The solutions engineer’s perspective on the complicated, time-consuming, and challenging process called data migration.
Data migration is a key step in IT transformation projects. It is also highly likely to veer off course and need correction without proper planning.
So, what is data migration? Essentially it’s the process of transferring data from one system to another. In reality, it’s a complicated, time-consuming, and challenging process that requires careful planning and precise execution. Even more so when dealing with large amounts of data, complex, interdependent data structures, and data that underpins business processes.
Why is the success of data migration so important?
Before we go into “how”, let’s talk about “why” for a moment. When moving to a new system, a lot of focus is given to the selection and design of the new system, but data migration is perceived as an ad hoc activity that “the IT team just needs to get through so we can finally leverage the capabilities of our new system”.
Thinking like this is a fool’s errand.
Imagine moving to a new house. You’ve spent months selecting it, planning, and designing the interior, and then booked a truck to move your furniture. When the truck arrives, you realize that with the amount of stuff you have and the size of the truck, it will take five trips to move everything. So now it will take five days instead of one to move everything.
You then find that some of your furniture does not fit through the door of the new place, some does not fit the room size, and some items got broken during transportation and need to be fixed or replaced. Eventually, you end up having to rent your existing place for another month at a higher price because you could not complete the move on time – and it will take a few months for you to unclutter and organize things in your new place.
You get the idea.
Migration is the foundation of transformations that involve a transition to a new system or platform. Organizations need to get it right to unlock the full potential of the new system and achieve their goals because your system is only as good as the quality of data you have in it.
Staging the data migration situation
An example of transformation projects that involve migration is a transition from legacy OSS/BSS systems to a new digital commerce and subscription management platform like CloudBlue.
This change would enable a digital marketplace, and a subscription consumption model, automate fulfilment, and expand a portfolio by offering more products.
So, you need to migrate existing accounts and subscriptions to a new system. Where do you start? What challenges should you anticipate?
The data migration process consists of four major phases:
- Scoping and planning: the most important phase of migration. It involves understanding what exactly you need to migrate, discovering technical and business requirements and constraints, choosing the right strategy, researching possible data migration software, tools, solutions, and aligning on strategy and execution with stakeholders.
- Execution: primarily depends on the first one because good planning and an appropriate solution will help migrate without failures.
- Verification: ensure that everything was done based on the plan by running verification tests.
- Cutover: in this phase, you switch to the new system.
Depending on the data migration strategy, steps 2-4 can be performed once, or it can be an iterative process with these steps performed for each migration batch.
Planning the data migration
The first questions you would usually ask when it comes to data migration planning are “what do we need to migrate”, and, “how many objects” then expand it:
- As you are migrating customer accounts that belong to resellers, do you need to migrate the resellers as well? Or will new resellers be created in the new system?
- Will you migrate all subscriptions or only subscriptions for specific products?
- Will you migrate only active accounts/subscriptions or inactive/disabled ones too?
- If you need to migrate inactive accounts/subscriptions, will you migrate all of them or just the ones that have been inactive for a certain period?
- Are there any other dependent objects you need to migrate together with accounts, such as past orders, invoices, terms, conditions, acceptance history, etc?
To ensure you don’t miss anything regarding types of objects, their properties, and dependencies, use multiple data sources: look at existing and new systems design documents, UI, data models, API specification, and database schema.
Data mapping, validation, and transformation
This is where you map entities that you need to migrate from the old system (accounts, subscriptions, users, user roles, etc) to ones in the new system.
And this is not just about mapping objects, it’s also about mapping properties. Some, like account company name, contact details, and address will be straightforward. But what about properties that do not have direct counterparts in the new system? Or what if the new system has a mandatory property not in the old system – what logic should you use to decide what value to set for such a property?
Data consistency checks and validation is also an important part of this process. If you think that all data in the system you’ve been using for the past five years is accurate and consistent – you’re in for a surprise. Old bugs, accounts created before validation checks were added, and workarounds involving direct data manipulations bypassing validation checks are just a few of the reasons that can lead to “broken” data in the system.
It’s much easier to find and fix such data inconsistencies in advance in the old system or as part of data transformation before or during migration rather than in the middle. Or even worse, finding these issues weeks or months after the migration is done and they have caused problems in the new system.
Data migration approach 101
This is where you will decide what high-level approach will be used for data migration:
- Bulk export data from the source system, copies exported data to the target system, apply data transformation, and import data to the new system.
- Use data migration tools that process each object one by one by retrieving the object and its properties from the source system API, applying transformation rules, and creating an object in the new system using its API.
- Combination of #1 and #2.
It will depend on the migration requirements and capabilities of the old and new systems.
Data co-existence and cutover processes
Large migration can take days, weeks, or even months to complete. This means there will be a period when some objects are managed by the old system, and some are managed by the new one. Planning co-existence and cutover is crucial for the success of the migration. You need to make sure that migration design and procedures address the following:
- Will it be one-off migration where everything is migrated in one shot and the old system is shut down, or will you do it in batches? Will you need to move the management of accounts and subscriptions from the old system to the new system batch by batch?
- Once you migrate an object, do you want to make it possible to manage this object from the old system and synchronize these changes to the new system? Or do you want to make sure that once an object is migrated, the old system cannot change it, and all changes can only be done from the new system?
- If the decision is that once an object is migrated, it should not be changed from the old system – how do you achieve this? Just sending an email to the customer asking them to log in to the new system is not enough, some customers may miss this email. If possible, this should be implemented as a restriction on a technical level: for example, by marking all users of the account as inactive so no one can log in and make changes from UI.
- For systems that handle account/subscription billing, make sure migration aligns dates and billing cadence in a way that there is no missed billing. There is a gap or double billing because there is overlap and some period is billed by both old and new systems.
- What is your rollback plan in case migration of an object fails and you need to restore it to its original state in the old system?
Requirements for systems configuration
When designing and configuring the new system, the focus is on the future. This leads to overlooking what’s required to bring existing customers from the old system who may be using “legacy” products.
Think back to the “furniture does not fit through the door of a new house” analogy.
Let’s say you are migrating customers with Microsoft 365 subscriptions. Some customers have subscriptions based on offers that are now discontinued and no longer sold but still work. If you don’t have those offers configured in the new system – can you migrate customers with subscriptions that use these offers? Do you need to configure these offers in the new system but mark them as not available for new sale?
Then the final questions: what will be impacted for customers, are they aware and informed about this – and do you have a compatible offer configured in the new system for each legacy offer?
Connectivity requirements across systems
Depending on how the migration will be executed, the location, and network configuration of the old system, new system, and migration tools (same network, separate isolated networks, or one is on-premises and another in the cloud) you will have different connectivity requirements:
- Do you need additional configuration in the old/new system to expose the APIs required for migration activities?
- Is a direct connection possible or do you need VPN? What is the expected throughput considering the size of data that will be sent over?
Data security considerations
Here we assume that the new system is already setup to meet security requirements and we need to make sure that migration processes are aligned with the requirements too:
If you are using API over a public network:
- Is communication with API encrypted using a cryptographic protocol that meets your requirements? It is less likely to be a problem on the new system side, but you’ll be surprised how many legacy systems still use API over plain HTTP or use SSL 2.0 or SSL 3.0.
- Is the authentication mechanism strong enough?
If not, do you have to switch from public access to API to using VPN or do you use an API gateway to take care of authentication and encryption requirements?
- If you are doing bulk export and uploading data to an intermediate server for further processing before it’s imported to the target system – where is this server located, is it in line with data residency requirements, is data encrypted at rest on this server, how do you make sure export data is removed from this server once it’s no longer needed?
What is the timeframe for data migration?
“How long will it take?” can be answered in the context of overall migration, project timelines, or in the daily migration batches that need to fit within allocated maintenance windows.
Try using existing performance estimates or benchmark results to calculate durations. It can be risky to rely on these as the performance of your system can differ from abstract benchmark results.
Perform duration estimates by running test migrations on the staging environment, or do pilot batches in the production environment to calculate how long it takes, extrapolating to planned batch sizing. Do these tests on comparable volume. If you are migrating batches of 10,000 accounts, using a test migration of five accounts as an estimate is not reliable. You will not see performance bottlenecks on API throttling, running out of worker threads, or database performance degradation.
Planning migration batches
If you are running a large migration, chances are you won’t be able to complete it all in one run and need to break it down into batches.
When planning batches, consider the following:
- Batch size, migration performance estimates, and how much time you are given to complete this batch. There will usually be a maintenance or migration window where you can perform these activities to avoid/minimize the impact on day-to-day activities.
- When planning batch size vs migration duration, allocate time for additional activities that may be required depending on your migration design, such as batch validation, post-batch migration verification, troubleshooting problems, generating post-migration reports, etc.
- Objects dependencies. For example, you have an account with two subscriptions in the old system. Does your migration and systems co-existence design allow you to migrate the account today and migrate its subscriptions two days later or do they all have to be migrated within the same migration window so that once it’s done account and his subscriptions are all managed by the new system only?
Data migration is a critical process that requires careful planning, preparation, and execution. If you want to talk about any of the information shared, please get in touch: igor.safonov@cloudblue.com.