Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Wiki Home
Collapse Expand Close

Members
Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Data Migration Solution Offering

From MIKE2.0 Methodology

Share/Save/Bookmark
Jump to: navigation, search
Hv3.jpg This Solution Offering currently receives Major Coverage in the MIKE2.0 Methodology. Most required activities are provided through the Overall Implementation Guide and SAFE Architecture, but some Activities are still missing and there are only a few Supporting Assets. In summary, aspects of the Solution Offering can be used but it cannot be used as a whole.
A Creation Guide exists that can be used to help complete this article. Contributors should reference this guide to help complete the article.

Contents

Introduction


The Data Migration Solution Offering provides techniques for handling different types of Data Migration scenarios, from simple migrations to complex application co-existence scenarios. The Data Migration solution offering provides techniques for measuring the complexity of the Data Migration initiative and determining the activities that are required depending on the complexity of the problem. Other enabling techniques include real-time integration, data investigation and re-engineering and delivering an integrated Operational Data Store

Executive Summary

Migrating from the legacy environment to a new system can be a straightforward activity or a very complex initiative. Migration can come in many forms:

  • A migration from a relatively simple system into another system
  • Upgrading a system to a new version through an approach that requires changing the underlying data
  • The convergence of multiple systems into a single composite system
  • Complex migration from one system to a new system, which requires the migration to be rolled out over a period of time
  • Multiple, con-current systems migrations and consolidation efforts. This is referred to as “IT Transformation”.

In most large organisations, migration of Enterprise Applications is very complex. To simplify this complexity, we first aim to understand the scope of the problem and then formulate some initial solution techniques. The MIKE2.0 Solution for Data Migration provides techniques for measuring the complexity of the Data Migration initiative and determining the activities that are required. It also defines the strategic architectural capabilities as well as high-level solution architecture options for solving different data migration challenges. It then moves into the set of required Foundation Activities, Incremental Design, and Delivery steps. The Executive Summary presents some of the strategy activities.

Solution Offering Purpose

This is a Core Solution Offering. Core Solution Offerings bring together all assets in MIKE2.0 relevant to solving a specific business and technology problem. Many of these assets may already exist and as the suite is built out over time, assets can be progressively added to an Offering.

A Core Solution Offering contains all the elements required to define and deliver a go-to-market offering. It can use a combination of open, shared and private assets.

Solution Offering Relationship Overview

The MIKE2.0 Data Migration Solution Offering is part of the EDM Solution Group

The MIKE2.0 Solution Offering for Data Migration describes how the Activities and Supporting Assets of the MIKE2.0 Methodology can be used to deliver successful solutions for migration challenges of varying complexity.

MIKE2.0 Solutions provide a detailed and holistic way of addressing specific problems. MIKE2.0 Solutions can be mapped directly to the Phases and Activities of the MIKE2.0 Overall Implementation Guide, providing additional content to help understand the overall approach.

The MIKE2.0 Overall Implementation Guide explains the relationships between the Phases, Activities and Tasks of the overall methodology as well as how the Supporting Assets tie to the overall Methodology and MIKE2.0 Solutions. Users of the MIKE2.0 Methodology should always start with the Overall Implementation Guide and the MIKE2.0 Usage Model as a starting point for projects.

Solution Offering Definition

The MIKE2.0 Methodology can be applied to solve different types of Data Migration problems.

  • For simple migration activities, not all activities from the Overall Implementation Guide are required. The complete migration may take only a single release.
  • For complex migration scenarios, most activities will be required and will be implemented over multiple increments.

Complex migration scenarios often require very sophisticated architectural capabilities.

  • Most migrations of Enterprise Applications are very complex processes.

The MIKE2.0 Solution for Data Migration introduces the overall approach to Data Migration, references the key activities from the Overall Implementation Guide and provides an approach for prioritizing complex migrations, based on business priorities and complexity of the implementation. It also provides a list of the key Supporting Assets to be used as part of the Solution.

Comparison of the 3 Techniques

Depending on the level of complexity – different migration orientations are required. At an introductory level, MIKE2.0 classifies orientations as “lite”, “medium” and “heavy”.

Lite Migration Scenario

A lite migration scenario is straightforward: it typically involves loading data from a single source into a single target. Few changes are required in terms of data quality improvement; mapping is relatively simple as is the application functionality to be enabled. Data integration may be on the back-end of systems and will likely be a once-off, “big bang”.

A Lite Scenario for Data Migration

Medium Migration Scenario

A medium migration scenario may involve loading data from a single source into a single target or to multiple systems. Data quality improvement will be performed through multiple iterations, transformation issues may be significant and integration into a common data model is typically complex.

A Medium Scenario for Data Migration

Heavy Migration Scenario

A heavy migration scenario typically involves providing a solution for application co-existence that allows multiple systems to be run in parallel. The integration framework is formulated so the current-state and future-state can work together. The model for a heavy migration scenario is representative of an organisation in IT Transformation.

As heavy migrations are long running and involve a significant data integration effort, it is useful to build a parallel analytical environment to attain a “vertical” view of information.

A Heavy Scenario for Data Migration


It should be noted that a migration effort may start at the lite orientation and decide to move to the next orientation (medium) as requirements are found to be more complex then initially envisaged. An IT transformation programme may have parts of the effort start concurrently at each of the orientations.

Migration Stages

Migration in MIKE2.0 takes places across multiple stages and some of the activities from the continuous implementation phases (phase 3, 4, 5) such as Data Profiling and Data Re-Engineering are repeated at multiple steps. The migration process is thought of in 4 stages – Acquisition, Consolidation, Move and Post-Move. Guidelines for each stage are listed below. Some final activities are often put off until the Post Move Stage. For heavy migration scenarios and to some extent for medium complexity migrations, this migration process is used for historical data and then enables co-existent applications that are similar in nature to operational data integration.

Acquisition Stage

The acquisition stage is focused on the sourcing of data from the producer. The data is placed in a staging area where the data is scanned and assessed. Judgments are made on the complexity of data quality issues and initially identified data quality problems are addressed.

Consolidation Stage

The consolidation stage focuses on attribute rationalisation into an integrated data store that may be required to bring data together from multiple systems. Key transformations occur and further steps are required for re-engineering data. The data and processes are prepared for migration to the Move environment. Considerable collaboration is needed in those areas where decommissioning occurs.

Move Stage

The move stage focuses on moving the data and application capabilities that have been developed to the production environment. The move stage has a staging area that is as close to production as possible. Final steps around data quality improvement are done this environment.

Post Move Stage

The post move stage is focused on the data transformations and quality aspects that were best done after the move to production (but before the system goes live) such as environment specific data or reference data. Additional process changes or software upgrades may also be required. The skills and toolsets used are the same as the ones used in the prior phases. Attention is paid to the ongoing use of the interfaces created during the transition process.

Relationship to Solution Capabilities

Relationship to Enterprise Views

Data Migration to implement a new application will typically involve:

  • Application Development of functionality related to the new target system(s). Functionality changes will also be made in the legacy system; these may be decommissioned over time for complex migrations.
  • Information Development for modelling, data investigation, data re-engineering and metadata management. More advanced capabilities are required for medium and heavy implementations.

Other that the initial definition of an Application Portfolio and high level business processes for scoping, the focus on MIKE2.0 is across the Technology Backplane of Information Development and Infrastructure Development. External methods to MIKE2.0 should be used for Application Development and, to an extent, Infrastructure Development.

In particular for heavy migration scenarios, a comprehensive approach will be required across people, process, organisation and technology, that driven by an overall strategy.

Mapping to the Information Governance Framework

Mapping to the SAFE Architecture Framework

Depending on the complexity of the migration effort, different capabilities are required across the SAFE Architecture. For a lite migration, often only very basic capabilities are typically required. For a medium level scenario, at least all foundation activities are needed. For a heavy migration, very sophisticated capabilities are needed to fulfill application co-existence requirements and a roll-out of system functionality over time.

A breakdown of typical capabilities required is shown below:

Capability Required Lite Scenario Medium Scenario Heavy Scenario
Data Profiling Direct Copy of Sources Key Integrity validated Referential Integrity required
Data Replication None None Multiple Targets
Data Transfer To Target To Target Target / Downstream
Data Synchronisation None None For Interfaces
Data Transformation Modest Significant to similar structures Major Activity
Data Mapping Minimal SME supported Major Activity
Data Standardisation None Key Attributes All attributes
Pattern Analysis and Parsing None None Yes
Record Matching None Based on similar IDs IDs and pattern matching
Record De-Duping None None Yes
Out of Box Business Rules As Appropriate As Appropriate As Appropriate
Configure Complex Rules None Application Application/Infrastructure
Out of the Box Interfaces As appropriate As appropriate As appropriate
Configure Custom Interfaces None Application Application/Infrastructure
Data Governance Process Model Documented in high level form Key or Lynchpin Processes modeled End to End Models
DB Independent Functions As Existed at Source Few Custom APIs Infrastructure Services
Data Management Reporting Data Move Metrics only DQ and DM metrics Reporting as a Service
Active Metadata Repository Specific and ‘physical’ Multiple Passive dictionaries Initial implementation


It is important that foundation capabilities be in place first, before more sophisticated capabilities such as a Services Oriented Architecture and Enabling Technologies be used to smooth the transition to these more advanced techniques. A number of artefacts from the SAFE Architecture can help define the component capabilities as part of a comprehensive approach.

Mapping to the Overall Implementation Guide

Depending on the complexity of the migration, different activities from MIKE2.0 will be required. Shown below is a high-level description of the key activities for defining the strategy, design and implementation of the migration programme

Lite Migration Scenario

A lite migration typically only involves the movement of data and minor process changes. Data structures in the consumer are similar to the producer and only straightforward transformations are required. Extracts from sources can be significantly leveraged by use of out-of-the-box tool capabilities. A Data Quality assessment is done for decision-making purposes.

Only modest data mapping is required since structures and attributes are similar. No standardisation or rationalisation of attributes is performed across multiple sources. Metadata management is focused only on this particular transformation as opposed to a broader effort. It involves a one time only movement of data. Most test cases are simple.

In this type of scenario, the amount of strategy work that is required is generally minor (although it is possible that this work is in the content of a larger project). If it is a standalone project and the technologies in place are sufficient, it is possible to go quickly into the activities in Phase 3 of MIKE2.0 and bypass a comprehensive Blueprint.

Key Activities in the Process
Data Migration Lite Scenario

The key activities in this process include:

  1. Extraction of data from producers into a test system area. Sometimes a data store is used as a staging area although staging in files may be appropriate. For a lite migration, there is generally only a single source system although additional enrichment data may also be required for new application functionality. Although the migration is simple, Data Profiling can still be valuable.
  2. The test target system provides the same functionality as the production target. In the test system area all transformations and data quality improvements are performed to get the data into an appropriate form. ETL Logical Design, Physical Design and Construction are straightforward for a lite migration.
  3. Testing is conducted in the test system. This process may involve multiple steps to get data into the new system to test newly built application functionality. Functional Testing, End-to-End Testing and UAT are the key testing activities to be performed; System Integration Testing and SVT (from a migration perspective) are typically not required.
  4. Loading into production may come from either the current-state system or the test environment. It is typically easier to load directly from the test system. As lite migrations are relatively simple, a once-off migration to production is appropriate.
  5. Data is loaded into the production system where some further data quality cleanup may be required. Production Verification Testing is conducted, which should also include functional testing of features that are environment specific. After testing is complete, the system is activated as a live production system.
Other Key Activities

Shown below are some of the other key activities required for a lite migration scenario.

Detailed Business Requirements

One of the first activities for a lite migration scenario is to define the Detailed Business Requirements. From an Information Development perspective, the approach will focus on translating how the business requirements of the new system relate to data requirements that must be sourced from the data producer.

Currently this Activity in MIKE2.0 is expressed in a fashion that is better aligned with building an analytical system. Therefore, there may be minor changes to the Overall Implementation Guide to have this Activity better aligned with requirements definition for integration of operational systems for migration.

Solution Architecture Definition/Revision

For smaller initiatives, the Solution Architecture Definition/Revision often acts as the key deliverable for defining the overall approach, key technical concepts and the conceptual design. Whereas it is ideal if it is done as a follow-on activity from a Business and Technology Blueprint, many shorter engagements start at this more focused level.

For a lite migration, the Solution Architecture will cover the conceptual design of all major components in the system, including infrastructural components and SDLC process design. It will be a translation from the initial set of business requirements at the application level to the technology solution that helps the new application meet its data requirements.

Medium Migration Scenario

A medium migration scenario involves a modest number of producers consolidated into a set of consumer structures. Whilst there is significant attribute rationalisation and standardisation, it is well understood. Extracts from source systems are either straightforward or there is a sufficient shadow copy available to use for Data Quality assessments.

Medium migrations can involve one-time movements of data. Application co-existence strategies generally are not required although multiple migration runs may be required to fully decommission the system. Due to this complexity, releases are prioritized and “targets of opportunity’” are addressed. Further data quality work may take place post move.

Data mapping from producers to consumers may be complex; Subject Matter Experts should be made available to assist in the mapping and the creation of test cases. Systems may be iteratively decommissioned as processes are moved to the target environment.

As a medium migration will oftentimes take place over an extended period of time and involves significant complexity, a Business and Technology Blueprint should first be developed to plan the increments and to ensure that the technologies that exist are sufficient.

For a medium migration scenario, foundation capabilities for Information Development and Infrastructure Development will be required at a minimum to get an effective solution in place.

Key Activities in the Process
Data Migration Medium Scenario

The key activities in this process include:

  1. Extraction of data from producers into a staging area.
  2. The data in the staging area will be profiled to measure down columns, across rows and between tables. This information will be used to determine which business rules and transformations need to be invoked early in the process.
  3. Metadata such as data mapping rules will begin to be established at this time. Data Standards will be agreed to and invoked at this stage in preparation for data movement. All source attributes will be mapped into the target attributes within the metadata management environment.
  4. All agreed to transformations and standardisations required to move the data into the staging area for testing and production are implemented. The data is moved into the Integrated Data Store.
  5. Data Profiling is done again and measured against the agreed upon move success criteria for all steps up to this point. Additional data standardisations are performed in to assist in the data matching and generally measure data quality against agreed upon criteria. After the standardisations the rules for which records can not or should not be moved are applied. It expected that this step will require considerable analysis.
  6. This step involves the actual move of the data into either the testing environment or the production environment of the target consumers. This step could be thought of as 6-T for testing and 6-P for production. The processes and capabilities used will be as identical as possible varying only in the target destination.
  7. Data is loaded into the production system where some further data quality cleanup may be required. Production Verification Testing is conducted, which should also include functional testing of features that are environment specific. After testing is complete, the system is activated as a live production system.

By storing information in a metadata repository throughout the process, the capabilities delivered for integration and data management become available beyond the project.

Other Key Activities

Some of the key activities from the Overall Implementation Guide include:

Metadata Management

Due to the complexity of the migration effort, significant metadata artefacts are produced related to data definition, business rules, transformation logic and data quality. This information should be stored in a metadata repository; getting this repository in place from the early stages of the project during the Metadata Driven Architecture is a key aspect of the architectural approach to MIKE2.0.

Data Profiling

Data Profiling focuses on conducting an assessment of actual data and data structures and is focused on measuring data integrity, consistency, completeness and validity. As part of this process, data quality issues are identified at the individual attribute level, at the table-level and between tables. Metadata such as business rules and mapping rules are identified as a by-product of this process. It is important to conduct Data Profiling early in the migration programme to reduce risks of project failures due to data quality issues. For ongoing migration efforts, data profiling rules are ideally operationalised to monitor data quality over time.

Data Re-Engineering

Data Re-Engineering is used to standardise, correct, match, de-duplicate and enrich data into the consumer system. For most migration efforts, some level of re-engineering is required; re-engineering can make up a significant percentage of the effort of a complex migration effort. The process for Data Re-Engineering typically follows the “80/20 rule”, using a repetitive software development lifecycle until data reaches the level that provides the most business value. This process often involves moving data into a staging area and re-engineering data in an iterative fashion before finally loading it into a production target.

Data Modelling

For medium level migration, the Data Modelling process is used to build test and production targets. Sometimes an intermediary data store is built to bring data together from multiple producer systems before loading it into the test system. This data store provides a common, integrated model where data may undergo significant re-engineering.

ETL Design and Development

The Data Integration process for a medium level migration is generally complex enough to warrant a comprehensive design process that covers conceptual, logical and physical design. These integration components are then constructed during Technology Backplane Development.

Testing

Testing is first conducted in the test system. This process may involve multiple steps to get data into the new system to test newly built application functionality. Functional Testing, End-to-End Testing and UAT are the key testing activities to be performed; SIT Testing and Stress and Volume Testing (from a migration perspective) may be required if there are multiple runs.

Deployment and Final Verification

Data is loaded into the production system where some further data quality cleanup may be required. Production Verification Testing is conducted, which should also include functional testing of features that are environment specific. After testing is complete, the system is activated as a live production system. This may involve multiple runs for a medium level migration scenario.

Heavy Migration Scenario

Data Migration Heavy Scenario

A heavy migration scenario will require a comprehensive strategy that develops a vision for people, process, organisation and technology. It will often be conducted over a multi-year programme and involve a large number of stakeholders and significant technology changes.

The solution architecture for a migration scenario is sophisticated, as achieving decommissioning of a system typically involves several source systems continuing to provide data movement on an ongoing basis in a co-existent application scenario. Data Quality improvement covers an initial batch process and ongoing capability; significant data mapping and rationalisation is needed across multiple systems in the architecture.

Due to the significant effort required to build this integrated environment, organisations should make sure to align this work with other data-centric initiatives. The Integrated Operational Data Store, for example, can be used as a staging area for the analytical Warehouse as well as a hub between co-existent applications. This allows new business functionality to be delivered on the analytical side in parallel with new operational functionality.

Business Assessment and Strategy Definition Blueprint (Phase 1)

All activities from the Business Blueprint phase of MIKE2.0 will be required to define this strategic approach. In the Business Blueprint phase, the focus is on developing an initial information and infrastructure development strategy that is aligned to the specific set of business requirements, many of which are driven from the application development stream. The Organisational QuickScan for Information Development is used to provide an initial assessment of the current state. For complex data migration scenario this includes the definition of an Application Portfolio and assessing Data Governance levels through IM QuickScan. The SAFE Architecture is used as a starting point to define the strategic conceptual architecture at the component level and a set of high level solution architecture options.

Technology Assessment and Selection Blueprint (Phase 2)

All activities from the Technology Blueprint phase of MIKE2.0 will be required for a heavy data migration scenario, which may involve implementation of a number of new technologies.

In phase 2 of MIKE2.0, a diligent approach is applied to establish the technology requirements at the level required to make strategic product decisions. Once inline with the overall business case, technology selection can then take place during this phase. Before implementing these technologies, standards are put in place related to the SDLC process before implementing the initial baseline infrastructure.

Also in phase 2, the Data Governance activities move from establishing the initial organisation to determining how it will function. The strategic set of standards, policies and procedures for the overall Information Development organisation are first established during this phase. This Information Development Organisation has established reporting lines into the other aspects of the organisation from a management, architecture and delivery perspective.

Roadmap and Foundation Activities

The Roadmap and Foundation Activities provide some of the most critical activities for reducing the risk of the migration programme and providing an integrated conceptual design across multiple solution areas.

In addition to the activities described in the medium level migration scenario, some of the activities for a very complex migration include:

Enterprise Information Architecture

MIKE2.0 takes the approach of building out the Enterprise Information Architecture over time for each new increment that is implemented as part of the overall programme. The scope for building the Enterprise Information Architecture is defined by the in-scope Key Data Elements (KDEs) that must be integrated against different systems in the migration architecture.

Solution Architecture Definition

Due to the complexity of the implementation, the Solution Architecture Definition/Revision must incorporate advanced techniques such as definition of a Services Oriented Architecture and Integrated Operational Data Store. Some architectural techniques that may be employed, such as Active Metadata Integration, may have a significant impact on the overall development approach. The design for testing this will also need to be sophisticated and should also be incorporated into the Solution Architecture.

Prototype the Solution Architecture

Due to its complexity, Prototype the Solution Architecture for testing the conceptual design. For a heavy data migration it can be particularly effective for testing complex areas such as application co-existence, the functionality of the integrated Operational Data Store and ongoing Data Quality Monitoring.

Design Increment

All design and development activities of MIKE2.0 will be required for this style of migration. In particular, the Services Oriented Architecture Design may be employed due to the ongoing nature of the interfaces and business capabilities that will be built.

The MIKE2.0 approach to a heavy migration recommends that a Business Intelligence environment be build in parallel with the delivery of the operational systems integration. Therefore the Business Intelligence Design activities from the Overall Implementation Guide should be conducted as this stage.

Incremental Development, Testing, Deployment and Improvement

Development of the Technology Backplane and Business Intelligence Application will occur during this phase. The earlier activities around Solution Architecture and Incremental Design feed directly into the build process.

Testing

For a heavy migration scenario, Testing will very complex. In addition to Functional Testing, End-to-End Testing and UAT, System Integration Testing and SVT will also be required. Testing will need to be performed against historical data loads as well against the ongoing feeds of data into the system.

Deployment and Final Verification

Data is loaded into the production system where some further data quality cleanup may be required. PVT is conducted, which should also include functional testing of features that are environment specific. After testing is complete, the system is activated as a live production system. Ongoing activities will be required to monitor the system for a heavy migration.

Continuous Improvement

The Continuous Improvement activities will be important for heavy migrations due to the complexity of the programme and its long-running nature. Of particular importance will be the Continuous Improvement of Data Quality and Infrastructure.

Mapping to Supporting Assets

Logical Architecture, Design and Development Best Practices

A number of artefacts help support the MIKE2.0 Solution for Data Migration:

The following MIKE2.0 Solutions should also be referenced:

Complementary MIKE2.0 Solutions related to IT Transformation and Master Data Management may also prove useful.

Product-Specific Implementation Techniques

Product Selection Criteria

Estimating Project Complexity

Data Migration - Business Goals vs. Difficulty

For Data Migration initiatives that involve replacement of a number of systems, a key part of prioritisating involves balancing the desire for new business capabilities with the complexity of their implementation.

Metrics on Business Alignment and Difficulty help to formulate priorities for the overall implementation of a large-scale transformation programme that involves migration of a large degree of system functionality. This is done by starting with areas that are most important for the business and of the lowest complexity. Whilst a simple model, this helps to clearly illustrate to the business and technical community how priorities were driven for the project in an objective fashion.


High Level Project Estimating Factors Include:

  • The complexity of the current-state environment
  • The number of critical business functions to be enabled
  • The level of technology sophistication that is required
  • The number of systems to be migrated
  • Amount of data within these systems to be migrated
  • Level of documentation on the system
  • Availability of Subject Matter Experts
  • Complexity of system interfaces
  • Quality of the data within the system

A key aspect of building a Blueprint for a Data Migration programme is determining these Estimating Factors. The Data Migration Complexity Estimating Model that is available as part of MIKE2.0 provides a quick mechanism to start to determine migration complexity.

Relationships to other Solution Offerings

Similar MIKE2 Solutions include:

Used in conjunction with these Solutions, the MIKE2.0 Solution for Data Migration provides an approach for decommissioning-oriented migrations and application co-existence scenarios.

Wiki Contributors
Collapse Expand Close

View more contributors