From MIKE2 Methodology
|
| This article is currently Under Construction. It is undergoing major changes as it is in the early stages of development. Users should help contribute to this article to get it to the point where is ready for a Peer Review.
|
| This deliverable template is used to describe a sample of the MIKE2.0 Methodology (typically at a task level). More templates are now being added to MIKE2.0 as this has been a frequently requested aspect of the methodology. Contributors are strongly encouraged to assist in this effort.
|
| Deliverable templates are illustrative as opposed to fully representative. Please help add examples to this template that are representative of the proposed output.
|
Overview
Performing Data Enrichment typically refers to the supplementing of an organisation’s internal data with data from external sources. Types of data that is typically used for enrichment data:
- Personal data such as date-of-birth and gender codes
- Geographical data
- Postal Data, such as Delivery Point Identifiers (DPID)
- Demographic information
- Economic data
- World event information
After data has been standardised, corrected and matched, enriching data is basically the same as adding other source data.
Key Deliverables for Data Enrichment include:
- Design for enrichment
- Changes to data model and meta-model
- New data load (refer to ETL solution approach)
Steps in the Process
| Step 1 Determine requirements for Data Enrichment
|
| Objective:
| Understand the business needs and the proposed benefit provided by external data
|
| Input:
| Functional requirements that apply to needs for Data Enrichment
|
| Process:
| Key Steps in the Process include:
- Understand gaps from existing data in mapping to business requirements.
- Define business value provided by supplementary data and how it maps into business requirements.
- Define options for existing data feeds.
- Provide estimated cost-benefit analysis from purchasing additional source data. This should be high-level and presented to the team.
|
| Output:
| Selected sources for data enrichment
|
| Step 2 Source Extract Definition and Design
|
| Objective:
| Determine source to be used, how often it will need to be reloaded and whether it may contain any potential data issues.
|
| Input:
| Sources for data enrichment
|
| Process:
| Key Steps in the Process include;
- Determine frequency of source data loads
- Determine whether data loads will be primary set of information in the organisation or whether it needs to be checked against an authoritative source
- Determine method for receiving data (extract load, external connection, etc.)
- Determine frequency of source data loads
- Refer to Data Investigation process for definition of source system extracts
|
| Output:
| Source extract definition and logical design of extract
|
| Step 3 Update Target Model Design
|
| Objective:
| Update target data model to reflect any requirements for enrichment data
|
| Input:
| Functional information requirements
|
| Process:
| The target data model should be defined as part of the initial requirements, but may require some minor changes (especially at the physical level) to accommodate new data load.
Follow the data modeling process for this area.
|
| Output:
| Extensions made to target data model
|
| Step 4 Supplement initial data with enriched data
|
| Objective:
| Load of enrichment data into target environment
|
Input:
| Determination of source extracts Target Data Model design Extract Logical Design
|
| Process:
| Follow the ETL solution process for this area
|
| Output:
| Enrichment data added to core data
|
Examples