|
Wiki Home
Members
To join, please contact us. Improve MIKE 2.0
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.
|
Data Investigation ComponentFrom MIKE2.0 Methodology -> You are here: Records, Contracts and IP Management Solution Offering > Category:MIKE2 Stub > Enterprise Portals and Information Delivery Solution Offering > IBM Product Model for Enterprise Data Management > Data Investigation Component
Data Investigation typically forms one of the key first steps in building capabilities for Information Development through a quantitative assessment of an organisations’ data. It also typically involved an ongoing monitoring process that is put in place once the solution has been implemented.
Data DiscoveryData Discovery (Profiling) should be considered a pre-requisite to any significant data integration effort. Data Profiling is done to remove the uncertainty and assumptions regarding the current information environment. As opposed to having to make assumptions, Data Profiling provides a discovery framework for identifying and analysing data quality issues and making fact-based decisions. This step is required to accurately cost and schedule any data transition or consolidation project. Use of Data Profiling ToolsData Profiling often uses a tool-based approach that enables the initial establishment of standards and initial formulation of metadata. It works by parsing and analysing free-form and single domain fields, and determining the number and frequency of unique values and classifying or assigning a business meaning to each occurrence of a value within a field. As a result, Data Profiling:
Data Profiling is carried out on the completeness of the fields, which determines the “usefulness” of the field for matching purposes. Incomplete fields mean that lower aggregate weights will be derived for the record, which can fail to meet the match cut-off requirements. Investigations are performed on both non-standardised and standardised fields. The purpose of investigating field patterns is to correct those patterns such that they can be standardised and used for matching, or to isolate those patterns for manual data quality improvement. Information Derived from Data ProfilingFollowing this approach, a tools-based profiling assessment provides information about data structures and data content:
Data Monitoring (Ongoing Data Profiling)Data Monitoring will occur through the re-use of processes that facilitated the initial data profiling. The monitoring of known data problems will be pro-active and will have two major objectives:
The data quality baseline and metrics are broken down by various dimensions are an important part of the ongoing DQ monitoring. By scheduling data profiling tasks and comparing these figures to the baseline and previous metrics, it is possible to identify and address known data quality problems, which may reappear in the system. |
Wiki asset search
Toolbox
Views
Wiki Contributors
|
|||||||||||||||||||||||||||

