Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Wiki Home
Collapse Expand Close

Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Toward Zero Latency Reporting

From MIKE2.0 Methodology

Jump to: navigation, search


The Need For An Integrated Data Store

Most organizations draw upon a variety of data sources within their reporting infrastructure. These sources typically represent a wide variety of database platforms and data formats stored on a wide variety of media. Data quality may be suspect. It may require specific transformation (lookups for referential integrity, aggragation or disaggregation) or cleansing before it can be integrated. Enter the operational datastore (ODS). According to Kimball (The Datawarehouse Lifecycle Toolkit (Wiley, 1998)), the ODS is defined as follows:

"The ODS should be a subject-oriented, integrated, frequently updated store of detailed data to support transactions with integrated data. Any detailed data used for decision support should simply be viewed as the lowest atomic level of the data warehouse."

Without such an integrated source of data, operational analyses lead to “stovepipe” decision-making that does not reflect the dynamic environment in which most companies operate.

PeopleSoft offers such an ODS in its Enterprise Performance Management (EPM) product suite through the use of four fully functional warehouse products in addtion to the Enterprise Warehouse. (See Figure 1 - Appendix).

The Concept of Latency In Reporting from the ODS

A term more typically associated with web hardware, latency often refers to the quality of a data connection, to how quickly specific node returns a “ping”. In the context of datawarehousing and enterprise reporting, we use the term to reflect how close to “real-time” is the refresh of data in the operational data store. While refreshes of data in the PeopleSoft data store are typically more periodic (daily, weekly, monthly), the proliferation of eBusiness has encouraged many organizations to accelerate their refresh rates. Why?

In the context of eBusiness, high quality data enters the system constantly. This data becomes information as the eBusiness professional uses real- or near-real time data to make improved business decisions. The eBusiness information becomes a closed-loop decision support system. To support the most dynamic decisions, a more integrated data store is demanded. Enter the “Zero-Latency-Enterprise” or the “Low-Latency Enterprise”.

Low latency comes at a price, however. Frequent data refreshes can build result in huge data stores. Further, as the spectrum of decisions supported widens, the transformations required often multiply and the de-normalized warehouse blossoms. The costs can grow exponentially.

Enterprise Reporting Strategy - Toward the Zero Latency Enterprise: Data Frequency vs Data Integration

In reality, most enterprises have a need for a variety of reports and reporting tools. In developing its enterprise reporting strategy, the company needs to discern the relative demands of its reporting tools and tactics – primarily in terms of the level of integration required (to what extent do we need a reporting warehouse (ODS, data mart) and how frequently they would be willing or able to refresh the data in a relatively static data store. (Indeed, as you increase the frequency of refreshes, your data becomes more dynamic). To describe the enterprise reporting model we can use a 4-grid model such as that shown in Figure 1.

ZLE Grid.gif

Figure 1

Typically, most organizations will have reporting needs in each quadrant of the grid. The goal of the organization’s enterprise reporting strategy will be to detail the extent of these needs. To represent these demands, we can overlay a set of circles on the grid representing three types of enterprise reporting: Transaction Reports, Off-line Operational Reports, and Low (or Zero) Latency Reports. The size of each circle and its position can be used to represent the volume and dimensions of each type of report that the organization will require:

ZLE Sizes.gif

Figure 2

Ad Hoc Reporting

Ad-hoc reports can actually be found among any of the three types of report groups listed above. The circles typically represent report data rather than report format. Ad hoc reporting requirements need to be addressed when the organization assesses the volume requirements each of the report categories in the model above.


Figure 3

The intersection of these circles represents areas of potential economy in establishing an enterprises reporting strategy. Perhaps the organization may want to reduce the volume of reports created in the transaction system and produce some of these out of the ODS in an offline operational reporting system. The case for creating some reports directly out if the transaction system is not diminished. Rather, the reports are created (once) in the environment where they can be most efficiently managed. An additional benefit will come as users become more familiar with the off line data store, information quality can be easily improved by incorporating additional enterprise (integrated) data. As the organization’s reporting needs expands from the lower left of the grid to the upper right, the need for an integrated operational data store becomes more acute. Perhaps one ODS is required for both Zero Latency (Low Latency) reporting and another for Offline Operational Reporting. Perhaps they can be combined into one warehouse, or data store. Regardless, tight integration with the source data systems will be key. This is where the reporting issue becomes more of a generic data warehousing issue (and hence, out of the scope of this paper).

An Integrated Solution for Enterprise Reporting

A fully integrated datawarehouse is key to many people’s vision for true integrated enterprise reporting. Certain (non-integrated) transaction reports can remain within the ERP (sorurce) system to meet the needs of the majority of end users. These systems offer the advantage of security (working under the application’s security model), accuracy (reported directly from the transaction detail), accessibility and speed (require no refresh of an ODS). To the extent that these reports do not require integration with other transaction processing system, on-line transaction reporting is both efficient and effective. When the enterprise needs to move to a more integrated data model, a less granular (atomic) level of data is required. Through a series of Extract, Transform and Load maps often bundled with the Warehouse, data can be seamlessly copied from ERP to EPM. This enables a couple of things. First, summary level reporting (including aggregations and any number of data transformations) data is available without interfering with core ERP transaction processes. Second, the data moves into a product that has a more robust meta data layer (within the warehouse) that can be used to facilitate rules-based analytics. Third, the warehouse can be built to leverage the same (or similar) hardware and software to control maintainence, training and support costs. Fourth, this is the beginning of a more denormalized data structure which will facilitate more timely and focused reporting. As the focus shifts to EPM, however, so does security. In addition to its standard page based security model, EPM also incorporates a robust row level security model. Typically, as the ODS contains a less granular level of detail than the transaction system, row level security in EPM will differ from that in the source OLTP system and will not (as a general rule) replicate the more restrictive OLTP model Zero latency is, at this point conceptual. While there is no reason that the Enterprise Warehouse and ODS could not be used for zero latency reporting, it will become more a matter of hardware and support (primarily for the the near constant ETL jobs) than of limitations in your architecture.

Wiki Contributors
Collapse Expand Close