Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Wiki Home
Collapse Expand Close

Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Evaluating Data Practices with Information Governance

From MIKE2.0 Methodology

Jump to: navigation, search

There really are no stand-alone systems anymore. And it’s not only systems that are related. Data is related across the enterprise as well. These relationships exist also across development, testing and production systems. However, many of the relationships, and much of the data, are dormant and inactive.

Dormant, inactive data just sits there costing money and energy and impacts debugging and migration activity. This data carries with it not only an unnecessary storage cost, but it is also a security risk (or carries a cost of managing its security).

Strategy efforts tend to focus so much time and energy into new and glamorous production systems. However, the world most developers live in is a one of development and testing systems. But what about those systems around us that have seen better days and should be retired? And what about the dormant data that resides in those systems? Forrester estimates that 85% of data stored in databases is inactive.

There certainly seems to be concern about connecting points between systems that keeps us back from system retirement. And what about the many copies of systems and data for the myriad of development and test activity that occurs in the organization. You probably don’t need me to list the alphabet soup of testing scenarios that most companies run between development and production. Compound that with the ongoing development activity of projects that need their own copy of data – in numerous environments – and it’s no wonder one company said they have 70 copies of production data strewn across the organization.

Most organizations need a big picture of their systems and a strategy around effective governance of the important assets of systems and data. There is creeping cost and security risk in current practices that must be addressed and to that end, we can follow some simple steps:


Develop that big picture

Develop a model of your system relationships and, as you do, keep in mind the 80/20 rule. In this case, it means the most actionable information will not come from exhaustive details, but will come from journaling the high points that is either in disparate documents or readily known.

Reevaluate the current practices around copies of data

All too often, organizations are quick to spin up a new copy of data because it is the easiest thing to do to satisfy an urgent requirement. However, looking just a bit deeper reveals that sharing with existing data is possible if the timing is worked out. When I was a software developer, we had an entire department managing “code change control”, allocating instances and test data judiciously and making sure each piece of production code moved through necessary testing before that code was released. You may be in an IT shop and not a software shop, and consequently view that as overkill. However, I suggest that, in larger organizations, it is absolutely essential that focused attention be given to this. Left as task #42 to an overworked project manager, this potentially costly function will turn out to be just that – costly.

Reevaluate the current practices around using production data in non-production environments

OK, now I’m going to add overhead to the project rather than take it away. There is something more important to companies than the storage cost of data. It’s the risk factor. The risk of taking production data out of the production firewall and making it more broadly accessible increases the odds that your company could end up somewhere you don’t want to be - on

Manual scrubbing techniques abound. You can randomly scramble key data. You can mix up attribute assignments. However, you must also keep the data representative or the testing is invalid! That includes making sure referential sets are maintained. You cannot drop all referential integrity when you scrub data.

However, if someone in the company, or their consultants, manually does the scrubbing, it is always subject to debate as to whether it has (1) been accurately scrubbed AND (2) is still a good enough representation of production. The conversation can come from both perspectives, from the same people, on a very repetitive basis. Trust me, I know. I’ve been living it! I am thinking more that something like the IBM InfoSphere Optim Test Data Management Solution would take that conversation, and the energy put into writing the routine to make it meet the 2 criteria, off the table.

In recent polling, two-thirds of organizations say they are not masking pre-production data! With an extremely efficient black market for information, may need to expand their server capacity.

Develop a cost model for harboring data and seemingly dormant systems

You may not know yet all the interdependencies of some low- to no-priority systems, but you can calculate how much it is costing you to keep them around. There are the obvious software and hardware licenses and whatever people efforts go into maintenance. Then, there are the overhead costs there are associated with every system that is in inventory to any degree.

Retire unneeded systems

The key word here is ‘unneeded’. Chances are every system that has been introduced into the company and hasn’t formally been retired is needed somewhere. I’ve had clients with systems where they still used one function in the system for many years. Of course, they are paying full maintenance for the system for this function that could be easily re-developed in another system where all the data is now maintained.

You also may be able to keep the data for the application, just in case you need to access it, while still retiring the application.

These are a few of the important matters for data governance within an organization. Optimizing performance while controlling costs and mitigating risk is not easy, but is a necessary function in an organization thriving with information.

Wiki Contributors
Collapse Expand Close