Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Members
Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Archive for the ‘Master Data Management’ Category

by: Robert.hillard
25  Jul  2015

Don’t seek to know everything about your customer

I hate customer service surveys. Hotels and retailers spend millions trying to speed our checkout or purchase by helping us avoid having to wait around. Then they undo all of that good work by pestering us with customer service surveys which take longer than any queue that they’ve worked so hard to remove!

Perhaps I’d be less grumpy if all of the data that organisations spend so much time, much of it ours, collecting was actually applied in a way that provided tangible value. The reality is that most customer data simply goes to waste (I argue this in terms of “decision entropy” in chapter 6 of my book, Information-Driven Business).

Customer data is expensive

Many years ago, I interviewed a large bank about their data warehouse. It was the 1990s and the era of large databases was just starting to arrive. The bank had achieved an impressive feat of engineering by building a huge repository of customer data, although they admitted it had cost a phenomenal sum of money to build.

The project was a huge technical success overcoming so many of the performance hurdles that plagued large databases of the time. It was only in the last few minutes of the interview that the real issue started to emerge. The data warehouse investment was in vain, the products that they were passionate about taking to their customers were deliberately generic and there was little room for customisation. Intimate customer data was of little use in such an environment.

Customer data can be really useful but it comes at a cost. There is the huge expense of maintaining the data and there is the good will that you draw upon in order to collect it. Perhaps most importantly, processes to identify a customer and manage the relationship add friction to almost every transaction.

Imagine that you own a clothing or electrical goods store. From your vantage point behind the counter you see a customer run up to you with cash in one hand and a product in the other. They look like they’re in a hurry and thrust the cash at you. Do you a) take the cash and thank them; or b) ask them to stop before they pay and register for your loyalty programme often including a username and password? It’s obvious you should go option a, yet so many retailers go with option b. At least the online businesses have the excuse that they can’t see the look of urgency and frustration in their customers’ eyes, it is impossible to fathom why so many bricks-and-mortar stores make the same mistake!

Commoditised relationships aren’t bad

Many people argue that Apple stores are close to best practice when it comes to retail, yet for most of the customer interaction the store staff member doesn’t know anything about the individual’s identity. It is not until the point of purchase that they actually access any purchase history. The lesson is that if the service is commoditised it is better to avoid cluttering the process with extraneous information.

Arguably the success of discount air travel has been the standardisation of the experience. Those who spend much of their lives emulating the movie Up in the Air want to be recognised. For the rest of the population, who just want to get to their destination at the lowest price possible while keeping a small amount of comfort and staying safe, a commoditised service is ideal. Given the product is not customised there is little need to know much about the individual customers. Aggregate data for demand forecasting can often be gained in more efficient ways including third party sources.

Do more with less

Online and in person, organisations are collecting more data than ever about their customers. Many of these organisations would do better to focus on a few items of data and build true relationships by understanding everything they can from these small number of key data elements. I’ve previously argued for the use of a magic 150 or “Dunbar’s number” (see The rule of 150 applied to data). If they did this, not only would they be more effective in their use of their data, they could also be more transparent about what data they collect and the purposes to which they put it.

People increasingly have a view of the value of their information and they often end-up resenting its misuse. Perhaps, the only thing worse than misusing it is not using it at all. There is so much information that is collected that then causes resentment when the customer doesn’t get the obvious benefit that should have been derived. Nothing frustrates people more than having to tell their providers things that are obvious from the information that they have already been asked for, such their interests, family relationships or location.

Organisations that don’t heed this will face a backlash as people seek to regain control of their own information (see You should own your own data).

Customers value simplicity

In this age of complexity, customers are often willing to trade convenience for simplicity. Many people are perfectly happy to be a guest at the sites they use infrequently, even though they have to re-enter their details each time, rather than having to remember yet another login. They like these relationships to be cheerfully transactional and want their service providers to respect them regardless.

The future is not just more data, it is more tailored data with less creepy insight and a greater focus on a few meaningful relationships.

Tags:
Category: Enterprise Data Management, Enterprise2.0, Information Development, Master Data Management
No Comments »

by: Alandduncan
01  Jul  2014

Information Requirements Gathering: The One Question You Must Never Ask!

Over the years, I’ve tended to find that asking any individual or group the question “What data/information do you want?” gets one of two responses:

“I don’t know.” Or;

“I don’t know what you mean by that.”

End of discussion, meeting over, pack up go home, nobody is any the wiser. Result? IT makes up the requirements based on what they think the business should want, the business gets all huffy because IT doesn’t understand what they need, and general disappointment and resentment ensues.

Clearly for Information Management & Business Intelligence solutions, this is not a good thing.

So I’ve stopped asking the question. Instead, when doing requirements gathering for an information project, I go through a workshop process that follows the following outline agenda:

Context setting: Why information management / Business Intelligence / Analytics / Data Governance* is generally perceived to be a “good thing”. This is essentially a very quick précis of the BI project mandate, and should aim at putting people at ease by answering the question “What exactly are we all doing here?”

(*Delete as appropriate).

Business Function & Process discovery: What do people do in their jobs – functions & tasks? If you can get them to explain why they do those things – i.e. to what end purpose or outcome – so much the better (though this can be a stretch for many.)

Challenges: what problems or issues do they currently face in their endeavours? What prevents them from succeeding in their jobs? What would they do differently if they had the opportunity to do so?

Opportunities: What is currently good? Existing capabilities (systems, processes, resources) are in place that could be developed further or re-used/re-purposed to help achieve the desired outcomes?

Desired Actions: What should happen next?

As a consultant, I see it as part of my role to inject ideas into the workshop dialogue too, using a couple of question forms specifically designed to provoke a response:

“What would happen if…X”

“Have you thought about…Y”

“Why do you do/want…Z”.

Notice that as the workshop discussion proceeds, the participants will naturally start to explore aspects that relate to later parts of the agenda – this is entirely ok. The agenda is there to provide a framework for the discussion, not a constraint. We want people to open up and spill their guts, not clam up. (Although beware of the “rambler” who just won’t shut up but never gets to the point…)

Notice also that not once have we actively explored the “D” or “I” words. That’s because as you explore the agenda, any information requirements will either naturally fall out of the discussion as it proceed, or else you can infer the information requirements arising based on the other aspects of the discussion.

As the workshop attendees explore the different aspects of the session, you will find that the discussion will touch upon a number of different themes, which you can categorise and capture on-the-fly (I tend to do this on sheets of butchers paper tacked to the walls, so that the findings are shared and visible to all participants.). Comments will typically fall into the following broad categories:

* Functions: Things that people do as part of doing business.
* Stakeholders: people who are involved (including helpful people elsewhere in the organisation – follow up with them!)
* Inhibitors: Things that currently prevent progress (these either become immediate scope-change items if they are show-stoppers for the current initiative, or else they form additional future project opportunities to raise with management)
* Enablers: Resources to make use of (e.g. data sets that another team hold, which aren’t currently shared)
* Constraints: “non-negotiable” aspects that must be taken into account. (Note: I tend to find that all constraints are actually negotiable and can be overcome if there is enough desire, money and political will.)
* Considerations: Things to be aware of that may have an influence somewhere along the line.
* Source systems: places where data comes from
* Information requirements: Outputs that people want

Here’s a (semi) fictitious example:

e.g. ADD: “What does your team do?”

Workshop Victim Participant #1: “Well, we’re trying to reconcile the customer account balances with the individual transactions.”

ADD: And why do you wan to do that?

Workshop Victim Participant #2: “We think there’s a discrepancy in the warehouse stock balances, compared with what’s been shipped to customers. The sales guys keep their own database of customer contracts and orders and Jim’s already given us dump of the data, while finance run the accounts receivables process. But Sally the Accounts Clerk doesn’t let the numbers out under any circumstances, so basically we’re screwed.”

Functions: Sales Processing, Contract Mangement, Order Fulfilment, Stock Management, Accounts Receivable.
Stakeholders: Warehouse team, Sales team (Jim), Finance team.
Inhibitors: Finance don’t collaborate.
Enablers: Jim is helpful.
Source Systems: Stock System, Customer Database, Order Management, Finance System.
Information Requirements: Orders (Quantity & Price by Customer, by Salesman, by Stock Item), Dispatches (Quantity & Price by Customer, by Salesman, by Warehouse Clerk, by Stock Item), Financial Transactions (Value by Customer, by Order Ref)

You will also probably end up with the attendees identifying a number of immediate self-assigned actions arising from the discussion – good ideas that either haven’t occurred to them before or have sat on the “To-Do” list. That’s your workshop “value add” right there….

e.g.
Workshop Victim Participant #1: “I could go and speak to the Financial Controller about getting access to the finance data. He’s more amenable to working together than Sally, who just does what she’s told.”

Happy information requirements gathering!

Category: Business Intelligence, Data Quality, Enterprise Data Management, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
No Comments »

by: Alandduncan
24  May  2014

Data Quality Profiling: Do you trust in the Dark Arts?

Why estimating Data Quality profiling doesn’t have to be guess-work

Data Management lore would have us believe that estimating the amount of work involved in Data Quality analysis is a bit of a “Dark Art,” and to get a close enough approximation for quoting purposes requires much scrying, haruspicy and wet-finger-waving, as well as plenty of general wailing and gnashing of teeth. (Those of you with a background in Project Management could probably argue that any type of work estimation is just as problematic, and that in any event work will expand to more than fill the time available…).

However, you may no longer need to call on the services of Severus Snape or Mystic Meg to get a workable estimate for data quality profiling. My colleague from QFire Software, Neil Currie, recently put me onto a post by David Loshin on SearchDataManagement.com, which proposes a more structured and rational approach to estimating data quality work effort.

At first glance, the overall methodology that David proposes is reasonable in terms of estimating effort for a pure profiling exercise – at least in principle. (It’s analogous to similar “bottom/up” calculations that I’ve used in the past to estimate ETL development on a job-by-job basis, or creation of standards Business Intelligence reports on a report-by-report basis).

I would observe that David’s approach is predicated on the (big and probably optimistic) assumption that we’re only doing the profiling step. The follow-on stages of analysis, remediation and prevention are excluded – and in my experience, that’s where the real work most often lies! There is also the assumption that a pre-existing checklist of assessment criteria exists – and developing the library of quality check criteria can be a significant exercise in its own right.

However, even accepting the “profiling only” principle, I’d also offer a couple of additional enhancements to the overall approach.

Firstly, even with profiling tools, the inspection and analysis process for any “wrong” elements can go a lot further than just a 10-minute-per-item-compare-with-the-checklist, particularly in data sets with a large number of records. Also, there’s the question of root-cause diagnosis (And good DQ methods WILL go into inspecting the actual member records themselves). So for contra-indicated attributes, I’d suggest a slightly extended estimation model:

* 10mins: for each “Simple” item (standard format, no applied business rules, fewer that 100 member records)
* 30 mins: for each “Medium” complexity item (unusual formats, some embedded business logic, data sets up to 1000 member records)
* 60 mins: for any “Hard” high-complexity items (significant, complex business logic, data sets over 1000 member records)

Secondly, and more importantly – David doesn’t really allow for the human factor. It’s always people that are bloody hard work! While it’s all very well to do a profiling exercise in-and-of-itself, the result need to be shared with human beings – presented, scrutinised, questioned, validated, evaluated, verified, justified. (Then acted upon, hopefully!) And even allowing for the set-aside of the “Analysis” stages onwards, then there will need to be some form of socialisation within the “Profiling” phase.

That’s not a technical exercise – it’s about communication, collaboration and co-operation. Which means it may take an awful lot longer than just doing the tool-based profiling process!

How much socialisation? That depends on the number of stakeholders, and their nature. As a rule-of-thumb, I’d suggest the following:

* Two hours of preparation per workshop ((If the stakeholder group is “tame”. Double it if there are participants who are negatively inclined).
* One hour face-time per workshop (Double it for “negatives”)
* One hour post-workshop write-up time per workshop
* One workshop per 10 stakeholders.
* Two days to prepare any final papers and recommendations, and present to the Steering Group/Project Board.

That’s in addition to David’s formula for estimating the pure data profiling tasks.

Detailed root-cause analysis (Validate), remediation (Protect) and ongoing evaluation (Monitor) stages are a whole other ball-game.

Alternatively, just stick with the crystal balls and goats – you might not even need to kill the goat anymore…

Category: Business Intelligence, Data Quality, Enterprise Data Management, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
No Comments »

by: Alandduncan
16  May  2014

Five Whiskies in a Hotel: Key Questions to Get Started with Data Governance

A “foreign” colleague of mine once told me a trick his English language teacher taught him to help him remember the “questioning words” in English. (To the British, anyone who is a non-native speaker of English is “foreign.” I should also add that as a Scotsman, English is effectively my second language…).

“Five Whiskies in a Hotel” is the clue – i.e. five questioning words begin with “W” (Who, What, When, Why, Where), with one beginning with “H” (How).

These simple question words give us a great entry point when we are trying to capture the initial set of issues and concerns around data governance – what questions are important/need to be asked.

* What data/information do you want? (What inputs? What outputs? What tests/measures/criteria will be applied to confirm whether the data is fit for purpose or not?)
* Why do you want it? (What outcomes do you hope to achieve? Does the data being requested actually support those questions & outcome? Consider Efficiency/Effectiveness/Risk Mitigation drivers for benefit.)
* When is the information required? (When is it first required? How frequently? Particular events?)
* Who is involved? (Who is the information for? Who has rights to see the data? Who is it being provided by? Who is ultimately accountable for the data – both contents and definitions? Consider multiple stakeholder groups in both recipients and providers)
* Where is the data to reside? (Where is it originating form? Where is it going to?)
* How will it be shared? (How will the mechanisms/methods work to collect/collate/integrate/store/disseminate/access/archive the data? How should it be structured & formatted? Consider Systems, Processes and Human methods.)

Clearly, each question can generate multiple answers!

Aside: in the Doric dialect of North-East of Scotland where I originally hail from, all the “question” words begin with “F”:
Fit…? (What?) e.g. “Fit dis yon feel loon wint?” (What does that silly chap want?)
Fit wye…? (Why?) e.g. “Fit wye div ye wint a’thin’?” (Why do you want everything?)
Fan…? (When?) e.g. “Fan div ye wint it?” (When you you want it?)
Fa…? (Who?) e.g. “Fa div I gie ‘is tae?” (Who do I give this to?)
Far…? (Where?) e.g. “Far aboots dis yon thingumyjig ging?” (Where exactly does that item go?)
Foo…? (How?) e.g. “Foo div ye expect me tae dae it by ‘e morn?” (How do you expect me to do it by tomorrow?)

Whatever your native language, these key questions should get the conversation started…

Remember too, the homily by Rudyard Kipling:

https://www.youtube.com/watch?v=e6ySN72EasU

Category: Business Intelligence, Data Quality, Enterprise Data Management, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
No Comments »

by: Alandduncan
13  May  2014

Is Your Data Quality Boring?

https://www.youtube.com/watch?v=vEJy-xtHfHY

Is this the kind of response you get when you mention to people that you work in Data Quality?!

Let’s be honest here. Data Quality is good and worthy, but it can be a pretty dull affair at times. Information Management is something that “just happens”, and folks would rather not know the ins-and-outs of how the monthly Management Pack gets created.

Yet I’ll bet that they’ll be right on your case when the numbers are “wrong”.

Right?

So here’s an idea. The next time you want to engage someone in a discussion about data quality, don’t start by discussing data quality. Don’t mention the processes of profiling, validating or cleansing data. Don’t talk about integration, storage or reporting. And don’t even think about metadata, lineage or auditability. Yaaaaaaaaawn!!!!

Instead of concentrating on telling people about the practitioner processes (which of course are vital, and fascinating no doubt if you happen to be a practitioner), think about engaging in a manner that is relevant to the business community, using language and examples that are business-oriented. Make it fun!

Once you’ve got the discussion flowing in terms of the impacts, challenges and inhibitors that get in the way of successful business operations, then you can start to drill into the underlying data issues and their root causes. More often than not, a data quality issue is symptomatic of a business process failure rather than being an end in itself. By fixing the process problem, the business user gains a benefit, and the data in enhanced as a by-product. Everyone wins (and you didn’t even have to mention the dreaded DQ phrase!)

Data Quality is a human thing – that’s why its hard. As practitioners, we need to be communicators. Lead the thinking, identify the impact and deliver the value.

Now, that’s interesting!

https://www.youtube.com/watch?v=5sGjYgDXEWo

Category: Business Intelligence, Data Quality, Enterprise Data Management, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
No Comments »

by: Alandduncan
12  May  2014

The Information Management Tube Map

Just recently, Gary Allemann posted a guest article on Nicola Askham’s Blog, which made an analogy between Data Governance and the London Tube map. (Nicola also on Twitter. See also Gary Allemann’s blog, Data Quality Matters.)

Up until now, I’ve always struggled to think of a way to represent all of the different aspects of Information Management/Data Governance; the environment is multi-faceted, with the interconnections between the component capabilities being complex and not hierarchical. I’ve sometimes alluded to there being a network of relationship between elements, but this has been a fairly abstract concept that I’ve never been able to adequately illustrate.

And in a moment of perspiration, I came up with this…

http://informationaction.blogspot.com.au/2014/03/the-information-management-tube-map.html

I’ll be developing this further as I go but in the meantime, please let me know what you think.

(NOTE: following on from Seth Godin’ plea for more sharing of ideas, I am publishing the Information Management Tube Map under Creative Commons License Attribution Share-Alike V4.0 International. Please credit me where you use the concept, and I would appreciate it if you could reference back to me with any changes, suggestions or feedback. Thanks in advance.)

Category: Business Intelligence, Data Quality, Enterprise Data Management, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
No Comments »

by: Alandduncan
29  Mar  2014

Now that’s magic!

When I was a kid growing up in the UK, Paul Daniels was THE television magician. With a combination of slick high drama illusions, close-up trickery and cheeky end-of-the-pier humour, (plus a touch of glamour courtesy of The Lovely Debbie McGee TM), Paul had millions of viewers captivated on a weekly basis and his cheeky catch-phrases are still recognised to this day.

Of course. part of the fascination of watching a magician perform is to wonder how the trick works. “How the bloody hell did he do that?” my dad would splutter as Paul Daniels performed yet another goofy gag or hair-raising stunt (no mean fear, when you’re as bald as a coot…) But most people don’t REALLY want to know the inner secrets, and ever fewer of us are inspired to spray a riffle-shuffled a pack of cards all over granny’s lunch, stick a coin up their nose or grab the family goldfish from its bowl and hide it in the folds of our nether-garments. (Um, yeah. Let’s not go there…)

Penn and Teller are great of course, because they expose the basic techniques of really old, hackneyed tricks and force more innovation within the magician community. They’re at their most engaging when they actually do something that you don’t get to see the workings of. Illusion maintained, audience entertained.

As data practitioners, I think we can learn a few of these tricks. I often see us getting too hot-and-bothered about differentiating data, master data, reference data, metadata, classification scheme, taxonomy, dimensional vs relational vs data vault modelling etc. These concepts are certainly relevant to our practitioner world, but I don’t necessarily believe they need to be exposed at the business-user level.

For example, I often hear business users talking about “creating the metadata” for an event or transaction, when they’re talking about compiling the picklist of valid descriptive values and mapping these to the contextualising descriptive information for that event (which by my reckoning, really means compiling the reference data!). But I’ve found that business people really aren’t all that bothered about the underlying structure or rigour of the modelling process.

That’s our job.

There will always be exceptions. My good friend and colleague Ben Bor is something a special case and has the talent to combine data management and magic.

But for the rest of us mere mortals, I suggest that we keep the deep discussion of data techniques for the Data Magic Circle, and just let the paying customers enjoy the show….

Category: Business Intelligence, Data Quality, Enterprise Data Management, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
1 Comment »

by: Alandduncan
04  Mar  2014

The (Data) Doctor Is In: ADD looks for a data diagnosis…

Being a data management practitioner can be tough.

You’re expected to work your data quality magic, solve other people’s data problems, and help people get better business outcomes. It’s a valuable, worthy and satisfying profession. But people can be infuriating and frustrating, especially when the business user isn’t taking responsibility for the quality of their own data.

It’s a bit like being a Medical Doctor in general practice.

The patent presents with some early indicative symptoms. The MD then performs a full diagnosis and recommends a course of treatment. It’s then up to the patient whether or not they take their MD’s advice…

AlanDDuncan: “Doctor, Doctor. I get very short of breath when I go upstairs.”
MD: Yes, well. Your Body Mass Index is over 30, you’ve got consistently high blood pressure, your heatbeat is arrhythmic, and cholesterol levels are off the scale.”
ADD: “So what does that mean, doctor?”
MD: “It means you’re fat, you drink like a fish, you smoke like a chimney, your diet consists of fried food and cakes and you don’t do any exercise.”
ADD: “I’m Scottish.”
MD: “You need to change your lifestyle completely, or you’re going to die.”
ADD: “Oh. So, can you give me some pills?….”

If you’re going to get healthy with your data, you’ll going to have to put the pies down, step away from the Martinis and get off the couch folks.

Category: Business Intelligence, Data Quality, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata
No Comments »

by: John McClure
21  Dec  2013

The RDF PROVenance Ontology

"At the toolbar (menu, whatever) associated with a document there is a button marked "Oh, yeah?". You press it when you lose that feeling of trust. It says to the Web, 'so how do I know I can trust this information?'. The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons.” [[Tim Berners Lee, 1997]]

“The problem is – and this is true of books and every other medium – we don’t know whether the information we find [on the Web] is accurate or not. We don’t necessarily know what its provenance is.” – Vint Cerf

The Worldwide Web Consortium (W3) has hit another home-run when the RDF PROVenance Ontology officially became a member of the Resource Description Framework last May. This timely publication proposes a data model well-suited to its task: representing provenance metadata about any resource. Provenance data for a thing relates directly to its chain of ownership, its development or treatment as a managed resource, and its intended uses and audiences. Provenance data is a central requirement for any trust-ranking process that often occurs against digital resources sourced from outside an organization.

The PROV Ontology is bound to have important impacts on existing provenance models in the field, including Google’s Open Provenance Model Vocabulary; DERI’s X-Prov and W3P vocabularies; the open-source SWAN Provenance, Authoring and Versioning Ontology and Provenance Vocabulary; Inference Web’s Proof Markup Language-2 Ontology; the W3C’s now outdated RDF Datastore Schema; among others. As a practical matter, the PROV Ontology is already the underlying model for the bio-informatics industry as implemented at Oxford University, a prominent thought-leader in the RDF community.

At the core of the PROV Ontology is a conceptual data model with semantics instantiated by serializations including RDF and XML plus a notation aimed at human consumption. These serializations are used by implementations to interchange provenance data. To help developers and users create valid provenance, a set of constraints are defined, useful to the creation of provenance validators. Finally, to further support the interchange of provenance, additional definitions are provided for protocols to locate access and connect multiple provenance descriptions and,most importantly how to interoperate with the widely used Dublin Core two metadata vocabularies.

The PROV Ontology is slightly ambitious too despite the perils of over-specification. It aims to provide a model not just for discrete data-points and relations applicable to any managed-resource, but also for describing in-depth the processes relevant to its development as a concept. This is reasonable in many contexts — such as a scholarly article, to capture its bibliography — but it seems odd in the context of non-media resources such as Persons. For instance, it might be odd to think of a notation of one’s parents as within the scope of “provenance data”. The danger of over-specification is palpable in the face of grand claims that, for instance, scientific publications will be describable by the PROV Ontology to an extent that reveals “How new results were obtained: from assumptions to conclusions and everything in between” [W3 Working Group Presentation].

Recommendations. Enterprises and organizations should immediately adopt the RDF PROVenance Ontology in their semantic applications. At a minimum this ontology should be deeply incorporated within the fundamentals of any enterprise-wide models now driving semantic applications, and it should be a point of priority among key decision-makers. Based upon my review and use in my clients’ applications, this ontology is surely of a quality and scope that it will drive a massive amount of web traffic clearly to come in the not distant future. A market for user-facing ‘trust’ tools based on this ontology should begin to appear soon that can stimulate the evolution of one’s semantic applications.

Insofar as timing, the best strategy is to internally incubate the present ontology, with plans to then fully adopt the second Candidate Recommendation. This gives the standardization process for this Recommendation a chance to achieve a better level of maturity and completeness.

Category: Data Quality, Enterprise Data Management, Information Development, Information Governance, Information Management, Information Strategy, Information Value, Master Data Management, Metadata, Semantic Web, Web Content Management
No Comments »

by: John McClure
11  Nov  2013

Semantic Business Vocabularies and Rules

For many in the traditional applications development community, “semantics” sounds like a perfect candidate for a buzzword tossed at management in an effort to pry fresh funding for what may appear to be academic projects with not much discernible practical payback. Indeed when challenged for examples of “semantic applications” often one hears stumbling litanies about “Linked Open Data”, ubiquitous “URLs” and “statement triples”. Traditional database folks might then retort where’s the beef? because URLs in web applications are certainly just as ubiquitous, are stored in database columns which are named just as “semantic properties” are; are in rows with foreign keys as construable as “subjects” in a standard semantic statement; and that’s still not to mention the many, many other SQL tools in which enterprises have heavily, heavily invested over many, many years! So…

It’s a good question – where IS the beef?

The Object Management Group (OMG), a highly respected standards organization with outsized impacts on many enterprises, has recently released a revision of its Semantics of Business Vocabulary and Rules (SBVR) that provides one clear answer to this important question.

Before we go there, let’s stipulate that for relatively straight-forward (but nonetheless closed world) applications, it’s probably not worth the time nor expense to ‘semanticize’ the application and its associated database. These are applications that have few database tables; that change at a low rate; that have fairly simple forms-based user interfaces, if any; and that are often connected through transaction queues with an enterprise’s complex of applications. For these, fine, exclude them from projects migrating an enterprise to a semantic-processing orientation.

Another genre of applications are those said ‘mission critical’. These applications are characterized by a large number of database tables and consequently a large number of columns and datatypes the application needs to juggle. These applications have moderate to high rates of change to accommodate functional requirements due to shifting (and new additions to) of dynamic enterprises — not so much mission creep as it is the normal response to the tempo of the competitive environments in which enterprises exist.

The fact is that the physical schema for a triples-based (or quad-based) semantic database does not change; the physical schema is static. Rather, it’s the ontology, the logical database schema, that changes to meet new requirements of the enterprise. This is an important outcome of a re-engineered applications development process: this eliminates often costly tasks associated with designing, configuring and deploying physical schema.

Traditionalists might view this shift as mere cleverness, one equally accomplished by tools which transform logical database designs into physical database schema. Personally I don’t have the background to debate the effectiveness of these tools. However, let’s take a larger view, one suggested by the OMG specification for Business Vocabularies and Rules.

Business Policies – where it begins, and will end

Classically business management establishes policies which are sent to an Information Technology department for incorporation to new and existing applications. It is then the job of systems analysts to stare at these goats and translate them into coding specifications for development and testing. Agile and other methodologies help speed this process internally to the IT department, however until the fundamental dynamic between management and IT changes, this cycle remains slow, costly and mistake-prone.

Now this is where OMG’s SBVR applies: it is an ontology for capturing rules such as “If A, then thou shalt not do X when Y or Z applies; otherwise thou shalt do B and C” into a machine-processable form (that is, into semantic statements). Initially suitably trained system analysts draft these statements as well as pertinent queries which are to be performed by applications at the proper moment. However at some point tools will appear that permit management themselves to draft and test the impact of new and changed policies against live databases.

This is real business process re-engineering at its brightest. Policy implementation and operational costs are affected as the same language (a somewhat ‘structured English’) is used to state what should and must be as that used to state what is. Without that common language, enterprises can only rely on the skills of systems analysts to adequately communicate business, and regulatory, requirements to others.

Capturing & weaving unstructured lexical information into enterprise applications, has never been possible with traditional databases. This is why ‘semantics’ is such a big deal.

Cheers!

Category: Enterprise Content Management, Information Development, Information Management, Information Strategy, Information Value, Master Data Management, Semantic Web
No Comments »

Calendar
Collapse Expand Close
TODAY: Sun, July 23, 2017
July2017
SMTWTFS
2526272829301
2345678
9101112131415
16171819202122
23242526272829
303112345
Archives
Collapse Expand Close
Recent Comments
Collapse Expand Close