Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Wiki Home
Collapse Expand Close

Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Guiding Principles for the Open Semantic Enterprise

From MIKE2.0 Methodology

Jump to: navigation, search



An open semantic enterprise is an organization that uses the languages and standards of the semantic Web, including RDF, RDFS, OWL, SPARQL and others to integrate existing information assets, using the best practices of linked data and the open world assumption, and targeting knowledge management applications. It does so using some or all of the seven guiding principles noted herein.[1]

These guiding principles do not necessarily mean open data nor open source. The techniques can equivalently be applied to internal, closed, proprietary data and structures. The techniques can themselves be used as a basis for bringing external information into the enterprise. ‘Open’ is in reference to the critical use of the open world assumption.

These practices do not require replacing current systems and assets; they can be applied equally to public or proprietary information; and they can be tested and deployed incrementally at low risk and cost. The very foundations of the practice encourage a learn-as-you-go approach and active and agile adaptation. While embracing the open semantic enterprise can lead to quite disruptive benefits and changes, it can be accomplished as such with minimal disruption in itself. This is its most compelling aspect.

Like any change in practice or learning, embracing the open semantic enterprise is fundamentally a people process. This is the pivotal piece to the puzzle, but also the one that does not lend itself to ready formula about principles or best practices. Leadership and vision is necessary to begin the process. People are the fuel for impelling it. In this sense, then, there are really eight principles to the open semantic enterprise, with people residing at the apex.

Summary: The Seven Principles and Their Benefits

The natural scope of the open semantic enterprise is in knowledge management and representation. Suitable applications include data federation, data warehousing, search, enterprise information integration, business intelligence, competitive intelligence, knowledge representation, and so forth.[2] The figure below shows the seven guiding principles for the open semantic enterprise:

Principles of the Open Semantic Enterprise

Embracing these principles of the open semantic enterprise can bring these knowledge management benefits:

  • Domains can be analyzed and inspected incrementally
  • Schema can be incomplete and developed and refined incrementally
  • The data and the structures within these frameworks can be used and expressed in a piecemeal or incomplete manner
  • Data with partial characterizations can be combined with other data having complete characterizations
  • Systems built with these frameworks are flexible and robust; as new information or structure is gained, it can be incorporated without negating the information already resident, and
  • Both open and closed world subsystems can be bridged.

Moreover, by building on successful Web architectures, the enterprise can put in place loosely coupled, distributed systems that can grow and interoperate in a decentralized manner. The potential benefits can be summarized as greater insight with lower risk, lower cost, faster deployment, and more agile responsiveness.

Now, let's summarize these seven principles.

The RDF Data Model

RDF is perhaps the single most important foundation to the open semantic enterprise. RDF can be applied equally to all structured, semi-structured and unstructured content. By defining new types and predicates, it is possible to create more expressive vocabularies within RDF. This expressiveness enables RDF to define controlled vocabularies with exact semantics. These features make RDF a powerful data model and language for data federation and interoperability across disparate datasets.

Via various processors or extractors, RDF can capture and convey the metadata or information in unstructured (say, text), semi-structured (say, HTML documents) or structured sources (say, standard databases). This makes RDF almost a “universal solvent” for representing data structure.

Because of this universality, there are now more than 150 off-the-shelf ‘RDFizers’ for converting various non-RDF notations (data formats and serializations) to RDF. Because of its diversity of serializations and simple data model, it is often straightforward to create new converters. Once in a common RDF representation, it is possible to incorporate new datasets or new attributes. It is also possible to aggregate disparate data sources as if they came from a single source. This enables meaningful compositions of data from different applications regardless of format or serialization.

What this practically means is that the integration layer can be based on RDF, but that all source data and schema can still reside in their native forms. Information may be readily authored, transfered or represented in non-RDF forms. RDF is only necessary at the point of federation, and not all knowledge workers need be versed in the framework.

Linked Data Techniques

Linked data is a set of best practices for publishing and deploying instance and class data using the RDF data model. Two of the best practices are to name the data objects using uniform resource identifiers (URIs), and to expose the data for access via the HTTP protocol. Both of these practices enable the Web to become a distributed database, which also means that Web architectures can also be readily employed.

Linked data is applicable to public or enterprise data, open or proprietary. It is straightforward to employ. Additional linked data best practices relate to how to characterize and classify data, especially in the use of predicates with the proper semantics for establishing the degree of relatedness for linked data items from disparate sources. With proper techniques, linked data can be used to represent data interconnections, interrelationships and context that are equally useful to both humans and machine agents.

Adaptive Ontologies

Ontologies are the guiding structures for how information is interrelated and made coherent using RDF and its related schema and ontology vocabularies, RDFS and OWL. Thousands of off-the-shelf ontologies exist — though only a few are likely applicable to any given circumstance — and new ones appropriate to any domain or scope at hand can be constructed in an incremental fashion.

In standard form, semantic Web ontologies may range from the small and simple to the large and complex, and may perform the roles of defining relationships among concepts, integrating instance data, orienting to other knowledge and domains, or mapping to other schema. Another best practice is to keep concept definitions and their relationships (the "schema") expressed separately from instance data and their attributes.

But, in addition to these standard roles, ontologies can also act as guiding structures for ontology-driven applications (see next principle). With a relatively few minor and new best practices, ontologies can take on the double role of informing user interfaces in addition to standard information integration. With these dual roles, these structures become adaptive ontologies.

Some of the user interface considerations that can be driven by adaptive ontologies include: attribute labels and tooltips; navigation and browsing structures and trees; menu structures; auto-completion of entered data; contextual dropdown list choices; spell checkers; online help systems; etc. Put another way, what makes an ontology adaptive is to supplement the standard machine-readable purpose of ontologies to add human-readable labels, synonyms, definitions and the like. A neat trick occurs with this slight expansion of roles. The knowledge management effort can now shift to the actual description, nature and relationships of the information environment. 

Any existing structure (or multiples thereof) can become a starting basis for these ontologies and their vocabularies, from spreadsheets to naïve data structures and lists and taxonomies. So, while producing an operating ontology that meets the best practice thresholds noted herein has certain requirements, kicking off or contributing to this process poses few technical or technology demands.

Ontology-driven Applications

The complement to adaptive ontologies are ontology-driven applications. By definition, ontology-driven apps are modular, generic software applications designed to operate in accordance with the specifications contained in an adaptive ontology. The relationships and structure of the information driving these applications are based on the standard functions and roles of ontologies, as supplemented by the human and user interface roles noted above.

Ontology-driven apps fulfill specific generic tasks. Examples of current ontology-driven apps include imports and exports in various formats, dataset creation and management, data record creation and management, reporting, browsing, searching, data visualization, user access rights and permissions, and similar. These applications provide their specific functionality in response to the specifications in the ontologies fed to them.

The applications are designed more similarly to widgets or API-based frameworks than to the dedicated software of the past, though the dedicated functionality (e.g., graphing, reporting, etc.) is obviously quite similar. The major change in these ontology-driven apps is to accommodate a relatively common abstraction layer that responds to the structure and conventions of the guiding ontologies. The major advantage is that single generic applications can supply shared functionality based on any properly constructed adaptive ontology.

This design thus limits software brittleness and maximizes software re-use. Moreover, as noted above, it shifts the locus of effort from software development and maintenance to the creation and modification of knowledge structures. 

A Web-oriented Architecture

A Web-oriented architecture (WOA) is a subset of the service-oriented architectural (SOA) style, wherein discrete functions are packaged into modular and shareable elements (”services”) that are made available in a distributed and loosely coupled manner. WOA uses the representational state transfer (REST) style. REST provides principles for how resources are defined and used and addressed with simple interfaces without additional messaging layers such as SOAP or RPC. The principles are couched within the framework of a generalized architectural style and are not limited to the Web, though they are a foundation to it.

REST and WOA stand in contrast to earlier Web service styles that are often known by the WS-* acronym (such as WSDL, etc.). WOA has proven itself to be highly scalable and robust for decentralized users since all messages and interactions are self-contained.

Enterprises have much to learn from the Web’s success. WOA has a simple design with REST and simple messaging, distributed and modular services, and simple interfaces. It has a natural synergy with linked data via the use of URI identifiers and the HTTP transport protocol. Since the same architecture has worked well in linking documents; it is now pointing the way to linking data.

An Incremental, Layered Approach

These principles are essentially “layers”. This layering begins with existing assets, both internal and external, in many diverse formats. These are then converted or transformed into RDF-capable forms. These various sources are then exposed via a WOA Web services layer for distributed and loosely-coupled access. Then, the information is integrated and federated via adaptive ontologies, which then can be searched, inspected and managed via ontology-driven apps. Here is one way to look at this layered approach:
Layered Semantic Enterprise Architecture.png
Semantic technology does not change or alter the fact that most activities of the enterprise are transactional, communicative or documentary in nature. Structured, relational data systems for transactions or records are proven and understood. On its very face, it should be clear that the meaning of these activities — their semantics — is by nature an augmentation or added layer to how to conduct the activities themselves.

This simple truth affirms that semantic technologies are not a starting basis, then, for these activities, but a way of expressing and interoperating their outcomes. Sure, some semantic understanding and common vocabularies at the front end can help bring consistency and a common language to an enterprise’s activities. This is good practice, and the more that can be done within reason while not stifling innovation, all the better. But an obvious benefit to the semantic enterprise is to federate across existing data silos. This should be an objective of the first semantic “layer”, and to do so in a way that leverages existing information already in hand.

The Open World Mindset

There is a common thread to many of these principles: the open world assumption (OWA). Enterprises have traditionally followed the (most often unstated) closed world assumption (CWA). Given the success of relational systems for transaction and operational systems — applications for which they are still clearly superior — it is understandable and not surprising that this same mindset has seemed logical for knowledge management problems as well. But knowledge and KM are by their nature incomplete, changing and uncertain. A closed-world mindset carries with it certainty and logic implications perhaps not readily supportable by real circumstances.

This is not an esoteric point, but a fundamental one. How one thinks about the world and evaluates it is pivotal to what can be learned and how and with what information. Transactions require completeness and performance; insight requires drawing connections in the face of incompleteness or unknowns. By itself, the open world mindset provides no assurance of gaining insight or wisdom. But, absent it, we place thresholds on information and understanding that may neither be affordable nor achievable with traditional, closed-world approaches.

  1. An earlier version of this article was M.K. Bergman, 2010. “Seven Pillars of the Open Semantic Enterprise“, AI3:::Adaptive Information blog, January 12, 2010; see that article for a longer listing of references
  2. In most instances, semantic technologies are poorly suited to transactional or operational applications. Also, there are instances in modeling specific closed-world domains where ontologies can be quite useful, such as in aerospace, petrochemicals, engineering, etc., where the scope of the domain can be precisely bounded and defined. Such efforts tend to be higher cost with lengthy lead times
Wiki Contributors
Collapse Expand Close

View more contributors