Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Wiki Home
Collapse Expand Close

Members
Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Layered Semantic Enterprise Architecture

From MIKE2.0 Methodology

Share/Save/Bookmark
Jump to: navigation, search

Many modern IT architectures are Web-based and layered. They are Web-based because the designs are proven, simple, scalable and accessible. They are layered to prevent "lock-in" to particular vendors or standards and to allow incorporation of new innovations, including open source.

Given its layered role, the Semantic Enterprise Offering acts as a subsequent set of functions or middleware with respect to the standard SAFE Architecture. This is reflected in its layered architecture, as the diagram shows (click to enlarge):

Layered Semantic Enterprise Architecture

Most of the existing SAFE architecture resides in the "Existing Assets" layer. The specific aspects of the Semantic Enterprise Offering is then sandwiched in the so-called "Access/Conversion" and "Ontologies" layers in the diagram. These capabilities do, however, also potentially greatly affect the applications layer, as is more specifically discussed in the Offering itself.

Contents

The Application Layer

The application layer is appropriately diverse. By combining the richness of existing information structure with semantic technologies, this design can lend itself to a new paradigm in information technology: ontology-driven applications. Ontology-driven applications are modular, generic tools, which operate and present results to users based on the underlying structures that feed them.

The ontology-driven paradigm is an alternative to the often brittle, traditional approaches to application code development, query formulation and report writers. Instead, attention shifts to the structure and organization of the information itself, a focus that can democratize the knowledge management process.

Existing options at the application layer enable structured data and its controlling vocabularies (ontologies) to drive applications and user interfaces. These options are based on RDF and various Web services framework s (see below).

Users and groups can flexibly access and manage any or all datasets exposed by the system depending on roles and permissions. Report and presentation templates are readily defined, styled or modified based on the underlying datasets and structure. Collaboration networks can be established across multiple installations and Web service endpoints. Powerful linked data integration can be included to embrace data anywhere on the Web.

Existing application functionality includes CRUD, data display templating, faceted browsing, full-text search, and import and export over structured data stores based on RDF or other structured data formats.

The Ontology ('Schema') Layer

Ontologies are the key structures that provide the horsepower behind these ontology-driven applications. As such, however, these data-driven adaptive ontologies — with their expanded duties in Web deployment and user interfaces — have added requirements:

  • Linked data and the use and accessibility of URIs as resource identifiers
  • Workflow considerations with explicit treatment of user edits and candidate suggestions
  • Context- and instance-sensitive data display, including templates for data objects to "drive" user interfaces.

Two roles for ontologies are to position existing datasets into an "aboutness" framework and to help guide how the data can be described and related to other data. General vocabularies can support more focused vocabularies through the incorporation of domain-specific ontologies. These various "mapping" structures can be developed and deployed incrementally without adversely affecting what has already been designed. In this manner, the semantic enterprise can grow and extend in a piecemeal manner, attuned to available budgets and with low risk of obsolescence and these ontology data structures provide long-sought benefits in data federation, consistent enterprise-wide semantics, and flexible frameworks for business intelligence and knowledge management.

Web Services Interfaces

An essential interface layer is the mediator between existing data assets and structure and the interoperability provided by adaptive ontologies. This layer needs to communicate and present clear semantics at the interoperable side of the interface. It needs to accept and convert a diversity of data, structures and schema. This interface layer must absolutely be neutral to any data format or application presented to it.

A Web services framework can provide this platform-independent middleware for accessing and exposing structured RDF data, with generic tools driven by underlying data structures. Best practices include using the perspective of the dataset. Access and user rights may be granted around these datasets, making the framework enterprise-ready and designed for collaboration. Since a this Web services layer may be placed over virtually any existing datastore with Web access — including large instance record stores in existing relational databases — it is also a framework for Web-wide deployments and interoperability.

In keeping with this Web-oriented architecture, the framework should generally be RESTful in design and is based on HTTP and Web protocols and open standards. Certain baseline services such as CRUD (create - read - update - delete), browse, full-text and faceted search, and export and import (multiple supported formats), are recommended baselines. More services can readily be added to the system, including advanced analytics and data visualization. All Web services are exposed via APIs and SPARQL endpoints.

The tools within such Web services frameworks (or those designed to interoperate with it) are different than traditional applications. They are designed to have generic functionality, the specific operation and expression of which is based on the inherent structure within the data and its relationships. This design approach is closer to Web 2.0 "mashup" designs, which emphasize APIs and protocols.

The Conversion, Extraction and Authoring Layer

Not shown on the diagram because it is a supporting input to it is a possible layer providing information extraction, RDF conversion of legacy data structs, and simple dataset authoring and exchange formats.

Information extraction is important because 80% to 85% of all information resides in unstructured text. Metadata tagging through information extraction allows faceting, finding named entities, and inferencing over conceptual relationships.  There are many open source and proprietary systems that enable concept or namedentity (instance record) extraction or the use of dictionaries to help guide the extraction process. The co-occurrence of matches between concepts and entities also aids the disambiguation task. The resulting tags can be managed separately or fed to user interfaces or re-injected back into the original content as RDFa.

"RDFizers" provide conversion and exposure of other data formats and structures. Many of these converters can work directly with major application APIs, and conversion from relational data structures is often straightforward.

For dataset authoring, many simple external applications and scripting languages exist, including the use of standard spreadsheets.

Leveraging Existing Assets

The source of most of the information for interoperation at the semantic layer comes from existing assets. These more traditional sources are the basis for much of the methodology in MIKE2.0 and are generally captured by the SAFE architecture design.

Wiki Contributors
Collapse Expand Close

View more contributors