Can’t See the Wood for the Trees !

How long have we been pre-occupied with syntactical well-formedness when we integrate business systems? Too long! In my experience the vast majority of our combined efforts are spent trying to fit the round peg in the round hole, building tools and approaches to simplify the process of matching every character in our XML instance documents with what little we consider to be an interface contract which itself if dominated by an XML Schema and precious little else.

As we expend so much effort squabbling over traditional flash-points such as element ordering, XSD precision or lack-of, schema version comedy and so forth, we’re actually missing the fundamental god-given right to freedom of expression. What I mean by this is that there is a better way, and that better way means we should be able to actually remove the need for such sensitivity about the representation of our information, and rely more on how we communicate the meaning of that information.

To explain this further let’s consider how the MIME-Type, something rarely discussed in SOAP/WS* oriented integration designs, is now our friend as a result of the resurgence of understanding of the true HTTP protocol thanks to Mr Fielding’s spotlight on the RESTful architectural style. REST actively enables a situation in which representation of subject resources/information can be stated and negotiated dynamically at run-time (thanks to the Content Type and Accept HTTP headers). This flexibility also means that a service consumer can drive an interaction with a remote system in a way that is optimal to the consumer. This empowerment removes some of the syntactical compliance head-aches traditionally associated with our EAI ancestry.

Taking the REST MIME concept further – I believe that we should also be able to decouple our integration points even more, effectively dispensing with any real syntactical agreement in our service contracts, whilst increasing the quality and reliability of our distributed systems? How so?

If we focus more on annotating our message representation in such a way that we can more adequately express formal semantics based on global or locally agreed ontologies (RDF/OWL for example), then our integration infrastructure can rightly be expected to manage the transit of any group of characters over a wire (XML, CSV, TLV, blah, blah…), delivering ‘information’ to a semantically aware processor more interested in resolving content based on an agreed ontology.
This shift means that our next generation Enterprise integration stack needs to learn from REST, incorporate elements of the Semantic Web movement in expressing ‘the third dimension’ of integration contracts, and spending more time discussing the business information/transaction instead of getting embroiled in the wasteful practice of bending our systems into forming brittle, badly-specified and semantically-devoid XML crunchers – and then crying into our beers about the fact that SOA doesn’t cost-in and that we were let down by the hype-mongers ??.

This is how I believe we’re spending time in the trees but not actually seeing the wood. I can safely say I’m no longer a tree-hugger from this point onward…


Posted in Semantic Integration | Tagged , , , | 2 Comments

3D Integration with Ontologies

I’ve existed for a long time in the application integration context of grappling with the matching of XML instance documents between disparate teams with different understanding of the same ambiguous integration contract!! This problem only really exists when we move away from simple XML document models (i.e. the traditional customer name, address and phone number) and deal with the kind of extensible XML dialects we generally associate with ‘re-usable’ business services in the Enterprise SOA landscape.

I’ve searched long and hard for a sustainable alternative to the usual XSD-oriented two-dimensional integration design approach. In this traditional scenario we usually grab the XSD and then scour supporting informal documentation to try to derive an adequate understanding of the XML formation rules before we get into the ‘usually’ brittle development and integration testing cycle to arrive at a stable solution.

My recent research on RDF, OWL and the use of Semantic-Web related Ontologies has now shifted my opinion from once looking upon that emerging domain as a purely academic tangent in information tagging, to now being convinced that the existence of Enterprise vocabularies, and ontologies are effectively offering an integration contract in the 3rd dimension. The 3rd dimension is the depth of descriptive information that is missing from our current 2 dimensional integration experience.

Bringing the ontology from the semantic web context into the transactional enterprise integration design and build cycle is now something I believe will establish the third semantic dimension and a means of decoupling from our current focus on XML syntax.


Posted in Semantic Integration | Tagged , | Leave a comment

Transformation is not Specification

One of the things that often causes ambiguity when I explain my thoughts on semantic integration and semantic contracts within that context is the perception that middleware vendors have traditionally catered for this capability in their SOA suites. A SOA Suite is a set of tools and capabilities that allow the owner to construct integration solutions from pre-built elements, and these toolkits often bundle a GUI transformation design tool. Generally these tools are what I would describe as reactive and embedded transformation tools.

Reactive means that the tool gives me the capability to specify rules, as the service provider, to handle inbound XML instance documents sent to my advertised services. These tools often show a source or inbound schema outline on the left hand side of the UI, and a target schema outline on the right hand side of the UI. I can then begin to drag relationships between input and target elements, and do clever things to add expressions to deal with combining, splitting, enriching, splicing and other element manipulations that allow me to achieve a transformation from source to destination.

Embedded means that the resulting transformation rules are embedded within, or worse proprietary to, the SOA Suite runtime platform I’m using. As such I am configuring my engine with necessary intelligence to react to specific inbound events and transform accordingly – or bombing out and ejecting the invalid inbound document. This embedding of the transformation logic means that the formation rules I’ve implemented within my transformation are not something I am necessary able to publish as a portable specification to my prospective consumers.

These SOA Suite transformation tools in my experience (BEA, IBM, Oracle, WebMethods, TIBCO) do a fine job in terms of helping me to crunch XML, but are not capable of allowing me to author a formalised specification of my service interface beyond the traditional XSD, Javadoc and informal supporting documents which may be produced.

A transformation is not a semantic contract specification, but instead is an implementation of some or all of the semantic contract specification Рthe integrity of which is not verifiable unless  we possess executable means of formally verifying my input and output documents as being semantically intact before and after they pass through the transformation code.

Let’s not forget – the people who own the service contract and implement transformations can get it wrong also….

Posted in EAI, Semantic Integration | Tagged , | Leave a comment

What is a Semantic Integration Contract?

In my simple terminology I use the term ‘Semantic Integration Contract’ or ‘Semantic Contract’ to refer to an artifact which is both declarative, executable, and human readable, which can be exchanged between a service provider and service consumer to enable a consumer of my service to better understand the static and dynamic formation rules of well-formedness I am imposing on requests to my service.

XML Schema will allow me to specify which base types, structures and elements need to be present in any given instance document being punted towards my service endpoint. However as we rely on the power of XML to accommodate more flexible and eXtensible data structures which become increasingly normalised and therefore data-driven, we need something else to compensate for the deficiencies of XSD in adequately expressing the formation rules of that XML. As such the semantic contract takes us to the next level by offering capabilities such as those listed below:

  • Expression of element interdependencies and rules that rely on instance data and dynamic conditions to be evaluated. For example ‘if element <ProductType> is present in the instance document, and if it’s value is ‘SuperWidget’, then we must have child elements of <SuperPowers> and <SuperTariff> within the <ProductType> data structure…
  • Expression of complex or conditional element content formatting rules beyond the primitive capability of the XSD vocabulary, with the extension point to reference external complex type libraries containing conditional constraints similar to those in the earlier item.
  • Expression of the actual meaning and intended usage of the complex entities expressed within the XML instance document, based on a supporting, formalised vocabulary within the context of the integration transaction.

These few examples of classes of capability I believe we need to introduce into the formalised artifacts associated with an integration contract such that we reverse the trend away from XSD towards informal word documents or excel spreadsheets to compensate for the descriptive and constraining expressions we need to be aware of as implementors of the emitters or consumers of XML instance documents in a progressive integration context.

I believe there are emerging standards that can be harnessed for aspects of the declarative solution but we are then left with the question of how to effectively introduce this into an agile integration engineering process…

Posted in Semantic Integration | Tagged , | Leave a comment

Semantic Contracts and Continuous Integration

Large scale distributed systems have a fundamental dependency on the quality of understanding of information across system boundaries. No matter how far we unify or standardise the mechanisms by which we pass information between systems, the speed by which we can evolve our solutions depends squarely on how well we manage the semantic vacuum which exists in a departmentally federated IT architecture (in my experience I hasten to add).

It’s a strange irony that the very principles underpinning the Service Oriented Architecture actually compound the departmental mindset which then breaks down the enterprise data model into a set of silo’s of knowledge hidden behind cold, brittle, and intimidating XML fortresses loosely presented as ‘re-usable services’.

This silo’d information architecture drives up the importance of formal mechanisms of knowledge sharing in the integration community such that we can achieve executable verification of ‘understanding, completeness and well-formedness’ when we entrust our business integrity and agility upon our XML service assets. Without such a formal mechanism we’re again reliant upon informal and imprecise tools and it’s no surprise our velocity takes a bit hit given we’re into a pretty torrid multi-party blinking contest as our delivery deadlines approach.

If we then look at the significance of Test-Driven Development, Continuous Integration and an overarching Proof-Driven delivery culture in driving up our collective velocity then getting a verifiable semantic contract into the hands of our service consumers as early as possible is imperative. The creation of a semantic contract artefact needs to embody the structural and semantic rules associated with calling a version of a service, and needs to happen at the earliest opportunity to replace the traditional experience of waiting until system-test to verify if we understood what we were integrating earlier in the process.

As such – Continuous Integration cycles must aspire to interating ‘Semantic Mock’ services which can be derived from a full-formed schema and domain ontology. This way we gain continuous proof of inter-domain interpretation at every stage of the process until we finally hit system test, and then when we hook our systems together we still have a canonical semantic contract as arbitration whenever a bomb goes off…

Semantic contracts are rooted in ontologies, NOT schemas…….and I’m certain that there’s a gap in the EAI/SOA landscape.


Posted in Continuous Integration, Semantic Integration | Tagged , , | Leave a comment

Semantic Integration Tools: DXSI, Schematron, RDF and OWL

At this stage my own investigations into alternatives to the use of XSD have relied upon the creation of java-executable ‘semantic contracts’ with a commerical product by the name Data-Extend Semantic Integrator (DXSI) from Progress Software, and also the creation of an xslt-executable Schematron policy files.

My current research direction is to look closely at the emerging standards in the Semantic Web field, to see whether there are directly re-usable techniques which can be exploited within the context of transactional Enterprise Application Integration. My thoughts are therefore turning to OWL and RDF which may hold to key to establishing an open vocabulary within which Domain Specific Languages (DSL’s) could be formed within the context of specific EAI contexts. Finding the correct capability which delivers an open standard but also an efficient design, test and production executable platform is the real key here…

Posted in Semantic Integration | Tagged , , , | Leave a comment

SOA and Service Re-use: XML Schema R.I.P.

Service re-use is constantly sited as a driver for SOA. Re-use is equally often misinterpreted as a justification for the creation of uber-flexible service interfaces capable of ‘being re-used’ to reach a wide-variety of related functions within the service implementation. The example here is that a ‘re-usable’ service is often seen to be a coarse-grained ‘order’ interface capable of being used to request a simple atomic product, as well as the high-end extremely complex multi element product. Whilst I too started my SOA journey thinking this, it is an extremely flawed for a couple of simple reasons. Firstly uber-flexible interfaces are hard to specify, and secondly the testability of such interfaces is inversely and proportionally related to the level of flexibility. (See this related article for a better explanation of this point ) offered by that interface.

Now to my point – XML Schema (XSD) is regarded as a standard for use in exchanging integration contracts between service providers and service consumers. XSD is great for expressing structural and primitive element related constraints. As such if we want to build our simple interface with a mandatory set of simple, atomic elements, then XSD is our friend. However, as we race down the SOA re-use road, the bumps in the road begin to get bigger given that same re-use and inherent flexibility at the schema level, effectively causes XSD to become little more than an empty-shell, unable to actually express the rules of how different combinations of optional elements need to be assembled under certain runtime conditions.

As such – the misinterpreted drive for re-use is actually killing one of the traditional cornerstone technologies that have enabled us to get this far with our current, simpler and less flexible service interfaces. This fact has led me to research and prototype alternative compensating mechanisms that allow me to offset the decaying schemas and exponentially expanding ‘informal’ specification documents which reduce the overall integrity of my delivered service when considering full-life cost including maintenance.

Posted in EAI, Semantic Integration | Tagged , , , , , | Leave a comment