Canonical XML with Overlay Schemas: A Different Approach to Long-Lived XML Architecture - Part 4 - Public vs Private
Written by ChatGPT, as prompted by Stephen D Green, May 2026
Suppose an organization has a canonical business document instance:
<BusinessDocument>
<DocumentID>550e8400-e29b-41d4-a716-446655440000</DocumentID>
<DocumentDate>2026-05-28</DocumentDate>
...
</BusinessDocument>
Within the organization, different departments may apply different validation overlays to that same instance.
The accounting department may validate it against an accounting profile.
The logistics department may validate it against a shipping profile.
The analytics team may apply a data-quality profile.
The archive department may apply records-management constraints.
None of these interpretations is necessarily more "correct" than the others. They are contextual views serving different purposes. The document is being projected into different local validation environments according to the needs of each department.
In this private, internal setting, "rightness" is often determined pragmatically. A schema is useful if it supports the work that needs to be done. Multiple schemas can coexist without difficulty because they are all operating within a single organizational boundary and under a shared authority structure.
The situation changes once the document crosses organizational boundaries.
Suppose the document is transmitted from one legal entity to another. At that point, the document is no longer merely an internal information object. It becomes a shared artifact whose interpretation must be sufficiently stable for independent parties to rely upon it.
This is where the distinction between public and private overlays becomes significant.
A public overlay is not merely a validation profile. It is effectively part of the interoperability contract between independent actors. If a sender claims that a document conforms to a particular public profile, the receiver must be able to validate that claim independently and obtain substantially the same result.
Consequently, public overlays typically require stronger governance, clearer specifications, published conformance criteria, versioning policies, and often formal change-control processes. They become part of the shared infrastructure of communication.
Private overlays, by contrast, can remain opportunistic and context-specific. They may evolve rapidly. They may be undocumented outside the organization. They may encode assumptions that would be inappropriate for external parties. They may even conflict with overlays used elsewhere, without causing problems because they are never exposed beyond their local context.
This suggests that there may actually be three layers rather than two.
The first layer is the canonical core. This defines the stable structural vocabulary shared across the ecosystem.
The second layer consists of public overlays. These represent profiles, standards, industry agreements, regulatory specifications, or contractual constraints that are intended to be shared across organizational boundaries.
The third layer consists of private overlays. These represent local interpretations, departmental requirements, operational constraints, internal workflows, reporting needs, and other context-specific projections that exist solely within a particular organizational environment.
A document instance may therefore participate simultaneously in all three layers.
It conforms to the canonical structure.
It conforms to one or more public interoperability profiles.
It conforms to a number of private local profiles.
The further one moves from the canonical core toward the private overlays, the less universal the notion of "correctness" becomes. Internal departments may legitimately disagree about which validation rules matter most because they are optimizing for different objectives. Accounting, logistics, legal compliance, customer service, and analytics may each apply different criteria to the same document.
By contrast, the public profiles occupy an intermediate position. Their purpose is precisely to establish a common notion of correctness between parties that do not share the same internal priorities. They provide the basis upon which documents can be exchanged, audited, trusted, and processed across organizational boundaries.
This perspective also helps explain why standards bodies often focus primarily on the canonical structure and a limited number of public profiles rather than attempting to standardize every possible use case. The closer a schema is to a private local concern, the weaker the case for broad standardization becomes. Local actors can often manage those concerns themselves.
Viewed in this way, the architecture is not merely a hierarchy of technical schemas. It is also a hierarchy of social visibility and institutional responsibility. The canonical core belongs to the ecosystem as a whole. Public profiles belong to communities of interoperability. Private overlays belong to individual organizations and departments.
The same XML instance may therefore be interpreted through many different schemas, but the significance of those schemas depends on who is expected to recognize and trust them. Internal departmental schemas derive their authority from local usefulness. Public interoperability schemas derive their authority from shared agreement. The canonical core derives its authority from providing a stable common language within which all of these other interpretations can coexist.
No comments:
Post a Comment