Sunday 10 April 2011

Document Engineering: Test Assertions Example

I was asked to clarify my previous blog posting from yesterday with some examples, so I'll give it a go.

A simple example is an OASIS Universal Business Language (UBL) invoice's so-called 'calculation model' (an example of rules showing how calculations in the invoice are to be made). The invoice has all kinds of totals and amounts so the way you calculate totals from the amounts is important to get right.

In TAML (Test Assertions Markup Language) I can write the rules as test assertions but using XPath so they can be executed against the XML invoice:
...
        <taml:testAssertion id="IN1" name="Invoice" enable="true">
                <taml:normativeSource>U2ICMDraft5Rule1:

"To be a conforming UBL 2 invoice the document MUST be valid according to a standard UBL 2 Invoice schema."</taml:normativeSource>
                <taml:target type="document" idscheme="'document'">/</taml:target>
                <taml:predicate>count(//in:Invoice) ge 1</taml:predicate>
                <taml:prescription level="mandatory"/>
                <taml:report label="failed" message="Not a standard UBL 2 invoice">The file does not contain a standard UBL 2
invoice.</taml:report>
        </taml:testAssertion>

        <taml:testAssertion id="INTOT1" name="LineExtensionAmount (1)" enable="true">
                <taml:normativeSource>U2ICMDraft5Rule2:

"The 'LineExtensionAmount' in the invoice 'LegalMonetaryTotal' SHOULD equal the sum of all 'LineExtensionAmount's in all of the invoice lines."
                </taml:normativeSource>
                <taml:target type="total"
idscheme="'invoice-total'">
/in:Invoice/cac:LegalMonetaryTotal</taml:target>
                <taml:prerequisite>(count(distinct-values(
//*/@currencyID)) eq
1)

                </taml:prerequisite>
                <taml:predicate>

number(./cbc:LineExtensionAmount) eq
sum(/in:Invoice/cac:InvoiceLine/cbc:LineExtensionAmount)

                </taml:predicate>
                <taml:prescription level="preferred"/>
                <taml:report label="failed" message="Error in Line Extension Amount">

The line extension total is not the sum of the invoice lines'
line extension amounts.

                </taml:report>
        </taml:testAssertion>
...


These are just two rules in an example set of rules.

If I use XML Schema (XSD) 1.1 to apply the assertions (effectively as test cases) by combining them with a schema (see previous blog posting) I run into some immediate problems:

1) Ideally I need to use a schema which targets a UBL invoice but a) my UBL invoice already has a schema b) my UBL invoice schema has some design rules which might make it tricky changing a globally defined element into a locally defined one

2) Assuming I can write my own schema in XSD 1.1, and can find a way to define some elements globally, if the assertion(s) targeting it allow this, and some elements locally if the assertions targeting those elements demand
it (e.g. have relevance only to certain contexts for that element)


3) Does every assertion map to one or more elements? I need to be able to turn any failed test of one assertion into a report refering to the one TA (by its ID). Can I do that?

4) How do I apply prerequisites? I could add them to the predicate perhaps but it makes it quite complex. The logic needs to be that if (count(distinct-values(//*/@currencyID)) eq 1)then the predicate applies, else it does not apply so it might be a little more complex than I'd like. Now this doesn't give me a mapping to my test assertion so I need to add the TA id somewhere - in a report or annotation, say (can I do that in XML Schema 1.1 ?).

That's how I'd like it to work but I'm told there is a hitch to this. The XML Schema 1.1 assert cannot lookup values from another part of the document. Duh! Still, I can handle it perhaps. I need to take all the values from the UBL that I want to test, calculate the appropriate totals, do the appropriate lookups and put the results in a special XML file - I might call it the provisional report file. It is then this file's markup, specially designed to support my test assertions, which I define using XML Schema 1.1.

e.g.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="InvoiceCalculationModelReport">
   <xs:complexType>
    <xs:sequence>
     <xs:element name="IN1">
      <xs:complexType>
       <xs:sequence>
        <xs:element name="InvoiceCount">
         <xs:simpleType>
          <xs:restriction base="xs:int">
           <xs:assertion test="$value ge 1"/>
          </xs:restriction>
         </xs:simpleType>
        </xs:element>
       </xs:sequence>
      </xs:complexType>
     </xs:element>
     <xs:element name="INTOT1">
      <xs:complexType>
       <xs:sequence>
        <xs:element name="DistinctCurrencyCount">
         <xs:simpleType>
          <xs:restriction base="xs:int">
           <xs:assertion test="$value eq 1"/>
          </xs:restriction>
         </xs:simpleType>
        </xs:element>
        <xs:element name="LegalMonetaryTotalLineExtensionAmount">
         <xs:simpleType>
          <xs:restriction base="xs:int"/>
         </xs:simpleType>
        </xs:element>
        <xs:element name="SumOfLineExtensionAmounts">
         <xs:simpleType>
          <xs:restriction base="xs:int"/>
         </xs:simpleType>
        </xs:element>
       </xs:sequence>
       <xs:assert test="SumOfLineExtensionAmounts eq LegalMonetaryTotalLineExtensionAmount"/>
      </xs:complexType>
     </xs:element>

     <!-- ... -->
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>


Then it is a relatively simple matter to extract values using code, database or XSLT (with help from the TA XPaths perhaps if the latter is used) from the target UBL invoice into a provisional report file which is validated by this schema. The report file would, in this example with just a few test assertions, look like the following:

<?xml version="1.0" encoding="UTF-8"?>
<InvoiceCalculationModelReport>
  <IN1>
   <InvoiceCount>1</InvoiceCount>
  </IN1>
  <INTOT1>
   <DistinctCurrencyCount>1</DistinctCurrencyCount>    <LegalMonetaryTotalLineExtensionAmount>100</LegalMonetaryTotalLineExtensionAmount>
   <SumOfLineExtensionAmounts>100</SumOfLineExtensionAmounts>
  </INTOT1>
  <!-- ... -->
</InvoiceCalculationModelReport>


The added advantage is that I can then get results into my provisional report file whatever the target system, even for a non-software system. The final output is the final report file which reports pass/fail/irrelevant of the like for each assert or better still, for each TA.

In the open source tool for executing test assertions written in XPath, called Tamelizer (hosted on Google Code) there is a similar way to handle such scenarios. The Test Assertions (TAs) are written in TAML and there is a provisional markup written describing the features of the targeted product. The executable TAs are written in terms of the markup used to describe the product features and executed against that marked up description document.

Here's a shot at how it would work for a target which is a mechanical widget with some features as follows 
1) red button on top
2) number of batteries = 2
3) voltage = 6V
4) alarm sound = continuous
5) country of destination = UK


The spec might have some TAs written for it such as

TA1: if there is a 6V battery then the button on top MUST be red
TA2: there MUST be at least 1 battery
TA3: if there are two batteries then their voltage MUST be 3V
TA4: if the country of destination for the widget is US then the
button on the widget MUST be blue


The features can be marked up with some XML. The test cases are written as the TA expressions but in terms of the markup as XPaths

Now here the XSD 1.1 comes in. The markup for the features is defined now in XSD 1.1 so that the test cases can be added into the asserts of the respective elements.

The markup might be

<widget>
<buttonColour>red</buttonColour>
<batteries>
<number>2</number>
<voltage units="Volts">6</voltage>
</batteries>
<destination>UK</destination>
...
</widget>


The expession of a TA in terms of the above might be

<TA id="TA2"><predicate language="XPath">/widget/batteries/number gt; 1</predicate>...</TA>

The schema definition for batteries might be

<xsd:element name="batteries">
...
<xsd:assert>number &gt; 1</xsd:assert>
...
</xsd:element>



Now the point is that the assertion needs to meet some criteria of TAs. One is that the report needs to refer to exactly one TA by its Id. There might be several TAs applying to the same element so the XSD 1.1 might or might not meet this requirement. This relates to the fact that a TA should itself map to a single  requirement in the spec. The TA needs to be self-contained too so that this mapping is unambiguous.

If there is a prerequisite for a TA then there needs to be a corresponding prerequisite for the assertion in the schema.

It gets complex when I have a single prerequisite for many TAs together but I can just apply the same prerequisite logic to each and every assertion individually. If I do though the XPaths of testing might take longer. If I structure the schema though and the if I have control over the features markup too I might be able to put my prerequisites on ancestor elements so that descendants' assertions are irrelevant if those higher assertions fail. Now this is where portability might apply. I really want to make each assertion composable and composability makes portability desirable. Say I have many profles for my spec: I might want some TAs to be transferred between those profiles. This might mean I have to make the assertions in the schema portable for those TAs so that I can move them from a schema for one profile into a schema for another one. Having those higher level assertions might hinder this: It might be better to have any prereqs added to every assertion where it applies so that I can always copy that element with its prerequisites and predicate expressed all in the one assertion. That is really my main point. Plus I tend to think the TAs are going to be each self-contained and atomic so the assertions in the schema relating to them might need to be too.

Now I can go back to the real world example of the UBL invoice test assertions and instead of writing an XSD 1.1 schema again for UBL, I can write a schema for a list of elements to take values extracted from a UBL invoice - the tax totals, line totals, etc. I can derive it from an invoice using XSLT, say, or just programming code. Now the schema for this totals document can itself be written in XML Schema 1.1 and can be modelled along the lines of my set of test assertions. It can then take XPath Boolean expressions derived from my Test Assertions (if the latter are already written in XPath I might only want to combine the prerequisite and predicate expressions to get my XSD 1.1 assert expression). Then I can execute the schema against the totals in their XML markup and, using say a tool like Saxon which can read and execute the XSD 1.1, obtain a list of any deviations from the rules.

Not exactly keeping it simple but sometimes it has to be just a little complex to accomplish what we need, just no more complex than it needs to be, hopefully.

I have to say though, even when compared to Schematron, another assertion-based schema language based on XPath and executed using a two-step XSLT approach, I do rather prefer Tamelizer, the Google Code project, with XPath expressions in executable TAML test assertions. Maybe it won't get as wide use as XML Schema 1.1 but it is just right, I think, for test cases for XML document targets. It takes more skill to write all of the reasoning into the TAML XPath expressions but they execute against the actual target XML so for this kind of target it is worth that little more struggle getting the XPaths right. The prerequisite facility makes it all the more satisfactory in my view.

Document Engineering: Test Assertions and XML Schema Assertions

Computer software does sometimes require some engineering discipline to keep in running smoothly. Software sometimes needs to be spoon fed information and sometimes that information comes in the form of documents similar to those read by humans (invoices can be sent from computer system to computer system to simplify and improve the efficiency of business transactions, perhaps over the Internet). Sometimes humans need to use software to write documents and software to read those same documents. The software used to write a document might be different from the software used to read it (Word at one end and Open Office at the other, say). On the Internet, web browsers need to read websites written and served up with various kinds of software packages from various software producers. All this makes it important at times to apply some engineering practices to ensure things work well. One such practice is the process from specification to conformance test or interoperability test. The specification for, say, a document might require that software reading the document handle it in this way or that way. It might also say how the document is to be written, perhaps using one of the many markup languages such as those which are based on the W3C standards authority's eXtensible Markup Language (XML). One software engineering practice used over and over for decades of computer system history is the production of many test assertions corresponding to statements in a specification. These are atomic restatements which get numbered or indexed in some way so that a test based on the spec can be tied to an individual statement in the spec. The test assertions are a bit like a special engineering index for the spec to help with testing.

Now a test assertion exists so that testers know that there is something particular which they ideally ought to make a special point of testing. They can refer to this test assertion in their test so that a failure of a component being tested can be tied in a test report to the exactly relevant item in the spec. Not complicated really. They often call the individual tests 'test cases'. Large, complicated systems can involve many thousands of test assertions (for large, complicated specifications, of course) and similarly large numbers of test cases based on these assertions. The test assertions could fill a fair bit of a database or a pretty big file of data, depending how they are stored. Then you have to be able to cater for various versions of the specs and various versions of the software or documents being tested. Simplicity might help make it all manageable and help real people keep track of it all. 

Now I have an interest at the moment in a particular technology and I'd like to explore an idea that for XML document testing, and maybe any kind of testing when the tests are documented first as XML documents, you can either write the test using this certain technology or turn test results into test reports using this technology when the tests themselves are run some other way. The technology in question is W3C XML Schema version 1.1. This allows an assertion a bit like a test assertion to be written using a special syntax for expressions relating to an XML document to be inserted into the middle of a definition of a particular part of that XML document. Test assertions targeting XML documents do not have to be expressed so that they can be executed as a test applied to the XML documents. There are some well known examples of where they are written this way but here (some WSI web services specs have such test assertions) really these test assertions are doubling up as test cases. I reckon the assertion feature in XML Schema 1.1 might be used for such test-assertion-like test cases but how? If you write a set of descriptions of tests using XML markup of some kind and define that markup using XML Schema 1.1 I reckon you could match every atomic statement of a requirement or the like in a spec to a test assertion for that requirement and put an element in the markup to report on the testing of the test assertion and define that element in the XML Schema. If the XML Schema is written using version 1.1 then you can put an assertion into the definition of the element which when executed as an executable expression (with a Boolean result) against the element in the test report produces a yes/no answer (true/false answer, it being a 'Boolean', 'predicate' expression).

That might be one way to do it which results in a layer of yes/no answers which can be overlaid on the report to show whether test results are conforming to the test assertions or not. Another way works if the target is itself an XML document but here it gets more challenging. The assertions are run against the XML target but how do they get stored in the XML Schema, I wonder? The test assertions might be themselves documented using markup such as OASIS's Test Assertion Guidelines Technical Committee's (in progress) Test Assertion Markup Language (which I helped write up). Then we are left with the test cases which I wonder whether they can somehow be put into the form of a W3C XML Scheme 1.1 schema. They could be put into the form of a Test Assertion Markup Language XPath profile document and executed that way against the target XML using a tool like the Google Code project's Tamelizer (by Fujitsu America, based on previous work with WS-I) but that doesn't use XML Schema 1.1, it uses XSLT 2.0, which is cool. (Schematron is a similar alternative too which also uses XSLTXPath used by XML Schema 1.1 and the above alternatives) to some kind of schema but it isn't obvious how to do so. I think I'd need a report-like XML structure with one element for each test case and for each element (or element's 'type', an XML Schema thing) a schema definition where I can insert the XPath assertion expression. That doesn't work. I don't want to run the schema against the XML structure, I want to run the assertions in it against the target XML. No good. The target XML might have its own schema. What I have to do is create a schema for the target and put my assertions there. But it constrains the way I define my schema, perhaps not the way I want to do it for the type of XML document I am testing. Still it is an option. Another is to write tests and report on them in a test report document and define the test report document's XML using the schema where I put the assertions. That means two steps and is the same as the first option I looked at above which need not be limited to XML targets (or even software targets, similarly with Tamelizer).

Right so I might decide I can indeed use XML Schema 1.1 to define the target and put my assertions into the individual elements' definitions. I'd probably want a 'global' element definition for each element which always has the same test assertion(s) applying to it wherever it occurs. If the test assertions depend on where the element occurs then I might have to have local elements defined so they can have a set of test assertions applied to them depending on their context (where they occur in the document). This messes with my design a bit but seems to be part and parcel of this technique. All-in-all I need my assertions to map to the test assertions and the test assertions to map to the 'normative' statements in the spec for that type of document. OK, fine. It means to use this approach I have to base my schema design not just on the document's XML structure and processes intended for the document which might impose requirements on the schema design but also I base the schema design partly on the spec design too insomuch as that dictate the test assertions and their granularity. I might then have a mix of global element definitions and local element definitions but I would foresee possible problems if an element has some assertions relating to it as a target which depend on its context in the document and some which don't. In these cases the local, context-dependant requirements (and their test assertions) trump the global, context-independent ones and the element might just have to be defined with one or more local definitions in the schema.

I conclude I'm fairly comfortable with the use of W3X XML Schema 1.1 for associating test assertions with XML documents and even with other targets for testing but it might be limited to applying a kind of truth table of yes and no test reports as a layer over the top of a more general test report for tests made some way other than with the assertion expressions themselves and this goes along similar lines to those used in tools like Tamelizer. To go further and make the schema double as the set of test cases (the test suite or part of it) requires, I think, that the target be an XML document which is itself defined using a schema under the control of the test assertion and / or test case author(s). Cool.