Wednesday, July 25, 2007

Bug day is tomorrow...

... but I must honor all schema locations today

I was bitten by this bug so I fixed it, although it may not make it into the next WTP maintenance release since it contains more or less "gratuitous" UI changes/additions.

The underlying problem is that of XML schemas that directly or indirectly import several schema files from different locations, like this:

A.xsd defines element nsA:A in nsA and imports B1.xsd and B2.xsd
B1.xsd defines element nsB:B1
B2.xsd defines element nsB:B2
A-instance.xml contains an nsA:A which contains a nsB:B1 and a nsB:B2.

(confused? This is a simplification of an example derived from a simplification of the real-world problem)

Now, the XML Schema spec says this is undefined and that a processor is free to ignore the schemaLocation attribute of the subsequent imports. Nevertheless, the Danish government office for IT standardisation and such has decided in ~2003 to mandate such use in their guidelines (OIOXML), and if you want to design XML interfaces to work in the public sector in Denmark, you'd better play along. Sigh. Many implementations, such as Microsoft's, did the sensible thing and ignored the extra schemaLocation, some even with a warning.

Now, WTP's XSD editor supports this in general (due to elaborate 'best effort' lookups in the EMF model for XML Schema), but the validation framework uses the Xerces schema functionality, not the EMF model, for validation. This validation drives the annotations you see in the Problems view and the as-you-type squiggly lines in the editor.

Fortunately, the problem was mostly solved in WTP a while back, possibly in 1.0.1, by introducing the "Honour all schema locations" checkbox in the XML Schema preferences.
However, this only works for the validation of the schema files themselves, not for XML files which use such schemas.

Now at least there's a solution, if the patch is accepted.

Next question is: Which bug to pick for bugday?