Evaluating the X3D Schema
with Semantic Tools
Marc PETIT (EDF) - Henry BOCCON-GIBOD (EDF) - Christophe MOUTON (EDF)
- X3D turns 18 this year !
- It is a successor to VRML (1.0 : 1994; 2.0 : 1997).
- A large part of X3D's data model comes from VRML.
- X3D even has a VRML encoding.
- 2004 : X3D appears as v3
- X3D's evolution is guided by several Working Groups.
- Each one focuses on a topic : CAG, GIS, medical ...
- With this long and rich history comes complexity.
- Maintaining consistency in this context is difficult at best.
X3D : the need for consistency
- The need for consistency in 3D data depends on the target.
- For art or games, it may not be very high as long as the behaviour seems correct.
- For CAD or medical visualization, any issue can have dire consequences.
- In XML, data consistency is mostly enforced by the document's schema.
- Other solutions are available (like schematron), yet not always used.
- An online validator is available at https://savage.nps.edu/X3dValidator
(thanks to Don Brutzman at Naval Postgraduate School).
- Using additional validation services make the system more complex.
- The set of rules which make the standard is defined by the schema.
- The ability to control data quality is important for many applications.
- The Ariane 5 launch failure in 1996 was due to improper ranges of values !
- Our company operates 58 civil nuclear reactors ...
- 3D is critical for none of these yet quality insurance is a mandatory feature for us.
- A methodology for systematic XML schema quality control is required.
Metamodels and their uses
- Metamodels are models of models.
- A classical metamodel is UML.
- In order to avoid multiplying levels, the last one usually is self-described.
- This led to the introduction of the Meta-Object Facility by the Object Management Group.
- Each metamodel has specific uses.
||Terminology (categories of things)
||Assertions (actual things)
- An XML schema is a description of the structure and limits of a valid document.
- Its philosophy is inherited from the Document Type Definition.
- Schemas are more powerful, allowing a more precise definition of structure and values.
- A schema is an XML document.
- XML schemas are validated by XMLSchema.xsd.
- Analysis of the schema is guided by its tree structure.
- X3D documents are validated by the X3D schema.
- A fragment from the 3.2 schema :
- This fragment defines the IndexedFaceSet element.
- The whole document is more than 10.000 lines long !
- 65 simple types, 72 complex types, 232 elements.
- Anonymous types, hundreds of attributes.
- Keeping this document consistent is a complex
task at best.
Resource Description Framework
- RDF is a framework for description (knowledge representation).
- It is based on assertions.
- An assertion is a triple : subject - predicate - object.
- A more usual way to refer to this is : subject - verb - complement.
- The lack of document structure make contradictions possible.
- The absence of a statement doesn't imply its falseness.
- This is the open world assumption.
- This can be opposed to default values in schemas.
- New assertions can thus be deduced by reasoning.
- Anything is a resource.
- Only a Uniform Resource Identifier is needed.
- RDF Schema extends the RDF vocabulary.
- It introduces new resources which help in the structuration of knowledge.
- It introduces the notions of class and property.
- Properties are also part of RDF Schema.
- Properties can have domain and range.
- Specific properties like "label" or "subTypeOf" are included.
Web Ontology Language
- An ontology is a formal description of the concepts and entities of a domain.
- Concepts and their relationships form the terminology.
- Knowledge about the concepts and entities is supported by assertions.
- Knowledge can be extended by reasonning.
- The OWL language extends RDFS. It adds new features to it :
- Property characteristics and restrictions.
- Transitive, symmetric, functional, cardinality ...
- Mapping (equivalentClass, sameAs ...).
- Complex classes (intersection, union ...).
- It represents knowledge and only it.
- Nothing is implied by structure or by default.
- Is X3D about description or document ?
- Is content or structure more important ?
Semantic evaluation process of the schema
- Before even considering data, the semantic validity of the schema in itself should be assessed.
- We use a process in 4 steps.
- Extract knowledge from the schema.
- Convert the schema to an ontology by XML Stylesheet (XSL).
- Clean the extracted ontology.
- Reordering and simplification is achieved by opening and saving the ontology with the Protege editor.
- The use of a knowledge base make simplification possible.
- Reasoning could be used here to enhance the available knowledge (RDFS would be enough at the moment).
- Produce an analysis report.
- Suspicious patterns are encoded into an XSL which directly produces an HTML report.
- Pattern detection is made easier by the use of Protege.
- Analyze the report.
- This is the only human action (diagnosis is too specific and complex to be automated).
- For X3D 3.3, 81 out of 469 attributes are detected as suspicious.
XML Schema to OWL mapping
- There are many possible mappings between XSD and OWL.
- The mapping has to be adapted to the need.
- Usually, the goal is to map 3D to other data (through the owl:sameAs property).
- In our process, the ontology is only used for terminology control.
- As a consequence, it doesn't need to be populated.
- Elements and attributes are translated into OWL classes and thus can have properties.
- Links between elements, types and attributes are translated into properties.
- The conversion is based upon three rules.
- Named complex types are mapped to classes, wherever they are defined.
- Anonymous complex types are expanded in place.
- Named simple types are mapped to classes, wherever they are defined.
- Anonymous simple types are not mapped. This makes detection of lack of consistency between simple types easier.
- The mixed attribute is not mapped.
- The included text is unstructured, so trying to check its model is futile.
- Attributes are mapped to properties
- Attribute are identified by name only. Homonymous attributes are considered a single property.
- This is why we need to simplify the ontology : stating tens of times that a name is an SFString is useless.
- Attributes with anonymous type are detected.
- It is possible that defining a named type is useless.
- There is however a risk that such a type should have been shared.
- Attributes with multiply defined type are detected.
- A generic concept could exist with multiple implementations.
- The concept is usual in meta-programming.
- This can be implemented in OWL as a property with no defined range, with a subproperty for each type.
- It can also be the symptom of an uncompletely defined concept.
- As long as the possible inconsistency isn't identified and documented, there remains an unacceptable risk.
- Considering the list of issues detected with these two rules, we have stopped there for now.
Evaluation of the X3D schema
Example issue : angle
- 5 attributes are related to the angle concept.
- startAngle, endAngle, creaseAngle, cutOffAngle, beamWidth
- For each one, the type is defined locally.
- As a float between -6.2858 and 6.2858.
- These are not the best IEEE 754 approximations for 2*Pi and -2*Pi.
- As a positive float.
- There is no maximum value.
- As a positive float limited to 1.570796.
- This local definition has several drawbacks.
- The schema is bigger than it needs (because of multiple definitions for the same type).
- There is no identified relationships between angles.
- These could be useful for scripting or code generation.
- Adding the needed types is transparent for X3D documents !
Example issue : depth
- The depth attribute exists with two different types :
- SFFloat on Contact.
- MFVec3f on FogCoordinate.
- The difference is too important for a well defined concept.
- SFFloat and MFFloat could be right considering the difference of dimension of the supporting objects.
- There is however no reason for a 3D depth for each coordinate of a fog.
- How has this been unnoticed for so many years ?
- The regular expression supposed to check the validity of MFVec3f is flawed.
- An empty string is valid from the schema's viewpoint.
- The same is true for SFVec3f.
- The number of values doesn't have to be a multiple of 3 !
- Most people don't actually read the schema or test error cases.
- Even worse, the specification is correct and inconsistent with the schema.
- Section 24.4.3 of the specification states that FogCoordinate.depth is of type MFFloat.
- The schema states it is a MFVec3f.
- Inline type definition
- Several simple types should be added (listed in the paper).
- Listing and adding them systematically actually reduces a bit the size of the schema.
- Regular expression based validation has limits.
- Too much freedom is left in the definition of some types.
- This was detected even without considering the double parsing issue.
- Inconsistencies exist between specification and schema.
- Precision may be inconsistent within a scene.
- GeoElevationGrid uses double precision (except for yScale) yet is a X3DGeometryNode and can be used anywhere.
- In some cases, very generic concepts are used.
- This is a false positive in our detection.
- Being able to understand the genericity of these concepts is however interesting.
Example of a corrupt "valid" file
- This file, despite its flaws, is considered valid by the schema.
- Even Schematron only warns about the 2.666 triples issue !
- We proposed a methodology for schema evaluation.
- A large part of it is automated, and makes it possible to focus only on possible problems.
- It was used on X3D, yet the automated part can be applied to any XML schema.
- As of today, only the attributes are investigated, and no actual OWL reasoning is applied.
- We proposed several enhancements to X3D.
- Some have a limited impact, other will break backward compatibility.
- Since only inconsistent documents cannot be automatically transformed into the evolved schema, this would be beneficial anyway.
- We showed that an ontology helps keep a more consistent model.
- An ontology only focuses on knowledge and doens't imply anything due to structure or default values.
- The knowledge inside the X3D schema could be integrated into an ontology.
- Evolution could be made more consistent.
- Ideas could emerge (are CAD people the only ones interested in visibility and layers ?).
- The Schema could be generated from this ontology.
- OWL offers possibility (unused here) which could be used for data consistency checking.
- For the authors.
- Extend the transforms to other aspects of the schema.
- Attributes grouping.
- Elements and types relationships.
- Apply the methodology to other formats (
COLLADA breaking news : the methodology was applied to COLLADA and an error was confirmed yesterday !).
- For the whole X3D community.
- Integrate the proposed evolutions.
- Create an ontology for X3D knowledge management
- Our XSL transforms can be used as a basis.
- Use this X3D ontology to build X3D v4.
- Backward compatibility could be ensured by transforms.
- An OWL encoding should be added to X3D.
Thank you for your attention