Connecting XML, RDF and Web Technologies for Representing Knowledge on the Semantic Web
Connecting XML, RDF and Web Technologies for Representing Knowledge on the Semantic Web
Slides: http://ilrt.org/people/cmdjb/talks/xmleurope2002/
Introduction
- HTML, XML and RDF compared
- Extensibility
- Schemas
- Webiness
- A little topic maps
- What to use when?
Representing Knowledge for who/what?
- For people to read
- For machines to interpret
HTML - The Web of Markup
- Documents in markup for people to read
- Describing things in the real world (you may get sued)
- A web - URIs for linking to other documents
- Can point to anything, without synchronising with them
- ... even if it doesn't exist, the web doesn't break
- To a machine, very little information is available
Semantic Web - The Web of Data
- Documents such that some terms are usable by software
- A web means universal, scalable, decentralised and evolvable
- Connect the documents, point to anything, allow 404s
Types of Web Data
For this talk considering:
- XML - (mostly-)tree of structured elements, attributes, ...
- RDF - graph of nodes connected by directed URI-labelled arcs (arrows)
- Topic Maps - graph-like structure with many topic node-types
Extensible Markup Language (XML) overview
- Unicode
- Mostly-tree structure; some other pointers allowed
- Designed for documents, not data - featureful
- Describes documents, not things in the real world
- A different DTD or schema needed for each XML syntax
- Best practice? is Infoset, new model for XQuery and XPath
XML - Extensibility
- Extensibility via XML Namespaces, DTDs, schemas and/or entities
- Best current practice seems to be namespaces
- Mixing multiple xml languages is not well understood
- Fragile - unexpected elements or attributes will fail
Resource Description Framework (RDF) overview
- A graph of items identified by URIs, no local identifiers
- Graph is a set of statements (subject, predicate, object) like:
"The PREDICATE of SUBJECT is OBJECT"
- Describes things in the real world
- Formal model theoretic semantics
W3C Semantic Web Standards Activity
W3C RDF Core Working Drafts
RDF - Extensibility
- Can always add more statements
- Extensible by XML namespaces for importing new sets of names
- RDF/XML syntax very general - no DTD
- Revised RDF/XML syntax now based on XML namespaces, base, infoset
- W3C RDF Core working on allowing W3C XSD datatypes
Schema Languages
Different approaches:
- XML schema languages describe or constrain the structure and content of XML
- RDF schema is a simple vocabulary language
XML Schema Languages
- W3C XML schemas describes shape and structure of tree
- Validate markup, content, integrity, links, ...
- Links to the web via namespace and schema URIs
- Model is one schema per namespace
- Namespace URI and local name are independent
- XSD uses (namespace URI, local name) pair as identifier
RDF Schema Languages 1
RDF Schema Languages 2
- More checking via higher layers
- New languages: DAML+OIL / OWL from W3C Web Ontology WG
- Some support for XSD types in rdf property values.
- W3C XML Schema for RDF/XML is hard
Webiness
- XML:
- No built-in linking, needs XLink
- Fragile for unexpected things
- Each new layer needs a new application
=> XML is not webby
- RDF and XML Topic Maps:
- Built in linking via URI-refs
- No terms forbidden
- (RDF) layering expected
=> RDF and XTM are webby
Topic Maps
- A formerly non-web technology being applied to description on the web
- Rich description of subjects via subject indicators, identifiers
- Relationships between subjects - associations are
scoped, associated with topics
- Processing model for manipulating
- See other sessions for more
Summary - XML
XML when:
- structural (C struct, union, data types)
- format is fixed, internal
- for sharing, expect all to write new code, use schema/DTD
Summary - RDF
RDF when:
- everything based on describing the real world
- expect the unexpected to turn up
- descriptive rather than detailed structural
- data types are not crucial, but nice
- want to leverage into rules, logic, semantic web
Summary - Topic Maps
Topic Maps when:
- relationships are complex, scoped - in context of other topics
- identity of entities in relationship needs careful consideration
- semantic web content also
Questions?
Further Reading
References