« May 2003 | Main | October 2003 »
June 14, 2003
The W3C Semantic Tour of Europe comes to London
On Wednesday I attended a day of lectures given by the W3C about the Semantic Web. Following are my notes from the two talks that I found the most interesting, which are mainly boiled-down versions of the speakers' slides. Online elsewhere are the slides from Ivan Herman's introduction to the W3C.
"Towards the Semantic Web" - Ora Lassila, Nokia
- Driver: Automation
- Short term goal: Interoperability
- Long term goal: "Departure from the Tool Paradigm"
Automation
Remove humans from the loop by making inforamation understandable by machines - accessible formal semantics. Hypertexts become semantic networks. As it stands, the WWW is an architecture for linkages ("pointing") - linkages are data, but data has no meaning without interpretation, and human interpretation doesn't scale.
The Semantic Web is a team sport - the problem is to achieve critical mass.
Accessible formal semantics: Ontologies. Features:
- controlled vocabulary
- concept taxonomy
- other relationships between concepts
- they are "a specification of conceptualization" (Thomas Gruber, Towards Principles for the Design of Ontology Used for Knowledge Sharing, 1995)
- they enable reasoning
- robust ontologies prevent the problem of "semantic drift".
The Resource Description Framework (RDF) allows the extension of ontologies. RDF:
- a data model of Directed Labelled Graphs
- an XML-based syntax for serialisation of DLGs
- nodes and arcs in an RDF DLG are named by URIs
- graphs decompose into object/attribute/value "triples". A statement: "subject/predicate/object".
The Web Ontology Language (OWL) is layered on RDF and offers more expressive power. The Semantic Web is built in a layered manner; not everybody needs all the layers. (Unicode → XML → RDF → RDF Schema → OWL...) XML is not a magic bullet; it's not enough just to use it. (Not only is XML not a lingua franca, it's not even a lingua.)
Ora prefers the "Darwinian" approach to the formation of ontologies rather than the committee approach. He feels that in the end a "mix and match" of individually-created and formally designed ontologies will prevail.
Reasoning and Inference
Reasoning allows one to draw inferences based on generalised "rules" - enabled by ontologies. It eases interoperability by allowing inferences between data in compatible formats. The Semantic Web, via ontologies and reasoning, will improve interoperability of information services.
Tools and Beyond
Before: human-operated software. After: software assistants for a range of tasks.
Formal Semantics
"The manifest destiny of AI". Machine learning could bootstrap semantic annotation for existing content. Ontologies → Reasoning → Agents
"Migrating Thesauri to the Semantic Web" - Brian Matthews, Council for the Central Laboratory of the Research Councils
Searching the Web with current technology has difficulty with ambiguity (chocolate chips versus silicon chips) and complex queries ("find me the paper on X that Y wrote on date T"). The Semantic Web will allow searching on descriptions of documents as opposed to by content.
There's a mismatch between terms of search and catalogue terms: translation is necessary. Restricted vocabularies [ontologies] - Knowledge Organisation Systems - can catalogue all items.
Levels of semantic structure may vary on different systems:
- simplest KOS: lists, maybe with cross-referencing (dictionaries).
- next level: classification schemes and taxonomies
- thesauri introduce more relationships: equivalence (exact, inexact, partial, one to many) - for multilingual thesauri; hierarchical (broader term, narrower term); associative (use, used for, related term).
Existing thesauri can be used to provide descriptions to use as a starting point to develop ontologies for the Semantic Web. The legacy of traditional thesaurus development is that rigorous schema exist, but they were developed for human consumption. Traditional thesauri often provide additional relationships not covered by broader/narrower terms, etc. (Subtype of, instance...)
Thesauri on the Web: provide data in RDF and allow people to mark up their data using these terms. Migrating to the Semantic Web requires:
- increased precision in thesauri
- human intervention.
One possibility is term-to-term modelling (e.g. modelling the relationships in DAML+OIL). Alternative approach: use RDF Schema to model concepts used in thesauri.
The task of migration is a complex one, requiring new tools and heuristics (a revised RDF thesaurus format? Alternative mappings?), but once it has been done, the process of searching for information on the Web will be improved dramatically.
Posted by Earle Martin at 10:09 PM |