CWA Part 1b - SDShare: Protocol for the Syndication of Semantic Descriptions
Contents
1. Abstract
This document describes a protocol for the exchange of semantic descriptions. It defines how a RESTful web service can publish a series of web accessible feeds that describe snapshots and changes to collections of semantic descriptions. This protocol also defines how a client should process those feeds so that a local store is in sync. A client can synchronize with more than one server to act as an aggregator for semantic descriptions.
NOTE: The current version of the specification, which continues to be subject to revision in the standardization process, is available under http://www.egovpt.org/fg/CWA_Part_1b
2. Scope
This document specifies the underlying syndication protocol for the exchange of information about semantic descriptions. The protocol conforms to the Atom Syndication Format and the Topic Maps Data Model (TMDM) and works with semantic descriptions represented as in XTM 1.0, XTM 2.0 and RDF/XML. It defines several layers of syndication feeds that a conforming application must provide. Finally it defines algorithms for the provisions and processing of the different feeds on the server and on the client.
3. Normative References
Atom: RFC 4287 The Atom Syndication Format. December 2005.
HTTP: RFC 2616 HTTP 1.1. Revision: 1.8. September 2004.
OWL: OWL Web Ontology Language. Semantics and Abstract Syntax. W3C Recommendation, 10 February 2004
RDF: Resource Description Framework (RDF): Concepts and Abstract Syntax. February 2004
TMDM: ISO/IEC 13250-2:2006 Information technology - Topic Maps - Part 2: Data model
Topic Maps: ISO/IEC 13250-1:2003: Information technology - SGML applications - Topic maps
4. Concepts
This protocol defines the following structure of Atom feeds published by a server to expose one or more collection(s) of semantic descriptions:
- An overview feed which lists the collections of semantic descriptions that the server manages
- a collection feed for each collection of semantic descriptions that each links to a corresponding snapshots feed and a corresponding fragments feed
- a snapshots feed for each collection. Each snapshot represents the state of a collection at point of time
- a fragments feed for each collection. Each fragment is created because it is the new representation of a topic that has changed.
Semantic collections can be represented as topic maps or OWL-compliant RDF/XML.
The protocol also specifies how a client should interpret and process these feeds in order to syndicate the server's semantic descriptions. A client that wishes to maintain a local semantic collection in sync with one held on the server first fetches the most recent snapshot for the required collection. It then subscribes to the feed of change fragments for that collection. This feed lists fragments of which each represent one or more changes in the underlying semantic collection. These fragments are created and an entry added to the feed when a topic is updated, added or deleted from the collection. The next time the client pulls the feed it updates its local collection with the new version of the topic.
NOTE: the notion of a client and server is solely defined by the responsibilities of each. Thus a given machine can act both as client and server (peer-to-peer scenario) or restrict itself to exactly one of the roles (publish-subscribe scenario).
NOTE: The term “topic” refers in this specification to any reification of a given semantic description.
The protocol provides no provision against malicious or broken servers. It requires the client to trust all upstream servers.
5. Protocol
5.1. Terminology
Server Node: A source of semantic descriptions that hosts both feed and data services so other nodes can observe and understand the state of the collection being managed over time.
Client Node: A node that subscribes to one or more server nodes and implements the update semantics defined in this protocol.
5.2. Server Contract
A server node is responsible for providing information about the state of the collection(s) it is managing. It provides a number of feeds that allow clients to see which aspects of the map have changed over time and data services that allow a client to fetch representations of the collection or individual topic instances in order to update a client environment.
5.3. Feeds & Data Services
A compliant server will provide the following Atom 1.0 feeds, fragment data services and snapshot data services.
5.3.1. Overview Feed
The overview feed lists all collections of semantic descriptions that are being managed by the server.
NOTE: This and the following class diagrams list selected properties of the Atom specification in addition to properties that are specific to this protocol.
Example Service URL: [server]/topicmaps
Example Http request for server http://sdshare.networkedplanet.com:
GET /topicmaps
The Atom payload of the overview feed contains an entry for each separate semantic collection. Each entry has a link to an Atom feed for the specified collection.
Example response body:
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Topic Maps managed by
sdshare.networkedplanet.com</title>
<link href="http://sdshare.networkedplanet.com/"/>
<updated>2008-12-13T18:30:02Z</updated>
<author>
<name>SDShare Server</name>
</author>
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<!-- topic map entry -->
<entry>
<title>eGovernment Resources Topic Map</title>
<!-- a link to the atom feed that has entries linking to
the snapshots and fragments feed for this map -->
<link rel="http://www.egovpt.org/sdshare/collectionfeed" type="application/atom+xml"
href="http://sdshare.networkedplanet.com/topicmaps/
egov"/>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2008-12-13T18:30:02Z</updated>
<summary>A topic map that contains the classification of
eGovernment services.</summary>
</entry>
<!-- an entry follows for each topic map being exposed.
...
-->
</feed>
5.3.2. Collection Feed
A collection feed is a feed for a given collection of semantic descriptions that provides exactly two entries, one linking to a snapshots feed and one to a fragments feed. Through one or more share:dependency elements the collection feed can optionally declare dependencies on other collection feeds from this or other servers that should be recursively processed before the current one.
Example Service URL: [server]/topicmaps/egov
Example Http request for server psi.egovpt.org:
GET /topicmaps/egov
The Atom payload of the response contains the two entries, one that links to a feed with all snapshots of the collection (rel-attribute of the link is http://www.egovpt.org/sdshare/snapshotsfeed), and another one that links to an Atom feed that lists change fragments to the collection (rel-attribute of the link is http://www.egovpt.org/sdshare/fragmentsfeed). Optionally, both links can be duplicated in the entries with rel-attributes that have the value alternate. This helps Atom feed readers to correctly display the links.
<a:feed xmlns:a="http://www.w3.org/2005/Atom">
<a:title>eGov TM</a:title>
<a:updated>2008-09-26T11:13:40-01:00</a:updated>
<a:subtitle>Feeds around the eGov TM</a:subtitle>
<a:id>http://http://psi.egovpt.org/feeds/testtm/</a:id>
<a:author>
<a:name>Isidor</a:name>
</a:author>
<a:link href="http://http://psi.egovpt.org/feeds/testtm/" rel="self"/>
<dependency xmlns="http://www.egovpt.org/sdshare">http://www.otherserver.org/feeds/ontology/</dependency>
<a:entry>
<a:title>eGov TM: Fragments</a:title>
<a:id>http://http://psi.egovpt.org/testtm/fragments/</a:id>
<a:updated>2008-09-11T17:58:39-01:00</a:updated>
<a:author>
<a:name>Isidor</a:name>
</a:author>
<a:link href="http://http://psi.egovpt.org/testtm/fragments/" rel="alternate" type="application/atom+xml"/>
<a:link href="http://http://psi.egovpt.org/testtm/fragments/" rel="http://www.egovpt.org/sdshare/fragmentsfeed" type="application/atom+xml"/>
</a:entry>
<a:entry>
<a:title>eGov TM: Snapshots</a:title>
<a:id>http://http://psi.egovpt.org/testtm/snapshots/</a:id>
<a:updated>2008-09-11T17:58:39-01:00</a:updated>
<a:author>
<a:name>Isidor</a:name>
</a:author>
<a:link href="http://http://psi.egovpt.org/testtm/snapshots/" rel="alternate" type="application/atom+xml"/>
<a:link href="http://http://psi.egovpt.org/testtm/snapshots/" rel="http://www.egovpt.org/sdshare/snapshotsfeed" type="application/atom+xml"/>
</a:entry>
</a:feed>
5.3.3. Snapshots Feed
A snapshots feed lists all representations of a given collection over time. At present, XTM 1.0, XTM 2.0 and RDF/XML representations are supported.
Example Service URL: [server]/topicmaps/egov/snapshots
Example Http-request for server psi.egovpt.org:
GET /topicmaps/egov/snapshots
Example Response:
<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:sdshare="http://www.egovpt.org/sdshare">
<title>The Snapshots of the eGovernment
Resources Topic Map</title>
<subtitle>A list of all snapshots of
this collection</subtitle>
<author>
<name>SDShare Server</name>
</author>
<updated>2008-07-17T12:15:07.020071Z</updated>
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<sdshare:ServerSrcLocatorPrefix>http://psi.networkedplanet.
com/</sdshare:ServerSrcLocatorPrefix>
<!-- a link to the feed -->
<link rel="self"
href="http://sdshare.networkedplanet.com/topicmaps/
egov/snapshots"/>
<entry>
<title>Snapshot 2008-07-17</title>
<updated>2008-07-17T14:04:42.205299Z</updated>
<!-- a link to the XTM 1.0 snapshot -->
<link rel="alternate" type="application/x-tm+xml;version=1.0"
href="http://sdshare.networkedplanet.com/topicmaps
/egov/shapshots/60a76c80-d300-11d9-b93C-
0003939e0af6"/>
<id>60a76c80-d300-11d9-b93C-0003939e0af6</id>
</entry>
<!-- an entry follows for each XTM snapshot being exposed.
...
-->
</feed>
5.3.4. Fragments Feed
A fragments feed lists all fragments of semantic descriptions that indicate changes for a given collection over a period of time.
NOTE: A given fragment can represent more than one change
Example Service URL: [server]/topicmaps/egov/fragments
Example Http-request for server sdshare.networkedplanet.com:
GET /topicmaps/egov/fragments
Example response body:
The Atom payload contains an entry for each fragment. Each entry contains one link to the fragment and the updated element contains the time at which the fragment was created. In addition to the standard Atom elements this protocol introduces two new elements.
The new elements are:
<ServerSrcLocatorPrefix>: Indicates to a client the prefix to use to locate topic properties that should be removed when updating a semantic description. This element should occur once as a child element of the <feed> and before the first entry. (See the fragment update algorithm below for more information.)
and
<TopicSI>: Indicates to a client which topic is being updated from all those present in the fragment. This element MUST occur at least once as a child element of each <entry>. (See the fragment update algorithm below for more information). More than one TopicSI element can be specified.
<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare">
<title>Change fragments from the eGovernment Resources
Topic Map</title>
<author>
<name>SDShare Server</name>
</author>
<updated>2008-07-17T15:47:17.062211Z</updated>
<id>28C5DBD8-652A-4617-8C4A-C0FFC49B4475</id>
<!-- The serversrclocatorprefix is used by a client
to know the providence of topic map constructs. -->
<sdshare:ServerSrcLocatorPrefix>http://psi.networkedplanet.
com/</sdshare:ServerSrcLocatorPrefix>
<link rel="self"
href="http://sdshare.networkedplanet.com/
topicmaps/egov/fragments"/>
<entry>
<!-- Best practice: the topic display name or the PSI should be used for the entry title -->
<title>ISO 19115:2003 Geographic Information -
Metadata</title>
<!-- the published date and time of the fragment -->
<updated>2008-07-17T15:55:21.971145Z</updated>
<!-- the id value is some unique value -->
<id>69CD5264-DB78-49c1-A7E4-04EECFA0AA85</id>
<link rel="alternate" type="application/x-tm+xml;version=1.0"
href="http://sdshare.networkedplanet.com/topicmaps/
egov/fragments/69CD5264-DB78-49c1-A7E4-
04EECFA0AA85"/>
<sdshare:TopicSI>http://psi.egovpt.org/standard/ISO+19115
%3A+Geographic+Information+-+Metadata</sdshare:TopicSI>
</entry>
<!-- an entry follows for each fragment being exposed ...
-->
</feed>
Supported datatypes are:
application/x-tm+xml;version=1.0: Serialization as XTM 1.0
application/x-tm+xml;version=2.0: Serialization as XTM 2.0
application/x-tm+xml: Serialization as either XTM 1.0 or XTM 2.0 acceptable
application/rdf+xml: Serialization as RDF/XML
5.3.5. Fragment Data Service
The fragment data service is a service that returns a specified collection fragment.
Example Service URL:
[server]/topicmaps/egov/fragments/fragment-for-topic-1
Example Http-request for server sdshare.networkedplanet.com:
GET /topicmaps/egov/fragments/fragment-for-topic-1
Response structure:
A collection fragment can be represented as a valid XTM 1.0, XTM 2.0 or RDF/XML document.
The generation of fragments for feeds with RDF/XML payload is described in the RDF subsection below.
A fragment for a feed with XTM 1.0 or XTM 2.0 payload is created in the context of one or more topics. The following algorithm should be applied when generating a fragment for given topic:
- Let 'export' mean to create an XTM representation of the TMDM construct
- Let T be the topic being exported.
- export T including ALL topicnames, identifiers, and occurrences.
- for each topicname in T export a topic stub for each name type (if it exists)
- for each topicname in T export a topic stub for each scope topic (if it exists)
- for each occurrence in T export a topic stub for the occurrence type (if it exists)
- for each occurrence in T export a topic stub for each scope topic (if it exists)
- for each association A in which T plays a role export the association
- for each association A export a topic stub for the association type
- for each association A export a topic stub for each topic scope topic
- for each role R in A export a topic stub for the role type and one for the role player UNLESS the role player is T
For each stub topic exported (the following minimum must be exported)
- export ALL of the topic's identifiers
ALL topics (stub or not) MUST have at least one Subject Identifier.
A server may choose to export more information in the fragment, what is described here is the minimum required.
5.4. Client Responsibilities
There are two aspects to client behaviour. The first is consumption of the feeds provided by the service the second is the updating of the local map based on the fragments it retrieves.
5.4.1. A Clean start
When a client first wants to sync with a server it can use the feeds provided to locate the collection of interest, retrieve its full representation and merge it into the local collection it is managing.
5.4.2. A Clean Replacement
If a client has a local collection that contains semantic descriptions from more than one server and wants to fetch and update the latest full collection from ONE source then it MUST do the following. Apply the delete topic algorithm from below, but apply it to all topics in his local collection. Then proceed in terms of 'A Clean Start', by fetching the collections from the originating servers and merging it in.
5.4.3. A partial update
Clients wishing to update their local collection as new changes occur on the server, should process the fragments feed for the appropriate collection. The client MUST record the date and time that it last updated its local copy and then find all Atom entries that have an updated value after that time. For each of these, in time order of most distant to most recent it should apply the following update algorithm.
5.4.4. The Fragment Update Algorithm
Bind SI to each TopicSI element in the Atom entry E.
Let SP be the value of the ServerSrcLocatorPrefix element in the Atom feed F
- feed F contains E
- entry E references fragment TF
- Let LC be the local collection
- Let T be the topic in LC that has a subjectidentifier that matches SI
- For all names, occurrences and associations in which T plays a role, TMC
Delete all SrcLocators of TMC that begin with SP
- If the count of srclocators on TMC = 0 then delete TMC
- Merge in the fragment TF using SP as the base all generated source locators.
NOTE: To deprecate a topic publish an empty topic
NOTE: The understanding is that each name, occurrence and association created or modified during the update will in its internal, TMDM-conformant representation have or get item identifiers that act as source locators and start with the ServerSrcLocatorPrefix.
5.5. RDF payload
This protocol is agnostic to the nature of the semantic payload. This section describes how the protocol works when the payload used is RDF.
NOTE: While the protocol is payload agnostic, it does not describe how to interoperate between different semantic payload representations.
5.5.1. The Fragment Generation Algorithm
To create a RDF fragment for a given entity uses the following algorithm:
For a given Resource
- Let T be the topic (a logical thing) being exported.
- Let R be the resource that identifies T
- Let S be the set of statements that comprise T
- Let 'export' mean to create a RDF representation of a graph of statements linked to the RDF resource. This graph is treated as a logical unit for the purposes of this algorithm
5.5.2. The Fragment Update Algorithm
Let SP be the value of the ServerSrcLocatorPrefix element in the Atom feed F
- feed F contains E
- entry E references fragment TF
- Let LC be the local collection
- Let R be the resource in LC that has a Resource URI that matches the SI
- For all statements where R is either the Subject or the Object of the statement STAT
Delete all SrcLocators of STAT that begin with SP
- If the count of srclocators on STAT = 0 then delete TMC
- Merge in the fragment TF using SP as the base all generated source locators.
The above algorithm talks about the SrcLocators of a Statement. This is not an inherant property in the RDF data model. This is a required extension for the RDF payload. All Statements are defined as a subtype of Resource. Thus each RDF statement (as a Resource) MUST have a list property that is a set of URI's. This property is called 'SrcLocators' and is used as described above.
6. Deployment Guidelines and Best Practices (informative)
This document defines no normative rules for the correct deployment of network topologies that build on it. However, this section proposes a few best practices to handle recurrent deployment issues.
6.1. Trust and Security
The protocol assumes that the syndication of semantic descriptions happens with trusted servers. Importing descriptions from untrusted, potentially malicious sources can have serious repercussions including corrupting the client's data. If the protocol is used in P2P scenarios where every node acts both as server and client, this implies that all partners need to trust each other.
Should the protocol be used to syndicate confidential data, it can be combined with end-to-end encryption between nodes, typically over https. Furthermore, it can leverage existing authentication mechanisms such as basic access authentication over http or https. In particularly sensitive cases it can also be employed over dedicated network connections.
6.2. Source Locator
This protocol does not define the exact structure of a source locator. However, it assumes that typically the source locator a server uses should start with its host name(s) to indicate a claim of ownership.
6.3. Topic Publication and Topic Ownership
This protocol does not define a strict notion of topic ownership nor does it specify when exactly topics should be determined as published. In the context of a given application servers and clients need to be in agreement over who owns a topic or a statement on it (in the form of a property / association), especially if the client in turn acts as a server and might want to reuse part of the syndicated information (as would be the normal case, e.g., in the resolution of feed dependencies).
In many cases, it is best for the server not to republish topics using its "own" source locator unless, in fact, it considers itself to be the originator of the original topic or statement. Rather, it should refrain from republishing a topic and explicitly advertise potential dependencies in the feed.
6.4. Versioning
The protocol implies that the server has a mechanism to keep information on topic versions over time. However it does not define any mechanism of how that can be achieved (proper versioning, maintaining of change logs, or other approaches). Clients have no need to maintain version information unless they in turn act as servers.
6.5. Selective Feeds
Often a server wants to provide several views on its semantic descriptions, e.g. a tightly controlled selection of its data for public use and a fuller view for internal consumption. In this case the server should publish a number of separate collection feeds, one for each view.
6.6. Extensions: Atom Publishing
The protocol does not define a mechanism for the client to update information on the server. However, valid use cases for such behaviour exist, e.g. in the case of local editing that is the pushed back on the server. The Atom Publishing Protocol is a good way to implement such behaviour, and a second revision of this CWA could contain a detailed specification for using it for this purpose.
7. Test Suite
These are set of files that can evolve into a test suite for implementations of the client and serve as target production set for servers.
Base test topic map: XTM 1.0: notificationbase.1_0.xtm , XTM 2.0: notificationbase.xtm , XSLT: xtm2toxtm1.xsl
Atom topic map feed document: topicmapfeed.xml
Atom topic fragments feed: notificationbase_changes.xml
Atom topic map snapshots feed: notificationbase_snapshotfeed.xml
- starting topic map xtm: (identical to the first the snapshot topic map xtm)
- snapshot topic map xtm (in normal XTM.)
fragments of the topics changed from one snapshot to the other: t100_1216302921.9711449.xtm , t100_1216303605.4192071.xtm
8. References
RFC 2616 HTTP 1.1 http://www.w3.org/Protocols/rfc2616/rfc2616.html
XML 1.0 http://www.w3.org/TR/REC-xml/
RFC 4287 Atom Syndication Format 1.0 http://www.atompub.org/rfc4287.html






