-
Notifications
You must be signed in to change notification settings - Fork 0
WADM
This is a suggestion
Imagine, we have a knowledge graph with statements about passages of texts and we are using URIs to the document endpoint to identify these passages. E.g. the statement, that John:1:3 is the archetype of the DIY principle 😉
<https://example.com/api/dts/document?resource=https://coexist.org/b/john.xml&ref=John:1:3> my:archetypeOf diy:Principle .Nice. However. that's not sufficient.
There is much information enclosed in the URI. But we do not want to
parse the URI. We want statements, that describe the RDF resource
which is identified by the URI
https://example.com/api/dts/document?resource=https://coexist.org/b/john.xml&ref=John:1:3. We
want triples that tell us what that is.
DTS does not provide the properties and classes for formalizing the information. But there's a already an open standard for such information: the web annotation data model (WADM). By describing the RDF resource with the WADM we also get an alignment with CIDOC-CRM, at least if we follow the proposal of the LINCS project.
In terms of the WADM, a part of a document as is returned by the
document endpoint, is a specific resource. And a derived view as
returned by specifying the mediaType parameter is also a specific
resource:
While it is possible using only the constructions described above to create Annotations that reference parts of resources by using IRIs with a fragment component, there are many situations when this is not sufficient. For example, even a simple circular region of an image, or a diagonal line across it, are not possible. Selecting an arbitrary span of text in an HTML page, perhaps the simplest annotation concept, is also not supported by fragments. Furthermore, there are non-segment use cases that require a client to retrieve a specific state or representation of the resource, to style it in a particular way, to associate a role with the resource that is specific to the Annotation's use of it, or for the Annotation to only apply when the resource is used in a particular context.
The Web Annotation Data Model uses a new type of resource to capture these Annotation-specific requirements: a SpecificResource. WADM TR, Sec. 4
Let's first have a look at describing a partial resource as
John:1:3. In WADM, such a partial resource is a specific resource,
consisting of the source (identified by the URI to the whole
resource) and a selector, that describes the passage by some
selection mechanism.
There is no fixed set of selection mechanisms in WADM, but lots of
work for XML is done since it introduces the
oa:XPathSelector
for DOM-based documents. We can also specify the DTS selection
mechanism and provide alternative selectors, that both describe the
same portion of the document.
Multiple Selectors can be given to describe the same Segment in different ways in order to maximize the chances that it will be discoverable later, and that the consuming user agent will be able to use at least one of the Selectors. WADM 4.2
Here's how it could look like when we describe the partial resource in two ways:
@prefix dts: <https://w3id.org/dts/api#> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix my: <http://onto.me/theory/> .
@prefix diy: <...> .
<https://example.com/api/dts/document?resource=https://coexist.org/b/john.xml&ref=John:1:3>
a oa:SpecificResource ;
oa:hasSource <https://coexist.org/b/john.xml> ;
oa:hasSelector [
a oa:FragmentSelector
dcterms:conformsTo <https://w3id.org/dts/api#> ;
rdf:value "tree=wadm&ref=John:1:3"
] ;
oa:hasSelector [
a oa:XPathSelector ;
rdf:value "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}l[3]" ;
] ;
# our analytical assertions. They might be formalized a bit
# different, but thats not the point here.
my:archetypeOf diy:Principle .Note the first selector, that describes the text passage in the style,
that we know from the DTS specifications, although using DTS as an
OA-Fragment style is still unspecified. The proposed syntax in the
rdf:value property is borrowed from RFC5147, which is used for the
style of plain text selectors. Let's change this!
How can we generate such a RDF-based description of the document part? Do we need an other endpoint? Good news: No. We can get it from the LOD returned by the navigation endpoint by applying a SPARQL construct query on it. We need some parameters for the SPARQL query, but a client knows them from his query to the document endpoint: the query URL, the citation tree and the ref parameter (or start and end).
Everyting else needed can be provided in the citation tree, especially the value for the XPathSelector:
<refsDecl n="wadm" default="false">
<citeStructure unit="book" match="//body/lg" use="@n">
<citeData use="path(.)" property="https://w3id.org/dts/api#xpath"/>
<citeStructure unit="chapter" match="lg" use="@n" delim=":">
<citeData use="path(.)" property="https://w3id.org/dts/api#xpath"/>
<citeStructure unit="verse" match="l" use="@n" delim=":">
<citeData use="path(.)" property="https://w3id.org/dts/api#xpath"/>
</citeStructure>
</citeStructure>
</citeStructure>
</refsDecl>There's an analogous example in
test/john.xml.
target/bin/xslt.sh -config:test/saxon.xml -xsl:xsl/navigation.xsl -s:test/john.xml tree=wadmThe members of the wadm citation look like this:
"member": [
{
"level": 1,
"dts:xpath": "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]",
"identifier": "John",
"parent": null,
"citeType": "book",
"@type": "CitableUnit"
},
{
"level": 2,
"dts:xpath": "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]",
"identifier": "John:1",
"parent": "John",
"citeType": "chapter",
"@type": "CitableUnit"
},
{
"level": 3,
"dts:xpath": "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}l[1]",
"identifier": "John:1:1",
"parent": "John:1",
"citeType": "verse",
"@type": "CitableUnit"
},
{
"level": 3,
"dts:xpath": "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}l[2]",
"identifier": "John:1:2",
"parent": "John:1",
"citeType": "verse",
"@type": "CitableUnit"
},
{
"level": 3,
"dts:xpath": "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}l[3]",
"identifier": "John:1:3",
"parent": "John:1",
"citeType": "verse",
"@type": "CitableUnit"
},
/* ... */Here's the SPARQL query:
# SPARQL for constructing a WADM selector for the output of the
# document endpoint queried with a ref parameter. The input graph must
# be the output of a navigation endpoint for the same citation tree of
# the same resource.
#
# Parameters to be set:
#
# ?PARAMTREE - the label of the citation tree, empty string for default
# ?PARAMREF - the identifier member passed to the document enpoint as ref parameter
# ?PARAMURI - the document query URL
PREFIX dts: <https://w3id.org/dts/api#>
PREFIX oa: <http://www.w3.org/ns/oa#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
CONSTRUCT {
?PARAMURI rdf:type oa:SpecificResource .
?PARAMURI oa:hasSource ?resource .
?PARAMURI oa:hasSelector _:xps .
_:xps rdf:type oa:XPathSelector .
_:xps rdf:value ?xpath .
?PARAMURI oa:hasSelector _:fgs .
_:fgs rdf:type oa:FragmentSelector .
_:fgs dcterms:conformsTo dts: .
_:fgs rdf:value ?DTSSEL .
# _:fgs dts:isMember _:m .
# _:m dts:identifier ?PARAMREF .
# _:m dts:citeType ?citeType .
# _:m rdf:type dts:CiteableUnit .
# _:m dts:fromTree ?PARAMTREE .
# _:m dts:level ?level .
# _:m dts:parent ?parent .
?PARAMURI dts:citeType ?citeType .
}
WHERE {
# ?PARAM* must be passed in as parameters
BIND("wadm" as ?PARAMTREE) . # empty value means the default tree?
BIND("John:1:3" as ?PARAMREF) .
BIND(<https://example.com/api/dts/document?resource=https://coexist.org/b/john.xml&ref=John:1:3> as ?PARAMURI) .
BIND(CONCAT("tree=", STR($PARAMTREE), "&ref=", STR(?PARAMREF)) as ?DTSSEL) .
?resource rdf:type dts:Resource .
?member rdf:type dts:CitableUnit .
?member dts:identifier ?PARAMREF .
?member dts:xpath ?xpath .
?member dts:citeType ?citeType .
?member dts:level ?level .
?member dts:parent ?parent .
}Let's apply this!
First generate NTriples file graph.n3 from navagation entpoint data
for test/john.xml and its wadm citation tree:
target/bin/xslt.sh -config:test/saxon.xml -xsl:xsl/navigation.xsl -s:test/john.xml tree=wadm resource=https://coexist.org/b/john.xml | target/bin/riot.sh --syntax=jsonld --out=ntriples > graph.n3Then use SPARQL to construct the desired graph from that graph.n3:
target/bin/sparql.sh --query=sparql/wadm-with-ref.rq --data=graph.n3@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix dts: <https://w3id.org/dts/api#> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
<https://example.com/api/dts/document?resource=https://coexist.org/b/john.xml&ref=John:1:3>
rdf:type oa:SpecificResource;
oa:hasSelector [ rdf:type oa:FragmentSelector;
rdf:value "tree=wadm&ref=John:1:3";
dcterms:conformsTo dts:
];
oa:hasSelector [ rdf:type oa:XPathSelector;
rdf:value "/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}body[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}lg[1]/Q{http://www.tei-c.org/ns/1.0}l[3]"
];
oa:hasSource <https://coexist.org/b/john.xml>;
dts:citeType "verse" .Nice!