Guide to the Open Provenance Model Vocabulary

The Open Provenance Model Vocabulary, OPMV, is a lightweight vocabulary that provides terms to enable practitioners of data publishing to publish their data responsibly. It is closely based on the community provenance data model, the Open Provenance Model (OPM). OPMV can be used together with other provenance-related RDF/OWL vocabularies/ontologies, such as Dublin Core, FOAF, the Changeset Vocabulary, and the Provenance Vocabulary.

Scope
1. Introduction
2. OPMV: an implementation of the Open Provenance Model
3. Describe the basic provenance
4. Describe the creation of an artifact
5. Describe the changes of an artifact
6. Describe the creation of an artifact using a specialized OPMV module
7. Build your own OPMV supplementary module
References

Scope

This document, the OPMV guide, is one of the two core documents of OPMV; the other is the OPMV vocabulary document [OPMV vocabulary]. The OPMV guide is aimed at both data publishers (those wishing to publish their datasets on the Web responsibly), and data consumers (those wishing to be aware of the quality of the datasets that they query and use in their applications). We assume that readers of this document are familiar with the core concepts about the Web of Data, such as URIs and RDF, and with the Turtle syntax for RDF. Basic knowledge about certain widely-used vocabularies such as Dublin Core (DC) and Friend of a Friend (FOAF) is also assumed.

1. Introduction

The Open Provenance Model Vocabulary (OPMV) is a vocabulary defined using OWL that implements the Open Provenance Model, a community provenance model that is driven by the need of facilitating interoperability between provenance systems [The OPM Specification].

OPMV aims to be as lightweight as possible. It tries to take full advantage of Semantic Web technologies by using minimum OWL constructs and reuse existing RDF vocabularies wherever possible. An alternative OPM OWL serialization, OPMO, is available at [OPMO Ontology], which uses more complex OWL2.0 constructs to define more constraints. Users should opt to OPMO if they need to perform complex reasoning over or validity checking of their OPM provenance information.

The Open Provenance Vocabulary currently is implemented as an OWL-DL ontology and is available in its namespace http://purl.org/net/opmv/ns#. The vocabulary is partitioned into the core OPMV vocabulary and several supplementary modules that provide less frequently used terms and a broad range of specializations of the core terms. At the moment we have the following implemented modules:

The common module, under the namespace of http://purl.org/net/opmv/types/common#
The XSLT module, under the namespace of http://purl.org/net/opmv/types/xslt#
The SPARQL module, under the namespace of http://purl.org/net/opmv/types/sparql#

The document is aimed at Linked Data practitioners who want to publish their data responsibly and it covers only how the OPMV Core should be used in practice. Information about individual supplementary modules can be found in corresponding short guides. We use concrete examples to explain how the OPMV Core can be used to publish basic as well as detailed and precise provenance information for linked data. Most of the examples are based on use cases from the data.gov.uk team.

All examples in this document are written in the Turtle RDF syntax. Throughout this document, the following namespaces are used:

  
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dcterms: <http://dublincore.org/documents/dcmi-terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix doap: <http://usefulinc.com/ns/doap#> .
@prefix org: <http://www.w3.org/ns/org#> .

@prefix opmv:      <http://purl.org/net/opmv/ns#> .

@prefix common:    <http://purl.org/net/opmv/types/common#>  .
@prefix xslt:    <http://purl.org/net/opmv/types/xslt#>  .
@prefix sparql:    <http://purl.org/net/opmv/types/sparql#>  .
@prefix gate:    <http://purl.org/net/opmv/types/gate#>  .


@prefix eg: <http://example.org.uk/>

2. OPMV:an implementation of the Open Provenance Model

OPMV does not explicitly implement all the structures from the OPM specification as OWL classes or properties. It aims to take full advantage of existing Semantic Web technologies, such as Named Graphs, and existing vocabularies, such as the W3C Time Ontology:

It defines three OWL classes and five (abstract) object properties to implement the basics of OPM, i.e. the three nodes and five edges.
It defines different sub-properties to make a distinction between the different roles that an artifact or an agent plays in a process.
It makes use of the Named Graphs to implement OPM accounts, which are designed to separate provenance information of different levels of granularity or from different points of views. It reuses the W3C Time ontology to support the descriptions of time-related provenance information.
Unlike OPMO, which provides specific OWL classes to define an OPM provenance graph and its nodes and edges, OPMV does not explicitly define such structures and their relationships. OPMV does not strongly nor explicitly state the membership of an entity in an OPM Graph; instead, for this, it relies on the graph data model of RDF.

The following sections describe in details how each part of the OPM specification is supported in OPMV in combination with existing technologies and vocabularies.

2.1 Implementing the basics of OPM

The three top OPM entities and five top properties are implemented in OPMV as classes and object properties:

opmv:wasDerivedFrom, dom(opmv:wasDerivedFrom) = opmv:Artifact and range (opmv:wasDerivedFrom) = opmv:Artifact;
opmv:used, dom(opmv:used) = opmv:Process and range (opmv:used) = opmv:Artifact;
opmv:wasGeneratedBy, dom(opmv:wasGeneratedBy) = opmv:Agent and range (opmv:wasGeneratedBy) = opmv:Process;
opmv:wasControlledBy, dom(opmv:wasControlledBy) = opmv:Process and range (opmv:wasControlledBy) = opmv:Agent;
opmv:wasTriggeredBy, dom(opmv:wasTriggeredBy) = opmv:Process and range (opmv:wasTriggeredBy) = opmv:Process.

These terms can be used to express some basic provenance information about data creation and transformation.

We define opmv:Process as disjoint with opmv:Agent and opmv:Artifact. We also define sub-properties for properties like opmv:wasControlledBy, to enable users to express provenance information in a more specific way.

2.2 Implementation of OPM roles

According to OPM, roles are used to "designate an artifact's or agent's function in a process" [The OPM Specification]. This structure can be used to refine provenance information expressed using the basic terms and to express provenance information more specifically. For example, an agent could have controlled the execution process or simply played a "performer" role. Instead of defining a class of roles, OPMV defines sub-properties of the five top abstract object properties to reflect the different roles that an artifact or an agent plays in a process.

For example, we define sub-property opmv:wasPerformedBy for opmv:wasControlledBy, to distinguish the roles played by an agent. We differentiate the different roles played by an artifact by refining the property of opmv:used. This has been implemented in the common module, which defines common:usedData and common:usedScript as two sub-properties of opmv:used; in the first case the artifact played the role of "data" and in the latter case it played the role of a (configuration) script.

2.3 Implementation of OPM account

The provenance information about an artifact could be expressed at different levels of abstractions or from different viewpoints [The OPM Specification]. OPM specification introduces the concept of "account" to "represent a description at some level of detail as provided by one or more observers".

OPMV does not provide specific terms to define accounts. We suggest using the Named Graphs to represent such information. A separate named graph can be created for provenance information provided by a separate observer. Provenance information at different levels of abstractions could either be extracted by queries (using for example SPARQL) or be defined in separate named graphs.

2.4 Implementation of Time

OPM provides a very refined representation for time-related information. It differentiate instantaneous occurrences and those not. It recognizes four instantaneous occurrences: the creation and use of artifacts, and the starting and ending of processes.

In OPMV, we define object properties by reusing the W3C Time Ontology (http://www.w3.org/TR/owl-time/) to express this time-related information:

opmv:wasGeneratedAt, dom(opmv:wasGeneratedAt) = opmv:Artifact and range (opmv:wasGeneratedAt) = time:Instant, expresses the time when an artifact was generated or created;
opmv:wasUsedAt, dom(opmv:wasUsedAt) = opmv:Artifact and range (opmv:wasGeneratedAt) = time:Instant, expresses the time when an artifact was used;
opmv:wasPerformedAt, dom(opmv:wasPerformedAt) = opmv:Process and range (opmv:wasPerformedAt) = time:TemporartEntity, expresses the time when/during which a proces was performed;
opmv:wasStartedAt, dom(opmv:wasStartedAt) = opmv:Process and range (opmv:wasStartedAt) = time:TemporalEntity, expresses the time when a process started;
opmv:wasEndedAt, dom(opmv:wasEndedAt) = opmv:Process and range (opmv:wasEndedAt) = time:TemporartEntity, expresses the time when a process ended.

At a very fine-grained level, the time when an artifact was created might be different from the time when the process creating the artifact was finished; hence we define both opmv:wasGeneratedAt and opmv:wasEndedAt. Similarly, the time when an artifact was used (opmv:wasUsedAt) might be different from the time when the process using the artifact was started (opmv:wasStartedAt).

3. Describe the basic provenance

Provenance information about an entity, either an artifact or an agent, can be very broad and very fine-grained. Although very detailed provenance information provides a very precise recording of what happened and evidence for the existence of the entity, it can lead to unnecessary performance and scalability burdens. The minimum provenance information about an entity at a specific state should include at least information about the when and who, for example, when an artifact was created, and by whom. This section describes how OPMV and other vocabularies can be used to provide the basic provenance and the following section explains how more detailed provenance information can be expressed using OPMV and related vocabularies.

We start with describing how the basic provenance information about an artifact can be represented using OPMV or Dublin Core and what the implications of using either or both vocabularies are. It then describes how the basic provenance information about an agent can be represented using DOAP and/or Dublin Core. Finally, it provide further examples to show how OPMV can be used to describe provenance of different types of artifacts, either those that are merely physical or those with only a digital representation, and how it can be used together with Named Graphs to describe provenance of artifacts at different levels of granularity.

3.1 Describe basic provenance of an artifact

The following example shows how the OPMV core can be used to express provenance information about an artifact, i.e. when it was created, by whom.


#### when an artifact was created, by whom
eg:d0
    rdf:type        opmv:Artifact ;
    opmv:wasGeneratedAt      eg:t0 ;
    opmv:wasGeneratedBy [
        rdf:type     opmv:Process ;
        opmv:wasPerformedBy    eg:p0
    ]
.

eg:t0
    rdf:type    time:Instant ;
    time:inXSDDateTime    "2010-10-07T12:09:00Z"^^xsd:dateTime ;
.

eg:p0
    rdf:type    opmv:Agent, foaf:Agent ;
.

Because OPMV is a process-oriented provenance vocabulary, the existence of an entity must be scoped in a process. For instance, in our example, the creator of an entity cannot be expressed without explicitly stating the process in which the creator operated the process that led to the creation of this entity. On the contrary, the Dublin Core is a resource-oriented metadata schema. A provenance statement can be directly associated with a resource, making it much less verbose than OPMV for expressing the above simple provenance information, as shown by the example below.


#### when an artifact was created, by whom
eg:d0
    rdf:type           opmv:Artifact ;
    dcterms:created    "2010-10-07T12:09:00Z" ;
    dcterms:creator    eg:p0 ;
.

eg:p0
    rdf:type    dcterms:Agent, foaf:Agent, opmv:Agent 
.

The DC Terms can be used together with OPMV to describe the provenance of an entity. However, users should note that the range of dcterms:created is a literal, which is different from that of opmv:wasGeneratedAt. The dcterms:creator can be used to replace the similar statement expressed using OPMV. However, we have no effective means to express the mapping between dcterms:creator and opmv:wasPerformedBy at the moment. When using DC Terms to express the creator information, users lose the interoperability of their provenance information with other expressed using OPMV or other OPM serializations. This is one drawback to be aware.

3.2 Describe basic provenance of an agent

An agent can be a person or an organization who controlled a process execution; it can also be a service or a software tool that performed the execution. The DOAP (Description of a Project) vocabulary, a vocabulary for describing software project (http://usefulinc.com/ns/doap), can be used to express provenance information about a service or tool. For example, we can describe the provenance of the software tool that was used for creating the artifact eg:d0, including when it was created and who developed it.


### when a software release was created, by whom
eg:s0
    rdf:type        doap:Version ;  ### a specific version of a software project release
    doap:revision   "0.0" ;
    doap:created    "2010-10-19" ;
.

eg:prj0
    rdf:type       doap:Project ;
    doap:release           eg:s0 ;
    doap:developer     eg:stuart ;
    doap:maintainer    eg:stuart ;
.

eg:stuart    rdf:type    foaf:Person .

Similarly, some of the above information can equally be expressed using the DC Terms, as shown below.


### when a software release was created, by whom
eg:s0
    rdf:type    doap:Version ;  ### a specific version of a software project release
    doap:revision    "0.0" ;
    dcterms:created    "2010-10-19" ;
.

eg:prj0
    rdf:type      doap:Project ;
    doap:release          eg:s0 ;
    dcterms:creator   eg:stuart ;
    doap:maintainer   eg:stuart ;
.

eg:stuart    rdf:type    foaf:Person .

dcterms:created can be used to express the same information as doap:created, while dcterms:creator might have a slightly different semantics from doap:developer.

3.3 When an artifact is a non-digital object

OPMV can be used to describe both provenance of artifacts which may have a physical embodiment in a physical object, such as an organization, and that of those with only a digital representation in a computer system, such as an RDF graph.

Our following example shows how OPMV can be used to describe provenance of a non-digital object. The Organization Ontology (http://www.epimorphics.com/public/vocabulary/org.html) is a vocabulary for describing organizational structures. OPMV has been reused in the Organization Ontology to describe historical changes of organizational structure, as illustrated below. Because an org:Organization is an opmv:Artifact, we can use OPMV to express who created the organization and when.


#### when an organization was created, by whom
eg:org0
    rdf:type    org:Organization, opmv:Artifact ;
    opmv:wasGeneratedAt       eg:t1 ;
    opmv:wasGeneratedBy [
        rdf:type              opmv:Process ;
        opmv:wasPerformedBy   eg:p1
    ]
.

eg:t1
    rdf:type    time:Instant ;
    time:inXSDDateTime "2007-10-07T14:51:00Z"^^xsd:dateTime ;
.

eg:p1
    rdf:type    opmv:Agent, foaf:Agent ;
.

An rg:Organization "represents a collection of people organized together into a community or other social, commercial or political structure". It is a sub-class of foaf:Agent as well as opmv:Artifact. This is consistent with the OPMV vocabulary because we do not define opmv:Agent as being disjoint with opmv:Artifact. Because foaf:Agent is defined as owl:equivalentClass of opmv:Agent, the example above also shows how OPMV can be used to describe the provenance of a human type of agent.

3.4 When an artifact is a set of RDF statements

OPMV can be used together with Named Graphs to describe provenance information for artifacts of different levels of granularity. An OPMV artifact can be of any level of granularity, an RDF triple or a collection of RDF triples. A Named Graph can be used to refer to that one RDF triple or that collection of RDF triples. Such a graph can be an opmv:Artifact. The example below shows how a Named Graph is used to refer to one RDF statement so that we can describe who published that statement and when.


#### when an organization was created, by whom

eg:g0 {
    eg:d1       rdf:type        org:Organization .
}

eg:g0    rdf:type    ,  opmv:Artifact ;
    opmv:wasGeneratedAt         eg:t2 ;
    opmv:wasGeneratedBy [
        rdf:type        opmv:Process ;
        opmv:wasPerformedBy    eg:p2
    ]
.

eg:t2
    rdf:type           time:Instant ;
    time:inXSDDateTime "2009-10-10T15:14:00Z"^^xsd:dateTime ;
.

eg:p2
    rdf:type    opmv:Agent, foaf:Agent ;
.

Named Graphs are our recommended way to describe provenance of a set of RDF statements. However, due to performance reasons, users might have to choose to use RDF reification, which can express the same information with more terse expressions, although with different semantics, which is beyond the discussion of this document. For example, the above example that describes the provenance of the RDF statement <eg:d1 rdf:type org:Organization> can be expressed using OPMV and RDF reification as the following:


eg:statementxxxx    rdf:type    rdf:Statement ;
                    rdf:subject    eg:d1 ;
                    rdf:predicate  rdf:type ;
                    rdf:object     org:organization ;
                    rdf:type       opmv:Artifact ;
                    opmv:wasGeneratedAt         eg:t2 ;
                    opmv:wasGeneratedBy [
                        rdf:type        opmv:Process ;
                        opmv:wasPerformedBy    eg:p2
                    ]
.

4. Describe the creation of an artifact

The previous section shows how to express the basic information about when an artifact was created and by whom, which gives some basic credibility to the artifact, just enough to track the responsibility for that artifact. A higher level of credibility can be established by knowing what tools were used to created the artifact and what other source artifacts were used. In this section, we show how OPMV can be used to provide more detailed, finer-grained provenance information, including the source artifact used to create an artifact and more specific information about the process leading to the given state of this artifact.

In this way, users can spot quality issues by inspecting the tools used or tracing who created the tools or operated the tools and trace the propagation of artifacts of bad qualities through the derivation paths of artifacts.

Our example describes the exact process and source data that led to an artifact, i.e. eg:school1, which represents some information about a school in RDF format, that was transformed from some legacy data format into RDF by a script and published in a Named Graph, identified by eg:school1.


eg:school1
    rdf:type opmv:Artifact,  ;
    opmv:wasDerivedFrom eg:queryResult ;
    opmv:wasGeneratedBy eg:p0 
.

eg:p0
    rdf:type opmv:Process ;         
    opmv:used eg:queryResult ;
    opmv:wasPerformedBy eg:netcode ;    
    opmv:wasControlledBy       ;
.

eg:queryResult rdf:type opmv:Artifact ;  
    opmv:wasGeneratedBy [
        rdf:type opmv:Process ;         
        opmv:used  ;
        opmv:used eg:query ;
    ] 
.

eg:netcode rdf:type opmv:Agent ;   
    rdfs:label ".NET code that formats the result of a SQL query on the database as RDF/XML" ;
.

Our example shows that the graph eg:school1 was derived from another artifact, eg:queryResult and was generated in a process that used this query result and a piece of code identified by eg:netcode, that was performed by <http://www.jenitennison.com/#me>.

The artifact eg:school1 can be defined as both an opmv:Artifact and an RDF graph. An opmv:Artifact can also be a foaf:Document or a prv:DataItem (from the Provenance Vocabulary), if appropriate. However, because an opmv:Artifact is immutable, an foaf:Document regarded as an opmv:Artifact implies that this foaf:Document refers to a document in a specific state, rather than as an abstract embodiment of some work that can be conceived as "documents". The same applies to a http://www.w3.org/2004/03/trix/rdfg-1/Graph. A prv:DataItem from the Provenance Vocabulary shares the same semantics as an OPMV opmv:Artifact, i.e. it refers to an immutable representation of data.

5. Describe the changes of an artifact

Apart from the basic provenance (the when and who about artifacts and agents), and the detailed provenance describing the process and source artifacts, another type of provenance could be information about changes of an artifact.

An opmv:Artifact can represent both something that has a physical embodiment and that exists only by a digital representation in a computer system. Describing changes of a physical object, such as an organization or a legislation document, can help data consumers to find out how things have changed over time and trace the relationship between source and resulting objects. Describing changes of digital objects, such as descriptions about an entity that are available in RDF format, is essential for data consumers to find out which descriptions about an entity is most up-to-date and trustworthy and how descriptions about the entity have changed over time.

The latter is particularly important in the context of the Web of Data. Due to the openness of WoD, linked datasets are often replicated and hosted at different locations, under the same or different URI namespaces. Even though these datasets are updated over the time, little care was given when datasets got updated. Commonly, different copies of statements about the same set of entities could exist at the same time on the Web, completely interconnected and intertwined. Without sufficient context information about these statements, data consumers are confronted with choices: Which statement(s) provides more updated information? How information about an entity has changed over time?

In this section, we show how OPMV can be used to track changes of an artifact, either as a physical or as a digital object.

Using the change history of organizations as the example: Any aspect of organizational structure is subject to change over time. When organizations change substantially, such as a merger, they result in a new organization and the new organization will typically be denoted by a new URI. To track changes over time and trace the relationship between the original and resulting organizations, we can provide the following provenance information:

The basic provenance, recording when an organization was created, who was responsible for the given state of an organization;
The change history about the organization, i.e., how it was changed, from which source organization it was changed from.

The following example describes when each organization eg:org0 and eg:org1 was created (eg:t3 and eg:t5 respectively) and by whom (eg:p1 in both cases). Additionally, it describes how the resulting organization eg:org1 was changed from the source organization eg:org0 during the change event eg:changeEvent0, which took eg:org0 as the input and produced eg:org1 as the output.


#### when the first organization was created, by whom, and that it was changed in the #### change event _:changeEvent0

eg:org0
    rdf:type    org:Organization, opmv:Artifact ;
    opmv:wasGeneratedAt       eg:t3 ;
    opmv:wasGeneratedBy [
        rdf:type              opmv:Process ;
        opmv:wasPerformedBy   eg:p1
    ];
    org:changedBy eg:changeEvent0 ;
.

#### when the second organization was created, by whom, and that it was resulted from the #### change event eg:changeEvent0 and derived from the source organization eg:org0


eg:org1
    rdf:type    org:Organization, opmv:Artifact ;
    opmv:wasGeneratedAt   eg:t5 ;
    opmv:wasGeneratedBy   eg:changeEvent0 ;
    org:resultedFrom      eg:changeEvent0 ;
    opmv:wasDerivedFrom   eg:org0 ;
.


#### describe the change event _:changeEvent0, what was the source and the result, when #### it was performed, and by whom

eg:changeEvent0
    rdf:type          org:ChangeEvent, opmv:Process ;
    org:originalOrganization    eg:org0 ;
    opmv:used                   eg:org0 ;
    opmv:wasPerformedBy         eg:p1 ;
    org:resultingOrganization   eg:org1 ;
    opmv:wasPerformedAt         eg:t4 ;    
.


eg:t3
    rdf:type    time:Instant ;
    time:inXSDDateTime "2007-10-07T14:51:00Z"^^xsd:dateTime ;
.

eg:t4
    rdf:type    time:Instant ;
    time:inXSDDateTime "2010-10-20T14:51:00Z"^^xsd:dateTime ;
.


eg:t5
    rdf:type    time:Instant ;
    time:inXSDDateTime "2010-10-20T15:11:00Z"^^xsd:dateTime ;
.

eg:p1
    rdf:type    opmv:Agent, foaf:Agent ;
.

Not only the organization itself can change, but also the descriptions about an organization. In the context of Linked Data, such descriptions are a set of triples that are associated with an organization resource. A new URI is used to identify a new organization, but a new URI is not always created when descriptions about an organization were changed. It is up to the data publishers to decide when a new URI should be coined if information about an organization were changed, such as its name or its location. Mechanisms such as Named Graphs can be used to handle such cases. If each Named Graph identifies the descriptions about an organization at a given state that remains immutable at that given state, we can describe the provenance of these Named Graphs just like how we describe that of an opmv:Artifact.

In the following example, the first Named Graph eg:g1 contains information about the organization eg:org2, which was generated in 1960, while the second graph eg:g2 contains information about the same organization, which was created in 2000 with changes in the organization title. For the purpose of the demonstration, we use the same URI to identify this organization. In practice, it is more appropriate to create a new URI to represent this organization whose title was updated. The example also describes the relationship between these two graphs, as eg:g2 was derived from eg:g1.




#### when an organization was created, by whom

eg:g1 {
    eg:org2        rdf:type        org:Organization ;
    dc:title "Computing Laboratory" ;
    org:hasPrimarySite eg:40002001 ;
.

eg:40002001
    rdf:type    org:Site ;
    dc:title    "OUCS" ;
    geo:lat "51.76001"^^xsd:float ;
    geo:long "-1.26035"^^xsd:float ;
.

}

eg:g2 {
    eg:org2        rdf:type        org:Organization ;
    dc:title "Computing Services" ;
    org:hasPrimarySite eg:40002001 ;
.

eg:40002001
    rdf:type    org:Site ;
    dc:title    "OUCS" ;
    geo:lat "51.76001"^^xsd:float ;
    geo:long "-1.26035"^^xsd:float ;
.

}


eg:g1    rdf:type    ,  opmv:Artifact ;
    opmv:wasGeneratedAt         eg:t6 ;
    opmv:wasGeneratedBy [
        rdf:type        opmv:Process ;
        opmv:wasPerformedBy    eg:p1
    ]
.


eg:g2    rdf:type    ,  opmv:Artifact ;
    opmv:wasGeneratedAt         eg:t7 ;
    opmv:wasGeneratedBy [
        rdf:type        opmv:Process ;
        opmv:wasPerformedBy    eg:p1
    ];
    opmv:wasDerivedFrom    eg:g1
.

eg:t6
    rdf:type    time:Instant ;
    time:inXSDDateTime "1960-10-20T14:51:00Z"^^xsd:dateTime ;
.


eg:t5
    rdf:type    time:Instant ;
    time:inXSDDateTime "2000-10-20T15:11:00Z"^^xsd:dateTime ;
.




Additionally, we can provide more information about the change of information about organization eg:org1, like who did it and when, as shown in the following example. 



eg:g2
    opmv:wasGeneratedBy [
        rdf:type    opmv:Process ;
        opmv:used   eg:g1 ;
        opmv:wasPerformedBy    eg:p2 ;
        opmv:wasPerformedAt    eg:t5 ;
    ]
.




Our examples show that the relationship between older and more updated artifacts are expressed in a derivation path, using properties like opmv:wasDerivedFrom or concepts like org:ChangeEvent. However, this information is not very sufficient to support queries like, "finding the latest information about an organization", or "finding information about an organization ". Additional metadata could be provided in different patterns to achieve different query efficiency [FlyWeb Provenance].




Data publishers can use an official URI, which is destined to always provide the latest information about the resource identified by that URI. This resource URI is linked to previous copies of information about that resource by properties like opmv:wasDerivedFrom. However, this does not work for cases shown in the second and third examples. A vocabulary that can express versioning of datasets is needed and this is out of the scope of the OPMV vocabulary. Users can refer to related terms from Dublin Core or others.




  6. Describe the creation of an artifact using a specialized OPMV module


OPMV is created as a very simple provenance vocabulary. Its generic terms might not be sufficient to express provenance information as precisely as expected. We encourage users to extend OPMV in a separate module in order to define the more specialized terms for their specific needs.



So far, the data.gov.uk team has created 3 OPMV typed modules, as mentioned at the beginning of this document. Examples showing how we can use each module to express provenance information more accurately and in more detail can be found in the following links:



http://code.google.com/p/opmv/wiki/GuideOfCommonModule


http://code.google.com/p/opmv/wiki/GuideOfXsltModule


http://code.google.com/p/opmv/wiki/GuideOfSparqlModule 




  7. Build your own OPMV supplementary module


Of the 3 type modules that extend OPMV, the common module is designed to keep terms that are commonly needed but not defined in the OPM specification.

Those who wish to propose new terms to the OPMV vocabularies should consider first whether such terms should follow into the common module. Those who wish to create a new typed module, as the XSLT and SPARQl modules, should base themselves upon the common module as well as the OPMV core, in order to reuse as many terms as possible.

Users can host their own OPMV supplementary modules. But we encourage users to use namespace patterns like http://purl.org/net/opmv/types/examplemoduel# and to inform us of their extensions.



  References


  
    [OPMV Vocabulary]
  
  
    OPMV Vocabulary, November 2010
  
  
    [The OPM Specification]
  
  
    The Open Provenance Model Core Speciﬁcation (v1.1), Moreau et al. 
  
  
    [The OPMO Ontology]
  
  
    The OPMO Ontology, November 2010. 
  
  
    [FlyWeb Provenance]
  
  
    Linked data and provenance in biological data webs, Zhao J. et al. Briefings in Bioinformatics. 10(2), December 2008.