Dataset information
In ttl2html, you can describe information about an entire dataset (e.g., title, author, release date, data size) in RDF format and input it into the tool. The tool then displays this information on the main page and the “About” page, giving users a clear overview before they use the dataset.
The dataset metadata expected by this tool consists of three main parts:
Metadata for the entire dataset
Contact information
Version history information
The basic data structure is illustrated in the following diagram:
Data model for the dataset information
The table below lists the metadata vocabularies and their namespaces used in the model diagram and in the descriptions that follow:
Metadata vocabulary |
Prefix |
Namespace URI |
|---|---|---|
dct: |
||
rdfs: |
||
foaf: |
||
void: |
||
pav: |
||
dcat: |
||
prov: |
Metadata for the Entire Dataset
The resource labeled “Entire Dataset” in the model represents the dataset as a whole. Metadata about the dataset is attached as properties of this resource.
A resource linked with
pav:hasCurrentVersionrepresents the “latest release version” of the dataset and contains details of the available Linked Data. When a resource with thispav:hasCurrentVersionproperty is found, this tool determines that dataset information is included and automatically writes out the dataset information.A resource linked with
dct:publisherrepresents the person or organization providing the dataset. This information is displayed as “contact information” in the Linked Data output.Resources linked with
pav:hasVersionrepresent “previous versions” and serve as historical information.
The following properties are available for describing the overall dataset:
Property |
Description |
|---|---|
|
|
|
Title of the dataset |
|
Description of the dataset |
|
Dataset license; use a URI such as a Creative Commons license if possible |
|
URI of the published dataset site |
|
Additional page for the dataset (if it has a different URI from above) |
|
Dataset publisher (see contact information below) |
Example (Turtle):
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix void: <http://rdfs.org/ns/void#> .
@prefix pav: <http://purl.org/pav/> .
@prefix ex: <http://example.org/dataset/> .
ex:dataset1 a void:Dataset ;
dct:title "Sample RDF Dataset"@en ;
dct:description "An example dataset for demonstrating ttl2html metadata"@en ;
dct:license <https://creativecommons.org/licenses/by/4.0/> ;
foaf:homepage <http://example.org/dataset/> ;
dct:publisher ex:project1 ;
pav:hasCurrentVersion ex:dataset1-v2 ;
pav:hasVersion ex:dataset1-v1 .
Contact Information
The following properties can be used to describe the provider of the dataset.
If the provider consists of multiple people, the contact resource should be represented as an instance of the foaf:Project class, and each member is linked with the foaf:member property.
Properties |
description |
|---|---|
|
|
|
Name of the project |
|
Member of the project (multiple repetitions possible). Links to resources that represent individuals below. |
Property |
Description |
|---|---|
|
|
|
Name of the individual |
|
Email address |
|
Name of the organization to which the individual belongs |
Example (Turtle):
ex:project1 a foaf:Project ;
foaf:name "Example Project" ;
foaf:member ex:alice ;
foaf:member ex:bob .
ex:alice a foaf:Person ;
foaf:name "Alice Example" ;
foaf:mbox <mailto:alice@example.org> ;
<http://www.w3.org/2006/vcard/ns#organization-name> "Example University" .
ex:bob a foaf:Person ;
foaf:name "Bob Example" ;
foaf:mbox <mailto:bob@example.org> .
Version History Information
Version history information provides details about dataset revisions over time. This information is represented using the PAV (Provenance Authoring and Versioning ontology).
The latest version is linked from the “Entire Dataset” resource with the
pav:hasCurrentVersionproperty.Past versions are linked with the
pav:hasVersionproperty.
The following properties can be used for each version resource:
Property |
Description |
|---|---|
|
|
|
Version title |
|
Release date of the version |
|
Version number |
|
File size of the dataset |
|
Number of triples in the dataset |
|
URI of the dataset file |
|
Resource describing revision details (can be a blank node) |
|
Source resource from which the data was obtained (can be a blank node) |
Example (Turtle):
ex:dataset1-v2 a prov:Dataset ;
dct:title "Dataset Version 2.0" ;
pav:version "2.0" ;
dct:issued "2025-12-25" ;
dcat:byteSize 123456 ;
void:triples 50000 ;
void:dataDump <http://example.org/dataset/v2/dump.nt.gz> ;
prov:qualifiedRevision ex:revnote-v2 ;
prov:wasDerivedFrom [
rdf:value <https://example.go.jp/sample-project/> ;
rdfs:label "Project Report 2022-2024" .
] .
ex:revnote-v2 a prov:Revision ;
rdfs:comment "Second release: added new data and fixed errors in metadata"@en ;
rdfs:seeAlso <http://example.org/dataset/v2/changelog> .
ex:dataset1-v1 a prov:Revision ;
dct:title "Dataset Version 1.0" ;
pav:version "1.0" ;
void:triples 30000 ;
void:dataDump <http://example.org/dataset/v1/dump.nt.gz> .
Revision Details
The value of prov:qualifiedRevision may contain the following properties:
Property |
Description |
|---|---|
|
|
|
Description of the revision |
|
URI with more details on the revision (if available) |
Source Information
The prov:wasDerivedFrom property, assigned to a dataset version resource, can be used to represent source information.
By describing the value of this property as a blank node with the structure shown below, the source of a published dataset can be explicitly indicated.
This method can also be used to provide attribution required by licenses such as CC-BY.
The resource used as the value of prov:wasDerivedFrom (typically assumed to be a blank node) should be assigned at least two properties: rdf:value and rdfs:label.
The rdf:value property should contain the URI of the source, while rdfs:label should provide the name of the source (a human-readable label).
License Information
License information can be expressed not only as a single URI, but also as an extended representation that adds explanatory text to the URI.
Property |
Description |
|---|---|
|
URI representing the license itself |
|
Descriptive text describing the license |
|
URI of a thumbnail image of the license information |
ex:dataset1 a void:Dataset ;
...
dct:license ex:license ;
... .
ex:license
rdf:value <https://creativecommons.org/licenses/by/4.0/>;
rdfs:label "Creative Commons Attribution-ShareAlike (CC BY-SA)";
foaf:thumbnail ex:license.png .
License information can also be expressed using blank nodes as follows:
ex:dataset1 a void:Dataset ;
...
dct:license [
rdf:value <https://creativecommons.org/licenses/by/4.0/>;
rdfs:label "Creative Commons Attribution-ShareAlike (CC BY-SA)";
foaf:thumbnail ex:license.png
] ;
SPARQL Endpoint Information
In ttl2html, information about the location of a SPARQL endpoint can be displayed on the top page and the about page by describing it in the input RDF triples (Turtle).
To express the endpoint location in RDF, use one of the following methods:
Method A (recommended): Express as a DataService using DCAT’s dcat:accessService.
You can specify not only the endpoint URL but also the landing page.
Method B (simplified): Write only VoID’s void:sparqlEndpoint.
This provides a minimal endpoint URI, but no landing page.
In the following example, _:toplevel is expressed as an entity representing the entire dataset.
Method A: Express using dcat:accessService (DataService)
The following RDF triple expression can be added as part of the metadata for the entire dataset.
@prefix void: <http://rdfs.org/ns/void#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
_:toplevel a void:Dataset, dcat:Dataset ;
dcat:accessService [
a dcat:DataService;
dcat:endpointURL <https://dydra.com/masao/jp-naaa/sparql>;
dcat:landingPage <https://dydra.com/masao/jp-naaa/@query>
] .
The meanings of the properties in the above example can be understood as follows:
dcat:accessService: Service for accessing this dataset
a dcat:DataService: Type of service (DataService)
dcat:endpointURL: SPARQL endpoint URL
dcat:landingPage: Landing page for humans (e.g., query UI, description page)
Method B: Express using void:sparqlEndpoint
The following RDF triple expression can be added as part of the metadata for the entire dataset.
@prefix void: <http://rdfs.org/ns/void#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
_:toplevel a void:Dataset, dcat:Dataset ;
void:sparqlEndpoint <https://dydra.com/masao/jp-naaa/sparql> .
Note that when using void:sparqlEndpoint, you can only add endpoint URIs for machine access; you cannot describe search or description pages for humans.