SENSE XML-RDF feed documentation¶
An EEA initiative for the national input to SOER 2010
- Antonio De Marinis
- Sasha Vinčić (Valentine Web Systems)
- Editing history:*
This document is a technical documentation to the specification and it is part of SENSE (Shared European National State of the Environment). SENSE refers to the name of a specific EEA/Eionet project for SOER 2010 Part C. Reade more about SENSE project here and Forum & FAQ about this spec login).
Focus and target audience¶
This document focuses on explaining the XML-RDF feed elements primarily for the EEA NRCs for Information systems, NFPs and others interested to see a pragmatic implementation.
The inteded audience is IT system architects and developers.
For other audiences, e.g. generalists and project managers, basic knowledge of XML and OO-modelling is required. For top managers and other non-technical audience you may find interesting info at the initial web proposal document and in the SENSE project page.
The class diagram is an OO-model of content we are going to harvest in XML-RDF format. The National Story represents the main content item which each country is sharing with EEA. It coincide with one or more indicator assessments (answer to one question) about a specific topic of SOER Part C. The National Story may contain one or several figures. Each Figure must also refer to one or several data files (where the figures is derived from).
A National Story may also be directly supported by one or several data files. This happens when the assessment in the National Story make no use of figures/images, it rather explains data in a textual form and refers to data files directly. If there is no figure the National Story must refer to at least one data source.
(Done with Poseidon for UML - source file attached to this page)
SOER XML-RDF feed specification (version 1.0)¶
See Soer RDF Schema/Soer Feed spec for all the different classes and properties.
In order to share content we also need to know how the content can be used and if there are any copyright restrictions. In order to facilitate the re-use of content EEA uses the standard Creative Common Attribution license version 2.5. In our examples we use a shorter version without the translations and descriptions as you can see in the Full version of the CC BY 2.5 license in RDF format. You must therefore provide similar license information in your RDF Feed.
Full Example All usescases of RDF shows how the RDF would look for the different usecases of
- normal National Story with link to a core set of indicator
- National Story based on multiple indicators
- National Story without figures
- Figure that is a map which is also the data for it self.
- Channel information
The example below shows an RDF feed from Norway containing one National Story. It answers the question "What are the state and impacts?" for the freshwater topic. It has been published officially published on the national site on the 15 May 2009.
<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.eea.europa.eu/soer/1.0#"> <NationalStory rdf:about="http://www.environment.no/status-of-water-bodies"> <question>What are the state and impacts?</question> <topic>freshwater</topic> <evaluation>FV</evaluation> <keyMessage>Environmental conditions in Norway’s rivers and lakes are good compared with those in most other countries in Europe.</keyMessage> <pubDate>2009-05-15</pubDate> <geoCoverage rdf:resource="http://rod.eionet.europa.eu/spatial/28" /> </NationalStory> </rdf:RDF>
You can add all the national stories sequentially in the same RDF feed. The value in rdf:about is the unique identifier for the national story we are describing.
<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.eea.europa.eu/soer/1.0#"> <NationalStory rdf:about="..."> ... </NationalStory> <NationalStory rdf:about="..."> ... </NationalStory> </rdf:RDF>
Now let's add generic channel information on who is sharing these SOER national stories. We will add a channel tag with information in the feed as in the following example, note the option to use xml:lang in order specify the name of the organisation in different languages. We also add the license tag Creativice Commons 2.5 (CC-BY):
<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.eea.europa.eu/soer/1.0#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:cc="http://creativecommons.org/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcq="http://purl.org/dc/terms/"> <channel rdf:about="Link to this RDF feed"> <organisationName>National Institute for the Protection and Environmental Research</organisationName> <organisationName xml:lang="it">Istituto Superiore per la Protezione e la Ricerca Ambientale</organisationName> <reportingCountryCode>IT</reportingCountryCode> <organisationLogoURL xml:lang="en">http://nfp.irceline.be/logo_irceline_nfp.png</organisationLogoURL> <organisationURL>http://www.isprambiente.it</organisationURL> <organisationContactURL xml:lang="it">http://www.isprambiente.it/site/it-IT/Servizi_del_sito/Contatti</organisationContactURL> <cc:License rdf:about="http://creativecommons.org/licenses/by/2.5/"> <cc:permits rdf:resource="http://creativecommons.org/ns#Distribution"/> <cc:permits rdf:resource="http://creativecommons.org/ns#Reproduction"/> <cc:permits rdf:resource="http://creativecommons.org/ns#DerivativeWorks"/> <cc:requires rdf:resource="http://creativecommons.org/ns#Attribution"/> <cc:requires rdf:resource="http://creativecommons.org/ns#Notice"/> <dcq:hasVersion>2.5</dcq:hasVersion> <cc:legalcode rdf:resource="http://creativecommons.org/licenses/by/2.5/legalcode"/> </cc:License> <NationalStory rdf:about="..."> ... </NationalStory> <NationalStory rdf:about="..."> ... </NationalStory> </channel> </rdf:RDF>
Dealing with HTML¶
In the <assessment> element you can have HTML. The HTML has to be escaped as it is essentially embedded as a payload inside an XML element.. There are two ways to escape HTML.
- Convert all '&' to '&', all '<' to '<' and all '>' to '>' in that order. '<h2>' becomes '<h2>'
- Put '<![CDATA[ ]]>' around the text. '<h2>' becomes '<![CDATA[<h2>]]>'. But remember that if your HTML code contains ']]>', then it has to be escaped by going out of CDATA mode and back in after the ']]>'. Like this: ]]> becomes ]]>]]><![CDATA[
We aim to get as clean and semantic HTML as possible, and preferably XHTML. We intend to have some kind of “HTML tidyer” on our side, just to remove CSS and MS WORD generated styles. This will help giving us more control on the style.
We expect all the content to be available at least in English. (You can however also specify several translations within same RDF feed by using the xml:lang attribute on the specific properties tags to distinguish languages.)
If the assessment makes use of figures we assume all the figures are included in the RDF feed with at least one reference to one data source.
We also expect the figures URLs to point to the highest resolution as possible. The EEA would then be able to deal with figures in a more flexible way, like scaling the figures to lower resolutions or higher depending on the output channel. For example we could generate thumbnails for search result lists.
We do not have any restrictions on layout for the figures. The original layout used in the national reporting is sufficient. As these figures are already layouted and published at national level.
Figure and Data source declarations¶
DataURL should point to a real data file, not to the same or another image. The image is not regarded as the data source. The image is the represantation of one or more DataFile:s for human consumption. The DataFile must point to a machine readable file, like xml, rdf/xml, excel, tabbed values. We prefer RDF/XML. This will allow the user to download the raw data behind the image/figure. See Assessment with figures and data: Freshwater, commonality part, State and impacts - Norway
- http://www.w3.org/RDF/Validator/ RDF validator