Project

General

Profile

Actions

SDI data folder structure » History » Revision 11

« Previous | Revision 11/13 (diff) | Next »
Jose Rubio, 2018-08-21 09:15


Directory structure used for SDI geospatial datasets

A directory structure with a strict naming convention under \\sdi.eea.europa.eu\data have been set up in order to organize the reference versions of the geospatial datasets used by the EEA and the ETCs.

1. Allowed characters in directory and file names

In order to ensure a smooth access to the datasets from scripts running in windows and linux environments, the names used for files and directories shall meet the following requirements:
  • Only alphanumerical lower-case characters, hyphen (-) and underscore (_) are allowed
  • Names shall not contain any space (blank character)
  • Words in a name shall be separated with underscores (_)

2. Overview of the directory structure

The structure retained groups the data by geographic coverage and then by themes as they are referenced in the GEMET thesaurus.

data
└───gis_sdi
    └───L1_Geographic_Location
        └───L2_Theme
            └───L3_Dataset_Parent_Folder
                └───L4_Dataset_A

3. Directory structure description

3.1 Level 1 - Geographic location

L1 (geographic location) consists of two nested directories for continental, local and regional coverages:

gis_sdi
├───continental
│   ├───...
│   └───south_america
├───global
├───local
│   ├───...
│   └───uk
└───regional
    ├───...
    └───west_balkans

The complete Level 1 structure is available here.

3.2 Level 2 - Theme or external database

Level 2 of the directory structure consists of one directory per theme. In addition to this, complex databases delivered by external providers such as ESRI Data & Maps, GISCO or Eurogrographics and containing data belonging to several themes are stored according to their native structure.

L1
├───external_db
│   ├───...
│   └───gisco
├───agriculture
├───...
└───water

The current Level 2 structure is available here. Extra themes from the GEMET thesaurus may be added when needed. Please note that general should only be used when no specific theme can apply.

3.3 Level 3 - Dataset parent folder

Level 3 of the directory structure consists of one directory per dataset family with possibly one level of sub-directories. The names chosen for level 3 (sub-)directories should be short and meaningful.

As an example, considering Corine Land Cover data which consist of land cover and changes datasets, the following structure could be used with L2 being equal to natural_areas:

L2
└───corine_land_cover
    ├───land_cover
    └───changes

3.4 Level 4 - Dataset folder

Level 4 of the directory structure consists of one directory per dataset. A level 4 folder may contain one and only one dataset.

In order to ensure that key metadata elements are directly accessible to the user, level 4 directories shall follow the following naming convention (currently under implementation)*:

Provider_DataType_EpsgCode_ScaleResolution_ScaleResUnit_DatasetShortName_PublicOrInternal_TimeCoverage_VersionNumber_RevisionNumber
As an example, the following structure contains two datasets:
  • A 2006 [2006] raster [r] version [13.0] of Corine Land Cover [clc] provided by EEA [EEA] with a resolution of 100 metres [100][m] in Lambert Azimuthal Equal Area projection [3035] which is publically [p] available
  • A coverage [c] version [17.0] of Corine Land Cover changes [clc-changes] between 2000 et 2006 [2000-2006] provided by EEA [EEA] at a resolution of 100 metres [100][m] in Lambert Azimuthal Equal Area projection [3035] which is publically [p] available
L2
└───corine_land_cover
    ├───land_cover
    │   ├───...
    │   └───eea_r_3035_100_m_clc-2006_p_2006_v13_r00
    └───changes
        ├───...
        └───eea_v_3035_100_k_clcc-2000-2006_p_2000-2006_v17_r00

4. Dataset file names

Considering the large amount of datasets already existing, renaming all of them according to some convention is not feasible. Nevertheless the restrictions on allowed characters in directory and file names shall still apply, thus some datasets might have to be renamed accordingly.

Updated by Jose Rubio almost 3 years ago · 11 revisions

Go to top