SDI data folder structure » History » Version 12
Jose Rubio, 2018-08-21 09:17
| 1 | 2 | Jose Rubio | {{>toc}} |
|---|---|---|---|
| 2 | 1 | Jose Rubio | |
| 3 | 4 | Jose Rubio | h1. Directory structure used for SDI geospatial datasets |
| 4 | 1 | Jose Rubio | |
| 5 | 9 | Jose Rubio | A directory structure with a strict naming convention under *\\sdi.eea.europa.eu\data* have been set up in order to organize the reference versions of the geospatial datasets used by the EEA and the ETCs. |
| 6 | 2 | Jose Rubio | |
| 7 | 5 | Jose Rubio | h2. 1. Allowed characters in directory and file names |
| 8 | 1 | Jose Rubio | |
| 9 | In order to ensure a smooth access to the datasets from scripts running in windows and linux environments, the names used for files and directories shall meet the following requirements: |
||
| 10 | * Only alphanumerical lower-case characters, hyphen (-) and underscore (_) are allowed |
||
| 11 | * Names shall not contain any space (blank character) |
||
| 12 | * Words in a name shall be separated with underscores (_) |
||
| 13 | |||
| 14 | 5 | Jose Rubio | h2. 2. Overview of the directory structure |
| 15 | 1 | Jose Rubio | |
| 16 | The structure retained groups the data by geographic coverage and then by themes as they are referenced in the "GEMET thesaurus":http://www.eionet.europa.eu/gemet. |
||
| 17 | |||
| 18 | <pre> |
||
| 19 | data |
||
| 20 | └───gis_sdi |
||
| 21 | └───L1_Geographic_Location |
||
| 22 | └───L2_Theme |
||
| 23 | └───L3_Dataset_Parent_Folder |
||
| 24 | └───L4_Dataset_A |
||
| 25 | </pre> |
||
| 26 | |||
| 27 | |||
| 28 | 5 | Jose Rubio | h2. 3. Directory structure description |
| 29 | 1 | Jose Rubio | |
| 30 | 5 | Jose Rubio | h3. 3.1 Level 1 - Geographic location |
| 31 | 1 | Jose Rubio | |
| 32 | L1 (geographic location) consists of two nested directories for continental, local and regional coverages: |
||
| 33 | <pre> |
||
| 34 | gis_sdi |
||
| 35 | ├───continental |
||
| 36 | │ ├───... |
||
| 37 | │ └───south_america |
||
| 38 | ├───global |
||
| 39 | ├───local |
||
| 40 | │ ├───... |
||
| 41 | │ └───uk |
||
| 42 | └───regional |
||
| 43 | ├───... |
||
| 44 | └───west_balkans |
||
| 45 | </pre> |
||
| 46 | |||
| 47 | 3 | Jose Rubio | The complete Level 1 structure is available [[SDI_data_folder_structure_-_Level_1|here]]. |
| 48 | 1 | Jose Rubio | |
| 49 | |||
| 50 | 5 | Jose Rubio | h3. 3.2 Level 2 - Theme or external database |
| 51 | 1 | Jose Rubio | |
| 52 | Level 2 of the directory structure consists of one directory per theme. In addition to this, complex databases delivered by external providers such as ESRI Data & Maps, GISCO or Eurogrographics and containing data belonging to several themes are stored according to their native structure. |
||
| 53 | |||
| 54 | <pre> |
||
| 55 | L1 |
||
| 56 | ├───external_db |
||
| 57 | │ ├───... |
||
| 58 | │ └───gisco |
||
| 59 | ├───agriculture |
||
| 60 | ├───... |
||
| 61 | └───water |
||
| 62 | </pre> |
||
| 63 | |||
| 64 | 3 | Jose Rubio | The current Level 2 structure is available [[SDI_data_folder_structure_-_Level_2|here]]. Extra themes from the GEMET thesaurus may be added when needed. Please note that _general_ should only be used when no specific theme can apply. |
| 65 | 1 | Jose Rubio | |
| 66 | |||
| 67 | 5 | Jose Rubio | h3. 3.3 Level 3 - Dataset parent folder |
| 68 | 1 | Jose Rubio | |
| 69 | Level 3 of the directory structure consists of one directory per dataset family with possibly one level of sub-directories. The names chosen for level 3 (sub-)directories should be short and meaningful. |
||
| 70 | |||
| 71 | As an example, considering Corine Land Cover data which consist of land cover and changes datasets, the following structure could be used with @L2@ being equal to @natural_areas@: |
||
| 72 | <pre> |
||
| 73 | L2 |
||
| 74 | └───corine_land_cover |
||
| 75 | ├───land_cover |
||
| 76 | └───changes |
||
| 77 | </pre> |
||
| 78 | |||
| 79 | 5 | Jose Rubio | h3. 3.4 Level 4 - Dataset folder |
| 80 | 1 | Jose Rubio | |
| 81 | Level 4 of the directory structure consists of one directory per dataset. A level 4 folder may contain _one *and only one*_ dataset. |
||
| 82 | |||
| 83 | 11 | Jose Rubio | In order to ensure that key metadata elements are directly accessible to the user, level 4 directories shall follow [[Naming_conventions|the following naming convention]] (currently under implementation)*: |
| 84 | 2 | Jose Rubio | |
| 85 | *Provider_DataType_EpsgCode_ScaleResolution_ScaleResUnit_DatasetShortName_PublicOrInternal_TimeCoverage_VersionNumber_RevisionNumber* |
||
| 86 | |||
| 87 | 1 | Jose Rubio | As an example, the following structure contains two datasets: |
| 88 | 2 | Jose Rubio | * A 2006 @[2006]@ raster @[r]@ version @[13.0]@ of Corine Land Cover @[clc]@ provided by EEA @[EEA]@ with a resolution of 100 metres @[100][m]@ in Lambert Azimuthal Equal Area projection @[3035]@ which is publically @[p]@ available |
| 89 | * A coverage @[c]@ version @[17.0]@ of Corine Land Cover changes @[clc-changes]@ between 2000 et 2006 @[2000-2006]@ provided by EEA @[EEA]@ at a resolution of 100 metres @[100][m]@ in Lambert Azimuthal Equal Area projection @[3035]@ which is publically @[p]@ available |
||
| 90 | 1 | Jose Rubio | |
| 91 | <pre> |
||
| 92 | L2 |
||
| 93 | └───corine_land_cover |
||
| 94 | ├───land_cover |
||
| 95 | │ ├───... |
||
| 96 | 2 | Jose Rubio | │ └───eea_r_3035_100_m_clc-2006_p_2006_v13_r00 |
| 97 | 1 | Jose Rubio | └───changes |
| 98 | ├───... |
||
| 99 | 2 | Jose Rubio | └───eea_v_3035_100_k_clcc-2000-2006_p_2000-2006_v17_r00 |
| 100 | |||
| 101 | 1 | Jose Rubio | </pre> |
| 102 | |||
| 103 | 12 | Jose Rubio | *At the time of writing (21/08/2018) the existing implementation of the naming convention at this level is slightly different: |
| 104 | |||
| 105 | *Provider_DataType_EpsgCode_ScaleResolution_ScaleResUnit_DatasetShortName_TimeCoverage_RevisionNumber* |
||
| 106 | |||
| 107 | |||
| 108 | 6 | Jose Rubio | h2. 4. Dataset file names |
| 109 | 1 | Jose Rubio | |
| 110 | Considering the large amount of datasets already existing, renaming all of them according to some convention is not feasible. Nevertheless the restrictions on allowed characters in directory and file names shall still apply, thus some datasets might have to be renamed accordingly. |