Data Workspace
The Data Workspace is the Digital Enabler component able to federate existing Open Data Management Systems (ODMS), providing a unique access point to search and discover open datasets coming from heterogeneous sources.
The Data Sources Federation, based on the Fiware Generic Enabler IDRA, supports natively ODMS based on CKAN, DKAN, Socrata, Orion Context Broker (NGSI v2) and many other technologies: the platform provides also a set of APIs to federate ODMS not natively supported.
Access Data Sources Federation
The URL to access this tool is https://catalogue.demo.digital-enabler.eng.it/catalogue. The system doesn't ask for credentials directly, without login the user can use this tool to consult the existing data. By logging in, the user can access the more advanced features.
Platform management & user guide
For the details of this section the user can refer to the official IDRA documentation.
Data evaluator
Compared to Idra, the data workspace provides an additional functionality: the data evaluator. It provides useful information about the dataset quality, usability and the machine readable level.
The dataset evaluation is built up by 5 evaluation indexes that analyze the dataset from different perspectives:
- Semantic level
- Format level
- Internationalization level
- Resources level
- Update level
Semantic level This index evaluates the metadata associated to a dataset. It provides information about the completeness of the dataset in terms of metadata. In order to have an high value for this index is necessary to fulfill the following fields: update information, temporal information, geo information, title, description, language, owner, keywords, license, website, contact.
Format level This index gives information about the type and the number of resources contained in a dataset. It's based on 5-star schema resources classification. A dataset with multiple resources (e.g. RDF files, JSON and so on) will get an higher format level.
Internationalization level This index evaluates the number of languages in which a dataset is described. It's even better to describe a dataset in multiple languages to improve the usability. Datasets described in English + other languages will get an higher internationalization score.
Resources level This value is obtained checking the semantic level of a resource and his format level. In order to have an high value for this index is necessary to fulfill the following fields for the resources: title, description, license and download URL, format.
Update level This index evaluates if a dataset has an own metadata named "update frequency" and if the dataset is updated correctly. If this index has 0 as value, it means that the dataset is not evaluated yet. In this case this value will not affect the overall evaluation.
How does the evaluation process works?
Every index has a value in a 1-5 range. Combining these scores it's possible to assign 5 categories of evaluation:
- poor
- lacking
- fairly good
- good
- excellent
Each evaluation gives users an idea of dataset quality. The evaluator checks datasets periodically, so the evaluation can change over the time.
Data Evaluator on the Data workspace
Assuming that the Data workspace has already been populated with datasets, the user can access the list of datasets for example by tag.
For each dataset a label with the quality evaluation appears. In the following picture, two datasets have been evaluated as "excellent" (1)(2) and other two as "fairly good" (3)(4).
Clicking on these labels, it is possible to access the evaluation in details: