DaViz - Data Visualization wizard

Goal

To provide a web feature on EEA data service for uploading and visualising small-to-medium data sets (max 5000 rows) in a few steps without the requirement of advanced programming skills or need to install any client software. The data service user (data provider) will feel more indipendent to quickly generate interactive maps and applications on any location without the need to be strictly dependent on IT staff or advanced desktop tools. Unlike other EEA data service tools, DaViz must make data visualisation as easy task as creating a chart in Excel or via google docs.

Project name

The initial project name for this tool is DaViz coming from data visualization.

Implementation

IMPLEMENTATION OF THIS PRODUCT IS CALLED DAVIZ: http://plone.org/products/eea.daviz

Introduction

This project aims to create a data visualisation wizard for data service system, a full through-the-web solution (TTW), a few steps wizard for visualizing data and create interactive rich web applications.

The data visualisation wizard will ease:
  1. upload of data (e.g. excel, xml/rdf) or connection to online data (google docs, sparql endpoint)
  2. configuration of facets / what data columns to show
  3. define visualisation (table, map, charts)
  4. export and share data and visualisation for easily embedding in other sites by simple copy and paste
  5. in the long term, possibility interlay with other similar data from dataservice, GIS infrastructure or reportnet.

All the steps are done through the web. Anybody will be able to submit, create visualisations and submit to data service admin for publication. EEA staff and ETC will be the first intended users.

DaViz is aimed to assist on creating simple and still powerfull data visualitation applications (interactive viewers and maps) for very small (10 records) to medium data sets (5000 records). The outputs are made via existing client side technology Exhibit
The intended users of DaViz requires no programming-skils.

Existing simila applications

  1. potluck

Use cases

Data administrator
  1. uploads data from pc or connect to existing online data (sparql endpoint, exsting CSV/TSV files, data catalogues services)
  2. preview data
  3. configure which facets (columns) to show
  4. decide visualizations: table, map, timeline, chart
Web visitor
  1. iteracts with data via facets
  2. suggest other sources
  3. view data on map
  4. view data as timeline
  5. view data as table
  6. export data

Technical implementation overview

The tool will be an add on product to Plone. It should be able to be used with any content type that have a file attachement (the excel data). A typical content type that will use this is EEADataFile and EEAFigureFile.

For the visualisation part we will use ready and open source solution MIT Simile exhibit.
Exhibit is very simple but still needs the user to know HTML. Our goal is to even remove this last technical obstacle for the common user.

The main work and challenge for DaViz is the server-side transformation of excel to json and let the user configure TTW in a few steps, without writing HTML.

There are some tools which already do some kind of on-the-fly tranformation of data-to-json-to-exhibit like Simile babel, but the output is very simple. DaViz will use a babel service plus more steps for getting a better output results, which can be seen as final output.

Sample of technologies to integrate

Server side
  1. eea.dataservice
  2. python excel, tabbed separated conversion tools
  3. python simplejson library
  4. http://triplify.org
Client side
  1. MIT Simile exhibit
  2. jQuery and jQuery tools
  3. google visualisation api
  4. potluck

Sample workflow

Workflow when user add excel to DaViz:

  1. User uploads an excel file to data service
  2. System stores original excel
  3. System transfer excel to tabbed separated
  4. Systems determins column label and format by
    1. removing empty rows in the beginning
    2. showng the first row to the user
    3. user decide alternative labels to columns if necessary
    4. user specify what columns contain number, date, boolean, text, id, label, if not present in original excel
  5. exhibit preview is created by
    1. exporting tabbed data to json data using the first row as properties names
    2. all facetes are displayed in same order as the properties order in first row
    3. the preview uses the table view
  6. in preview mode, user decides which facets to remove (can be later added)
  7. in preview mode, user enable additional views like timeline, map etc

We need more brainstorms and pictures to sketch the interfaces used in the workflow above.

Examples of DaViz excel data input and DaViz output

Excel input -> DaViz output

Excel Input -> DaViz output

to add more example of EEA excel files used in indicators, which can be used for testing the tool.

Links

Exhibit and shape files

Exhibit: Lightweight Structured Data Publishing

Demo: Interactive Faceted Browser for Earthquake Data

Bachelor thesis: Creating interactive web pages using the Exhibit framework

#6855 Charts with ajax/javascript

Python library: simplejson (already in the buildout, so we can make use of it)

earthquakes.xls (34 KB) Antonio De Marinis, 23 Oct 2009 15:02

millionairs.xls (27.5 KB) Antonio De Marinis, 23 Oct 2009 15:03