Interactive Data Visualizer

High-Level Project Summary

For a curious mind, knowledge is not a passion. It is a necessity. We at The Black Sheep are the owners of a restless mind. Forever in a pursuit to extend our knowledge of this marvelous. That is why we designed the Interactive Data Visualizer. A comprehensive application designed to bring the vast knowledge of the collective human minds to your fingertips. To create an opportunity for every budding scientist, researcher, or just a curious mind to access cutting-edge research data at their fingertips. Finding datasets had never been this effortless before!

Detailed Project Description

Project GItHub Repository

ChristoJobyAntony/nasa-oinv (github.com)


Abstract

Our project "Interactive Data Visualizer", is an open source web application that searches through thousands of datasets in the NASA Open Data Portal, and displays an interactive network visualization that sketches the relationships among these datasets.


Pre-processing

We utilized the following raw dataset :




We parsed the data utilizing python in accordance with the schema provided. From this data we were able to identify the following attributes as potential relationships



  • Agency,
  • Bureau
  • Program
  • Publisher,
  • Contact Point,
  • Keyword
  • Themes
  • Identifier




The Database

The parsed data was loaded the data into a Neo4J Graphing Database using python. Based on the loaded data, a full text search index was created for all nodes based on their attributes of name and description. This allows for the database to provide faster response to search queries on the whole database.


We choose a Graphing Database for the following reasons ;


  • Scalable: Neo4j being a schema less DB it is inherently suited to handle varying data structures
  • Relationships: A core goal of our application is the was the need to conveniently create and query relationships. Graphing Databases
  • Quick Response : Despite being schema less, Neo4j is renowned for being fast due to it ACID compliancy.



Server Side

The server side of our project is written in Node.js and Typescript and uses the Express framework. The server handles API requests from the client and queries the Neo4j database. It has the following endpoints:


  • /Node/getAllRelations - Gets all relations of a dataset
  • /Node/info - Gets the data of a specific dataset
  • /Node/search - Searches the database for matching datasets



Front End

The client side of our project is written in Typescript with the VueJS framework. It uses the Cytoscape library for the interactive data visualization and the Materialize CSS framework for the user interface.



The Application

The app on start-up loads a network visualization a random dataset. Each node on the graph represents a data-point (Agency, Publisher, Keyword, Contact Point, Bureau, Program, Theme) or a data-set. The edges are representations of the attributes of each node. The user can click on any node to view the detailed information on it. By default the network is shown as concentric circles but it can be changed by simply dragging any node. Holding on a node or clicking the "View relations" button shows you the relationships of that node with other datasets. Clicking on the title or clicking on "Check it out!" button takes you to the dataset landing page. You can also use the search box to search for different datasets.

Space Agency Data

Datasets

This was the main dataset utilized by our application to construct our database and draw relations from .


Schema

The dataset provided above was comprehended and parsed in adherence to the schema listed above


Bureau Code

We utilized this database to associate a meaningful name to the 'bureauCode' attribute and draw out the agency responsible


Program Code

We utilized this database to associate a meaningful name to the 'programCode' attribute and draw out the agency responsible

Hackathon Journey

By participating in the world's biggest hackathon challengewe were able to take back a lot of knowledge and experience. We had the opportunity to learn, grow and share, to inspire. Through our software, users save the time they would utilize in searching for a variety of datasets and have those datasets accessible within seconds.


CHALLENGES WE FACED :

  • One of the biggest challenges that our team encountered was obtaining and formatting the data in a way that was usable while creating a database.
  • Converting them to a manageable format which can be understood by somebody with no background in research.
  • Finding the correct relationships and nodes to create the perfect ontology.


OUR APPROACH :

  • Create a schema to be followed throughout the development of the Web Application.
  • Identify the connections and data points.
  • Think of an ontology that works for the available data sets.
  • Determine a program stack for effective development of web app.

We created an open source click and visualization web application, that concisely represents all the data sets we obtained from the publicly available databases provided at NASADATA.GOV.

The need for access to credible and reliable sources of information is what provoked us to choose this challenge. The importance of using coherent sources truly boils down to effective communication. 

References

Technical Tools


Dataset Resources

Tags

#datasets #ontology #nasa #network #relationships #nodes #researches #data

Global Judging

This project has been submitted for consideration during the Judging process.