Challenge

Ontologies and Interactive Network Visualizations

Summary

Tens of thousands of NASA datasets are publicly available online, but with so many files available, how can potential users determine those that will meet their needs? Your challenge is to (1) create an ontology to integrate descriptions of disparate NASA data sets, and (2) develop an interactive network visualization to depict relationships among those data sets.

Details

Background

The NASA Open Data Portal provides a catalog of 42,946 data sets and 555 code repositories, and links to open-innovation sites, NASA science archives, and code and data from numerous U.S. federal agencies. Many of the data set collections are described in JavaScript Object Notation (JSON) files.

Given burgeoning NASA data, users can have a hard time knowing where to start, or knowing which data set(s) might meet their needs. This issue is exacerbated by different data sets having varying internal organization / list mechanisms, further confusing the user. NASA is going through the process of creating an internal metadata catalog that would act in many ways like data.nasa.gov does publicly. Both catalogs are primarily catalogs of metadata, not data. For data.nasa.gov, its metadata follows the project open data standard and is available as a single JSON at data.nasa.gov/data.json. What types of data visualizations or user interfaces can you come up with that save users time and frustration by linking them with the right data sets in a faster, more accurate manner?

An ontology defines terminology, organizes classes of things into a taxonomy, describes types of relationships among things, and specifies the attributes of things. The relationships between these things can be depicted as a network consisting of shapes and lines that connect the shapes. Each shape represents a node (or thing) and the lines depict relationships between connected nodes (or things). Free open-source software is available for developing ontologies and network visualizations.

Objectives

Your challenge is to (1) create an ontology to integrate descriptions of disparate, publicly available NASA data sets, and (2) develop an interactive network visualization to depict relationships among those data sets. Objectives of this challenge include specifying a taxonomy of data set classes, integrating terminology, defining relationships (also known as object properties), and identifying attributes (also known as data properties) of data sets in open data catalogs. You may then apply the ontology to define a network wherein data sets are nodes and the notional relationships among them are the links between nodes. The network should be visualized as an interactive diagram or 3D model embedded in a web page.

Specific activities associated with this challenge include:

  1. Exploring open data sets at NASA and other U.S. government agencies, as well as free and open datasets from other international space agencies.
  2. Identifying complementary data sets, e.g., Earth science data sets, population densities, demographics, pollution, etc.
  3. Obtaining the JSON description of those data sets. Viewing the source of a web page can reveal links to JSON data.
  4. Creating an ontology or a master JSON structure for integrating disparate data set descriptions.
  5. Visualizing the ontology or integrated JSON data as an interactive network. Examples of relationships could be locations, source, key words, etc.

An ideal network visualization will be interactive; e.g., the nodes will be linked to the web page that describes the data set. Your interactive network should be integrated into a web page.

Potential Considerations

You may (but are not required to) consider the following when developing your ontology and visualization:

  • Popular code repositories offer free hosting of web pages that can include JavaScript, which can parse JSON data.
  • Your interactive network could be hierarchical; clicking on a node could open a lower-level detailed network. A web application could display a taxonomy in a navigation pane and another pane could present the interactive network of data sets that exist within a selected class. Nodes could be color-coded or various icons could be used as node shapes to indicate different categories. Scalable Vector Graphics (SVG) could depict a network and the shapes could have embedded Uniform Resource Locators (URL). The binary version of the Graphics Language Transfer Format (GLB) could be embedded in a web page and displayed via a viewer, which would enable the creation of an interactive 3D network.
  • This challenge requires knowledge in ontology development and skills in data visualization. When recruiting team members, consider seeking ontologists as well as web application developers who are familiar with data visualization code libraries. Research free and open-source applications and code libraries.
  • The Example Resources include links to a NASA Scientific and Technical Information (STI) thesaurus, a NASA Taxonomy at the Library of Congress, a Planetary Data Sciences (PDS) ontology expressed in the Resource Description Framework (RDF) format, and subjects from the NASA taxonomy specified in a Simple Knowledge Organization System (SKOS) ontology. This thesaurus, taxonomy, and the ontologies can serve as examples of the types of products that could be generated in this challenge. Research search terms may include “Label Property Graphs” (LPG), “Typed Property Graphs” (TPG), and RDF to LPG conversion.
  • Security policies and regulations prohibit NASA personnel and contractors from downloading executable code to their computers. Ontology documentation and a network diagram with embedded hyperlinks could be a Portable Document Format (PDF) file. Ontology editors can export web-based documentation and code libraries can produce interactive collapsible and expandable networks within a web application. Web-based documentation and web applications ought to be deployed on a web server. There are free web hosting services.
  • Key words in the text above can serve as a starting point for your research online; inclusion of those key words in this challenge description does not constitute an endorsement.

For data and resources related to this challenge, refer to the Resources tab at the top of the page. More resources may be added before the hackathon begins.

NASA does not endorse any non-U.S. Government entity and is not responsible for information contained on non-U.S. Government websites.