TripleStore

From Evolutionary Interoperability and Outreach
Jump to: navigation, search

Main Goals

  • Understand AllegroGraph WebView, including the semantics it provides for various RDFs and OWL constructs and how to use it for geospatial reasoning.
  • Implement use cases involving GBIF occurrence data and biodiversity inventories of the world's protected areas.
  • Provide a triple store implementation that implements use cases being generated by one of the other groups.


Outcomes

Server instance
We installed Franz's AllegroGraph WebView on http://eb3.cs.umbc.edu:8080
Username/passwords are available from jsachs.cs.umbc.edu

AG Webview is a free triplestore with support for geospatial data and reasoning. It is the only triplestore that supports standard geospatial queries like point_within_polygon?, points_within_raidus?, etc. One of our goals was understanding the extent of this support.

Documentation and code patches
Working through the documentation, we were unable to replicate all the functionality described in the official tutorial. Email communication with the developers revealed an awareness on Franz's part of the shakiness of the still nascent geospatial functionality. So two of our outcomes are:

  • documentation of steps necessary to import points and polygons into AG.
  • a ruby script which parses a kml file and creates a lisp program that loads the vertices into AG.


Data There is heavy variation in the parsing behaviour of rdf consumers. For example, there are rdf documents that validate with the W3C validator, but which cause rdflib to throw an exception, and vice-versa. We verified that the following sources of data provide RDF which is correctly interpreted by AG:

  • GBIF occurrence records

The darwin core occurrence records served by GBIF violate rdf convention in a few ways. Nevertheless, AG parses the data, and transforms records to appropriate triples (rdflib and the W3C validator both complain about the GBIF data).

  • Biodiversity inventories of the world's protected areas.

(Also validates with rdflib, and W3C.)

  • IUCN redlist.

(W3C sees no triples. rdflib handles fine.)

  • Loaded small set of artificial data into AG. AG seems to properly support the semantics of rdfs:subClassOf, owl:sameAs, owl:TransitiveProperty. Proper inferences are produced when reasoning is turned on.


Follow-up

  • Command-line utility.

Input: pointer to a file of GBIF occurrence records, and a pointer to a kml file.
Output: list of all occurrence records bounded by the polygon.

  • A massive triplestore of biodiversity knowledge: occurrence records; biodiversity inventories of the world's protected areas; shapes files for protected areas; shapes files for countries and regions; food webs; species profiles; conservation status; invasiveness status; range maps; genomics data; etc.
  • A matrix of rdf constructs and idiosyncracies, and that way they are interpreted by different parsers.
  • Working with Franz to improve their documentation and support for geospatial queries.

Resources

AllegroGraphTutorial