GeoWave aims at linking popular geospatial tools to “big data” technology

A very exciting project has been proposed at LocationTech (it’s in the Project Proposal Phase as defined in the Eclipse Development Process).  Simply put, GeoWave intends to do for “big data” databases (initially Apache Accumulo) what PostGIS does for SQL databases (PostgreSQL).  GeoWave is open source software (licensed under Apache 2.0) that adds support for geographic objects, multi-dimensional indexing and geospatial operators to Apache Accumulo.  

NoSQL databases

Accumulo logoTo deal with data volumes that are too large for traditional SQL databases, beginning in 2004 Google developed “BigTable” which is a compressed, high performance, and proprietary data storage system built on the Google File System that is used by a number of Google applications including Google Maps.  Apache Accumulo is a distributed database that is based on Google’s BigTable design and is built on top of Apache Hadoop and other Apache projects.  For putting geospatial data into a key/value store like Accumulo the key concept is that of the “geospatial hash” which converts a 2D, 3D or 4D coordinate such as a lon and lat, lon, lat and elevation or a lon, lat, elevation, and time to an integer index, such as a quadtree or R-Tree index, that can be used to order and rapidly retrieve spatial data.  GeoWave means that you can manage massive amounts of geoinformation in key/value databases such as Accumulo and take advantage of programs such as MapReduce which Accumulo uses for distributed processing.

Connecting Accumulo to GeoServer, GeoTools, and PDAL

In addition GeoWave includes a GeoServer plugin to enable geospatial data in Accumulo to be shared and visualized via GeoServer OGC standard web services. It provides plugins to connect the popular geospatial toolset GeoTools and the point cloud library PDAL to an Accumulo based data store. The PDAL plugin makes it possible to interact with point cloud data in Accumulo through the PDAL library.
 
The GeoWave project Work plans to extend the same geospatial capabilities to other distributed key-value stores in addtition to Accumulo.  The next data store will be HBase.  It also will support other geospatial frameworks in addition to GeoTools/GeoServer.  Mapnik is the next geospatial framework targeted for GeoWave support.  GeoWave says it is very interested in GeoGig logoGeoGig and support for this geospatial data versioning library is currently on their backlog.  GeoGig takes the concepts used in distributed version control such as Git and applies them to versioned spatial data.

Background
 
GeoWave was developed at the National Geospatial-Intelligence Agency (NGA) in collaboration with RadiantBlue Technologies and Booz Allen Hamilton.  The NGA released GeoWave under an open source license in June, 2014. The primary goal of GeoWave is to bridge the gap between well-known geospatial projects such as GeoTools and distributed databases.

I blogged previously about GeoMesa, the first LocationTech project that aims at providing a foundation for storing, querying, and transforming spatio-temporal data in Accumulo.  It implements interfaces that enable Geoserver and other Geotools projects to use Accumulo as a data store.

Geoff Zeiss

Geoff Zeiss

Geoff Zeiss has more than 20 years experience in the geospatial software industry and 15 years experience developing enterprise geospatial solutions for the utilities, communications, and public works industries. His particular interests include the convergence of BIM, CAD, geospatial, and 3D. In recognition of his efforts to evangelize geospatial in vertical industries such as utilities and construction, Geoff received the Geospatial Ambassador Award at Geospatial World Forum 2014. Currently Geoff is Principal at Between the Poles, a thought leadership consulting firm. From 2001 to 2012 Geoff was Director of Utility Industry Program at Autodesk Inc, where he was responsible for thought leadership for the utility industry program. From 1999 to 2001 he was Director of Enterprise Software Development at Autodesk. He received one of ten annual global technology awards in 2004 from Oracle Corporation for technical innovation and leadership in the use of Oracle. Prior to Autodesk Geoff was Director of Product Development at VISION* Solutions. VISION* Solutions is credited with pioneering relational spatial data management, CAD/GIS integration, and long transactions (data versioning) in the utility, communications, and public works industries. Geoff is a frequent speaker at geospatial and utility events around the world including Geospatial World Forum, Where 2.0, MundoGeo Connect (Brazil), Middle East Spatial Geospatial Forum, India Geospatial Forum, Location Intelligence, Asia Geospatial Forum, and GITA events in US, Japan and Australia. Geoff received Speaker Excellence Awards at GITA 2007-2009.

View article by Geoff Zeiss

Be the first to comment

Leave a Reply

Your email address will not be published.


*