Spatial Data Infrastructure for Government

One of the scenarios that is becoming increasingly common especially in government at the national level is an IT solution for sharing data, often referred to as spatial data infrastructure or SDI.  In Spatialdatainfrastructrure
most organizations this is what I would call a Web 1.0 application  where spatial data that is created somewhere within the government, for example in the national mapping agency, and is made available through a single portal to consumers of the data.  Initially, the consumers are typically other government agencies, but over time the portal frequently evolves into a public portal.  Often in addition to providing a description of the data and access to the data, the portal will also include metadata about the data.  I’ve included a logical architecture (thanks to Bruce Argue for this), that illustrates what a typical SDI looks like logically.  There are creators, who create data, a consolidator or portal which provides a gateway to and metadata about the data, and consumers, who use the data.

Challenge

Typically the creators or data providers have used applications from multiple vendors to develop their datasets, so that the consolidator needs to be able to consume data in Autodesk, Oracle Spatial, ESRI, MapInfo, Intergraph,and Bentley formats and then publish this data through the consolidator portal in a common standards-based way so that everyone in the organization is able to access the data.

Solutions

From an IT perspective the solution to this type of problem rests on two basic components, standards-based interoperability and a scalable architecture.

For spatial data, fortunately the Open Geospatial Consortium (OGC) has defined standards which are becoming widely adopted in government.  The OGC’s open web standards WMS and WFS are the key to interoperability in a web environment.  In the context of SDI this means that the consolidator portal needs to be able to publish WMS/WFS compliant data and consumers need to be able to consume WMS and WFS datasets.  As I mentioned in a previous blog, most vendors of geospatial products support at least WMS, and increasingly WFS.

From an architectural perspective there are several solutions to problem of publishing datasets stored in different formats through a WMS/WFS portal.

Clearinghouse

A common architecture is a clearinghouse where the portal simply includes Sdilooselycoupledsoa
descriptions, metadata, and an URL for each dataset.  In the past this
type of approach was common, but the URL would redirect you to a web
site where you could download the data in whatever format the data
creator supported.  The advantage of a standards-based architecture as
illustrated here is that the data is accessible through a common
standards-based web services API.

An advantage of this approach is that the portal is relatively easy
to simple to implement because it does not publish data directly.  In
addition it puts the onus on the data creator, rather than the
consolidator, to ensure that the data is accessible.

A disadvantage is that cross-dataset queries are limited to the
contents of the metadata, since to query the actual data you need to go
to the website of each dataset.  Another potential disadvantage is that
the data creators may not  all support the same versions of the WMS and
WFS standards so you may not find the same level of data access for all
datasets.

Spatial Warehouse

This is probably the most typical approach at the present time. Periodically data creators upload aSdiconsolidateddatasource
copy of their current dataset to the consolidator who converts it and loads it into spatial data warehouse consisting of a spatially-enabled relational database management system (RDBMS).  This process can automated so that the datasets are refreshed periodically.  The frequency with which this occurs depends on the volatility and persishability of the data. 

The primary advantage of this approach is that it has minimal impact on the operational processes of the data creators.   Another advantage is that since the data is stored in a common spatially-enabled RDBMS, spatial SQL queries on each of the datasets is supported.  Potentially cross-dataset queries can also be defined.  In addition uploads can be timed to occur at off-peak times so that during normal business hours operational systems are not impacted by large uploads occurring at times of restricted network availability.  For remote sites with poor network connections datasets can be provided to the central office on CDs.

This approach requires that RDBMS schemas for all of the datasets be defined. In general each dataset will require its own schema. You also have to identify applications such as FME for data conversion and write scripts for converting each dataset and loading it into the RDBMS.

Distributed Databases

Another approach is based on an intelligent portal that provides a common data access API to all theSdidistributeddatasources
datesets.  This approach requires data providers for the different types of spatial datasets such as ESRI shape, Autodesk DWG, and Oracle Spatial. 

The advantages of this approach is that each dataset is current and accessible through a common programmable interface, though you should bear in mind that in general each will have its own data model and the portal needs to reflect this.  In this approach each dataset is individually queryable using the common data access API.  Another potential advantage is that this approach also allows the portal to provide cross-dataset queries.   A potential disadvantage is that in situations of limited network
capacity, for example in the case of a low bandwidth WAN, there may
simply not be enough available capacity, so that performance levels
become unacceptable.

But with the future in mind one of the most exciting things about this approach is that it supports Web 2.0 interactivity, in other words, users can edit the data.

Geoff Zeiss

Geoff Zeiss

Geoff Zeiss has more than 20 years experience in the geospatial software industry and 15 years experience developing enterprise geospatial solutions for the utilities, communications, and public works industries. His particular interests include the convergence of BIM, CAD, geospatial, and 3D. In recognition of his efforts to evangelize geospatial in vertical industries such as utilities and construction, Geoff received the Geospatial Ambassador Award at Geospatial World Forum 2014. Currently Geoff is Principal at Between the Poles, a thought leadership consulting firm. From 2001 to 2012 Geoff was Director of Utility Industry Program at Autodesk Inc, where he was responsible for thought leadership for the utility industry program. From 1999 to 2001 he was Director of Enterprise Software Development at Autodesk. He received one of ten annual global technology awards in 2004 from Oracle Corporation for technical innovation and leadership in the use of Oracle. Prior to Autodesk Geoff was Director of Product Development at VISION* Solutions. VISION* Solutions is credited with pioneering relational spatial data management, CAD/GIS integration, and long transactions (data versioning) in the utility, communications, and public works industries. Geoff is a frequent speaker at geospatial and utility events around the world including Geospatial World Forum, Where 2.0, MundoGeo Connect (Brazil), Middle East Spatial Geospatial Forum, India Geospatial Forum, Location Intelligence, Asia Geospatial Forum, and GITA events in US, Japan and Australia. Geoff received Speaker Excellence Awards at GITA 2007-2009.

View article by Geoff Zeiss

Be the first to comment

Leave a Reply

Your email address will not be published.


*