Citizen geodata science an emerging big data trend

Gartner’s latest hype cycle identifies a new trend “citizen data science” that has important implications for the geospatial sector.

At the recent Canadian Institute of Geomatics (CIG) Ottawa Branch Workshop on Big Data: What is it? Technology and Applications, Monica Wachowicz from the University of New Brunswick pointed out a brand new entry Data Science in the Gartner hype cycle 2014Gartner Hype Cycle 2014 that wasn’t in the Gartner Hype Cycle 2013. Then in the Gartner Hype Cycle 2015 Data Science and Big Data have disappeared and a new entry with the intriguing Gartner hype cycle 2015name Citizen Data Science has appeared.

2013 Big Data

2014 Big Data and Data Science

2015 Citizen Data Science

What is data science ?

Monica referenced research by Harlan Harris et al “Analyzing the Analyzers” An Introspective Survey of Data Scientists and Their Work which surveyed data scientists to try to determine what data scientist do.  Data scientists skills Harlan Harris OReillyThey divided data scientists into five categories based on their self-ranked skill sets; Statistics (which includes spatial statistics), Math/Operations Research, Business, Programming, and Machine Learning/Big Data, and four categories based on their respondents’ self-identification; Data Researchers, Data Business people, Data Engineers, and Data Creatives.  

  • Data Business people are most focused on the organization and how data projects yield profit.
  • Data Creatives excel at applying a wide range of tools and technologies to a problem, or creating innovative prototypes at hackathons.
  • Data Developers are focused on the technical problem of managing data — how to get it, store it, and learn from it.
  • Data Researchers backgrounds are in statistics or mathematics, 75% of them have published in peer-reviewed journals, and over half have a PhD.

Data scientists introspection Harlan Harris OReillyThe resulting landscape of data scientists shows a broad range of skills distributed among the self-identified professional categories.  The researchers found that

  • a defining feature of data scientists is the breadth of their skills, and their ability to single-handedly do at least prototype-level versions of all the steps needed to derive new insights or build data products.
  • the evidence points in favor of a scientific versus a tools-based education for data scientists. 
  • 70% of the respondents had at least a Master’s degree. 
  • 40% had undergraduate degrees in scientific fields, specifically, physical or social sciences (but not mathematics, computer science, statistics, or engineering).

The survey showed that the most successful data scientists are those with substantial, deep expertise in at least one aspect of data science; statistics, big data, or business communication.  The authors concluded that data science is a collaborative and creative field, where the successful professional is able to work with database administrators, business people, and others to get data projects completed in innovative ways. 

What is citizen data science ?

One of the biggest drivers for a new type of data specialist is that there are simply not enough data scientists to satisfy the demand from virtually all sectors of the economy.  Gartner has recommended cultivating “citizen data scientists” which they identify as people on the business side who may have some undergraduate mathematics or social science background and who can be assigned to exploring and analyzing data with the appropriate software tools.  Last year Gartner predicted that the demand for citizen data scientists will increase five times more rapidly than that of the highly skilled data scientists Harlan Harris studied.

Software tools are critical for helping citizen data scientists find real insights and avoid simple statistical mistakes.  For example, a technology that Gartner identified is “smart data discovery” which is a next-generation data discovery capability that provides business users or citizen data scientists with insights from advanced analytics and helps them avoid some of the common statistical pitfalls.

Citizen geodata science

There is at least ancedotal evidence that there has been for some time a trend toward “citizen geodata science”.  At the India Geospatial Forum in Hyderabad in 2014, I moderated a session on electric power.   One of the speakers was Arup Ghosh, Chief Technology Officer at Tata Power Delhi Distribution Ltd (TPDDL) who presented his perspective on implementing geospatial technology in a private utility.   The geospatial group at TPDDL had about 60 field personnel and 18 analysts and support staff,  none of which had an educational background in geospatial data and technology.  Twelve were electrical engineers and the rest were people with electric power experience.  All learned geospatial data management and simple analytics “on the fly”.

At a recent GoGeomatics Social Jonathan Murphy gave a presentation about his experience working in Northern Alberta preparing terrain for seismic surveys.  Geospatial data and technology was used in all aspects of field operations.   The staff were experienced in seismic surveying, winter drilling programs, wildfire management, and road and facility construction, but had minimal education in geospatial data management and analytics.  They also had picked up enough geospatial knowledge “on the fly” to do their jobs.

These are two examples of what could be called “citizen geodata science” which I suspect is part of Gartner’s citizen data science trend and is almost certainly growing more rapidly than traditional “geospatial data science” as geospatial technology is adopted into vertical industries.  The challenge is how to reach these people, perhaps through MOOCs, community colleges, conferences focused on vertical industries and including geospatial technology vendors, presentations and hand-on training, or vendor marketing to help them avoid the common pitfalls of geodata management and geoanalytics.

Geoff Zeiss

Geoff Zeiss

Geoff Zeiss has more than 20 years experience in the geospatial software industry and 15 years experience developing enterprise geospatial solutions for the utilities, communications, and public works industries. His particular interests include the convergence of BIM, CAD, geospatial, and 3D. In recognition of his efforts to evangelize geospatial in vertical industries such as utilities and construction, Geoff received the Geospatial Ambassador Award at Geospatial World Forum 2014. Currently Geoff is Principal at Between the Poles, a thought leadership consulting firm. From 2001 to 2012 Geoff was Director of Utility Industry Program at Autodesk Inc, where he was responsible for thought leadership for the utility industry program. From 1999 to 2001 he was Director of Enterprise Software Development at Autodesk. He received one of ten annual global technology awards in 2004 from Oracle Corporation for technical innovation and leadership in the use of Oracle. Prior to Autodesk Geoff was Director of Product Development at VISION* Solutions. VISION* Solutions is credited with pioneering relational spatial data management, CAD/GIS integration, and long transactions (data versioning) in the utility, communications, and public works industries. Geoff is a frequent speaker at geospatial and utility events around the world including Geospatial World Forum, Where 2.0, MundoGeo Connect (Brazil), Middle East Spatial Geospatial Forum, India Geospatial Forum, Location Intelligence, Asia Geospatial Forum, and GITA events in US, Japan and Australia. Geoff received Speaker Excellence Awards at GITA 2007-2009.

View article by Geoff Zeiss

Be the first to comment

Leave a Reply

Your email address will not be published.


*