The Canadian Institute of Geomatics Brings Big Data to Ottawa
Last week I was among an interesting group of participants and presenters who came together to discuss Big Data (June 1st, 2016). The day long event was framed as an exploration of “Big Data: Technology and Applications.” The event was put on by the Ottawa Branch of the Canadian Institute of Geomatics (CIG).
At first, not being terribly familiar with “Big Data,” I wasn’t sure what to expect from the day. The first speaker to the podium was Greg Richards who is the MBA Program Director at the University of Ottawa. He kicked us off with: “What is Big Data? Why does it matter? Who uses it?” Geoff Zeiss was sitting to my left throughout the day and he’s produced an excellent analysis of what he took from the event entitled, Citizen Geodata Science an Emerging Big Data Trend. He also shares some informative images from the slides.
During the day my understanding of what Big Data is and what it means for the geomatics sector grew exponentially. I was particularly impressed with Dr Tracey P. Lauriault whose talk, “Big Data: Policy Issues, Data Sharing, Privacy and Access,” was impactful in terms of both the volume and quality of the information shared. She stood out to me as being the most knowledgeable of all the speakers in regards to the ramifications of Big Data. It was also quite fun having her talk to us from Sweden via a live video link which helped diversify the format of the day.
What did I learn about Big Data? I feel that now after the event I have a much better handle on the subject. You can find a list of the speakers and the topics here. As you can see from the list the quality of the speakers were quite good. If I had had one criticism of the event it is this. I found the speakers, who came from what we might term the more traditional geomatics community, were weaker than those who came from without. It felt to me that their talks seemed like they had simply rehashed older presentations and re-titled them “Big Data” presentations. Their talks were still interesting but did serve as an interesting contrast to the more dynamic speakers who leaned more towards Big Data and less on geomatics as a focus.
All this talk of Big Data begs the question: what is it? A fairly mainstream definition of Big Data is to think about it as three Vs. Those are data Volume, Velocity and Variety.
Big Data needs no explanation to us geomatics professionals as at one point or another most of us have run into massive amounts of data. For us this comes in the form of raster imagery or vector data. So the idea of volume is right there in the name of the thing: “Big Data,” but what about those other two Vs? What do they mean?
Velocity means that data capture is happening really, really, fast. We are not talking about data loggers in the field that get downloaded at the end of the month. We are talking about streaming data almost continuously from the sensor or other form of data production. I would widen this definition to include the speed at which analysis is taking place. As the volume of data is high, and the speed at which it is being created is enormous, the analysis needs to be as quickly as possible.
In the future a lot of data won’t go into the cloud or on your clunky old server. It will live and move on the network and that is where we will do our analysis. Now we have one more V to look at. Variety.
Here is another aspect of Big Data I found surprising. As a geomatics specialist I deal with certain types of data sets fairly regularly. I have my raster and vector data and I also have my tables of attribute data. There are other types of data but for the most part that is what I have dealt with. Big Data means drawing together and working with much more varied types of data. Think about working with sound and video as well as our more traditional data sets. Now how about throwing social media streams in as a new data set as well?
If you are not quite sure about my definition of Big Data I would invite you to have a look at what a lot of other much smarter IT professionals other than myself have said in regards to a definition. You will see there is a lot of room to maneuver.
Kudos to the CIG branch in Ottawa for putting on an excellent event. Paul Mrstik who helped to organize the event on behalf of the CIG sent me the post event wrap up below to share with the community. The event was well attended with 38 presenters and participants.
What was particularly interesting about the topics presented on big data is the connection they made to geomatics, and some of the concerns the geomatics community has when it comes to using and applying geospatial information to analytics. Data quality came up more than once as did accuracy. For some these terms refer to coordinates, datums, projections and observations while for others the terms may apply to data sources, statistical anomalies, verification, modeling, etc. The two communities can learn from each other; when they combine technologies it does not take much translation to get on the same page.
We learned that analytics is what big data is all about, and how geomatics is used as a component of descriptive and prescriptive techniques. The point was made several times that those engaged in big data analytics could really benefit from more geomatics expertise and that educators in the geomatics community, recognizing this, are starting to address the need. Several speakers who work with huge mapping data sets made the point that the cloud is there but it is still not plug and play for someone who needs to walk in with a box full of hard drives.
CIG Ottawa Branch wants to thank the presenters who took valuable time out of their busy schedules, making our event a success. We also appreciate the support from our Geomatics community who spread the word about the workshop, and from the 31 people who registered. The success of our last two workshops encourages us to organize more in the future!