Some reflections on Big Data – Strata Europe

by Justin Souter on November 16, 2012

Introduction

I’ve recently been to several meetings in London, and had the opportunity to learn more about data, analytics, and developments in this area – especially around ‘big data’.

It follows on from my interests in Cloud computing, business & web analytics, data and data visualisation, innovation: I think you get the idea. Hopefully these days I bring together more circumspection along with my interest in new trends!

This post is the first of my reflections on big data – about a recent conference in London. I’m also posting in more details about the meetups I attended, and I’m queuing up a post about IBM reports in this area, so watch out for that.

OK, so what’s this big data thing?

Wikipedia summary:

In information technology, big data[1][2] is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools. The challenges include capture, curation, storage,[3] search, sharing, analysis,[4] and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to “spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions.”[5][6][7]

Hype cycle

Many people are sceptical of tech trends, and Gartner invented the Hype Cycle to reflect this:

Gartner has used hype cycles to characterize the over-enthusiasm or “hype” and subsequent disappointment that typically happens with the introduction of new technologies.[2] Hype cycles also show how and when technologies move beyond the hype, offer practical benefits and become widely accepted.

Gartner says that big data is near the ‘peak of of inflated expectations’, although some argue with this! See graphic to the right.

Visit to Strata Conference in London

I had arranged to be in London on business, and was pleased to take the opportunity to attend the periphery of Strata Europe.

The pass I got gave me access to the exhibitors stands, and the chance to meet with delegates, which I got to shoehorn in and around my other commitments.

I met some fascinating folk, and am listing interesting exhibitors below.

Strata Conference blurb

From here:

Unprecedented computing power and connectivity are bringing new layers of experience to our lives: a change that brings both opportunity and the challenge of new technologies and skills. The future belongs to those who understand how to collect and use their data successfully.

Strata Conference offers the nuts-and-bolts of building a data-driven business—the latest on the skills, tools, and technologies you need to make data work. The inaugural Strata 2012 event in London will focus on the core issues and opportunities that are specific to the European data community, as well as the most significant worldwide themes and players.

Exhibitors

  • Splunk – “turns machine data into valuable insights”
  • Greenplum – [my take] a collaboration space for data science insights, along the lines of eRoom, and both owned by EMC Corporation.
  • Datasift – takes the full feeds from platforms such as Twitter; home to the main man, Stewart Townsend – he of the flower shirt fame ;-)
  • Hortonworks – provides services around Hadoop (a bit like Red Hat re Linux)
  • Scraperwiki – “elps coders make data do stuff across the web”
  • Tableau – awesome dataviz
  • Feedzai – “a software company specialized in the processing of data in real time”; uncover and manage anomalies
  • Talend – open source data integration tools

Ok, so we didn’t have The Zuck, but it was a great show! I think it could be the first of many – there are certainly East and West Coast events in the US, and London scheduled for Autumn 2013.

Documents to learn from – O’Reilly

I’ve made my way through much of the following documents, which are free and pretty straightforward to read. They are a tremendous introduction to the whole field – more at the biz / tech end of things.

DJ Patil

Mike Loukides

Alex Howard

Wash up

Hopefully this post has set some context around big data, and reflects the fact-finding I did at this event.

I’m pretty confident the change is coming – what will you do to prepare?

Previous post:

Next post: