We’re Not KDDing

Next up: making sense of all the Internet of Things data.
Published: August 12, 2013

It wasn’t so long ago that the Internet of Things was futuristic. Today, it’s high tech’s equivalent of the “Harlem Shake.” Just as thousands of people worldwide seemingly overnight created dance videos to accompany the song and uploaded them to YouTube, suddenly every tech company, consultant and journalist is “doing” the Internet of Things. Not only that, but just as with the Harlem Shake, they are doing their own versions—from the “Web of Things” to the “Industrial Internet” and, my personal favorite, the “Internet of Everything,” which apparently is the Internet of Things plus people and data. Well, duh.

The Harlem Shake, which went viral early this year, no doubt will be replaced soon by another Internet craze and then another. The Internet of Things will last for millennia, but it is already being replaced by the next futuristic idea. That’s because the Internet of Things is taking shape in the present. “Smart” cities are popping up in Europe and Asia. Want a “smart” home? You can purchase Belkin’s WeMo Internet of Things technology at Best Buy, Costco or Target for less than $50 (full disclosure: I led the development of this product).

So, what comes next? KDD: knowledge, discovery and data mining. Actually, the idea of KDD is not new—in August, the Association for Computer Machinery is holding its 19th annual KDD conference. But the term hasn’t yet been co-opted and misused, and the field is alive with change.

For centuries, the dimensions of data analysis were somewhat constrained. There was a single data set from a single source—say, payroll data from a particular company for the past 10 years—and statisticians would analyze it using conventional mathematical tools. Computers made this process faster, easier and more accurate by putting the data into a “database,” where it could be analyzed electronically. Computers and computerized data became more common, and eventually data was moved online and to the “cloud,” and another term emerged to describe all this information: “big data.”

The rise of the Internet of Things means the world of big data is changing quickly. Data now streams from real-time sensors distributed globally and networked together, gathering noisy signals. Big data is now made of what Deborah Estrin, professor of computer science at Cornell NYC Tech, calls “small data”: lots of tiny, almost insignificant bits of information. Making all this data useful is a job for machines that must be programmed to comb through ever-changing deposits of data (data mining), find significant pieces and patterns (discovery) and synthesize them into something useful (knowledge). This technology, also known as “data science” or “machine learning,” is the frontier of computing. A network that senses data needs a network to make that data make sense.

Kevin Ashton was cofounder and executive director of the Auto-ID Center.