Big Data, Meet Long Data, Meet Blog Data

April 2, 2013 · · Posted by jfrank

Big Data Meet Long Data by Jeff Bertolucci - @jbertolucci - column appears this week in InformationWeek to reminds us that "Long Data" or historical data is vital for analysis and comprehension of trends that span years.

Bertolucci's article links to a Wired article from January 2013 titled Stop Hyping Big Data and Start Paying Attention to ‘Long Data’ where the author, Samuel Arbesman - @arbesman - says "Big data puts slices of knowledge in context. But to really understand the big picture, we need to place a phenomenon in its longer, more historical context."

Bringing this home to a company's context, Bertolucci quotes Benjamin Bruce (Pitney Bowes - Marketing Director) as saying "Big data is more about taking a slice in time across many different channels" and that "long data involves looking at information on a much longer timescale. Ignoring customer data and records that go back decades can limit a company's ability to connect with its customers."

Taking a long-data trip back to May 2008 brings us to a quote from Now Everything is Fragmented by Dave Snowden - @snowded - in KMWorld where he said:

"Over the last decade as I have worked on homeland security, we have had the chance to run some experiments that show that raw field intelligence has more utility over longer periods of time than intelligence reports written at a specific time and place. In other experiments, we have demonstrated that narrative assessment of a battlefield picks up more weak signals (those things that after the event you wished you had paid attention to) than analytical structured thinking."

He continues with explanation: "we live in a world subject to constant change, and it’s better to blend fragments at the time of need than attempt to anticipate all needs." Amen.

So when you leave data on the chopping block after completing an analysis, you are denying the next person an opportunity to go back to the raw data and run their own analysis, possibly for the same or different purposes.

I'll assert that what's needed is the thin slicing in big data concepts combined with the long data trends that allow for understanding change and gathering some picture of the future.

Bringing this back to "blog" data - thats where we can capture the vital narrative that Snowden says carries the weak signals. The blog data helps to annotate the context where big data lives.

Blog data in a TeamPage demo also offers a simple and easy example to explain the importance of thin slicing over long trends. Here is a tag cloud from one of my demo servers. It's set to All Time. The tags tell a story, of sorts. Interpret it as you may.

Image

Now, when we use a date selector control, we can see it tells a very different story in 2012 vs. 2011.

Date Selector: Year 2010 Date Selector: Year 2011
Image Image

In the example, you can see the company's attention has shifted from competitors like Nike and Vibram in 2010 to competitors like Merrell in 2011. It also looks like they've done less work with Policy work in 2011 and have shifted away from HUMINT (human intelligence collection).

This sort of tag cloud view offers a pretty blunt view across a whole server or a particular space at a time. Greater precision is often required. Another way to slice the content is with our premium search which is powered by Attivio's Active Intelligence Engine. You can search for any set of words and get a tag cloud for the search, then even slice that into a time period or by any other facet or set of tags.

Image

In this search interface, Attivio also helps us extract and display tag cloud style views of any facet, including keywords, spaces, content authors, and more. From the keywords facet, you can quickly see there are 5 hits on Marathon Training and Injury Prevention.

This brings us to Small Data. Once we can thin slice small, relevant, data, we can quickly assess what topics are prominent even before digging in to read the source content or more quickly understand trends.

A thick slice across all time isn't adequate to explain a course of events - a long view with thin slices and supporting narrative is vital. With all this, you might take a long music trip back to the 80s and You May ask Yourself How Did I Get Here? ...and actually come to a good answer!!

Or, if you want to consider more thought provoking ideas about tags, other meta data and the role of time, please click over to my Tag Mush presentation linked at the bottom of Ontologies & Tagsonomies at Taxonomy Boot Camp to read more about the Information T.

Image

This Information T model talks directly to the importance of Long Data, Big Data and context from Blog Data.

Page Top