Journalists Seek Answers From LulzSec And Sarah Palin’s Emails…Without Asking Questions
He is also a woman.
In May, the hacker collective LulzSec hacked into and released Fox’s database of X-Factor contestants. You can download the whole thing, peruse the 73,000 hopefuls, pull out data points. Such as the following: the most common first name in the bunch is Michael, last name is Johnson – but the group is 2-to-1 female. (There were six actual “Michael Johnson”s. None were from Hollywood; all were male.)
LulzSec’s site is a cornucopia of data: usernames and passwords and sales databases and media outlets. Besides from nefarious purposes, there’s not a lot that can be gained from it. By me anyway.
Which is a good transition point to another data dump from recent memory: the Sarah Palin emails. The 24,000 messages sent to and from the former Governor of Alaska revealed… not much. An enormous depth of information – but not a whole lot that was news-worthy. Not that media organizations didn’t do their best – recognizing the scope of the task, they cleverly out-sourced the analysis process to the general public, hoping that some fascinating tidbit they missed might be spotted by a savvy observer on his home computer. But in this case, it was as though Geraldo had invited the entire city of Chicago to help crack open Al Capone’s vault.
You can’t fault the Post and Times for trying the strategy; after all, it worked spectacularly with Wikileaks. Growing more sophisticated after the release of files from the Afghanistan conflict, Wikileaks launched its diplomatic cables release with an indexed, searchable website already created. Journalists, paid and unpaid, seized upon it, uncovering a broad array of data – and arguably launched the Arab Spring.
This is the era we live in. Data, always omnipresent, is now digitized. Meaning it’s searchable, indexable. Which is the point of the smart column Daily Dot’s Nicholas White wrote for PBS’ Mediashift. He wrote:
To cover the online community, The Daily Dot needs data skills. We don’t just need programmers to produce a website; we need some in the newsroom, too. And we need highly skilled mathematicians. We need people who didn’t spend all their time in the humanities in college — we need those who understand scientific research.
In the information age, journalism needs to go further. Information bombards us. What is scarce is insight, understanding and knowledge.
Data, once mute, is now a source. One of White’s points is that a media institution – particularly one like his, predicated on covering a medium built on data – needs to know how to get answers from that source.
Yesterday, the Knight Foundation unveiled its 2011 News Challenge winners – $1.5 million in prizes given in part to projects which can facilitate asking questions of data. There are some brilliant and deserving recipients: upgrading DocumentCloud to allow annotation, for example; or aggregating social media during big news events with iWitness. (Nieman Lab did a nice overview of the winners.)
Data is a source. We are building smarter tools to get answers from it. Which leaves only one problem: who knows what questions to ask?
In a sense, that’s what broadening access to Wikileaks and the Palin emails allowed – those paid and unpaid journalists digging through, looking for whatever his or her passion was. Andrew Sullivan might have kept an eye out for information about Trig. Lisa Murkowski could have scanned for mentions of her father. A campaign finance lawyer would have brought one lens; an advocate for increased religion in government, another. The questions would have depended on the asker.
Just like in everything else, journalism to a large extent is the art of finding patterns in chaos, stories in the unremarkable. Whether that was Weegee’s photos or Woodward’s interest in the Watergate break-in, journalism is about finding the newsworthy.
In his column, White notes that it is still the asker that makes the difference – but that we need smoother processes to allow askers to question the data.
I downloaded the X Factor data, but had no questions to ask it. A police reporter in Hollywood might have scanned for particular names (did Whitey Bulger sing?); an entertainment reporter would certainly have wanted to dive deeper on some of the contestants. All I did was pull out some statistics.
And journalism, we can all agree, needs to be much more than that.
Have a tip we should know? email@example.com