Molten content, data ghettos and why your CMS problems are an excuse, not a reason

* Note: This post came from a version of this blog that got lost in a server failure. It's been restored from old RSS feeds, Google caches and other sources. As such, the comments, links and associated media have been lost.

The other component of the data ghetto that bothers me is that you can’t find that data outside the ghetto. Please, someone point me to a place where there’s dynamic content being fed to the story level pages. I have yet to see where someone’s crime data is being fed into a story about a crime, i.e. a map of murders from the data ghetto’s crime application dynamically generated on a story page about a murder. Or a list of the largest donors to a politician from a campaign finance app on a story about a politician.

And that seems to be a problem we’re creating for ourselves — we’re only thinking about getting the data online, not about what to do next. Or about what else could we do with our data. Or what could someone else do with it if we let them. We’re content with a couple of search boxes, a button and a results page. And we’re content to leave it right where we put it.

Here’s why I’m thinking about this now: I spent all this time building PolitiFact to be a layered, data-driven approach to political and fact check journalism. What have I spent my time doing since? Trying to figure out how to get my content out of that site and into other forms. First was the automatically generated email newsletter (sign-up here). Then it was a widget, which required me turning the Truth-O-Meter into a JSON stream that I could parse with a little javascript. Now? We’re syndicating PolitiFact content to newspapers who sign up at a whole other site. Now, subscribers can go to PolitiFactMedia and get our content for their publications via a password protected site. That site includes a pure REST API for automated import into whatever CMS the customer is using.

Doing all this got me thinking of another concept: molten content.

I’ve always thought of the work I was doing as building something out of raw materials. As a reporter, I did interviews, read documents and analyzed data. All that raw material was worked into a story, a graphic, maybe some photos, and lately some online interactive content. Building news applications, I’m finding, is more like working with metal. The more malleable you make your content, the easier it is to mold your application into all the different places it may need to go.

Most places, the data in the main newspaper.com CMS is cold iron — hard as hell to work with, if not impossible. So most of the time, we’re not going to be bringing our newspaper content to our applications. But what if we brought our application to our newspaper content?

Ask yourself this: Why can’t you find data from a newspaper.com’s data ghetto outside of there? A large reason for a lot of data ghettos is that the CMS is on one set of servers based on one technology and the place for data is on a whole other server setup with another technology. That’s driven by the horror stories most newspaper.com webworkers have about how gawdawful their CMS is and how the CMS won’t serve up this or handle that (or hell, even shovel the previous day’s paper online cleanly).

But, if you designed your application right, your CMS problems are an excuse, not a reason.

As I go forward, I’m adding a hidden requirement to my applications: make the data molten — the stage where the metal is nearly liquid, easy to pour in whatever form I need it to go into.

Here’s a simple example: PolitiFact’s widget. It was my first attempt at molten content. I’m going to breeze over the code for now — if anyone wants me to detail it, post a comment below and I’ll do it in another post.

Basically, the widget is the sum of a few parts. First, a user embeds a little piece of script on their page (you can get it here). That calls a piece of Javascript that looks like this. That Javascript code then makes a couple of calls itself: to a CSS file and to a page that returns a query to the PolitiFact database in JSON format. That page is actually pretty hackish: It’s a pretty vanilla view in Django which then returns a template that fakes the JSON. If I were to do it over again, there’s better means to serialize data in Django. But my hack worked and I haven’t had time to make it work the more elegant way. Anyway, the script goes to that page, parses out that data and then writes it to your browser screen.

Here’s what I mean by molten content. PolitiFact resides on a server at my employer’s server room. This blog is hosted somewhere else. That’s Django, this is Wordpress (which is PHP based)(* this is a post from my old blog. It's all Django now). Different systems, different servers, different states, different everything. And, here’s that dynamic content:

So, if I can take PolitiFact and put it on my blog with these tools, why can’t we take it a step further and put any and all data from our news applications into our story pages?

Because, as I just showed you, we can. We can do it better than this. If you’re developing news apps and don’t know anything about web services, you should start learning.

The broader point here, divorced from technologies and implementations, is that we need to start thinking about where our data is going to go and what we’re going to do with it beyond search and results pages at our one URL. More on this in the coming months, when some other projects I’m working on go live.

By: Matt Waite | Posted: Jan. 11, 2008 | Tags: Journalism, Databases | 0 comments