Setting the head of a post using a delimiter in your doc

I’ve gotten a lot of requests from people using our Docs to WordPress plugin on how to set a headline that’s different from the title of your Doc, such as we do at the BDN using a pipe.

This isn’t a standard feature of the plugin, but the plugin does include a few filters to modify how posts are formatted. One of these filters is pre_docs_to_wp_insert, and it can be leveraged as such:

Plugin Name: Extend Docs to WP like so

add_filter( ‘pre_docs_to_wp_insert’, ‘bdn_split_post’ );
function bdn_split_post( $post_array = array() ) {

$exploded_fields = explode( ‘|’, $post_array[ ‘post_content’ ] );

//Sometimes people forget a pipe, and we don’t want to put the entire post in the headline
if( is_array( $exploded_fields ) && count( $exploded_fields ) >= 2 ) {

//Save the old title in case you want to do something with it
$old_title = $post_array[ ‘post_title’ ];

//Set the title to the first occurance.
$post_array[ ‘post_title’ ] = strip_tags( $exploded_fields[ 0 ] );

//Unset the title
unset( $exploded_fields[ 0 ] );

//Now restore the post content and save it
$post_array[ ‘post_content’ ] = implode( ‘|’, $exploded_fields );


return $post_array;


I haven’t tested it but the above code should do the trick.

Coming soon: Details on moving stories from WordPress to InDesign!

Update to Google Docs to WordPress plugin

A few weeks ago, unbeknownst to me, Google made a (wise) change to its APIs to require SSL for all API calls. This caused parts (but not all, apparently) of the Docs to WordPress plugin to stop working.

This evening I released an update to that plugin that should resolve any problems people had. In addition, I made a small change to make bold and italics come through more consistently.

As always, you can download the latest version in the WordPress Plugin Repository.

Updated Docs to WordPress plugin: Now with better formatting

Version 0.3-beta of our Docs to WordPress plugin has been released. It’s a fairly minor release, and only affects people using the cleaner extension, but we just rolled it out internally this morning at the BDN and it’s exciting for us, so I wanted to pass it on to the public.

The newest version of the cleaner extension passes correctly parses bold and italic text and headings, such as <h4>. It also strips out span tags to avoid a few instances where they would cause an extra line break.

In an earlier version of Docs, bold and italic text were surrounded by <em> and <strong> tags, as would be expected. It made it very easy to handle stylized text.

In the newest version of Docs, however, text formatting is done completely via CSS. So the header of each of the HTML versions of the Doc contains a stylesheet, where bold and italic text is given an arbitrary class name.

To carry that through to WordPress, I use preg_match to find the correct class name from the stylesheet and then preg_replace to convert span tags to <em> and <strong>. Thanks much to Andrew Nacin, Rob Flaherty Bill, who replied to my cry for help.

As always, the Docs to WordPress plugin can be found in the WordPress Plugin Repository.

Appeal for help, because I’m terrible at regex

One of the most annoying things about using Google Docs is that none of the styles are inline. It used to be that bold text was wrapped in a <strong> tag and italic text was wrapped in am <em> tag. No longer. Now each style of text is wrapped in a span with a number of different classes applied to it. Those styles don’t carry through when we bring the text into WordPress and the names of the classes vary from article to article. This can be very annoying for columnists who bold names of subjects, for example.

So, what I’m looking for is a regex expression to turn <span class=”c0 c3″>My text</span> into <span class=”c0 c3″><strong>My text</strong></span> where class c3 is the bold class, for example.

OK, so you’ve built your site. Now what? Focus on performance. #wcbos

Here at the BDN, we’re all a bit relieved to finally have completed the relaunch of And one of our tasks now is to clean up all the mistakes we’ve made and all the sloppy code we’ve written while rushing toward the finish line.

A great talk by Frederick Townes, CTO of and the man behind the must-use W3 Total Cache plugin, gave a great talk at WordCamp Boston this weekend about performance enhancements, and I thought it’d be good to repeat some of his points and add a few of my own.
Continue reading OK, so you’ve built your site. Now what? Focus on performance. #wcbos

New Plugin: Define SSL Pages

Just released: A very simple plugin that allows you to require SSL for certain pages on your site. For example, at the BDN we moved the login form from /wp-login.php to /login/, and wanted to require SSL for that page. So, using this plugin, we can force anyone who visits to

It’s in the WordPress plugin repository:

Form ever follows function, eventually

Our online editor, Will Davis, has been explaining how we flowed text from Google Docs to WordPress to print and created a low-cost, portable front-end system for our newsroom at Bangor Daily News. I wanted to tell you a little about why we did it.

If you read what Jeff Jarvis, Chuck Peters and Clay Shirky have been writing about decaying newsrooms and the need for new models, it’s hard to believe they weren’t taking notes in some darkened corner of ours, perhaps in Sports behind the stacks of old game results and Mountain Dew cans.

Like many newsrooms, until very recently we were production heavy because we had to be. Moving stories to the web was a copy-and-paste affair, but that’s not where the trouble started. If you begin with a print-directed front-end system, as we did, how does that system accommodate a story being updated from the field? Or how would the full possibility of story assets land online, to be chosen among for print? Even simpler: When do reporters add links? The answers, as countless journalists know, are: It can’t; they won’t; they don’t. From there, it’s all production, not creation.

As we lost staff to cutbacks over the years, assembling our content into finished products was taking a larger and larger percentage of our time. Simply processing press releases seemed to suck up significant portions of editors’ days. No one wanted to be in this situation, but our infrastructure for moving content demanded it. We were trapped.

We needed reporters to get out of the tools they had been using for more than a decade to drive toward single shift-end deadlines. We needed to simplify the connections between what reporters wrote and what the public saw. We needed to link our bureaus so that they were much more a part of the daily news flow; mobility, so that any staff member with a cell phone could file from anywhere; web archiving that allowed us to expand on stories and retrieve content below the level of a story — in brief, we needed to match the way our audience now acquires information. Also, we didn’t have any money for this project.

Then along came JRC’s Ben Franklin Project, pointing the way. We had begun using Google Docs in our new media department in 2007, when the department was created and we suddenly had to keep and share records on web development, ad sales and commissions, good ideas and meeting minutes. Docs as front-end newsroom system became apparent as Google improved its product and we needed a place for reporters to store notes, interviews, story ideas and all the rest in a place they could organize. The WordPress CMS, as good as it is, didn’t seem like the place to do that.

As the newsroom has grown comfortable with Docs, it is becoming more efficient (links and headlines, for instance, travel from Docs to WordPress) and we are shifting staff members from production to content creation. We knew we had a winner in Docs when we had a major election story with two reporters in the field and an editor in the newsroom, all working simultaneously on the same breaking story, adding content, seeing in real time what each was adding, talking to each other through the chat function and responding with updated information. Fast, simple, low cost.

We’re a long way from done. We’re still working on ways to present data and extract pieces of story content to create a coherent, useful whole; and we are just beginning the process of providing our audience a range of tools to contribute their own content. But in the newsroom, the guiding ideas we have put into practice are to match the tool to the job we need done (rather than the reverse), reduce the number of steps required and anticipate how our audience will want the information next. And the cost should be next to nothing.

Quick update to the Docs to WordPress plugin

We’ve released a quick update to the Docs to WordPress plugin that allows the plugin to integrate with WP Cron rather than requiring you to create your own file to run cron against (though you can still do that, too). To take advantage of it, you’ll have to upgrade to the latest version and edit your wp-config file to define at least three variables:

DOCSTOWP_USER: Your Google Docs username

DOCSTOWP_PASS: Your Google Docs password

DOCSTOWP_ORIGIN: The ID of the origin folder you want to draw the docs from

DOCSTOWP_DESTINATION: This one’s optional, but if defined it will put the docs in this folder when everything is finished processing.

To define a variable:

[php]define( ‘DOCSTOWP_USER’, ‘’ );[/php]

To get the ID of the folder, look to the URL. It will look something like this:!–ID STARTS HERE, AFTER THE 0 and the period –!

A quick note: I decided to hardcode the variables in wp-config instead of using a setting because I’d rather store a plaintext password on the server than in the database. Obviously, neither are ideal, so if anybody has a better solution I’m all ears.

Chronicling the BDN on WordPress