Migrating legacy content to WordPress (is pretty easy)

Recently I encountered a situation where I wanted to move some content to a new WordPress installation (not this one) from a piece of forum software that hadn't been attended to or upgraded in, you know, half a decade or so. Normally I would hit Google to try and find a solution for a problem like this, but I figured my particular circumstances were sufficiently esoteric that I could likely sort it out more quickly on my own. And it became a convenient opportunity to test and flex some basic XSLT muscle memory.

In any data migration process the first questions you need to ask yourself are "where am I?" and "where am I going?" The answer to the first question was unambiguous: a MySQL database. The answer to the second question, writ large, was also a MySQL database. But as any self-respecting programmer knows, you can't just take some stuff from one database and insert it into another database willy-nilly without fully expecting to…well, break shit. Badly. So how does one get data into an application without directly injecting it into the database…and without first understanding every single interdependency in the entire database? i.e., a scope of knowledge that is unreasonable and unfeasible in this context. In the case of WordPress, the most obvious route seemed to be via the stock XML import/export plugin. So I had my migration path: MySQL DB(Database) XMLXSLT(WordPress) XML(WordPress) MySQL DB

Querying & exporting the data

The content of interest resided in two tables: topics & posts. It was a very simple SQL JOIN to extract the data I wanted (putting the data of greatest interest in the leading columns then everything else, often redundantly, after):

SELECT t.tid, t.title, t.description, t.starter_name,
    FROM_UNIXTIME(t.start_date) AS `unixtime_start_date`, p.post, t.*, p.*
FROM `ibf_1_1_2_topics` t INNER JOIN `ibf_1_1_2_posts` p
ON p.`topic_id` = t.`tid` AND p.`forum_id` = 1 AND p.`new_topic` = 1

This query got me all the data of interest, and I had executed the query in phpMyAdmin so exporting it as XML only took a couple more clicks. Here's an abbreviated and lorem-ipsum-ized version of the output: generic_database_export.xml

Transforming the data

So I had the (Database) XML and all I needed now was the XSLT to transform it to the (WordPress) XML for import. Being that I was working with zero knowledge of the WordPress XML import/export format, in order to construct the transform I exported as XML the default data that are created when you first install WordPress (i.e., the "Hello World" post) and used that as my target for the transform output. Here's the transform and here's the XML output.

Final thoughts

Quick & Dirty were very much the key words here. If I could have copy/pasted the content in question faster than I could have extracted and reconstituted and imported it, then this would have largely been an exercise in XSLT practice rather than a successful use of technology to simplify and expedite.