Note: The post below is extremely geeky and probably not interesting to anyone except those who would like to follow along with the progress of HOW I'm implementing a Vox export tool. If you're just interested in hearing when I'm done with it, this is not the post for you – that'll come soon.
I'm more laying this out for my own thought processes than in any sort of attempt to educate on how the export tool is going to finally work. The good news is I have a tentatively working solution that will theoretically import a full Vox blog onto a self-hosted WordPress installation. The bad news is that the solution in mind will NOT work for (free-hosted) WordPress.com installations, so I'm still trying to figure out an alternative for those. Preferably one that does not involve someone having to find a friend with access to a self-hosted version to do an intermediate conversion for them.
After countless hours (days? weeks?) of half-assed research online, here's a summary of what I've come up with regarding exporting from Vox (VoxPorting? Anyone got a better name for the eventual tool I'll be posting?)
- Blogging services SUCK at normalizing on an export standard. Every single one of them is different. Likewise, almost all of them try to trap you into their service by only allowing you to import their export types and/or only export a type that will be incompatible with other services. This means people have to get crafty if they want to jump from one platform to another, especially if they do it more than once.
- The big contenders for free (hosted) blogging services out there seem to be (in no particular order): Vox, LiveJournal, Blogger, and WordPress (hosted on WordPress.com). Yes, MySpace and its clones exist, and no, I'm not going to even try to get content over on to them.
- Additionally, you've got WordPress (self-hosted) and MovableType (self-hosted) which are free, but require you to host them somewhere.
- Paid services exist (TypePad, etc.) but since they require you to front money, I'm not focusing on trying to export to them.
- That being said, looking at the free services, I've found the following:
- I'm not looking to import into Vox, since that's obviously contrary to the whole point of a Vox export tool. I believe there are easier ways to migrate content from one Vox account to another than exporting/importing. That being said, if you're just trying to back up your Vox blog, you can either use BlogBackupOnline (to back up online only) or Simon Wistow's VoxSlurp (to back up to an .mbox file) – more on these in another post.
- Apparently exporting to a file to import to LiveJournal is out, as LJ doesn't even appear to be able to import its own export files. Unless you're planning to repost every individual post on LJ, probably not an option. I'm not even considering this at the moment.
- Blogger only imports "Blogger export files". There are solutions out there that seem to use Blogger APIs to get around this limitation, but this looks like A LOT of work. I looked at what the Blogger export files look like and don't know that I can forge one to duplicate a Vox account onto a Blogger blog. Holding this out as a last resort option, especially as there seems to be an alternative (see a couple bullets down, below).
- WordPress (self-hosted or on WordPress.com) seem to be the most likely choices. I've had success importing an RSS feed from Vox to a self-hosted WordPress blog. It would be fairly trivial to expand this to create a custom RSS .xml file to encompass a full Vox blog, and import that into a new WordPress blog. HOWEVER, WordPress.com blogs (free-hosted) do NOT have the "import from RSS" as one of their options (for some bizarre reason, they don't offer this??) Instead:
- WordPress.com imports from WordPress export files, called WXR (WordPress eXtended RSS). Both self-hosted and free-hosted solutions export to WXR files, and both can import from the other (I believe). Furthermore, once you've got a WXR file, you can use a solution to convert this into a Blogger-compatible format to import to Blogger! Sounds like the winner, if I can figure out how to properly create a WXR file from a Vox blog. Except documentation on the WXR format seems to be pretty much non-existant, so the only way to figure it out is to analyze an existing blog's export file, the WordPress import code, and experiment. Not the ideal way to make sure I'm doing it correctly, and definitely a way that's going to take more time to get to complete.
- One added benefit to doing a WXR file – if I set it up properly, I could actually scrape the Vox blog posts for comments, and forge new comments to be imported along with the blog posts – this way, not only would you be importing your hard work to a new blog, you'd be carrying along the comments (which oftentimes are as informative/entertaining as the original post!) Currently the plan is to do the first pass with just blog posts, and then once I get that up and running, consider devising the import w/ comments. The big problem is my approach to getting the content off the Vox blog will vary tremendously depending on whether or not I'm capturing comments – if I am, I have to do the much more tedious (and much slower) page-scraping, as opposed to taking advantage of the Vox RSS feeds that I would be using for the other non-comment method. I'm not sure I'd want to commit to doing a page-scrape for every Vox export – I currently am doing that for my Picture and MP3 backup tool and it takes a bit of time – this would be even worse, given that some people have thousands of posts on Vox.
- Movable Type also seems to be able to import WXR files. Definitely looks like WXR is the way to go, and then provide that file to the user for their use in importing to WordPress or MT (directly) or Blogger (via the converter).
Since I know you CAN import to a self-hosted WordPress blog from Vox and then export that right back out to a WXR, the cynical part of me says I should post this solution and then people who self-host can go ahead and import, and people that don't can find someone to do it for them. Heck, I might even go ahead and do this as an intermediate step to the final soltuion. But in the end, I don't want to create half a solution and have most of the users have to fend for themselves. People shouldn't be penalized just because they signed up for a free blog on Vox and now want to have a free blog somewhere else instead.