2010-02-14
Dealing with my archived Blogger posts
I've stopped using Blogger for the weblog and have moved to using Jekyll.
I've set things up so I can edit the posts on my local push, push to my server using git and it publishes the new post.
Blogger provides a way to export all existing posts and comments as an XML file. I used this to manually import a few posts to test out Jekyll but then decided this was way to much work to convert everything. Instead I opted to create posts that link to my original Blogger ones so at least the archives and tags list in Jekyll will allow listing the posts and titles.
To import the existing archived blogger posts I wrote a program to read from the Blogger export file, extract the title, post tags/categories and existing URL and create Jekyll equivalents. It seems to have worked ok and I'll manually fix up any problems if I find them.
I wrote the importer in the Pure Programming Language to have a play with that language. I dabbled with it when it first came out but this was my first time using it in anger. It worked out pretty well. The core of the code to write out the posts looks like:
write_tag_header fd [] = ();
write_tag_header fd (x:xs) = fprintf fd "tags:\n" ();
write_tags fd (x:xs) = fprintf fd " - %s\n" x $$
write_tags_fd xs;
write_tags fd [] = ();
create_post post = fprintf fd "---\n" () $$
fprintf fd "layout: post\n" () $$
fprintf fd "title: %s\n" title $$
write_tag_header fd tags $$
write_tags fd tags $$
fprintf fd "---\n" () $$
fprintf fd "Original Post [%s](%s)\n" (title,href) $$
fclose fd
when
filename = post_filename post;
fd = fopen filename "w";
title = article_title post;
href = article_url post;
tags = article_tags post;
end;
Basically I used the Pure XML library to get a list of posts (I use an XPATH expression to get the relevant parts of the post from the Blogger XML). For each post I then call create_post on it:
map create_post all_posts
I didn't go crazy and convert the actual HTML content into the markdown format I'm using fo the Jekyll posts and convert everything completely - maybe a task for another day.