Jumping on the Jekyll Bandwagon

24 Jun 2013

Regular readers of this blog may have noticed the new look and feel of the site that was rolled out a couple of weeks ago. Hope you like it!

These changes are not purely cosmetic, however. I also migrated from Wordpress to Jekyll. Jekyll is a Markdown-aware static website generator. We use Jekyll for the Firebase Blog, and I thought it was about time I switched over my personal blog as well.

Why switch?

The first step was to import all my existing content. Jekyll offers a Wordpress import tool, but I found it rather lacking. The resulting markdown files weren’t formatted correctly, and a lot of metadata was missing. After looking around a little more, I found exitwp, a more full-featured tool that claimed to preserve as much data as possible.

After exporting all your posts and pages from Wordpress, you may need to make a small tweak to the resulting XML file (if you’re exporting from wordpress.com). Add a namespace declaration, xmlns:atom="http://www.w3.org/2005/Atom" to the top-level <rss> element. Then, tweak exitwp’s config.yaml file to your liking. Mine looked like this:

target_format: markdown
date_format: '%Y-%m-%d %H:%M:%S'
download_images: True
item_type_filter: {attachment, nav_menu_item}
item_field_filter: {status: draft}
taxonomies:
  filter: {category}
  entry_filter: {category: Uncategorized}

exitwp will also download all the images contained in your posts, which is really convenient!

$ python exitwp.py
writing...........
done

Now that you have a bunch of Markdown files (they’ll be in build/jekyll/<domain>/_posts), you can bootstrap your Jekyll site:

$ jekyll new kix
$ mv exitwp/build/jekyll/kix.in/_posts/* kix/_posts/
$ cd kix && jekyll serve

Access your brand new site at http://localhost:4000. Jekyll comes with a simple default theme (the one used by Tom Preston-Werner), so you can see if your posts were all imported correctly straight away. You may want to customize the look and feel of your website. Just start editing _layouts/default.html to your liking! I used the Pure CSS framework, some icons from Topcoat and the color scheme from Solarized to build the layout for this website.

The one thing I couldn’t figure out how to automate was making sure syntax highlighting for code snippets worked correctly. Code blocks are kind of a mess in Jekyll right now, especially if you’re going to be hosting on Github Pages. In order to use Github-style fenced code blocks, you’ll have to switch to using redcarpet as the Markdown parser:

markdown: redcarpet
redcarpet:
  extensions: ["no_intra_emphasis", "fenced_code_blocks"]

Unfortunately, I also wanted to enable SmartyPants to display beautiful punctuation. Github Pages has enabled SmartyPants, but only if you use rdiscount. Since I had to go in and convert all my code snippets to either fenced code blocks or liquid tags, I opted to enable rdiscount with SmartyPants and use liquid tags to highlight code.

Now for comments. Since Jekyll generates static HTML pages, the only option is to use Disqus (or a similar JS-only comment system - I haven’t come across any yet). Disqus makes it really easy to import your Wordpress comments, just upload your wordpress.xml in the admin interface and you should be good to go in a few hours.

Setting up a feed is really easy. Jekyll automatically runs every file that starts with “---” through its processor, so simply create a feed.xml file at the root that looks something like this. You can repeat this process to create seperate RSS and Atom feeds if you wish.

Finally, permalinks. The URL syntax for my site has (regrettably) changed over the years, so there’s lots of incoming links with varying permalink syntaxes. I wanted to make sure these didn’t break. Wordpress did a great job of redirecting broken links as neccessary, but with Jekyll you’ll need to use a plugin like the Alias Generator. Unfortunately, Github Pages does not support plugins, so I had to use the rather hacky approach of a custom 404 page.

Whenever you arrive at a broken link, Github will render the custom 404 page instead. This doesn’t help search engines, of course - you’d need a 301 redirect for that - but at-least for users I can do a dynamic redirect to the closest match. Since the slugs always match across multiple permalink syntaxes, I can do something simple like this:

var toRedirect = {};
{% for post in site.posts %}
  toRedirect["{{ post.slug }}"] = "{{ site.url }}{{ post.url }}"
{% endfor %}
var url = document.location.href;
for (var key in toRedirect) {
  if (url.indexOf(key) != -1) {
    document.getElementById("redirect").style.display = "block";
    setTimeout(function() {
      document.location.href = toRedirect[key];
    }, 2000);
    break;
  }
}

One caveat with this approach is that Github only recognizes a static 404.html as a valid custom 404 page - it won’t run Jekyll on it. You’ll need to run jekyll build yourself to generate the HTML file locally, before checking it in.

That’s it! You’ll notice that Jekyll is a fairly low-level engine, leaving a lot of room for customization, but also requires a fair bit of work upfront. If you’re looking for something that’s dedicated to blogging and easier to get started with, check out Octopress. I haven’t used it, but it seems to be quite popular.