Tumblr Eats a Pelican: The Problem With Static Website Generators

Tl;dr: Don’t bother switching. Tumblr is better.

Guys, my blog is failing.

I’ve been blogging (via Tumblr) at http://www.rogueleaderr.com for two years and the Google analytics data speaks for itself:

Except for one random post about subverting Megabus wifi, my posts hardly ever see more than 200 hits (few of whom actually read to the end). I’m grateful for the readers I have, but I’m only getting marginally more amplification than shouting from a literal soapbox.

So I’ve decided to step up my blogging game

I’ve got some ideas I want to spread. And it’s time to find out whether no one else cares about those ideas or whether I’m just doing a bad job of broadcasting.

So I’m taking a few steps to step things up:

  1. Creating an email newsletter. (You can sign up here)
  2. Getting serious about promoting my blog posts.
  3. Redesigning my blog.

1 & 2 are topics for a different day. This post is about #3, my decision to switch from trusty Tumblr to a static website generator.

Spoiler: I will be wrecked by “what you know you know that just ain’t so…”

I should have listened to Mark Twain. That’s the main moral of this story. The longer story is that I let bad assumptions waste quite a lot of my time.

If you spent any time on Hacker News et al., you’re probably familiar with the classic “hacker profile page”, as best exemplified by Tom Preston-Werner or Kenneth Reitz or Paul Graham. These inimitably classy and minimalist sites generally contain a short tagline-style bio of the hacker, a list of projects, a list of blog posts, and maybe a contact form or easter egg.

I may not have a tenth the skill of any of those chaps, but I know how to steal a good idea when I see one. I wanted my blog to look like theirs and so I assumed (falsely!) that I needed to build my blog the way they built theirs. That meant a static website generator since, after all, Tom Preston-Werner popularized the very idea of static website generators with his tool Jekyll.

If you’re not familiar, static website generators (hereafter “SWG’s”) let you store all the components of your website (including each individual blog post) as files in a directory (often in a markup language like Markdown). You run a script and presto everything is compiled to interlinked HTML files which can be uploaded and served directly from a webserver or even for free from Github.

Static is the move.

As I started researching SWG’s, I quickly realized that there are a lot of options. The most popular seems to be the Ruby-based Jekyll, the original heavy weight. I read Ruby at a 3rd grade level and SWG’s in general seem to require a lot of tweaking, so I was inclined to prefer a Python-based option.

There are plenty of those as well, but I found a glowing post about one called Pelican so, in the absence of any clear compelling differences among the options, I decided Pelican was the way to go. It’s pretty simple to setup a basic Pelican site and even to deploy it publicly on Github Pages (probably 30 minutes work.) But a few minutes later, the problems started.

First, I wanted to create an “About Me” page. That’s quite easy, if you already know how to do it. If you don’t the Pelican docs (and limited tutorials available online) require some pretty close study.

Next, I wanted to make my blog pretty. Pelican ships with some off the shelf themes. IMHO, they’re all somewhere between ugly and “meh.” So I hoped over to CreativeMarket (with a big discount from AppSumo burning a whole in my pocket) and looked for a theme. There are hundreds of WordPress and Tumblr themes for sale, but nothing for Pelican and only a few for “generic HTML site”. Before accepting the cost of customizing one of those to fit Pelican, I googled and found a generic Svbtle ripoff theme for Pelican.

Sure it’s not responsive, but only 7% of my traffic is mobile. So…goodbye forever iPhone losers!

I need to build a link to the past

Next problem: my blog gets a small but non-trivial amount of inbound search traffic. Simply shutting down the old site would kill all traffic to those archives. So I needed to import my posts from Tumblr. But as I’ve learned the hard way on my current project LinerNotes, Googlebot ain’t too bright. Even if I redirect the rogueleaderr.com domain to my pretty new static site, any change in the URL’s of my old posts will at least break any external links to my posts and at worst will cause Google to penalize my posts as both broken links and duplicate content. Plus, cool URI’s don’t change.

A site deployed behind a webserver like nginx could use URL rewrites to redirect incoming URL’s (assuming the URL slugs could be preserved). But…Github pages doesn’t allow redirects (except perhaps via Javascript; I never got that far.)

And I’m back to Tumblr

While googling for how to import from Tumblr to Pelican, I stumbled upon a Tumblr manual page and had a massive facepalm moment.

It turns out that Tumblr has a feature called “Pages” that lets you include and link to static pages right inside your Tumblog.

All you have to do is press a button and drop in a little HTML and you’ve got your own “About Me” page. In other words, basically all the customization power I was hoping to get from a static site is already available in Tumblr with none of the hassle of learning how to use a generator, managing the migration, and maintaining a separate site.

With about 45 minutes of work and the help of the Tumblr Svblte Theme, I was able to create a site that looked and worked better than my Pelican site and preserved all my lovely links and Tumblr followers.

And it’s even responsive.

Moral of the story

Tumblr is a lot more powerful than I realized (as probably is WordPress, which I haven’t used). And Tumblr gives you an awful lot of bang for the setup buck. I’m now struggling to think of a use case that SWG’s allow and Tumblr doesn’t. (Maybe if you want lots of highly customized static pages, or if you want to run a webserver with complicated URL trickery.) If you’re publishing a lot of varied types of content, you’ll probably be better served by a fullblown CMS. SWG’s are only marginally less complicated to learn.

I’ve read worries about “what if Tumblr dies off like Posterous” – well, Tumblr allows export of your posts via an API, so you can deal with learning Pelican as soon as Yahoo writes off its $1bln investment.

One good argument I’ve heard is that storing a blog in a directory allows version control. That’s true, but I basically never change old posts. And it’s easy enough to save a copy of the HTML of a Tumblr theme before changing it.

All in all, the hassles of a Static Website Generator don’t outweight the costs.

Think I’m crazy? Let’s talk it out in the comments.

Did you like this post?

Then upvote it on Hacker News. Follow me on Twitter. Or subscribe to my newsletter.

Never again be thwarted by restrictive “guest” wifi (e.g. on buses or airplanes)

 Last week, I took a Megabus from New York to Boston. It’s a four-hour trip and Megabus advertises free wifi, so I expected to be able to get in some serious undisturbed working time.

Imagine my disappointment when I opened my laptop, connected to wifi, tried to ssh into a server I’m working on, and then watched helplessly as ssh timed out again and again without connecting.

I’m not exactly sure what Megabus is doing, but my guess is that they block all non-web traffic (probably primarily to avoid torrents hogging bandwidth), and they do that by just blocking all network traffic on ports other than 80 and 443 (the traditional http port), or by filtering certain communications protocols like SSH. Once I got to Boston, I tried to use another guest wifi network that was also randomly blocking ports I needed to connect to other servers, so I decided to put a stop to this nonsense once and for all.

The solution? Create a (mostly free) micro server on Amazon’s EC2 cloud and use it as a “poor man’s VPN” by routing all traffic from your laptop through the server and from there out onto the internet. The worked marvelously on the Boston guest wifi, and as I’m writing this it’s letting me connect to EC2 servers via SSH on a Southwest flight.

This is easier than it sounds to set up, provided you have directions. So…here you go!

1)   Launch an EC2 micro server instance running Linux. This is straight forward but a bit complicated if you haven’t done it before, so if you need help Google something like “quickstart set up EC2 server linux” and you should find a good guide.

2)   Ssh into your server (“ssh ubuntu@your-host-name”)

3)   Open up /etc/ssh/sshd_config (“sudo nano /etc/ssh/sshd_config”)

4)   Find the line “Port 22”, and under it add the line “Port 80” (the normal web port) and “Port 443” (the https port) – this tells the server to listen for incoming ssh connections on Port 80 and 443 as well, which will almost always be unblocked on guest wifi because they’re needed for web traffic.

5)   On your laptop, visit https://github.com/apenwarr/sshuttle/ and clone the repo into somewhere convenient (i.e. “git clone https://github.com/apenwarr/sshuttle/”)

6)   Go into the sshuttle folder, and type “./sshuttle -r username@sshserver:80 0.0.0.0/0 –L 127.0.0.1:443 -vv

That’s all there is to it!

Now all of your TCP traffic will be securely routed to your server through port 443 via ssh, and then forwarded on to the internet by your EC2 server.

This has two benefits:

1)   No more pesky port / protocol blocking on the guest wifi

2)   All your data transmitted over the open wifi network is encrypted, so you can’t be snooped on with wireshark.

Now you can do whatever you want and Megabus (and now confirmed on Southwest Airlines) can’t say a darn thing about it. Unless they, you know, change their security policies.

If you like this guide, follow me on twitter (@rogueleaderr) for more like it soon.

WARNING: this only encrypts TCP traffic, not other kinds like DNS (unless you use an extra flag in sshuttle) or UDP etc. So some kinds of traffic may still be snoop-able.  Also, you are not anonymous since your traffic can still be traced back to your EC2 server, which has your name on the billing records. So not that you would anyway, but don’t go committing any cybercrime.

Edit: I’m shocked by how much traffic this post got. I’ll freely admit that I’m a networking n00b and that although this approach worked for me it’s probably not ideal. Many commenters on the Hacker News thread had great suggestions for alternative approaches. Check out the comments at http://news.ycombinator.com/item?id=4410195 for more options on how to get around network restrictions.