org.jclouds.rest.AuthorizationException with Whirr and Ubuntu Precise 12.04

Just leaving a Google breadcrumb in case someone hits this error. If you’re trying to launch a cluster using Whirr and Precise, and you get this error:

java.io.IOException: org.jclouds.rest.AuthorizationException: (ubuntu:rsa[fingerprint(8a:da:a2:0f:23:d0:3b:91:38:93:1f:5b:2b:dc:33:90),sha1(f5:eb:ab:91:a2:a7:be:f1:90:f6:2c:10:f0:89:1f:51:12:a9:32:e5)]@23.20.104.174:22) (ubuntu:rsa[fingerprint(8a:da:a2:0f:23:d0:3b:91:38:93:1f:5b:2b:dc:33:90),sha1(f5:eb:ab:91:a2:a7:be:f1:90:f6:2c:10:f0:89:1f:51:12:a9:32:e5)]@23.20.104.174:22) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods

You can fix it by adding:
whirr.cluster-user=huser
to your hadoop.properties (or equivalent) file.

Setting up MusicBrainz Server on EC2 using Postgresql 9.1 and Ubuntu 11.10

[Warning: super technical post to follow]

I sunk a large amount of time this week into trying to get a MusicBrainz server running in the cloud. Since as of a week ago, I didn’t know jack about Postgres, Linux, Perl, or “the Cloud”, this was a rather large challenge for me. But I finally got through it with the help of a lot of scattered web resources and a bit of help from the most excellent “ruaok” in the #MusicBrainz-Devel IRC room (on FreeNode).

So I’d like offer a few pointers that might save a substantial amount of time for anyone else trying to do this, especially if you (like me) don’t know enough postgres or perl to fix things when they break.

I’m going to assume that you know how to set up a EC2 server running the latest (as of 3/9/2012) version of Ubuntu 11.10. You can find AMI’s linked on the official Ubuntu website. Building the database is a bit computationally intensive, so I recommend at least a large instance if you don’t want to wait around for a long time. I also recommend starting with a 20gb+ volume to be safe, so you don’t have to waste time resizing if you run out of space. I strongly recommend you make sure you use the latest stable version of Ubuntu and don’t (like I did) accidentally install an unstable beta of the next version, because that will lead to a lot of weird errors.

If you don’t know how to use EC2…consider learning. There are lots of good guides online, and it’s pretty powerful.  It’s straightforward but most of the procedures are fairly arbitrary, so there’s no super-easy way to just jump into it. Note Virtualbox MusicBrainz server (as suggested on the MB website) does not work by default on EC2 w/ Ubuntu 11.10, so don’t waste your time trying unless you’re already familiar with virtualization. It’s complicated enough to start with, even if you’re not trying to run a VM inside a VM.

So…

1.     Follow the instructions. The install instructions are good and will get you most of the way there. Find them here (https://github.com/metabrainz/musicbrainz-server/blob/master/INSTALL)

2.     As of this writing, the latest version of Postgres was 9.1; So type in 9.1 when the guide tells you to enter the version number. Postgres is kind of confusing, and unhelpfully they seem to have changed a lot of the directory names in 9.1 without updating the official manual, so if you try to google for the paths you want, you’ll often find the wrong ones. As of now, the key directory to care about is `/usr/lib/postgresql/9.1/bin` which contains the control commands for the server.  Some of these will be put on your path by default from the installation, but not all of them.

3.     By default, postgres keeps its data at `/etc/postgresql/9.1/main/` That’s where you can find the config files the INSTALL guide references. You can also use a different directory (e.g. if you want to put the data on a separate EBS volume so you can clone it easily), you can use the command controls and type `initdb –D /your/dir` to create a new directory with it’s own configuration. You can then start that server with `postgres –D /your/dir`

4.     Edit pg_hba.conf as recommended. Inside of postgresql.conf, change the line that says `listen_address = ‘’` to `listen_address=127.0.0.1` which allows you to connect to your server through TCP. A few lines below, uncomment the line that says `port = 5432`.

5.     Inside of your directory, edit `/lib/DBDefs.pm` and in the block that says READWRITE, change “schema” to “musicbrainz_db”, and username to “postgres” (changing the schema and username might not be necessary, but they helped with some errors I was seeing), and uncomment “port” and change it to 5432.  Below in the “System” block, make sure that the username and password are the same as your postgres account (by default it’s username “postgres” and you have to set the password with `sudo passwd postgres”. Also uncomment “port” and set it to 5432.

6.     Follow all of the other INSTALL instructions. But before building the database in the last step, make sure you log in to the postgres account by doing `sudo su – postgres`

Hopefully, everything should build for you the first time. If it doesn’t the script can get jammed by creating a musicbrainz_db before crashing, so clear any old databases by doing `dropdb musicbrainz_db` from the shell before running the build script again.

If that doesn’t work, feel free to comment here and I’ll see if I can help you. Or, better yet, ask the much more knowledgeable people in the #MusicBrainz-Devel IRC channel or on the musicbrainz-devel mailing list.

Good luck! (Also, if anything here is wrong or doesn’t work for you, please comment with what you had to differently so future readers can figure out what they need to do.)