FOURPROC

Choosing a Backup Implementation for stand-alone cloud servers

LabKey has a number of stand-alone, single purpose servers under management. These servers are running at a variety of hosting and cloud providers (SliceHost, Rackspace and AWS to name a few). When we first got into this business, I wrote some very simple backup scripts that performed the required steps to backup the system configuration and important data. Over time, the simplified architecture that I designed for backups began to experience some problems. As with all “quick” solutions, the problems came about when the size of the data to be backed up on some of these servers began to grow and grow.

As I began re-evaluating our backup system, I reviewed a number of commercial and open source options out there. The solution that I chose is definitely not for everyone, as we have some very peculiar requirements. The results of my search are below.

My Requirements

  1. This backup system will only be used for our “hosted” servers (ie servers located at AWS or Rackspace, etc).
    • This backup system will not be used to backup servers or desktops from the office
  2. Each server can be it’s own backup server and backup client
    • The servers are not part of a cluster, or provide shared resources for all others servers.
    • The backup requirements of these servers in terms of types of data, backup frequency and retention time are nearly identical
    • Data transfer between servers on different cloud or hosting providers can be expensive
    • Need the ability for servers at Rackspace to use CloudFiles and servers at Amazon to use S3 as the file transfers are then free.
    • Cost and operational overhead of running dedicated “backup servers” seem too high.
  3. Server installation and configuration is performed using the automation tool Chef.
    • OS or application configuration files do not need to be backed up.
    • This greatly reduces the complexity of the required backup system.
    • Do not require a “bare metal” restore option
      • In the case of a catastrophic failure of the server, I simply need to use Chef to rebuild the server and then restore the our data.
  4. Use Rackspace CloudFiles or Amazon S3 for off-server storage of backups
    • Servers at Rackspace should use CloudFiles and servers at Amazon should use S3 as the file transfers are then free.

Available Options

I evaluated a number of options. Many of the options did not meet my requirements and were immediately dismissed. I have some of them listed here. This is by far not an exhaustive list.

  • Zmanda
    • Despite the overhead of learning Amanda, this was the most robust solution out there. The code base have been around for every (I first used this back in the late 90’s).
    • It can be run either in stand alone mode (ie the server is both the client and the server) or with a centralized backup server
    • It is configured using text files which makes it easy to manage via Chef
    • It handles all the annoyances of managing backup frequency, retention times, aging of old backups, etc
    • The only problems were
      • No support for CloudFiles
      • Not very good documentation for installing in stand-alone mode
      • It would have been great if they had this for other operating systems beside Windows.
  • These required a central backup server so they were not choosen
  • Backup Manager
    • This tool fits all my needs except it did not support Rackspace CloudFiles
  • Brackup
    • Does not support Rackspace CloudFiles, plus looks like development ended in 2006
  • Roll-my-own
    • Found Duplicity which can handle writing backups to local disk, CloudFiles and S3.
    • Not an ideal solution as this requires me to essentially write a backup system from scratch and this has been done before by better programmers than me.
    • Increased support burden over time.
    • If I leave the company, there is no one else who can come in and easily manage the system

What did I choose?

In the end, The only choice that I had was to Roll-My-Own. There are no commercial or open-source solutions which support CloudFiles. This definitely shows the immaturity and limited adoption of Rackspace’s Cloud offerings. I was not really happy about this choice as it really feels like I am re-inventing the wheel. Sadly there does not seem to be any good enterprise-class backup solutions out there for backing up servers that are hosted “in the cloud”.

Other information

Below is some further information I found during my research

.....

Submit a new website to the major search engines for indexing.

It turns out that a site I run was not registered properly registered with Google webmaster’s tools. As I was fixing that, I took the time to also submit the Website and the sitemap.xml to Yahoo and Bing as well. Here is how I did it.

Google

First we need to register the website and verify that we own it. Then we can submit the sitemap file (which recommends how the indexer will have crawl the site).

  1. Log into the Google Webmaster tools as http://www.google.com/webmaster
  2. Click on the Add Site button
  3. Post the HTML verification file, googlexxxxxxxxxxxxxxx.html, at the http://www.example.org/googlexxxxxxxxxxxxxxx.html
  4. Click on the verify button
  5. You will now see all the information in Google’s index about our site.
  6. Submit the sitemap by clicking Site Configuration —> Sitemaps in the left pane
  7. Click on the Submit button and enter https://www.example.org/sitemap.xml
  8. Google will contact our servers and read the sitemap.xml file to index the pages.

Yahoo

First we need to register the website and verify that we own it. Then we can submit the sitemap file (which recommends how the indexer will have crawl the site)

  1. Goto https://siteexplorer.search.yahoo.com/mysites
  2. In the left pane, click on Submit your Site
  3. Click on Submit a Website or Webpage
  4. Enter in https://www.example.org/ and click submit
  5. In the left pane click on My Sites
  6. Click on the www.example.org in the list of sites
  7. In the left pane click on Authentication
  8. I chose to verify using the a file.
    1. I downloaded the file y_key_xxxxxxxxxxxxxxxxxx.html from the webpage and posted it a http://www.example.org/y_key_xxxxxxxxxxxxxxxxxx.html
    2. Click on Ready to Authenticate button
  9. In the left pane, click on the Feeds button
  10. In the text box enter sitemap.xml and click Add Feed button

Bing

First we need to register the website and verify that we own it. Then we can submit the sitemap file (which recommends how the indexer will have crawl the site)

  1. Goto http://www.bing.com/webmaster
  2. At the Add a Site page, enter
    • Web Address = http://www.example.org/
    • Sitemap address = http://www.example.org/sitemap.xml
    • Hit the submit button
  3. Now we need to Verify ownership of the website.
    1. I downloaded the file, LiveSearchSiteAuth.xml, by clicking on the link Download XML Verification file
    2. Copied the file to http://www.example.org/LiveSearchSiteAuth.xml
    3. Click on Return to Site List button

.....

Postgres Database Backups

There is a plethora of information on backup and restore of Postgres databases online. The problem is that a lot of it is old and out of date. This means it can hard to figure out how you should do something as simple as backup. So here is list of the various backup file formats. Hope this helps you.

There are 3 output formats for a backup of a postgres database. I am going to describe the pros and cons of each below. This information is for Postgres v8.4

Script or plain-text format

  • This is the default format for backup
  • This is simply a text file which contains the commands necessary to rebuild the database.
  • Can be restored using psql or pgrestore
  • Individual tables CANNOT be restored using the format. In order to restore an individual table, you will need to restore the entire table first. Then copy the table contents from the restored database to the working database.

Custom archive format

  • Most flexible format.
  • By default this format is compressed, but compression can be disabled using the —compress option
  • Must use pg_restore to restore the database
  • The backup file can be opened and the contents reviewed using pg_restore
  • Individual tables CAN be restored using the format.
  • Backup file size(uncompressed) can be roughly 2x the size of the database. This is due overhead necessary to allow pg_restore the ability to restore individual tables, etc

tar archive format

  • Has the same features as Custom format except
  • Tables cannot be larger than 8GB in size or the restore will fail.
  • Output file can be opened using the tar command.

(Much of this information was found at http://www.postgresql.org/docs/8.4/static/app-pgdump.html and http://www.postgresql.org/docs/8.4/static/app-pgrestore.html ]

.....

Create a SSL keystore for a Tomcat server using Openssl

An SSL certificate was required for one of our customers. The SSL certificate was to be used with a Tomcat server, but I decided to give the customer the flexibility to re-use this certificate on a different webserver if needed. This meant I used openssl to generate the certificate and then created a pkcs12 keystore.

Create the private key and certificate request

Create the certificate key

openssl genrsa -des3 -out customercert.key 2048

Remove the passphrase from the key

openssl rsa -in customercert.key -out customercert.key.new
mv customercert.key.new customercert.key

Create the Certificate request

openssl req -new -key customercert.key -out customercert.csr

Create the Keystore file for use with tomcat and keytool

I had some trouble getting this to work. This is a very simple procedure when working with certs signed by GoDaddy, but certs from Verisign needed some extra hand-holding. There is some information on how to do this is found at http://conshell.net/wiki/index.php/OpenSSL_to_Keytool_Conversion_tips.

I did not follow the instructions on this site. I ended up creating a keystore in the pkcs12 format instead of the default jks format. This site above does have instructions for converting a pkcs12 keystore to a jks format, if you require.

The signed certificate was downloaded to clients.adaptivetcr.com.cer. The Secure Site with EV Root bundle was downloaded to intermediate.crt. When I first attempted to create the keystore file, I received the error below

openssl pkcs12 -export -chain -CAfile intermediate.crt -in customercert.cer \ 
    -inkey customercert.key -out customercert.keystore -name tomcat -passout pass:changeit\ 
    
Error unable to get issuer certificate getting chain.

Now the interesting thing about this error is that if you attempt a openssl verify using both cert file and intermediate.crt, it does not complain and gives the “OK” message. After a bit of testing, I found that you need to make a new CAfile to be used, that combines the cacerts file from the openssl distribution and the intermediate.crt file.

cat intermediate.crt /etc/ssl/certs/ca-certificates.crt > allcacerts.crt
openssl pkcs12 -export -chain -CAfile allcacerts.crt -in customercert.cer \
    -inkey customercert.key -out customercert.keystore -name tomcat -passout \
    pass:changeit

This successfully created the keystore file. You can look at the contents of the keystore by running

keytool -list -keystore customercert.keystore -storetype pkcs12 -v

.....

Improved Access Logging format for LabKey Server

The LabKey Server runs on Apache Tomcat. The Tomcat server has the ability to perform access logging similar the Apache web server. When configured, the Tomcat server will log each request to the server to a file on the file system. By default:

  • the log file is located in the $CATALINA_HOME/logs directory
  • the log file name will be of the form localhost_access_log.YYYY-MM-DD.txt
    • where YYYY = 4 digit year, MM = month and DD is the day of the month.
  • the log file is rotated daily at midnight

The Tomcat server comes with two preconfigured Access Log formats; common and combined. If I had to choose one of these format, I would recommend the combined format. The format of the combined format is

%h %l %u %t "%r" %s %b Referer User-Agent'

See below for a description of what each pattern code (ie %h) means. Complete documentation on the Tomcat Access Logging can be found at http://tomcat.apache.org/tomcat-5.5-doc/config/valve.html

Recommended LabKey Server Access Log format

For production installations of the LabKey server, I recommend a new format:

%h %l %u %t "%r" %s %b %D %S "%{Referer}i" "%{User-Agent}i" %{LABKEY.username}s

The difference between this format and the combined format is

  • %D = The time taken by the server to process the request. This is valuable when debugging performance problems
  • %S = Tomcat User SESSION ID
  • %{LABKEY.username}s = The username of the person accessing the LabKey server. If the person is not logged in, then this will be a “-“

There are 3 other LabKey specific fields that might be of interest in particular circumstances

  • %{LABKEY.container}r = The LabKey server container being accessed during the request
  • %{LABKEY.controller}r = The LabKey server controller being used during the request
  • %{LABKEY.action}r = The LabKey server action being used during the request

Example Access Log Valve configuration

Below is an example configuration for using the pattern above

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="localhost_access_log." suffix=".txt" pattern='%h %l %u %t "%r" %s %b %D %S "%{Referer}i" "%{User-Agent}i" %{LABKEY.username}s' resolveHosts="false"/>

Pattern Code description

This was taken from http://tomcat.apache.org/tomcat-5.5-doc/config/valve.html

  • %a - Remote IP address
  • %A - Local IP address
  • %b - Bytes sent, excluding HTTP headers, or ‘-’ if zero
  • %B - Bytes sent, excluding HTTP headers
  • %h - Remote host name (or IP address if resolveHosts is false)
  • %H - Request protocol
  • %l - Remote logical username from identd (always returns ‘-‘)
  • %m - Request method (GET, POST, etc.)
  • %p - Local port on which this request was received
  • %q - Query string (prepended with a ‘?’ if it exists)
  • %r - First line of the request (method and request URI)
  • %s - HTTP status code of the response
  • %S - User session ID
  • %t - Date and time, in Common Log Format
  • %u - Remote user that was authenticated (if any), else ‘-‘
  • %U - Requested URL path
  • %v - Local server name
  • %D - Time taken to process the request, in millis
  • %T - Time taken to process the request, in seconds
  • %I - current request thread name (can compare later with stacktraces)

There is also support to write information from the cookie, incoming header, outgoing response headers, the Session or something else in the ServletRequest. It is modeled after the apache syntax:

  • %{xxx}i for incoming request headers
  • %{xxx}o for outgoing response headers
  • %{xxx}c for a specific request cookie
  • %{xxx}r xxx is an attribute in the ServletRequest
  • %{xxx}s xxx is an attribute in the HttpSession

.....