Building an MD5 Rainbow Table

30 November -0001

The MD5 hashing algorithm is a common way to store user passwords in many PHP based applications. This mechanism effectively obscures the password so that if the password store is compromised, user accounts are not necessarily exposed. This mechanism also obscures passwords from site administrators, protecting the privacy of users.

MD5 password hashes work in a relatively straightforward and streamlined way. A hash is a mathematical function that takes input and passes it through routines to transform the value of the input into a unique value. The idea is that the output value from one input will be distinct from the output value of all other inputs. However, "collisions" do exist in MD5, meaning that more than one input will produce the same hash, they are very rare.

Typical MD5 implementations take a sting, usually of user input, and has the value which results in a 32 character output. For instance, the MD5 hash of the word 'editor' is '5aee9dbd2a188839105073571bee1b1f'. You'll note the hash is represented in hexadecimal notation, this allows for a full 16 possible characters in each of the 32 characters that make up the hash value.

The MD5 hash is put to practical use by allowing an application user to create an account and chose a password. The application stores the user's username in a table, and the MD5 hash of the user's chosen password. This means that a user who chose the password 'editor' would actually have the value '5aee9dbd2a188839105073571bee1b1f' stored by the application. This way, anyone reading the stored values could not discern the original password. When a user logs in, the authentication algorithm takes the password the user supplies at login and runs it through the MD5 hashing algorithm, the resulting value is then compared to the stored value in the database. This allows the application to discard passwords as soon as they are provided.

As discussed in a previous article, this scheme suffers from a few notable flaws. For one thing, it is easy to discern if two users have the same password by looking at the hashes. The other major flaw is that MD5 hashes can be brute forced. Brute forcing means trying all possible combinations to guess a password. This is a time consuming and tedious process. However, many attackers will seek out MD5 hashes of passwords as soon as they compromise systems and then either upload these to a password cracking service, or run them through their own "rainbow tables" to try and crack the password.

A rainbow table is a precomputed list of words and their MD5 hashes. The idea is that instead of trying combinations of characters to guess a password, with a rainbow table you can simply look up values that have been computed beforehand. The following are some simple instructions for creating a functional MD5 rainbow table.

The first step in this process is to find a suitably large dictionary file. Dictionary files can be found in many locations around the internet. For this example we'll be using the dic-0294.txt from http://www.outpost9.com/files/WordLists.html. This dictionary can be downloaded as a zip file, but inflates to 8.5 MB and contains around 870,000 words. Once we've downloaded this file we need to create a database and a table for our rainbow values. First log into mysql and create a new database called 'rainbow', next create a table called 'hash' for the values like so:

$ mysql -u root -p
mysql> create database rainbow;
mysql> use rainbow;
mysql> create table hash (hash_id int not null auto_increment primary key, hash_word varchar(20) not null, hash_hash varchar(32));

Once you have the database created all you need to do is run the following simple Perl script to create your rainbow table:

#!/usr/bin/perl
# MD5 Rainbow table generator by Justin Klein Keane 

use Digest::MD5  qw(md5 md5_hex md5_base64);
use DBI;

my $dbh = DBI->connect('DBI:mysql:rainbow', 'db_user', 'db_password') || die "Could not connect $DBI::errstr";

open (FILE, 'dic-0294.txt') || die('Could not open file');
while (<FILE>) {
        my $data;
        chomp($data = $_);
        $data =~ s/\r\n?//g;
        $hash = md5_hex $data;
        $data =~ s/'/''/g;
        my @vals = ($data, $hash);
        my $sth = $dbh->prepare("insert into hash (hash_word,hash_hash) values (?,?)");
        $sth->execute(@vals) || die "Query failed! $DBI::errstr";
}
close(FILE);
$dbh->disconnect();

This will run over the entire word list and take each word and calculate it's MD5 hash value. You will probably be surprised at the speed with which a computer of even moderate computing power can calculate the entire table. Once the table is built it is quite easy to look up both MD5 values for common words as well as the words associated with MD5 values:

mysql> select * from hash where hash_word = 'admin';
+----------------------------------+---------+-----------+
| hash_hash                        | hash_id | hash_word |
+----------------------------------+---------+-----------+
| 21232f297a57a5a743894a0e4a801fc3 |  949489 | admin     |
+----------------------------------+---------+-----------+
1 row in set (0.97 sec)

mysql> select * from hash where hash_hash = '5aee9dbd2a188839105073571bee1b1f';
+----------------------------------+---------+-----------+
| hash_hash                        | hash_id | hash_word |
+----------------------------------+---------+-----------+
| 5aee9dbd2a188839105073571bee1b1f | 1060421 | editor    |
+----------------------------------+---------+-----------+
1 row in set (0.46 sec)

Armed with such a table an attacker can easily deduce passwords if they are able to get a hold of the hash values from a database. The best defense against such an attack is to "salt" passwords before hashing them. This increases the length of the original password, thus forcing an attacker to have a much larger, and more complex, rainbow table in order to recover the original password.