August 16, 2007

Is MD5 the God of Hash?

It is 'Yes' according to Skrentablog. I found some interesting uses of MD5 in the article 'We Worship MD5, the GOD of HASH'. Here is a gist of it.

MD5 takes any length string of input bytes and outputs 128 bits. The bits are consistently random, based on the input string. But if you make even a tiny change to the input string, you'll get a completely different output hash.

MD5 tips & tricks

  • Unique ID generation

    Say you want to create a set of fixed-sized IDs based on chunks of text -- urls, for example. MD5 (url) is 16 bytes, consistently, and you're unlikely to ever have a collision. So, it's safe to use the md5 as an ID for the URL.

  • Checksums

    Don't trust your disk or your OS to properly detect errors for you. They CRC and protocol checksums they use are weak and bad data can get delivered.

    Instead, bring out an industrial strength checksum and protect your own data. MD5 your data before you stuff it onto the disk, check the MD5 when you read it.

        save_to_disk(data,md5(data))
    ...
    (data,md5) = read_from_disk()
    if (md5(data) != md5)
    read_error

  • Password security

    You could store the password in your database, "in the clear". But this should be avoided. If your site is hacked, someone could get a giant list of usernames and passwords.

    So instead, store md5(password) in the database. When a user tries to login, take the password they entered, md5 it, and then check it against what is in the database. The process can then forget the cleartext password they entered.


  • Hash table addressing

    MD5 isn't a weak hash function and you don't need to worry about that stuff. MD5 your key and have your table size be a power of 2. You will never have to worry about Hashtable bucket collisions and similar issues.


  • Random number generation

    The typical library RNG available isn't generally very good. For the same reason that you want your hashes to be randomly distributed, you want your random numbers to actually be random, and not to have some underlying mathematical structure showing through.

    Having random numbers that can't be guessed or predicted can be surprisingly useful. MD5 based sequence numbers were a solution for the TCP sequence number guessing attacks.

3 Comments:

Saagar said...

You can use MD5 for your passwords, but the catch is the password cannot be retained. If a user 'forgets' his password, you can only ask them to reset it, because there is no way to get back the lost password since the code cannot reverse engineer the stored password.

Seshu Karthick said...

True... and thats actually a good thing. Thus, there can be no way to reverse engineer stolen MD5 passwords. Legitimate users can reset their forgotten password.

Anonymous said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花