Best Practice Web Application Password Hashing
A primary reason for hashing is to stop a potential hacker from getting usable data once they have broken into your server. Poorly built websites do not hash their passwords and since most users reuse their username and passwords, you are actually endangering every user that uses your services. For a business, this can be a huge liability not to meant a horrible PR nightmare. Once a hacker gains access to your SQL database, they can turn around and try the usernames and passwords on other sites hoping to get lucky. In the end it could cost your users access to their online banking accounts and more. Hashing is a method to make the user’s password unusable to the hacker.
How Hashing Works
Without going into any individual hashing function, hashing is a one way encryption method. For many, the idea of a one way encryption is a little hard to grasp. Let me give a very basic explanation of how a one way hash works:
Step 1 : Take the original input (we will use a number to make it easy but it works with text by using unicode or ascii) : 11
Step 2 : Start a series of modifications using the math function modulo : 11 mod 4 = 3
Step N : Continuing using modulo : 3 mod 2 = 1
Obviously it doesn’t make much sense with such small numbers but most hashing techniques use much larger numbers were you have a larger usable space. So your encrypted (hashed) output is 1 now. To do the reverse of the hash is next to impossible because:
7 mod 2 = 1
5 mod 2 = 1
3 mod 2 = 1
Then you have to guess 7, 5 or 3 and run that through another level of mod to get another infinite set of possible input text. With infinite results you cannot find the original password. With real hashing functions the inputs would not overlap either so no two realistic inputs would allow you to login. Its not just the modulo function but the series and order that the math is done in that allow each input to generate a different encrypted output but yet still generate an infinite reverse calculation.
How hashing is implemented in real applications
Its actually quite basic. When you create a user account, you take the password and run it through a hashing function to get a hash result. You then store that to your database. The hash result is just a mix of letters and numbers to you but the next time a user logs in, you run the same hashing technique. If they match, then you have the same password and you can allow the login. If they dont’ match, then you don’t have the correct password. Now your SQL database is holding hashed passwords that a hacker cannot see user’s original passwords.
Rainbow Tables
Hashing isn’t perfectly secure though. While its almost impossible to reverse the hash, there is another way to decode it. Rainbow tables are a term used to describe large tables of hashed data. Think of it as a phonebook or dictionary. By hashing every combination of input and storing it in a database with both the input and output. Now you can do almost instant lookups of hashed data. With many distributed rainbow tables projects going on right now, you can easily download a few gigabytes of rainbow tables. With a copy of this table you can now decode and once again get user’s passwords from a hashed user table.
Salting
Salting adds another layer to the hashing technique. It basically adds a combination of letters, numbers and symbols to each password to increase the character space that a rainbow table would need to be built. Without a salt you could build a rainbow table by just using the hash function on every combination from a to z with 8 characters to get a reasonable set of passwords decoded.
In practice what you should do is hash the password but append the username with the password. What this does is make a basic rainbow table useless.
Example:
email: joe@example.com password : password
hash : joe@example.compassword >>
Crypt() vs md5()
In the basic md5 function for PHP, it does not support a salt while crypt supports salt. However, crypt only supports 8 characters in its basic mode. From here there isn’t a good answer for security. Rainbow Tables will break almost any hashing technique out there with enough time and resources. You best bet is to just do something unstandard with the hopes of forcing the hacker to build the tables from scratch as opposed to downloading just an md5.
DO NOT USE md5($password)
A better solution is to use md5(‘salt’ . $password . $username . ‘salt’) which isn’t as common as md5($password) and will require building custom tables or until the rainbow tables available reach the full size of unique md5 hashes. BUT don’t use that form either because if it becomes standard practice it will no longer be best practice. In your salts add symbols and other harsh letters that aren’t normally used. Probably once a hacker has access to SQL they also can get to your code to see your hashing technique but at least this forces them to build a table from scratch.
Changing the md5 to another encryption function (google mcrypt) will help but only to stop them from just downloading a prebuilt rainbow table. Using a hashing function that supports salting is important as it will seriously increase the size of the decoding rainbow tables. This is one of the biggest flaws of hashing, if someone knows how you did it, they can build a rainbow table to decode it. There is almost no defense from a rainbow table. Only to make sure your hashes aren’t in easily downloadable tables today.
In fact the government recommends everyone to stop using MD5 and switch to sha256. This hash produces a 64bit output instead of MD5’s 32bit output. To use it just switch your md5()s to hash(‘sha256’,$valuetohash);