Rainbow table attack (Explained)
On member sites, you will be asked to verify your identity with a combination of your account name and password before logging in. On the site management side, this password is saved using a hash function (described later) that converts it in a certain way.
If you store it as a hash value using a hash function, it is almost impossible to guess the original password from it. However, if you use this rainbow table, you can recover the hashed password in the memory space and the computing power of a laptop computer.
What is a hash function?
Hash functions are sometimes called one-way functions, summary functions, and so on. Hash means to mix them up. Hashed potatoes are chopped potatoes, mixed with pancake mix and baked.
Various algorithms have been devised for the hash function, but when you convert the word “apple” using the most famous md5, it becomes “c4c0ac673fd8010a257753cc2c7922cc”. In addition, if you convert “Apple” with the capital letter at the beginning, it becomes “e453f5acbd718a36eaf46b5a04e647d6”. The output value changes drastically even though the leading p is only case-sensitive.
Also, it can be said that it is unrealistic to calculate the original apple or Apple from this output value, because it requires a very large calculation cost. Therefore, it is also called a one-way function.
You can use this hash function to store passwords securely. Depending on the purpose, there are various hash functions with output values of 256 characters and 512 characters, but for the sake of clarity, the output value is short crc32 (security is weak, it is not suitable for actual password storage) Will be used for the explanation.
First, ask the newly registered members to decide their own password. Suppose this was abc123. On the website, save the “dbf164c4” that has been converted with a hash function.
Enter the password when this user comes in. Transform that input value with a hash function and check if it is the same as the stored hash value. If they match, you have entered the correct password.
All the stored password lists are only hash values, so even if it leaks, you cannot recover the password from the hash value. Internal people can only see the hash value, not the original password.
How to crack a hashed password
However, the hash function method is not completely safe. There is an easy way to attack. If you know the hash function you are using, make a list of how the main passwords are converted with the hash function in advance and search for the hash function you got by hacking You just have to try it.
Although it is not possible to analyze the passwords of all, if you create a hash value list of common weak passwords such as “password”, “abc123”, “admin”, “qwerty”, “iloveyou”, etc. You will be sure to find it.
The purpose of the dark side hacker is often to take over the account, so it doesn’t matter who you are. Well, I can fulfill my purpose.
If the hash function is known, the hash value (dbf164c4) of the weak password “abc123” can be easily calculated. If the hash value “dbf164c4” is searched from the obtained hashed password list and a hit is found, the password of the person (here, Betty) is found to be “abc123”.
That said, I don’t think anybody is using passwords that are too simple, such as “password”, “abcdefg”, “qwerty”, etc. You need to make
Assuming that you can use 64 uppercase letters, lowercase letters, numbers, and symbols with an 8-character password, all combinations will be a huge number of 64 ^ 8 = 281 trillion. If this were to be converted into a hash value that was 256 bytes long, we would have to create a table with 256 bytes x 281 trillion = approximately 67 million terabytes (64 exabytes). My HDD is 2 terabytes, so I need 33.5 million units. It’s not very realistic.
Therefore, saving the storage area is the basic idea of the rainbow table attack. As I will show you the concrete method later, even if you attack the same 8-character password, with a rainbow table you can finish with a list of about 256 megabytes. All of the hash values for the 8-character password will be included in this. This is the rainbow table.
The original plaintext cannot be restored from the hash value
A reduction function is important for making this rainbow table.
This is a function that transforms a hash value into plaintext. However, you cannot make a function that returns the hash value “dbf164c4” to the original “abc123”. The reduction function only needs to be able to be mapped to one of the original password lists. You don’t have to think about the relationship to which password you want to correspond, but you can associate it with any password.
The reduction function is a function that corresponds to any of the password lists based on the hash value.
Now, the password “Tokyo” has been converted to the hash value “5a62c” by the hash function. This hash value “5a62c” is associated with one of the password lists using a reduction function. Let’s assume that it is associated with “Beijing”. Furthermore, “Beijing” is converted into “6ge2f” using a hash function, and this hash value is associated with “London” by a reduction function, and this is repeated endlessly to create one chain. This is equivalent to one row of the rainbow table.
I think there are many people who wonder about this mysterious function and chain. But this is the key to the idea of a rainbow table. Now, please understand that you need a reduction function even if you feel a little strange. I can understand the need for it later.
Contents of rainbow table
When 1000 words are used as the starting point, each time the hash conversion → conversion by the reduction function is repeated 1000 times, 1000 times × 1000 rows (chain), and a table containing 1 million password and hash value pairs is completed.
For rainbow tables, the hash function always uses the same. It is necessary to use different reduction functions from R 1 to R n depending on the sequence . This table contains innumerable “password and hash value” pairs.
The purpose of creating such a table is to save memory space. Actually, this rainbow table, you can leave only the first row and the last row, and discard all the way. In the figure, leave only the columns of “Tokyo, Kyoto …” on the left and “Paris, Seoul …” at the end, and discard all passwords and hash values in the middle.
Should I do such a bold thing? It’s good. This is because you can easily restore only the necessary parts later. If you want to restore Kyoto line, you can easily restore this line by calculating hash function and restore function one after another. Therefore, you can delete everything in the middle. If this happens, the memory space will be 8 bytes x 1000 x 2 = 16000 bytes = about 15.6K bytes. The size is only 0.006%.
Attack using a rainbow table
Now, let’s actually attack using this rainbow table.
The rainbow table attack procedure searches assuming that the hash value obtained from the attacked site is one of the last hash values in the rainbow table. If that doesn’t work, it will search backwards, assuming it’s the hash value of the column before the last one.
However, all the hash values have been deleted. Therefore, the reduction function R n used last is used for the obtained hash value . If the assumption that “it is the same as one of the hash values in the last column” is met, the plaintext derived from the result of “R n (obtained hash value)” is one of the passwords at the end of the rainbow table. Should match.
In the second attack, Liverpool was found in the rainbow table. This means that this Liverpool line contains the hash value and password we got. If you try to restore only this Liverpool row, “Beijing” will appear in front of the obtained hash value. That is, it can be seen that the obtained value is a hash value generated from Beijing and the target password is “Beijing”.
Now, applying the reduction function that last used the hash value obtained from the attack target, it became Hanoi.
I searched for “Hanoi” in the password at the end of the rainbow table, but I could not find it. In other words, the assumption “the same as the hash value of the last column” was wrong.
Next, suppose it is the hash value of the last column before. When the reduction function R n-1 is applied to the obtained hash value, it becomes Fukuoka. I provided a hash function to this Fukuoka and applied a reduction function R n to it. The last one became Liverpool. This is recreating the chain to match the last password group on the rainbow table side.
This plaintext Liverpool was found at the end of a chain on the rainbow table side. This means that the desired hash value and password should be in this chain.
The top of the chain with Liverpool at the end is Beijing. Starting from Beijing, apply the hash and reduction functions to recover just this chain. Then, since the hash value obtained from the attack target comes out, you can see that the previous password is the target password.
As I’ve read so far, rainbow table attacks are rarely used. This is because rainbow table attacks can be prevented by methods such as “multiplexing hash functions” and “salt (add keywords to passwords and then apply hash functions)”.
However, since the algorithm is extremely interesting and the understanding of password hashing is deepened, it is not wasteful for students and new engineers to understand and write simple code. There is no doubt that it will help you to understand. If you have the time, try a simple coding and it will be easier to understand.