This is the page to view the actual task: https://seedsecuritylabs.org/Labs_20.04/Crypto/Crypto_MD5_Collision/
Step 1:
Write a C program based on the pseudo-code given in Task 4:
Array X;
Array Y;
main()
{
if(X’s contents and Y’s contents are the same)
run benign code;
else
run malicious code;
return;
}
If the contents of the 2 arrays are the same, print benign code, otherwise print a message indicating that the malicious code was run.
This is my md5.c file:
#include <stdio.h>
unsigned char x[400] = { 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A', 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A'};
unsigned char y[400] = { 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A', 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A'};
int main()
{
int i;
int same=1;
for(i = 0; i < 400; i++)
{
if(x[i]!=y[i])
same=0;
}
if(same)
printf("Benign code executed\n");
else
printf("Malicious code executed\n");
}
To run the file, use the following commands in the terminal:
gcc md5.c // produces a a.out file
a.out // runs the file
Step 2:
Create a prefix file, then create 2 binary files with the same hash. The prefix needs to be part of the output file from Step 1. Copy the first N bytes (N should be a multiple of 128 bytes) from the output file and save it to a prefix file.
When u run xxd a.out
in the terminal, it is observed that the a.out
file’s first array of A’s start from the position 1040 (hexadecimal) which is 4160 in decimal.
For the prefix file to end in the middle of the first array, I have chose N = 4224 bytes. The following command head -c 4224 a.out > prefix
is used to extract the first 4224 bytes of a.out
file to prefix
file.
Then, I used md5collgen
command to create 2 binary files with the same hash named as pprefix
file and qprefix
file.
md5collgen -p prefix -o pprefix qprefix
You can use the md5sum
command to show that both the pprefix
and qprefix
binary files have the same hash.
Step 3:
Now you have two binary files with same prefix and P and Q (128 bytes
each). Refer to the Figure 4 of Task 4. Observe carefully what the diagram is
showing you and recreate the file.
Now we have the pprefix
file and the qprefix
file, we just need to connect it with the correct suffix.
First, the suffix file is obtained from the a.out file after the pprefix
or qprefix
file.
By running the ls -l *prefix
command, it is observed that the prefix
file is 4224 bytes while the pprefix
and qprefix
file is 4352 bytes, in which 128 bytes is generated by our md5collgen
command.
Therefore, the suffix file is everything after the first 4353 bytes from the a.out
file.
To obtain suffix file:
tail -c +4353 a.out > suffix
Since we know that 128 bytes is generated by our md5collgen
command, it is the p
that we want.
To obtain p file:
tail -c 128 pprefix > p
After that, we know that the middle file starts right after p
(or q
)and ends in the middle of the second array, so by using the xxd
command to observe the binary file, we can determine the file size of our middle file.
xxd suffix
From the figure above, we know that the second array starts at position 00e0 (hexadecimal) which is position 224, I added another 64 bytes for the middle
file to end at the middle of the second array, which makes the middle
file has 288 bytes.
To obtain middle file:
head -c 288 sufffix > middle
Then, everything after the middle
file will be our commonsuffix
file. Since p
is 128 bytes and middle
is 288 bytes, our commonsuffix
file will be everything in the suffix
file after 417 (128+288) bytes.
To obtain commonsuffix file:
tail -c 417 suffix > commonsuffix
Finally, concatenate all the files together according to this figure:
cp pprefix bcode
cp qprefix mcode
cat middle >> bcode
cat middle >> mcode
cat p >> bcode
cat p >> mcode
cat commonsuffix >> bcode
cat commonsuffix >> mcode
After concatenating everything, your bcode
and mcode
should have the same file size as your a.out
.
From the figure above, we can observe that all three of the files have the same file size of 8228 bytes.
Step 4:
This is the content of the bcode
binary file:
This is the content of the mcode
binary file:
Notice the difference between the two files. Then, run the code.
chmod +x bcode mcode
bcode
mcode
This are the results:
We can see that the benign code is executed for the bcode
file while the malicious code is executed for the mcode
file although both the files have the same hash value.
This report has narrated the steps to launch a simple md5 collision attack. In conclusion, MD5 collision attacks pose a significant threat to the security of cryptographic systems that rely on this hash function. As demonstrated by this report, it is possible to create 2 different programs that share the same md5 hash but have completely different behaviors. which can lead to serious security vulnerabilities.