MD5 Collision Attack Lab Task 4 — SEEDLabs 2.0 Cryptography

Emily
5 min readMar 14, 2023

--

This is the page to view the actual task: https://seedsecuritylabs.org/Labs_20.04/Crypto/Crypto_MD5_Collision/

Step 1:

Write a C program based on the pseudo-code given in Task 4:

Array X;
Array Y;

main()
{
if(X’s contents and Y’s contents are the same)
run benign code;
else
run malicious code;
return;
}

If the contents of the 2 arrays are the same, print benign code, otherwise print a message indicating that the malicious code was run.

This is my md5.c file:

#include <stdio.h>
unsigned char x[400] = { 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A', 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A'};
unsigned char y[400] = { 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A', 'A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A'};

int main()
{
int i;
int same=1;
for(i = 0; i < 400; i++)
{
if(x[i]!=y[i])
same=0;
}

if(same)
printf("Benign code executed\n");
else
printf("Malicious code executed\n");
}

To run the file, use the following commands in the terminal:

gcc md5.c   // produces a a.out file
a.out // runs the file

Step 2:

Create a prefix file, then create 2 binary files with the same hash. The prefix needs to be part of the output file from Step 1. Copy the first N bytes (N should be a multiple of 128 bytes) from the output file and save it to a prefix file.

When u run xxd a.out in the terminal, it is observed that the a.out file’s first array of A’s start from the position 1040 (hexadecimal) which is 4160 in decimal.

For the prefix file to end in the middle of the first array, I have chose N = 4224 bytes. The following command head -c 4224 a.out > prefix is used to extract the first 4224 bytes of a.out file to prefix file.

Then, I used md5collgencommand to create 2 binary files with the same hash named as pprefix file and qprefix file.

md5collgen -p prefix -o pprefix qprefix

You can use the md5sum command to show that both the pprefix and qprefix binary files have the same hash.

Step 3:

Now you have two binary files with same prefix and P and Q (128 bytes
each). Refer to the Figure 4 of Task 4. Observe carefully what the diagram is
showing you and recreate the file.

Internal composition of bcode and mcode

Now we have the pprefix file and the qprefix file, we just need to connect it with the correct suffix.

First, the suffix file is obtained from the a.out file after the pprefix or qprefix file.

By running the ls -l *prefix command, it is observed that the prefix file is 4224 bytes while the pprefix and qprefix file is 4352 bytes, in which 128 bytes is generated by our md5collgen command.

Therefore, the suffix file is everything after the first 4353 bytes from the a.out file.

To obtain suffix file:

tail -c +4353 a.out > suffix

Since we know that 128 bytes is generated by our md5collgen command, it is the p that we want.

To obtain p file:

tail -c 128 pprefix > p

After that, we know that the middle file starts right after p (or q)and ends in the middle of the second array, so by using the xxd command to observe the binary file, we can determine the file size of our middle file.

xxd suffix

From the figure above, we know that the second array starts at position 00e0 (hexadecimal) which is position 224, I added another 64 bytes for the middle file to end at the middle of the second array, which makes the middle file has 288 bytes.

To obtain middle file:

head -c 288 sufffix > middle

Then, everything after the middle file will be our commonsuffix file. Since p is 128 bytes and middle is 288 bytes, our commonsuffix file will be everything in the suffix file after 417 (128+288) bytes.

To obtain commonsuffix file:

tail -c 417 suffix > commonsuffix

Finally, concatenate all the files together according to this figure:

cp pprefix bcode
cp qprefix mcode
cat middle >> bcode
cat middle >> mcode
cat p >> bcode
cat p >> mcode
cat commonsuffix >> bcode
cat commonsuffix >> mcode

After concatenating everything, your bcode and mcode should have the same file size as your a.out.

From the figure above, we can observe that all three of the files have the same file size of 8228 bytes.

Step 4:

This is the content of the bcode binary file:

This is the content of the mcode binary file:

Notice the difference between the two files. Then, run the code.

chmod +x bcode mcode
bcode
mcode

This are the results:

Command to execute the binary files

We can see that the benign code is executed for the bcode file while the malicious code is executed for the mcode file although both the files have the same hash value.

Checking the hash of the binary files using md5sum

This report has narrated the steps to launch a simple md5 collision attack. In conclusion, MD5 collision attacks pose a significant threat to the security of cryptographic systems that rely on this hash function. As demonstrated by this report, it is possible to create 2 different programs that share the same md5 hash but have completely different behaviors. which can lead to serious security vulnerabilities.

--

--

Emily
Emily

Written by Emily

Top CS Graduate | MSc Fintech

Responses (1)