3,439,775 members and growing!   10,248 currently online.
Email Password Lost your password?
All Topics, .NET, C# >> .NET >> Encryption

Exploiting MD5 collisions (in C#)
By ediazc.

This article shows how the MD5 collisions can be used to tamper software distribution schemas.
C#
Windows, .NET (.NET 1.1)
ASP.NET, Win32, VS (VS.NET2003)
CEO, Arch, DB, Dev, QA
Posted: 14 Sep 2005
Updated: 20 Sep 2005
Views: 65,473
Programming books
Ultimate Combo
MFC tools $449
Send to a friend


Search    
Advanced Search
Sitemap
PrintBroken Article?Bookmark Discuss
29 votes for this article.
Popularity: 6.4. Rating: 4.38 out of 5.

Introduction

In my previous article Good Bye MD5, I introduced you to the current findings on cryptology and MD5 collision detection. A debate started, and most of the people think that these findings are not a serious issue.

Microsoft agreed that this is an important issue:

"Microsoft is banning certain cryptographic functions from new computer code, citing increasingly sophisticated attacks that make them less secure, according to a company executive".

"The Redmond, Wash., software company instituted a new policy for all developers that bans functions using the DES, MD4, MD5 and, in some cases, the SHA1 encryption algorithm, which is becoming "creaky at the edges," said Michael Howard, senior security program manager at the company, Howard said". Source: Microsoft Scraps Old Encryption in New Code

We now have some proofs of concept, like a pair of X.509 colliding certificates and a spectacular example of a pair of postscript documents, with the same MD5 hash value, you can read about this in the excellent paper, Attacking Hash Functions by Poisoned Messages "The Story of Alice and her Boss".

How to exploit the collisions

There is a known result about MD5 hash function:

If MD5(x) == MD5(y) then MD5(x+q) == MD5(y+q)

So, if you have a pair of messages, x and y, with the same MD5 value, you can append a payload q, the MD5 value remains the same, the size of q is arbitrary. You need a pair of vectors, x and y to do the exploit. You can try to find a pair for yourself, but we already have a pair of values, given by the Chinese investigators Joux and Wang. A practical use of this pair of vector values is explained in the paper MD5 To Be Considered Harmful Someday, by Dan Kaminsky.

In my blog, I have written about these issues, and now I want to show you how you can make an exploit using these vectors.

Hacking software distribution

The proof of concept to be shown in this article has the following scenario:

  • This is a simulated software distribution mechanism.
  • The software is distributed in binary format, in files with .bin extension.
  • Exists as an extraction program that checks and extracts the software from the .bin file.
  • For verification purpose, we use the MD5 value to check the integrity of the .bin files.

This picture shows a scenario, where a pair of binary files with the same MD5 are generated, MD5(good.bin) == MD5(evil.bin):

Attacking the distribution software

First, we will build a generator program, this program takes a pair of executables, the first is a harmless program and the second the evil file, is a harmful program, and generates a pair of binary distribution files (.bin files). These are good.bin distribution file, and an evil.bin distribution file.

The good program

The code of the harmless program is simple:

namespace GoodExe
{
 /// <summary>
 /// A Harmless program
 /// </summary>
 class Class1
 {
  /// <summary>
  /// The main entry point for the application.
  /// </summary>
  [STAThread]
  static void Main(string[] args)
  {
   Console.WriteLine ("this is a good executable");
  }
 }
}

The evil program

This is the code of the evil program, that simulates a harmful behavior:

using System;
using System.Threading;
 
namespace EvilExe
{
  /// <summary>
  /// A harmfull program
  /// </summary>
  class Class1
  {
    /// <summary>
    /// The main entry point for the application.
    /// </summary>
    [STAThread]
    static void Main(string[] args)
    {
      Console.WriteLine ("This is an evil file");
      Console.WriteLine ("Formatting your hard drive...");
      Thread.Sleep(2000);
      Console.WriteLine ("Just joking...");
    }
  }
}

The collision generator

We have a pair of vectors with the same MD5 value:

Collapse
/// <summary>
/// First prefix vector
/// Taken from stripwire by Dan Kaminsky
/// http://www.doxpara.com/md5_someday.pdf
/// </summary>
static byte[] vec1 = 
{
   0xd1, 0x31, 0xdd, 0x02, 0xc5, 0xe6 
   , 0xee , 0xc4 , 0x69 , 0x3d, 0x9a , 0x06 
   , 0x98 , 0xaf , 0xf9 , 0x5c , 0x2f , 0xca 
   , 0xb5 , /**/0x87 , 0x12 , 0x46 , 0x7e 
   , 0xab , 0x40 , 0x04 , 0x58 , 0x3e , 0xb8 
   , 0xfb , 0x7f , 0x89 , 0x55 , 0xad 
   , 0x34 , 0x06 , 0x09 , 0xf4 , 0xb3 , 0x02 
   , 0x83 , 0xe4 , 0x88 , 0x83 , 0x25 
   , 0x71 , 0x41 , 0x5a, 0x08 , 0x51 , 0x25 
   , 0xe8 , 0xf7 , 0xcd , 0xc9 , 0x9f , 
   0xd9 , 0x1d , 0xbd , 0xf2 , 0x80 , 0x37 
   , 0x3c , 0x5b , 0xd8 , 0x82 , 0x3e 
   , 0x31 , 0x56 , 0x34 , 0x8f , 0x5b , 0xae 
   , 0x6d , 0xac , 0xd4 , 0x36 , 0xc9 
   , 0x19 , 0xc6 , 0xdd , 0x53 , 0xe2 , 0xb4 
   , 0x87 , 0xda , 0x03 , 0xfd , 0x02 
   , 0x39 , 0x63 , 0x06 , 0xd2 , 0x48 , 0xcd 
   , 0xa0 , 0xe9 , 0x9f , 0x33 , 0x42 
   , 0x0f , 0x57 , 0x7e , 0xe8 , 0xce , 0x54 
   , 0xb6 , 0x70 , 0x80 , 0xa8 , 0x0d 
   , 0x1e , 0xc6 , 0x98 , 0x21 , 0xbc , 0xb6 
   , 0xa8 , 0x83 , 0x93 , 0x96 , 0xf9 
   , 0x65 , 0x2b , 0x6f , 0xf7 , 0x2a , 0x70
};
Collapse
/// <summary>
/// Second prefix vector
/// Taken from stripwire by Dan Kaminsky
/// http://www.doxpara.com/md5_someday.pdf
/// </summary>
static byte[] vec2 = 
{
   0xd1 , 0x31, 0xdd , 0x02 , 0xc5 , 0xe6 
   , 0xee , 0xc4 , 0x69 , 0x3d , 0x9a , 0x06 
   , 0x98 , 0xaf , 0xf9 , 0x5c, 0x2f , 0xca 
   , 0xb5 , /**/0x07 , 0x12 , 0x46 , 0x7e 
   , 0xab , 0x40 , 0x04 , 0x58 , 0x3e , 0xb8 
   , 0xfb , 0x7f , 0x89 , 0x55 , 0xad 
   , 0x34 , 0x06 , 0x09 , 0xf4 , 0xb3 , 0x02 
   , 0x83 , 0xe4 , 0x88 , 0x83 , 0x25 
   ,/**/ 0xf1 , 0x41 , 0x5a , 0x08 , 0x51 , 0x25 
   , 0xe8 , 0xf7 , 0xcd , 0xc9 , 0x9f
   , 0xd9 , 0x1d , 0xbd , /**/0x72 , 0x80 
   , 0x37 , 0x3c , 0x5b, 0xd8 , 0x82 
   , 0x3e , 0x31 , 0x56 , 0x34 , 0x8f , 0x5b 
   , 0xae , 0x6d , 0xac , 0xd4 , 0x36 
   , 0xc9 , 0x19 , 0xc6 , 0xdd , 0x53 , 0xe2 
   , /**/0x34 , 0x87 , 0xda , 0x03 , 0xfd 
   , 0x02 , 0x39 , 0x63 , 0x06 , 0xd2 , 0x48 
   , 0xcd , 0xa0, 0xe9 , 0x9f , 0x33 
   , 0x42 , 0x0f , 0x57 , 0x7e , 0xe8 , 0xce 
   , 0x54 , 0xb6 , 0x70 , 0x80 , /**/ 0x28 
   , 0x0d , 0x1e, 0xc6 , 0x98 , 0x21 , 0xbc 
   , 0xb6 , 0xa8 , 0x83 , 0x93 , 0x96 
   , 0xf9 , 0x65 , /* flag byte*/0xab 
   , 0x6f , 0xf7 , 0x2a , 0x70
};

Remember that given this pair of vectors, if we have a payload of any size, then MD5(vec1+payload) == MD5(vec2+payload). The payload is built in this way, the length of good file, the length of evil file, the content of the good file, and the content of the evil file.

The core of the generation program is shown below:

Collapse
[STAThread]
static void Main(string[] args)
{
   if (args.Length != 2)
    Usage();

 
   byte[] goodFile = ReadBinaryFile(args[0]);
   byte[] evilFile = ReadBinaryFile(args[1]);
   
   WriteBinary("good.bin", vec1, goodFile, evilFile);
   WriteBinary("evil.bin", vec2, goodFile, evilFile);
   
   ShowMD5("good.bin");
   ShowMD5("evil.bin");
}
   
private static void Usage ()
{
   Console.WriteLine("Usage: md5gen good_file evil_file");
   Environment.Exit (-1);
}

 
private static void WriteBinary (string sOutFileName, 
          byte[] prefix, byte[] goodFile, byte[] evilFile)
{
   using (FileStream fs = File.OpenWrite (sOutFileName))
   {
      using (BinaryWriter writer = new BinaryWriter(fs))
      {
         writer.Write(prefix);
         writer.Write ( goodFile.Length );
         writer.Write ( evilFile.Length );
         fs.Write (goodFile, 0, goodFile.Length);
         fs.Write (evilFile, 0, evilFile.Length);
         fs.Close ();
      }
   }
}

If we apply the generator program to the pair of programs, we generate a pair of files, good.bin and evil.bin in the following way:

C:\TEMP>md5clone goodexe.exe evilexe.exe
MD5 Hash for good.bin is 1D8EE13FBA00DD022F002AAD0E3EF9C7
MD5 Hash for evil.bin is 1D8EE13FBA00DD022F002AAD0E3EF9C7

Now, we can publish good.bin in the Internet for people to download it, and later, we can replace it with evil.bin. Now, the users will get infected, without noticing and convinced that there is no tampering, because the MD5 signature is the same for both files, in others words we have MD5(good.bin) == MD5(evil.bin).

The extractor program

Now, suppose we have changed the extractor program, with our own version. Our extractor receives the .bin distribution file, and extracts the good or evil program based on the prefix vector at the beginning of the .bin file. We use the byte at position 123 to detect the vector that is used for the prefix.

The code of the extractor is very simple:

Collapse
  [STAThread]
  static void Main(string[] args)
  {
    if (args.Length == 0)
     Usage();
    ExtractFile(args[0], args[1]);
  }

 
  private static void ExtractFile (string sfilename, 
                                      string soutputfile)
  {
    using (BinaryReader reader = 
            new BinaryReader(File.OpenRead (sfilename)))
    {
      byte[] vec = reader.ReadBytes (128);
      int goodSize = reader.ReadInt32 ();
      int evilSize  = reader.ReadInt32 ();
      /// open evil file
      if (vec[123] == 0xab)
      {
        reader.ReadBytes (goodSize);
        byte[] evil = reader.ReadBytes (evilSize);
        using (BinaryWriter writer = 
              new BinaryWriter(File.OpenWrite (soutputfile)))
        {
           writer.Write (evil);
           writer.Close (); 
        }
      }
      else
      {
        byte[] good = reader.ReadBytes (goodSize);
        using (BinaryWriter writer = 
            new BinaryWriter(File.OpenWrite (soutputfile)))
        {
           writer.Write (good);
           writer.Close ();
        }
      }
      Console.WriteLine ("File written on {0}", 
                                       soutputfile);
    }
  }

 
  private static void Usage ()
  {
     Console.WriteLine(
        "Usage: md5extract file.bin output_file.exe");
     Environment.Exit (-1);
  }

Suppose, you receive the good.bin file, then you apply the extractor on good.bin and the good.exe file is extracted. But if you receive the evil.bin file, then the extractor will extract the evil.exe, i.e. the harmful executable. Remember that MD5(evil.bin) == MD5(good.bin).

Conclusion

Recently, the world of cryptographic hash functions was on crisis. A lot of researchers announced "attacks" to find collisions for common hash functions such as MD5 and SHA-1. "For cryptographers, these results are exciting - but many so-called "practitioners" turned them down as practically irrelevant".

I hope, this proof of concept will convince you that there is a serious issue with MD5.

"There have already been a few exploits of the collision-finding attacks against MD5: Kaminski [Ka] and Mikle [Mi] presented different executables with the same MD5 hash. One of Kaminski's executables was quite harmless, the other one very harmful. Mikle's executables were self-extracting archives, extracting different stuffs. Lenstra, Wang and de Weger [LWdW,LdW] constructed colliding X.509 certificates".

This article shows how a failure on the software distribution chain, allows exploiting the current findings on cryptology about the MD5 hash function.

History

  • September 20th, 2005: Some grammar correction, and title modified.

About ediazc


Eduardo Diaz
site | english blog | spanish blog

Click here to view ediazc's online profile.


Other popular .NET articles:

[Top] Sign in to vote for this article:     PoorExcellent  
Xoreax cuts VS build times by up to 90%

Note: You must Sign in to post to this message board.
FAQ Message score threshold    Search comments  
 View   Per page  
 Msgs 1 to 25 of 43 (Total: 43) (Refresh)First Prev Next     
Subject 
Author 
Date 
  Certificate collision
 MilanTomic78 5:49 14 Oct '05 
  Re: Certificate collision
 ediazc 9:27 14 Oct '05 
  Caesar and Alice was better
Unconfirmed/Anonymous posting Anonymous 20:32 23 Sep '05 
  Re: Caesar and Alice was better
 ediazc 11:24 24 Sep '05 
  Security is a process
 ediazc 13:44 23 Sep '05 
  this is stupid
 ahurwitz 11:21 23 Sep '05 
  Re: this is stupid
Unconfirmed/Anonymous posting Vanco 12:20 23 Sep '05 
  Re: this is stupid
 ahurwitz 13:32 23 Sep '05 
  Re: this is stupid
Unconfirmed/Anonymous posting Anonymous 15:08 23 Sep '05 
  Re: this is stupid
 ahurwitz 13:34 23 Sep '05 
  Re: this is stupid
Unconfirmed/Anonymous posting Anonymous 14:01 23 Sep '05 
  Re: this is stupid
Unconfirmed/Anonymous posting xacatecas 4:05 24 Sep '05 
  Re: this is stupid
 evildictaitor 13:38 7 Sep '06 
  Re: this is stupid
Unconfirmed/Anonymous posting anonymous 4:06 24 Sep '05 
  Re: this is stupid
Unconfirmed/Anonymous posting Anonymous 15:10 23 Sep '05 
  Re: this is stupid
 orlyfu 18:05 7 Nov '05 
  Re: this is stupid
 evildictaitor 13:33 7 Sep '06 
  What would be *really* useful ...
 CDGuru 11:06 23 Sep '05 
  Slashdotted
 Judah Himango 10:18 23 Sep '05 
  Re: Slashdotted
 ediazc 13:33 23 Sep '05 
  extractor
Unconfirmed/Anonymous posting Fred Spiessens 9:01 23 Sep '05 
  Does this example demonstrate anything?
Unconfirmed/Anonymous posting anonymous 8:58 23 Sep '05 
  Re: Does this example demonstrate anything?
 moobobcat 4:50 10 Oct '05 
  file size
Unconfirmed/Anonymous posting eean 6:47 23 Sep '05 
  Re: file size
Unconfirmed/Anonymous posting Anonymous 7:53 23 Sep '05 
Last Visit: 12:12 Friday 13th October, 2006First Prev Next     

General comment    News / Info    Question    Answer    Joke / Game    Admin message


Updated: 20 Sep 2005 Article content copyright ediazc, 2005
everything else Copyright © CodeProject, 1999-2006.
Web12 | Advertise on The Code Project | Privacy

The Ultimate ToolboxASP AllianceDeveloper FusionDevelopersdexDevGuruProgrammers HeavenPlanet Source CodeTek-Tips Forums
Help!
Articles
Message Boards
StoreFront
Lounge
What is 'The Code Project'?
General FAQ
Post a Question
Site Directory
About Us
Latest
Most Popular
Search
Site Directory
Submit an Article
Update an Article
Article Competition
Visual C++
ATL / WTL / STL
COM
C++/CLI
C#
ASP.NET
VB.NET
Web Development
.NET Framework
SQL / ADO / ADO.NET
XML / XSL
OS / SysAdmin
Work Issues
Article Requests
Collaboration
General Discussions
Hardware
Algorithms / Math
Suggestions
The Soapbox