Friday, January 6, 2012

Poor Man's Bin-Diff

Sometimes you happen upon a binary file, nearly identical to another, that requires identification of very minor differences. Usually the first n bytes are identical, than deviations or complete differences exist beyond that point. If you don't have access to a tool like bindiff, or the file format is not applicable to bindiff (like the 2 ISO files I had to compare today, or anything other than an executable), there is a very easy way to identify the start and potentially all differences between the files. With some gnu kung-fu we'll use xxd (the hex dump utility) and diff (the programmer's difference identification tool) to locate the changes and potentially the start of changes. 

First, use xxd to dump the binary file contents to a ASCII hex representation:

# xxd File1.exe > File1_dump 
# xxd File2.exe > File2_dump

Standard PE/Executable file, dumped to hex with xxd.
As you can see above, xxd makes for a very convenient binary file viewer when looking for plain text meta-data, basic data structures and other potentially interesting items. We've essentially converted the binary files into an ASCII based file.

Some useful notes about this method: if there's an offset from the start of similarity, you'll need to use the  following options when creating your xxd dump:

-seek bytes_to_skip Skip to the identical starting point of the two files.
-ps  Create the output without the byte offset column and without the ASCII representation columns.

Next, use diff to view the areas of difference between these files.

# diff File1_dump File2_dump


We can clearly see changes (an IP address) starting at file offset 0x4f70
Using the switch --suppress-common-lines will reduce the content you have to review before identifying areas of interest.

In this manufactured example, we see the difference between the two executable files is an IP address.

References:
http://linux.die.net/man/1/diff
http://linux.die.net/man/1/xxd