Commit 2046e27d authored by Sophie Brun's avatar Sophie Brun

Imported Upstream version 2.0.1

parent 20c7e28c
Description
===========
The binwalk python module can be used by any python script to programmatically perform binwalk scans and obtain the results of those scans.
The classes, methods and objects in the binwalk modules are documented via pydoc, including examples, so those interested in using the binwalk module are encouraged to look there. However, several common usage examples are provided here to help jump-start development efforts.
Binwalk Scripting
=================
Each of binwalk's features (signature scans, entropy analysis, etc) are implemented as separate modules. These modules can be invoked via the binwalk.core.module.Modules class, which makes scripting trivial through its execute method.
In fact, the binwalk command line utility can be duplicated nearly entirely with just two lines of code:
```python
import binwalk
binwalk.Modules().execute()
```
The Modules class constructor as well as the execute method accept both Python args and kwargs corresponding to the normal command line options accepted by the binwalk command line utility, providing a large amount of freedom in how you choose to specify binwalk options (if none are specified, sys.argv is used by default).
For example, to execute a signature scan, you at the very least have to specify the --signature command line option, as well as a list of files to scan. This can be done in a number of ways:
```python
binwalk.Modules().execute('firmware1.bin', 'firmware2.bin', signature=True)
binwalk.Modules().execute('firmware1.bin', 'firmware2.bin', **{'signature' : True})
binwalk.Modules().execute(*['firmware1.bin', 'firmware2.bin'], signature=True)
binwalk.Modules().execute(*['--signature', 'firmware1.bin', 'firmware2.bin',])
binwalk.Modules().execute('--signature', 'firmware1.bin', 'firmware2.bin')
```
All args and kwargs keys/values correspond to binwalk's command line options. Either args or kwargs, or a combination of the two may be used, with the following caveats:
* All command line switches passed via args must be preceeded by hyphens (not required for kwargs)
* All file names must be passed via args, not kwargs
Accessing Scan Results
======================
binwalk.Modules.execute returns a list of objects. Each object corresponds to a module that was run. For example, if you specified --signature and --entropy, then both the Signature and Entropy modules would be executed and you would be returned a list of two objects.
The two attributes of interest for each object are the 'results' and 'errors' objects. Each is a list of binwalk.core.module.Result and binwalk.core.module.Error objects respectively. Each Result or Error object may contain custom attributes set by each module, but are guarunteed to have at least the following attributes (though modules are not required to populate all attributes):
| Attribute | Description |
|-------------|-------------|
| offset | The file offset of the result/error (usually unused for errors) |
| description | The result/error description, as displayed to the user |
| module | Name of the module that generated the result/error |
| file | The file object of the scanned file |
| valid | Set to True if the result if value, False if invalid (usually unused for errors) |
| display | Set to True to display the result to the user, False to hide it (usually unused for errors) |
| extract | Set to True to flag this result for extraction (not used for errors) |
| plot | Set to Flase to exclude this result from entropy plots (not used for errors) |
binwalk.core.module.Error has the additional guarunteed attribute:
| Attribute | Description |
|-------------|-------------|
| exception | Contains the Python execption object if the encountered error was an exception |
Thus, scan results and errors can be programatically accessed rather easily:
```python
for module in binwalk.Modules().execute('firmware1.bin', 'firmware2.bin', signature=True):
print ("%s Results:" % module.name)
for result in module.results:
print ("\t%s 0x%.8X %s" % (result.file.name, result.offset, result.description))
```
Module Exceptions
=================
The only expected exception that should be raised by binwalk.Modules is that of binwalk.ModuleException. This exception is thrown only if a required module encountered a fatal error (e.g., one of the specified target files could not be opened):
```python
try:
binwalk.Modules().execute()
except binwalk.ModuleException as e:
print ("Critical failure:", e)
```
Before You Start
================
Binwalk supports Python 2.7 - 3.x. Although binwalk is slightly faster in Python 3, the Python OpenGL bindings are still experimental for Python 3, so Python 2.7 is recommended.
The following installation procedures assume that you are installing binwalk to be run using Python 2.7. If you want to use binwalk in Python 3, some package
names and installation procedures may differ slightly.
Installation
============
Installation follows the typical configure/make process (standard development tools such as gcc, make, and Python must be installed in order to build):
$ ./configure
$ make
$ sudo make install
Binwalk's core features will work out of the box without any additional dependencies. However, to take advantage of binwalk's graphing and extraction capabilities, multiple supporting utilities/packages need to be installed.
To ease "dependency hell", a shell script named `deps.sh` is included which attempts to install all required dependencies for Debian and RedHat based systems:
$ ./deps.sh
If you are running a different system, or prefer to install these dependencies manually, see the Dependencies section below.
Dependencies
============
The following dependencies are only required for optional binwalk features, such as file extraction and graphing capabilities. Unless otherwise specified, these dependencies are available from most Linux package managers.
Binwalk uses [pyqtgraph](http://www.pyqtgraph.org) to generate graphs and visualizations, which requires the following:
libqt4-opengl
python-opengl
python-qt4
python-qt4-gl
python-numpy
python-scipy
Binwalk relies on multiple external utilties in order to automatically extract/decompress files and data:
mtd-utils
zlib1g-dev
liblzma-dev
ncompress
gzip
bzip2
tar
arj
p7zip
cabextract
p7zip-full
openjdk-6-jdk
firmware-mod-kit [https://code.google.com/p/firmware-mod-kit]
Bundled Software
================
For convenience, the following libraries are bundled with binwalk and will not conflict with system-wide libraries:
libmagic
libfuzzy
pyqtgraph
Installation of any individual bundled library can be disabled at build time:
$ ./configure --disable-libmagic --disable-libfuzzy --disable-pyqtgraph
Alternatively, installation of all bundled libraries can be disabled at build time:
$ ./configure --disable-bundles
If a bundled library is disabled, the equivalent library must be installed to a standard system library location (e.g., `/usr/lib`, `/usr/local/lib`, etc) in order for binwalk to function properly.
**Note:** If the bundled libmagic library is not used, be aware that:
1. Some versions of libmagic have known bugs that are triggered by binwalk under some circumstances.
2. Minor version releases of libmagic may not be backwards compatible with each other and installation of the wrong version of libmagic may cause binwalk to fail to function properly.
3. Conversely, updating libmagic to a version that works with binwalk may cause other utilities that rely on libmagic to fail.
Currently, the following libmagic versions are known to work properly with binwalk (other versions may or may not work):
5.18
5.19
Specifying a Python Interpreter
===============================
The default python interpreter used during install is the system-wide `python` interpreter. A different interpreter (e.g., `python2`, `python3`) can be specified at build time:
$ ./configure --with-python=python3
Uninstallation
==============
The following command will remove binwalk from your system. Note that this will *not* remove manually installed packages, or utilities installed via deps.sh:
$ sudo make uninstall
The MIT License (MIT)
Copyright (c) 2010-2014 Craig Heffner
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
export CC=@CC@
export CFLAGS=@CFLAGS@
export SONAME=@SONAME@
export SOEXT=@SOEXT@
export prefix=@prefix@
export exec_prefix=@exec_prefix@
export LIBDIR=@libdir@
export INSTALL_OPTIONS=@INSTALL_OPTIONS@
export PLATFORM=@PLATFORM@
export BUILD_MAGIC=@BUILD_MAGIC@
export BUILD_FUZZY=@BUILD_FUZZY@
export BUILD_PYQTGRAPH=@BUILD_PYQTGRAPH@
export PYLIBDIR="./binwalk/libs"
BUILD_C_LIBS=@BUILD_C_LIBS@
BUILD_BUNDLES=@BUILD_BUNDLES@
PYTHON=@PYTHON@
SRC_C_DIR="./src/C"
SRC_BUNDLES_DIR="./src/bundles"
ifeq ($(strip $(prefix)),)
PREFIX=""
else
PREFIX="--prefix=$(prefix)"
endif
.PHONY: all install build deps clean uninstall
all: build
install: build
$(PYTHON) ./setup.py install $(PREFIX)
build:
if [ "$(BUILD_C_LIBS)" -eq "1" ]; then make -C $(SRC_C_DIR); fi
if [ "$(BUILD_BUNDLES)" -eq "1" ]; then make -C $(SRC_BUNDLES_DIR); fi
$(PYTHON) ./setup.py build
deps:
./deps.sh
clean:
if [ "$(BUILD_C_LIBS)" -eq "1" ]; then make -C $(SRC_C_DIR) clean; fi
if [ "$(BUILD_BUNDLES)" -eq "1" ]; then make -C $(SRC_BUNDLES_DIR) clean; fi
$(PYTHON) ./setup.py clean
distclean: clean
if [ "$(BUILD_C_LIBS)" -eq "1" ]; then make -C $(SRC_C_DIR) distclean; fi
if [ "$(BUILD_BUNDLES)" -eq "1" ]; then make -C $(SRC_BUNDLES_DIR) distclean; fi
rm -rf Makefile config.* *.cache
uninstall:
$(PYTHON) ./setup.py uninstall --pydir=`find $(prefix)/lib -name binwalk | head -1` --pybin=`find $(prefix)/bin -name binwalk | head -1`
Description
===========
Binwalk is a fast, easy to use tool for analyzing, reverse engineering, and extracting firmware images.
Installation
============
Binwalk follows the standard Unix configure/make installation procedure:
$ ./configure
$ make
$ sudo make install
For convenience, optional dependencies for automatic extraction and graphical visualizations can be installed by running the included `deps.sh` script:
$ ./deps.sh
If your system is not supported by `deps.sh`, or if you wish to manually install dependencies, see `INSTALL.md`.
For advanced installation options, see `INSTALL.md`.
Usage
=====
Basic usage is simple:
$ binwalk firmware.bin
For additional examples and desriptions of advanced options, see the [wiki](https://github.com/devttys0/binwalk/wiki).
This diff is collapsed.
This diff is collapsed.
# Common functions.
import os
import re
def file_size(filename):
'''
Obtains the size of a given file.
@filename - Path to the file.
Returns the size of the file.
'''
# Using open/lseek works on both regular files and block devices
fd = os.open(filename, os.O_RDONLY)
try:
return os.lseek(fd, 0, os.SEEK_END)
except Exception, e:
raise Exception("file_size failed to obtain the size of '%s': %s" % (filename, str(e)))
finally:
os.close(fd)
def str2int(string):
'''
Attempts to convert string to a base 10 integer; if that fails, then base 16.
@string - String to convert to an integer.
Returns the integer value on success.
Throws an exception if the string cannot be converted into either a base 10 or base 16 integer value.
'''
try:
return int(string)
except:
return int(string, 16)
def strip_quoted_strings(string):
'''
Strips out data in between double quotes.
@string - String to strip.
Returns a sanitized string.
'''
# This regex removes all quoted data from string.
# Note that this removes everything in between the first and last double quote.
# This is intentional, as printed (and quoted) strings from a target file may contain
# double quotes, and this function should ignore those. However, it also means that any
# data between two quoted strings (ex: '"quote 1" you won't see me "quote 2"') will also be stripped.
return re.sub(r'\"(.*)\"', "", string)
def get_quoted_strings(string):
'''
Returns a string comprised of all data in between double quotes.
@string - String to get quoted data from.
Returns a string of quoted data on success.
Returns a blank string if no quoted data is present.
'''
try:
# This regex grabs all quoted data from string.
# Note that this gets everything in between the first and last double quote.
# This is intentional, as printed (and quoted) strings from a target file may contain
# double quotes, and this function should ignore those. However, it also means that any
# data between two quoted strings (ex: '"quote 1" non-quoted data "quote 2"') will also be included.
return re.findall(r'\"(.*)\"', string)[0]
except:
return ''
def unique_file_name(base_name, extension=''):
'''
Creates a unique file name based on the specified base name.
@base_name - The base name to use for the unique file name.
@extension - The file extension to use for the unique file name.
Returns a unique file string.
'''
idcount = 0
if extension and not extension.startswith('.'):
extension = '.%s' % extension
fname = base_name + extension
while os.path.exists(fname):
fname = "%s-%d%s" % (base_name, idcount, extension)
idcount += 1
return fname
#!/usr/bin/env python
# Routines to perform Monte Carlo Pi approximation and Chi Squared tests.
# Used for fingerprinting unknown areas of high entropy (e.g., is this block of high entropy data compressed or encrypted?).
# Inspired by people who actually know what they're doing: http://www.fourmilab.ch/random/
import math
class MonteCarloPi(object):
'''
Performs a Monte Carlo Pi approximation.
Currently unused.
'''
def __init__(self):
'''
Class constructor.
Returns None.
'''
self.reset()
def reset(self):
'''
Reset state to the beginning.
'''
self.pi = 0
self.error = 0
self.m = 0
self.n = 0
def update(self, data):
'''
Update the pi approximation with new data.
@data - A string of bytes to update (length must be >= 6).
Returns None.
'''
c = 0
dlen = len(data)
while (c+6) < dlen:
# Treat 3 bytes as an x coordinate, the next 3 bytes as a y coordinate.
# Our box is 1x1, so divide by 2^24 to put the x y values inside the box.
x = ((ord(data[c]) << 16) + (ord(data[c+1]) << 8) + ord(data[c+2])) / 16777216.0
c += 3
y = ((ord(data[c]) << 16) + (ord(data[c+1]) << 8) + ord(data[c+2])) / 16777216.0
c += 3
# Does the x,y point lie inside the circle inscribed within our box, with diameter == 1?
if ((x**2) + (y**2)) <= 1:
self.m += 1
self.n += 1
def montecarlo(self):
'''
Approximates the value of Pi based on the provided data.
Returns a tuple of (approximated value of pi, percent deviation).
'''
if self.n:
self.pi = (float(self.m) / float(self.n) * 4.0)
if self.pi:
self.error = math.fabs(1.0 - (math.pi / self.pi)) * 100.0
return (self.pi, self.error)
else:
return (0.0, 0.0)
class ChiSquare(object):
'''
Performs a Chi Squared test against the provided data.
'''
IDEAL = 256.0
def __init__(self):
'''
Class constructor.
Returns None.
'''
self.bytes = {}
self.freedom = self.IDEAL - 1
# Initialize the self.bytes dictionary with keys for all possible byte values (0 - 255)
for i in range(0, int(self.IDEAL)):
self.bytes[chr(i)] = 0
self.reset()
def reset(self):
self.xc2 = 0.0
self.byte_count = 0
for key in self.bytes.keys():
self.bytes[key] = 0
def update(self, data):
'''
Updates the current byte counts with new data.
@data - String of bytes to update.
Returns None.
'''
# Count the number of occurances of each byte value
for i in data:
self.bytes[i] += 1
self.byte_count += len(data)
def chisq(self):
'''
Calculate the Chi Square critical value.
Returns the critical value.
'''
expected = self.byte_count / self.IDEAL
if expected:
for byte in self.bytes.values():
self.xc2 += ((byte - expected) ** 2 ) / expected
return self.xc2
class CompressionEntropyAnalyzer(object):
'''
Class wrapper around ChiSquare.
Performs analysis and attempts to interpret the results.
'''
BLOCK_SIZE = 32
CHI_CUTOFF = 512
DESCRIPTION = "Statistical Compression Analysis"
def __init__(self, fname, start, length, binwalk=None, fp=None):
'''
Class constructor.
@fname - The file to scan.
@start - The start offset to begin analysis at.
@length - The number of bytes to analyze.
@callback - Callback function compatible with Binwalk.display.
Returns None.
'''
if fname:
self.fp = open(fname, 'rb')
else:
self.fp = fp
self.start = start
self.length = length
self.binwalk = binwalk
def analyze(self):
'''
Perform analysis and interpretation.
Returns a descriptive string containing the results and attempted interpretation.
'''
i = 0
num_error = 0
analyzer_results = []
if self.binwalk:
self.binwalk.display.header(file_name=self.fp.name, description=self.DESCRIPTION)
chi = ChiSquare()
self.fp.seek(self.start)
while i < self.length:
rsize = self.length - i
if rsize > self.BLOCK_SIZE:
rsize = self.BLOCK_SIZE
d = self.fp.read(rsize)
if len(d) != rsize:
break
chi.reset()
chi.update(d)
if chi.chisq() >= self.CHI_CUTOFF:
num_error += 1
i += rsize
if num_error > 0:
verdict = 'Moderate entropy data, best guess: compressed'
else:
verdict = 'High entropy data, best guess: encrypted'
result = [{'offset' : self.start, 'description' : '%s, size: %d, %d low entropy blocks' % (verdict, self.length, num_error)}]
if self.binwalk:
self.binwalk.display.results(self.start, result)
self.binwalk.display.footer()
return result
import os
class Config:
'''
Binwalk configuration class, used for accessing user and system file paths.
After instatiating the class, file paths can be accessed via the self.paths dictionary.
System file paths are listed under the 'system' key, user file paths under the 'user' key.
For example, to get the path to both the user and system binwalk magic files:
from binwalk import Config
conf = Config()
user_binwalk_file = conf.paths['user'][conf.BINWALK_MAGIC_FILE]
system_binwalk_file = conf.paths['system'][conf.BINWALK_MAGIC_FILE]
There is also an instance of this class available via the Binwalk.config object:
import binwalk
bw = binwalk.Binwalk()
user_binwalk_file = bw.config.paths['user'][conf.BINWALK_MAGIC_FILE]
system_binwalk_file = bw.config.paths['system'][conf.BINWALK_MAGIC_FILE]
Valid file names under both the 'user' and 'system' keys are as follows:
o BINWALK_MAGIC_FILE - Path to the default binwalk magic file.
o BINCAST_MAGIC_FILE - Path to the bincast magic file (used when -C is specified with the command line binwalk script).
o BINARCH_MAGIC_FILE - Path to the binarch magic file (used when -A is specified with the command line binwalk script).
o EXTRACT_FILE - Path to the extract configuration file (used when -e is specified with the command line binwalk script).
o PLUGINS - Path to the plugins directory.
'''
# Release version
VERSION = "1.2.2-1"
# Sub directories
BINWALK_USER_DIR = ".binwalk"
BINWALK_MAGIC_DIR = "magic"
BINWALK_CONFIG_DIR = "config"
BINWALK_PLUGINS_DIR = "plugins"
# File names
PLUGINS = "plugins"
EXTRACT_FILE = "extract.conf"
BINWALK_MAGIC_FILE = "binwalk"
BINCAST_MAGIC_FILE = "bincast"
BINARCH_MAGIC_FILE = "binarch"
ZLIB_MAGIC_FILE = "zlib"
def __init__(self):
'''
Class constructor. Enumerates file paths and populates self.paths.
'''
# Path to the user binwalk directory
self.user_dir = self._get_user_dir()
# Path to the system wide binwalk directory
self.system_dir = self._get_system_dir()
# Dictionary of all absolute user/system file paths
self.paths = {
'user' : {},
'system' : {},
}
# Build the paths to all user-specific files
self.paths['user'][self.BINWALK_MAGIC_FILE] = self._user_path(self.BINWALK_MAGIC_DIR, self.BINWALK_MAGIC_FILE)
self.paths['user'][self.BINCAST_MAGIC_FILE] = self._user_path(self.BINWALK_MAGIC_DIR, self.BINCAST_MAGIC_FILE)
self.paths['user'][self.BINARCH_MAGIC_FILE] = self._user_path(self.BINWALK_MAGIC_DIR, self.BINARCH_MAGIC_FILE)
self.paths['user'][self.EXTRACT_FILE] = self._user_path(self.BINWALK_CONFIG_DIR, self.EXTRACT_FILE)
self.paths['user'][self.PLUGINS] = self._user_path(self.BINWALK_PLUGINS_DIR)
# Build the paths to all system-wide files
self.paths['system'][self.BINWALK_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.BINWALK_MAGIC_FILE)
self.paths['system'][self.BINCAST_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.BINCAST_MAGIC_FILE)
self.paths['system'][self.BINARCH_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.BINARCH_MAGIC_FILE)
self.paths['system'][self.ZLIB_MAGIC_FILE] = self._system_path(self.BINWALK_MAGIC_DIR, self.ZLIB_MAGIC_FILE)
self.paths['system'][self.EXTRACT_FILE] = self._system_path(self.BINWALK_CONFIG_DIR, self.EXTRACT_FILE)
self.paths['system'][self.PLUGINS] = self._system_path(self.BINWALK_PLUGINS_DIR)
def _get_system_dir(self):
'''
Find the directory where the binwalk module is installed on the system.
'''
try:
root = __file__
if os.path.islink(root):
root = os.path.realpath(root)
return os.path.dirname(os.path.abspath(root))
except:
return ''
def _get_user_dir(self):
'''
Get the user's home directory.
'''
try:
# This should work in both Windows and Unix environments
return os.getenv('USERPROFILE') or os.getenv('HOME')
except:
return ''
def _file_path(self, dirname, filename):
'''
Builds an absolute path and creates the directory and file if they don't already exist.
@dirname - Directory path.
@filename - File name.
Returns a full path of 'dirname/filename'.
'''
if not os.path.exists(dirname):
try:
os.makedirs(dirname)
except:
pass
fpath = os.path.join(dirname, filename)
if not os