Reverse engineering message formats from static network traces is a difficult and time consuming task, but it is critical for a variety of security purposes from recovering the lost or incomplete specifications of legacy systems to understanding the communications of hostile systems. The ambiguous nature of binary data makes reverse engineering difficult: the same sequence of four bytes could be interpreted as an integer, a float, a string, a timestamp, etc., or even several smaller fields. Our key insight in tackling this problem is that while there may be an infinite number of ways data can be encoded, in practice engineers reuse standard encodings over and over again, both for atomic types, such as integers, IEEE 754 floats, and timestamps, and for compound types such as variable-length sequences. These common idioms leave behind fingerprints we can use to identify them. In the BinaryInferno project, we are exploring an ensemble-based approach in which a collection of simple detectors, each focused on a particular kind of data, work together to infer an overall description.