Binspector can analyze a binary file and report to the user if the file is well-formed or not, that is, if the file passes analysis. While true
is a straightforward answer, false
comes with a host of complications. Specifically, what was it about the file that caused the analysis to fail? Was there some invariant violated, a read that went off into the weeds… what? Validation works best when it fails as fast as it can, because the closer one halts to the actual point of failure, the more information can be gathered about it.
Sentries are one way to facilitate failing as fast as possible during file validation. So how do they work?
File formats such a PNG and TIFF contain data wrapped in length-prefixed blocks. Sometimes the format is completely block-based; sometimes it’s just substructures that are. For our purposes lets modify our original sample format grammar to be length-prefixed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
To keep our binary file up to speed with the grammar, we prefix file.bin
with two bytes that indicate the length of the block:
If, in the course of analyzing one of the pascal_t
s, a length
is larger or smaller than it should be, we won’t find out about it until the parse is completed. Given a malformed binary file:
The analysis result doesn’t give us much to go on:
1 2 3 4 |
|
The key piece of information we need to leverage is main.length
. If we know the scope to which that length applies, we could inform Binspector of a boundary that must be met exactly by the time that scope ends. The boundary is specified with the sentry
declaration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
And the Binspector output is more informative:
1 2 3 4 5 6 7 |
|
I’ll be the first to admit the sentry error reporting needs to be cleaned up, but let me break down what Binspector is trying to say. The two key bits of information are main sentry barrier breach
and the point the grammar failed, namely format.bfft:3
. Binspector was in the process of executing the line found at format.bfft:3
, namely, the length
of a pascal_t
, when the sentry established by main.length
was overrun.
If the length
value is malformed and specifies a larger block than actual data:
We get notified of that in turn:
1 2 3 |
|
Notice in both cases, Binspector still drops you into a command-line interface. This gives the user the ability to navigate the analysis up to the point of failure in an attempt to discern where things went wrong.