BARfly Help - Node editing - Constructs: how BAR makes a schema of binary data

  Constructs: how BAR makes a schema of binary data

Let's return to what BAR stands for:  Binary Artifact Reference. This means the binary data (the "artifact ") must be treated in a way that makes more sense than just the bits and bytes they are composed of on the disk. It is necessary to reference the data in a way that provides the user with a reasonable interpretation of the data.

BAR accomplishes this task by utilizing three different types of file format components, known as constructs.   These constructs are blocks, data structures, and lists.  Their definitions are as follows.

  • Blocks are variable-length sections of a file containing data of simple or indeterminate types
  • Data structures are fixed-length sections of a file containing data of known, structured types
  • Lists are definitions of dynamically-sized portions of a file, and control the selection and repetition of blocks and data structures

These constructs are represented as nodes in a deserialized binary file.  A node is a fundamental discrete location in a hierarchical linked list, which potentially has a parent, a previous sibling, a next sibling, and child nodes.  In a deserialized file in BAR, a node represents an absolute location in the file.  In BARfly, nodes are nearly synonymous with lines in the node browser and subnode browser tree controls.

Blocks, lists, and data structures break down into the following sub-types of constructs.

Data Structures
Simple Type: This is a fixed-length, fundamental unit of data between one and eight bytes in size.  This type of construct never has any children.  A simple type can be a character (1-byte, 2-byte, 4-byte, or 8-byte), an integer (1-byte, 2-byte, 4-byte, or 8-byte), or an IEEE 754 floating-point number (4-byte or 8-byte).
Simple Structure: This is a fixed-length chunk of data with named data members.  The children of this type are always of Simple Type, and there is always the same number of children to a structure of a particular type.
Bit Field Structure: This is a fixed-length chunk of data with named data members, whose bit counts do not necessarily match the number found in a Simple Type, and whose values are aligned on bit boundaries, rather than byte boundaries.  The children of this type are always of Simple Type, and there is always the same number of children to a structure of a particular type.
Complex Structure: This is a fixed-length chunk of data with structured data members.  The children of this type can be of Simple Type, Simple Structure, Bit Field Structure, Complex Structure, Array of Simple Type, and Array of Complex Type.   There is always the same number of children to a structure of a particular type.
Array of Simple Type: This is a fixed-length chunk of data that acts as an array of Simple Type constructs.  There is always the same number of children in the array (since the number of elements is fixed).
Array of Complex Type: This is a fixed-length chunk of data that acts as an array of Simple Structure, Bit Field Structure, Complex Structure, Array of Simple Type, or Array of Complex Type constructs.  There is always the same number of children in the array (since the number of elements is fixed).
Blocks
Unorganized Block: This is a variable-length chunk of data of a Simple Type, such as characters or unsigned long integers.  The children of this block are always of Simple Type.  Unlike an Array of Simple Type, an Unorganized Block can have any number of children.
     Text Block: This is a subtype of Unorganized Block; it treats its Simple-Type children as text characters.  The children of this block are of integral Simple Type (1-byte, 2-byte, 4-byte, or 8-byte character).
Organized Block: This is a variable-length chunk of data that usually contains a header structure and always contains a body.  The children of this block include an optional header structure (Simple Structure, Bit Field Structure, or Complex Structure), and any number of additional blocks or data structures, whose pattern is defined by a Node List, Decision List, or a single data structure or block.  This type of block can have any number of children.
     Bit Scan Block: This is a subtype of Organized Block that only stores bit fields.  Unlike Bit Field Structures, a Bit Scan Block does not expect the bit fields to be consistently aligned within the scan.  If a header structure exists, it must be a Bit Field Structure.  All other children must be either Bit Field Structures or Bit Scan Blocks.
Lists
Node List: This construct defines a list of one or more constructs.  A list can be either repeating or non-repeating.  List items can include the following construct types:  Simple Structure, Bit Field Structure, Complex Structure, Unorganized Block, Organized Block, and Decision List.  If the Node List is part of a Bit Scan Block, the allowable list items are restricted to only Bit Field Structure, Decision List, and Bit Scan Block.
Decision List: This construct defines a point in a file where a particular block or data structure is chosen from a list of alternatives.  List choices can include the following construct types:  Simple Structure, Bit Field Structure, Complex Structure, Unorganized Block, and Organized Block. If the Decision List is part of a Bit Scan Block, the allowable list items are restricted to only Bit Field Structure and Bit Scan Block.


The BAR protocol uses node lists and decision lists to aid in the deserialization process.  Once deserialization is complete, only data structures and blocks are available for examination.  For this reason, you will only see blocks and data structures in a data file in BARfly; you will never see a list in a data file.  For the same reason, you can only insert data structures and blocks into a file you wish to edit; you can never insert a list.

It is no simple feat to construct an I.F., which defines the constructs as well as the relationships between the constructs.  Fortunately, you won't have to deal with these relationships most of the time.  BARfly enforces all the rules necessary to keep a data file consistent with the relationships described above.

BARfly does NOT enforce whether or not a file you are editing is internally consistent, which is to say, if you saved it now, you could re-open it and get the exact same results without errors.  It is impossible to guarantee internal consistency of most binary files without performing an empirical test, which involves actually saving and reloading the file.  However, BARfly does allow you to explicitly "revalidate" any file you are editing by doing just that sort of test:  saving using the I.F.'s serialization process, and re-loading using the I.F.'s deserialization process.  You can perform this test at any time during editing by selecting Run.Revalidate from the menu.

Now that you know the basic "building blocks" and terminology BAR uses to represent binary data, you should feel more comfortable exploring files with the node browser and data display viewer.


  See also: [Using the node data viewer] [Inserting and deleting nodes] [Editing data in the data display view]


BARfly Help Copyright © 2009 Christopher Allen