mardi 26 juillet 2011

Common Trace Format Bitfields

The Common Trace Format (CTF) will be the default format of the next release of LTTng 2.0. This new trace format is able to write data at the bit level to optimize space usage of each field. Such sub-byte packed bits is called a bitfield.

Endianness of the bitfield follows the native endianness of the architecture where the trace is obtained. Big endian (BE) and little endian (LE) are supported. But how does the bits are written into the trace? According to the documentation in include/babeltrace/bitfield.h:

The inside of a bitfield is from high bits to low bits.

On big endian, bytes are places from most significant to less significant.
Also, consecutive bitfields are placed from higher to lower bits.

On little endian, bytes are placed from the less significant to the most
significant. Also, consecutive bitfields are placed from lower bits to higher
bits.

To make sure I understood correctly, I created an example based on the current implementation in babeltrace. The structure has three fields. Each field is followed by it's length in bits.

struct x {
    unsigned int a : 4;
    unsigned int b : 32;
    unsigned int c : 4;
}

The total size of the structure is 40 bits or 5 bytes. For the example, following values are assigned to the fields to be able to spot them easily:

x.a = 0x0000000E;
x.b = 0xA1B2C3D4;
x.c = 0x0000000F;

Then, writing those values in a byte array in big and little endian gives the results in the table below. The binary and hexadecimal reprensentation of consecutive bytes are displayed.


Big endian
0b11101010 0xEA
0b00011011 0x1B
0b00101100 0x2C
0b00111101 0x3D
0b01001111 0x4F
0b00000000 0x00
[...]

   

Little endian
0b01001110 0x4E
0b00111101 0x3D
0b00101100 0x2C
0b00011011 0x1B
0b11111010 0xFA
0b00000000 0x00
[...]

Big endian is simple, because the bitfield is written sequentially. The first byte 0xEA starts with the four bits if the a field, followed by the first four bits of the b field. Notice that in consequence all subsequent bytes of the b field are shifted. The last byte starts with the last four bits of the b field. The c field fills the rest of the last byte. All bytes that follow are untouched.

As to little endian, the first byte ends with the a field. Then, the b field is written, but the most significant bit is found on the second half of the last byte. Notice that tbe bytes and written backwards, also shifted by four bits. The third field c is placed in the upper half of the last byte.

Happy hacking!