Investigating Fixed Length Messages¶

Fixed length messages are the simplest form of message type in SBE. They are defined by a fixed number of bytes and are typically used for messages that need the highest level of performance.

We will continue to use the simple fixed-length message type we defined when generating the java code before. This message is defined in the following XML schema, schema-01.xml:

schema-01.xml

<sbe:messageSchema xmlns:sbe="http://fixprotocol.io/2016/sbe"
                   package="com.shaunlaurens.pa"
                   id="1000"
                   version="1"
                   semanticVersion="pa0.1"
                   description="Schema 1 for the PA samples, version 0.1">
    <types>
        <composite name="messageHeader"
                   description="Message identifiers and length of message root">
            <type name="blockLength" primitiveType="uint16"/>
            <type name="templateId" primitiveType="uint16"/>
            <type name="schemaId" primitiveType="uint16"/>
            <type name="version" primitiveType="uint16"/>
        </composite>
    </types>

    <sbe:message name="MessageType1" id="1"
                 description="A simple, fixed length message type">
        <field name="field1" id="1" type="int64"/>
        <field name="field2" id="2" type="int32"/>
        <field name="field3" id="3" type="int64"/>
    </sbe:message>

</sbe:messageSchema>

When run through the SBE tool, this schema file will result in the following Java code being generated:

Generated Java Code

└── src
    └── main
        └── java
            └── com
                └── shaunlaurens
                    └── pa
                        ├── MessageType1Decoder.java
                        ├── MessageType1Encoder.java
                        ├── MessageHeaderDecoder.java
                        ├── MessageHeaderEncoder.java
                        ├── MetaAttribute.java
                        └── package-info.java

Let's investigate the generated code.

Message Type 1 Encoder and Decoder¶

Static header information¶

Both the encoder and decoder classes for the message type include fixed attributes for the header data we defined in the schema.

public static final int BLOCK_LENGTH = 20;
public static final int TEMPLATE_ID = 1;
public static final int SCHEMA_ID = 1000;
public static final int SCHEMA_VERSION = 1;
public static final String SEMANTIC_VERSION = "pa0.1";

We can see:

BLOCK_LENGTH is the total length of the message type in bytes.
This only includes the message fields and does not include any header information. In our case, the message type is 20 bytes long with 8 bytes for field1, 4 bytes for field2, and 8 bytes for field3.
TEMPLATE_ID is the unique identifier for the message type, as we defined in the schema.
SCHEMA_ID is the unique identifier for the schema, again, with the value set to what was provided in the schema.
SCHEMA_VERSION is the version of the schema.
SEMANTIC_VERSION is a human-readable version of the schema that we defined in the schema.

Field encoding and decoding¶

Message Type 1 is a simple, fixed-length message type. It includes three fields: field1, field2, and field3. Field 1 is an int64, field 2 is an int32, and field 3 is an int64. Visually, the message type content (excluding any header information), looks like this with each block representing a byte:

Message Type 1 - Byte layout

Within the generated Java code, we can see that the MessageType1Encoder and MessageType1Decoder classes use fixed byte offsets for reading and writing the data. The offsets are calculated based on the field's position in the message type along with the previous field's lengths.

MessageType1Decoder.java - reading snippets

...
public long field1()
{
    return buffer.getLong(offset + 0, BYTE_ORDER);
}

public int field2()
{
    return buffer.getInt(offset + 8, BYTE_ORDER);
}

public long field3()
{
    return buffer.getLong(offset + 12, BYTE_ORDER);
}
...

In much the same way, we can see the writing also uses fixed byte offsets:

MessageType1Encoder.java - writing snippets

public MessageType1Encoder field1(final long value)
{
    buffer.putLong(offset + 0, value, BYTE_ORDER);
    return this;
}

public MessageType1Encoder field2(final int value)
{
    buffer.putInt(offset + 8, value, BYTE_ORDER);
    return this;
}

public MessageType1Encoder field3(final long value)
{
    buffer.putLong(offset + 12, value, BYTE_ORDER);
    return this;
}

Some key points to note about this fixed-length message type:

None of the header information is included in the encoder and decoder data access. The header information is hard-coded into the encoder and decoder classes. We can make use of the MessageHeaderEncoder and MessageHeaderDecoder classes to encode and decode the header information separately.
The offsets are fixed and calculated based on the field's position in the message type. There is nothing within the message type itself that indicates the length of the fields, so if a specific encoder is placed over a buffer containing the same number of bytes, but a different message type, then the Decoder will still go ahead and read it. This starts to get interesting as we move to evolving schemas.
The generated code is efficient and performs well, but it is also low-level and requires careful handling to ensure correctness. In this scenario, the offsets are fixed and known at compile time, so there is no need to calculate them at runtime. This can lead to very fast encoding and decoding of messages. We are also able to read and write using these decoders in any order. This is not always the case with SBE.

Message Header Decoder and Encoder¶

SBE tool generates a MessageHeaderEncoder and MessageHeaderDecoder for each schema that includes the messageHeader composite type. These classes can be used to read and write data on the buffer, independently to the payload data.

In much the same as as the MessageType1 encoder and decoder, the header encoder and decoder classes use fixed byte offsets for reading and writing the data, and hard codes the header data we supplied earlier.

schema-01.xml - Header aspects

 <composite name="messageHeader"
                   description="Message identifiers and length of message root">
    <type name="blockLength" primitiveType="uint16"/>
    <type name="templateId" primitiveType="uint16"/>
    <type name="schemaId" primitiveType="uint16"/>
    <type name="version" primitiveType="uint16"/>
</composite>

This is reflected in the generated Java code as follows:

MessageHeaderDecoder.java - reading snippets

public int blockLength()
{
    return (buffer.getShort(offset + 0, BYTE_ORDER) & 0xFFFF);
}

public int templateId()
{
    return (buffer.getShort(offset + 2, BYTE_ORDER) & 0xFFFF);
}

public int schemaId()
{
    return (buffer.getShort(offset + 4, BYTE_ORDER) & 0xFFFF);
}

public int version()
{
    return (buffer.getShort(offset + 6, BYTE_ORDER) & 0xFFFF);
}

The usage of the & 0xFFFF is to ensure that the value is treated as an unsigned short (as requested in the schema with uint16), as Java does not have unsigned types.

MessageHeaderEncoder.java - writing snippets

public MessageHeaderEncoder blockLength(final int value)
{
    buffer.putShort(offset + 0, (short)value, BYTE_ORDER);
    return this;
}

public MessageHeaderEncoder templateId(final int value)
{
    buffer.putShort(offset + 2, (short)value, BYTE_ORDER);
    return this;
}

public MessageHeaderEncoder schemaId(final int value)
{
    buffer.putShort(offset + 4, (short)value, BYTE_ORDER);
    return this;
}

public MessageHeaderEncoder version(final int value)
{
    buffer.putShort(offset + 6, (short)value, BYTE_ORDER);
    return this;
}

We typically do not interact directly with the MessageHeaderEncoder - we can simply use wrapAndApplyHeader within the MessageType1Encoder to apply the header to the buffer.

MessageType1Encoder.java applying the header

public MessageType1Encoder wrapAndApplyHeader(
    final MutableDirectBuffer buffer, final int offset,
    final MessageHeaderEncoder headerEncoder)
{
    headerEncoder
        .wrap(buffer, offset)
        .blockLength(BLOCK_LENGTH)
        .templateId(TEMPLATE_ID)
        .schemaId(SCHEMA_ID)
        .version(SCHEMA_VERSION);

    return wrap(buffer, offset + MessageHeaderEncoder.ENCODED_LENGTH);
}

Safely consuming messages from buffers¶

Unless you're using SBE in a very controlled environment (for example, when you're moving data across an Agrona ringbuffer within an application), you should always make use of the headers. This will allow the decoder to correctly identify the template id, and allow the application to validate the schema id and schema version before attempting to decode the payload.

private static final UnsafeBuffer BUFFER =
        new UnsafeBuffer(ByteBuffer.allocate(256));
private static final MessageHeaderEncoder MESSAGE_HEADER_ENCODER =
        new MessageHeaderEncoder();
private static final MessageHeaderDecoder MESSAGE_HEADER_DECODER =
        new MessageHeaderDecoder();
private static final MessageType1Encoder MESSAGE_TYPE1_ENCODER =
        new MessageType1Encoder();
private static final MessageType1Decoder MESSAGE_TYPE1_DECODER =
        new MessageType1Decoder();

@Test
public void encodeDecode()
{
    final int bufferOffset = 0;

    MESSAGE_TYPE1_ENCODER.wrapAndApplyHeader(BUFFER, bufferOffset,
                    MESSAGE_HEADER_ENCODER) //1(1)
        .field1(1234L)
        .field2(4321)
        .field3(6789L); //2(2)

    MESSAGE_HEADER_DECODER.wrap(BUFFER, bufferOffset); //3(3)

    assertEquals(1, MESSAGE_HEADER_DECODER.templateId());
    assertEquals(20, MESSAGE_HEADER_DECODER.blockLength());
    assertEquals(1000, MESSAGE_HEADER_DECODER.schemaId());
    assertEquals(1, MESSAGE_TYPE1_DECODER.sbeSchemaVersion());

    MESSAGE_TYPE1_DECODER.wrapAndApplyHeader(BUFFER, 0,
            MESSAGE_HEADER_DECODER); //4(4)
    final long field1 = MESSAGE_TYPE1_DECODER.field1();
    final int field2 = MESSAGE_TYPE1_DECODER.field2();
    final long field3 = MESSAGE_TYPE1_DECODER.field3();

    assertEquals(1234L, field1);
    assertEquals(4321, field2);
    assertEquals(6789L, field3);
}

We wrap and apply the header to have the header data automatically written per the hard coded field values.
When writing with SBE, a best practice is to write the fields in the exact order defined in the schema. Fields not written in order for a non-fixed length message type can lead to incorrect encoding and decoding.
The decoder wraps the buffer and reads the header data. This is a critical step to ensure the correct schema is used for decoding the payload. The MessageType1Decoder will raise an exception if the template id does not match the expected template id. Note that it does not check the schema id or version - this is up to the application. Note that this step is not necessary if you have certainty that the buffer contains only one type of message.
Now that we are certain of the template id, we can wrap and apply the header to the decoder and read the fields in the correct order.