When implementing a Kafka broker for Codecrafters, one of the challenges I encountered was ensuring that the message_size
field in requests and responses was calculated correctly. The message_size
field is critical because it specifies the size of the subsequent message in bytes, allowing clients to parse messages accurately. However, I noticed that my implementation required an “off-by-one” adjustment to pass Codecrafters’ tests, which suggests a deeper issue.
[tester::#PV1] Hexdump of received "ApiVersions" response:
[tester::#PV1] Idx | Hex | ASCII
[tester::#PV1] -----+-------------------------------------------------+-----------------
[tester::#PV1] 0000 | 00 00 00 1b 0d 7c 16 e6 00 00 03 00 00 00 00 00 | .....|..........
[tester::#PV1] 0010 | 03 00 00 12 00 00 00 04 00 00 00 00 00 00 00 | ...............
[tester::#PV1]
[tester::#PV1] [Decoder] - .ResponseHeader
[tester::#PV1] [Decoder] - .correlation_id (226236134)
[tester::#PV1] [Decoder] - .ResponseBody
[tester::#PV1] [Decoder] - .error_code (0)
[tester::#PV1] [Decoder] - .num_api_keys (2)
[tester::#PV1] [Decoder] - .ApiKeys[0]
[tester::#PV1] [Decoder] - .api_key (0)
[tester::#PV1] [Decoder] - .min_version (0)
[tester::#PV1] [Decoder] - .max_version (3)
[tester::#PV1] [Decoder] - .TAG_BUFFER
[tester::#PV1] [Decoder] - .ApiKeys[1]
[tester::#PV1] [Decoder] - .api_key (18)
[tester::#PV1] [Decoder] - .min_version (0)
[tester::#PV1] [Decoder] - .max_version (4)
[tester::#PV1] [Decoder] - .TAG_BUFFER
[tester::#PV1] [Decoder] - .throttle_time_ms (0)
[tester::#PV1] [Decoder] - .TAG_BUFFER
[tester::#PV1] Received:
[tester::#PV1] Hex (bytes 21-26) | ASCII
[tester::#PV1] ------------------------------------------------+------------------
[tester::#PV1] 00 00 00 00 00 00 | ......
[tester::#PV1] ^ ^
[tester::#PV1] Error: unexpected 1 bytes remaining in decoder after decoding ApiVersionsResponse
[tester::#PV1] Context:
[tester::#PV1] - ApiVersions v3
[tester::#PV1] - Response Body
[tester::#PV1]
[tester::#PV1] Test failed
[tester::#PV1] Terminating program
[tester::#PV1] Program terminated successfully
The Problem: Off-by-One in message_size
In Kafka, the message_size
field represents the size of the message excluding the size of the message_size
field itself. For example:
- If the
message_size
isN
, the total size of the message (including themessage_size
field) should beN + 4
bytes.
In my implementation, the NetworkResponse.Write method calculates the message_size
as follows:
public int Write(Span writter, short version = default)
{
int count = 4; // Reserve 4 bytes for the message_size field
count += Header.Write(writter[count..]); // Write the header
count += Body.Write(writter[count..], version); // Write the body
MessageSize = count - 4; // Calculate the message size (excluding the 4-byte size field)
writter[0..4].WriteBigEndian(MessageSize - 1); // Write the message size to the span
return count;
}
public int Write(Span<byte> writter, short version = default)
{
int count = 4; // Reserve 4 bytes for the message_size field
count += Header.Write(writter[count..]); // Write the header
count += Body.Write(writter[count..], version); // Write the body
MessageSize = count - 4; // Calculate the message size (excluding the 4-byte size field)
writter[0..4].WriteBigEndian(MessageSize - 1); // Write the message size to the span
return count;
}
Notice the line:
writter[0..4].WriteBigEndian(MessageSize - 1);
Here, I subtract 1
from the MessageSize to make the tests pass. This adjustment is a clear indication of an off-by-one bug in the size calculation.