New challenge: Build your own Kafka

We’ve got a new challenge in beta for you to try: Build your own Kafka.

This is the second most voted challenge so far.

Kafka is a distributed event streaming platform often used for high-performance data pipelines. In this challenge, you’ll build a Kafka broker from scratch. Along the way, you’ll learn about TCP servers, the Kafka wire protocol and more.

We’re starting out with a set of base stages + 2 extensions:

And we’ll extend this in the future with other extensions:

Currently available in Rust, Go, Python and JavaScript, with more languages on the way.

The challenge will be free during beta (about a month or two). The instructions for later stages are still a work in progress, but we’ve added notes on how the tester works.

Start building now and let us know which languages you’d like to see next!

4 Likes

A warning to those brave enough to attempt this: the Kafka docs are absolutely horrible :grimacing:

@ryan-gang (author of this challenge), @andy1li and I spent countless hours banging our heads against the wall, inspecting hexdumps and wading through the Kafka source just to make sense of what’s going on.

We’ve tried to make this easier for you folks by building a parser that highlights errors well, for example:

But even with this, you are inevitably going to have to suffer a bit due to Kafka’s poor docs.

In Jensen Huang’s words: “I wish upon you ample doses of pain and suffering.”

If you’ve got ideas on how we can make this more approachable / easy to work with - please let us know! Maybe we could work on an “Unofficial guide to the Kafka protocol”? Or a more interactive version like what Wireshark does with network protocols?

6 Likes

I seem to be stuck on Handle APIVersions requests.

The instructions seems to be off with what the actually is being tested:

  • The first 4 bytes of your response (the “message length”) are valid.
  • The correlation ID in the response header matches the correlation ID in the request header.
  • The error code in the response body is 0 (No Error).
  • The response body contains at least one entry for the API key 18 (API_VERSIONS).
  • The MaxVersion for the ApiKey 18 is at least 4.

While the tester seems to check more than that (for example throttle_time_ms which is not even mentioned in any of the challenge steps):

remote: [tester::#PV1] [Decoder] - .ResponseHeader
remote: [tester::#PV1] [Decoder]   ✔️ .correlation_id (810919229)
remote: [tester::#PV1] [Decoder] - .ResponseBody
remote: [tester::#PV1] [Decoder]   ✔️ .error_code (0)
remote: [tester::#PV1] [Decoder]   ✔️ .num_api_keys (0)
remote: [tester::#PV1] [Decoder]   ✔️ .throttle_time_ms (256)
remote: [tester::#PV1] Received:
remote: [tester::#PV1] Hex (bytes 21-25)                               | ASCII
remote: [tester::#PV1] ------------------------------------------------+------------------
remote: [tester::#PV1] 00 00 00 00 00                                  | .....
remote: [tester::#PV1]                 ^                                      ^
remote: [tester::#PV1] Error: Unexpected end of data
remote: [tester::#PV1] Context:
remote: [tester::#PV1] - ApiVersions v3
remote: [tester::#PV1]   - Response Body
remote: [tester::#PV1]     - TAG_BUFFER
remote: [tester::#PV1]       - TAGGED_FIELD_ARRAY
remote: [tester::#PV1]         - UNSIGNED_VARINT
remote: [tester::#PV1]
remote: [tester::#PV1] Test failed

Also there is a mention of ApiVersions v3 where in the challenge it is written that the tester will use v4.

I think it is better to release something a little bit more polished than to beta test on users:

:construction: We’re still working on instructions for this stage. You can find notes on how the tester works below.

I’m a little bit confused. If this “challenge” is about guessing what to do then that doesn’t seem that much fun and challenging at all.

@Eghizio thanks for highlighting these. I’ve moved your specific notes on the stage to Question about Handle APIVersions requests stage - #3 by rohitpaulk, will address there!

I’m a little bit confused. If this “challenge” is about guessing what to do then that doesn’t seem that much fun and challenging at all.

There’s no guessing here - the challenge is to re-implement Kafka. We just don’t have a friendly re-interpretation of Kafka docs like we have for the previous stages (yet). We’ve added tester notes so that it’s clear what exact tests you need to pass. Our tests are run against an official Kafka server to ensure that they’re valid.

I think it is better to release something a little bit more polished than to beta test on users:

The challenge is marked as beta for a reason :slight_smile: This is our style – we release early and often. If you’re looking for a more polished experience, I’d recommend checking out our other challenges that have been around for a while, like Redis.

I am experiencing this warning today just getting to the ApiVersionRequest stage. I didn’t think the Kafka docs were too bad until now, it loosely defines what this request is then goes on to entirely disregard defining what it actually looks like in the docs!

Thanks for putting this together, I’ve been wanting to learn more about how Kafka works before potentially starting on a large real-world project with it soon that would interface directly with its APIs. I’m looking forward to the upcoming additional instructions on it.

2 Likes