Tests attempt to connect before code is ready

I’m stuck on Stage #ZU2

The tests sometimes attempt to connect before the code is ready

Here are my logs:

[tester::#ZU2] Running tests for Stage #ZU2 (Handle concurrent clients)
[tester::#ZU2] $ ./your_program.sh
[your_program] Waiting for connection
[your_program] Connected to 0
[tester::#ZU2] [client-1] $ redis-cli PING

when “Waiting for connection” appears before [client-1] logs the test passes. However, sometimes this happens

[tester::#WY1] Running tests for Stage #WY1 (Respond to multiple PINGs)
[tester::#WY1] $ ./your_program.sh
[tester::#WY1] [client-1] $ redis-cli PING
[your_program] Waiting for connection
[tester::#WY1] Received: "" (no content received)
[tester::#WY1]            ^ error
[tester::#WY1] Error: Expected start of a new RESP2 value (either +, -, :, $ or *)
[tester::#WY1] Test failed

This is random and sometimes it gets passed this to the first test, but then that fails.

And here’s a snippet of my code: (formatter doesn’t have a zig option)

pub fn main(init: std.process.Init) !void {
    for (0..10) |_| {
        try receivers.append(init.arena.allocator(), .{});
    }

    var loop = try xev.Loop.init(.{});
    defer loop.deinit();

    const address = try std.Io.net.IpAddress.parse("127.0.0.1", 6379);
    const x_server = try xev.TCP.init(address);
    try x_server.bind(address);
    try x_server.listen(0);

    var c: xev.Completion = undefined;
    x_server.accept(&loop, &c, ?void, null, onConnect);
    std.debug.print("Waiting for connection\n", .{});

    while (true) {
        try loop.run(.until_done);
    }
}

Any help would be great

Hey @benjamaan476, could you upload your code to GitHub and share the link? It will be much easier to debug if I can run it directly.

Here it is https://github.com/benjamaan476/codecrafters-redis-zig

Runs perfectly locally for me. Can have multiple terminals open connected with redis-cli and do things like echo -e “PING\nPING” | redis-cli and see the correct number of PONGS

@benjamaan476 Thanks for sharing the link! I’m getting a 404 when trying to open it, could you double-check that the repo is public?

Sorry! Could have sworn I chose Public when I created it. It is public now :slight_smile:

Thanks @benjamaan476! I tried running tests for the first stages using codecrafters test --previous:

Looks like the issue happens after the server restarts for different stages, which suggests connections may not be getting cleaned up properly.

Could you try ensuring all connections are closed when the server shuts down, and that any shared state is reset as well? Let me know what you find.

How does cleanup of tests work? I am seeing no logs from my cleanup code after a test passes and it starts the next one. Do I need to catch the kill?

Yes, the tester terminates your server at the end of each stage.

One thing to watch out for: if sockets aren’t closed cleanly, the OS may take a short while to release the port (e.g. due to TIME_WAIT). This can cause issues when the next stage starts and your server tries to bind to the same port immediately.

I have attempted cleanup but it still isn’t working. You can see in this screenshot and in the one before that you posted, that the test is calling the [client-1] redis-cli .. before my code even starts.

that “Start” line is literally the first line of my main

I think this is a display issue. (Yes, it’s confusing, so we’ll need to fix this.)

You can check the tester code to see that your server is started before the client connects:

You can also verify that several connections to port 6379 are still in the TIME-WAIT state using the ss -tan command:

Putting in a sleep 5 before exec “$(dirnam … results in this

so I think the client is starting before run.sh is finished.

I’m seeing those TIME-WAIT now so thanks for pointing them out

I have updated my code with proper shutdown that I can see happening locally. There are still sockets with TIME-WAIT though so I don’t know what to do. The tests all passed once locally but when I submitted it didn’t pass

Well, if those sockets are still in TIME-WAIT, it suggests the shutdown isn’t fully clean yet :sweat_smile:

One way to confirm whether this is a cleanup issue is to run tests for #WY1 (Respond to multiple PINGs) first. Here’s how:

  1. Restart the challenge in Zig, clone the new repo, and copy your code over.
  2. Run codecrafters submit and click “mark stage as complete” until you reach stage #WY1.
  3. At that point, #WY1 should consistently pass, while other stages start failing.

so I think the client is starting before run.sh is finished.

To clarify, your server and the clients run concurrently, so either one may print logs first:

For context, our tester has been validated against the official Redis implementation (namely, the real Redis passes these tests), so we’re confident that ordering isn’t the root cause here.

Can I confirm how the sockets are supposed to be closed correctly?

If I get a read_len of zero that means that the client has closed the connection. I then call shutdown on the socket and close the fd. I can see the logs showing that this happens every time you call redis-cli PING but there is always a TIME-WAIT for each connection made when I exit the server.

Is this not how you’re supposed to cleanup a socket?

To be honest, I’m not familiar with how to handle graceful shutdown using libxev.

One thing you could try is looking through the Code Examples to see how other Zig users who passed this stage approached it.

That should give you some useful reference points.

I can only view the first two solutions. They just seem to close the connection when they’re done. I might just have to drop libxev and do it another way