Bug in tester: `list_blpop`

I do realise the challenge is unreleased, but I had been tinkering with it locally, so I thought it would be best to share my findings! Apologies if the same feels premature.

The current BlockingClientGroupTestCase just runs the assertion as is. without waiting, which is troublesome for custom implementations, as when the server is supposed to unblock a command/client, it may take a small amount of time to synchronize and do that. If, in that time itself, this assertion is run, then we get the no content to read error, and the whole test fails.

For example, consider the test 510:
Here are the logs from an unlucky run:

[stage-510] Running tests for Stage #510: BLPOP-1
[stage-510] $ ./your_program.sh
[your_program] Server started at port: 6379
[your_program] [Handshake] Master role with replID: 9d075485237c4f20908e83e0a352fc41 and replOffset: 0
[your_program] [Handshake] Completed successfully
[your_program] [:33364] Accepted connection
[stage-510] client-2: $ redis-cli BLPOP banana 0
[stage-510] client-2: Sent bytes: "*3\r\n$5\r\nBLPOP\r\n$6\r\nbanana\r\n$1\r\n0\r\n"
[stage-510] client-3: $ redis-cli BLPOP banana 0
[stage-510] client-3: Sent bytes: "*3\r\n$5\r\nBLPOP\r\n$6\r\nbanana\r\n$1\r\n0\r\n"
[stage-510] client-1: $ redis-cli RPUSH banana pineapple
[stage-510] client-1: Sent bytes: "*3\r\n$5\r\nRPUSH\r\n$6\r\nbanana\r\n$9\r\npineapple\r\n"
[your_program] [:33372] Accepted connection
[your_program] [:33378] Accepted connection
[your_program] [C :33378] [735339 ms] (BLPOP, banana, 0)
[your_program] [C :33372] [735339 ms] (BLPOP, banana, 0)
[your_program] [C :33364] [735345 ms] (RPUSH, banana, pineapple)
[your_program] [C :33364] [735345 ms] Response: Int(1)
[stage-510] client-1: Received bytes: ":1\r\n"
[stage-510] client-1: Received RESP integer: 1
[stage-510] Received 1
[your_program] [C :33378] [735339 ms] Response: Arr(Bulk('banana'), Bulk('pineapple'))
[stage-510] Received: "" (no content received)
[stage-510]            ^ error
[stage-510] Error: Expected start of a new RESP2 value (either +, -, :, $ or *)
[stage-510] Test failed
[stage-510] Terminating program
[your_program] [C :33378] Unexpected error: Connection reset
[your_program] [C :33364] Connection closed

As you can see here:

  • My sample implementation does respond to the RPUSH command correctly
  • But before my server is able to sync the list and unblock the client that issued the BLPOP command, the tester runs the BlockingClientGroupTestCase’s assertion, and thus it reads no value, and the whole thing fails.
  • On lucky runs, the same runs after my server is able to sync and send the value, and thus it passes.

The tester should actually fix this either by:

  • Retrying the assertion if the received values are empty (a couple of times)
  • Define a maximum time that the tester would allow the solutions to sync, and only run the assertions after it.

Hey @EshaanAgg, thanks for testing the Lists extension out!

We’ll take a closer look before releasing it. cc: @UdeshyaDhungana

1 Like

@EshaanAgg mind sharing your code here please? We did account for the delay, it works the same way it does for all other stages - we wait upto 2 seconds whenever we try to read a RESP value: redis-tester/internal/resp/connection/connection.go at 4ca1c5368a45d0181a1462cdb9fe82728776fb4d · codecrafters-io/redis-tester · GitHub.

We can try running this against your code and help figure out what’s wrong – whether it’s a tester fault or a code error.

Sure: you can find the same here.

It might as well be a code fault, but while running the appropriate test, I saw the test failing almost instantaneously, without actually being blocked (for 2 seconds as you pointed out). Let me know if there’s anything I could help with in debugging!

Hi @EshaanAgg , upon running the code sample provided and checking messages in Wireshark, we found that the server responds to an unexpecting client. For eg,

  1. Client 1 issues BLPOP first

  2. Client 2 issues BLPOP

  3. Client 3 runs RPUSH

It is expected according to the Redis protocol that the longest waiting client should be served first. Ref: Redis Docs.



Here, since the client in port 61783 issues BLPOP first, the tester expects the response from that client. The RESP array is received in 61784 instead.

Meanwhile, we will be improving the tester before release.

Please let me know if you have any queries.

1 Like

Ahh! That’s a sneaky bug! Didn’t realise the error was that the server was sending the same to the wrong client. Thanks so much for the debugging help :slight_smile:

Adding more testcases to catch flaky implementations like mine, or having logs if the wrong client receives the message indeed, would be great additions to the tester! Thanks again :slight_smile:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.