Ruby: Redis Replication Stage #NA2: Error reading replica acknowledgement

I’m stuck on stage 18 replication.

My approach is to send a GETACK '*' command to each replica and update some form of counter while the timeout is not yet expired or the number of expected replicas are yet to be met.
While I’m unable to meet the expected outcome (due to whatever reason), I noticed that whenever I try to read the response of the GETACK command from a replica (replica.gets or replica.read etc), I get the error:

/app/app/client_handler.rb:129:in read': Connection reset by peer (Errno::ECONNRESET).

Here are my logs:

[replication-18] Running tests for Replication > Stage #18: WAIT with multiple commands
[replication-18] $ ./spawn_redis_server.sh --port 6379
[replication-18] Proceeding to create 4 replicas.
[replication-18] replica-1: $ redis-cli PING
[replication-18] Received "PONG"
[replication-18] replica-1: $ redis-cli REPLCONF listening-port 6380
[replication-18] Received "OK"
[replication-18] replica-1: $ redis-cli REPLCONF capa psync2
[replication-18] Received "OK"
[replication-18] replica-1: $ redis-cli PSYNC ? -1
[replication-18] Received "FULLRESYNC 8371b4fb1155b71f4a04d3e1bc3e18c4a990aeeb 0"
[replication-18] Received RDB file
[replication-18] replica-2: $ redis-cli PING
[replication-18] Received "PONG"
[replication-18] replica-2: $ redis-cli REPLCONF listening-port 6380
[replication-18] Received "OK"
[replication-18] replica-2: $ redis-cli REPLCONF capa psync2
[replication-18] Received "OK"
[replication-18] replica-2: $ redis-cli PSYNC ? -1
[replication-18] Received "FULLRESYNC 8371b4fb1155b71f4a04d3e1bc3e18c4a990aeeb 0"
[replication-18] Received RDB file
[replication-18] replica-3: $ redis-cli PING
[replication-18] Received "PONG"
[replication-18] replica-3: $ redis-cli REPLCONF listening-port 6380
[replication-18] Received "OK"
[replication-18] replica-3: $ redis-cli REPLCONF capa psync2
[replication-18] Received "OK"
[replication-18] replica-3: $ redis-cli PSYNC ? -1
[replication-18] Received "FULLRESYNC 8371b4fb1155b71f4a04d3e1bc3e18c4a990aeeb 0"
[replication-18] Received RDB file
[replication-18] replica-4: $ redis-cli PING
[replication-18] Received "PONG"
[replication-18] replica-4: $ redis-cli REPLCONF listening-port 6380
[replication-18] Received "OK"
[replication-18] replica-4: $ redis-cli REPLCONF capa psync2
[replication-18] Received "OK"
[replication-18] replica-4: $ redis-cli PSYNC ? -1
[replication-18] Received "FULLRESYNC 8371b4fb1155b71f4a04d3e1bc3e18c4a990aeeb 0"
[replication-18] Received RDB file
[replication-18] client: $ redis-cli SET foo 123
[replication-18] Received "OK"
[replication-18] client: $ redis-cli WAIT 1 500
[replication-18] Testing Replica : 1
[replication-18] Received ["SET", "foo", "123"]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] replica-1: $ redis-cli REPLCONF ACK 31
[replication-18] Testing Replica : 2
[replication-18] Received ["SET", "foo", "123"]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] Testing Replica : 3
[replication-18] Received ["SET", "foo", "123"]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] Testing Replica : 4
[replication-18] Received ["SET", "foo", "123"]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] Expected 1, got 4
[replication-18] Test failed (try setting 'debug: true' in your codecrafters.yml to see more details)

I’m hoping to understand why I’m getting the error and any other discussion that can point me in the right direction.

Cheers.

Hi @rohitpaulk,
There are a few other topics related to not getting a response when REPLCONF GETACK * is sent to a replica. Any help regarding this will be appreciated :pray:

Hey @9jaswag!

Yep, this does seem to be a common point of confusion. We’re going to update the instructions here. In this stage, it is expected that some replicas will not send ACKs back (that’s what we’re testing – whether your program can accurately identify replicas which did send ACKs vs. ones that didn’t).

In my case, I get an Errno::ECONNRESET error. Does getting this error mean the replica didn’t acknowledge the previous command?
From my understanding from previous stages, I would expect a reply with an offset of 0 or so to mean the last command wasn’t acknowledged. Is this assumption incorrect?

In my case, I get an Errno::ECONNRESET error. Does getting this error mean the replica didn’t acknowledge the previous command?
From my understanding from previous stages, I would expect a reply with an offset of 0 or so to mean the last command wasn’t acknowledged. Is this assumption incorrect?

Yep, there can be cases where you don’t receive an ACK at all, like if a replica disconnects in the middle. That’s a case where you’d expect to see a “connection reset” or “broken pipe” error.

Another case is just that the replica doesn’t respond in time (i.e. it’s busy processing the replication stream and hasn’t hit the “GETACK” command yet).

(We’ll highlight both of these in the updated instructions)

1 Like

Catching the Errno::ECONNRESET error doesn’t work and returns this output:

[replication-18] client: $ redis-cli WAIT 3 500
[replication-18] Received: "" (no content received)
[replication-18]            ^ error
[replication-18] Error: Expected start of a new RESP value (either +, -, :, $ or *)
[replication-18] Test failed (try setting 'debug: true' in your codecrafters.yml to see more details)
[your_program] Logs from your program will appear here!
[your_program] "--- catch error below"

Removing the code to catch the error and this output is returned:

[replication-18] Received RDB file
[replication-18] client: $ redis-cli WAIT 3 500
[replication-18] Received: "" (no content received)
[replication-18]            ^ error
[replication-18] Error: Expected start of a new RESP value (either +, -, :, $ or *)
[replication-18] Test failed (try setting 'debug: true' in your codecrafters.yml to see more details)
[your_program] /app/app/client_handler.rb:153:in `readpartial': Connection reset by peer (Errno::ECONNRESET)
[your_program] Logs from your program will appear here!
[your_program] "--- catch error below"

Can you confirm if this is due to some error suppression in the system or I’m doing something wrong?

@9jaswag After catching the error, can you log out the bytes that you’re sending back to the client? The logs suggest that no response was received. If you could send a small code snippet, that’d be useful for us to look through too!

@rohitpaulk I agree with you. It seems as though as soon as the error occurs, code execution is terminated and then nothing is returned to the client from my redis server (hence why I asked if there was some form of error suppression or if I was doing something wrong).

here’s a snippet of my code

ack_count = 0
replicas.each do |replica|
  replica.write(generate_resp_array(['REPLCONF', 'GETACK', '*']))

  response = replica.read # I actually don't know what response is since this line raises the error.

  ack_count += 1 if response
rescue EOFError, Errno::ECONNRESET => e
  pp "Error occurred: #{e}"
end

# from this point henceforth, the code exits and everything below isn't executed.
pp "returned ack count: #{ack_count}"

ack_count

It seems as though the code below the rescue block isn’t executed after the error is caught, which is unexpected.

@rohitpaulk any thoughts on this?

@9jaswag In the loop above, after the rescue block triggers I think the loop will just continue (it’ll move on the next element). Could that explain why you aren’t hitting the error log below the each block?

Closing this for now, please do let us know if you still need help!

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.

Note: I’ve updated the title of this post to include the stage ID (#NA2). You can learn about the stages rename here: Upcoming change: Stages overhaul.