Test master not waiting for response to get request?

I’m stuck on Stage #YG4.

It feels like the test master gives me no time to react to the first get command.

My Python code can succesfully finish the master handshake, and then use the same socket to receive set commands from the test master to update the replica db - even when e.g. multiple set commands are received at the same time. However, I keep getting errors with the first get command.

Unlike the set commands, I was completely incapable to receive the get command from the master handshake socket. First, I thought my code maybe was going through the set commands too slowly, and that this lead to timeouts. I made a function testing the socket after the master handshake completes. It simply “empties” the socket by receiving data until chunk size is not above 0. It does not process the set commands which I could get away with because the master does not expect responses to the set commands.

No matter what, the data received from the handshake socket did not include the get command. I then tried creating a new socket by using accept() with the replica server socket immediately after emptying the handshake socket. Now I was able to receive the get command - but it´s as if the test master does not wait for me to respond even though I try to make the code as fast as possible by cheating with the set commands. Immediately after receiving the get command from the new socket, I respond with a sendall hardcoded to send the expected answer to the get command without spending time on my actual get function extracting info from the replica db. No matter what, I keep getting the message that nothing is received immediately after being told that a get cmd request has been sent.

I don´t have a lot of experience with sockets. Is this expected behavior? I thought all the requests were supposed to be made with the handshake socket.

Here are my logs. The prints in test_socket don´t even all finish before tester terminates the program:

remote: [tester::#YG4] [handshake] master: Waiting for replica to initiate handshake with "PING" command
remote: [tester::#YG4] [handshake] master: Received bytes: "*1\r\n$4\r\nping\r\n"
remote: [tester::#YG4] [handshake] master: Received RESP array: ["ping"]
remote: [tester::#YG4] [handshake] Received ["ping"]
remote: [tester::#YG4] [handshake] master: Sent "PONG"
remote: [tester::#YG4] [handshake] master: Sent bytes: "+PONG\r\n"
remote: [tester::#YG4] [handshake] master: Waiting for replica to send "REPLCONF listening-port 6380" command
remote: [tester::#YG4] [handshake] master: Received bytes: "*3\r\n$8\r\nREPLCONF\r\n$14\r\nlistening-port\r\n$4\r\n6380\r\n"
remote: [tester::#YG4] [handshake] master: Received RESP array: ["REPLCONF", "listening-port", "6380"]
remote: [tester::#YG4] [handshake] Received ["REPLCONF", "listening-port", "6380"]
remote: [tester::#YG4] [handshake] master: Sent "OK"
remote: [tester::#YG4] [handshake] master: Sent bytes: "+OK\r\n"
remote: [tester::#YG4] [handshake] master: Waiting for replica to send "REPLCONF capa" command
remote: [tester::#YG4] [handshake] master: Received bytes: "*3\r\n$8\r\nREPLCONF\r\n$4\r\ncapa\r\n$6\r\npsync2\r\n"
remote: [tester::#YG4] [handshake] master: Received RESP array: ["REPLCONF", "capa", "psync2"]
remote: [tester::#YG4] [handshake] Received ["REPLCONF", "capa", "psync2"]
remote: [tester::#YG4] [handshake] master: Sent "OK"
remote: [tester::#YG4] [handshake] master: Sent bytes: "+OK\r\n"
remote: [tester::#YG4] [handshake] master: Waiting for replica to send "PSYNC" command
remote: [tester::#YG4] [handshake] master: Received bytes: "*3\r\n$5\r\nPSYNC\r\n$1\r\n?\r\n$2\r\n-1\r\n"
remote: [tester::#YG4] [handshake] master: Received RESP array: ["PSYNC", "?", "-1"]
remote: [tester::#YG4] [handshake] Received ["PSYNC", "?", "-1"]
remote: [tester::#YG4] [handshake] master: Sent "FULLRESYNC 75cd7bc10c49047e0d163660f3b90625b1af31dc 0"
remote: [tester::#YG4] [handshake] master: Sent bytes: "+FULLRESYNC 75cd7bc10c49047e0d163660f3b90625b1af31dc 0\r\n"
remote: [tester::#YG4] [handshake] Sending RDB file...
remote: [tester::#YG4] [handshake] master: Sent bytes: "$88\r\nREDIS0011\xfa\tredis-ver\x057.2.0\xfa\nredis-bits\xc0@\xfa\x05ctime\xc2m\b\xbce\xfa\bused-mem°\xc4\x10\x00\xfa\baof-base\xc0\x00\xff\xf0n;\xfe\xc0\xffZ\xa2"
remote: [tester::#YG4] [handshake] Sent RDB file.
remote: [tester::#YG4] [propagation] master: > SET foo 123
remote: [tester::#YG4] [propagation] master: Sent bytes: "*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\n123\r\n"
remote: [tester::#YG4] [propagation] master: > SET bar 456
remote: [tester::#YG4] [propagation] master: Sent bytes: "*3\r\n$3\r\nSET\r\n$3\r\nbar\r\n$3\r\n456\r\n"
remote: [tester::#YG4] [propagation] master: > SET baz 789
remote: [tester::#YG4] [propagation] master: Sent bytes: "*3\r\n$3\r\nSET\r\n$3\r\nbaz\r\n$3\r\n789\r\n"
remote: [tester::#YG4] [test] Getting key foo
remote: [tester::#YG4] [test] client: $ redis-cli GET foo
remote: [tester::#YG4] [test] client: Sent bytes: "*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n"
remote: [your_program] Starting test_socket...
remote: [your_program] Emptying handshake_socket
remote: [tester::#YG4] Received: "" (no content received)
remote: [tester::#YG4]            ^ error
remote: [tester::#YG4] Error: Expected start of a new RESP2 value (either +, -, :, $ or *)
remote: [tester::#YG4] Test failed
remote: [tester::#YG4] Terminating program
remote: [your_program] Command(s) received with handshake_socket: b'*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\n123\r\n*3\r\n$3\r\nSET\r\n$3\r\nbar\r\n$3\r\n456\r\n*3\r\n$3\r\nSET\r\n$3\r\nbaz\r\n$3\r\n789\r\n'
remote: [your_program] Creating get_cmd_socket
remote: [your_program] Command(s) received with get_cmd_socket: b'*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n'
remote: [your_program] Sending response to get command
remote: [tester::#YG4] Program terminated successfully

And here’s a snippet of my code:

def test_socket(handshake_socket):
        data = b""
        while True:
            chunk = handshake_socket.recv(4096)
            if len(chunk) > 0:
                data += chunk
            else:
                break
        print(f"Command(s) received with handshake_socket: {data}")
        set_cmd_socket, addr = server_socket.accept()
        chunk = get_cmd_socket.recv(4096)
        print(f"Command(s) received with get_cmd_socket: {chunk}")
        get_cmd_socket.sendall("$3\r\n123\r\n").encode()

@mitoeth I’ll take a look at your code and get back to you before Monday.

1 Like

@mitoeth I tried running your code against the previous stages, but it’s no longer passing the second stage.

Suggestions:

  1. Use our CLI to test against previous stages by running:
codecrafters test --previous
  1. Focus on fixing the early stages first, as later stages depend on them.

Hello @andy1li, thanks for the help,

The previous stages primarily failed due to changes I made while trying to figure out how to make #YG4 work, my mistake.

While performing previous tests, I have to disable the handshake procedure in the code for #HC6 which tests an INFO command with my replica server. Normally, when there´s a replicaof argument, I run my master_handshake function and return the socket for any additional requests to the replica e.g. an INFO command. This kept on failing in #HC6, but I passed the test when I commented out the handshake step and just went straight to receiving commands with _create_client_thread() using a socket from server_socket.accept() like shown below:

if args.replicaof:
            print("Creating replica...")
            master_host, master_port = args.replicaof.split(" ")
            #client_socket = replica.master_handshake(master_host, int(master_port), server_port)
            client_socket, addr = server_socket.accept()
            _create_client_thread(client_socket)
        else:
            while True:
                client_socket, addr = server_socket.accept()
                _create_client_thread(client_socket)

#HC6 is before you start implementing the handshake, so it might not expect my replica to do it. Maybe it just wants a response to INFO. But if I disable master_handshake(), I start receiving errors in later challenges.

This is the result when master_handshake() is not commented out:

[tester::#HC6] Running tests for Stage #HC6 (Replication - The INFO command on a replica)
[tester::#HC6] Master is running on port 6379
[tester::#HC6] $ ./spawn_redis_server.sh --port 6380 --replicaof "localhost 6379"
[your_program] Logs from your program will appear here!
[your_program] Running main._server_accept_socket
[your_program] Creating replica...
[your_program] Running replica.master_handshake...
[your_program] Running server.decode_response...
[your_program] Response to PING is: b'+PONG\r\n'
[your_program] Running server.decode_response...
[your_program] Response to 1st REPLCONF is: b'+OK\r\n'
[your_program] Running server.decode_response...
[your_program] Response to 2nd REPLCONF is: b'+OK\r\n'
[your_program] Running server.decode_response...
[your_program] 1st response to PSYNC is: b'+FULLRESYNC 75cd7bc10c49047e0d163660f3b90625b1af31dc 0\r\n'
[your_program] Running server.decode_response...
[your_program] 2nd response to PSYNC is: b'$88\r\nREDIS0011\xfa\tredis-ver\x057.2.0\xfa\nredis-bits\xc0@\xfa\x05ctime\xc2m\x08\xbce\xfa\x08used-mem\xc2\xb0\xc4\x10\x00\xfa\x08aof-base\xc0\x00\xff\xf0n;\xfe\xc0\xffZ\xa2'
[your_program] Running main.manage_client_socket_request
[your_program] Running server.decode_request...
[your_program] Running server._parse_array...
[your_program] Request decoding failed: invalid literal for int() with base 10: b''
[tester::#HC6] dial tcp [::1]:6380: connect: connection refused
[tester::#HC6] Test failed
[tester::#HC6] Terminating program
[tester::#HC6] Program terminated successfully

The decoding fails because server._parse_array() looks for an integer to determine the parameter count, but, by the time I finish the handshake, recv() only seems to return empty reponses like b’'.

This is the result when master_handshake() is commented out. #GL7 is also the challenge where you start implementing the handshake:

[tester::#GL7] Running tests for Stage #GL7 (Replication - Send handshake (1/3))
[tester::#GL7] Master is running on port 6379.
[tester::#GL7] $ ./spawn_redis_server.sh --port 6380 --replicaof "localhost 6379"
[your_program] Logs from your program will appear here!
[your_program] Running main._server_accept_socket
[your_program] Creating replica...
[tester::#GL7] timed out, test exceeded 10 seconds
[tester::#GL7] Test failed
[tester::#GL7] Terminating program
[tester::#GL7] Error terminating program: 'execution timed out'

This is currently the output for #YG4 without testing previous challenges (master_handshake() is not commented out):

[tester::#YG4] Running tests for Stage #YG4 (Replication - Command Processing)
[tester::#YG4] Master is running on port 6379
[tester::#YG4] $ ./spawn_redis_server.sh --port 6380 --replicaof "localhost 6379"
[your_program] Logs from your program will appear here!
[your_program] Running main._server_accept_socket
[your_program] Creating replica...
[your_program] Running replica.master_handshake...
[your_program] Running server.decode_response...
[tester::#YG4] [handshake] master: Waiting for replica to initiate handshake with "PING" command
[tester::#YG4] [handshake] master: Received bytes: "*1\r\n$4\r\nping\r\n"
[tester::#YG4] [handshake] master: Received RESP array: ["ping"]
[tester::#YG4] [handshake] Received ["ping"]
[tester::#YG4] [handshake] master: Sent "PONG"
[tester::#YG4] [handshake] master: Sent bytes: "+PONG\r\n"
[tester::#YG4] [handshake] master: Waiting for replica to send "REPLCONF listening-port 6380" command
[tester::#YG4] [handshake] master: Received bytes: "*3\r\n$8\r\nREPLCONF\r\n$14\r\nlistening-port\r\n$4\r\n6380\r\n"
[tester::#YG4] [handshake] master: Received RESP array: ["REPLCONF", "listening-port", "6380"]
[tester::#YG4] [handshake] Received ["REPLCONF", "listening-port", "6380"]
[tester::#YG4] [handshake] master: Sent "OK"
[tester::#YG4] [handshake] master: Sent bytes: "+OK\r\n"
[tester::#YG4] [handshake] master: Waiting for replica to send "REPLCONF capa" command
[your_program] Response to PING is: b'+PONG\r\n'
[your_program] Running server.decode_response...
[your_program] Response to 1st REPLCONF is: b'+OK\r\n'
[your_program] Running server.decode_response...
[tester::#YG4] [handshake] master: Received bytes: "*3\r\n$8\r\nREPLCONF\r\n$4\r\ncapa\r\n$6\r\npsync2\r\n"
[tester::#YG4] [handshake] master: Received RESP array: ["REPLCONF", "capa", "psync2"]
[tester::#YG4] [handshake] Received ["REPLCONF", "capa", "psync2"]
[tester::#YG4] [handshake] master: Sent "OK"
[tester::#YG4] [handshake] master: Sent bytes: "+OK\r\n"
[tester::#YG4] [handshake] master: Waiting for replica to send "PSYNC" command
[your_program] Response to 2nd REPLCONF is: b'+OK\r\n'
[your_program] Running server.decode_response...
[tester::#YG4] [handshake] master: Received bytes: "*3\r\n$5\r\nPSYNC\r\n$1\r\n?\r\n$2\r\n-1\r\n"
[tester::#YG4] [handshake] master: Received RESP array: ["PSYNC", "?", "-1"]
[tester::#YG4] [handshake] Received ["PSYNC", "?", "-1"]
[tester::#YG4] [handshake] master: Sent "FULLRESYNC 75cd7bc10c49047e0d163660f3b90625b1af31dc 0"
[tester::#YG4] [handshake] master: Sent bytes: "+FULLRESYNC 75cd7bc10c49047e0d163660f3b90625b1af31dc 0\r\n"
[tester::#YG4] [handshake] Sending RDB file...
[tester::#YG4] [handshake] master: Sent bytes: "$88\r\nREDIS0011\xfa\tredis-ver\x057.2.0\xfa\nredis-bits\xc0@\xfa\x05ctime\xc2m\b\xbce\xfa\bused-mem°\xc4\x10\x00\xfa\baof-base\xc0\x00\xff\xf0n;\xfe\xc0\xffZ\xa2"
[tester::#YG4] [handshake] Sent RDB file.
[your_program] 1st response to PSYNC is: b'+FULLRESYNC 75cd7bc10c49047e0d163660f3b90625b1af31dc 0\r\n'
[your_program] Running server.decode_response...
[your_program] 2nd response to PSYNC is: b'$88\r\nREDIS0011\xfa\tredis-ver\x057.2.0\xfa\nredis-bits\xc0@\xfa\x05ctime\xc2m\x08\xbce\xfa\x08used-mem\xc2\xb0\xc4\x10\x00\xfa\x08aof-base\xc0\x00\xff\xf0n;\xfe\xc0\xffZ\xa2'
[tester::#YG4] [propagation] master: > SET foo 123
[tester::#YG4] [propagation] master: Sent bytes: "*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\n123\r\n"
[tester::#YG4] [propagation] master: > SET bar 456
[tester::#YG4] [propagation] master: Sent bytes: "*3\r\n$3\r\nSET\r\n$3\r\nbar\r\n$3\r\n456\r\n"
[tester::#YG4] [propagation] master: > SET baz 789
[tester::#YG4] [propagation] master: Sent bytes: "*3\r\n$3\r\nSET\r\n$3\r\nbaz\r\n$3\r\n789\r\n"
[tester::#YG4] [test] Getting key foo
[tester::#YG4] [test] client: $ redis-cli GET foo
[tester::#YG4] [test] client: Sent bytes: "*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n"
[your_program] Running main.manage_client_socket_request
[your_program] Running server.decode_request...
[your_program] Running server._parse_array...
[your_program] Creatin redis_cmd...
[your_program] redis_cmd is: ['SET', 'foo', '123']
[your_program] Checking if replica...
[your_program] Is replica
[your_program] Running replica.response_handler...
[your_program] Running commands.set...
[your_program] response_list is:
[your_program] []
[your_program] Checking if response_list is empty...
[your_program] Response is empty
[your_program] Running server.decode_request...
[your_program] Running server._parse_array...
[your_program] Creatin redis_cmd...
[your_program] redis_cmd is: ['SET', 'bar', '456']
[your_program] Checking if replica...
[your_program] Is replica
[your_program] Running replica.response_handler...
[your_program] Running commands.set...
[your_program] response_list is:
[your_program] []
[your_program] Checking if response_list is empty...
[your_program] Response is empty
[your_program] Running server.decode_request...
[your_program] Running server._parse_array...
[your_program] Creatin redis_cmd...
[your_program] redis_cmd is: ['SET', 'baz', '789']
[your_program] Checking if replica...
[your_program] Is replica
[your_program] Running replica.response_handler...
[your_program] Running commands.set...
[your_program] response_list is:
[your_program] []
[your_program] Checking if response_list is empty...
[your_program] Response is empty
[your_program] Running server.decode_request...
[your_program] Running server._parse_array...
[tester::#YG4] Received: "" (no content received)
[tester::#YG4]            ^ error
[tester::#YG4] Error: Expected start of a new RESP2 value (either +, -, :, $ or *)
[tester::#YG4] Test failed
[tester::#YG4] Terminating program
[your_program] Request decoding failed: invalid literal for int() with base 10: b''
[tester::#YG4] Program terminated successfully

@mitoeth Thanks for the detailed explanation!

Let’s focus on fixing #HC6 (Replication - The INFO command on a replica) first. Here are two key issues to address:

  1. When recv() returns b'', it means that the connection has been closed by the peer. Make sure to handle this case gracefully, rather than attempting to read further from a closed connection.

  2. The handshake and client connection logic are mixed together.


For the second issue, take a look at the connection refused error:

This error occurs because the replica isn’t actually set up to accept client connections after the handshake. (The code below highlighted in blue only handles the connection to the master.)

The code accepting client connections (highlighted in red) should always run, regardless of whether the server is acting as a master or replica.

Let me know if you need further assistance.

1 Like

Thanks for all the help @andy1li. After making the while True apply every time and not just for non-replica servers - as well as making some other changes - it seems to work now. I´ve passed #YG4.

I have a question about the sockets for the sake of my own learning if you can help clarify things. I assumed that the replica’s handshake socket would be used for all incoming requests. This applies to the set commands still handled by the handshake socket, but not the get commands that require a new socket to manage those. What is the overall logic here? Is the handshake socket only meant to be used for certain requests? Is some other factor at play? I want to understand which socket does what, so I can manage it better in the future.

Thanks again

@mitoeth Nice work. Congrats on passing #YG4! :tada:

To answer your question, let’s break down replica connections based on who initiates them:

  • Outgoing:

    • The replica initiates an outgoing connection to the master to perform the handshake and synchronize data.
    • The master then uses this same connection to send back updates and commands to keep the replica in sync.
  • Incoming:

    • The replica can also accept incoming connection from clients, allowing it to respond to GET commands without interfering with the master connection.

Does this explanation help clarify things?

1 Like

@andy1li Ah OK, thank you very much. Nothing else needed from my side

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.