I’m stuck on stage 11 of replication in the redis challenge.
As far as I can tell things should be operating as expected but I can’t tell what about the behavior isn’t meeting this step’s expectations.
Here are my logs:
[your_program] command: Set("baz", "789", None)
[replication-11] client: Received bytes: "+OK\r\n"
[replication-11] client: Received RESP value: "OK"
[replication-11] Received "OK"
[your_program] replicating set command to slave (127.0.0.1, 37342): "*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\n123\r\n"
[your_program] replicating set command to slave (127.0.0.1, 37342): "*3\r\n$3\r\nSET\r\n$3\r\nbar\r\n$3\r\n456\r\n"
[your_program] replicating set command to slave (127.0.0.1, 37342): "*3\r\n$3\r\nSET\r\n$3\r\nbaz\r\n$3\r\n789\r\n"
[replication-11] replica: Received bytes: ""
[replication-11] Received: ""
[replication-11] ^ error
[replication-11] Error: Expected start of a new RESP value (either +, -, :, $ or *)
[replication-11] Test failed
[replication-11] Terminating program
[your_program] thread 'tokio-runtime-worker' panicked at src/main.rs:470:77:
[your_program] failed to write set command to replication client: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
From the logs point of view, improvements I can think of:
We should make it clear that after receiving the “OK”, we’re now checking for commands to be propagated on the replication connection. The next “Received bytes: xxx” log doesn’t make sense without this.
The replica: Received bytes: "" can be confusing when we actually didn’t read anything, we should probably omit that log entirely if we read 0 bytes.
The second Received: "" log is a tad confusing too, maybe we can say something more like Received: "" (no content received after 2 seconds).
Re: the issue itself, I think the last line is most relevant:
failed to write set command to replication client: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
I’d expect this to happen if the connection was prematurely closed - looks like the handshake succeeded, so this must’ve happened after the handshake, and before the first command was propagated.
I haven’t taken a look at the code yet, thought I’d write down what I could infer from the logs first. Will take a look at the code sometime today!
thanks so much for taking a look! very much appreciate your reflections regarding the logs.
that’s really strange, i just ran the same thing checking out that commit (i’ve been noodling around a bunch locally since) and i still got the same output (attached).
Was about to go to bed and realized that there could be a conflict between the write & read parts since they use the same lock, and that turned out to be it:
This obviously isn’t a “proper” solution, since you’ll need to read from the client in later stages. Maybe using separate read / write locks will help?
This explains the behaviours we were seeing:
The “replicating to …” log happens before the lock is acquired, and at that point the program is waiting on the lock, which is held by the “read” call at the start of the loop.
The tester waiting for 2 seconds to receive a value, but eventually gave up. That caused the “Received: ‘’” error.
When the tester gave up, it closed the replica’s connection, causing the “broken pipe” error.
Ahh thank you so much Paul! I was already mulling over moving replication to use channels, but I’d have been a bit dissatisfied without understanding why that will resolve it. Thanks again!