Hi
In the Redis replication section, we are told that a master should replicate WRITE commands (SET, DEL) to its replica.
Further, in the challenge, we are introduced to the REPLCONF GETACK * request from the Master to the Replica to retrieve the replication offset of the replica.
In the challenge “ACKs with commands #yd3” the test and the challenge documentation explains that when a replica has finished processing a REPLCONF GETACK * from the Master, it should increase its replication offset. Same for the PING received from the Master, the test and the doc explains:
"The master sends a REPLCONF GETACK * command.
The replica should respond with REPLCONF ACK 0.
The returned offset is 0 since no commands have been processed yet (before receiving the REPLCONF GETACK command)
The master then sends REPLCONF GETACK * again.
The replica should respond with REPLCONF ACK 37.
The returned offset is 37 since the first REPLCONF GETACK command was processed, and it was 37 bytes long.
The RESP encoding for the REPLCONF GETACK command looks like this: `3\r\n$8\r\nreplconf\r\n$6\r\ngetack\r\n$1\r\n\r\n (that’s 37 bytes long)
The master then sends a PING command to the replica (masters do this periodically to notify replicas that the master is still alive).
The replica must silently process the PING command and update its offset. It should not send a response back to the master.
The master then sends REPLCONF GETACK * again (this is the third REPLCONF GETACK command received by the replica)
The replica should respond with REPLCONF ACK 88.
The returned offset is 88 (37 + 37 + 14)
37 for the first REPLCONF GETACK command
37 for the second REPLCONF GETACK command
14 for the PING command"
However, this is not how replication offset works!
The replication offset is only increased for WRITE operations, otherwise how would a Master figure if a Replica is up to date if the replication offset of the replica increased for non write operations?
Example:
Master offset: 0
Replica offset: 0
Master receives SET foo bar
Master offset: 30 (I didn’t do the maths btw)
Replica offset: 0
Master propagates SET foo bar to Replica
Master offset: 30
Replica offset: 30
Master sends REPLCONF GETACK * to replica and received REPLCONF ACK 30
Master offset: 30
Replica offset: 67
Great now the Master knows Replica is up to date.
Some time elapses and Master sends another REPLCONF GETACK * to replica and received REPLCONF ACK 67
So now the Replica is somehow ahead of the Master.
The documentation on Redis replication | Docs is clear:
In the previous section we said that if two instances have the same replication ID and replication offset, they have exactly the same data. However it is useful to understand what exactly is the replication ID, and why instances have actually two replication IDs: the main ID and the secondary ID.
A replication ID basically marks a given history of the data set. Every time an instance restarts from scratch as a master, or a replica is promoted to master, a new replication ID is generated for this instance. The replicas connected to a master will inherit its replication ID after the handshake. So two instances with the same ID are related by the fact that they hold the same data, but potentially at a different time. It is the offset that works as a logical time to understand, for a given history (replication ID), who holds the most updated data set.
Btw I’m not blocked by this, I abided by the rule of having the Replica offsetting the PING and REPLCONF but that means that the CodeCrafters “expected” implementation of the replication offset BREAKS the nature of replication as Redis is implementing it.
I’m sad that the tests forced me to implement the replication offset incorrectly. Nevertheless I still really like the challenges