REPLCONF GETACK should not increase the replication offset

ValentinJub · October 30, 2024, 9:21pm

Hi

In the Redis replication section, we are told that a master should replicate WRITE commands (SET, DEL) to its replica.

Further, in the challenge, we are introduced to the REPLCONF GETACK * request from the Master to the Replica to retrieve the replication offset of the replica.

In the challenge “ACKs with commands #yd3” the test and the challenge documentation explains that when a replica has finished processing a REPLCONF GETACK * from the Master, it should increase its replication offset. Same for the PING received from the Master, the test and the doc explains:

"The master sends a REPLCONF GETACK * command.

The replica should respond with REPLCONF ACK 0.
The returned offset is 0 since no commands have been processed yet (before receiving the REPLCONF GETACK command)

The master then sends REPLCONF GETACK * again.

The replica should respond with REPLCONF ACK 37.
The returned offset is 37 since the first REPLCONF GETACK command was processed, and it was 37 bytes long.
The RESP encoding for the REPLCONF GETACK command looks like this: `3\r\n$8\r\nreplconf\r\n$6\r\ngetack\r\n$1\r\n\r\n (that’s 37 bytes long)

The master then sends a PING command to the replica (masters do this periodically to notify replicas that the master is still alive).

The replica must silently process the PING command and update its offset. It should not send a response back to the master.

The master then sends REPLCONF GETACK * again (this is the third REPLCONF GETACK command received by the replica)

The replica should respond with REPLCONF ACK 88.
The returned offset is 88 (37 + 37 + 14)
37 for the first REPLCONF GETACK command
37 for the second REPLCONF GETACK command
14 for the PING command"

However, this is not how replication offset works!

The replication offset is only increased for WRITE operations, otherwise how would a Master figure if a Replica is up to date if the replication offset of the replica increased for non write operations?

Example:

Master offset: 0
Replica offset: 0

Master receives SET foo bar

Master offset: 30 (I didn’t do the maths btw)
Replica offset: 0

Master propagates SET foo bar to Replica

Master offset: 30
Replica offset: 30

Master sends REPLCONF GETACK * to replica and received REPLCONF ACK 30

Master offset: 30
Replica offset: 67

Great now the Master knows Replica is up to date.
Some time elapses and Master sends another REPLCONF GETACK * to replica and received REPLCONF ACK 67

So now the Replica is somehow ahead of the Master.

The documentation on Redis replication | Docs is clear:

In the previous section we said that if two instances have the same replication ID and replication offset, they have exactly the same data. However it is useful to understand what exactly is the replication ID, and why instances have actually two replication IDs: the main ID and the secondary ID.

A replication ID basically marks a given history of the data set. Every time an instance restarts from scratch as a master, or a replica is promoted to master, a new replication ID is generated for this instance. The replicas connected to a master will inherit its replication ID after the handshake. So two instances with the same ID are related by the fact that they hold the same data, but potentially at a different time. It is the offset that works as a logical time to understand, for a given history (replication ID), who holds the most updated data set.

Btw I’m not blocked by this, I abided by the rule of having the Replica offsetting the PING and REPLCONF but that means that the CodeCrafters “expected” implementation of the replication offset BREAKS the nature of replication as Redis is implementing it.

I’m sad that the tests forced me to implement the replication offset incorrectly. Nevertheless I still really like the challenges

rohitpaulk · October 30, 2024, 9:38pm

The replication offset is only increased for WRITE operations

Can you share a source for this? I understand it isn’t super intuitive, but we verify our tester against Redis + I remember researching this fact when we were working on the extension.

Master sends REPLCONF GETACK * to replica and received REPLCONF ACK 30

Master offset: 30
Replica offset: 67

This is incorrect. When a master sends REPLCONF GETACK *, it’ll update its offset too (to 67). The offset is a count of how many bytes were written to the “replication stream”, and the GETACK command is appended to the replication stream so the offset would increase.

ValentinJub · October 30, 2024, 9:45pm

Okay, so I proved myself wrong by trying to prove I was right, but at least I have the answer, and you guys are right.

github.com/redis/redis

[QUESTION] master_repl_offset increased without write operations

opened 03:05AM - 23 Mar 23 UTC

polaris-alioth

In master-slave replication（1 master，1 slave）， i find that `master_repl_offset `… increases by 14 every 10 seconds（I did not do any write operations）. e.g. ``` 127.0.0.1:9091> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=9092,state=online,offset=28,lag=0 master_failover_state:no-failover master_replid:bc3c679709bc361f8d188499a717863cb3f08ef3 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:28 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:28 127.0.0.1:9091> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=9092,state=online,offset=42,lag=0 master_failover_state:no-failover master_replid:bc3c679709bc361f8d188499a717863cb3f08ef3 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:42 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:42 127.0.0.1:9091> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=9092,state=online,offset=42,lag=1 master_failover_state:no-failover master_replid:bc3c679709bc361f8d188499a717863cb3f08ef3 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:56 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:56 ``` scaned the coed， i find that `master_repl_offset ` increases by `ping`. without write operations，master_repl_offset is increased. i'm wondering if this is a mechanism or a bug? ``` if (!manual_failover_in_progress) { ping_argv[0] = shared.ping; replicationFeedSlaves(server.slaves, -1, ping_argv, 1); } ```

The master_repl is indeed increased when the master sends PING to the replicas.

I’ll do a better search next time, but the Redis doc says the replication offset keeps the history of the data set, PING is not part of the data set

rohitpaulk · October 30, 2024, 9:48pm

Nice find! I’ll keep an eye out for other users who run into this and might include that link in the stage instructions down the line

system · November 4, 2024, 9:49pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Timing of sending "REPLCONF GETACK *" for the Redis master server Challenges challenge:redis	5	167	October 11, 2024
Replication #NA2: Discrepancy between master_repl_offset and “REPLCONF ACK” Challenges challenge:redis	4	215	June 4, 2024
Replication (#YD3) ACKs with commands Challenges challenge:redis	3	160	June 4, 2024
Replication Stage #YD3 Missing GEKACK Challenges challenge:redis	3	142	June 4, 2024
Replication stage 15 Challenges challenge:redis	4	108	June 3, 2024

REPLCONF GETACK should not increase the replication offset

Related topics