Replication stage (#NA2) No ACKs received from replicas?

I’m stuck on replication state 18 redis

Here are my logs:

[replication-18] client: $ redis-cli SET foo 123
[replication-18] Received "OK"
[replication-18] client: $ redis-cli WAIT 1 500
[replication-18] Testing Replica : 1
[replication-18] Received ["SET", "foo", "123"]
[your_program] [ "set", "foo", "123", "" ]
[your_program] [ "wait", "1", "500", "" ]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] replica-1: $ redis-cli REPLCONF ACK 197
[replication-18] Testing Replica : 2
[replication-18] Received ["SET", "foo", "123"]
[your_program] [ "replconf", "ack", "197", "" ]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] Testing Replica : 3
[replication-18] Received ["SET", "foo", "123"]
[replication-18] Received ["REPLCONF", "GETACK", "*"]
[replication-18] Received: ""
[replication-18]            ^ error
1 Like

github link for the code. After the expected acknowledgements the response is not sent.

1 Like

Interesting. I am also seeing the same issue I think. I wish someone takes a look at this. I was asserting if I get proper ‘REPL ACK ’ back but, my server doesn’t receive anything from slave, so test doesn’t pass.

1 Like

I’m stuck on Replication 18 WAIT with multiple commands.

I tried to print message receive from slave, but i receive nothing in this stage. I got correct message in my environment.

Here are my logs:

[your_program] server receive *3
[your_program] $4
[your_program] WAIT
[your_program] $1
[your_program] 3
[your_program] $3
[your_program] 500
[your_program] Server offset: 0
[your_program] Master try to get offset of 12
[your_program] Master try to get offset of 11
[your_program] Master try to get offset of 10
[your_program] Master try to get offset of 9
[your_program] Master try to get offset of 8
[your_program] Master try to get offset of 7
[your_program] Master try to get offset of 6
[your_program] Master try to get offset of 5
[replication-17] client: Received bytes: ":0\r\n"
[replication-17] client: Received RESP value: 0
[replication-17] Expected 8, got 0
[replication-17] Test failed
[replication-17] Terminating program
[replication-17] Program terminated successfully

And here’s a snippet of my code:

while (true)
    bool reply = (server->ismaster) || (clientfd != server->masterfd);
    int cur_location = 0;
    std::string replyMsg(server->clients[clientfd].buffer.begin(), server->clients[clientfd].buffer.end());
    ReplyParser parser(replyMsg);
    cur_location = 0;
    int msg_len = parser.parseReply(cur_location);
    if (msg_len <= 0)// uncomplete message
      return -1;
    std::cout << "server receive " << replyMsg<< std::endl;
    case WAIT:{
        auto now = std::chrono::steady_clock().now();
        std::chrono::milliseconds duration(std::stoi(parser.tokens[1]));
        auto expire_timepoint = now + duration;
        std::thread t(waitprocess, server, clientfd, std::stoi(parser.tokens[0]), expire_timepoint);
for (auto it = server->needReplica_fd.begin(); it != server->needReplica_fd.end();)
      std::cout<<"Master try to get offset of "<<*it<<std::endl;
      if(send(*it, "*3\r\n$8\r\nREPLCONF\r\n$6\r\nGETACK\r\n$1\r\n*\r\n", 37, 0)<0)std::cout<<"Send GETACK ERROR"<<std::endl;
      it = server->needReplica_fd.erase(it);

Yep, I think we just need proper instructions here - it is expected that some slaves will not respond to ACKs, that’s what we explicitly test in this stage.

Will keep this open until we’ve improved the instructions!

Just adding a quick note – we haven’t updated the instructions here yet, but we do have better logs now:

Same here, been stuck on this stage for embarrassingly long time. The replicas are not responding except the first replica for some reason.

If anyone knows the reason for this a help is greatly appreciated, couldn’t come up with a working solution for this one I’m afraid.

As noted above (^), it is expected that some slaves will not respond to ACKs in this stage.

1 Like

Note: I’ve updated the title of this post to include the stage ID (#NA2). You can learn about the stages rename here: Upcoming change: Stages overhaul.

The logs now clearly mention when a replica ACKs vs. doesn’t ACK a message - marking as closed!