Download a piece #nd2: Connection reset by peer

I’m stuck on Stage #ND2.

I’ve followed all information in the hints and interestingly my code works locally when I try download any piece (0, 1, 2) from any server from the sample.torrent file via:

./your_bittorrent.sh download_piece -o test-piece-0 sample.torrent 0

I am able to successfully verify that the hashes of the downloaded piece match the associated piece hash in the torrent file.

However running the same code with codecrafters test, always results in the same error:

[tester::#ND2] Running tests for Stage #ND2 (Download a piece)
[tester::#ND2] Running ./your_bittorrent.sh download_piece -o /tmp/torrents2250797056/piece-2 /tmp/torrents2250797056/congratulations.gif.torrent 2
[your_program] Connecting to 161.35.46.221:51486
[your_program] Traceback (most recent call last):
[your_program]   File "<frozen runpy>", line 198, in _run_module_as_main
[your_program]   File "<frozen runpy>", line 88, in _run_code
[your_program]   File "/app/app/main.py", line 330, in <module>
[your_program]     main()
[your_program]   File "/app/app/main.py", line 319, in main
[your_program]     piece = download_piece(torrent_metainfo, piece_index)
[your_program]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[your_program]   File "/app/app/main.py", line 230, in download_piece
[your_program]     handshake_resp, s = peer_handshake(metainfo, peers[0])
[your_program]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[your_program]   File "/app/app/main.py", line 213, in peer_handshake
[your_program]     resp = s.recv(len(handshake))
[your_program]            ^^^^^^^^^^^^^^^^^^^^^^
[your_program] ConnectionResetError: [Errno 104] Connection reset by peer
[tester::#ND2] Application didn't terminate successfully without errors. Expected 0 as exit code, got: 1
[tester::#ND2] Test failed

And here’s a snippet of my code:

def peer_handshake(
    metainfo: TorrentMetainfo, addr: tuple[str, int]
) -> tuple[bytes, socket.socket]:
    ip, port = addr
    print(f"Connecting to {ip}:{port}")

    handshake = (
        (19).to_bytes(1)
        + b"BitTorrent protocol"
        + (0).to_bytes(8)
        + metainfo.info_hash
        + (112233445566778899).to_bytes(20)
    )

    s = socket.socket()
    s.connect((ip, int(port)))
    s.send(handshake)

    resp = s.recv(len(handshake))
    return resp, s


def recv_message(s: socket.socket) -> bytes:
    msg_len = int.from_bytes(s.recv(4))
    print(f"Received message of length {msg_len}")

    msg = s.recv(msg_len)
    while len(msg) < msg_len:
        msg += s.recv(msg_len - len(msg))

    return msg


def download_piece(metainfo: TorrentMetainfo, piece_idx: int):
    peers = fetch_peers(metainfo)
    handshake_resp, s = peer_handshake(metainfo, peers[0])

    # Wait for bitfield message
    msg = recv_message(s)
    assert msg[0] == 5

    # Send interested message
    s.send((1).to_bytes(4) + (2).to_bytes(1))

    # Wait for unchoke message
    msg = recv_message(s)
    assert msg[0] == 1

    # Determine size of requested piece
    piece_size = metainfo.piece_length
    num_pieces = len(metainfo.piece_hashes)
    if piece_idx == num_pieces - 1:
        piece_size = metainfo.length - (piece_size * (num_pieces - 1))

    # Determine number of blocks making up the piece
    num_blocks = math.ceil(piece_size / STD_BLOCK_SIZE)

    # Send request messages for the piece blocks
    for b in range(num_blocks):
        # Smaller block size for last block
        block_size = STD_BLOCK_SIZE
        if b == num_blocks - 1:
            block_size = piece_size - (STD_BLOCK_SIZE * (num_blocks - 1))

        payload = (
            (piece_idx).to_bytes(4)
            + (b * STD_BLOCK_SIZE).to_bytes(4)
            + (block_size).to_bytes(4)
        )
        s.send((1 + len(payload)).to_bytes(4) + (6).to_bytes(1) + payload)

        print(f"Sent request for piece {piece_idx} block {b}")

    # Wait for piece messages
    recv_blocks = []
    while len(recv_blocks) < num_blocks:
        msg = recv_message(s)
        assert msg[0] == 7

        p = int.from_bytes(msg[1:5])
        begin = int.from_bytes(msg[5:9])
        block = msg[9:]

        print(f"Received piece {p} block {begin // STD_BLOCK_SIZE}")
        recv_blocks.append((begin, block))

    # Create piece from blocks
    piece = b"".join([block for _, block in recv_blocks])

    # Verify piece hash
    assert hashlib.sha1(piece).digest() == metainfo.piece_hashes[piece_idx]

    return piece
1 Like

Same issue here. The exact same problem using Rust

I wonder if this is intentional, since some peers might fail to send data, or send incorrect data

I tried skipping peers if they failed, but they all failed.

This definitely isn’t intentional - our peers should be reliable, and the only circumstance under which they’d close a connection would be if there were duplicate peer connections initiated from the same IP.

@warpftl thanks for sharing your code, we’ll give this a try (cc: @andy1li).

@shide1989 & @afresquet could you share your code please? Here’s how you can publish to github: Publish to GitHub - CodeCrafters.

1 Like

Code in GitHub

only circumstance under which they’d close a connection would be if there were duplicate peer connections initiated from the same IP.

That was my issue! I was reconnecting for each piece with the same peer, fixed that and the tests passed, thank you. Now to try to make it work with multiple peers at a time.

I tried going through all the peers to find a working one (instead of always choosing the first one), and now my tests pass some of the time, but not everytime.

Relevant updated code:

def get_socket(metainfo: TorrentMetainfo) -> socket.socket:
    peers = fetch_peers(metainfo)

    for p in peers:
        try:
            _, s = peer_handshake(metainfo, p)
            return s
        except:
            continue


def download_piece(metainfo: TorrentMetainfo, piece_idx: int):
    s = get_socket(metainfo)

    ...

It seems that there is some flakiness in peer responses.

To me, it feels like some test server load or concurrency issue.

I got the error 95% of the time yesterday, and now (noon european time, most people asleep in the americas) I tested/resubmitted the unchanged code and it ran flawlessly.

2 Likes

@rohitpaulk here’s my code. I had to make some fixes, but I’m still getting the “Connection reset by peer” problem during the Handshake

1 Like

Ok, so running codecrafters submit worked perfectly, but running codecrafters test will fail with the connection reset error.

@afresquet We’re currently investigating the handshake issue. While it might take some time, I’ll keep you updated once we have a resolution.

This is what it turned out to be! Our hosted peers (we use an off-the-shelf bittorrent client that is battle tested & production-ready) have a check that prevent multiple connections from the same IP - and our test runners can occasionally share outbound IPs, especially when we have a lot of activity and have to place multiple runners on the same machine.

We’re looking into a solution here, will update once done!

@warpftl In addition to the tester runner IP issue Paul mentioned, peers will not accept multiple connections with the same peer_id either. Using a constant ID like 00112233445566778899 increases the likelihood of colliding with another Codecrafters user. Apologies, we’ll be updating course instructions about this.

1 Like

@sarp Ah that makes sense, I can see why this might be an issue.

I tried a few more test runs after changing the peer ID to a random number, but it seems there is there is no major improvement in the failure rate (it still fails more than half the time). Even if the last stage passes (downloading the entire file), often the older stage fails (downloading a single piece), which is blocker for moving ahead.

I was wondering if the logs can have more details about why the connection is being reset (reused peer id, connection from same IP, invalid handshake or peer message). This would be helpful in debugging the specific issue.

2 Likes

Just wanted to add that I am experience the same issues as well. I was able to successfully download and validate the pieces and entire file for all provided torrents. Yet it fails both times on the test runner.

I understand this being worked on, just wanted to validate that I am also being the issue. Best of luck!

@mole-179 Thanks for sharing your experience! We’ll keep you updated on any progress.

@rohitpaulk @sarp I saw the post in the other thread and tried running my code a few more times.

It seems to pass consistently now, thank you for fixing this!

1 Like

It is working for me now. Thank you!

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.