Clarification needed for how many bytes to read

Nicholas-EG · May 13, 2025, 11:15pm

I’m stuck on Stage #MG6.

When dealing with Git objects in the pack file, I understand that the object header encodes information about the object type and the size of the object. Can someone clarify if the size is referring to the size of the compressed object or the size of the uncompressed object? I’ve read sources claiming it both ways, and the Git documentation isn’t clear on this. If it is referring to the size of the uncompressed object, can someone also clarify how to figure out how many compressed bytes we should read for each object? It seems like the 0/1 MSB indicator only applies to the object headers, so I’m not sure how many bytes I need to feed into my decompressing function.

Any help and links are appreciated!

andy1li · May 17, 2025, 11:20am

When dealing with Git objects in the pack file, I understand that the object header encodes information about the object type and the size of the object. Can someone clarify if the size is referring to the size of the compressed object or the size of the uncompressed object? I’ve read sources claiming it both ways, and the Git documentation isn’t clear on this.

Hey @Nicholas-EG, it refers to the uncompressed size of the object. You can confirm this by generating a minimal pack file:

The byte 0x36 translates to 0 011 0110 in binary:

If it is referring to the size of the uncompressed object, can someone also clarify how to figure out how many compressed bytes we should read for each object? It seems like the 0/1 MSB indicator only applies to the object headers, so I’m not sure how many bytes I need to feed into my decompressing function.

In fact, you don’t need to manually determine the compressed size. Zlib streams are self-terminating. You just need to feed bytes into a zlib decompressor, and it will stop exactly where the compressed object ends.

Here’s an AI-generated code example in Java:

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.zip.Inflater;
import java.util.zip.InflaterInputStream;

public class GitPackDecompressor {

    public static void main(String[] args) {
        // Concatenated zlib-compressed Git objects ("abc\n" and "hello\n" as examples)
        byte[] concatenatedCompressed = new byte[] {
            // Object 1: "abc\n"
            (byte) 0x78, (byte) 0x9c, (byte) 0x4b, (byte) 0x4c,
            (byte) 0x4a, (byte) 0xe6, (byte) 0x02, (byte) 0x00,
            (byte) 0x03, (byte) 0x7e, (byte) 0x01, (byte) 0x31,
            // Object 2: "hello\n"
            (byte) 0x78, (byte) 0x9c, (byte) 0xcb, (byte) 0x48,
            (byte) 0xcd, (byte) 0xc9, (byte) 0xc9, (byte) 0x57,
            (byte) 0x28, (byte) 0xcf, (byte) 0x2f, (byte) 0xca,
            (byte) 0x49, (byte) 0x01, (byte) 0x00, (byte) 0x1b,
            (byte) 0x04, (byte) 0x04, (byte) 0x5d
        };

        int offset = 0;

        while (offset < concatenatedCompressed.length) {
            try {
                // Create Inflater with nowrap=true for zlib streams
                Inflater inflater = new Inflater(false);
                InflaterInputStream inflaterInput = new InflaterInputStream(
                        new ByteArrayInputStream(concatenatedCompressed, offset, concatenatedCompressed.length - offset),
                        inflater
                );

                StringBuilder output = new StringBuilder();
                int b;
                while ((b = inflaterInput.read()) != -1) {
                    output.append((char) b);
                }

                System.out.println("Decompressed object: " + output);

                // Calculate how many bytes were consumed from the input
                int consumed = concatenatedCompressed.length - offset - inflater.getRemaining();
                offset += consumed;

                inflater.end();
            } catch (IOException e) {
                System.err.println("Decompression error at offset " + offset + ": " + e.getMessage());
                break;
            }
        }
    }
}

system · June 1, 2025, 7:09pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Java][#MG6] zlib error after parsing many objects successfully Challenges challenge:git	4	32	March 22, 2025
I can't setup git-tree hashes right Challenges challenge:git	4	74	October 11, 2024
Challenge:git stuck at last task Challenges challenge:git	4	44	December 4, 2024
Last Stage #MG6: Clone Implementation in Javascript Challenges challenge:git	4	180	June 18, 2024
Decompressing zlib file outputs empty string Challenges challenge:git	3	91	June 10, 2024

Clarification needed for how many bytes to read

Related topics