Clarification needed for how many bytes to read

I’m stuck on Stage #MG6.

When dealing with Git objects in the pack file, I understand that the object header encodes information about the object type and the size of the object. Can someone clarify if the size is referring to the size of the compressed object or the size of the uncompressed object? I’ve read sources claiming it both ways, and the Git documentation isn’t clear on this. If it is referring to the size of the uncompressed object, can someone also clarify how to figure out how many compressed bytes we should read for each object? It seems like the 0/1 MSB indicator only applies to the object headers, so I’m not sure how many bytes I need to feed into my decompressing function.

Any help and links are appreciated!

When dealing with Git objects in the pack file, I understand that the object header encodes information about the object type and the size of the object. Can someone clarify if the size is referring to the size of the compressed object or the size of the uncompressed object? I’ve read sources claiming it both ways, and the Git documentation isn’t clear on this.

Hey @Nicholas-EG, it refers to the uncompressed size of the object. You can confirm this by generating a minimal pack file:

The byte 0x36 translates to 0 011 0110 in binary:

If it is referring to the size of the uncompressed object, can someone also clarify how to figure out how many compressed bytes we should read for each object? It seems like the 0/1 MSB indicator only applies to the object headers, so I’m not sure how many bytes I need to feed into my decompressing function.

In fact, you don’t need to manually determine the compressed size. Zlib streams are self-terminating. You just need to feed bytes into a zlib decompressor, and it will stop exactly where the compressed object ends.

Here’s an AI-generated code example in Java:

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.zip.Inflater;
import java.util.zip.InflaterInputStream;

public class GitPackDecompressor {

    public static void main(String[] args) {
        // Concatenated zlib-compressed Git objects ("abc\n" and "hello\n" as examples)
        byte[] concatenatedCompressed = new byte[] {
            // Object 1: "abc\n"
            (byte) 0x78, (byte) 0x9c, (byte) 0x4b, (byte) 0x4c,
            (byte) 0x4a, (byte) 0xe6, (byte) 0x02, (byte) 0x00,
            (byte) 0x03, (byte) 0x7e, (byte) 0x01, (byte) 0x31,
            // Object 2: "hello\n"
            (byte) 0x78, (byte) 0x9c, (byte) 0xcb, (byte) 0x48,
            (byte) 0xcd, (byte) 0xc9, (byte) 0xc9, (byte) 0x57,
            (byte) 0x28, (byte) 0xcf, (byte) 0x2f, (byte) 0xca,
            (byte) 0x49, (byte) 0x01, (byte) 0x00, (byte) 0x1b,
            (byte) 0x04, (byte) 0x04, (byte) 0x5d
        };

        int offset = 0;

        while (offset < concatenatedCompressed.length) {
            try {
                // Create Inflater with nowrap=true for zlib streams
                Inflater inflater = new Inflater(false);
                InflaterInputStream inflaterInput = new InflaterInputStream(
                        new ByteArrayInputStream(concatenatedCompressed, offset, concatenatedCompressed.length - offset),
                        inflater
                );

                StringBuilder output = new StringBuilder();
                int b;
                while ((b = inflaterInput.read()) != -1) {
                    output.append((char) b);
                }

                System.out.println("Decompressed object: " + output);

                // Calculate how many bytes were consumed from the input
                int consumed = concatenatedCompressed.length - offset - inflater.getRemaining();
                offset += consumed;

                inflater.end();
            } catch (IOException e) {
                System.err.println("Decompression error at offset " + offset + ": " + e.getMessage());
                break;
            }
        }
    }
}
1 Like