#FE4 - mismatch on tree length

I’m stuck on Stage #FE4. Test logs tell me there’s a mismatch on the tree’s length, but I’m not really sure why. My implementation seems to work since git ls-tree can read the tree object just fine:

Here are my logs:

[tester::#FE4] Running tests for Stage #FE4 (Write a tree object)
[tester::#FE4] $ ./your_program.sh init
[your_program] Initialized git directory
[tester::#FE4] Creating some files & directories
[tester::#FE4] $ ./your_program.sh write-tree
[your_program] d3f060708d12b8c4498f6726862f6760e6edb04f
[tester::#FE4] Reading file at .git/objects/d3/f060708d12b8c4498f6726862f6760e6edb04f
[tester::#FE4] Found git object file written at .git/objects/d3/f060708d12b8c4498f6726862f6760e6edb04f.
[tester::#FE4] Git object file doesn't match official Git implementation. Diff after zlib decompression:
[tester::#FE4]
[tester::#FE4] Expected (bytes 0-100), hexadecimal:                        | ASCII:
[tester::#FE4] 74 72 65 65 20 39 39 00 31 30 30 36 34 34 20 64 6f 6f 62 79 | tree 99.100644 dooby
[tester::#FE4] 00 d6 ad 1c 23 2f 42 64 d0 ac f9 71 9e 8e db b3 e9 52 6f 82 | ....#/Bd...q.....Ro.
[tester::#FE4] 1f 34 30 30 30 30 20 68 6f 72 73 65 79 00 ac ee 8e 68 2a 49 | .40000 horsey....h*I
[tester::#FE4] 9d 46 a2 f3 0e f5 57 e8 6f d5 16 91 d2 49 34 30 30 30 30 20 | .F....W.o....I40000
[tester::#FE4] 6d 6f 6e 6b 65 79 00 d1 99 ab b6 42 4e 11 d1 e2 2b 30 dd c9 | monkey.....BN...+0..
[tester::#FE4]
[tester::#FE4] Actual (bytes 0-100), hexadecimal:                          | ASCII:
[tester::#FE4] 74 72 65 65 20 31 33 35 00 31 30 30 36 34 34 20 64 6f 6f 62 | tree 135.100644 doob
[tester::#FE4] 79 00 d6 ad 1c 23 2f 42 64 d0 ac f9 71 9e 8e db b3 e9 52 6f | y....#/Bd...q.....Ro
[tester::#FE4] 82 1f 31 30 30 36 34 34 20 64 75 6d 70 74 79 00 2e 2d 8f 56 | ..100644 dumpty..-.V
[tester::#FE4] 1f 68 0b 14 26 87 50 94 b4 ff af dd b5 3e 80 b8 31 30 30 36 | .h..&.P......>..1006
[tester::#FE4] 34 34 20 68 6f 72 73 65 79 00 95 04 2d 5b 53 38 84 41 f1 b4 | 44 horsey...-[S8.A..
[tester::#FE4]
[tester::#FE4] Git object file doesn't match official Git implementation
[tester::#FE4] Test failed

Logs point out that my content length is bigger than the expected length. I’m filtering out the .git directory, so I’m not sure what’s up. Modes, names and hashes seem to match.

And here’s a snippet of my code:

...
        case "write-tree":
            entries = _create_entries_for_directory()

            contents = b""
            for sha_hash, entry in sorted(
                entries.items(), key=lambda entry: entry[1].name
            ):
                contents += entry.mode.encode() + b" "
                contents += entry.name.encode() + b"\x00"
                contents += entry.sha_hash

            content_length = len(contents)
            blob_object = b"tree " + str(content_length).encode() + b"\x00" + contents
            object_hash = sha1(blob_object).hexdigest()

            folder = object_hash[0:2]
            filename = object_hash[2:]
            if not os.path.isdir(f".git/objects/{folder}"):
                os.mkdir(f".git/objects/{folder}")

            compressed_object = zlib.compress(blob_object)
            with open(f".git/objects/{folder}/{filename}", "wb") as file:
                file.write(compressed_object)

            print(object_hash)
...

def _create_entries_for_directory(path: str | None = None):
    entries = {}
    for entry in os.scandir(path):
        if entry.is_file():
            with open(entry.path, "rb") as file:
                contents = file.read()

            content_length = len(contents)
            blob_object = b"blob " + str(content_length).encode() + b"\0" + contents
            object_hash = sha1(blob_object).digest()
            entries[object_hash] = TreeEntry(
                mode=_get_mode_for_entry(entry),
                name=entry.name,
                sha_hash=object_hash,
            )
        elif entry.is_dir():
            if entry.name in [".git", ".venv", "__pycache__"]:
                continue
            entries |= _create_entries_for_directory(entry.path)
    return entries


def _get_mode_for_entry(entry: os.DirEntry) -> str:
    if entry.is_dir():
        return "40000"
    if entry.is_symlink():
        return "120000"
    if entry.is_file():
        if os.access(entry.path, os.X_OK):
            return "100755"
        return "100644"
    raise Exception("Invalid entry")


@dataclass
class TreeEntry:
    mode: str
    name: str
    sha_hash: str | bytes

I appreciate any help. Thanks!

Hi, thanks for your post!

I’m currently out of the office and will return on Feb 3. I’ll get back to you as soon as possible after I’m back.

1 Like

Hey @andy1li , did you have a chance to look at this? I’d love to progress on this challenge but I’m out of ideas.

Hi @eaverdeja, thanks for sharing the detailed logs!

This entry for main.py does not look correct:

The file app/main.py shouldn’t be included directly in the root. Instead, it’s the directory app that should be included.

Here’s an example of the output from the real Git:

Let me know if you’d like further clarification or assistance in debugging your implementation!

1 Like

Ahh that makes sense. I was doing the recursion at the wrong place.

Tests are now passing, I just gotta clean up the code :see_no_evil:

[tester::#FE4] Running tests for Stage #FE4 (Write a tree object)
[tester::#FE4] $ ./your_program.sh init
[your_program] Initialized git directory
[tester::#FE4] Creating some files & directories
[tester::#FE4] $ ./your_program.sh write-tree
[your_program] e0eaa24bc3e7945df1c780de90481b709c800827
[tester::#FE4] Reading file at .git/objects/e0/eaa24bc3e7945df1c780de90481b709c800827
[tester::#FE4] Found git object file written at .git/objects/e0/eaa24bc3e7945df1c780de90481b709c800827.
[tester::#FE4] $ git ls-tree --name-only e0eaa24bc3e7945df1c780de90481b709c800827
[tester::#FE4] Test passed.

Thanks Andy!

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.