I’m stuck on Stage #FE4. Test logs tell me there’s a mismatch on the tree’s length, but I’m not really sure why. My implementation seems to work since git ls-tree
can read the tree object just fine:
Here are my logs:
[tester::#FE4] Running tests for Stage #FE4 (Write a tree object)
[tester::#FE4] $ ./your_program.sh init
[your_program] Initialized git directory
[tester::#FE4] Creating some files & directories
[tester::#FE4] $ ./your_program.sh write-tree
[your_program] d3f060708d12b8c4498f6726862f6760e6edb04f
[tester::#FE4] Reading file at .git/objects/d3/f060708d12b8c4498f6726862f6760e6edb04f
[tester::#FE4] Found git object file written at .git/objects/d3/f060708d12b8c4498f6726862f6760e6edb04f.
[tester::#FE4] Git object file doesn't match official Git implementation. Diff after zlib decompression:
[tester::#FE4]
[tester::#FE4] Expected (bytes 0-100), hexadecimal: | ASCII:
[tester::#FE4] 74 72 65 65 20 39 39 00 31 30 30 36 34 34 20 64 6f 6f 62 79 | tree 99.100644 dooby
[tester::#FE4] 00 d6 ad 1c 23 2f 42 64 d0 ac f9 71 9e 8e db b3 e9 52 6f 82 | ....#/Bd...q.....Ro.
[tester::#FE4] 1f 34 30 30 30 30 20 68 6f 72 73 65 79 00 ac ee 8e 68 2a 49 | .40000 horsey....h*I
[tester::#FE4] 9d 46 a2 f3 0e f5 57 e8 6f d5 16 91 d2 49 34 30 30 30 30 20 | .F....W.o....I40000
[tester::#FE4] 6d 6f 6e 6b 65 79 00 d1 99 ab b6 42 4e 11 d1 e2 2b 30 dd c9 | monkey.....BN...+0..
[tester::#FE4]
[tester::#FE4] Actual (bytes 0-100), hexadecimal: | ASCII:
[tester::#FE4] 74 72 65 65 20 31 33 35 00 31 30 30 36 34 34 20 64 6f 6f 62 | tree 135.100644 doob
[tester::#FE4] 79 00 d6 ad 1c 23 2f 42 64 d0 ac f9 71 9e 8e db b3 e9 52 6f | y....#/Bd...q.....Ro
[tester::#FE4] 82 1f 31 30 30 36 34 34 20 64 75 6d 70 74 79 00 2e 2d 8f 56 | ..100644 dumpty..-.V
[tester::#FE4] 1f 68 0b 14 26 87 50 94 b4 ff af dd b5 3e 80 b8 31 30 30 36 | .h..&.P......>..1006
[tester::#FE4] 34 34 20 68 6f 72 73 65 79 00 95 04 2d 5b 53 38 84 41 f1 b4 | 44 horsey...-[S8.A..
[tester::#FE4]
[tester::#FE4] Git object file doesn't match official Git implementation
[tester::#FE4] Test failed
Logs point out that my content length is bigger than the expected length. I’m filtering out the .git
directory, so I’m not sure what’s up. Modes, names and hashes seem to match.
And here’s a snippet of my code:
...
case "write-tree":
entries = _create_entries_for_directory()
contents = b""
for sha_hash, entry in sorted(
entries.items(), key=lambda entry: entry[1].name
):
contents += entry.mode.encode() + b" "
contents += entry.name.encode() + b"\x00"
contents += entry.sha_hash
content_length = len(contents)
blob_object = b"tree " + str(content_length).encode() + b"\x00" + contents
object_hash = sha1(blob_object).hexdigest()
folder = object_hash[0:2]
filename = object_hash[2:]
if not os.path.isdir(f".git/objects/{folder}"):
os.mkdir(f".git/objects/{folder}")
compressed_object = zlib.compress(blob_object)
with open(f".git/objects/{folder}/{filename}", "wb") as file:
file.write(compressed_object)
print(object_hash)
...
def _create_entries_for_directory(path: str | None = None):
entries = {}
for entry in os.scandir(path):
if entry.is_file():
with open(entry.path, "rb") as file:
contents = file.read()
content_length = len(contents)
blob_object = b"blob " + str(content_length).encode() + b"\0" + contents
object_hash = sha1(blob_object).digest()
entries[object_hash] = TreeEntry(
mode=_get_mode_for_entry(entry),
name=entry.name,
sha_hash=object_hash,
)
elif entry.is_dir():
if entry.name in [".git", ".venv", "__pycache__"]:
continue
entries |= _create_entries_for_directory(entry.path)
return entries
def _get_mode_for_entry(entry: os.DirEntry) -> str:
if entry.is_dir():
return "40000"
if entry.is_symlink():
return "120000"
if entry.is_file():
if os.access(entry.path, os.X_OK):
return "100755"
return "100644"
raise Exception("Invalid entry")
@dataclass
class TreeEntry:
mode: str
name: str
sha_hash: str | bytes
I appreciate any help. Thanks!