[Python][#FE4] write tree - object content differs

I’m stuck on Stage #FE4.

I have implemented a recursive write tree but the output differs.

Here are my logs:

[user@fedora codecrafters-git-python]$ codecrafters test
Initiating test run...

⏳ Turbo test runners busy. You are in queue.

Upgrade to skip the wait: https://codecrafters.io/turbo

Running tests. Logs should appear shortly...

[compile] Moved ./.codecrafters/run.sh → ./your_program.sh
[compile] Compilation successful.

[tester::#FE4] Running tests for Stage #FE4 (Write a tree object)
[tester::#FE4] $ ./your_program.sh init
[your_program] Initialized git directory
[tester::#FE4] Creating some files & directories
[tester::#FE4] $ ./your_program.sh write-tree
[your_program] 35f788256ebc421c93c7dc6bf4b7878cde4319c6
[tester::#FE4] Found git object file written at .git/objects/35/f788256ebc421c93c7dc6bf4b7878cde4319c6.
[tester::#FE4] Git object file doesn't match official Git implementation. Diff after zlib decompression:
[tester::#FE4] 
[tester::#FE4] Expected (bytes 0-100), hexadecimal:                        | ASCII:
[tester::#FE4] 74 72 65 65 20 39 38 00 34 30 30 30 30 20 64 6f 6f 00 57 77 | tree 98.40000 doo.Ww
[tester::#FE4] de 63 c8 c1 d7 5b 22 f3 77 3b 04 56 15 ac 05 cf 7f 22 34 30 | .c...[".w;.V....."40
[tester::#FE4] 30 30 30 20 6d 6f 6e 6b 65 79 00 dd 40 10 af 93 91 79 eb b4 | 000 monkey..@....y..
[tester::#FE4] 08 df ca 79 f7 f3 94 55 c7 60 87 31 30 30 36 34 34 20 76 61 | ...y...U.`.100644 va
[tester::#FE4] 6e 69 6c 6c 61 00 eb c8 50 fa fe d7 7a 95 a3 4a 95 1a 06 b2 | nilla...P...z..J....
[tester::#FE4] 
[tester::#FE4] Actual (bytes 0-100), hexadecimal:                          | ASCII:
[tester::#FE4] 74 72 65 65 20 39 38 00 34 30 30 30 30 20 64 6f 6f 00 07 9f | tree 98.40000 doo...
[tester::#FE4] 59 2f 6f 98 bf 6e 62 82 18 e6 ca cd 80 6c bb 58 af b4 34 30 | Y/o..nb......l.X..40
[tester::#FE4] 30 30 30 20 6d 6f 6e 6b 65 79 00 3e a3 2e c9 03 3e b9 b9 75 | 000 monkey.>....>..u
[tester::#FE4] 75 02 81 24 b9 20 dc ab 0a 46 df 31 30 30 36 34 34 20 76 61 | u..$. ...F.100644 va
[tester::#FE4] 6e 69 6c 6c 61 00 eb c8 50 fa fe d7 7a 95 a3 4a 95 1a 06 b2 | nilla...P...z..J....
[tester::#FE4] 
[tester::#FE4] Git object file doesn't match official Git implementation
[tester::#FE4] Test failed (try setting 'debug: true' in your codecrafters.yml to see more details)

And here’s a snippet of my code:

    57	def hash_object(filepath, write_enabled=False):
    58	   filepath = Path(filepath)
    59	   if not filepath.is_file():
    60	       print("Object not found.")
    61	       raise FileNotFoundError
    62	   with open(filepath, "rb") as f:
    63	       contents = f.read()
    64	   size = len(contents)
    65	   header = f"blob {size}"
    66	   blob = header.encode() + b"\0" + contents
    67	   raw_sha = hashlib.sha1(blob)
    68	   sha_hash = raw_sha.hexdigest()
    69	   if write_enabled:
    70	       compressed_blob = zlib.compress(blob)
    71	       sub_dir = sha_hash[:2]
    72	       obj_name = sha_hash[2:]
    73	       if not (Path(".git/objects") / sub_dir).is_dir():
    74	           Path.mkdir(Path(".git/objects") / sub_dir)
    75	       with open(Path(".git/objects") / sub_dir / obj_name, "wb") as f:
    76	           f.write(compressed_blob)
    77	   return (raw_sha.digest()[:20], sha_hash)
    78	
    79	
    80	def write_tree(loc: Path):
    81	   tree_contents = b""
    82	   for file in sorted(list(loc.iterdir())):
    83	       if file.name == ".git":
    84	           continue
    85	       elif file.is_file():
    86	           mode = "100644"
    87	           raw_sha, _ = hash_object(file)
    88	       else:
    89	           mode = "40000"
    90	           raw_sha, _ = write_tree(file)
    91	       tree_contents += f"{mode} {file.name}".encode() + b"\0" + raw_sha
    92	       # print(mode, file.name, raw_sha)
    93	
    94	   tree_sha = hashlib.sha1(tree_contents)
    95	   tree_sha_hash = tree_sha.hexdigest()
    96	   if not (Path(".git/objects") / tree_sha_hash[:2]).is_dir():
    97	       Path.mkdir(Path(".git/objects") / tree_sha_hash[:2])
    98	   with open(Path(".git/objects") / tree_sha_hash[:2] / tree_sha_hash[2:], "wb") as f:
    99	       tree_obj = f"tree {len(tree_contents)}".encode() + b"\0" + tree_contents
   100	       compressed_tree_obj = zlib.compress(tree_obj)
   101	       f.write(compressed_tree_obj)
   102	   return tree_sha.digest()[:20], tree_sha_hash

Hey @neerajnangireddy, could you upload your code to GitHub and share the link? It will be much easier to debug if I can run it directly.

Hey @neerajnangireddy, I tried running your code against the previous stages, but it’s actually no longer passing a previous stage #IC4 (Read a blob object).

Suggestions:

  1. Use our CLI to test against previous stages by running:
codecrafters test --previous
  1. Focus on fixing the early stages first, as later stages depend on them.

Hii,
I refactored some of my previous stages code and forgot to change the arguments. so previous stages failed. however the login is same.
and it still fails the #FE4 stage.
The latest changes are pushed to my github. Any help in identifying the error will be helpful.

Thank you for the help,

I fixed it finally.
The thing is I caliculated the hash of the tree without including the header of the tree i.e tree {size}\0

The modified python code is

 def write_tree(loc: Path):
@@ -91,15 +91,16 @@ def write_tree(loc: Path):
         tree_contents += f"{mode} {file.name}".encode() + b"\0" + raw_sha
         # print(mode, file.name, raw_sha)
 
-    tree_sha = hashlib.sha1(tree_contents)
+    tree_obj = f"tree {len(tree_contents)}".encode() + b"\0" + tree_contents
+    tree_sha = hashlib.sha1(tree_obj)
     tree_sha_hash = tree_sha.hexdigest()
     if not (Path(".git/objects") / tree_sha_hash[:2]).is_dir():
         Path.mkdir(Path(".git/objects") / tree_sha_hash[:2])
     with open(Path(".git/objects") / tree_sha_hash[:2] / tree_sha_hash[2:], "wb") as f:
-        tree_obj = f"tree {len(tree_contents)}".encode() + b"\0" + tree_contents
+        # tree_obj = f"tree {len(tree_contents)}".encode() + b"\0" + tree_contents
         compressed_tree_obj = zlib.compress(tree_obj)
         f.write(compressed_tree_obj)
-    return tree_sha.digest()[:20], tree_sha_hash
+    return tree_sha.digest(), tree_sha_hash

Once again thanks of the help

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.