Thanks for the links! They were helpful.
I managed to progress a bit further - I can now find & resolve the 3 REF_DELTA
objects in the git-sample-1
repo.
Number of base objects: 329
Number of delta objects to resolve: 3
Attempting to resolve delta with base SHA: 6a9f27650d6d08a9307f28a3e1697b32dc250a8a
Successfully resolved delta, new object SHA: 5a201b017b9c92745491d72a7301de7ec773f782
Attempting to resolve delta with base SHA: 5a201b017b9c92745491d72a7301de7ec773f782
Successfully resolved delta, new object SHA: 781007c281d79173c6c166b753376147af4ace50
Attempting to resolve delta with base SHA: 0f99f9c5b83b010cfbd67870502df7b293ec0e37
Successfully resolved delta, new object SHA: 2a7a45d39bd312e00c01f5972063b7ca12b6bd28
Remaining deltas after iteration: 0
Verifying the packfile gives me 329 non-delta objects and 3 delta objects, so everything seems aligned until now.
The new issue arises when checking out and reading the recently written git objects - there’s a tree that wasn’t written to .git/objects
for some reason.
File "/Users/eaverdeja/Projects/codecrafters-git-python/app/commands.py", line 237, in _checkout_tree
type_str, data = _read_git_object(tree_sha1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/eaverdeja/Projects/codecrafters-git-python/app/commands.py", line 263, in _read_git_object
with open(obj_path, "rb") as file:
^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '.git/objects/f0/88b44e824793b7aaf2f65c0919c4287d500188'
From a fresh git clone
of the sample repo, this SHA does exist and is a tree:
➜ git-sample-1 git:(master) ✗ g cat-file -p f088b44e824793b7aaf2f65c0919c4287d500188
100644 blob ab87edd434a2c46290ecbd9799b3bd2b6525f6d3 donkey
100644 blob 9f0eabb707754276cf5fa8492bc5ae2b1becbc4b doo
100644 blob 578e86f46ec2d281b72c817c54c4c16bc2ff9e08 dooby
100644 blob a28840bd27bf82659e591c61f484f30714728249 dumpty
100644 blob e1c1cade21c7761bdb226666ae0d5ed70f7e3cdd horsey
100644 blob 4a21229f266438606fa578ea9b478013b89e77b0 humpty
100644 blob 409b28ad925a57dd4b521d04274416301fec8828 monkey
100644 blob bd8a3fd65926c98bde51900ce0ab287197154e34 scooby
100644 blob 7f487faaebd80604675fc12b9e41f8c25cdeea76 vanilla
Running tree
on my cloned repo’s .git/objects
yields me 188 directories, 334 files
. On a fresh git clone
of the repo I get 187 directories, 332 files
.
Seems weird that there’s a just a single object missing? Running diff --brief git-sample-1/.git/objects ../git-sample-1/.git/objects
yields me the following (left is my clone, right is git clone
):
Only in git-sample-1/.git/objects: 23
Only in git-sample-1/.git/objects: 4c
Only in git-sample-1/.git/objects: ff
Only in ../git-sample-1/.git/objects: info
Only in ../git-sample-1/.git/objects: pack
So it seems like I have extra objects (or wrongly named ones). How could that be happening? These extra objects are invalid btw:
(venv) ➜ codecrafters-git-python git:(master) ✗ cd git-sample-1 && g cat-file -p 23575504f27d489deee7ad72bfc9c2a185d4eb49
fatal: invalid object type
I’m so close I can even get tests passing - which is amazing but not quite fulfilling as I need to ignore errors when reading git objects to do so. I’d much prefer to get a proper working solution.
[tester::#MG6] Running tests for Stage #MG6 (Clone a repository)
[tester::#MG6] $ ./your_program.sh clone https://github.com/codecrafters-io/git-sample-3 <testDir>
[your_program]
[your_program] Number of base objects: 300
[your_program] Number of delta objects to resolve: 1
[your_program] Attempting to resolve delta with base SHA: 614a4a38d7b3dd6d34df0b99110b81ea32bef5a6
[your_program] Successfully resolved delta, new object SHA: b64fa8e5c5fcea1ecfd0cf36986a7b450656d944
[your_program] Remaining deltas after iteration: 0
[tester::#MG6] $ git cat-file commit 23f0bc3b5c7c3108e41c448f01a3db31e7064bbb
[tester::#MG6] Commit contents verified
[tester::#MG6] Reading contents of a sample file
[tester::#MG6] File contents verified
[tester::#MG6] Test passed.