* git gc gives "error: Could not read..." @ 2015-06-01 7:37 Stefan Näwe 2015-06-01 8:14 ` Jeff King 0 siblings, 1 reply; 15+ messages in thread From: Stefan Näwe @ 2015-06-01 7:37 UTC (permalink / raw) To: Git list Hi there. One of my repos started giving an error on 'git gc' recently: $ git gc error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 Counting objects: 3052, done. Delta compression using up to 4 threads. Compressing objects: 100% (531/531), done. Writing objects: 100% (3052/3052), done. Total 3052 (delta 2504), reused 3052 (delta 2504) error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 (Yes, the error comes twice). I tried: $ git cat-file -t 7713c3b1e9ea2dd9126244 fatal: Not a valid object name 7713c3b1e9ea2dd9126244 Otherwise, everything works fine. I used that repo initially on Win7 with an older msysgit installation but also tried the latest git-for-windows installer, and version 2.4.2 on linux. All give the same error (no matter if I clone or copy the repo). Unfortunately I cannot make the repo publically available. Any chance to get rid of that error ? Thanks, Stefan -- ---------------------------------------------------------------- /dev/random says: When it comes to humility, I'm the very BEST there is! python -c "print '73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')" GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9 9666 829B 49C5 9221 27AF ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 7:37 git gc gives "error: Could not read..." Stefan Näwe @ 2015-06-01 8:14 ` Jeff King 2015-06-01 8:40 ` Stefan Näwe 0 siblings, 1 reply; 15+ messages in thread From: Jeff King @ 2015-06-01 8:14 UTC (permalink / raw) To: Stefan Näwe; +Cc: Git list On Mon, Jun 01, 2015 at 09:37:17AM +0200, Stefan Näwe wrote: > One of my repos started giving an error on 'git gc' recently: > > $ git gc > error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 > Counting objects: 3052, done. > Delta compression using up to 4 threads. > Compressing objects: 100% (531/531), done. > Writing objects: 100% (3052/3052), done. > Total 3052 (delta 2504), reused 3052 (delta 2504) > error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 The only error string that matches that is the one in parse_commit(), when we fail to read the object. It happens twice here because `git gc` runs several subcommands; you can see which ones are generating the error if you run with GIT_TRACE=1. I am surprised that it doesn't cause the commands to abort, though. If we are traversing the object graph to repack, for example, we would want to abort if we are missing a reachable object (i.e., the repository is corrupt). > I tried: > > $ git cat-file -t 7713c3b1e9ea2dd9126244 > fatal: Not a valid object name 7713c3b1e9ea2dd9126244 Not surprising, if we don't have the object. What is curious is why git wants to look it up in the first place. I.e., who is referencing it? Either: 1. It is an object that we are OK to be missing (e.g., the UNINTERESTING side of a traversal), and the error should be suppressed. 2. Your repository really is corrupted, and this is a case where we need to be paying attention to the return value of parse_commit but are not. I'd love to see: - the output of "GIT_TRACE=1 git gc" (to see which subcommand is causing the error) - the output of "git fsck" (which should hopefully confirm whether or not there is a real problem) - any mentions of the sha1 in the refs or reflogs. Something like: sha1=7713c3b1e9ea2dd9126244697389e4000bb39d85 cd .git grep $sha1 $(find packed-refs refs logs -type f) - If that doesn't turn up any hits, then presumably it's an object referencing the sha1. We can dig into the objects (all of them, not just reachable ones), like: { # loose objects (cd .git/objects && find ?? -type f | tr -d /) # packed objects for i in .git/objects/pack/*.idx; do git show-index <$i done | cut -d' ' -f2 } | # omit blobs; they are expensive to access and cannot have # reachability pointers git cat-file --batch-check='%(objecttype) %(objectname)' | grep -v ^blob | cut -d' ' -f2 | # now get all of the contents, and look for our object; this is # going to be slow, since it's one process per object; but we # can't use --batch because we need to pretty-print the trees xargs -n1 git cat-file -p | less +/$sha1 I would have guessed this was maybe caused by trying to traverse unreachable recent objects for reachability. It fits case 1 (it is OK for us to be missing these objects, but we might accidentally complain), and it would probably happen twice during a gc (once for the repack, and once for `git prune`). But that code should not be present in older versions of msysgit, as it came in v2.2.0 (and I assume "older msysgit is v1.9.5). And if that is the problem, it would follow a copy of the repo, but not a clone (though I guess if your clone was on the local filesystem, we blindly hardlink the objects, so it might follow there). -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 8:14 ` Jeff King @ 2015-06-01 8:40 ` Stefan Näwe 2015-06-01 8:52 ` Jeff King 0 siblings, 1 reply; 15+ messages in thread From: Stefan Näwe @ 2015-06-01 8:40 UTC (permalink / raw) To: Jeff King; +Cc: Git list [-- Attachment #1: Type: text/plain, Size: 4480 bytes --] Am 01.06.2015 um 10:14 schrieb Jeff King: > On Mon, Jun 01, 2015 at 09:37:17AM +0200, Stefan Näwe wrote: > >> One of my repos started giving an error on 'git gc' recently: >> >> $ git gc >> error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 >> Counting objects: 3052, done. >> Delta compression using up to 4 threads. >> Compressing objects: 100% (531/531), done. >> Writing objects: 100% (3052/3052), done. >> Total 3052 (delta 2504), reused 3052 (delta 2504) >> error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 > > The only error string that matches that is the one in parse_commit(), > when we fail to read the object. It happens twice here because > `git gc` runs several subcommands; you can see which ones are generating > the error if you run with GIT_TRACE=1. > > I am surprised that it doesn't cause the commands to abort, though. If > we are traversing the object graph to repack, for example, we would want > to abort if we are missing a reachable object (i.e., the repository is > corrupt). > >> I tried: >> >> $ git cat-file -t 7713c3b1e9ea2dd9126244 >> fatal: Not a valid object name 7713c3b1e9ea2dd9126244 > > Not surprising, if we don't have the object. What is curious is why git > wants to look it up in the first place. I.e., who is referencing it? > > Either: > > 1. It is an object that we are OK to be missing (e.g., the > UNINTERESTING side of a traversal), and the error should be > suppressed. > > 2. Your repository really is corrupted, and this is a case where we > need to be paying attention to the return value of parse_commit but > are not. > > I'd love to see: > > - the output of "GIT_TRACE=1 git gc" (to see which subcommand is > causing the error) > > - the output of "git fsck" (which should hopefully confirm whether or > not there is a real problem) See attached file. > - any mentions of the sha1 in the refs or reflogs. Something like: > > sha1=7713c3b1e9ea2dd9126244697389e4000bb39d85 > cd .git > grep $sha1 $(find packed-refs refs logs -type f) That gives nothing. > - If that doesn't turn up any hits, then presumably it's an object > referencing the sha1. We can dig into the objects (all of them, not > just reachable ones), like: > > { > # loose objects > (cd .git/objects && find ?? -type f | tr -d /) > # packed objects > for i in .git/objects/pack/*.idx; do > git show-index <$i > done | cut -d' ' -f2 > } | > # omit blobs; they are expensive to access and cannot have > # reachability pointers > git cat-file --batch-check='%(objecttype) %(objectname)' | > grep -v ^blob | > cut -d' ' -f2 | > # now get all of the contents, and look for our object; this is > # going to be slow, since it's one process per object; but we > # can't use --batch because we need to pretty-print the trees > xargs -n1 git cat-file -p | > less +/$sha1 Turns out to be a tree: tree 7713c3b1e9ea2dd9126244697389e4000bb39d85 parent d7acfc22fbc0fba467d82f41c90aab7d61f8d751 author Stefan Naewe <stefan.naewe@atlas-elektronik.com> 1429536806 +0200 committer Stefan Naewe <stefan.naewe@atlas-elektronik.com> 1429536806 +0200 > I would have guessed this was maybe caused by trying to traverse > unreachable recent objects for reachability. It fits case 1 (it is OK > for us to be missing these objects, but we might accidentally complain), > and it would probably happen twice during a gc (once for the repack, and > once for `git prune`). > > But that code should not be present in older versions of msysgit, as it > came in v2.2.0 (and I assume "older msysgit is v1.9.5). Not exactly. My msysgit is merge-rebase'd (or rebase-merge'd...) onto v2.2.0... I'll try older versions (pre v2.2.0) on linux. > And if that is > the problem, it would follow a copy of the repo, but not a clone (though > I guess if your clone was on the local filesystem, we blindly hardlink > the objects, so it might follow there). I also cloned from local filesystem (widnows drive) to a samba share. Thanks, Stefan -- ---------------------------------------------------------------- /dev/random says: Useless Invention: How-to cassettes for the deaf. python -c "print '73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')" GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9 9666 829B 49C5 9221 27AF [-- Attachment #2: git-trace.log --] [-- Type: text/plain, Size: 7749 bytes --] $ GIT_TRACE=1 git gc 10:21:27.228845 git.c:348 trace: built-in: git 'gc' 10:21:27.228845 run-command.c:347 trace: run_command: 'pack-refs' '--all' '--prune' 10:21:27.244445 git.c:348 trace: built-in: git 'pack-refs' '--all' '--prune' 10:21:27.260045 run-command.c:347 trace: run_command: 'reflog' 'expire' '--all' 10:21:27.275646 git.c:348 trace: built-in: git 'reflog' 'expire' '--all' 10:21:27.338047 run-command.c:347 trace: run_command: 'repack' '-d' '-l' '-A' '--unpack-unreachable=2.weeks.ago' 10:21:27.353647 git.c:348 trace: built-in: git 'repack' '-d' '-l' '-A' '--unpack-unreachable=2.weeks.ago' 10:21:27.353647 run-command.c:347 trace: run_command: 'pack-objects' '--keep-true-parents' '--honor-pack-keep' '--non-empty' '--all' '--reflog' '--indexed-objects' '--unpack-unreachable=2.weeks.ago' '--local' '--delta-base-offset' '.git/objects/pack/.tmp-3852-pack' 10:21:27.384848 git.c:348 trace: built-in: git 'pack-objects' '--keep-true-parents' '--honor-pack-keep' '--non-empty' '--all' '--reflog' '--indexed-objects' '--unpack-unreachable=2.weeks.ago' '--local' '--delta-base-offset' '.git/objects/pack/.tmp-3852-pack' error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 10:21:27.915258 run-command.c:347 trace: run_command: 'prune' '--expire' '2.weeks.ago' 10:21:27.930858 git.c:348 trace: built-in: git 'prune' '--expire' '2.weeks.ago' error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 10:21:28.196063 run-command.c:347 trace: run_command: 'rerere' 'gc' 10:21:28.211664 git.c:348 trace: built-in: git 'rerere' 'gc' $ ----------------------------------------------------------------------------- $ git fsck dangling commit 8a0066c756e13e5f8b02fdab4716bff74de7556e dangling commit 450111131bb54c2a7426c7c4d07729d96a3a4b08 dangling blob e6c1f0b06d7c2571f27885efb722628a5640f5bb dangling blob ad026f17ef3ca76c8e6a176b0e9c161820cb55bd dangling commit bb828e3aea9cd8ca472b7bf84175c2d786c6bca6 dangling blob cf82a18977b340e8d52f0e7d5a27cc95de79083e dangling commit e822f5fd21cb7c0523d73474d7fd5f7038c53323 dangling blob 132355dd750c3425f566db4df93d11dd5713ee57 dangling blob 3ba352808438fbcb937efbfe297572e5629622be dangling blob 3f6315e9789309b0b608e130d863b99bbf479ea4 dangling commit 902399eb3d16606940c373340ab287693981094d dangling blob d6c30db3396fd157f9051068f85e57b9c0eec5d6 dangling blob 3344fa068975a3b5f8bbb00161367fd5cfe75c56 dangling blob 5ea460158d22979898e8c72650c98074d3f9f9d7 dangling blob dd247bd0e183903e80c634e3d97d757902c19829 dangling commit 78859135a46bbde0bfd54add4138d6492cf1509a dangling blob 84e5b7254a52115ade3f11fcdc39ad3d9480af5a dangling blob b085bbf3b6a3021fd28ff54f102aef90bae41ffb dangling blob dac51e999c3820142c4b7ecffc7a8249e1d83c31 dangling commit 8ae6f24c2f6f1e0494a95020131d7c326bc74a4d dangling blob 93a6270236e39959b4e1a1bb4996a49819bd8bb8 dangling blob 7ee7c42e98e64a253e5183b7a0fb0d5fe8ac62d4 dangling blob a1e7795049e459db59cb897422d6b5b9284f02c4 dangling commit 1508fa2bb1beba32ce97d9fe52ba41f162cd2a52 dangling blob 35e85b19da57331a6dff5da3f98dbcba00b1f5cf dangling commit 3d8856caf448bf6fd8f255531cd242e937cfe60e dangling blob 88e898403686622bee2f5a56e1dcabd3f3e76d7f dangling commit fb68819d653555456a1c8a69adfed59b8714fdcb dangling commit 66095370ee7c01f97e55bbafb46173d46324d566 dangling blob 7109dbbb65382961bad78e96e759dc10c6089893 dangling blob 432a66e07bc0c72b11666ca428e4c144664b8d9d dangling commit ae4a91c13b448c1e0df3ea5a490eaadd295b775d dangling blob 212bcc5073d3f15d554c57c586607b176999fa6e dangling blob 374b9ffcec1d3c0526ac280b3e23019995577229 dangling blob 75abddbc662d2ad4116cfd1d1b4652a28cdf7516 dangling commit 136c216120c7c6d562e67ef6a26e118f3d377d75 dangling blob b48c13bf7ab920fe9d6dae3dd4960616e146e2c4 dangling blob 12cd4bf13c90918841acde16a84c129f4fe4c1cb dangling blob 448e6eca0bdd9070628fe98a3b13af3542eeb190 dangling blob af8e10f66a96920f9caf7abc7e778a8ed25b7e7f dangling blob 2f2fe720093863e455fd350982d22013268db4ab dangling blob 786fd76d6e90bd0f812685bcf21308f469cff806 dangling commit b8cf8a7c3382ae243e3691de437c179f6fe5b109 dangling commit deef19eb8b606899d18a77e5671569ee422e5c74 dangling blob c01002da12138b843efc3794ae057828d57713ff dangling blob 6e31e31e214628f5cfb208b09f4c8cb6e0a7efd7 dangling commit c01168eb4f10937347035388e82c34b36b5e757a dangling commit c5f1e1f2a04bd46f0f0017be6dbfe39170dacd37 dangling commit c671700332dac70cb8f02b49d49085635db829b7 dangling commit c7b16704a89c5e6dff26251c5c3c3596643a4765 dangling blob 13d2cdca94567550f5720d7271404d5cf1465b76 dangling blob 7a9216bd39c0249267da24e374b93b0bef7bab3a dangling blob e272943ad649011749d99a102f75cbb2cfe0481c dangling blob e412a8d9f91ac5ab621c607e9e79dd070f0575ea dangling blob 15d3e1c23d11963494f1fddffdfbe8d82281ebe7 dangling blob dcb3ece673bd71b2a7cd797863ab05b9f55d982a dangling blob e893edce44198fc3ed7d2a0a800addb9f7fcdd00 dangling commit 2c74dba4cb87622bf281fea95495580ed7169f7f dangling commit 38349ee95d761badfb6ef86899744ffa7aa6dc4d dangling commit 3b348f3eae0e14cdaf792ea11334f1ea20b8ac21 dangling commit 5874ff569260833b7da14565c9a04172f45bea34 dangling blob 86d47d16d7cf6b460ff7f2d821845add385b168a dangling blob 8e14d0376ae4c963654038ab8552c14a88c82f19 dangling commit 9674dae0470e1659d08d7258fc87ccb65469c377 dangling commit a894434804e644d52f18eac6564b5470c355a0fd dangling commit fdf458b7ffa6f81038988818f6bdb148ddbd2747 dangling blob 1a154456fd202678d09c8fe3488a7ff5eb128f0e dangling commit 315500c96a705a19bda551b552bb4af76a5423b9 dangling blob 35d541a16983732fb33205711a4ae275659f35bf dangling blob a7354d59cd449515daf1da563d4a81b0511e43f7 dangling commit bfb5538171c1a00a09b979005a032400b8be6c11 dangling blob da35f034d130a0d5a7864b243e919c2885af7dcb dangling commit 25365b87e74792bd278afec51afdfc74ef378301 dangling blob 9f36254fdae6ed8e543d8ada1dfeb7273322421f dangling commit 82375e755bebab8b4a5c34cc227fdfbcaf587878 dangling commit 1f78f76aeb42cdd5e7dd5201eb46f11b1794b959 dangling commit 4418d11dc76a4eb0327c2f182c40584774f1fa27 dangling commit 925850b1e58b665bd5df508eb7cb99eedae8b472 dangling commit a138760e4b7c324fd54e52998ab19ed4c4106064 dangling blob de788db974ca67606102f301cbc9926a55e1d9a2 dangling blob 0d7943f697e316b9d97ff89b0efc341425205426 dangling commit 0e999989958b7b7a9e5718500bc0d2122e5155ac dangling blob 2d59c05f686f88684c65359e2af46b0a2e2c1385 dangling blob 96591f7dc1e9f6d1b540884c2201da25b4435611 dangling blob b239ff3b282c9d45d5e901835f217dc28b949b32 dangling blob df7968cdc5d56584641f611f6f907727048d72cf dangling blob 1f5a1377e8f920e9392696f1187886b85a65c794 dangling blob 49fbe34371c0ddd391f35ec05c5e6b9166016fa1 dangling blob 599bdfe5070788fc301d3002ee3973383d0b6cf9 dangling commit 5a3b29c6cf9a62252e607d2a367402549b981b67 dangling blob 1edc5a3c2de4a883e0b959a663727bff691707af dangling blob 81fc0a7f742d6c9c629d22fc3a8eccfc631f2165 dangling commit 3e7ddd2f7b25d6422a6baae129cc4c2a1be09728 dangling blob 90dd4cf5ae9c6d9b8eefff04099ec76093622738 dangling blob 9b3ddb9a8e5ac2adbf0f2c3b40e9c93f95b3770d dangling commit e67d527c84eeb095ab8a627cc0d369e49708af1b dangling blob 061e173ee667f82b952f57998fc7666420cbeca5 dangling commit 3bbef5a498d6fbfe34a5f06828da1f9b1f26990d dangling commit 6e7e6d49490c88a3c974f80baa94a505453b3b87 dangling blob 87be2bcf0238fc64671dfb4d3c56b5116f8e0939 dangling commit 8fde9bf8b855191936f565f8ce1207da4ff7f70f dangling blob 967e88babd993b76f9e03916beb28c2bea7accbd dangling blob c6be8987d0adeac36a53691076950a68440e6b1e dangling commit d59ed0675627b36d7ad836b810a3ede13cbdee17 dangling blob 48dff406c16f2aa182e0e22f87cd7a4bef600072 dangling commit 4adf7f23bba31cbb7b0628367c51aba0e68359c5 dangling blob 81ff52ca8932c6f5b64737b28e07a18af0056f38 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 8:40 ` Stefan Näwe @ 2015-06-01 8:52 ` Jeff King 2015-06-01 9:14 ` Stefan Näwe 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King 0 siblings, 2 replies; 15+ messages in thread From: Jeff King @ 2015-06-01 8:52 UTC (permalink / raw) To: Stefan Näwe; +Cc: Git list On Mon, Jun 01, 2015 at 10:40:53AM +0200, Stefan Näwe wrote: > Turns out to be a tree: > > tree 7713c3b1e9ea2dd9126244697389e4000bb39d85 > parent d7acfc22fbc0fba467d82f41c90aab7d61f8d751 > author Stefan Naewe <stefan.naewe@atlas-elektronik.com> 1429536806 +0200 > committer Stefan Naewe <stefan.naewe@atlas-elektronik.com> 1429536806 +0200 Yeah, I bungled the grep earlier. That message can come from a missing tag, tree, or commit object. But I think the root cause is the same. > Not exactly. My msysgit is merge-rebase'd (or rebase-merge'd...) onto v2.2.0... > I'll try older versions (pre v2.2.0) on linux. OK, that makes more sense then. > I also cloned from local filesystem (widnows drive) to a samba share. And that, too. I've managed to create a small test case that replicates the problem: diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh index 157f3f9..015b0da 100755 --- a/t/t6501-freshen-objects.sh +++ b/t/t6501-freshen-objects.sh @@ -129,4 +129,19 @@ for repack in '' true; do ' done +test_expect_failure 'do not complain about existing broken links' ' + cat >broken-commit <<-\EOF && + tree 0000000000000000000000000000000000000001 + parent 0000000000000000000000000000000000000002 + author whatever <whatever@example.com> 1234 -0000 + committer whatever <whatever@example.com> 1234 -0000 + + some message + EOF + commit=$(git hash-object -t commit -w broken-commit) && + git gc 2>stderr && + verbose git cat-file -e $commit && + test_must_be_empty stderr +' + test_done which produces: 'stderr' is not empty, it contains: error: Could not read 0000000000000000000000000000000000000002 error: Could not read 0000000000000000000000000000000000000001 error: Could not read 0000000000000000000000000000000000000002 error: Could not read 0000000000000000000000000000000000000001 Unfortunately the fix is a little bit invasive. I'll send something out in a few minutes. -Peff ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 8:52 ` Jeff King @ 2015-06-01 9:14 ` Stefan Näwe 2015-06-01 9:58 ` Jeff King 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King 1 sibling, 1 reply; 15+ messages in thread From: Stefan Näwe @ 2015-06-01 9:14 UTC (permalink / raw) To: Jeff King; +Cc: Git List Am 01.06.2015 um 10:52 schrieb Jeff King: > On Mon, Jun 01, 2015 at 10:40:53AM +0200, Stefan Näwe wrote: > >> Turns out to be a tree: >> >> tree 7713c3b1e9ea2dd9126244697389e4000bb39d85 >> parent d7acfc22fbc0fba467d82f41c90aab7d61f8d751 >> author Stefan Naewe <stefan.naewe@atlas-elektronik.com> 1429536806 +0200 >> committer Stefan Naewe <stefan.naewe@atlas-elektronik.com> 1429536806 +0200 > > Yeah, I bungled the grep earlier. That message can come from a missing > tag, tree, or commit object. But I think the root cause is the same. Maybe this one: d3038d (prune: keep objects reachable from recent objects) ?? That's what 'git bisect' told me. >> Not exactly. My msysgit is merge-rebase'd (or rebase-merge'd...) onto v2.2.0... >> I'll try older versions (pre v2.2.0) on linux. > > OK, that makes more sense then. > >> I also cloned from local filesystem (widnows drive) to a samba share. > > And that, too. > > I've managed to create a small test case that replicates the problem: > > diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh > index 157f3f9..015b0da 100755 > --- a/t/t6501-freshen-objects.sh > +++ b/t/t6501-freshen-objects.sh > @@ -129,4 +129,19 @@ for repack in '' true; do > ' > done > > +test_expect_failure 'do not complain about existing broken links' ' > + cat >broken-commit <<-\EOF && > + tree 0000000000000000000000000000000000000001 > + parent 0000000000000000000000000000000000000002 > + author whatever <whatever@example.com> 1234 -0000 > + committer whatever <whatever@example.com> 1234 -0000 > + > + some message > + EOF > + commit=$(git hash-object -t commit -w broken-commit) && > + git gc 2>stderr && > + verbose git cat-file -e $commit && > + test_must_be_empty stderr > +' > + > test_done > > which produces: > > 'stderr' is not empty, it contains: > error: Could not read 0000000000000000000000000000000000000002 > error: Could not read 0000000000000000000000000000000000000001 > error: Could not read 0000000000000000000000000000000000000002 > error: Could not read 0000000000000000000000000000000000000001 > > Unfortunately the fix is a little bit invasive. I'll send something out > in a few minutes. It would be really helpful if you sent the patch as an attachment. I know that's not the normal wokflow but our mail server garbles every message so that I can't (or don't know how to...) use 'git am' to test the patch, which I'm willing to do! Thanks, Stefan -- ---------------------------------------------------------------- /dev/random says: The cost of feathers has risen... Now even DOWN is up! python -c "print '73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')" GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9 9666 829B 49C5 9221 27AF ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 9:14 ` Stefan Näwe @ 2015-06-01 9:58 ` Jeff King 2015-06-01 10:08 ` Stefan Näwe 0 siblings, 1 reply; 15+ messages in thread From: Jeff King @ 2015-06-01 9:58 UTC (permalink / raw) To: Stefan Näwe; +Cc: Git List On Mon, Jun 01, 2015 at 11:14:27AM +0200, Stefan Näwe wrote: > Maybe this one: > > d3038d (prune: keep objects reachable from recent objects) Yes, exactly. > It would be really helpful if you sent the patch as an attachment. > I know that's not the normal wokflow but our mail server garbles every > message so that I can't (or don't know how to...) use 'git am' to test > the patch, which I'm willing to do! It ended up as a patch series. However, you can fetch it from: git://github.com/peff/git.git jk/silence-unreachable-broken-links which is perhaps even easier. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 9:58 ` Jeff King @ 2015-06-01 10:08 ` Stefan Näwe 2015-06-01 10:22 ` Jeff King 0 siblings, 1 reply; 15+ messages in thread From: Stefan Näwe @ 2015-06-01 10:08 UTC (permalink / raw) To: Jeff King; +Cc: Git List Am 01.06.2015 um 11:58 schrieb Jeff King: > On Mon, Jun 01, 2015 at 11:14:27AM +0200, Stefan Näwe wrote: > >> Maybe this one: >> >> d3038d (prune: keep objects reachable from recent objects) > > Yes, exactly. > >> It would be really helpful if you sent the patch as an attachment. >> I know that's not the normal wokflow but our mail server garbles every >> message so that I can't (or don't know how to...) use 'git am' to test >> the patch, which I'm willing to do! > > It ended up as a patch series. However, you can fetch it from: > > git://github.com/peff/git.git jk/silence-unreachable-broken-links > > which is perhaps even easier. Not really in my situation...(but that's another story) I managed to create patch files by simply copy-and-pasting the message text (and the From:, Date:, and Subject: fields from 'View message source' in Thunderbird...) which I could then 'git am' ;-) The patches applied (and compiled) cleanly on v2.4.2 and 'git gc' stopped giving me the error message. So (FWIW): Tested-by: Stefan Naewe <stefan.naewe@gmail.com> Anything else I could test ? Thanks, Stefan -- ---------------------------------------------------------------- /dev/random says: Windows N'T: as in Wouldn't, Couldn't, and Didn't. python -c "print '73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')" GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9 9666 829B 49C5 9221 27AF ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: git gc gives "error: Could not read..." 2015-06-01 10:08 ` Stefan Näwe @ 2015-06-01 10:22 ` Jeff King 0 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2015-06-01 10:22 UTC (permalink / raw) To: Stefan Näwe; +Cc: Git List On Mon, Jun 01, 2015 at 12:08:27PM +0200, Stefan Näwe wrote: > > It ended up as a patch series. However, you can fetch it from: > > > > git://github.com/peff/git.git jk/silence-unreachable-broken-links > > > > which is perhaps even easier. > > Not really in my situation...(but that's another story) Oh, sorry. :) > So (FWIW): > > Tested-by: Stefan Naewe <stefan.naewe@gmail.com> > > Anything else I could test ? Thanks for confirming (though I was pretty sure it was the problem when you bisected to the same commit I suspected). My only open questions are the concerns I raised in the cover letter, but I don't think there is anything else for you to test there. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* [RFC/PATCH 0/3] silence missing-link warnings in some cases 2015-06-01 8:52 ` Jeff King 2015-06-01 9:14 ` Stefan Näwe @ 2015-06-01 9:54 ` Jeff King 2015-06-01 9:56 ` [PATCH 1/3] add quieter versions of parse_{tree,commit} Jeff King ` (3 more replies) 1 sibling, 4 replies; 15+ messages in thread From: Jeff King @ 2015-06-01 9:54 UTC (permalink / raw) To: git; +Cc: Stefan Näwe Stefan noticed that running "git gc" with a recent version of git causes some useless complaints about missing objects. The reason is that since git d3038d2 (prune: keep objects reachable from recent objects, 2014-10-15), we will traverse objects that are not reachable but have recent mtimes (within the 2-week prune expiration window). Because they are not reachable, we may not actually have all of their ancestors; we use the revs->ignore_missing_links option to avoid making this a fatal error. But we still print an error message. This series suppresses those messages. The first two patches below implement that. The third one gives the same treatment to UNINTERESTING parents, which we implicitly ignore when they are missing. I have slightly mixed feelings on this, just because it could be a clue that there is repo corruption. E.g., if you do: git log foo..bar and we find that "foo^" is missing, it is the only error message you get. OTOH, I think the reason we ignore errors with UNINTERESTING parents is that it does not necessarily mean corruption. E.g., while serving a fetch, if the client claims to have "x", we check only "has_sha1_file(x)" before putting the object on the UNINTERESTING side of our traversal. It might not be reachable at all, but rather just part of an incomplete segment of unreachable history. Of course, with modern git (post-d3038d2), we try to avoid getting that situation in the first place, which means that it _is_ an exceptional situation, and we should continue to at least print the error message. Note that post-d3038d2, it is also exceptional to see this in the ignore_missing_link cases, too. The reason Stefan is seeing it is probably that the repo was pruned in the past 2 weeks by an older version of git (so it removed an older "x^", but kept "x"; whereas modern git would keep both). So yet another possibility is to scrap this whole series. Within 2 weeks the problem will magically go away on its own, or sooner if the user runs "git prune". [1/3]: add quieter versions of parse_{tree,commit} [2/3]: silence broken link warnings with revs->ignore_missing_links [3/3]: suppress errors on missing UNINTERESTING links -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/3] add quieter versions of parse_{tree,commit} 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King @ 2015-06-01 9:56 ` Jeff King 2015-06-01 9:56 ` [PATCH 2/3] silence broken link warnings with revs->ignore_missing_links Jeff King ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2015-06-01 9:56 UTC (permalink / raw) To: git; +Cc: Stefan Näwe When we call parse_commit, it will complain to stderr if the object does not exist or cannot be read. This means that we may produce useless error messages if this situation is expected (e.g., because the object is marked UNINTERESTING, or because revs->ignore_missing_links is set). We can fix this by adding a new "parse_X_gently" form that takes a flag to suppress the messages. The existing "parse_X" form is already gentle in the sense that it returns an error rather than dying, and we could in theory just add a "quiet" flag to it (with existing callers passing "0"). But doing it this way means we do not have to disturb existing callers. Note also that the new flag is "quiet_on_missing", and not just "quiet". We could add a flag to suppress _all_ errors, but besides being a more invasive change (we would have to pass the flag down to sub-functions, too), there is a good reason not to: we would never want to use it. Missing a linked object is expected in some circumstances, but it is never expected to have a malformed commit, or to get a tree when we wanted a commit. We should always complain about these corruptions. Signed-off-by: Jeff King <peff@peff.net> --- commit.c | 5 +++-- commit.h | 6 +++++- tree.c | 5 +++-- tree.h | 6 +++++- 4 files changed, 16 insertions(+), 6 deletions(-) diff --git a/commit.c b/commit.c index 2d9de80..6e2103c 100644 --- a/commit.c +++ b/commit.c @@ -357,7 +357,7 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s return 0; } -int parse_commit(struct commit *item) +int parse_commit_gently(struct commit *item, int quiet_on_missing) { enum object_type type; void *buffer; @@ -370,7 +370,8 @@ int parse_commit(struct commit *item) return 0; buffer = read_sha1_file(item->object.sha1, &type, &size); if (!buffer) - return error("Could not read %s", + return quiet_on_missing ? -1 : + error("Could not read %s", sha1_to_hex(item->object.sha1)); if (type != OBJ_COMMIT) { free(buffer); diff --git a/commit.h b/commit.h index ed3a1d5..9a1fa96 100644 --- a/commit.h +++ b/commit.h @@ -59,7 +59,11 @@ struct commit *lookup_commit_reference_by_name(const char *name); struct commit *lookup_commit_or_die(const unsigned char *sha1, const char *ref_name); int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long size); -int parse_commit(struct commit *item); +int parse_commit_gently(struct commit *item, int quiet_on_missing); +static inline int parse_commit(struct commit *item) +{ + return parse_commit_gently(item, 0); +} void parse_commit_or_die(struct commit *item); /* diff --git a/tree.c b/tree.c index 58ebfce..413a5b1 100644 --- a/tree.c +++ b/tree.c @@ -204,7 +204,7 @@ int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size) return 0; } -int parse_tree(struct tree *item) +int parse_tree_gently(struct tree *item, int quiet_on_missing) { enum object_type type; void *buffer; @@ -214,7 +214,8 @@ int parse_tree(struct tree *item) return 0; buffer = read_sha1_file(item->object.sha1, &type, &size); if (!buffer) - return error("Could not read %s", + return quiet_on_missing ? -1 : + error("Could not read %s", sha1_to_hex(item->object.sha1)); if (type != OBJ_TREE) { free(buffer); diff --git a/tree.h b/tree.h index d24125f..d24786c 100644 --- a/tree.h +++ b/tree.h @@ -16,7 +16,11 @@ struct tree *lookup_tree(const unsigned char *sha1); int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size); -int parse_tree(struct tree *tree); +int parse_tree_gently(struct tree *tree, int quiet_on_missing); +static inline int parse_tree(struct tree *tree) +{ + return parse_tree_gently(tree, 0); +} void free_tree_buffer(struct tree *tree); /* Parses and returns the tree in the given ent, chasing tags and commits. */ -- 2.4.2.690.g2a79674 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/3] silence broken link warnings with revs->ignore_missing_links 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King 2015-06-01 9:56 ` [PATCH 1/3] add quieter versions of parse_{tree,commit} Jeff King @ 2015-06-01 9:56 ` Jeff King 2015-06-01 9:56 ` [PATCH 3/3] suppress errors on missing UNINTERESTING links Jeff King 2015-06-01 15:03 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Junio C Hamano 3 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2015-06-01 9:56 UTC (permalink / raw) To: git; +Cc: Stefan Näwe We set revs->ignore_missing_links to instruct the revision-walking machinery that we know the history graph may be incomplete. For example, we use it when walking unreachable but recent objects; we want to add what we can, but it's OK if the history is incomplete. However, we still print error messages for the missing objects, which can be confusing. This is not an error, but just a normal situation when transitioning from a repository last pruned by an older git (which can leave broken segments of history) to a more recent one (where we try to preserve whole reachable segments). Signed-off-by: Jeff King <peff@peff.net> --- list-objects.c | 2 +- revision.c | 2 +- t/t6501-freshen-objects.sh | 15 +++++++++++++++ 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/list-objects.c b/list-objects.c index 2a139b6..41736d2 100644 --- a/list-objects.c +++ b/list-objects.c @@ -81,7 +81,7 @@ static void process_tree(struct rev_info *revs, die("bad tree object"); if (obj->flags & (UNINTERESTING | SEEN)) return; - if (parse_tree(tree) < 0) { + if (parse_tree_gently(tree, revs->ignore_missing_links) < 0) { if (revs->ignore_missing_links) return; die("bad tree object %s", sha1_to_hex(obj->sha1)); diff --git a/revision.c b/revision.c index 7ddbaa0..29e5143 100644 --- a/revision.c +++ b/revision.c @@ -844,7 +844,7 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit, for (parent = commit->parents; parent; parent = parent->next) { struct commit *p = parent->item; - if (parse_commit(p) < 0) + if (parse_commit_gently(p, revs->ignore_missing_links) < 0) return -1; if (revs->show_source && !p->util) p->util = commit->util; diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh index 157f3f9..2adf825 100755 --- a/t/t6501-freshen-objects.sh +++ b/t/t6501-freshen-objects.sh @@ -129,4 +129,19 @@ for repack in '' true; do ' done +test_expect_success 'do not complain about existing broken links' ' + cat >broken-commit <<-\EOF && + tree 0000000000000000000000000000000000000001 + parent 0000000000000000000000000000000000000002 + author whatever <whatever@example.com> 1234 -0000 + committer whatever <whatever@example.com> 1234 -0000 + + some message + EOF + commit=$(git hash-object -t commit -w broken-commit) && + git gc 2>stderr && + verbose git cat-file -e $commit && + test_must_be_empty stderr +' + test_done -- 2.4.2.690.g2a79674 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/3] suppress errors on missing UNINTERESTING links 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King 2015-06-01 9:56 ` [PATCH 1/3] add quieter versions of parse_{tree,commit} Jeff King 2015-06-01 9:56 ` [PATCH 2/3] silence broken link warnings with revs->ignore_missing_links Jeff King @ 2015-06-01 9:56 ` Jeff King 2015-06-01 15:03 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Junio C Hamano 3 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2015-06-01 9:56 UTC (permalink / raw) To: git; +Cc: Stefan Näwe When we are traversing commit parents along the UNINTERESTING side of a revision walk, we do not care if the parent turns out to be missing. That lets us limit traversals using unreachable and possibly incomplete sections of history. However, we do still print error messages about the missing commits; this patch suppresses the error, as well. Signed-off-by: Jeff King <peff@peff.net> --- revision.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/revision.c b/revision.c index 29e5143..0b322b4 100644 --- a/revision.c +++ b/revision.c @@ -817,7 +817,7 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit, parent = parent->next; if (p) p->object.flags |= UNINTERESTING; - if (parse_commit(p) < 0) + if (parse_commit_gently(p, 1) < 0) continue; if (p->parents) mark_parents_uninteresting(p); -- 2.4.2.690.g2a79674 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [RFC/PATCH 0/3] silence missing-link warnings in some cases 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King ` (2 preceding siblings ...) 2015-06-01 9:56 ` [PATCH 3/3] suppress errors on missing UNINTERESTING links Jeff King @ 2015-06-01 15:03 ` Junio C Hamano 2015-06-01 15:41 ` Jeff King 3 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2015-06-01 15:03 UTC (permalink / raw) To: Jeff King; +Cc: git, Stefan Näwe Jeff King <peff@peff.net> writes: > Stefan noticed that running "git gc" with a recent version of git causes > some useless complaints about missing objects. > > The reason is that since git d3038d2 (prune: keep objects reachable from > recent objects, 2014-10-15), we will traverse objects that are not > reachable but have recent mtimes (within the 2-week prune expiration > window). Because they are not reachable, we may not actually have all of > their ancestors; we use the revs->ignore_missing_links option to avoid > making this a fatal error. But we still print an error message. This > series suppresses those messages. Nice finding. One of us should have thought of this kind of fallout when we discussed that change, but we apparently failed. The fixes make sense to me (I haven't carefully read the implementation, but design/approach explained in the proposed log messages are very sound), and I think 3/3 is a good thing to do, too, in the new world order after d3038d2. Thanks, both. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC/PATCH 0/3] silence missing-link warnings in some cases 2015-06-01 15:03 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Junio C Hamano @ 2015-06-01 15:41 ` Jeff King 2015-06-01 16:11 ` Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Jeff King @ 2015-06-01 15:41 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Stefan Näwe On Mon, Jun 01, 2015 at 08:03:05AM -0700, Junio C Hamano wrote: > > The reason is that since git d3038d2 (prune: keep objects reachable from > > recent objects, 2014-10-15), we will traverse objects that are not > > reachable but have recent mtimes (within the 2-week prune expiration > > window). Because they are not reachable, we may not actually have all of > > their ancestors; we use the revs->ignore_missing_links option to avoid > > making this a fatal error. But we still print an error message. This > > series suppresses those messages. > > Nice finding. One of us should have thought of this kind of fallout > when we discussed that change, but we apparently failed. I think the real culprit is that this should have been added along with ignore_missing_links in the first place. That came along with the bitmap code, but I was too busy focusing on the hard problems there to notice. :) > The fixes make sense to me (I haven't carefully read the > implementation, but design/approach explained in the proposed log > messages are very sound), and I think 3/3 is a good thing to do, > too, in the new world order after d3038d2. I think it's rather the opposite. In a post-d3038d2 world, a missing object is _more_ likely to be a real corruption, and we would probably prefer to complain about it. I am on the fence though. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC/PATCH 0/3] silence missing-link warnings in some cases 2015-06-01 15:41 ` Jeff King @ 2015-06-01 16:11 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2015-06-01 16:11 UTC (permalink / raw) To: Jeff King; +Cc: git, Stefan Näwe Jeff King <peff@peff.net> writes: >> The fixes make sense to me (I haven't carefully read the >> implementation, but design/approach explained in the proposed log >> messages are very sound), and I think 3/3 is a good thing to do, >> too, in the new world order after d3038d2. > > I think it's rather the opposite. In a post-d3038d2 world, a missing > object is _more_ likely to be a real corruption, and we would probably > prefer to complain about it. I am on the fence though. Sorry, but I wasn't talking about that far in the future. In the immediate future that necessitates patches 1 and 2, a warning on such a missing object from the codepath in 3 would be equally annoying noise, no? And a purely post-d3038d2 world, all of these warnings may be pointing at a real corruption, as you referred to as "yet another possibility". As you said, these should have been part of ignore-missing-links, so I'd say we should treat the codepaths that special case the callers that pass that option the same way. Having said all that, I do not think it is healthy to assume that pre-d3038d2 prune is the only thing that may leave an incomplete and unreachable island of objects in the repository (two easy ways to do so are to interrupt unpack-objects or the commit walker dumb fetch). So from that point of view, these three patches are reasonable things to keep even in the longer term (in other words, I do not think "yet another possibility" of waiting for the older versions of gc/prune to die out is a viable solution to the issue Stefan Näwe noticed). Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-06-01 16:12 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-01 7:37 git gc gives "error: Could not read..." Stefan Näwe 2015-06-01 8:14 ` Jeff King 2015-06-01 8:40 ` Stefan Näwe 2015-06-01 8:52 ` Jeff King 2015-06-01 9:14 ` Stefan Näwe 2015-06-01 9:58 ` Jeff King 2015-06-01 10:08 ` Stefan Näwe 2015-06-01 10:22 ` Jeff King 2015-06-01 9:54 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Jeff King 2015-06-01 9:56 ` [PATCH 1/3] add quieter versions of parse_{tree,commit} Jeff King 2015-06-01 9:56 ` [PATCH 2/3] silence broken link warnings with revs->ignore_missing_links Jeff King 2015-06-01 9:56 ` [PATCH 3/3] suppress errors on missing UNINTERESTING links Jeff King 2015-06-01 15:03 ` [RFC/PATCH 0/3] silence missing-link warnings in some cases Junio C Hamano 2015-06-01 15:41 ` Jeff King 2015-06-01 16:11 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).