git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>, Garima Singh <garimasigit@gmail.com>
Cc: git@vger.kernel.org, peff@peff.net
Subject: Re: [RFC PATCH 0/1] commit-graph.c: handle corrupt commit trees
Date: Fri, 6 Sep 2019 12:48:36 -0400	[thread overview]
Message-ID: <a2f6d2d9-e7ea-addd-97a0-f05fa3f1be51@gmail.com> (raw)
In-Reply-To: <20190904212121.GB20904@syl.local>

On 9/4/2019 5:21 PM, Taylor Blau wrote:
> Hi Garima,
> 
> On Wed, Sep 04, 2019 at 02:25:55PM -0400, Garima Singh wrote:
>>
>> On 9/3/2019 10:22 PM, Taylor Blau wrote:
>>> Hi,
>>>
>>> I was running some of the new 'git commit-graph' commands, and noticed
>>> that I could consistently get 'git commit-graph write --reachable' to
>>> segfault when a commit's root tree is corrupt.
>>>
>>> I have an extremely-unfinished fix attached as an RFC PATCH below, but I
>>> wanted to get a few thoughts on this before sending it out as a non-RFC.
>>>
>>> In my patch, I simply 'die()' when a commit isn't able to be parsed
>>> (i.e., when 'parse_commit_no_graph' returns a non-zero code), but I
>>> wanted to see if others thought that this was an OK approach. Some
>>> thoughts:
>>
>> I like the idea of completely bailing if the commit can't be parsed too.
>> Only question: Is there a reason you chose to die() instead of BUG() like
>> the other two places in that function? What is the criteria of choosing one
>> over the other?
> 
> I did not call 'BUG' here because 'BUG' is traditionally used to
> indicate an internal bug, e.g., an unexpected state or some such. On the
> other side of that coin, 'BUG' is _not_ used to indicate repository
> corruption, since that is not an issue in the Git codebase, rather in
> the user's repository.
> 
> Though, to be honest, I've never seen that rule written out explicitly
> (maybe if it were to be written somewhere, it could be stored in
> Documentation/CodingGuidelines?). I think that this is some good
> #leftoverbits material.
> 
>>>
>>>    * It seems like we could write a commit-graph by placing a "filler"
>>>      entry where the broken commit would have gone. I don't see any place
>>>      where this is implemented currently, but this seems like a viable
>>>      alternative to not writing _any_ commits into the commit-graph.
>>
>> I would rather we didn't do this cause it will probably kick open the can of
>> always watching for that filler when we are working with the commit-graph.
>> Or do we already do that today? Maybe @stolee can chime in on what we do in
>> cases of shallow clones and other potential gaps in the walk
> 
> Yeah, I think that the consensus is that it makes sense to just die
> here, which is fine by me.

I agree the die() is the best thing to do for now.

If we wanted to salvage as much as possible, then we could use these
corrupt marks and then use the "reverse walk" in compute_generation_numbers()
to mark all commits that can reach the corrupt commit as corrupt.
We would then need to remove all corrupt commits from the list we are
planning to write.

However, that is just hiding a corrupt object in the object database,
which is not a situation we want to leave unnoticed.

Thanks,
-Stolee

      parent reply	other threads:[~2019-09-06 16:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04  2:22 [RFC PATCH 0/1] commit-graph.c: handle corrupt commit trees Taylor Blau
2019-09-04  2:22 ` [RFC PATCH 1/1] commit-graph.c: die on un-parseable commits Taylor Blau
2019-09-04  3:04   ` Jeff King
2019-09-04 21:18     ` Taylor Blau
2019-09-05  6:47       ` Jeff King
2019-09-06 16:48         ` Derrick Stolee
2019-09-06 17:04           ` Jeff King
2019-09-06 17:19             ` Derrick Stolee
2019-09-06 17:20             ` Derrick Stolee
2019-09-05 22:19     ` Junio C Hamano
2019-09-06  6:35       ` Jeff King
2019-09-06  6:56         ` Jeff King
2019-09-06 16:59         ` Junio C Hamano
2019-09-06 17:04           ` Jeff King
2019-09-09 16:39             ` Junio C Hamano
2019-09-09 16:54               ` Jeff King
2019-09-04 18:25 ` [RFC PATCH 0/1] commit-graph.c: handle corrupt commit trees Garima Singh
2019-09-04 21:21   ` Taylor Blau
2019-09-05  6:08     ` Jeff King
2019-09-06 16:48     ` Derrick Stolee [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2f6d2d9-e7ea-addd-97a0-f05fa3f1be51@gmail.com \
    --to=stolee@gmail.com \
    --cc=garimasigit@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).