git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Cc: git@vger.kernel.org
Subject: Re: Suggestion: "verify/repair" option for 'git gc'
Date: Thu, 14 Oct 2021 17:17:23 +0200	[thread overview]
Message-ID: <87czo7haha.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <96bf2eff-f4c8-cae8-76cb-6eeb233cd1d3@syntevo.com>


On Thu, Oct 14 2021, Alexandr Miloslavskiy wrote:

> On 14.10.2021 4:19, Ævar Arnfjörð Bjarmason wrote:
>> I'd be interested in a copy of it, I've been slowly trying to improve
>> these sorts of corruption cases.
>
> Sent.

Thanks, I can't promise I'll take a look at it in detail time soon, but
I was going to loop back to checking out these corruption cases at some
point.

>> I wonder if this and other issues you encountered wouldn't need a full
>> "fsck", but merely gc triggering a complete repack.
>
> That sounds slow :( For example, it's going to be a lot of disk write
> bandwidth. Just doing the verification along with regular gc sounds
> faster.

Having looked at your repo the immedite issue is that you've got a tree
in it that has a (manually crafted?) entry that points to a commit
object, without the relevant mode being correct:

    $ git cat-file -p 1d571d7354f99b726bbcc0cb232b3f47846c71a1
    100644 blob 0189425cc210555c36383293c468df5da73acc48    common.mak
    040000 tree 6a2c4a5ef0b0ee7aa85d88c3147b7558a6a7c29f    include

Was this created with git itself, or some tool that's manually crafting
trees? I.e. that the object on-disk has the exact expected content but
is just bad in this particular way points to corruption in git or
another tool writing the data, not e.g. FS corruption or a bit-flip.

Anyway, getting back on track the "gc" command actually does do exactly
what you're suggesting:

    $ git gc; echo $?
    error: object 0189425cc210555c36383293c468df5da73acc48 is a commit, not a blob
    fatal: entry 'common.mak' in tree 1d571d7354f99b726bbcc0cb232b3f47846c71a1 has blob mode, but is not a blob
    fatal: failed to run repack
    128

The problem is that as a user you won't have seen that because we won't
get that far without running into the gc.auto limit, then it would have
run into that, and you'd have had the contents of gc.log spewed at you
by other commands.

So maybe we should be more aggressive there, e.g. as a function of repo
size or whatever (this repo is 18MB).

You really don't need "git fsck" to verify FS corruption or basic object
graph issues like these, and I think it's rather unfortunate that we
expose it like that.

What it does over and beyond a full repack of the repo is to
exhaustively verify object contents, which is most useful e.g. if you're
running a git service and want to prevent users from pushing crafted
corrupt objects, either intentionally or unintentionally.

I've also been meaning to look at that aspect of it for a while, i.e. it
should be able to have some --fast, it has --connectivity-only, but that
one goes a bit too far, although in this case it would have helped you.

>> Yes, we still definitely have cases where dealing with this sort of
>> thing can be very painful.
>
> With the new remote promisor code, I think that auto-fixing corrupted
> blobs is easy enough (provided they can be found on any remote) ?

Hypothetically, but these blobs aren't corrupted, and no amount of
fetching something from other places is going to fix a bad DAG. If that
thing didn't really point at the wrong object type the hash would be
different. The problem is that it was wrong when it was written.

I say "hypothetically" because even in the case of a bitflip or whatever
coercing git into some sort of auto-repair mode is pretty far off. I've
been able to do it manually in some cases, but e.g. promisors not having
a blob *and knowing that* is very different from the more general cases
of object(s) XYZ being corrupt.

We have well-intentioned features like the collision detection that
actively gets in the way of some repairs like that (I had a patch[1] to
disable it, which as a side-effect made recovering from some forms of
corruption easier).

But even without that you'll find that e.g. if a recent object is bad,
and we'd like to fetch it from upstream, that we're just going to die
pretty early, as none of the code involved in say incremental fetching
is prepared to run across a bad/corrupt object.

Those aren't inherent problems, and it would be very nice to have more
such auto-repair in git, just limitations of the current implementation.

1. https://lore.kernel.org/git/20181028225023.26427-5-avarab@gmail.com/ 

  reply	other threads:[~2021-10-14 15:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 15:47 Suggestion: "verify/repair" option for 'git gc' Alexandr Miloslavskiy
2021-10-14  1:19 ` Ævar Arnfjörð Bjarmason
2021-10-14 12:47   ` Alexandr Miloslavskiy
2021-10-14 15:17     ` Ævar Arnfjörð Bjarmason [this message]
2021-10-14 20:23       ` Alexandr Miloslavskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czo7haha.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=alexandr.miloslavskiy@syntevo.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).