All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Cc: git@vger.kernel.org
Subject: Re: Suggestion: "verify/repair" option for 'git gc'
Date: Thu, 14 Oct 2021 03:19:10 +0200	[thread overview]
Message-ID: <87h7dkh04o.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <e288dbe1-b7c7-5a2e-5271-404a14de836a@syntevo.com>


On Wed, Oct 13 2021, Alexandr Miloslavskiy wrote:

> Suggestion
> ----------
> 1) It would be nice if 'git gc' had an option to also verify
>    (like 'git fsck') the repo and report corruption. I think that it's
>    a good idea to have it in 'gc' for performance reasons, because
>    'git gc' already reads things.
>
> 2) It would be nice if git could automatically download blobs from
>    remote if local blob is corrupted. Maybe it was already implemented,
>    see story 3 below.
>
> Motivation
> ----------
>
> -- Story 1 --
> Just a few days ago I encountered another secretly broken repo which
> caused some small bugs in the git UI I'm using. The repo worked mostly
> fine, that's why I had no idea that it's corrupted.
>
> My git UI invokes 'git gc' sometimes and if that detected the
> corruption, I wouldn't have to spend time hunting the bug in UI.
>
> Specifically, it reports these errors on `git fsck`
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 1d571d7354f99b726bbcc0cb232b3f47846c71a1: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 2808b286c2a933e88735d97416e29b9514fc6af2: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 604f6f6c4fbf8da7a593708e863e68f8c5a27d07: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 6a2c4a5ef0b0ee7aa85d88c3147b7558a6a7c29f: broken links
>
> The repo is not confidential and I could share it if needed.
> I "solved" the problem by cloning a new copy.

I'd be interested in a copy of it, I've been slowly trying to improve
these sorts of corruption cases.

> -- Story 2 --
> A few years ago, I had another repo that wasn't used for a couple years
> and had corrupted blobs. The repo looked fine until I tried to clone
> from it. Unfortunately it was the only copy and I had to write some
> code to "guess" the blob's contents to repair the repo.
>
> If 'git gc' detected corruption, I would have known about the problem
> earlier,
> when I still had other copies around.

I wonder if this and other issues you encountered wouldn't need a full
"fsck", but merely gc triggering a complete repack. Which is not to say
that some regular background "fsck" wouldn't be a good idea...

> -- Story 3 --
> Also a few years ago, I had a repo with a single corrupted blob. I don't
> remember why, but simply re-cloning it was a headache. I managed to fix repo
> by issuing a command to re-download a blob from remote. Git could totally do
> that itself, I think.

Yes, we still definitely have cases where dealing with this sort of
thing can be very painful.

  reply	other threads:[~2021-10-14  1:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 15:47 Suggestion: "verify/repair" option for 'git gc' Alexandr Miloslavskiy
2021-10-14  1:19 ` Ævar Arnfjörð Bjarmason [this message]
2021-10-14 12:47   ` Alexandr Miloslavskiy
2021-10-14 15:17     ` Ævar Arnfjörð Bjarmason
2021-10-14 20:23       ` Alexandr Miloslavskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7dkh04o.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=alexandr.miloslavskiy@syntevo.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.