From: shejialuo <shejialuo@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Karthik Nayak <karthik.188@gmail.com>,
Michael Haggerty <mhagger@alum.mit.edu>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [RFC] Implement consistency check for packed refs
Date: Tue, 17 Dec 2024 19:44:56 +0800 [thread overview]
Message-ID: <Z2FkOO9bl-IS_era@ArchLinux> (raw)
In-Reply-To: <Z2A93X2rxZEnYE76@pks.im>
On Mon, Dec 16, 2024 at 03:49:01PM +0100, Patrick Steinhardt wrote:
> On Mon, Dec 16, 2024 at 09:58:13PM +0800, shejialuo wrote:
> > Hi all,
> >
> > At current, I have already implemented the consistency check for files
> > backend. My next step is to implement the consistency check for packed
> > refs. And during these days, I have learned the source code of the
> > "refs/packed-backend.c".
>
> Great. I'm also starting to work into the direction of consistency
> checks for reftables right now, which requires a couple more changes for
> the reftable library to expose reftable blocks. I'll probably get to it
> early in the next release cycle.
>
I am also happy to see this. We are gradually making things work. It's
good. And maybe we could make a GSoC project "implement consistency check
for reftable backend" if we decide to attend GSoC next year.
> > The current "git-fsck(1)" implicitly checks the packed refs by calling
> > the method `refs/packed-backend.c::packed_refs_lock` which alls
> > `get_snapshot` and `create_snapshot`.
> >
> > In the `create_snapshot` function, it will check some consistency of the
> > "packed-refs" file, if anything is wrong, the program will die.
> > "git-fsck(1)" relies on this to check the basic format correctness for
> > "packed-refs" file. It's not suitable to just call "packed_refs_lock"
> > in the code, we should not die the program.
> >
> > Based on above, I have some ideas to implement the functionality. But
> > before I implement the actual code, I want to share my ideas to the
> > mailing list to get some feedback.
> >
> > There are two ways we could add consistency check for packed refs.
> >
> > 1. We simply read the raw file "packed-refs" file, check each line. Of
> > course, we should also need to check whether the content is sorted.
> > 2. We may reuse the data structure "struct snapshot" to do this. And we
> > call "packed_refs_lock" without calling the "creating_snapshot" instead,
> > we should just check. The reason why we do this is that we could reuse
> > some functions defined in the "packed-backend.c" instead of repeating
> > some logics. However, this way would be more complicated and require
> > more effort.
>
> Hm. I cannot really say much on this. The important part is that you
> have enough information at hand to be able to implement those checks. If
> you have all necessary information in both cases I would recommend to go
> with the one that is simpler.
>
> > However, one thing I am not sure is that should we lock the raw file
> > when checking? From my perspective, we should lock the file because we
> > are checking the consistency of it. If other processes are processing
> > the "packed-refs", we may encounter inconsistency and things would be
> > strange.
>
> The consistency checks may run for an extended amount of time in repos
> with huge number of refs, and locking for the whole duration may easily
> cause problems.
>
> Furthermore, "packed-refs" files are written atomically: the client
> writes a new "packed-refs.lock" file, syncs it to disk and then renames
> it into place. This operation doesn't invalidate any file descriptors
> that refer to the old file and you can continue reading from it, so the
> snapshot would remain consistent in itself. It could of course become
> inconsistent with any loose refs, but that's always the case with the
> "files" backend and something we'll have to accept.
>
> So I don't see any reason why the consistency checks should lock the
> "packed-refs" file.
>
Thanks, I either don't want to lock the file because this would bring
complexity but without much benefit.
> Patrick
prev parent reply other threads:[~2024-12-17 11:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 13:58 [RFC] Implement consistency check for packed refs shejialuo
2024-12-16 14:49 ` Patrick Steinhardt
2024-12-17 11:44 ` shejialuo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2FkOO9bl-IS_era@ArchLinux \
--to=shejialuo@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=mhagger@alum.mit.edu \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.