git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: shejialuo <shejialuo@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>
Subject: Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
Date: Mon, 12 May 2025 20:25:06 +0800	[thread overview]
Message-ID: <aCHoovrKiSUemBCL@ArchLinux> (raw)
In-Reply-To: <aCGzIlLH_ESNg6-v@pks.im>

On Mon, May 12, 2025 at 10:36:50AM +0200, Patrick Steinhardt wrote:
> On Sun, May 11, 2025 at 10:01:43PM +0800, shejialuo wrote:
> > During fsck, an empty "packed-refs" gives an error; this is unwarranted.
> > We should just skip checking the content of "packed-refs" just like the
> > runtime code paths such as "create_snapshot" which simply returns the
> > "snapshot" without checking the content of "packed-refs".
> 
> I think this doesn't quite answer the question whether this is a _good_
> idea though. The question that we need to answer is whether there are
> any writing code paths that may end up writing a "packed-refs" file that
> is completely empty. Modern Git would at least write the packed-refs
> header, wouldn't it?
> 

That's right. In the current codebase, we would always write the header
which could be easily reproduced by using the following command:

    git init repo
    cd repo && git pack-refs
    cat .git/packed-refs

And in "packed-backend.c::write_with_updates", we would always write the
header.

> The reason why I'm a little sceptical is that there is a common problem
> with ext4 caused by its delayed allocation [1]. If you:
> 
>   1. Write data to a temporary file.
>   2. Rename the file into place.
>   3. The host system crashes.
> 
> Then it may happen that the renamed file is now completely empty.
> 
> The root cause is a bug in the application: before renaming the file
> into place it _must_ fsync the file to disk. Git does that by default,
> but it is extremely easy to get wrong and we had bugs around this until
> ~2 years ago, if I remember correctly. We hit the problem several times
> in our production systems.
> 

I see. I agree that in the most situation, an empty "packed-refs" file
means that there is an issue.

> So I wonder whether ignoring empty files would cause us to miss such a
> common error. But I guess if there are valid cases where we may end up
> with an empty "packed-refs" file we cannot do anything about it.
> 

I somehow think we would always write header in the Modern Git. But
"create_snapshot" accept an empty existing "packed-refs" file at
runtime.

And header is introduced in 694b7a1999 (repack_without_ref(): write peeled
refs in the rewritten file, 2013-04-22). At this commit, we would always
write the header into the "packed-refs" file.

But in runtime, we accept empty file or no header of the file content as we
want to keep compatible. In my humble word, I think we should allow
empty file at now. Then, In Git 3.0, we tighten all the rules (there
must always be a header etc) and also update the runtime behavior.

> Patrick
> 
> [1]: https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/

Thanks,
Jialuo

  reply	other threads:[~2025-05-12 12:24 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
2025-05-06 18:42   ` Junio C Hamano
2025-05-07 12:09     ` shejialuo
2025-05-06 19:14   ` Junio C Hamano
2025-05-07 12:10     ` shejialuo
2025-05-06 16:41 ` [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-06 19:16   ` Junio C Hamano
2025-05-06 16:41 ` [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
2025-05-06 18:52   ` Junio C Hamano
2025-05-06 22:17     ` Junio C Hamano
2025-05-07 12:21     ` shejialuo
2025-05-06 16:41 ` [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file shejialuo
2025-05-06 19:00   ` Junio C Hamano
2025-05-06 22:18     ` Junio C Hamano
2025-05-07 12:34     ` shejialuo
2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
2025-05-07 14:53   ` [PATCH v2 1/4] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
2025-05-07 14:53   ` [PATCH v2 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-07 14:53   ` [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
2025-05-08 19:57     ` Jeff King
2025-05-08 20:05       ` Junio C Hamano
2025-05-09 15:03         ` shejialuo
2025-05-07 14:54   ` [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-08 20:07     ` Jeff King
2025-05-09 15:21       ` shejialuo
2025-05-09 15:59         ` Jeff King
2025-05-09 16:40           ` shejialuo
2025-05-07 22:51   ` [PATCH v2 0/4] align the behavior when opening "packed-refs" Junio C Hamano
2025-05-08 20:08     ` Jeff King
2025-05-08 20:20       ` Junio C Hamano
2025-05-08 20:33         ` Jeff King
2025-05-09 15:26           ` shejialuo
2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
2025-05-12  8:36       ` Patrick Steinhardt
2025-05-12 12:25         ` shejialuo [this message]
2025-05-12 14:39           ` Patrick Steinhardt
2025-05-12 15:56             ` Jeff King
2025-05-12 17:18               ` Junio C Hamano
2025-05-13  5:08                 ` Patrick Steinhardt
2025-05-13  7:06                   ` shejialuo
2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-12  8:37       ` Patrick Steinhardt
2025-05-12 10:35         ` shejialuo
2025-05-12 14:41           ` Patrick Steinhardt
2025-05-12 13:06       ` Jeff King
2025-05-13  6:55         ` shejialuo
2025-05-11 14:01     ` [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-12 13:08       ` Jeff King
2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
2025-05-13 16:30         ` Junio C Hamano
2025-05-14 12:51           ` shejialuo
2025-05-13 11:07       ` [PATCH v4 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-13 11:07       ` [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-13 16:51         ` Junio C Hamano
2025-05-14 13:05           ` shejialuo
2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
2025-05-14 15:50         ` [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
2025-05-14 15:50         ` [PATCH v5 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-14 15:50         ` [PATCH v5 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-15 12:57         ` [PATCH v5 0/3] align the behavior when opening "packed-refs" Junio C Hamano
2025-05-21 16:31         ` Junio C Hamano
2025-05-22  5:50           ` Jeff King
2025-05-23  9:40             ` Patrick Steinhardt
2025-05-23 15:58               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aCHoovrKiSUemBCL@ArchLinux \
    --to=shejialuo@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).