git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] align the behavior when opening "packed-refs"
@ 2025-05-06 16:39 shejialuo
  2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
                   ` (4 more replies)
  0 siblings, 5 replies; 67+ messages in thread
From: shejialuo @ 2025-05-06 16:39 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net

Really thank Peff and Patrick to suggest me to do above change.

Thanks,
Jialuo

shejialuo (4):
  packed-backend: skip checking consistency of empty packed-refs file
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  packed-backend: use mmap when opening large "packed-refs" file

 refs/packed-backend.c    | 106 +++++++++++++++++++++++----------------
 t/t0602-reffiles-fsck.sh |  13 +++++
 2 files changed, 75 insertions(+), 44 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file
  2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
@ 2025-05-06 16:41 ` shejialuo
  2025-05-06 18:42   ` Junio C Hamano
  2025-05-06 19:14   ` Junio C Hamano
  2025-05-06 16:41 ` [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 67+ messages in thread
From: shejialuo @ 2025-05-06 16:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

In "load_contents", when the "packed-refs" is empty, we will just return
the snapshot. However, we would report an error to the user when
checking the consistency of the empty "packed-refs".

We should align with the runtime behavior. As what "load_contents" does,
let's check whether the file size is zero and if so, we will skip
checking the consistency and simply return.

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c    |  3 +++
 t/t0602-reffiles-fsck.sh | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3ad1ed0787..0dd6c6677b 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2103,6 +2103,9 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
+	if (!st.st_size)
+		goto cleanup;
+
 	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
 		ret = error_errno(_("unable to read '%s'"), refs->path);
 		goto cleanup;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 9d1dc2144c..0a9e9ccc55 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,6 +647,19 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
 	)
 '
 
+test_expect_success 'empty packed-refs should not be reported' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit default &&
+
+		touch .git/packed-refs &&
+		git refs verify 2>err &&
+		test_must_be_empty err
+	)
+'
+
 test_expect_success 'packed-refs header should be checked' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
  2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
@ 2025-05-06 16:41 ` shejialuo
  2025-05-06 19:16   ` Junio C Hamano
  2025-05-06 16:41 ` [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-06 16:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"load_contents" would choose which way to load the content of the
"packed-refs". However, we cannot directly use this function when
checking the consistency due to we don't want to open the file. And we
also need to reuse the logic to avoid causing repetition.

Let's create a new helper function "allocate_snapshot_buffer" to extract
the snapshot allocation logic in "load_contents" and update the
"load_contents" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 54 +++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 0dd6c6677b..e582227772 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -517,6 +517,32 @@ static int refname_contains_nul(struct strbuf *refname)
 
 #define SMALL_FILE_SIZE (32*1024)
 
+static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
+{
+	ssize_t bytes_read;
+	size_t size;
+
+	size = xsize_t(st->st_size);
+	if (!size)
+		return 0;
+
+	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
+		snapshot->buf = xmalloc(size);
+		bytes_read = read_in_full(fd, snapshot->buf, size);
+		if (bytes_read < 0 || bytes_read != size)
+			die_errno("couldn't read %s", snapshot->refs->path);
+		snapshot->mmapped = 0;
+	} else {
+		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		snapshot->mmapped = 1;
+	}
+
+	snapshot->start = snapshot->buf;
+	snapshot->eof = snapshot->buf + size;
+
+	return 1;
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -525,10 +551,9 @@ static int refname_contains_nul(struct strbuf *refname)
  */
 static int load_contents(struct snapshot *snapshot)
 {
-	int fd;
 	struct stat st;
-	size_t size;
-	ssize_t bytes_read;
+	int ret = 1;
+	int fd;
 
 	fd = open(snapshot->refs->path, O_RDONLY);
 	if (fd < 0) {
@@ -550,27 +575,12 @@ static int load_contents(struct snapshot *snapshot)
 
 	if (fstat(fd, &st) < 0)
 		die_errno("couldn't stat %s", snapshot->refs->path);
-	size = xsize_t(st.st_size);
-
-	if (!size) {
-		close(fd);
-		return 0;
-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
-		snapshot->buf = xmalloc(size);
-		bytes_read = read_in_full(fd, snapshot->buf, size);
-		if (bytes_read < 0 || bytes_read != size)
-			die_errno("couldn't read %s", snapshot->refs->path);
-		snapshot->mmapped = 0;
-	} else {
-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
-		snapshot->mmapped = 1;
-	}
-	close(fd);
 
-	snapshot->start = snapshot->buf;
-	snapshot->eof = snapshot->buf + size;
+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
+		ret = 0;
 
-	return 1;
+	close(fd);
+	return ret;
 }
 
 static const char *find_reference_location_1(struct snapshot *snapshot,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
  2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
  2025-05-06 16:41 ` [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-06 16:41 ` shejialuo
  2025-05-06 18:52   ` Junio C Hamano
  2025-05-06 16:41 ` [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file shejialuo
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
  4 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-06 16:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"create_snapshot" would try to munmap the file when the "mmap_strategy"
is "MMAP_TEMPORARY". We also need to do this operation when checking the
consistency of the "packed-refs" file.

Create a new function "munmap_snapshot_if_temporary" to do above and
change "create_snapshot" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index e582227772..dd903db301 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -543,6 +543,23 @@ static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct st
 	return 1;
 }
 
+static void munmap_snapshot_if_temporary(struct snapshot *snapshot)
+{
+	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
+		/*
+		 * We don't want to leave the file mmapped, so we are
+		 * forced to make a copy now:
+		 */
+		size_t size = snapshot->eof - snapshot->start;
+		char *buf_copy = xmalloc(size);
+
+		memcpy(buf_copy, snapshot->start, size);
+		clear_snapshot_buffer(snapshot);
+		snapshot->buf = snapshot->start = buf_copy;
+		snapshot->eof = buf_copy + size;
+	}
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -761,19 +778,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
 		verify_buffer_safe(snapshot);
 	}
 
-	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
-		/*
-		 * We don't want to leave the file mmapped, so we are
-		 * forced to make a copy now:
-		 */
-		size_t size = snapshot->eof - snapshot->start;
-		char *buf_copy = xmalloc(size);
-
-		memcpy(buf_copy, snapshot->start, size);
-		clear_snapshot_buffer(snapshot);
-		snapshot->buf = snapshot->start = buf_copy;
-		snapshot->eof = buf_copy + size;
-	}
+	munmap_snapshot_if_temporary(snapshot);
 
 	return snapshot;
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file
  2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
                   ` (2 preceding siblings ...)
  2025-05-06 16:41 ` [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
@ 2025-05-06 16:41 ` shejialuo
  2025-05-06 19:00   ` Junio C Hamano
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
  4 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-06 16:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

We use "strbuf_read" to read the content of "packed-refs". However, this
is a bad practice which would consume a lot of memory usage if there are
multiple processes reading large "packed-refs".

As we have introduced two helper functions "allocate_snapshot_buffer"
and "munmap_snapshot_if_temporary", we could simply call these functions
to use mmap mechanism.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index dd903db301..1370b982df 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2074,7 +2074,7 @@ static int packed_fsck(struct ref_store *ref_store,
 {
 	struct packed_ref_store *refs = packed_downcast(ref_store,
 							REF_STORE_READ, "fsck");
-	struct strbuf packed_ref_content = STRBUF_INIT;
+	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
 	unsigned int sorted = 0;
 	struct stat st;
 	int ret = 0;
@@ -2121,21 +2121,21 @@ static int packed_fsck(struct ref_store *ref_store,
 	if (!st.st_size)
 		goto cleanup;
 
-	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
-		ret = error_errno(_("unable to read '%s'"), refs->path);
+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
 		goto cleanup;
-	}
+	munmap_snapshot_if_temporary(snapshot);
 
-	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
-				      packed_ref_content.buf + packed_ref_content.len);
+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
+				      snapshot->eof);
 	if (!ret && sorted)
-		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
-					     packed_ref_content.buf + packed_ref_content.len);
+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot->start,
+					     snapshot->eof);
 
 cleanup:
 	if (fd >= 0)
 		close(fd);
-	strbuf_release(&packed_ref_content);
+	clear_snapshot_buffer(snapshot);
+	free(snapshot);
 	return ret;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file
  2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
@ 2025-05-06 18:42   ` Junio C Hamano
  2025-05-07 12:09     ` shejialuo
  2025-05-06 19:14   ` Junio C Hamano
  1 sibling, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 18:42 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> +		touch .git/packed-refs &&

Unless you want to signal your readers that you care about file
timestamp somehow, don't use "touch" as a means to create a file.
Readers would have to wonder if .git/packed-refs existed before,
or what git command that follows this part cares about its last
modified time, or what behaviour the timestamp would affect.

What you are interested in doing with this is to ensure that *AN*
*EMPTY* file exists there, hence you should say

		>.git/packed-refs &&

instread.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-06 16:41 ` [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
@ 2025-05-06 18:52   ` Junio C Hamano
  2025-05-06 22:17     ` Junio C Hamano
  2025-05-07 12:21     ` shejialuo
  0 siblings, 2 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 18:52 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> +static void munmap_snapshot_if_temporary(struct snapshot *snapshot)
> +{
> +	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {

In general, a refactoring tomove conditionals like this to the
callee and make the callers unconditionally call the helper is an
antipattern from maintainability's point of view.

Imagine what would happen when we acquire a different mmap_strategy
in the future, and by that time, there are callers in the codebase
other than what we currently have (which is just one below after
this patch).  Do you have to verify if all existing callers that
trusted "if_temporary" really meant MMAP_TEMPORARY, as the name of
the helper function suggested, or do some of the callers meant "any
strategy other than MMAP_OK"?  What if some callers want the former
and others want the latter semantics?

Without even talking about longer term maintainability, at the
callsite, _if_temporary in the name is a much weaker sign than an
explicit if() condition that says what is going on.

I'd prefer the caller to be more like

	if (mmap_strategy == MMAP_TEMPORARY)
		munmap_temporary_stapshot(snapshot)

and make the caller to return immediately when !snapshot, i.e.

	static void munmap_temporary_stapshot(struct snapshot *snapshot
	{
		if (!snapshot)
			return;
		... the rest of the helper function ...
	}

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file
  2025-05-06 16:41 ` [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file shejialuo
@ 2025-05-06 19:00   ` Junio C Hamano
  2025-05-06 22:18     ` Junio C Hamano
  2025-05-07 12:34     ` shejialuo
  0 siblings, 2 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 19:00 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> We use "strbuf_read" to read the content of "packed-refs". However, this
> is a bad practice which would consume a lot of memory usage if there are
> multiple processes reading large "packed-refs".

Neither this nor the commit title says that the issue is limited to
the code path that runs fsck on packed-refs file, but I thought the
code paths to use packed-refs to resolve refs correctly uses mmap()
and does not share this issue?  If it is limited to one single code
path, please mention it explicitly.

Also, I think it was already pointed out that "multiple processes"
is not all that interesting issue.  Even if there is a single
process using a single large packed-refs file, alloc+read gives the
system more memory pressure than the read-only mmap like we do.

As to the title

	packed-backend: use mmap when opening large "packed-refs" file
	packed-backend: mmap large "packed-refs" file during fsck

would be shorter and clearer.

The patch looks OK.  Nice to see this one-off strbuf use going away.

> -	struct strbuf packed_ref_content = STRBUF_INIT;
> +	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
>  	unsigned int sorted = 0;
>  	struct stat st;
>  	int ret = 0;
> @@ -2121,21 +2121,21 @@ static int packed_fsck(struct ref_store *ref_store,
>  	if (!st.st_size)
>  		goto cleanup;
>  
> -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> -		ret = error_errno(_("unable to read '%s'"), refs->path);
> +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
>  		goto cleanup;
> -	}
> +	munmap_snapshot_if_temporary(snapshot);
>  
> -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
> -				      packed_ref_content.buf + packed_ref_content.len);
> +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
> +				      snapshot->eof);
>  	if (!ret && sorted)
> -		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
> -					     packed_ref_content.buf + packed_ref_content.len);
> +		ret = packed_fsck_ref_sorted(o, ref_store, snapshot->start,
> +					     snapshot->eof);
>  
>  cleanup:
>  	if (fd >= 0)
>  		close(fd);
> -	strbuf_release(&packed_ref_content);
> +	clear_snapshot_buffer(snapshot);
> +	free(snapshot);
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file
  2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
  2025-05-06 18:42   ` Junio C Hamano
@ 2025-05-06 19:14   ` Junio C Hamano
  2025-05-07 12:10     ` shejialuo
  1 sibling, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 19:14 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> In "load_contents", when the "packed-refs" is empty, we will just return
> the snapshot. However, we would report an error to the user when
> checking the consistency of the empty "packed-refs".

Neither the commit title nor the above paragraph hints that this is
talking about "fsck" part of the packed-refs subsystem.  That leaves
the readers confused when they read "with the runtime behavior"
below.

> We should align with the runtime behavior. As what "load_contents" does,
> let's check whether the file size is zero and if so, we will skip
> checking the consistency and simply return.

How about

	During fsck, an empty "packed-refs" file gives an error;
	this is unwarranted.  We should instead just return an empty
	"snapshot" and let the caller happily declare success, just
	like the code paths that implement the runtime use of the
	file do.

or something?

As to the title

	packed-backend: fsck should allow an empty packed-refs file

is shorter and clearer, I would think.

The code change is trivially correct, I think.  Nicely found.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-06 16:41 ` [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-06 19:16   ` Junio C Hamano
  0 siblings, 0 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 19:16 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> "load_contents" would choose which way to load the content of the
> "packed-refs". However, we cannot directly use this function when
> checking the consistency due to we don't want to open the file. And we
> also need to reuse the logic to avoid causing repetition.
>
> Let's create a new helper function "allocate_snapshot_buffer" to extract
> the snapshot allocation logic in "load_contents" and update the
> "load_contents" to align with the behavior.
>
> Suggested-by: Jeff King <peff@peff.net>
> Suggested-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  refs/packed-backend.c | 54 +++++++++++++++++++++++++------------------
>  1 file changed, 32 insertions(+), 22 deletions(-)

Trivially correct, cleanly done, and nicely described.

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-06 18:52   ` Junio C Hamano
@ 2025-05-06 22:17     ` Junio C Hamano
  2025-05-07 12:21     ` shejialuo
  1 sibling, 0 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 22:17 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

Junio C Hamano <gitster@pobox.com> writes:

> shejialuo <shejialuo@gmail.com> writes:
>
>> +static void munmap_snapshot_if_temporary(struct snapshot *snapshot)
>> +{
>> +	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
>
> In general, a refactoring tomove conditionals like this to the
> callee and make the callers unconditionally call the helper is an
> antipattern from maintainability's point of view.
>
> Imagine what would happen when we acquire a different mmap_strategy
> in the future, and by that time, there are callers in the codebase
> other than what we currently have (which is just one below after
> this patch).  Do you have to verify if all existing callers that
> trusted "if_temporary" really meant MMAP_TEMPORARY, as the name of
> the helper function suggested, or do some of the callers meant "any
> strategy other than MMAP_OK"?  What if some callers want the former
> and others want the latter semantics?
>
> Without even talking about longer term maintainability, at the
> callsite, _if_temporary in the name is a much weaker sign than an
> explicit if() condition that says what is going on.
>
> I'd prefer the caller to be more like
>
> 	if (mmap_strategy == MMAP_TEMPORARY)
> 		munmap_temporary_stapshot(snapshot)
>
> and make the caller to return immediately when !snapshot, i.e.

Oops.  I obviously meant "the callee" here.

> 	static void munmap_temporary_stapshot(struct snapshot *snapshot
> 	{
> 		if (!snapshot)
> 			return;
> 		... the rest of the helper function ...
> 	}
>
> Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file
  2025-05-06 19:00   ` Junio C Hamano
@ 2025-05-06 22:18     ` Junio C Hamano
  2025-05-07 12:34     ` shejialuo
  1 sibling, 0 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-06 22:18 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

Junio C Hamano <gitster@pobox.com> writes:

> As to the title
>
> 	packed-backend: use mmap when opening large "packed-refs" file
> 	packed-backend: mmap large "packed-refs" file during fsck
>
> would be shorter and clearer.

Sorry to be cryptic.  I meant to suggest the second one.  The first
one is what you gave us, and was left to contrast its length with
the suggested altenative.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file
  2025-05-06 18:42   ` Junio C Hamano
@ 2025-05-07 12:09     ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 12:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Patrick Steinhardt

On Tue, May 06, 2025 at 11:42:37AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > +		touch .git/packed-refs &&
> 
> Unless you want to signal your readers that you care about file
> timestamp somehow, don't use "touch" as a means to create a file.
> Readers would have to wonder if .git/packed-refs existed before,
> or what git command that follows this part cares about its last
> modified time, or what behaviour the timestamp would affect.
> 
> What you are interested in doing with this is to ensure that *AN*
> *EMPTY* file exists there, hence you should say
> 
> 		>.git/packed-refs &&
> 
> instread.

Thanks for the suggestion, I didn't realise about this.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file
  2025-05-06 19:14   ` Junio C Hamano
@ 2025-05-07 12:10     ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 12:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Patrick Steinhardt

On Tue, May 06, 2025 at 12:14:59PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > In "load_contents", when the "packed-refs" is empty, we will just return
> > the snapshot. However, we would report an error to the user when
> > checking the consistency of the empty "packed-refs".
> 
> Neither the commit title nor the above paragraph hints that this is
> talking about "fsck" part of the packed-refs subsystem.  That leaves
> the readers confused when they read "with the runtime behavior"
> below.
> 

That's right. My message is vague.

> > We should align with the runtime behavior. As what "load_contents" does,
> > let's check whether the file size is zero and if so, we will skip
> > checking the consistency and simply return.
> 
> How about
> 
> 	During fsck, an empty "packed-refs" file gives an error;
> 	this is unwarranted.  We should instead just return an empty
> 	"snapshot" and let the caller happily declare success, just
> 	like the code paths that implement the runtime use of the
> 	file do.
> 
> or something?
> 
> As to the title
> 
> 	packed-backend: fsck should allow an empty packed-refs file
> 
> is shorter and clearer, I would think.
> 

Thanks, I will improve this in the next version.

> The code change is trivially correct, I think.  Nicely found.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-06 18:52   ` Junio C Hamano
  2025-05-06 22:17     ` Junio C Hamano
@ 2025-05-07 12:21     ` shejialuo
  1 sibling, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 12:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Patrick Steinhardt

On Tue, May 06, 2025 at 11:52:02AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > +static void munmap_snapshot_if_temporary(struct snapshot *snapshot)
> > +{
> > +	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
> 
> In general, a refactoring tomove conditionals like this to the
> callee and make the callers unconditionally call the helper is an
> antipattern from maintainability's point of view.
> 
> Imagine what would happen when we acquire a different mmap_strategy
> in the future, and by that time, there are callers in the codebase
> other than what we currently have (which is just one below after
> this patch).  Do you have to verify if all existing callers that
> trusted "if_temporary" really meant MMAP_TEMPORARY, as the name of
> the helper function suggested, or do some of the callers meant "any
> strategy other than MMAP_OK"?  What if some callers want the former
> and others want the latter semantics?
> 

That's right. Actually when I wake up in the morning, I suddenly find
out I made a mistake here. We should try to make the function as pure as
possible. So, this function would just do one thing, call `mumap` when
the mmap strategy is "MMAP_TEMPORARY".

> Without even talking about longer term maintainability, at the
> callsite, _if_temporary in the name is a much weaker sign than an
> explicit if() condition that says what is going on.
> 
> I'd prefer the caller to be more like
> 
> 	if (mmap_strategy == MMAP_TEMPORARY)
> 		munmap_temporary_stapshot(snapshot)
> 
> and make the caller to return immediately when !snapshot, i.e.
> 
> 	static void munmap_temporary_stapshot(struct snapshot *snapshot
> 	{
> 		if (!snapshot)
> 			return;
> 		... the rest of the helper function ...
> 	}
> 
> Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file
  2025-05-06 19:00   ` Junio C Hamano
  2025-05-06 22:18     ` Junio C Hamano
@ 2025-05-07 12:34     ` shejialuo
  1 sibling, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 12:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Patrick Steinhardt

On Tue, May 06, 2025 at 12:00:39PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > We use "strbuf_read" to read the content of "packed-refs". However, this
> > is a bad practice which would consume a lot of memory usage if there are
> > multiple processes reading large "packed-refs".
> 
> Neither this nor the commit title says that the issue is limited to
> the code path that runs fsck on packed-refs file, but I thought the
> code paths to use packed-refs to resolve refs correctly uses mmap()
> and does not share this issue?  If it is limited to one single code
> path, please mention it explicitly.
> 

Nice advice, I will improve this in the next version.

> Also, I think it was already pointed out that "multiple processes"
> is not all that interesting issue.  Even if there is a single
> process using a single large packed-refs file, alloc+read gives the
> system more memory pressure than the read-only mmap like we do.
> 

That's right. Even there is only one process, we could avoid copying the
file content from kernel space to the user space. I will improve the
commit message.

> As to the title
> 
> 	packed-backend: use mmap when opening large "packed-refs" file
> 	packed-backend: mmap large "packed-refs" file during fsck
> 
> would be shorter and clearer.
> 

Yes, the second one is better. Will improve this.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v2 0/4] align the behavior when opening "packed-refs"
  2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
                   ` (3 preceding siblings ...)
  2025-05-06 16:41 ` [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file shejialuo
@ 2025-05-07 14:52 ` shejialuo
  2025-05-07 14:53   ` [PATCH v2 1/4] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
                     ` (5 more replies)
  4 siblings, 6 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 14:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net

Really thank Peff and Patrick to suggest me to do above change.

---

Change since v1:

1. Update the commit message of [PATCH 1/4]. And use redirection to
create an empty file instead of using `touch`.
2. Don't use if for the refactored function in [PATCH 3/4] and then
update the commit message to align with the new function name.
3. Enhance the commit message of [PATCH 4/4].

Thanks,
Jialuo

shejialuo (4):
  packed-backend: fsck should allow an empty "packed-refs" file
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  packed-backend: mmap large "packed-refs" file during fsck

 refs/packed-backend.c    | 113 ++++++++++++++++++++++++---------------
 t/t0602-reffiles-fsck.sh |  13 +++++
 2 files changed, 82 insertions(+), 44 deletions(-)

Range-diff against v1:
1:  aa9037ebfa ! 1:  26c3fd55a8 packed-backend: skip checking consistency of empty packed-refs file
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    packed-backend: skip checking consistency of empty packed-refs file
    +    packed-backend: fsck should allow an empty "packed-refs" file
     
    -    In "load_contents", when the "packed-refs" is empty, we will just return
    -    the snapshot. However, we would report an error to the user when
    -    checking the consistency of the empty "packed-refs".
    -
    -    We should align with the runtime behavior. As what "load_contents" does,
    -    let's check whether the file size is zero and if so, we will skip
    -    checking the consistency and simply return.
    +    During fsck, an empty "packed-refs" gives an error; this is unwarranted.
    +    We should just skip checking the content of "packed-refs" just like the
    +    runtime code paths such as "create_snapshot" which simply returns the
    +    "snapshot" without checking the content of "packed-refs".
     
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    @@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
     +		cd repo &&
     +		test_commit default &&
     +
    -+		touch .git/packed-refs &&
    ++		>.git/packed-refs &&
     +		git refs verify 2>err &&
     +		test_must_be_empty err
     +	)
2:  852e8a606b = 2:  4604be8b51 packed-backend: extract snapshot allocation in `load_contents`
3:  de146155f6 ! 3:  c0609afac9 packed-backend: extract munmap operation for `MMAP_TEMPORARY`
    @@ Commit message
         is "MMAP_TEMPORARY". We also need to do this operation when checking the
         consistency of the "packed-refs" file.
     
    -    Create a new function "munmap_snapshot_if_temporary" to do above and
    +    Create a new function "munmap_temporary_snapshot" to do above and
         change "create_snapshot" to align with the behavior.
     
         Suggested-by: Jeff King <peff@peff.net>
    @@ refs/packed-backend.c: static int allocate_snapshot_buffer(struct snapshot *snap
      	return 1;
      }
      
    -+static void munmap_snapshot_if_temporary(struct snapshot *snapshot)
    ++static void munmap_temporary_snapshot(struct snapshot *snapshot)
     +{
    -+	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
    -+		/*
    -+		 * We don't want to leave the file mmapped, so we are
    -+		 * forced to make a copy now:
    -+		 */
    -+		size_t size = snapshot->eof - snapshot->start;
    -+		char *buf_copy = xmalloc(size);
    ++	char *buf_copy;
    ++	size_t size;
     +
    -+		memcpy(buf_copy, snapshot->start, size);
    -+		clear_snapshot_buffer(snapshot);
    -+		snapshot->buf = snapshot->start = buf_copy;
    -+		snapshot->eof = buf_copy + size;
    -+	}
    ++	if (!snapshot)
    ++		return;
    ++
    ++	/*
    ++	 * We don't want to leave the file mmapped, so we are
    ++	 * forced to make a copy now:
    ++	 */
    ++	size = snapshot->eof - snapshot->start;
    ++	buf_copy = xmalloc(size);
    ++
    ++	memcpy(buf_copy, snapshot->start, size);
    ++	clear_snapshot_buffer(snapshot);
    ++	snapshot->buf = snapshot->start = buf_copy;
    ++	snapshot->eof = buf_copy + size;
     +}
     +
      /*
    @@ refs/packed-backend.c: static struct snapshot *create_snapshot(struct packed_ref
     -		snapshot->buf = snapshot->start = buf_copy;
     -		snapshot->eof = buf_copy + size;
     -	}
    -+	munmap_snapshot_if_temporary(snapshot);
    ++	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    ++		munmap_temporary_snapshot(snapshot);
      
      	return snapshot;
      }
4:  be1e9e2540 ! 4:  c868e3dd16 packed-backend: use mmap when opening large "packed-refs" file
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    packed-backend: use mmap when opening large "packed-refs" file
    +    packed-backend: mmap large "packed-refs" file during fsck
     
    -    We use "strbuf_read" to read the content of "packed-refs". However, this
    -    is a bad practice which would consume a lot of memory usage if there are
    -    multiple processes reading large "packed-refs".
    +    During fsck, we use "strbuf_read" to read the content of "packed-refs"
    +    without using mmap mechanism. This is a bad practice which would consume
    +    more memory than using mmap mechanism. Besides, as all code paths in
    +    "packed-backend.c" use this way, we should make "fsck" align with the
    +    current codebase.
     
         As we have introduced two helper functions "allocate_snapshot_buffer"
         and "munmap_snapshot_if_temporary", we could simply call these functions
    @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
     +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
      		goto cleanup;
     -	}
    -+	munmap_snapshot_if_temporary(snapshot);
      
     -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
     -				      packed_ref_content.buf + packed_ref_content.len);
    ++	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    ++		munmap_temporary_snapshot(snapshot);
    ++
     +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
     +				      snapshot->eof);
      	if (!ret && sorted)
-- 
2.49.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v2 1/4] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
@ 2025-05-07 14:53   ` shejialuo
  2025-05-07 14:53   ` [PATCH v2 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 14:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, an empty "packed-refs" gives an error; this is unwarranted.
We should just skip checking the content of "packed-refs" just like the
runtime code paths such as "create_snapshot" which simply returns the
"snapshot" without checking the content of "packed-refs".

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c    |  3 +++
 t/t0602-reffiles-fsck.sh | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3ad1ed0787..0dd6c6677b 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2103,6 +2103,9 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
+	if (!st.st_size)
+		goto cleanup;
+
 	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
 		ret = error_errno(_("unable to read '%s'"), refs->path);
 		goto cleanup;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 9d1dc2144c..e04967581c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,6 +647,19 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
 	)
 '
 
+test_expect_success 'empty packed-refs should not be reported' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit default &&
+
+		>.git/packed-refs &&
+		git refs verify 2>err &&
+		test_must_be_empty err
+	)
+'
+
 test_expect_success 'packed-refs header should be checked' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 2/4] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
  2025-05-07 14:53   ` [PATCH v2 1/4] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
@ 2025-05-07 14:53   ` shejialuo
  2025-05-07 14:53   ` [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-07 14:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"load_contents" would choose which way to load the content of the
"packed-refs". However, we cannot directly use this function when
checking the consistency due to we don't want to open the file. And we
also need to reuse the logic to avoid causing repetition.

Let's create a new helper function "allocate_snapshot_buffer" to extract
the snapshot allocation logic in "load_contents" and update the
"load_contents" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 54 +++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 0dd6c6677b..e582227772 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -517,6 +517,32 @@ static int refname_contains_nul(struct strbuf *refname)
 
 #define SMALL_FILE_SIZE (32*1024)
 
+static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
+{
+	ssize_t bytes_read;
+	size_t size;
+
+	size = xsize_t(st->st_size);
+	if (!size)
+		return 0;
+
+	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
+		snapshot->buf = xmalloc(size);
+		bytes_read = read_in_full(fd, snapshot->buf, size);
+		if (bytes_read < 0 || bytes_read != size)
+			die_errno("couldn't read %s", snapshot->refs->path);
+		snapshot->mmapped = 0;
+	} else {
+		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		snapshot->mmapped = 1;
+	}
+
+	snapshot->start = snapshot->buf;
+	snapshot->eof = snapshot->buf + size;
+
+	return 1;
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -525,10 +551,9 @@ static int refname_contains_nul(struct strbuf *refname)
  */
 static int load_contents(struct snapshot *snapshot)
 {
-	int fd;
 	struct stat st;
-	size_t size;
-	ssize_t bytes_read;
+	int ret = 1;
+	int fd;
 
 	fd = open(snapshot->refs->path, O_RDONLY);
 	if (fd < 0) {
@@ -550,27 +575,12 @@ static int load_contents(struct snapshot *snapshot)
 
 	if (fstat(fd, &st) < 0)
 		die_errno("couldn't stat %s", snapshot->refs->path);
-	size = xsize_t(st.st_size);
-
-	if (!size) {
-		close(fd);
-		return 0;
-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
-		snapshot->buf = xmalloc(size);
-		bytes_read = read_in_full(fd, snapshot->buf, size);
-		if (bytes_read < 0 || bytes_read != size)
-			die_errno("couldn't read %s", snapshot->refs->path);
-		snapshot->mmapped = 0;
-	} else {
-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
-		snapshot->mmapped = 1;
-	}
-	close(fd);
 
-	snapshot->start = snapshot->buf;
-	snapshot->eof = snapshot->buf + size;
+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
+		ret = 0;
 
-	return 1;
+	close(fd);
+	return ret;
 }
 
 static const char *find_reference_location_1(struct snapshot *snapshot,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
  2025-05-07 14:53   ` [PATCH v2 1/4] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
  2025-05-07 14:53   ` [PATCH v2 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-07 14:53   ` shejialuo
  2025-05-08 19:57     ` Jeff King
  2025-05-07 14:54   ` [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck shejialuo
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-07 14:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"create_snapshot" would try to munmap the file when the "mmap_strategy"
is "MMAP_TEMPORARY". We also need to do this operation when checking the
consistency of the "packed-refs" file.

Create a new function "munmap_temporary_snapshot" to do above and
change "create_snapshot" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 36 +++++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index e582227772..ae6b6845a6 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -543,6 +543,27 @@ static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct st
 	return 1;
 }
 
+static void munmap_temporary_snapshot(struct snapshot *snapshot)
+{
+	char *buf_copy;
+	size_t size;
+
+	if (!snapshot)
+		return;
+
+	/*
+	 * We don't want to leave the file mmapped, so we are
+	 * forced to make a copy now:
+	 */
+	size = snapshot->eof - snapshot->start;
+	buf_copy = xmalloc(size);
+
+	memcpy(buf_copy, snapshot->start, size);
+	clear_snapshot_buffer(snapshot);
+	snapshot->buf = snapshot->start = buf_copy;
+	snapshot->eof = buf_copy + size;
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -761,19 +782,8 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
 		verify_buffer_safe(snapshot);
 	}
 
-	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
-		/*
-		 * We don't want to leave the file mmapped, so we are
-		 * forced to make a copy now:
-		 */
-		size_t size = snapshot->eof - snapshot->start;
-		char *buf_copy = xmalloc(size);
-
-		memcpy(buf_copy, snapshot->start, size);
-		clear_snapshot_buffer(snapshot);
-		snapshot->buf = snapshot->start = buf_copy;
-		snapshot->eof = buf_copy + size;
-	}
+	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
+		munmap_temporary_snapshot(snapshot);
 
 	return snapshot;
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
                     ` (2 preceding siblings ...)
  2025-05-07 14:53   ` [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
@ 2025-05-07 14:54   ` shejialuo
  2025-05-08 20:07     ` Jeff King
  2025-05-07 22:51   ` [PATCH v2 0/4] align the behavior when opening "packed-refs" Junio C Hamano
  2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
  5 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-07 14:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, we use "strbuf_read" to read the content of "packed-refs"
without using mmap mechanism. This is a bad practice which would consume
more memory than using mmap mechanism. Besides, as all code paths in
"packed-backend.c" use this way, we should make "fsck" align with the
current codebase.

As we have introduced two helper functions "allocate_snapshot_buffer"
and "munmap_snapshot_if_temporary", we could simply call these functions
to use mmap mechanism.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index ae6b6845a6..ff744f1d4c 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2079,7 +2079,7 @@ static int packed_fsck(struct ref_store *ref_store,
 {
 	struct packed_ref_store *refs = packed_downcast(ref_store,
 							REF_STORE_READ, "fsck");
-	struct strbuf packed_ref_content = STRBUF_INIT;
+	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
 	unsigned int sorted = 0;
 	struct stat st;
 	int ret = 0;
@@ -2126,21 +2126,23 @@ static int packed_fsck(struct ref_store *ref_store,
 	if (!st.st_size)
 		goto cleanup;
 
-	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
-		ret = error_errno(_("unable to read '%s'"), refs->path);
+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
 		goto cleanup;
-	}
 
-	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
-				      packed_ref_content.buf + packed_ref_content.len);
+	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
+		munmap_temporary_snapshot(snapshot);
+
+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
+				      snapshot->eof);
 	if (!ret && sorted)
-		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
-					     packed_ref_content.buf + packed_ref_content.len);
+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot->start,
+					     snapshot->eof);
 
 cleanup:
 	if (fd >= 0)
 		close(fd);
-	strbuf_release(&packed_ref_content);
+	clear_snapshot_buffer(snapshot);
+	free(snapshot);
 	return ret;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 0/4] align the behavior when opening "packed-refs"
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
                     ` (3 preceding siblings ...)
  2025-05-07 14:54   ` [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-07 22:51   ` Junio C Hamano
  2025-05-08 20:08     ` Jeff King
  2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
  5 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-07 22:51 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> Hi All:
>
> As discussed in [1], we need to use mmap mechanism to open large
> "packed_refs" file to save the memory usage. This patch mainly does the
> following things:
>
> 1: Fix an issue that we would report an error when the "packed-refs"
> file is empty, which does not align with the runtime behavior.
> 2-4: Extract some logic from the existing code and then use these
> created helper functions to let fsck code to use mmap necessarily
>
> [1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net
>
> Really thank Peff and Patrick to suggest me to do above change.

This round looks good to me.  Others?

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-07 14:53   ` [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
@ 2025-05-08 19:57     ` Jeff King
  2025-05-08 20:05       ` Junio C Hamano
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-08 19:57 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Wed, May 07, 2025 at 10:53:56PM +0800, shejialuo wrote:

> @@ -761,19 +782,8 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
>  		verify_buffer_safe(snapshot);
>  	}
>  
> -	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
> -		/*
> -		 * We don't want to leave the file mmapped, so we are
> -		 * forced to make a copy now:
> -		 */
> -		size_t size = snapshot->eof - snapshot->start;
> -		char *buf_copy = xmalloc(size);
> -
> -		memcpy(buf_copy, snapshot->start, size);
> -		clear_snapshot_buffer(snapshot);
> -		snapshot->buf = snapshot->start = buf_copy;
> -		snapshot->eof = buf_copy + size;
> -	}
> +	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
> +		munmap_temporary_snapshot(snapshot);

The original triggers this conditional whenever the strategy is not
MMAP_OK (so MMAP_TEMPORARY or MMAP_NONE). But in your post-image, we do
so only for MMAP_TEMPORARY.

I can guess that the two end up the same, because snapshot->mmapped
would never be set when MMAP_NONE is set. But if we are going to make
such a logical inference, it should be explained in the commit message
(though my preference is to leave the code as-is, or to pull the
refactor into its own commit).

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-08 19:57     ` Jeff King
@ 2025-05-08 20:05       ` Junio C Hamano
  2025-05-09 15:03         ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-08 20:05 UTC (permalink / raw)
  To: Jeff King; +Cc: shejialuo, git, Patrick Steinhardt

Jeff King <peff@peff.net> writes:

>> -	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
>> -		/*
>> -		 * We don't want to leave the file mmapped, so we are
>> -		 * forced to make a copy now:
>> -		 */
>> -		size_t size = snapshot->eof - snapshot->start;
>> -		char *buf_copy = xmalloc(size);
>> -
>> -		memcpy(buf_copy, snapshot->start, size);
>> -		clear_snapshot_buffer(snapshot);
>> -		snapshot->buf = snapshot->start = buf_copy;
>> -		snapshot->eof = buf_copy + size;
>> -	}
>> +	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
>> +		munmap_temporary_snapshot(snapshot);
>
> The original triggers this conditional whenever the strategy is not
> MMAP_OK (so MMAP_TEMPORARY or MMAP_NONE). But in your post-image, we do
> so only for MMAP_TEMPORARY.
>
> I can guess that the two end up the same, because snapshot->mmapped
> would never be set when MMAP_NONE is set. But if we are going to make
> such a logical inference, it should be explained in the commit message
> (though my preference is to leave the code as-is, or to pull the
> refactor into its own commit).

Good thinking.

An "extract" step that is meant as a preliminary refactoring should
not make such a change.  The change may or may not prepare the code
for better maintainability, but I agree that such a change needs to
be justified separately.

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-07 14:54   ` [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-08 20:07     ` Jeff King
  2025-05-09 15:21       ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-08 20:07 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Wed, May 07, 2025 at 10:54:03PM +0800, shejialuo wrote:

> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index ae6b6845a6..ff744f1d4c 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -2079,7 +2079,7 @@ static int packed_fsck(struct ref_store *ref_store,
>  {
>  	struct packed_ref_store *refs = packed_downcast(ref_store,
>  							REF_STORE_READ, "fsck");
> -	struct strbuf packed_ref_content = STRBUF_INIT;
> +	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));

Minor, but is there any reason to allocate this here and not just:

  struct snapshot snapshot = { 0 };

?

> @@ -2126,21 +2126,23 @@ static int packed_fsck(struct ref_store *ref_store,
>  	if (!st.st_size)
>  		goto cleanup;
>  
> -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> -		ret = error_errno(_("unable to read '%s'"), refs->path);
> +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
>  		goto cleanup;
> -	}

Looking at allocate_snapshot_buffer(), it will return 0 only when the
file is empty (and thus there is nothing to allocate) and will
otherwise die(). So we do not need to report any error when it fails.
Good.

But that makes the "!st.st_size" check in the context redundant, doesn't
it? It can just go away.

> -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
> -				      packed_ref_content.buf + packed_ref_content.len);
> +	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
> +		munmap_temporary_snapshot(snapshot);
> +
> +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
> +				      snapshot->eof);

Why are we unmapping here before we use the content? That will create an
allocated in-memory copy of the mmap'd content. I thought the whole
point here was to avoid doing so.

It does shorten the amount of time we hold the temporary mmap in place,
but I don't think we care about that here. The whole point of
MMAP_TEMPORARY is that we usually hold the packed-refs file open across
many requests, and on some platforms (like Windows) we don't want to do
that. But in this code path we plan to mmap, do our verification, and
then drop the snapshot. So we're always "temporary" anyway.

I.e., I'd have expected this code to allocate_snapshot_buffer(), do its
checks, and then call clear_snapshot_buffer().

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 0/4] align the behavior when opening "packed-refs"
  2025-05-07 22:51   ` [PATCH v2 0/4] align the behavior when opening "packed-refs" Junio C Hamano
@ 2025-05-08 20:08     ` Jeff King
  2025-05-08 20:20       ` Junio C Hamano
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-08 20:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: shejialuo, git, Patrick Steinhardt

On Wed, May 07, 2025 at 03:51:02PM -0700, Junio C Hamano wrote:

> shejialuo <shejialuo@gmail.com> writes:
> 
> > Hi All:
> >
> > As discussed in [1], we need to use mmap mechanism to open large
> > "packed_refs" file to save the memory usage. This patch mainly does the
> > following things:
> >
> > 1: Fix an issue that we would report an error when the "packed-refs"
> > file is empty, which does not align with the runtime behavior.
> > 2-4: Extract some logic from the existing code and then use these
> > created helper functions to let fsck code to use mmap necessarily
> >
> > [1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net
> >
> > Really thank Peff and Patrick to suggest me to do above change.
> 
> This round looks good to me.  Others?

I left a few comments that I think bear addressing (or at least some
discussion).

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 0/4] align the behavior when opening "packed-refs"
  2025-05-08 20:08     ` Jeff King
@ 2025-05-08 20:20       ` Junio C Hamano
  2025-05-08 20:33         ` Jeff King
  0 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-08 20:20 UTC (permalink / raw)
  To: Jeff King; +Cc: shejialuo, git, Patrick Steinhardt

Jeff King <peff@peff.net> writes:

> On Wed, May 07, 2025 at 03:51:02PM -0700, Junio C Hamano wrote:
>
>> shejialuo <shejialuo@gmail.com> writes:
>> 
>> > Hi All:
>> >
>> > As discussed in [1], we need to use mmap mechanism to open large
>> > "packed_refs" file to save the memory usage. This patch mainly does the
>> > following things:
>> >
>> > 1: Fix an issue that we would report an error when the "packed-refs"
>> > file is empty, which does not align with the runtime behavior.
>> > 2-4: Extract some logic from the existing code and then use these
>> > created helper functions to let fsck code to use mmap necessarily
>> >
>> > [1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net
>> >
>> > Really thank Peff and Patrick to suggest me to do above change.
>> 
>> This round looks good to me.  Others?
>
> I left a few comments that I think bear addressing (or at least some
> discussion).

I found both of your points very good ones, especially the "why are
we making an in-core copy anyway later?"

Which may probably mean that munmap_temporary_snapshot() is not the
helper function we want in the code path, so one of the preliminary
refactoring patches can be removed.

Even on mmap-incapable platforms, we have enough emulation in
git_mmap() and git_munmap(), and this code path that wants to read a
packed-refs file just mmap(), do its thing, and then munmap(),
without worrying anything about "ah, temporary, so we need to make
an in-core copy for ourselves".

THanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 0/4] align the behavior when opening "packed-refs"
  2025-05-08 20:20       ` Junio C Hamano
@ 2025-05-08 20:33         ` Jeff King
  2025-05-09 15:26           ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-08 20:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: shejialuo, git, Patrick Steinhardt

On Thu, May 08, 2025 at 01:20:54PM -0700, Junio C Hamano wrote:

> > I left a few comments that I think bear addressing (or at least some
> > discussion).
> 
> I found both of your points very good ones, especially the "why are
> we making an in-core copy anyway later?"
> 
> Which may probably mean that munmap_temporary_snapshot() is not the
> helper function we want in the code path, so one of the preliminary
> refactoring patches can be removed.

Yep.

> Even on mmap-incapable platforms, we have enough emulation in
> git_mmap() and git_munmap(), and this code path that wants to read a
> packed-refs file just mmap(), do its thing, and then munmap(),
> without worrying anything about "ah, temporary, so we need to make
> an in-core copy for ourselves".

Hmm, yeah. I had thought that somebody could explicitly set
mmap_strategy themselves via config, in which case we'd want to avoid
calling git_mmap() on systems where it actually does mmap. But it
doesn't look like we have any mechanism for setting the config. So
MMAP_NONE happens only when NO_MMAP is set.

  I briefly wondered if the existing code could be simplified to get rid
  of MMAP_NONE, since it could also rely on git_mmap() to make an
  in-memory copy. But it gets weird with MMAP_PREVENTS_DELETE and
  NO_MMAP combined, since then we make a pointless extra copy (one to
  fake-mmap, and one into the new buffer). OTOH, I'd expect those to be
  mutually exclusive.

  I kind of wonder with MMAP_TEMPORARY why we bother mapping at all and
  not just reading into a buffer. Maybe it's more efficient?

  Probably not worth revisiting all of this old code, though. And
  anyway, because of the SMALL_FILE_SIZE check, we couldn't even
  simplify away the read_in_full() code.

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  2025-05-08 20:05       ` Junio C Hamano
@ 2025-05-09 15:03         ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-09 15:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git, Patrick Steinhardt

On Thu, May 08, 2025 at 01:05:49PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> >> -	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
> >> -		/*
> >> -		 * We don't want to leave the file mmapped, so we are
> >> -		 * forced to make a copy now:
> >> -		 */
> >> -		size_t size = snapshot->eof - snapshot->start;
> >> -		char *buf_copy = xmalloc(size);
> >> -
> >> -		memcpy(buf_copy, snapshot->start, size);
> >> -		clear_snapshot_buffer(snapshot);
> >> -		snapshot->buf = snapshot->start = buf_copy;
> >> -		snapshot->eof = buf_copy + size;
> >> -	}
> >> +	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
> >> +		munmap_temporary_snapshot(snapshot);
> >
> > The original triggers this conditional whenever the strategy is not
> > MMAP_OK (so MMAP_TEMPORARY or MMAP_NONE). But in your post-image, we do
> > so only for MMAP_TEMPORARY.
> >
> > I can guess that the two end up the same, because snapshot->mmapped
> > would never be set when MMAP_NONE is set. But if we are going to make
> > such a logical inference, it should be explained in the commit message
> > (though my preference is to leave the code as-is, or to pull the
> > refactor into its own commit).

That's right, I made a mistake here. I somehow think that we should just
check whether "mmap_strategy" is "MMAP_TEMPORARY". And this would be
enough. Because when "mmap_strategy" is "MMAP_NONE", we would never call
`mmap`.

Actually, when I refactor the code, this one makes me quite confusing.
And I think we should not change. I will revert in the next version.

> 
> Good thinking.
> 
> An "extract" step that is meant as a preliminary refactoring should
> not make such a change.  The change may or may not prepare the code
> for better maintainability, but I agree that such a change needs to
> be justified separately.
> 

Yeah, sure. I didn't consider well.

> Thanks.

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-08 20:07     ` Jeff King
@ 2025-05-09 15:21       ` shejialuo
  2025-05-09 15:59         ` Jeff King
  0 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-09 15:21 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Thu, May 08, 2025 at 04:07:41PM -0400, Jeff King wrote:
> On Wed, May 07, 2025 at 10:54:03PM +0800, shejialuo wrote:
> 
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index ae6b6845a6..ff744f1d4c 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -2079,7 +2079,7 @@ static int packed_fsck(struct ref_store *ref_store,
> >  {
> >  	struct packed_ref_store *refs = packed_downcast(ref_store,
> >  							REF_STORE_READ, "fsck");
> > -	struct strbuf packed_ref_content = STRBUF_INIT;
> > +	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
> 
> Minor, but is there any reason to allocate this here and not just:
> 
>   struct snapshot snapshot = { 0 };
> 
> ?

I simply copy the code from the existing code... I will change.

> 
> > @@ -2126,21 +2126,23 @@ static int packed_fsck(struct ref_store *ref_store,
> >  	if (!st.st_size)
> >  		goto cleanup;
> >  
> > -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> > -		ret = error_errno(_("unable to read '%s'"), refs->path);
> > +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
> >  		goto cleanup;
> > -	}
> 
> Looking at allocate_snapshot_buffer(), it will return 0 only when the
> file is empty (and thus there is nothing to allocate) and will
> otherwise die(). So we do not need to report any error when it fails.
> Good.
> 
> But that makes the "!st.st_size" check in the context redundant, doesn't
> it? It can just go away.
> 

Good catch. I remember in the V1, this does not exist. I may make
something wrong when rebasing the code. Thanks!

> > -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
> > -				      packed_ref_content.buf + packed_ref_content.len);
> > +	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
> > +		munmap_temporary_snapshot(snapshot);
> > +
> > +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
> > +				      snapshot->eof);
> 
> Why are we unmapping here before we use the content? That will create an
> allocated in-memory copy of the mmap'd content. I thought the whole
> point here was to avoid doing so.
> 

I simply follow how "create_snapshot" does. Actually, I am also quite
confused about this. If we would eventually copy the content into the
user space's memory. What is the reason that we mmap at Windows in the
first place?

My understanding is that after mmaping, we need to do some sanity checks
and then if there is a need, we may sort the "packed-refs" file. So, we
would improve some efficiency at Windows for this part?

> It does shorten the amount of time we hold the temporary mmap in place,
> but I don't think we care about that here. The whole point of
> MMAP_TEMPORARY is that we usually hold the packed-refs file open across
> many requests, and on some platforms (like Windows) we don't want to do
> that. But in this code path we plan to mmap, do our verification, and
> then drop the snapshot. So we're always "temporary" anyway.
> 
> I.e., I'd have expected this code to allocate_snapshot_buffer(), do its
> checks, and then call clear_snapshot_buffer().
> 

I will improve this in the next version.

> -Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 0/4] align the behavior when opening "packed-refs"
  2025-05-08 20:33         ` Jeff King
@ 2025-05-09 15:26           ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-09 15:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Patrick Steinhardt

On Thu, May 08, 2025 at 04:33:50PM -0400, Jeff King wrote:
> On Thu, May 08, 2025 at 01:20:54PM -0700, Junio C Hamano wrote:
> 
> > > I left a few comments that I think bear addressing (or at least some
> > > discussion).
> > 
> > I found both of your points very good ones, especially the "why are
> > we making an in-core copy anyway later?"
> > 
> > Which may probably mean that munmap_temporary_snapshot() is not the
> > helper function we want in the code path, so one of the preliminary
> > refactoring patches can be removed.
> 
> Yep.
> 

Right, I will remove this patch in the next version.

> > Even on mmap-incapable platforms, we have enough emulation in
> > git_mmap() and git_munmap(), and this code path that wants to read a
> > packed-refs file just mmap(), do its thing, and then munmap(),
> > without worrying anything about "ah, temporary, so we need to make
> > an in-core copy for ourselves".
> 
> Hmm, yeah. I had thought that somebody could explicitly set
> mmap_strategy themselves via config, in which case we'd want to avoid
> calling git_mmap() on systems where it actually does mmap. But it
> doesn't look like we have any mechanism for setting the config. So
> MMAP_NONE happens only when NO_MMAP is set.
> 
>   I briefly wondered if the existing code could be simplified to get rid
>   of MMAP_NONE, since it could also rely on git_mmap() to make an
>   in-memory copy. But it gets weird with MMAP_PREVENTS_DELETE and
>   NO_MMAP combined, since then we make a pointless extra copy (one to
>   fake-mmap, and one into the new buffer). OTOH, I'd expect those to be
>   mutually exclusive.
> 
>   I kind of wonder with MMAP_TEMPORARY why we bother mapping at all and
>   not just reading into a buffer. Maybe it's more efficient?
> 

I am also confused about this. If we eventually would allocate a buffer,
it is wired that we map in the first place.

>   Probably not worth revisiting all of this old code, though. And
>   anyway, because of the SMALL_FILE_SIZE check, we couldn't even
>   simplify away the read_in_full() code.
> 

Right, I will try to clean the code later. At now, I think we need to
concentrate on using `mmap`. I will add this into my TODOs. Thanks for
the suggestion.

> -Peff

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-09 15:21       ` shejialuo
@ 2025-05-09 15:59         ` Jeff King
  2025-05-09 16:40           ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-09 15:59 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Fri, May 09, 2025 at 11:21:34PM +0800, shejialuo wrote:

> > > -	struct strbuf packed_ref_content = STRBUF_INIT;
> > > +	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
> > 
> > Minor, but is there any reason to allocate this here and not just:
> > 
> >   struct snapshot snapshot = { 0 };
> > 
> > ?
> 
> I simply copy the code from the existing code... I will change.

Ah, I see. The existing code must allocate on the heap because it is
returning the snapshot to its caller. But here the variable is
completely local to the function.

However, if we stop using mmap_strategy altogether and just use xmmap()
directly, I don't think you'd even need a snapshot variable.

> > Why are we unmapping here before we use the content? That will create an
> > allocated in-memory copy of the mmap'd content. I thought the whole
> > point here was to avoid doing so.
> > 
> 
> I simply follow how "create_snapshot" does. Actually, I am also quite
> confused about this. If we would eventually copy the content into the
> user space's memory. What is the reason that we mmap at Windows in the
> first place?
> 
> My understanding is that after mmaping, we need to do some sanity checks
> and then if there is a need, we may sort the "packed-refs" file. So, we
> would improve some efficiency at Windows for this part?

Ah, yes, that makes sense for the existing code. If we do have to sort,
then mapping on Windows means we could sort into our internal buffer,
saving an extra copy. That is probably a rare case these days (once upon
a time the file was not guaranteed to be sorted, but these days the
writer makes sure it is sorted and puts a marker in the header claiming
it is so). So maybe that approach is not as useful as it once was, but I
don't know if it's worth spending effort to rip it out now.

I don't think any of that applies for packed_fsck(), though, since it is
just processing the file linearly in that case (rather than making a
sorted copy).

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-09 15:59         ` Jeff King
@ 2025-05-09 16:40           ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-09 16:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Fri, May 09, 2025 at 11:59:34AM -0400, Jeff King wrote:
> On Fri, May 09, 2025 at 11:21:34PM +0800, shejialuo wrote:
> 
> > > > -	struct strbuf packed_ref_content = STRBUF_INIT;
> > > > +	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
> > > 
> > > Minor, but is there any reason to allocate this here and not just:
> > > 
> > >   struct snapshot snapshot = { 0 };
> > > 
> > > ?
> > 
> > I simply copy the code from the existing code... I will change.
> 
> Ah, I see. The existing code must allocate on the heap because it is
> returning the snapshot to its caller. But here the variable is
> completely local to the function.
> 
> However, if we stop using mmap_strategy altogether and just use xmmap()
> directly, I don't think you'd even need a snapshot variable.
> 

Yes, that's right. Maybe the simplest way is to use `xmmap()`. But I
don't want to introduce repetition. In the current codebase, we already
have the logic to load the "packed-refs". Let's just reuse it.

Out of topic, I have more to express.

Actually, my eventual goal is to unify the fsck and other parts. For
example, "create_snapshot" would also do some basic sanity checks. And
of course, there is some overlap between fsck and this function. And
also for "next_record", it would also check something.

During my implementation, I find it hard to unify and it would require a
lot of effort. So, I simply introduce some redundant logic. But this is
not perfect. Because I still parse the "packed-refs" file just like
"next_record" and "create_snapshot" do. And the most disappointed thing
is that I cannot reuse them at all. But the things I want to do is very
similar to these functions.

So, I think I would eventually to find out a way to do above in the
future.

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v3 0/3] align the behavior when opening "packed-refs"
  2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
                     ` (4 preceding siblings ...)
  2025-05-07 22:51   ` [PATCH v2 0/4] align the behavior when opening "packed-refs" Junio C Hamano
@ 2025-05-11 13:59   ` shejialuo
  2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
                       ` (3 more replies)
  5 siblings, 4 replies; 67+ messages in thread
From: shejialuo @ 2025-05-11 13:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net

Really thank Peff and Patrick to suggest me to do above change.

---

Change in v2:

1. Update the commit message of [PATCH 1/4]. And use redirection to
create an empty file instead of using `touch`.
2. Don't use if for the refactored function in [PATCH 3/4] and then
update the commit message to align with the new function name.
3. Enhance the commit message of [PATCH 4/4].

---

Change in v3:

1. Drop the patch which creates a new function
"munmap_temporary_snapshot". As discussed, there is no need to munmap
the file during fsck.
2. Allocate snapshot variable in the stack instead of heap.
3. Fix rebase issue, remove unneeded code to check the file size
explicitly.

The range-diff seems unreadable. Briefly, for this new version, the only
change is the last patch compared to the v2.

    packed-backend: mmap large "packed-refs" file during fsck

Thanks,
Jialuo

shejialuo (3):
  packed-backend: fsck should allow an empty "packed-refs" file
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: mmap large "packed-refs" file during fsck

 refs/packed-backend.c    | 70 ++++++++++++++++++++++------------------
 t/t0602-reffiles-fsck.sh | 13 ++++++++
 2 files changed, 52 insertions(+), 31 deletions(-)

Range-diff against v2:
-:  ---------- > 1:  26c3fd55a8 packed-backend: fsck should allow an empty "packed-refs" file
1:  c0609afac9 ! 2:  4604be8b51 packed-backend: extract munmap operation for `MMAP_TEMPORARY`
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    packed-backend: extract munmap operation for `MMAP_TEMPORARY`
    +    packed-backend: extract snapshot allocation in `load_contents`
     
    -    "create_snapshot" would try to munmap the file when the "mmap_strategy"
    -    is "MMAP_TEMPORARY". We also need to do this operation when checking the
    -    consistency of the "packed-refs" file.
    +    "load_contents" would choose which way to load the content of the
    +    "packed-refs". However, we cannot directly use this function when
    +    checking the consistency due to we don't want to open the file. And we
    +    also need to reuse the logic to avoid causing repetition.
     
    -    Create a new function "munmap_temporary_snapshot" to do above and
    -    change "create_snapshot" to align with the behavior.
    +    Let's create a new helper function "allocate_snapshot_buffer" to extract
    +    the snapshot allocation logic in "load_contents" and update the
    +    "load_contents" to align with the behavior.
     
         Suggested-by: Jeff King <peff@peff.net>
         Suggested-by: Patrick Steinhardt <ps@pks.im>
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
      ## refs/packed-backend.c ##
    -@@ refs/packed-backend.c: static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct st
    - 	return 1;
    - }
    +@@ refs/packed-backend.c: static int refname_contains_nul(struct strbuf *refname)
    + 
    + #define SMALL_FILE_SIZE (32*1024)
      
    -+static void munmap_temporary_snapshot(struct snapshot *snapshot)
    ++static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
     +{
    -+	char *buf_copy;
    ++	ssize_t bytes_read;
     +	size_t size;
     +
    -+	if (!snapshot)
    -+		return;
    ++	size = xsize_t(st->st_size);
    ++	if (!size)
    ++		return 0;
    ++
    ++	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
    ++		snapshot->buf = xmalloc(size);
    ++		bytes_read = read_in_full(fd, snapshot->buf, size);
    ++		if (bytes_read < 0 || bytes_read != size)
    ++			die_errno("couldn't read %s", snapshot->refs->path);
    ++		snapshot->mmapped = 0;
    ++	} else {
    ++		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
    ++		snapshot->mmapped = 1;
    ++	}
     +
    -+	/*
    -+	 * We don't want to leave the file mmapped, so we are
    -+	 * forced to make a copy now:
    -+	 */
    -+	size = snapshot->eof - snapshot->start;
    -+	buf_copy = xmalloc(size);
    ++	snapshot->start = snapshot->buf;
    ++	snapshot->eof = snapshot->buf + size;
     +
    -+	memcpy(buf_copy, snapshot->start, size);
    -+	clear_snapshot_buffer(snapshot);
    -+	snapshot->buf = snapshot->start = buf_copy;
    -+	snapshot->eof = buf_copy + size;
    ++	return 1;
     +}
     +
      /*
       * Depending on `mmap_strategy`, either mmap or read the contents of
       * the `packed-refs` file into the snapshot. Return 1 if the file
    -@@ refs/packed-backend.c: static struct snapshot *create_snapshot(struct packed_ref_store *refs)
    - 		verify_buffer_safe(snapshot);
    - 	}
    +@@ refs/packed-backend.c: static int refname_contains_nul(struct strbuf *refname)
    +  */
    + static int load_contents(struct snapshot *snapshot)
    + {
    +-	int fd;
    + 	struct stat st;
    +-	size_t size;
    +-	ssize_t bytes_read;
    ++	int ret = 1;
    ++	int fd;
      
    --	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
    --		/*
    --		 * We don't want to leave the file mmapped, so we are
    --		 * forced to make a copy now:
    --		 */
    --		size_t size = snapshot->eof - snapshot->start;
    --		char *buf_copy = xmalloc(size);
    + 	fd = open(snapshot->refs->path, O_RDONLY);
    + 	if (fd < 0) {
    +@@ refs/packed-backend.c: static int load_contents(struct snapshot *snapshot)
    + 
    + 	if (fstat(fd, &st) < 0)
    + 		die_errno("couldn't stat %s", snapshot->refs->path);
    +-	size = xsize_t(st.st_size);
     -
    --		memcpy(buf_copy, snapshot->start, size);
    --		clear_snapshot_buffer(snapshot);
    --		snapshot->buf = snapshot->start = buf_copy;
    --		snapshot->eof = buf_copy + size;
    +-	if (!size) {
    +-		close(fd);
    +-		return 0;
    +-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
    +-		snapshot->buf = xmalloc(size);
    +-		bytes_read = read_in_full(fd, snapshot->buf, size);
    +-		if (bytes_read < 0 || bytes_read != size)
    +-			die_errno("couldn't read %s", snapshot->refs->path);
    +-		snapshot->mmapped = 0;
    +-	} else {
    +-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
    +-		snapshot->mmapped = 1;
     -	}
    -+	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    -+		munmap_temporary_snapshot(snapshot);
    +-	close(fd);
    + 
    +-	snapshot->start = snapshot->buf;
    +-	snapshot->eof = snapshot->buf + size;
    ++	if (!allocate_snapshot_buffer(snapshot, fd, &st))
    ++		ret = 0;
      
    - 	return snapshot;
    +-	return 1;
    ++	close(fd);
    ++	return ret;
      }
    + 
    + static const char *find_reference_location_1(struct snapshot *snapshot,
2:  c868e3dd16 ! 3:  82e19a65c6 packed-backend: mmap large "packed-refs" file during fsck
    @@ Commit message
         "packed-backend.c" use this way, we should make "fsck" align with the
         current codebase.
     
    -    As we have introduced two helper functions "allocate_snapshot_buffer"
    -    and "munmap_snapshot_if_temporary", we could simply call these functions
    -    to use mmap mechanism.
    +    As we have introduced the helper function "allocate_snapshot_buffer", we
    +    could simple use this function to use mmap mechanism.
     
         Suggested-by: Jeff King <peff@peff.net>
         Suggested-by: Patrick Steinhardt <ps@pks.im>
    @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      	struct packed_ref_store *refs = packed_downcast(ref_store,
      							REF_STORE_READ, "fsck");
     -	struct strbuf packed_ref_content = STRBUF_INIT;
    -+	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
    ++	struct snapshot snapshot = { 0 };
      	unsigned int sorted = 0;
      	struct stat st;
      	int ret = 0;
     @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
    - 	if (!st.st_size)
    + 		goto cleanup;
    + 	}
    + 
    +-	if (!st.st_size)
    ++	if (!allocate_snapshot_buffer(&snapshot, fd, &st))
      		goto cleanup;
      
     -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
     -		ret = error_errno(_("unable to read '%s'"), refs->path);
    -+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
    - 		goto cleanup;
    +-		goto cleanup;
     -	}
    - 
    +-
     -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
     -				      packed_ref_content.buf + packed_ref_content.len);
    -+	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    -+		munmap_temporary_snapshot(snapshot);
    -+
    -+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
    -+				      snapshot->eof);
    ++	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
    ++				      snapshot.eof);
      	if (!ret && sorted)
     -		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
     -					     packed_ref_content.buf + packed_ref_content.len);
    -+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot->start,
    -+					     snapshot->eof);
    ++		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
    ++					     snapshot.eof);
      
      cleanup:
      	if (fd >= 0)
      		close(fd);
     -	strbuf_release(&packed_ref_content);
    -+	clear_snapshot_buffer(snapshot);
    -+	free(snapshot);
    ++	clear_snapshot_buffer(&snapshot);
      	return ret;
      }
      
-- 
2.49.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
@ 2025-05-11 14:01     ` shejialuo
  2025-05-12  8:36       ` Patrick Steinhardt
  2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-11 14:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, an empty "packed-refs" gives an error; this is unwarranted.
We should just skip checking the content of "packed-refs" just like the
runtime code paths such as "create_snapshot" which simply returns the
"snapshot" without checking the content of "packed-refs".

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c    |  3 +++
 t/t0602-reffiles-fsck.sh | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3ad1ed0787..0dd6c6677b 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2103,6 +2103,9 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
+	if (!st.st_size)
+		goto cleanup;
+
 	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
 		ret = error_errno(_("unable to read '%s'"), refs->path);
 		goto cleanup;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 9d1dc2144c..e04967581c 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,6 +647,19 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
 	)
 '
 
+test_expect_success 'empty packed-refs should not be reported' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit default &&
+
+		>.git/packed-refs &&
+		git refs verify 2>err &&
+		test_must_be_empty err
+	)
+'
+
 test_expect_success 'packed-refs header should be checked' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
  2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
@ 2025-05-11 14:01     ` shejialuo
  2025-05-12  8:37       ` Patrick Steinhardt
  2025-05-12 13:06       ` Jeff King
  2025-05-11 14:01     ` [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
  2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
  3 siblings, 2 replies; 67+ messages in thread
From: shejialuo @ 2025-05-11 14:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"load_contents" would choose which way to load the content of the
"packed-refs". However, we cannot directly use this function when
checking the consistency due to we don't want to open the file. And we
also need to reuse the logic to avoid causing repetition.

Let's create a new helper function "allocate_snapshot_buffer" to extract
the snapshot allocation logic in "load_contents" and update the
"load_contents" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 54 +++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 0dd6c6677b..e582227772 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -517,6 +517,32 @@ static int refname_contains_nul(struct strbuf *refname)
 
 #define SMALL_FILE_SIZE (32*1024)
 
+static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
+{
+	ssize_t bytes_read;
+	size_t size;
+
+	size = xsize_t(st->st_size);
+	if (!size)
+		return 0;
+
+	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
+		snapshot->buf = xmalloc(size);
+		bytes_read = read_in_full(fd, snapshot->buf, size);
+		if (bytes_read < 0 || bytes_read != size)
+			die_errno("couldn't read %s", snapshot->refs->path);
+		snapshot->mmapped = 0;
+	} else {
+		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		snapshot->mmapped = 1;
+	}
+
+	snapshot->start = snapshot->buf;
+	snapshot->eof = snapshot->buf + size;
+
+	return 1;
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -525,10 +551,9 @@ static int refname_contains_nul(struct strbuf *refname)
  */
 static int load_contents(struct snapshot *snapshot)
 {
-	int fd;
 	struct stat st;
-	size_t size;
-	ssize_t bytes_read;
+	int ret = 1;
+	int fd;
 
 	fd = open(snapshot->refs->path, O_RDONLY);
 	if (fd < 0) {
@@ -550,27 +575,12 @@ static int load_contents(struct snapshot *snapshot)
 
 	if (fstat(fd, &st) < 0)
 		die_errno("couldn't stat %s", snapshot->refs->path);
-	size = xsize_t(st.st_size);
-
-	if (!size) {
-		close(fd);
-		return 0;
-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
-		snapshot->buf = xmalloc(size);
-		bytes_read = read_in_full(fd, snapshot->buf, size);
-		if (bytes_read < 0 || bytes_read != size)
-			die_errno("couldn't read %s", snapshot->refs->path);
-		snapshot->mmapped = 0;
-	} else {
-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
-		snapshot->mmapped = 1;
-	}
-	close(fd);
 
-	snapshot->start = snapshot->buf;
-	snapshot->eof = snapshot->buf + size;
+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
+		ret = 0;
 
-	return 1;
+	close(fd);
+	return ret;
 }
 
 static const char *find_reference_location_1(struct snapshot *snapshot,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
  2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
  2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-11 14:01     ` shejialuo
  2025-05-12 13:08       ` Jeff King
  2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
  3 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-11 14:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, we use "strbuf_read" to read the content of "packed-refs"
without using mmap mechanism. This is a bad practice which would consume
more memory than using mmap mechanism. Besides, as all code paths in
"packed-backend.c" use this way, we should make "fsck" align with the
current codebase.

As we have introduced the helper function "allocate_snapshot_buffer", we
could simple use this function to use mmap mechanism.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index e582227772..85f5a45160 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2069,7 +2069,7 @@ static int packed_fsck(struct ref_store *ref_store,
 {
 	struct packed_ref_store *refs = packed_downcast(ref_store,
 							REF_STORE_READ, "fsck");
-	struct strbuf packed_ref_content = STRBUF_INIT;
+	struct snapshot snapshot = { 0 };
 	unsigned int sorted = 0;
 	struct stat st;
 	int ret = 0;
@@ -2113,24 +2113,19 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
-	if (!st.st_size)
+	if (!allocate_snapshot_buffer(&snapshot, fd, &st))
 		goto cleanup;
 
-	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
-		ret = error_errno(_("unable to read '%s'"), refs->path);
-		goto cleanup;
-	}
-
-	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
-				      packed_ref_content.buf + packed_ref_content.len);
+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
+				      snapshot.eof);
 	if (!ret && sorted)
-		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
-					     packed_ref_content.buf + packed_ref_content.len);
+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
+					     snapshot.eof);
 
 cleanup:
 	if (fd >= 0)
 		close(fd);
-	strbuf_release(&packed_ref_content);
+	clear_snapshot_buffer(&snapshot);
 	return ret;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
@ 2025-05-12  8:36       ` Patrick Steinhardt
  2025-05-12 12:25         ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Patrick Steinhardt @ 2025-05-12  8:36 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Jeff King

On Sun, May 11, 2025 at 10:01:43PM +0800, shejialuo wrote:
> During fsck, an empty "packed-refs" gives an error; this is unwarranted.
> We should just skip checking the content of "packed-refs" just like the
> runtime code paths such as "create_snapshot" which simply returns the
> "snapshot" without checking the content of "packed-refs".

I think this doesn't quite answer the question whether this is a _good_
idea though. The question that we need to answer is whether there are
any writing code paths that may end up writing a "packed-refs" file that
is completely empty. Modern Git would at least write the packed-refs
header, wouldn't it?

The reason why I'm a little sceptical is that there is a common problem
with ext4 caused by its delayed allocation [1]. If you:

  1. Write data to a temporary file.
  2. Rename the file into place.
  3. The host system crashes.

Then it may happen that the renamed file is now completely empty.

The root cause is a bug in the application: before renaming the file
into place it _must_ fsync the file to disk. Git does that by default,
but it is extremely easy to get wrong and we had bugs around this until
~2 years ago, if I remember correctly. We hit the problem several times
in our production systems.

So I wonder whether ignoring empty files would cause us to miss such a
common error. But I guess if there are valid cases where we may end up
with an empty "packed-refs" file we cannot do anything about it.

Patrick

[1]: https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-12  8:37       ` Patrick Steinhardt
  2025-05-12 10:35         ` shejialuo
  2025-05-12 13:06       ` Jeff King
  1 sibling, 1 reply; 67+ messages in thread
From: Patrick Steinhardt @ 2025-05-12  8:37 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Jeff King

On Sun, May 11, 2025 at 10:01:50PM +0800, shejialuo wrote:
> "load_contents" would choose which way to load the content of the
> "packed-refs". However, we cannot directly use this function when
> checking the consistency due to we don't want to open the file. And we
> also need to reuse the logic to avoid causing repetition.
> 
> Let's create a new helper function "allocate_snapshot_buffer" to extract
> the snapshot allocation logic in "load_contents" and update the
> "load_contents" to align with the behavior.
> 
> Suggested-by: Jeff King <peff@peff.net>
> Suggested-by: Patrick Steinhardt <ps@pks.im>

Huh. Are you sure I suggested this? :) I cannot remember at least.

That being said, the change looks sensible.

Patrick

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-12  8:37       ` Patrick Steinhardt
@ 2025-05-12 10:35         ` shejialuo
  2025-05-12 14:41           ` Patrick Steinhardt
  0 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-12 10:35 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Jeff King

On Mon, May 12, 2025 at 10:37:03AM +0200, Patrick Steinhardt wrote:
> On Sun, May 11, 2025 at 10:01:50PM +0800, shejialuo wrote:
> > "load_contents" would choose which way to load the content of the
> > "packed-refs". However, we cannot directly use this function when
> > checking the consistency due to we don't want to open the file. And we
> > also need to reuse the logic to avoid causing repetition.
> > 
> > Let's create a new helper function "allocate_snapshot_buffer" to extract
> > the snapshot allocation logic in "load_contents" and update the
> > "load_contents" to align with the behavior.
> > 
> > Suggested-by: Jeff King <peff@peff.net>
> > Suggested-by: Patrick Steinhardt <ps@pks.im>
> 
> Huh. Are you sure I suggested this? :) I cannot remember at least.
> 

Because you explain me a lot how Gitlab handles and Peff tells me how
Github handles, I add both of you.

> That being said, the change looks sensible.
> 
> Patrick

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-12  8:36       ` Patrick Steinhardt
@ 2025-05-12 12:25         ` shejialuo
  2025-05-12 14:39           ` Patrick Steinhardt
  0 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-12 12:25 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Jeff King

On Mon, May 12, 2025 at 10:36:50AM +0200, Patrick Steinhardt wrote:
> On Sun, May 11, 2025 at 10:01:43PM +0800, shejialuo wrote:
> > During fsck, an empty "packed-refs" gives an error; this is unwarranted.
> > We should just skip checking the content of "packed-refs" just like the
> > runtime code paths such as "create_snapshot" which simply returns the
> > "snapshot" without checking the content of "packed-refs".
> 
> I think this doesn't quite answer the question whether this is a _good_
> idea though. The question that we need to answer is whether there are
> any writing code paths that may end up writing a "packed-refs" file that
> is completely empty. Modern Git would at least write the packed-refs
> header, wouldn't it?
> 

That's right. In the current codebase, we would always write the header
which could be easily reproduced by using the following command:

    git init repo
    cd repo && git pack-refs
    cat .git/packed-refs

And in "packed-backend.c::write_with_updates", we would always write the
header.

> The reason why I'm a little sceptical is that there is a common problem
> with ext4 caused by its delayed allocation [1]. If you:
> 
>   1. Write data to a temporary file.
>   2. Rename the file into place.
>   3. The host system crashes.
> 
> Then it may happen that the renamed file is now completely empty.
> 
> The root cause is a bug in the application: before renaming the file
> into place it _must_ fsync the file to disk. Git does that by default,
> but it is extremely easy to get wrong and we had bugs around this until
> ~2 years ago, if I remember correctly. We hit the problem several times
> in our production systems.
> 

I see. I agree that in the most situation, an empty "packed-refs" file
means that there is an issue.

> So I wonder whether ignoring empty files would cause us to miss such a
> common error. But I guess if there are valid cases where we may end up
> with an empty "packed-refs" file we cannot do anything about it.
> 

I somehow think we would always write header in the Modern Git. But
"create_snapshot" accept an empty existing "packed-refs" file at
runtime.

And header is introduced in 694b7a1999 (repack_without_ref(): write peeled
refs in the rewritten file, 2013-04-22). At this commit, we would always
write the header into the "packed-refs" file.

But in runtime, we accept empty file or no header of the file content as we
want to keep compatible. In my humble word, I think we should allow
empty file at now. Then, In Git 3.0, we tighten all the rules (there
must always be a header etc) and also update the runtime behavior.

> Patrick
> 
> [1]: https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
  2025-05-12  8:37       ` Patrick Steinhardt
@ 2025-05-12 13:06       ` Jeff King
  2025-05-13  6:55         ` shejialuo
  1 sibling, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-12 13:06 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Sun, May 11, 2025 at 10:01:50PM +0800, shejialuo wrote:

> "load_contents" would choose which way to load the content of the
> "packed-refs". However, we cannot directly use this function when
> checking the consistency due to we don't want to open the file. And we
> also need to reuse the logic to avoid causing repetition.
> 
> Let's create a new helper function "allocate_snapshot_buffer" to extract
> the snapshot allocation logic in "load_contents" and update the
> "load_contents" to align with the behavior.

This looks good to me. One thing that did give me a slight pause while
reviewing:

>  static int load_contents(struct snapshot *snapshot)
>  {
> -	int fd;
>  	struct stat st;
> -	size_t size;
> -	ssize_t bytes_read;
> +	int ret = 1;
> [...]
> +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
> +		ret = 0;
>  
> -	return 1;
> +	close(fd);
> +	return ret;
>  }

I wanted to see what the semantics of "ret" were, but there aren't any
other assignments. So I think this is equivalent to:

  int ret;
  ...
  ret = allocate_snapshot_buffer(snapshot, fd, &st);

which makes it a little more clear we are just relaying the value from
that function.

Probably not worth re-rolling for that, though.

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-11 14:01     ` [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-12 13:08       ` Jeff King
  0 siblings, 0 replies; 67+ messages in thread
From: Jeff King @ 2025-05-12 13:08 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Sun, May 11, 2025 at 10:01:59PM +0800, shejialuo wrote:

> @@ -2113,24 +2113,19 @@ static int packed_fsck(struct ref_store *ref_store,
>  		goto cleanup;
>  	}
>  
> -	if (!st.st_size)
> +	if (!allocate_snapshot_buffer(&snapshot, fd, &st))
>  		goto cleanup;
>  
> -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> -		ret = error_errno(_("unable to read '%s'"), refs->path);
> -		goto cleanup;
> -	}
> -
> -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
> -				      packed_ref_content.buf + packed_ref_content.len);
> +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
> +				      snapshot.eof);
>  	if (!ret && sorted)
> -		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
> -					     packed_ref_content.buf + packed_ref_content.len);
> +		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
> +					     snapshot.eof);

OK, so we still use allocate_snapshot_buffer(), but then we hold on to
the snapshot. Good.

I do think just using xmmap() would have been sufficient (and not needed
patch 2 then), but I'm OK with this direction under the logic that we
may end up sharing more code between the normal and fsck code paths
eventually.

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-12 12:25         ` shejialuo
@ 2025-05-12 14:39           ` Patrick Steinhardt
  2025-05-12 15:56             ` Jeff King
  0 siblings, 1 reply; 67+ messages in thread
From: Patrick Steinhardt @ 2025-05-12 14:39 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Jeff King

On Mon, May 12, 2025 at 08:25:06PM +0800, shejialuo wrote:
> On Mon, May 12, 2025 at 10:36:50AM +0200, Patrick Steinhardt wrote:
> > So I wonder whether ignoring empty files would cause us to miss such a
> > common error. But I guess if there are valid cases where we may end up
> > with an empty "packed-refs" file we cannot do anything about it.
> > 
> 
> I somehow think we would always write header in the Modern Git. But
> "create_snapshot" accept an empty existing "packed-refs" file at
> runtime.
> 
> And header is introduced in 694b7a1999 (repack_without_ref(): write peeled
> refs in the rewritten file, 2013-04-22). At this commit, we would always
> write the header into the "packed-refs" file.

So what happened before this commit? Would we end up writing a
completely empty file?

> But in runtime, we accept empty file or no header of the file content as we
> want to keep compatible. In my humble word, I think we should allow
> empty file at now. Then, In Git 3.0, we tighten all the rules (there
> must always be a header etc) and also update the runtime behavior.

I think it depends. If we can prove that Git (and other, third-party
implementations thereof) wouldn't have ever written such "packed-refs"
files then we should start warning now already, even if we accept such
files at runtime. Otherwise, if there were versions of Git that could
have ended up writing such files I agree that we have to accept this for
now.

But if we were to deprecate the ability to read an empty file I wouldn't
adapt both paths at the Git 3.0 boundary. If we decide that an empty
file is broken we should introduce the check for `git refs verify` way
before we tighten the runtime behaviour. Otherwise there isn't really an
opportunity for people to learn that we are about to remove this.

Patrick

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-12 10:35         ` shejialuo
@ 2025-05-12 14:41           ` Patrick Steinhardt
  0 siblings, 0 replies; 67+ messages in thread
From: Patrick Steinhardt @ 2025-05-12 14:41 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Junio C Hamano, Jeff King

On Mon, May 12, 2025 at 06:35:05PM +0800, shejialuo wrote:
> On Mon, May 12, 2025 at 10:37:03AM +0200, Patrick Steinhardt wrote:
> > On Sun, May 11, 2025 at 10:01:50PM +0800, shejialuo wrote:
> > > "load_contents" would choose which way to load the content of the
> > > "packed-refs". However, we cannot directly use this function when
> > > checking the consistency due to we don't want to open the file. And we
> > > also need to reuse the logic to avoid causing repetition.
> > > 
> > > Let's create a new helper function "allocate_snapshot_buffer" to extract
> > > the snapshot allocation logic in "load_contents" and update the
> > > "load_contents" to align with the behavior.
> > > 
> > > Suggested-by: Jeff King <peff@peff.net>
> > > Suggested-by: Patrick Steinhardt <ps@pks.im>
> > 
> > Huh. Are you sure I suggested this? :) I cannot remember at least.
> > 
> 
> Because you explain me a lot how Gitlab handles and Peff tells me how
> Github handles, I add both of you.

I think that would've made sense for the last step where you introduce
the adapted logic. But the intermediate steps have all been designed by
yourself without my help :)

Anyway, I don't mind this, I just found it to be funny.

Patrick

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-12 14:39           ` Patrick Steinhardt
@ 2025-05-12 15:56             ` Jeff King
  2025-05-12 17:18               ` Junio C Hamano
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-12 15:56 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: shejialuo, git, Junio C Hamano

On Mon, May 12, 2025 at 04:39:43PM +0200, Patrick Steinhardt wrote:

> > And header is introduced in 694b7a1999 (repack_without_ref(): write peeled
> > refs in the rewritten file, 2013-04-22). At this commit, we would always
> > write the header into the "packed-refs" file.
> 
> So what happened before this commit? Would we end up writing a
> completely empty file?

Yeah, prior to that if you delete the final entry, you'd get an empty
file. Reproducible with:

  # you can use modern git for this part
  git init
  git commit --allow-empty -m foo
  git checkout --detach
  git pack-refs --all

  # old git writes empty file during deletion
  git.v1.8.4 branch -D main
  ls -l .git/packed-refs

Going back even further, we'd also write an empty file via pack-refs.
With git v1.4.4, doing "git init && git pack-refs" will get you an empty
file. I didn't bisect completely, but presumably that changed in
f4204ab9f6 (Store peeled refs in packed-refs (take 2)., 2006-11-21).

> I think it depends. If we can prove that Git (and other, third-party
> implementations thereof) wouldn't have ever written such "packed-refs"
> files then we should start warning now already, even if we accept such
> files at runtime. Otherwise, if there were versions of Git that could
> have ended up writing such files I agree that we have to accept this for
> now.

So we definitely did write those files. Those are pretty old versions of
Git, but something we should continue to support.

It may be useful for fsck to detect this, though, even if the default
message severity is set to "info" or even "ignore. That would allow
people who know they are using modern Git to increase it themselves (I
don't expect normal users to do this, but it would probably be useful
for forges which run automated "fsck" across a lot of repos).

And then the backwards-incompatible Git 3.0 thing would just be tweaking
the severity of the config (and in the meantime, it would help flush out
any unexpected instances people run into).

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-12 15:56             ` Jeff King
@ 2025-05-12 17:18               ` Junio C Hamano
  2025-05-13  5:08                 ` Patrick Steinhardt
  0 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-12 17:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, shejialuo, git

Jeff King <peff@peff.net> writes:

> It may be useful for fsck to detect this, though, even if the default
> message severity is set to "info" or even "ignore. That would allow
> people who know they are using modern Git to increase it themselves (I
> don't expect normal users to do this, but it would probably be useful
> for forges which run automated "fsck" across a lot of repos).
>
> And then the backwards-incompatible Git 3.0 thing would just be tweaking
> the severity of the config (and in the meantime, it would help flush out
> any unexpected instances people run into).

I came to make a same comment but the above has everything I wanted
to say (and more).

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-12 17:18               ` Junio C Hamano
@ 2025-05-13  5:08                 ` Patrick Steinhardt
  2025-05-13  7:06                   ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Patrick Steinhardt @ 2025-05-13  5:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, shejialuo, git

On Mon, May 12, 2025 at 10:18:34AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > It may be useful for fsck to detect this, though, even if the default
> > message severity is set to "info" or even "ignore. That would allow
> > people who know they are using modern Git to increase it themselves (I
> > don't expect normal users to do this, but it would probably be useful
> > for forges which run automated "fsck" across a lot of repos).
> >
> > And then the backwards-incompatible Git 3.0 thing would just be tweaking
> > the severity of the config (and in the meantime, it would help flush out
> > any unexpected instances people run into).
> 
> I came to make a same comment but the above has everything I wanted
> to say (and more).

Yup, agreed, that sounds like a reasonable approach indeed.

Patrick

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-12 13:06       ` Jeff King
@ 2025-05-13  6:55         ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-13  6:55 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Patrick Steinhardt

On Mon, May 12, 2025 at 09:06:19AM -0400, Jeff King wrote:
> On Sun, May 11, 2025 at 10:01:50PM +0800, shejialuo wrote:
> 
> > "load_contents" would choose which way to load the content of the
> > "packed-refs". However, we cannot directly use this function when
> > checking the consistency due to we don't want to open the file. And we
> > also need to reuse the logic to avoid causing repetition.
> > 
> > Let's create a new helper function "allocate_snapshot_buffer" to extract
> > the snapshot allocation logic in "load_contents" and update the
> > "load_contents" to align with the behavior.
> 
> This looks good to me. One thing that did give me a slight pause while
> reviewing:
> 
> >  static int load_contents(struct snapshot *snapshot)
> >  {
> > -	int fd;
> >  	struct stat st;
> > -	size_t size;
> > -	ssize_t bytes_read;
> > +	int ret = 1;
> > [...]
> > +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
> > +		ret = 0;
> >  
> > -	return 1;
> > +	close(fd);
> > +	return ret;
> >  }
> 
> I wanted to see what the semantics of "ret" were, but there aren't any
> other assignments. So I think this is equivalent to:
> 
>   int ret;
>   ...
>   ret = allocate_snapshot_buffer(snapshot, fd, &st);
> 
> that function.
> 

That's right, it would be more clear. I will update in the next version.

> Probably not worth re-rolling for that, though.
> 
> -Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file
  2025-05-13  5:08                 ` Patrick Steinhardt
@ 2025-05-13  7:06                   ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-13  7:06 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, Jeff King, git

On Tue, May 13, 2025 at 07:08:02AM +0200, Patrick Steinhardt wrote:
> On Mon, May 12, 2025 at 10:18:34AM -0700, Junio C Hamano wrote:
> > Jeff King <peff@peff.net> writes:
> > 
> > > It may be useful for fsck to detect this, though, even if the default
> > > message severity is set to "info" or even "ignore. That would allow
> > > people who know they are using modern Git to increase it themselves (I
> > > don't expect normal users to do this, but it would probably be useful
> > > for forges which run automated "fsck" across a lot of repos).
> > >
> > > And then the backwards-incompatible Git 3.0 thing would just be tweaking
> > > the severity of the config (and in the meantime, it would help flush out
> > > any unexpected instances people run into).
> > 
> > I came to make a same comment but the above has everything I wanted
> > to say (and more).
> 
> Yup, agreed, that sounds like a reasonable approach indeed.
> 

Agree, I will use a "info" to report an empty "packed-refs" file. Thank
everyone.

> Patrick

Jialuo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v4 0/3] align the behavior when opening "packed-refs"
  2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
                       ` (2 preceding siblings ...)
  2025-05-11 14:01     ` [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-13 11:06     ` shejialuo
  2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
                         ` (3 more replies)
  3 siblings, 4 replies; 67+ messages in thread
From: shejialuo @ 2025-05-13 11:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net

Really thank Peff and Patrick to suggest me to do above change.

---

Change in v2:

1. Update the commit message of [PATCH 1/4]. And use redirection to
create an empty file instead of using `touch`.
2. Don't use if for the refactored function in [PATCH 3/4] and then
update the commit message to align with the new function name.
3. Enhance the commit message of [PATCH 4/4].

---

Change in v3:

1. Drop the patch which creates a new function
"munmap_temporary_snapshot". As discussed, there is no need to munmap
the file during fsck.
2. Allocate snapshot variable in the stack instead of heap.
3. Fix rebase issue, remove unneeded code to check the file size
explicitly.

---

Change in v4:

1. Report the "emptyPackedRefsFile(INFO)" to the user when the
"packed-refs" is empty instead of ONLY skipping checking the content and
update the shell script and commit message.
2. Apply Peff's advice to make [PATCH v3 2/3] more clear.


Thanks,
Jialuo

shejialuo (3):
  packed-backend: fsck should warn when "packed-refs" file is empty
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: mmap large "packed-refs" file during fsck

 Documentation/fsck-msgids.adoc |  6 +++
 fsck.h                         |  1 +
 refs/packed-backend.c          | 73 ++++++++++++++++++++--------------
 t/t0602-reffiles-fsck.sh       | 17 ++++++++
 4 files changed, 67 insertions(+), 30 deletions(-)

Range-diff against v3:
1:  26c3fd55a8 ! 1:  75636c9c85 packed-backend: fsck should allow an empty "packed-refs" file
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    packed-backend: fsck should allow an empty "packed-refs" file
    +    packed-backend: fsck should warn when "packed-refs" file is empty
     
         During fsck, an empty "packed-refs" gives an error; this is unwarranted.
    -    We should just skip checking the content of "packed-refs" just like the
    -    runtime code paths such as "create_snapshot" which simply returns the
    -    "snapshot" without checking the content of "packed-refs".
    +    The runtime code paths would accept an empty "packed-refs" file, such as
    +    "create_snapshot" would simply return the "snapshot" without checking
    +    the content of "packed-refs".
    +
    +    And we should also skip checking the content of "packed-refs" when it is
    +    empty during fsck. However, we should think about whether we need to
    +    report something to the users in this case.
    +
    +    After 694b7a1999 (repack_without_ref(): write peeled refs in the
    +    rewritten file, 2013-04-22), we would always write a header into the
    +    "packed-refs" file where we would never create empty file since then.
    +    Because we only create empty "packed-refs" in the very early versions,
    +    we may tighten this rule in the future. In order to notify the users
    +    about this, we should at least report an warning to the users.
    +
    +    But we need to consider the fsck message type carefully, it is not
    +    appropriate that we use "FSCK_ERROR". This is because we would
    +    definitely break the compatibility. Let's create a "FSCK_INFO" message
    +    id EMPTY_PACKED_REFS_FILE" to indicate that "packed-refs" is empty.
     
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    + ## Documentation/fsck-msgids.adoc ##
    +@@
    + `emptyName`::
    + 	(WARN) A path contains an empty name.
    + 
    ++`emptyPackedRefsFile`::
    ++	(INFO) "packed-refs" file is empty. Report to the
    ++	git@vger.kernel.org mailing list if you see this error. As only
    ++	very early versions of Git would create such an empty
    ++	"packed_refs" file, we might tighten this rule in the future.
    ++
    + `extraHeaderEntry`::
    + 	(IGNORE) Extra headers found after `tagger`.
    + 
    +
    + ## fsck.h ##
    +@@ fsck.h: enum fsck_msg_type {
    + 	FUNC(LARGE_PATHNAME, WARN) \
    + 	/* infos (reported as warnings, but ignored by default) */ \
    + 	FUNC(BAD_FILEMODE, INFO) \
    ++	FUNC(EMPTY_PACKED_REFS_FILE, INFO) \
    + 	FUNC(GITMODULES_PARSE, INFO) \
    + 	FUNC(GITIGNORE_SYMLINK, INFO) \
    + 	FUNC(GITATTRIBUTES_SYMLINK, INFO) \
    +
      ## refs/packed-backend.c ##
     @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      		goto cleanup;
      	}
      
    -+	if (!st.st_size)
    ++	if (!st.st_size) {
    ++		struct fsck_ref_report report = { 0 };
    ++		report.path = "packed-refs";
    ++		ret = fsck_report_ref(o, &report,
    ++				      FSCK_MSG_EMPTY_PACKED_REFS_FILE,
    ++				      "file is empty");
     +		goto cleanup;
    ++	}
     +
      	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
      		ret = error_errno(_("unable to read '%s'"), refs->path);
    @@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
      	)
      '
      
    -+test_expect_success 'empty packed-refs should not be reported' '
    ++test_expect_success 'empty packed-refs should be reported' '
     +	test_when_finished "rm -rf repo" &&
     +	git init repo &&
     +	(
    @@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
     +
     +		>.git/packed-refs &&
     +		git refs verify 2>err &&
    -+		test_must_be_empty err
    ++		cat >expect <<-EOF &&
    ++		warning: packed-refs: emptyPackedRefsFile: file is empty
    ++		EOF
    ++		rm .git/packed-refs &&
    ++		test_cmp expect err
     +	)
     +'
     +
2:  4604be8b51 ! 2:  1a5893379d packed-backend: extract snapshot allocation in `load_contents`
    @@ refs/packed-backend.c: static int refname_contains_nul(struct strbuf *refname)
      	struct stat st;
     -	size_t size;
     -	ssize_t bytes_read;
    -+	int ret = 1;
    ++	int ret;
     +	int fd;
      
      	fd = open(snapshot->refs->path, O_RDONLY);
    @@ refs/packed-backend.c: static int load_contents(struct snapshot *snapshot)
      
     -	snapshot->start = snapshot->buf;
     -	snapshot->eof = snapshot->buf + size;
    -+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
    -+		ret = 0;
    ++	ret = allocate_snapshot_buffer(snapshot, fd, &st);
      
     -	return 1;
     +	close(fd);
3:  82e19a65c6 ! 3:  31e272db7e packed-backend: mmap large "packed-refs" file during fsck
    @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      		goto cleanup;
      	}
      
    --	if (!st.st_size)
    -+	if (!allocate_snapshot_buffer(&snapshot, fd, &st))
    +-	if (!st.st_size) {
    ++	if (!allocate_snapshot_buffer(&snapshot, fd, &st)) {
    + 		struct fsck_ref_report report = { 0 };
    + 		report.path = "packed-refs";
    + 		ret = fsck_report_ref(o, &report,
    +@@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      		goto cleanup;
    + 	}
      
     -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
     -		ret = error_errno(_("unable to read '%s'"), refs->path);
-- 
2.49.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty
  2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
@ 2025-05-13 11:07       ` shejialuo
  2025-05-13 16:30         ` Junio C Hamano
  2025-05-13 11:07       ` [PATCH v4 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-13 11:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, an empty "packed-refs" gives an error; this is unwarranted.
The runtime code paths would accept an empty "packed-refs" file, such as
"create_snapshot" would simply return the "snapshot" without checking
the content of "packed-refs".

And we should also skip checking the content of "packed-refs" when it is
empty during fsck. However, we should think about whether we need to
report something to the users in this case.

After 694b7a1999 (repack_without_ref(): write peeled refs in the
rewritten file, 2013-04-22), we would always write a header into the
"packed-refs" file where we would never create empty file since then.
Because we only create empty "packed-refs" in the very early versions,
we may tighten this rule in the future. In order to notify the users
about this, we should at least report an warning to the users.

But we need to consider the fsck message type carefully, it is not
appropriate that we use "FSCK_ERROR". This is because we would
definitely break the compatibility. Let's create a "FSCK_INFO" message
id EMPTY_PACKED_REFS_FILE" to indicate that "packed-refs" is empty.

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 Documentation/fsck-msgids.adoc |  6 ++++++
 fsck.h                         |  1 +
 refs/packed-backend.c          |  9 +++++++++
 t/t0602-reffiles-fsck.sh       | 17 +++++++++++++++++
 4 files changed, 33 insertions(+)

diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 9601fff228..0ba4f9a27e 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -59,6 +59,12 @@
 `emptyName`::
 	(WARN) A path contains an empty name.
 
+`emptyPackedRefsFile`::
+	(INFO) "packed-refs" file is empty. Report to the
+	git@vger.kernel.org mailing list if you see this error. As only
+	very early versions of Git would create such an empty
+	"packed_refs" file, we might tighten this rule in the future.
+
 `extraHeaderEntry`::
 	(IGNORE) Extra headers found after `tagger`.
 
diff --git a/fsck.h b/fsck.h
index b1deae61ee..0c5869ac34 100644
--- a/fsck.h
+++ b/fsck.h
@@ -84,6 +84,7 @@ enum fsck_msg_type {
 	FUNC(LARGE_PATHNAME, WARN) \
 	/* infos (reported as warnings, but ignored by default) */ \
 	FUNC(BAD_FILEMODE, INFO) \
+	FUNC(EMPTY_PACKED_REFS_FILE, INFO) \
 	FUNC(GITMODULES_PARSE, INFO) \
 	FUNC(GITIGNORE_SYMLINK, INFO) \
 	FUNC(GITATTRIBUTES_SYMLINK, INFO) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3ad1ed0787..fb91833e76 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2103,6 +2103,15 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
+	if (!st.st_size) {
+		struct fsck_ref_report report = { 0 };
+		report.path = "packed-refs";
+		ret = fsck_report_ref(o, &report,
+				      FSCK_MSG_EMPTY_PACKED_REFS_FILE,
+				      "file is empty");
+		goto cleanup;
+	}
+
 	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
 		ret = error_errno(_("unable to read '%s'"), refs->path);
 		goto cleanup;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 9d1dc2144c..f671ac4d3a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,6 +647,23 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
 	)
 '
 
+test_expect_success 'empty packed-refs should be reported' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit default &&
+
+		>.git/packed-refs &&
+		git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		warning: packed-refs: emptyPackedRefsFile: file is empty
+		EOF
+		rm .git/packed-refs &&
+		test_cmp expect err
+	)
+'
+
 test_expect_success 'packed-refs header should be checked' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v4 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
  2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
@ 2025-05-13 11:07       ` shejialuo
  2025-05-13 11:07       ` [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
  3 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-13 11:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"load_contents" would choose which way to load the content of the
"packed-refs". However, we cannot directly use this function when
checking the consistency due to we don't want to open the file. And we
also need to reuse the logic to avoid causing repetition.

Let's create a new helper function "allocate_snapshot_buffer" to extract
the snapshot allocation logic in "load_contents" and update the
"load_contents" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 53 +++++++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index fb91833e76..1da44a3d6d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -517,6 +517,32 @@ static int refname_contains_nul(struct strbuf *refname)
 
 #define SMALL_FILE_SIZE (32*1024)
 
+static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
+{
+	ssize_t bytes_read;
+	size_t size;
+
+	size = xsize_t(st->st_size);
+	if (!size)
+		return 0;
+
+	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
+		snapshot->buf = xmalloc(size);
+		bytes_read = read_in_full(fd, snapshot->buf, size);
+		if (bytes_read < 0 || bytes_read != size)
+			die_errno("couldn't read %s", snapshot->refs->path);
+		snapshot->mmapped = 0;
+	} else {
+		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		snapshot->mmapped = 1;
+	}
+
+	snapshot->start = snapshot->buf;
+	snapshot->eof = snapshot->buf + size;
+
+	return 1;
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -525,10 +551,9 @@ static int refname_contains_nul(struct strbuf *refname)
  */
 static int load_contents(struct snapshot *snapshot)
 {
-	int fd;
 	struct stat st;
-	size_t size;
-	ssize_t bytes_read;
+	int ret;
+	int fd;
 
 	fd = open(snapshot->refs->path, O_RDONLY);
 	if (fd < 0) {
@@ -550,27 +575,11 @@ static int load_contents(struct snapshot *snapshot)
 
 	if (fstat(fd, &st) < 0)
 		die_errno("couldn't stat %s", snapshot->refs->path);
-	size = xsize_t(st.st_size);
-
-	if (!size) {
-		close(fd);
-		return 0;
-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
-		snapshot->buf = xmalloc(size);
-		bytes_read = read_in_full(fd, snapshot->buf, size);
-		if (bytes_read < 0 || bytes_read != size)
-			die_errno("couldn't read %s", snapshot->refs->path);
-		snapshot->mmapped = 0;
-	} else {
-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
-		snapshot->mmapped = 1;
-	}
-	close(fd);
 
-	snapshot->start = snapshot->buf;
-	snapshot->eof = snapshot->buf + size;
+	ret = allocate_snapshot_buffer(snapshot, fd, &st);
 
-	return 1;
+	close(fd);
+	return ret;
 }
 
 static const char *find_reference_location_1(struct snapshot *snapshot,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
  2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
  2025-05-13 11:07       ` [PATCH v4 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-13 11:07       ` shejialuo
  2025-05-13 16:51         ` Junio C Hamano
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
  3 siblings, 1 reply; 67+ messages in thread
From: shejialuo @ 2025-05-13 11:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, we use "strbuf_read" to read the content of "packed-refs"
without using mmap mechanism. This is a bad practice which would consume
more memory than using mmap mechanism. Besides, as all code paths in
"packed-backend.c" use this way, we should make "fsck" align with the
current codebase.

As we have introduced the helper function "allocate_snapshot_buffer", we
could simple use this function to use mmap mechanism.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1da44a3d6d..7fd73a0e6d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2068,7 +2068,7 @@ static int packed_fsck(struct ref_store *ref_store,
 {
 	struct packed_ref_store *refs = packed_downcast(ref_store,
 							REF_STORE_READ, "fsck");
-	struct strbuf packed_ref_content = STRBUF_INIT;
+	struct snapshot snapshot = { 0 };
 	unsigned int sorted = 0;
 	struct stat st;
 	int ret = 0;
@@ -2112,7 +2112,7 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
-	if (!st.st_size) {
+	if (!allocate_snapshot_buffer(&snapshot, fd, &st)) {
 		struct fsck_ref_report report = { 0 };
 		report.path = "packed-refs";
 		ret = fsck_report_ref(o, &report,
@@ -2121,21 +2121,16 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
-	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
-		ret = error_errno(_("unable to read '%s'"), refs->path);
-		goto cleanup;
-	}
-
-	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
-				      packed_ref_content.buf + packed_ref_content.len);
+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
+				      snapshot.eof);
 	if (!ret && sorted)
-		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
-					     packed_ref_content.buf + packed_ref_content.len);
+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
+					     snapshot.eof);
 
 cleanup:
 	if (fd >= 0)
 		close(fd);
-	strbuf_release(&packed_ref_content);
+	clear_snapshot_buffer(&snapshot);
 	return ret;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty
  2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
@ 2025-05-13 16:30         ` Junio C Hamano
  2025-05-14 12:51           ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-13 16:30 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> During fsck, an empty "packed-refs" gives an error; this is unwarranted.
> The runtime code paths would accept an empty "packed-refs" file, such as
> "create_snapshot" would simply return the "snapshot" without checking
> the content of "packed-refs".

Perhaps "unwarranted" is now too strong a word; we still want to
consider it an anomaly (that is why emptyPackedRefsFile warning is
introduced after all). I think the problem description you want to
here in the above pragraph is that fsck giving an error and runtime
completely silent is inconsistent.  

    Side note: and you'd probably want to say what "an error"
    reported here is.  The problem, if I understand correctly, is
    that the code assumes the file won't be empty and instead has at
    least one line in it (even when there are no refs packed, there
    is the file header line) and insists that all lines must be well
    terminated---if we tolerate an empty file, of course such a
    check will fail, as there is no terminating LF in a file with 0
    lines in it.

And because versions of Git that are not too ancient never wrote an
empty packed-refs file, and often having an empty file there is/was
a sign of a filesystem-level issue, the way we want resolve this
inconsistency is not make everybody totally silent but notice and
report the anomaly.

> But we need to consider the fsck message type carefully, it is not
> appropriate that we use "FSCK_ERROR". This is because we would
> definitely break the compatibility. Let's create a "FSCK_INFO" message
> id EMPTY_PACKED_REFS_FILE" to indicate that "packed-refs" is empty.

OK.

> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  Documentation/fsck-msgids.adoc |  6 ++++++
>  fsck.h                         |  1 +
>  refs/packed-backend.c          |  9 +++++++++
>  t/t0602-reffiles-fsck.sh       | 17 +++++++++++++++++
>  4 files changed, 33 insertions(+)
>
> diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
> index 9601fff228..0ba4f9a27e 100644
> --- a/Documentation/fsck-msgids.adoc
> +++ b/Documentation/fsck-msgids.adoc
> @@ -59,6 +59,12 @@
>  `emptyName`::
>  	(WARN) A path contains an empty name.
>  
> +`emptyPackedRefsFile`::
> +	(INFO) "packed-refs" file is empty. Report to the
> +	git@vger.kernel.org mailing list if you see this error. As only
> +	very early versions of Git would create such an empty
> +	"packed_refs" file, we might tighten this rule in the future.

I am not too happy to see "Report to ..." and everything after that
here, primarily because it takes one extra step for the user to find
it out when they see such an informational message.  There are other
existing error classes, like refMissingNewline, etc., that have the
same problem.  One thing to make it easier for the users to report
is to put it in the error/info messages themselves, but I think it
is OK to make such a clean-up (including the existing offenders)
after the dust settles from this topic.

> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 3ad1ed0787..fb91833e76 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -2103,6 +2103,15 @@ static int packed_fsck(struct ref_store *ref_store,
>  		goto cleanup;
>  	}
>  
> +	if (!st.st_size) {
> +		struct fsck_ref_report report = { 0 };
> +		report.path = "packed-refs";
> +		ret = fsck_report_ref(o, &report,
> +				      FSCK_MSG_EMPTY_PACKED_REFS_FILE,
> +				      "file is empty");
> +		goto cleanup;
> +	}

OK.

> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 9d1dc2144c..f671ac4d3a 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -647,6 +647,23 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
>  	)
>  '
>  
> +test_expect_success 'empty packed-refs should be reported' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit default &&
> +
> +		>.git/packed-refs &&
> +		git refs verify 2>err &&
> +		cat >expect <<-EOF &&
> +		warning: packed-refs: emptyPackedRefsFile: file is empty
> +		EOF
> +		rm .git/packed-refs &&
> +		test_cmp expect err
> +	)
> +'
> +
>  test_expect_success 'packed-refs header should be checked' '
>  	test_when_finished "rm -rf repo" &&
>  	git init repo &&

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-13 11:07       ` [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-13 16:51         ` Junio C Hamano
  2025-05-14 13:05           ` shejialuo
  0 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-13 16:51 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> During fsck, we use "strbuf_read" to read the content of "packed-refs"
> without using mmap mechanism. This is a bad practice which would consume
> more memory than using mmap mechanism. Besides, as all code paths in
> "packed-backend.c" use this way, we should make "fsck" align with the
> current codebase.
>
> As we have introduced the helper function "allocate_snapshot_buffer", we
> could simple use this function to use mmap mechanism.

"could simple" -> "can simply".

> Suggested-by: Jeff King <peff@peff.net>
> Suggested-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  refs/packed-backend.c | 19 +++++++------------
>  1 file changed, 7 insertions(+), 12 deletions(-)

Nice loss of line count ;-)

> -	if (!st.st_size) {
> +	if (!allocate_snapshot_buffer(&snapshot, fd, &st)) {

It is a bit funny to see that a helper function that works at a much
higher conceptual level treat an empty file so specially (namely,
should it be different from a header-only packed-refs file?).  If I
were doing this refactoring in 2 & 3, I would probalby have made the
helper return "void", and have callers who do care about st.st_size
check that themselves.

But I'll let it pass.

> @@ -2121,21 +2121,16 @@ static int packed_fsck(struct ref_store *ref_store,
>  		goto cleanup;
>  	}
>  
> -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> -		ret = error_errno(_("unable to read '%s'"), refs->path);
> -		goto cleanup;
> -	}
> -
> -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
> -				      packed_ref_content.buf + packed_ref_content.len);
> +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
> +				      snapshot.eof);
>  	if (!ret && sorted)
> -		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
> -					     packed_ref_content.buf + packed_ref_content.len);
> +		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
> +					     snapshot.eof);
>  
>  cleanup:
>  	if (fd >= 0)
>  		close(fd);
> -	strbuf_release(&packed_ref_content);
> +	clear_snapshot_buffer(&snapshot);
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty
  2025-05-13 16:30         ` Junio C Hamano
@ 2025-05-14 12:51           ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-14 12:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Patrick Steinhardt

On Tue, May 13, 2025 at 09:30:57AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > During fsck, an empty "packed-refs" gives an error; this is unwarranted.
> > The runtime code paths would accept an empty "packed-refs" file, such as
> > "create_snapshot" would simply return the "snapshot" without checking
> > the content of "packed-refs".
> 
> Perhaps "unwarranted" is now too strong a word; we still want to
> consider it an anomaly (that is why emptyPackedRefsFile warning is
> introduced after all). I think the problem description you want to
> here in the above pragraph is that fsck giving an error and runtime
> completely silent is inconsistent.  
> 
>     Side note: and you'd probably want to say what "an error"
>     reported here is.  The problem, if I understand correctly, is
>     that the code assumes the file won't be empty and instead has at
>     least one line in it (even when there are no refs packed, there
>     is the file header line) and insists that all lines must be well
>     terminated---if we tolerate an empty file, of course such a
>     check will fail, as there is no terminating LF in a file with 0
>     lines in it.
> 

Good idea, I will improve the commit message.

> And because versions of Git that are not too ancient never wrote an
> empty packed-refs file, and often having an empty file there is/was
> a sign of a filesystem-level issue, the way we want resolve this
> inconsistency is not make everybody totally silent but notice and
> report the anomaly.
> 

That's right. I should talk about this problem.

> > But we need to consider the fsck message type carefully, it is not
> > appropriate that we use "FSCK_ERROR". This is because we would
> > definitely break the compatibility. Let's create a "FSCK_INFO" message
> > id EMPTY_PACKED_REFS_FILE" to indicate that "packed-refs" is empty.
> 
> OK.
> 
> > Signed-off-by: shejialuo <shejialuo@gmail.com>
> > ---
> >  Documentation/fsck-msgids.adoc |  6 ++++++
> >  fsck.h                         |  1 +
> >  refs/packed-backend.c          |  9 +++++++++
> >  t/t0602-reffiles-fsck.sh       | 17 +++++++++++++++++
> >  4 files changed, 33 insertions(+)
> >
> > diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
> > index 9601fff228..0ba4f9a27e 100644
> > --- a/Documentation/fsck-msgids.adoc
> > +++ b/Documentation/fsck-msgids.adoc
> > @@ -59,6 +59,12 @@
> >  `emptyName`::
> >  	(WARN) A path contains an empty name.
> >  
> > +`emptyPackedRefsFile`::
> > +	(INFO) "packed-refs" file is empty. Report to the
> > +	git@vger.kernel.org mailing list if you see this error. As only
> > +	very early versions of Git would create such an empty
> > +	"packed_refs" file, we might tighten this rule in the future.
> 
> I am not too happy to see "Report to ..." and everything after that
> here, primarily because it takes one extra step for the user to find
> it out when they see such an informational message.  There are other
> existing error classes, like refMissingNewline, etc., that have the
> same problem.  One thing to make it easier for the users to report
> is to put it in the error/info messages themselves, but I think it
> is OK to make such a clean-up (including the existing offenders)
> after the dust settles from this topic.
> 

That's right. Actually we introduce redirection here. I think I will
improve this in the next release cycle. I'll add this into my TODO list.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-13 16:51         ` Junio C Hamano
@ 2025-05-14 13:05           ` shejialuo
  0 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-14 13:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Patrick Steinhardt

On Tue, May 13, 2025 at 09:51:20AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > During fsck, we use "strbuf_read" to read the content of "packed-refs"
> > without using mmap mechanism. This is a bad practice which would consume
> > more memory than using mmap mechanism. Besides, as all code paths in
> > "packed-backend.c" use this way, we should make "fsck" align with the
> > current codebase.
> >
> > As we have introduced the helper function "allocate_snapshot_buffer", we
> > could simple use this function to use mmap mechanism.
> 
> "could simple" -> "can simply".
> 

Nice catch.

> > Suggested-by: Jeff King <peff@peff.net>
> > Suggested-by: Patrick Steinhardt <ps@pks.im>
> > Signed-off-by: shejialuo <shejialuo@gmail.com>
> > ---
> >  refs/packed-backend.c | 19 +++++++------------
> >  1 file changed, 7 insertions(+), 12 deletions(-)
> 
> Nice loss of line count ;-)
> 
> > -	if (!st.st_size) {
> > +	if (!allocate_snapshot_buffer(&snapshot, fd, &st)) {
> 
> It is a bit funny to see that a helper function that works at a much
> higher conceptual level treat an empty file so specially (namely,
> should it be different from a header-only packed-refs file?).  If I
> were doing this refactoring in 2 & 3, I would probalby have made the
> helper return "void", and have callers who do care about st.st_size
> check that themselves.
> 

This is a good question. Actually, I intentionally design this. In the
very first implementation, I use above way. But later I think this is
inappropriate to let callers check `st.st_size`.

The reason is that we would do the same thing in "load_content" function
and fsck function. But after thinking, I think we should make the
runtime behavior and fsck consistent. That's our goal. And by using
this, let's say if one day we want to tighten the rule in the future, we
could just change `allocate_snapshot_buffer`.

I agree that introducing side effect into a function is not a good
design. But in this situation, we could make things consistent.

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v5 0/3] align the behavior when opening "packed-refs"
  2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
                         ` (2 preceding siblings ...)
  2025-05-13 11:07       ` [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-14 15:48       ` shejialuo
  2025-05-14 15:50         ` [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
                           ` (4 more replies)
  3 siblings, 5 replies; 67+ messages in thread
From: shejialuo @ 2025-05-14 15:48 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net

Really thank Peff and Patrick to suggest me to do above change.

---

Change in v2:

1. Update the commit message of [PATCH 1/4]. And use redirection to
create an empty file instead of using `touch`.
2. Don't use if for the refactored function in [PATCH 3/4] and then
update the commit message to align with the new function name.
3. Enhance the commit message of [PATCH 4/4].

---

Change in v3:

1. Drop the patch which creates a new function
"munmap_temporary_snapshot". As discussed, there is no need to munmap
the file during fsck.
2. Allocate snapshot variable in the stack instead of heap.
3. Fix rebase issue, remove unneeded code to check the file size
explicitly.

---

Change in v4:

1. Report the "emptyPackedRefsFile(INFO)" to the user when the
"packed-refs" is empty instead of ONLY skipping checking the content and
update the shell script and commit message.
2. Apply Peff's advice to make [PATCH v3 2/3] more clear.

---

Change in v5:

1. Improve the commit message in the first patch to be more clear:
    1. Talk about the current behavior, what error we would report if
       "packed-refs" is empty.
    2. To align with the runtime behavior, we should skip checking the
       content of "packed-refs".
    3. Why do we need to report to the user when the "packed-refs" is
       empty
2. Fix grammar issue in the last patch.

Thanks,
Jialuo

shejialuo (3):
  packed-backend: fsck should warn when "packed-refs" file is empty
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: mmap large "packed-refs" file during fsck

 Documentation/fsck-msgids.adoc |  6 +++
 fsck.h                         |  1 +
 refs/packed-backend.c          | 73 ++++++++++++++++++++--------------
 t/t0602-reffiles-fsck.sh       | 17 ++++++++
 4 files changed, 67 insertions(+), 30 deletions(-)

Range-diff against v4:
1:  75636c9c85 ! 1:  3487692a03 packed-backend: fsck should warn when "packed-refs" file is empty
    @@ Metadata
      ## Commit message ##
         packed-backend: fsck should warn when "packed-refs" file is empty
     
    -    During fsck, an empty "packed-refs" gives an error; this is unwarranted.
    -    The runtime code paths would accept an empty "packed-refs" file, such as
    -    "create_snapshot" would simply return the "snapshot" without checking
    -    the content of "packed-refs".
    +    We assume the "packed-refs" won't be empty and instead has at least one
    +    line in it (even when there are no refs packed, there is the file header
    +    line). Because there is no terminating LF in the empty file, we will
    +    report "packedRefEntryNotTerminated(ERROR)" to the user.
     
    -    And we should also skip checking the content of "packed-refs" when it is
    -    empty during fsck. However, we should think about whether we need to
    -    report something to the users in this case.
    +    However, the runtime code paths would accept an empty "packed-refs"
    +    file, for example, "create_snapshot" would simply return the "snapshot"
    +    without checking the content of "packed-refs". So, we should skip
    +    checking the content of "packed-refs" when it is empty during fsck.
     
         After 694b7a1999 (repack_without_ref(): write peeled refs in the
         rewritten file, 2013-04-22), we would always write a header into the
    -    "packed-refs" file where we would never create empty file since then.
    -    Because we only create empty "packed-refs" in the very early versions,
    -    we may tighten this rule in the future. In order to notify the users
    -    about this, we should at least report an warning to the users.
    +    "packed-refs" file. So, versions of Git that are not too ancient never
    +    write such an empty "packed-refs" file.
     
    -    But we need to consider the fsck message type carefully, it is not
    -    appropriate that we use "FSCK_ERROR". This is because we would
    -    definitely break the compatibility. Let's create a "FSCK_INFO" message
    -    id EMPTY_PACKED_REFS_FILE" to indicate that "packed-refs" is empty.
    +    As an empty file often indicates a sign of a filesystem-level issue, the
    +    way we want to resolve this inconsistency is not make everybody totally
    +    silent but notice and report the anomaly.
    +
    +    Let's create a "FSCK_INFO" message id "EMPTY_PACKED_REFS_FILE" to report
    +    to the users that "packed-refs" is empty.
     
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
2:  1a5893379d = 2:  0d050849bc packed-backend: extract snapshot allocation in `load_contents`
3:  31e272db7e ! 3:  fe5ffec8fb packed-backend: mmap large "packed-refs" file during fsck
    @@ Commit message
         current codebase.
     
         As we have introduced the helper function "allocate_snapshot_buffer", we
    -    could simple use this function to use mmap mechanism.
    +    can simply use this function to use mmap mechanism.
     
         Suggested-by: Jeff King <peff@peff.net>
         Suggested-by: Patrick Steinhardt <ps@pks.im>
-- 
2.49.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
@ 2025-05-14 15:50         ` shejialuo
  2025-05-14 15:50         ` [PATCH v5 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-14 15:50 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

We assume the "packed-refs" won't be empty and instead has at least one
line in it (even when there are no refs packed, there is the file header
line). Because there is no terminating LF in the empty file, we will
report "packedRefEntryNotTerminated(ERROR)" to the user.

However, the runtime code paths would accept an empty "packed-refs"
file, for example, "create_snapshot" would simply return the "snapshot"
without checking the content of "packed-refs". So, we should skip
checking the content of "packed-refs" when it is empty during fsck.

After 694b7a1999 (repack_without_ref(): write peeled refs in the
rewritten file, 2013-04-22), we would always write a header into the
"packed-refs" file. So, versions of Git that are not too ancient never
write such an empty "packed-refs" file.

As an empty file often indicates a sign of a filesystem-level issue, the
way we want to resolve this inconsistency is not make everybody totally
silent but notice and report the anomaly.

Let's create a "FSCK_INFO" message id "EMPTY_PACKED_REFS_FILE" to report
to the users that "packed-refs" is empty.

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 Documentation/fsck-msgids.adoc |  6 ++++++
 fsck.h                         |  1 +
 refs/packed-backend.c          |  9 +++++++++
 t/t0602-reffiles-fsck.sh       | 17 +++++++++++++++++
 4 files changed, 33 insertions(+)

diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc
index 9601fff228..0ba4f9a27e 100644
--- a/Documentation/fsck-msgids.adoc
+++ b/Documentation/fsck-msgids.adoc
@@ -59,6 +59,12 @@
 `emptyName`::
 	(WARN) A path contains an empty name.
 
+`emptyPackedRefsFile`::
+	(INFO) "packed-refs" file is empty. Report to the
+	git@vger.kernel.org mailing list if you see this error. As only
+	very early versions of Git would create such an empty
+	"packed_refs" file, we might tighten this rule in the future.
+
 `extraHeaderEntry`::
 	(IGNORE) Extra headers found after `tagger`.
 
diff --git a/fsck.h b/fsck.h
index b1deae61ee..0c5869ac34 100644
--- a/fsck.h
+++ b/fsck.h
@@ -84,6 +84,7 @@ enum fsck_msg_type {
 	FUNC(LARGE_PATHNAME, WARN) \
 	/* infos (reported as warnings, but ignored by default) */ \
 	FUNC(BAD_FILEMODE, INFO) \
+	FUNC(EMPTY_PACKED_REFS_FILE, INFO) \
 	FUNC(GITMODULES_PARSE, INFO) \
 	FUNC(GITIGNORE_SYMLINK, INFO) \
 	FUNC(GITATTRIBUTES_SYMLINK, INFO) \
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3ad1ed0787..fb91833e76 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2103,6 +2103,15 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
+	if (!st.st_size) {
+		struct fsck_ref_report report = { 0 };
+		report.path = "packed-refs";
+		ret = fsck_report_ref(o, &report,
+				      FSCK_MSG_EMPTY_PACKED_REFS_FILE,
+				      "file is empty");
+		goto cleanup;
+	}
+
 	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
 		ret = error_errno(_("unable to read '%s'"), refs->path);
 		goto cleanup;
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 9d1dc2144c..f671ac4d3a 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -647,6 +647,23 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
 	)
 '
 
+test_expect_success 'empty packed-refs should be reported' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit default &&
+
+		>.git/packed-refs &&
+		git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		warning: packed-refs: emptyPackedRefsFile: file is empty
+		EOF
+		rm .git/packed-refs &&
+		test_cmp expect err
+	)
+'
+
 test_expect_success 'packed-refs header should be checked' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v5 2/3] packed-backend: extract snapshot allocation in `load_contents`
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
  2025-05-14 15:50         ` [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
@ 2025-05-14 15:50         ` shejialuo
  2025-05-14 15:50         ` [PATCH v5 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-14 15:50 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

"load_contents" would choose which way to load the content of the
"packed-refs". However, we cannot directly use this function when
checking the consistency due to we don't want to open the file. And we
also need to reuse the logic to avoid causing repetition.

Let's create a new helper function "allocate_snapshot_buffer" to extract
the snapshot allocation logic in "load_contents" and update the
"load_contents" to align with the behavior.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 53 +++++++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index fb91833e76..1da44a3d6d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -517,6 +517,32 @@ static int refname_contains_nul(struct strbuf *refname)
 
 #define SMALL_FILE_SIZE (32*1024)
 
+static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
+{
+	ssize_t bytes_read;
+	size_t size;
+
+	size = xsize_t(st->st_size);
+	if (!size)
+		return 0;
+
+	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
+		snapshot->buf = xmalloc(size);
+		bytes_read = read_in_full(fd, snapshot->buf, size);
+		if (bytes_read < 0 || bytes_read != size)
+			die_errno("couldn't read %s", snapshot->refs->path);
+		snapshot->mmapped = 0;
+	} else {
+		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		snapshot->mmapped = 1;
+	}
+
+	snapshot->start = snapshot->buf;
+	snapshot->eof = snapshot->buf + size;
+
+	return 1;
+}
+
 /*
  * Depending on `mmap_strategy`, either mmap or read the contents of
  * the `packed-refs` file into the snapshot. Return 1 if the file
@@ -525,10 +551,9 @@ static int refname_contains_nul(struct strbuf *refname)
  */
 static int load_contents(struct snapshot *snapshot)
 {
-	int fd;
 	struct stat st;
-	size_t size;
-	ssize_t bytes_read;
+	int ret;
+	int fd;
 
 	fd = open(snapshot->refs->path, O_RDONLY);
 	if (fd < 0) {
@@ -550,27 +575,11 @@ static int load_contents(struct snapshot *snapshot)
 
 	if (fstat(fd, &st) < 0)
 		die_errno("couldn't stat %s", snapshot->refs->path);
-	size = xsize_t(st.st_size);
-
-	if (!size) {
-		close(fd);
-		return 0;
-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
-		snapshot->buf = xmalloc(size);
-		bytes_read = read_in_full(fd, snapshot->buf, size);
-		if (bytes_read < 0 || bytes_read != size)
-			die_errno("couldn't read %s", snapshot->refs->path);
-		snapshot->mmapped = 0;
-	} else {
-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
-		snapshot->mmapped = 1;
-	}
-	close(fd);
 
-	snapshot->start = snapshot->buf;
-	snapshot->eof = snapshot->buf + size;
+	ret = allocate_snapshot_buffer(snapshot, fd, &st);
 
-	return 1;
+	close(fd);
+	return ret;
 }
 
 static const char *find_reference_location_1(struct snapshot *snapshot,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v5 3/3] packed-backend: mmap large "packed-refs" file during fsck
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
  2025-05-14 15:50         ` [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
  2025-05-14 15:50         ` [PATCH v5 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
@ 2025-05-14 15:50         ` shejialuo
  2025-05-15 12:57         ` [PATCH v5 0/3] align the behavior when opening "packed-refs" Junio C Hamano
  2025-05-21 16:31         ` Junio C Hamano
  4 siblings, 0 replies; 67+ messages in thread
From: shejialuo @ 2025-05-14 15:50 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt

During fsck, we use "strbuf_read" to read the content of "packed-refs"
without using mmap mechanism. This is a bad practice which would consume
more memory than using mmap mechanism. Besides, as all code paths in
"packed-backend.c" use this way, we should make "fsck" align with the
current codebase.

As we have introduced the helper function "allocate_snapshot_buffer", we
can simply use this function to use mmap mechanism.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1da44a3d6d..7fd73a0e6d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2068,7 +2068,7 @@ static int packed_fsck(struct ref_store *ref_store,
 {
 	struct packed_ref_store *refs = packed_downcast(ref_store,
 							REF_STORE_READ, "fsck");
-	struct strbuf packed_ref_content = STRBUF_INIT;
+	struct snapshot snapshot = { 0 };
 	unsigned int sorted = 0;
 	struct stat st;
 	int ret = 0;
@@ -2112,7 +2112,7 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
-	if (!st.st_size) {
+	if (!allocate_snapshot_buffer(&snapshot, fd, &st)) {
 		struct fsck_ref_report report = { 0 };
 		report.path = "packed-refs";
 		ret = fsck_report_ref(o, &report,
@@ -2121,21 +2121,16 @@ static int packed_fsck(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
-	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
-		ret = error_errno(_("unable to read '%s'"), refs->path);
-		goto cleanup;
-	}
-
-	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
-				      packed_ref_content.buf + packed_ref_content.len);
+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
+				      snapshot.eof);
 	if (!ret && sorted)
-		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
-					     packed_ref_content.buf + packed_ref_content.len);
+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
+					     snapshot.eof);
 
 cleanup:
 	if (fd >= 0)
 		close(fd);
-	strbuf_release(&packed_ref_content);
+	clear_snapshot_buffer(&snapshot);
 	return ret;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v5 0/3] align the behavior when opening "packed-refs"
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
                           ` (2 preceding siblings ...)
  2025-05-14 15:50         ` [PATCH v5 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
@ 2025-05-15 12:57         ` Junio C Hamano
  2025-05-21 16:31         ` Junio C Hamano
  4 siblings, 0 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-15 12:57 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> As discussed in [1], we need to use mmap mechanism to open large
> "packed_refs" file to save the memory usage. This patch mainly does the
> following things:
>
> 1: Fix an issue that we would report an error when the "packed-refs"
> file is empty, which does not align with the runtime behavior.
> 2-4: Extract some logic from the existing code and then use these
> created helper functions to let fsck code to use mmap necessarily
>
> [1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net
>
> Really thank Peff and Patrick to suggest me to do above change.

This round looks good to me.  Queued.  Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5 0/3] align the behavior when opening "packed-refs"
  2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
                           ` (3 preceding siblings ...)
  2025-05-15 12:57         ` [PATCH v5 0/3] align the behavior when opening "packed-refs" Junio C Hamano
@ 2025-05-21 16:31         ` Junio C Hamano
  2025-05-22  5:50           ` Jeff King
  4 siblings, 1 reply; 67+ messages in thread
From: Junio C Hamano @ 2025-05-21 16:31 UTC (permalink / raw)
  To: shejialuo; +Cc: git, Jeff King, Patrick Steinhardt

shejialuo <shejialuo@gmail.com> writes:

> As discussed in [1], we need to use mmap mechanism to open large
> "packed_refs" file to save the memory usage. This patch mainly does the
> following things:
>
> 1: Fix an issue that we would report an error when the "packed-refs"
> file is empty, which does not align with the runtime behavior.
> 2-4: Extract some logic from the existing code and then use these
> created helper functions to let fsck code to use mmap necessarily
>
> [1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net
>
> Really thank Peff and Patrick to suggest me to do above change.
> ...
> Change in v5:
>
> 1. Improve the commit message in the first patch to be more clear:
>     1. Talk about the current behavior, what error we would report if
>        "packed-refs" is empty.
>     2. To align with the runtime behavior, we should skip checking the
>        content of "packed-refs".
>     3. Why do we need to report to the user when the "packed-refs" is
>        empty
> 2. Fix grammar issue in the last patch.

The thread has gone quiet on this topic.  Is everybody happy with
this version?

Thanks.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5 0/3] align the behavior when opening "packed-refs"
  2025-05-21 16:31         ` Junio C Hamano
@ 2025-05-22  5:50           ` Jeff King
  2025-05-23  9:40             ` Patrick Steinhardt
  0 siblings, 1 reply; 67+ messages in thread
From: Jeff King @ 2025-05-22  5:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: shejialuo, git, Patrick Steinhardt

On Wed, May 21, 2025 at 09:31:09AM -0700, Junio C Hamano wrote:

> > Change in v5:
> >
> > 1. Improve the commit message in the first patch to be more clear:
> >     1. Talk about the current behavior, what error we would report if
> >        "packed-refs" is empty.
> >     2. To align with the runtime behavior, we should skip checking the
> >        content of "packed-refs".
> >     3. Why do we need to report to the user when the "packed-refs" is
> >        empty
> > 2. Fix grammar issue in the last patch.
> 
> The thread has gone quiet on this topic.  Is everybody happy with
> this version?

Yep, it looks good to me. Thanks.

-Peff

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5 0/3] align the behavior when opening "packed-refs"
  2025-05-22  5:50           ` Jeff King
@ 2025-05-23  9:40             ` Patrick Steinhardt
  2025-05-23 15:58               ` Junio C Hamano
  0 siblings, 1 reply; 67+ messages in thread
From: Patrick Steinhardt @ 2025-05-23  9:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, shejialuo, git

On Thu, May 22, 2025 at 01:50:06AM -0400, Jeff King wrote:
> On Wed, May 21, 2025 at 09:31:09AM -0700, Junio C Hamano wrote:
> 
> > > Change in v5:
> > >
> > > 1. Improve the commit message in the first patch to be more clear:
> > >     1. Talk about the current behavior, what error we would report if
> > >        "packed-refs" is empty.
> > >     2. To align with the runtime behavior, we should skip checking the
> > >        content of "packed-refs".
> > >     3. Why do we need to report to the user when the "packed-refs" is
> > >        empty
> > > 2. Fix grammar issue in the last patch.
> > 
> > The thread has gone quiet on this topic.  Is everybody happy with
> > this version?
> 
> Yep, it looks good to me. Thanks.

Didn't have anything else to add, either. Thanks!

Patrick

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5 0/3] align the behavior when opening "packed-refs"
  2025-05-23  9:40             ` Patrick Steinhardt
@ 2025-05-23 15:58               ` Junio C Hamano
  0 siblings, 0 replies; 67+ messages in thread
From: Junio C Hamano @ 2025-05-23 15:58 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Jeff King, shejialuo, git

Patrick Steinhardt <ps@pks.im> writes:

> On Thu, May 22, 2025 at 01:50:06AM -0400, Jeff King wrote:
>> On Wed, May 21, 2025 at 09:31:09AM -0700, Junio C Hamano wrote:
>> 
>> > > Change in v5:
>> > >
>> > > 1. Improve the commit message in the first patch to be more clear:
>> > >     1. Talk about the current behavior, what error we would report if
>> > >        "packed-refs" is empty.
>> > >     2. To align with the runtime behavior, we should skip checking the
>> > >        content of "packed-refs".
>> > >     3. Why do we need to report to the user when the "packed-refs" is
>> > >        empty
>> > > 2. Fix grammar issue in the last patch.
>> > 
>> > The thread has gone quiet on this topic.  Is everybody happy with
>> > this version?
>> 
>> Yep, it looks good to me. Thanks.
>
> Didn't have anything else to add, either. Thanks!

Thanks, all.  Let's merge it down.

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2025-05-23 15:59 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
2025-05-06 18:42   ` Junio C Hamano
2025-05-07 12:09     ` shejialuo
2025-05-06 19:14   ` Junio C Hamano
2025-05-07 12:10     ` shejialuo
2025-05-06 16:41 ` [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-06 19:16   ` Junio C Hamano
2025-05-06 16:41 ` [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
2025-05-06 18:52   ` Junio C Hamano
2025-05-06 22:17     ` Junio C Hamano
2025-05-07 12:21     ` shejialuo
2025-05-06 16:41 ` [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file shejialuo
2025-05-06 19:00   ` Junio C Hamano
2025-05-06 22:18     ` Junio C Hamano
2025-05-07 12:34     ` shejialuo
2025-05-07 14:52 ` [PATCH v2 0/4] align the behavior when opening "packed-refs" shejialuo
2025-05-07 14:53   ` [PATCH v2 1/4] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
2025-05-07 14:53   ` [PATCH v2 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-07 14:53   ` [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
2025-05-08 19:57     ` Jeff King
2025-05-08 20:05       ` Junio C Hamano
2025-05-09 15:03         ` shejialuo
2025-05-07 14:54   ` [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-08 20:07     ` Jeff King
2025-05-09 15:21       ` shejialuo
2025-05-09 15:59         ` Jeff King
2025-05-09 16:40           ` shejialuo
2025-05-07 22:51   ` [PATCH v2 0/4] align the behavior when opening "packed-refs" Junio C Hamano
2025-05-08 20:08     ` Jeff King
2025-05-08 20:20       ` Junio C Hamano
2025-05-08 20:33         ` Jeff King
2025-05-09 15:26           ` shejialuo
2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
2025-05-12  8:36       ` Patrick Steinhardt
2025-05-12 12:25         ` shejialuo
2025-05-12 14:39           ` Patrick Steinhardt
2025-05-12 15:56             ` Jeff King
2025-05-12 17:18               ` Junio C Hamano
2025-05-13  5:08                 ` Patrick Steinhardt
2025-05-13  7:06                   ` shejialuo
2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-12  8:37       ` Patrick Steinhardt
2025-05-12 10:35         ` shejialuo
2025-05-12 14:41           ` Patrick Steinhardt
2025-05-12 13:06       ` Jeff King
2025-05-13  6:55         ` shejialuo
2025-05-11 14:01     ` [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-12 13:08       ` Jeff King
2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
2025-05-13 16:30         ` Junio C Hamano
2025-05-14 12:51           ` shejialuo
2025-05-13 11:07       ` [PATCH v4 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-13 11:07       ` [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-13 16:51         ` Junio C Hamano
2025-05-14 13:05           ` shejialuo
2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
2025-05-14 15:50         ` [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
2025-05-14 15:50         ` [PATCH v5 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-14 15:50         ` [PATCH v5 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-15 12:57         ` [PATCH v5 0/3] align the behavior when opening "packed-refs" Junio C Hamano
2025-05-21 16:31         ` Junio C Hamano
2025-05-22  5:50           ` Jeff King
2025-05-23  9:40             ` Patrick Steinhardt
2025-05-23 15:58               ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).