git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: shejialuo <shejialuo@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Patrick Steinhardt <ps@pks.im>
Subject: [PATCH v2 0/4] align the behavior when opening "packed-refs"
Date: Wed, 7 May 2025 22:52:15 +0800	[thread overview]
Message-ID: <aBtzn4nwLsI9p5Cp@ArchLinux> (raw)
In-Reply-To: <aBo7OiCKHTyT4DzH@ArchLinux>

Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@coredump.intra.peff.net

Really thank Peff and Patrick to suggest me to do above change.

---

Change since v1:

1. Update the commit message of [PATCH 1/4]. And use redirection to
create an empty file instead of using `touch`.
2. Don't use if for the refactored function in [PATCH 3/4] and then
update the commit message to align with the new function name.
3. Enhance the commit message of [PATCH 4/4].

Thanks,
Jialuo

shejialuo (4):
  packed-backend: fsck should allow an empty "packed-refs" file
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: extract munmap operation for `MMAP_TEMPORARY`
  packed-backend: mmap large "packed-refs" file during fsck

 refs/packed-backend.c    | 113 ++++++++++++++++++++++++---------------
 t/t0602-reffiles-fsck.sh |  13 +++++
 2 files changed, 82 insertions(+), 44 deletions(-)

Range-diff against v1:
1:  aa9037ebfa ! 1:  26c3fd55a8 packed-backend: skip checking consistency of empty packed-refs file
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    packed-backend: skip checking consistency of empty packed-refs file
    +    packed-backend: fsck should allow an empty "packed-refs" file
     
    -    In "load_contents", when the "packed-refs" is empty, we will just return
    -    the snapshot. However, we would report an error to the user when
    -    checking the consistency of the empty "packed-refs".
    -
    -    We should align with the runtime behavior. As what "load_contents" does,
    -    let's check whether the file size is zero and if so, we will skip
    -    checking the consistency and simply return.
    +    During fsck, an empty "packed-refs" gives an error; this is unwarranted.
    +    We should just skip checking the content of "packed-refs" just like the
    +    runtime code paths such as "create_snapshot" which simply returns the
    +    "snapshot" without checking the content of "packed-refs".
     
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    @@ t/t0602-reffiles-fsck.sh: test_expect_success SYMLINKS 'the filetype of packed-r
     +		cd repo &&
     +		test_commit default &&
     +
    -+		touch .git/packed-refs &&
    ++		>.git/packed-refs &&
     +		git refs verify 2>err &&
     +		test_must_be_empty err
     +	)
2:  852e8a606b = 2:  4604be8b51 packed-backend: extract snapshot allocation in `load_contents`
3:  de146155f6 ! 3:  c0609afac9 packed-backend: extract munmap operation for `MMAP_TEMPORARY`
    @@ Commit message
         is "MMAP_TEMPORARY". We also need to do this operation when checking the
         consistency of the "packed-refs" file.
     
    -    Create a new function "munmap_snapshot_if_temporary" to do above and
    +    Create a new function "munmap_temporary_snapshot" to do above and
         change "create_snapshot" to align with the behavior.
     
         Suggested-by: Jeff King <peff@peff.net>
    @@ refs/packed-backend.c: static int allocate_snapshot_buffer(struct snapshot *snap
      	return 1;
      }
      
    -+static void munmap_snapshot_if_temporary(struct snapshot *snapshot)
    ++static void munmap_temporary_snapshot(struct snapshot *snapshot)
     +{
    -+	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
    -+		/*
    -+		 * We don't want to leave the file mmapped, so we are
    -+		 * forced to make a copy now:
    -+		 */
    -+		size_t size = snapshot->eof - snapshot->start;
    -+		char *buf_copy = xmalloc(size);
    ++	char *buf_copy;
    ++	size_t size;
     +
    -+		memcpy(buf_copy, snapshot->start, size);
    -+		clear_snapshot_buffer(snapshot);
    -+		snapshot->buf = snapshot->start = buf_copy;
    -+		snapshot->eof = buf_copy + size;
    -+	}
    ++	if (!snapshot)
    ++		return;
    ++
    ++	/*
    ++	 * We don't want to leave the file mmapped, so we are
    ++	 * forced to make a copy now:
    ++	 */
    ++	size = snapshot->eof - snapshot->start;
    ++	buf_copy = xmalloc(size);
    ++
    ++	memcpy(buf_copy, snapshot->start, size);
    ++	clear_snapshot_buffer(snapshot);
    ++	snapshot->buf = snapshot->start = buf_copy;
    ++	snapshot->eof = buf_copy + size;
     +}
     +
      /*
    @@ refs/packed-backend.c: static struct snapshot *create_snapshot(struct packed_ref
     -		snapshot->buf = snapshot->start = buf_copy;
     -		snapshot->eof = buf_copy + size;
     -	}
    -+	munmap_snapshot_if_temporary(snapshot);
    ++	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    ++		munmap_temporary_snapshot(snapshot);
      
      	return snapshot;
      }
4:  be1e9e2540 ! 4:  c868e3dd16 packed-backend: use mmap when opening large "packed-refs" file
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    packed-backend: use mmap when opening large "packed-refs" file
    +    packed-backend: mmap large "packed-refs" file during fsck
     
    -    We use "strbuf_read" to read the content of "packed-refs". However, this
    -    is a bad practice which would consume a lot of memory usage if there are
    -    multiple processes reading large "packed-refs".
    +    During fsck, we use "strbuf_read" to read the content of "packed-refs"
    +    without using mmap mechanism. This is a bad practice which would consume
    +    more memory than using mmap mechanism. Besides, as all code paths in
    +    "packed-backend.c" use this way, we should make "fsck" align with the
    +    current codebase.
     
         As we have introduced two helper functions "allocate_snapshot_buffer"
         and "munmap_snapshot_if_temporary", we could simply call these functions
    @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
     +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
      		goto cleanup;
     -	}
    -+	munmap_snapshot_if_temporary(snapshot);
      
     -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
     -				      packed_ref_content.buf + packed_ref_content.len);
    ++	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    ++		munmap_temporary_snapshot(snapshot);
    ++
     +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
     +				      snapshot->eof);
      	if (!ret && sorted)
-- 
2.49.0


  parent reply	other threads:[~2025-05-07 14:52 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06 16:39 [PATCH 0/4] align the behavior when opening "packed-refs" shejialuo
2025-05-06 16:41 ` [PATCH 1/4] packed-backend: skip checking consistency of empty packed-refs file shejialuo
2025-05-06 18:42   ` Junio C Hamano
2025-05-07 12:09     ` shejialuo
2025-05-06 19:14   ` Junio C Hamano
2025-05-07 12:10     ` shejialuo
2025-05-06 16:41 ` [PATCH 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-06 19:16   ` Junio C Hamano
2025-05-06 16:41 ` [PATCH 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
2025-05-06 18:52   ` Junio C Hamano
2025-05-06 22:17     ` Junio C Hamano
2025-05-07 12:21     ` shejialuo
2025-05-06 16:41 ` [PATCH 4/4] packed-backend: use mmap when opening large "packed-refs" file shejialuo
2025-05-06 19:00   ` Junio C Hamano
2025-05-06 22:18     ` Junio C Hamano
2025-05-07 12:34     ` shejialuo
2025-05-07 14:52 ` shejialuo [this message]
2025-05-07 14:53   ` [PATCH v2 1/4] packed-backend: fsck should allow an empty " shejialuo
2025-05-07 14:53   ` [PATCH v2 2/4] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-07 14:53   ` [PATCH v2 3/4] packed-backend: extract munmap operation for `MMAP_TEMPORARY` shejialuo
2025-05-08 19:57     ` Jeff King
2025-05-08 20:05       ` Junio C Hamano
2025-05-09 15:03         ` shejialuo
2025-05-07 14:54   ` [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-08 20:07     ` Jeff King
2025-05-09 15:21       ` shejialuo
2025-05-09 15:59         ` Jeff King
2025-05-09 16:40           ` shejialuo
2025-05-07 22:51   ` [PATCH v2 0/4] align the behavior when opening "packed-refs" Junio C Hamano
2025-05-08 20:08     ` Jeff King
2025-05-08 20:20       ` Junio C Hamano
2025-05-08 20:33         ` Jeff King
2025-05-09 15:26           ` shejialuo
2025-05-11 13:59   ` [PATCH v3 0/3] " shejialuo
2025-05-11 14:01     ` [PATCH v3 1/3] packed-backend: fsck should allow an empty "packed-refs" file shejialuo
2025-05-12  8:36       ` Patrick Steinhardt
2025-05-12 12:25         ` shejialuo
2025-05-12 14:39           ` Patrick Steinhardt
2025-05-12 15:56             ` Jeff King
2025-05-12 17:18               ` Junio C Hamano
2025-05-13  5:08                 ` Patrick Steinhardt
2025-05-13  7:06                   ` shejialuo
2025-05-11 14:01     ` [PATCH v3 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-12  8:37       ` Patrick Steinhardt
2025-05-12 10:35         ` shejialuo
2025-05-12 14:41           ` Patrick Steinhardt
2025-05-12 13:06       ` Jeff King
2025-05-13  6:55         ` shejialuo
2025-05-11 14:01     ` [PATCH v3 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-12 13:08       ` Jeff King
2025-05-13 11:06     ` [PATCH v4 0/3] align the behavior when opening "packed-refs" shejialuo
2025-05-13 11:07       ` [PATCH v4 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
2025-05-13 16:30         ` Junio C Hamano
2025-05-14 12:51           ` shejialuo
2025-05-13 11:07       ` [PATCH v4 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-13 11:07       ` [PATCH v4 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-13 16:51         ` Junio C Hamano
2025-05-14 13:05           ` shejialuo
2025-05-14 15:48       ` [PATCH v5 0/3] align the behavior when opening "packed-refs" shejialuo
2025-05-14 15:50         ` [PATCH v5 1/3] packed-backend: fsck should warn when "packed-refs" file is empty shejialuo
2025-05-14 15:50         ` [PATCH v5 2/3] packed-backend: extract snapshot allocation in `load_contents` shejialuo
2025-05-14 15:50         ` [PATCH v5 3/3] packed-backend: mmap large "packed-refs" file during fsck shejialuo
2025-05-15 12:57         ` [PATCH v5 0/3] align the behavior when opening "packed-refs" Junio C Hamano
2025-05-21 16:31         ` Junio C Hamano
2025-05-22  5:50           ` Jeff King
2025-05-23  9:40             ` Patrick Steinhardt
2025-05-23 15:58               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBtzn4nwLsI9p5Cp@ArchLinux \
    --to=shejialuo@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).