public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: git@vger.kernel.org
Subject: [PATCH 3/4] pack-revindex: avoid double-loading .rev files
Date: Thu, 5 Mar 2026 18:12:29 -0500	[thread overview]
Message-ID: <20260305231229.GC2901305@coredump.intra.peff.net> (raw)
In-Reply-To: <20260305230315.GA2354983@coredump.intra.peff.net>

The usual entry point for loading the pack revindex is the
load_pack_revindex() function. It returns immediately if the packed_git
has a non-NULL revindex or revindex data field (representing an
in-memory or mmap'd .rev file, respectively), since the data is already
loaded.

But in 5a6072f631 (fsck: validate .rev file header, 2023-04-17) the fsck
code path switched to calling load_pack_revindex_from_disk() directly,
since it wants to check the on-disk data (if there is any). But that
function does _not_ check to see if the data has already been loaded; it
just maps the file, overwriting the revindex_map pointer (and pointing
revindex_data inside that map). And in that case we've leaked the mmap()
pointed to by revindex_map (if it was non-NULL).

This usually doesn't happen, since fsck wouldn't need to load the
revindex for any reason before we get to these checks. But there are
some cases where it does. For example, is_promisor_object() runs
odb_for_each_object() with the PACK_ORDER flag, which uses the revindex.

This happens a few times in our test suite, but SANITIZE=leak doesn't
detect it because we are leaking an mmap(), not a heap-allocated buffer
from malloc(). However, if you build with NO_MMAP, then our compat mmap
will read into a heap buffer instead, and LSan will complain. This
causes failures in t5601, t0410, t5702, and t5616.

We can fix it by checking for existing revindex_data when loading from
disk. This is redundant when we're called from load_pack_revindex(), but
it's a cheap check. The alternative is to teach check_pack_rev_indexes()
in fsck to skip the load, but that seems messier; it doesn't otherwise
know about internals like revindex_map and revindex_data.

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-revindex.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/pack-revindex.c b/pack-revindex.c
index 56cd803a67..1fe0afe899 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -277,6 +277,10 @@ int load_pack_revindex_from_disk(struct packed_git *p)
 {
 	char *revindex_name;
 	int ret;
+
+	if (p->revindex_data)
+		return 0;
+
 	if (open_pack_index(p))
 		return -1;
 
-- 
2.53.0.786.g466665faa3


  parent reply	other threads:[~2026-03-05 23:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-05 20:51 memory leak when cloning a repository Jacob Keller
2026-03-05 22:02 ` Jeff King
2026-03-05 23:03   ` [PATCH 0/4] plugging some mmap() leaks Jeff King
2026-03-05 23:08     ` [PATCH 1/4] check_connected(): delay opening new_pack Jeff King
2026-03-05 23:18       ` Jacob Keller
2026-03-05 23:09     ` [PATCH 2/4] check_connected(): fix leak of pack-index mmap Jeff King
2026-03-05 23:20       ` Jacob Keller
2026-03-05 23:12     ` Jeff King [this message]
2026-03-05 23:13     ` [PATCH 4/4] Makefile: turn on NO_MMAP when building with LSan Jeff King
2026-03-06  9:17       ` Jacob Keller
2026-03-06 16:25         ` [PATCH 5/4] meson: " Jeff King
2026-03-06 18:00           ` Ramsay Jones
2026-03-07  1:14       ` [PATCH 4/4] Makefile: " Junio C Hamano
2026-03-07  2:24         ` [PATCH 3.5/4] object-file: fix mmap() leak in odb_source_loose_read_object_stream() Jeff King
2026-03-07  5:35           ` Junio C Hamano
2026-03-10 12:23             ` Patrick Steinhardt
2026-03-06  4:37     ` [PATCH 0/4] plugging some mmap() leaks Ramsay Jones
2026-03-06 16:21       ` Jeff King
2026-03-06 17:49         ` Ramsay Jones
2026-03-06 18:37       ` Junio C Hamano
2026-03-06 18:55         ` Ramsay Jones
2026-03-06 22:05           ` Junio C Hamano
2026-03-06 23:25             ` Ramsay Jones
2026-03-07  1:15               ` Junio C Hamano
2026-03-05 23:16   ` memory leak when cloning a repository Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260305231229.GC2901305@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jacob.e.keller@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox