public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>,
	git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>
Subject: [PATCH 3.5/4] object-file: fix mmap() leak in odb_source_loose_read_object_stream()
Date: Fri, 6 Mar 2026 21:24:59 -0500	[thread overview]
Message-ID: <20260307022459.GA693632@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqqzpwv3t7.fsf@gitster.g>

On Fri, Mar 06, 2026 at 05:14:28PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > @@ -1600,6 +1600,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
> >  endif
> >  ifneq ($(filter leak,$(SANITIZERS)),)
> >  BASIC_CFLAGS += -O0
> > +NO_MMAP = CatchMapLeaks
> >  SANITIZE_LEAK = YesCompiledWithIt
> >  endif
> >  ifneq ($(filter address,$(SANITIZERS)),)
> 
> And of course, this "breaks" the leaks job at CI without being the
> true culprit.
> 
>     https://github.com/git/git/actions/runs/22786105918/job/66103114142
> 
> My bisection between v2.52.0 and v2.53.0 with the following
> 
>     $ git bisect start v2.53.0 v2.52.0
>     $ git bisect run sh :doit
> 
> where :doit has the shell script attached at the end of this message
> blames this commit.  I didn't dig further than that.

Interesting. I ran my tests on "master", which would include v2.53.0,
and it came up clean. But I use gcc locally; switching to clang does
indeed report a leak for me.

Even more curiously, if I try testing the tip of jch, then gcc does find
the same leak! Bisecting, it starts to find the leak as of 1f3fd68e06
(odb/source: make `read_object_stream()` function pluggable,
2026-03-05).

There is a real leak here; the fix is below. But curiously, it is _not_
the fault of the commit you found by bisection.

In the test in question, we die() shortly after the leak happens. We've
definitely left the function that holds the pointer to the leaked
buffer, so it's a true leak. But my guess is that the leak detector
doesn't quite know which parts of stack memory are valid or not when we
die(), so it scans the whole thing looking for plausible pointers to
allocations. If it gets "lucky", then the stale out-of-scope pointer is
still in stack memory, and we consider it still reachable.

And whether that happens or not can depend on the compiler, or even
compile options. And as the code is refactored to use the more abstract
odb API (and call more functions), it is increasingly likely that
something else has re-used that bit of stack memory.

So that's why the leak "appears" in 4c89d31494 (streaming: rely on
object sources to create object stream, 2025-11-23) for clang, and
1f3fd68e06 (odb/source: make `read_object_stream()` function pluggable,
2026-03-05) for gcc. But it was really there all along.

Anyway, here's the fix. It should probably be slotted in before patch 4
(which turns on NO_MMAP for leak-check builds).

-- >8 --
Subject: object-file: fix mmap() leak in odb_source_loose_read_object_stream()

We mmap() a loose object file, storing the result in the local variable
"mapped", which is eventually assigned into our stream struct as
"st.mapped". If we hit an error, we jump to an error label which does:

  munmap(st.mapped, st.mapsize);

to clean up. But this is wrong; we don't assign st.mapped until the end
of the function, after all of the "goto error" jumps. So this munmap()
is never cleaning up anything (st.mapped is always NULL, because we
initialize the struct with calloc).

Instead, we should feed the local variable to munmap().

This leak is due to 595296e124 (streaming: allocate stream inside the
backend-specific logic, 2025-11-23), which introduced the local
variable. Before that, we assigned the mmap result directly into
st.mapped. It was probably switched there so that we do not have to
allocate/free the struct when the map operation fails (e.g., because we
don't have the loose object). Before that commit, the struct was passed
in from the caller, so there was no allocation at all.

You can see the leak in the test suite by building with:

  make SANITIZE=leak NO_MMAP=1 CC=clang

and running t1060. We need NO_MMAP so that the mmap() is backed by an
actual malloc(), which allows LSan to detect it. And the leak seems not
to be detected when compiling with gcc, probably due to some internal
compiler decisions about how the stack memory is written.

Signed-off-by: Jeff King <peff@peff.net>
---
 object-file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index 3094140055..ab2fb9c4eb 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2197,7 +2197,7 @@ int odb_source_loose_read_object_stream(struct odb_read_stream **out,
 	return 0;
 error:
 	git_inflate_end(&st->z);
-	munmap(st->mapped, st->mapsize);
+	munmap(mapped, mapsize);
 	free(st);
 	return -1;
 }
-- 
2.53.0.791.g8baeb4ea4d


  reply	other threads:[~2026-03-07  2:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-05 20:51 memory leak when cloning a repository Jacob Keller
2026-03-05 22:02 ` Jeff King
2026-03-05 23:03   ` [PATCH 0/4] plugging some mmap() leaks Jeff King
2026-03-05 23:08     ` [PATCH 1/4] check_connected(): delay opening new_pack Jeff King
2026-03-05 23:18       ` Jacob Keller
2026-03-05 23:09     ` [PATCH 2/4] check_connected(): fix leak of pack-index mmap Jeff King
2026-03-05 23:20       ` Jacob Keller
2026-03-05 23:12     ` [PATCH 3/4] pack-revindex: avoid double-loading .rev files Jeff King
2026-03-05 23:13     ` [PATCH 4/4] Makefile: turn on NO_MMAP when building with LSan Jeff King
2026-03-06  9:17       ` Jacob Keller
2026-03-06 16:25         ` [PATCH 5/4] meson: " Jeff King
2026-03-06 18:00           ` Ramsay Jones
2026-03-07  1:14       ` [PATCH 4/4] Makefile: " Junio C Hamano
2026-03-07  2:24         ` Jeff King [this message]
2026-03-07  5:35           ` [PATCH 3.5/4] object-file: fix mmap() leak in odb_source_loose_read_object_stream() Junio C Hamano
2026-03-10 12:23             ` Patrick Steinhardt
2026-03-06  4:37     ` [PATCH 0/4] plugging some mmap() leaks Ramsay Jones
2026-03-06 16:21       ` Jeff King
2026-03-06 17:49         ` Ramsay Jones
2026-03-06 18:37       ` Junio C Hamano
2026-03-06 18:55         ` Ramsay Jones
2026-03-06 22:05           ` Junio C Hamano
2026-03-06 23:25             ` Ramsay Jones
2026-03-07  1:15               ` Junio C Hamano
2026-03-05 23:16   ` memory leak when cloning a repository Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260307022459.GA693632@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.e.keller@intel.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox