Git development
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Cc: Git mailing list <git@vger.kernel.org>, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH] builtin/history: unuse the commit buffer after use
Date: Tue, 30 Jun 2026 01:26:08 -0400	[thread overview]
Message-ID: <20260630052608.GA2495216@coredump.intra.peff.net> (raw)
In-Reply-To: <317d0f7b-469f-4456-8808-506e17de264d@gmail.com>

On Tue, Jun 30, 2026 at 09:15:49AM +0530, Kaartic Sivaraam wrote:

> > So what the patch is doing is correct, but the explanation is a little
> > confused. We see the leak only when re-encoding, so we'd probably want a
> > test case that triggers that. Which I assume implies rewriting a commit
> > that was previously generated with an encoding header.
> > 
> 
> Thank you very much for these insights! It has been helpful but on further
> digging I think this is not about reencoding. On testing and digging
> further, the leak appears to be happening when the commit that is being
> reworded we get is a freshly allocated buffer from repo_get_commit_buffer.
> I'm still trying to figure out how specific commits get cached in the slab
> while other commits don't. I'll update this thread shortly once I get an
> idea about the same.
> 
> Meanwhile, if anyone knows offhand about this, kindly chime in.

The most likely cause of an uncached commit buffer is that we loaded the
commit from the commit-graph (and thus never opened the object in the
first place, so there was nothing to cache).

I suppose it is also possible that we might create a commit struct and
then before ever calling parse_commit() on it, ask to see the commit
msg. And thus we never looked at either the commit graph nor the object
itself.

We'd also refuse to cache if save_commit_buffer is disabled (since
that's the point of that flag). If that were the case I'd expect it to
be set consistently within a given program.

So my guess is probably the commit graph, but it could be that the
command tries to format unparsed commits (which is _not_ a bug or error,
but would trigger the situation where we hand back a freshly allocated
buffer).

I suspect the leak could _also_ be caused by re-encoding, but yeah, the
root cause is "repo_logmsg_reencode gave us an allocated string", and
that can happen when re-encoding is necessary, or when we did not have a
cached buffer in the first place.

> > But note that you need to do _both_ the unuse and free calls. If we did
> > re-encode, the former is needed to free the newly allocated buffer. The
> > latter only drops the original buffer in the cache.
> 
> From my understanding, I think we may not need free_commit_buffer for the
> following reasons:
> 
> - The leak was only being reported when the commit did not come
>   from the commit slab.
> - We are not going to be reading too many commit objects into memory in
>   this code path. Hence freeing the commit in the slab isn't strictly
>   necessary.
> 
> Kindly correct me if I missed something, though.

It depends on the definition of "too many" here. ;) Probably nobody will
notice if you keep a dozen in memory, but they may if the command
happens to get thousands of commits as input. That's not likely, but if
there's an easy moment where the program can say "ah, I am done with
commit X, let's free any resources associated with it" then I think it
is worth doing.

If there isn't such an easy moment, it is probably not that big a deal
to leave it as-is. It's a rare case when somebody would notice at all,
and if you think over the implications above, a repo with an up-to-date
commit graph won't have very many cached entries in the first place.

> To conclude, I think the change that the patch proposes if fine but the
> commit message definitely needs updation.

Yep. Whether we should be calling free_commit_buffer() or not is really
an orthogonal question to the leak. If we want to do something about it,
it would be a separate patch anyway.

-Peff

  reply	other threads:[~2026-06-30  5:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-14 14:15 [PATCH] builtin/history: unuse the commit buffer after use Kaartic Sivaraam
2026-06-15  9:48 ` Patrick Steinhardt
2026-06-15 17:29   ` Jeff King
2026-06-16  5:45     ` Patrick Steinhardt
2026-06-30  3:45     ` Kaartic Sivaraam
2026-06-30  5:26       ` Jeff King [this message]
2026-06-30  3:43   ` Kaartic Sivaraam
2026-06-30  5:38     ` Jeff King
2026-06-30  5:50       ` Jeff King
2026-06-30  6:44         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260630052608.GA2495216@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox