All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF
Date: Wed, 29 May 2024 13:25:02 +0200	[thread overview]
Message-ID: <ZlcQjvAS-27S-mjw@tanuki> (raw)
In-Reply-To: <20240529091633.GB1098944@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 3872 bytes --]

On Wed, May 29, 2024 at 05:16:33AM -0400, Jeff King wrote:
> On Mon, May 27, 2024 at 08:44:46AM +0200, Patrick Steinhardt wrote:
> > > diff --git a/strbuf.c b/strbuf.c
> > > diff --git a/strbuf.c b/strbuf.c
> > > index e1076c9891..aed699c6bf 100644
> > > --- a/strbuf.c
> > > +++ b/strbuf.c
> > > @@ -656,10 +656,8 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
> > >  	 * we can just re-init, but otherwise we should make sure that our
> > >  	 * length is empty, and that the result is NUL-terminated.
> > >  	 */
> > > -	if (!sb->buf)
> > > -		strbuf_init(sb, 0);
> > > -	else
> > > -		strbuf_reset(sb);
> > > +	FREE_AND_NULL(sb->buf);
> > > +	strbuf_init(sb, 0);
> > >  	return EOF;
> > >  }
> > >  #else
> > > 
> > > But I think either of those would solve your leak, _and_ would help with
> > > similar leaks of strbuf_getwholeline() and friends.
> > 
> > I'm not quite convinced that `strbuf_getwholeline()` should deallocate
> > the buffer for the caller, I think that makes for quite a confusing
> > calling convention. The caller may want to reuse the buffer for other
> > operations, and it feels hostile to release the buffer under their feet.
> > 
> > The only edge case where I think it would make sense to free allocated
> > data is when being passed a not-yet-allocated strbuf. But I wonder
> > whether the added complexity would be worth it.
> 
> I'm not sure what they'd reuse it for. We necessarily have to reset it
> before reading, so the contents are now garbage. The allocated buffer
> could be reused, but since everybody has to call strbuf_grow() before
> assuming they can write, it's not a correctness issue, but only an
> optimization. But that optimization is pretty unlikely to matter. Since
> we hit this code only on EOF or error, it's generally going to happen
> once in a program, and not in a tight loop.
> 
> If we really cared, though, I think you could check sb->alloc before the
> call to getdelim(), and then we'd know whether the original held an
> allocation or not (and we could restore its state). That's what other
> syscall-ish strbuf functions like strbuf_readlink() and strbuf_getcwd()
> do.

Ah, I didn't know that we did similar things in other strbuf functions.
With that precedence I think it's less ugly to do this dance.

> That said, I agree that leaks here are not going to be common. Most
> callers are going to call it in a loop and unconditionally release at
> the end, whether they get multiple lines or not. The "append" function
> is the odd man out by reading a single line into a new buffer[1].
> 
> Looking through the results of:
> 
>   git grep -P '(?<!while) \(!?strbuf_get(whole)?line'
> 
> I saw only one questionable case. builtin/difftool.c does:
> 
>   if (strbuf_getline_nul(&lpath, fp))
> 	break;
> 
> without freeing lpath. But then...it does not free it in the case that
> we got a value, either! So I think it is leaking either way, and the
> solution, to strbuf_release(&lpath) outside of the loop, would fix both
> cases.

Indeed. We also didn't free `rpath` and `info`. I do have a follow up to
this series already, so let me add those leak fixes to it.

> > I've been going through all callsites and couldn't spot any that doesn't
> > free the buffer on EOF. So I'd propose to leave this as-is and revisit
> > if we eventually see that this is causing more memory leaks.
> 
> OK. I don't feel too strongly about it, but mostly thought it seemed
> inconsistent with the philosophy of those other strbuf functions.

I get where you're coming from now with the additional info that other
syscall-ish functions do a similar dance. I'll refrain from rerolling
this series just to fix this in a different way, also because neither of
us did spot any additional leaks caused by this.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-05-29 11:25 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 12:25 [PATCH 00/20] Various memory leak fixes Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 01/20] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-23 17:44   ` Junio C Hamano
2024-05-24  6:56     ` Patrick Steinhardt
2024-05-24 16:05       ` Junio C Hamano
2024-05-24 17:53         ` Junio C Hamano
2024-05-24 20:34   ` Karthik Nayak
2024-05-23 12:25 ` [PATCH 02/20] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-23 17:36   ` Junio C Hamano
2024-05-24 20:38   ` Karthik Nayak
2024-05-23 12:25 ` [PATCH 03/20] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 04/20] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 05/20] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 06/20] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 07/20] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-23 16:59   ` Eric Sunshine
2024-05-23 12:25 ` [PATCH 08/20] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 09/20] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 10/20] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 11/20] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 12/20] config: plug various memory leaks Patrick Steinhardt
2024-05-23 17:13   ` Junio C Hamano
2024-05-24  6:58     ` Patrick Steinhardt
2024-05-24  8:55       ` Patrick Steinhardt
2024-05-24 16:12         ` Junio C Hamano
2024-05-24 16:11       ` Junio C Hamano
2024-05-23 12:26 ` [PATCH 13/20] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 14/20] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 15/20] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 16/20] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-23 17:09   ` Eric Sunshine
2024-05-24  6:56     ` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 17/20] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 18/20] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 19/20] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 20/20] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-23 16:45 ` [PATCH 00/20] Various memory leak fixes Junio C Hamano
2024-05-24  6:56   ` Patrick Steinhardt
2024-05-24 10:03 ` [PATCH v2 00/21] " Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 01/21] ci: add missing dependency for TTY prereq Patrick Steinhardt
2024-05-24 16:31     ` Junio C Hamano
2024-05-24 10:03   ` [PATCH v2 02/21] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 03/21] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-25  4:46     ` Jeff King
2024-05-27  6:44       ` Patrick Steinhardt
2024-05-29  9:16         ` Jeff King
2024-05-29 11:25           ` Patrick Steinhardt [this message]
2024-05-30  7:16             ` Jeff King
2024-05-24 10:03   ` [PATCH v2 05/21] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 06/21] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 07/21] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 08/21] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 09/21] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 10/21] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 11/21] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 12/21] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 13/21] config: plug various memory leaks Patrick Steinhardt
2024-05-24 10:13     ` Patrick Steinhardt
2024-05-25  4:33     ` Jeff King
2024-05-27  6:46       ` Patrick Steinhardt
2024-05-29  9:20         ` Jeff King
2024-05-24 10:04   ` [PATCH v2 14/21] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 15/21] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 16/21] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 17/21] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 18/21] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 19/21] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 20/21] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 21/21] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-25  2:10   ` [PATCH v2 00/21] Various memory leak fixes Junio C Hamano
2024-05-27  6:44     ` Patrick Steinhardt
2024-05-27 17:38       ` Junio C Hamano
2024-05-27 18:02         ` Junio C Hamano
2024-05-28  5:09         ` Patrick Steinhardt
2024-05-29  8:25       ` Karthik Nayak
2024-05-27 11:45 ` [PATCH v3 " Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 01/21] ci: add missing dependency for TTY prereq Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 02/21] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 03/21] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 05/21] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 06/21] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 07/21] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 08/21] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 09/21] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 10/21] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 11/21] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 12/21] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 13/21] config: plug various memory leaks Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 14/21] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 15/21] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 16/21] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 17/21] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 18/21] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 19/21] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 20/21] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 21/21] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-27 17:52   ` [PATCH v3 00/21] Various memory leak fixes Junio C Hamano
2024-05-30  6:38   ` [PATCH 0/5] add-ons for ps/leakfixes Jeff King
2024-05-30  6:39     ` [PATCH 1/5] t-strvec: use va_end() to match va_start() Jeff King
2024-05-30  6:39     ` [PATCH 2/5] t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL Jeff King
2024-05-30  6:44     ` [PATCH 3/5] mv: move src_dir cleanup to end of cmd_mv() Jeff King
2024-05-30  7:04       ` Patrick Steinhardt
2024-05-30  7:21         ` Jeff King
2024-05-30  7:24           ` Patrick Steinhardt
2024-05-30  8:15             ` Jeff King
2024-05-30  8:19               ` Patrick Steinhardt
2024-05-30  8:28                 ` Jeff King
2024-05-30  6:45     ` [PATCH 4/5] mv: factor out empty src_dir removal Jeff King
2024-05-30  6:46     ` [PATCH 5/5] mv: replace src_dir with a strvec Jeff King
2024-05-30 15:36       ` Junio C Hamano
2024-05-31 11:12         ` Jeff King
2024-05-31 14:56           ` Junio C Hamano
2024-05-30  7:05     ` [PATCH 0/5] add-ons for ps/leakfixes Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlcQjvAS-27S-mjw@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.