git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Lidong Yan via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Lidong Yan <502024330056@smail.nju.edu.cn>
Subject: Re: [PATCH v2] mailinfo: fix pointential memory leak if `decode_header` failed
Date: Mon, 12 May 2025 07:12:08 -0700	[thread overview]
Message-ID: <xmqqcycdx6iv.fsf@gitster.g> (raw)
In-Reply-To: <pull.1956.v2.git.git.1746980097510.gitgitgadget@gmail.com> (Lidong Yan via GitGitGadget's message of "Sun, 11 May 2025 16:14:57 +0000")

"Lidong Yan via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Lidong Yan <502024330056@smail.nju.edu.cn>
>
> In mailinfo.c:decode_header, if convert_to_utf8 failed, the strbuf stored
> in dec will leak. Simply add strbuf_release and free(dec) will solve
> this problem.
>
> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
> ---
>     decode_header: fix pointential memory leak if decode_header failed
>     
>     In mailinfo.c line 539, if convert_to_utf8 failed, the strbuf stored in
>     dec will leak. Simply add strbuf_release and free(dec) will solve this
>     problem.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1956%2Fbrandb97%2Ffix-mailinfo-decode-header-leak-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1956/brandb97/fix-mailinfo-decode-header-leak-v2
> Pull-Request: https://github.com/git/git/pull/1956
>
> Range-diff vs v1:
>
>  1:  81fdfb94315 ! 1:  90dc9b0d49b decode_header: fix pointential memory leak if decode_header failed
>      @@ Metadata
>       Author: Lidong Yan <502024330056@smail.nju.edu.cn>
>       
>        ## Commit message ##
>      -    decode_header: fix pointential memory leak if decode_header failed
>      +    mailinfo: fix pointential memory leak if `decode_header` failed
>       
>      -    In mailinfo.c line 539, if convert_to_utf8 failed, the strbuf stored
>      +    In mailinfo.c:decode_header, if convert_to_utf8 failed, the strbuf stored
>           in dec will leak. Simply add strbuf_release and free(dec) will solve
>           this problem.

Much better.

> +static int decode_q_segment(struct strbuf *out, const struct strbuf *q_seg,
> +			    int rfc2047)
>  {
>  	const char *in = q_seg->buf;
>  	int c;
> -	struct strbuf *out = xmalloc(sizeof(struct strbuf));
>  	strbuf_init(out, q_seg->len);

Don't let the caller pass in an uninitialized thing and force the
callee to initialize it.  Drop this strbuf_init(), and make the
caller always do:

	struct strbuf dec = STRBUF_INIT;

	...
		decode_q_segment(&dec, ...);

instead.  That makes the division of labor easier to see (e.g., what
if the caller had a code path that never calls decode_x_segment()
before it has to return?  it might want to add something to dec
itself so that it can base its behaviour always on what is in dec,
or at the end it may just be able to strbuf_release(&dec) without
having to remember if it called decode_x_segment().  Which means it
is more convenient for it to always assume that dec is initialized
whether it called decode_x_segment() or not).

> -static struct strbuf *decode_b_segment(const struct strbuf *b_seg)
> +static int decode_b_segment(struct strbuf *out, const struct strbuf *b_seg)
>  {
>  	/* Decode in..ep, possibly in-place to ot */
>  	int c, pos = 0, acc = 0;
>  	const char *in = b_seg->buf;
> -	struct strbuf *out = xmalloc(sizeof(struct strbuf));
>  	strbuf_init(out, b_seg->len);

Ditto.

> @@ -530,18 +529,23 @@ static void decode_header(struct mailinfo *mi, struct strbuf *it)
>  		default:
>  			goto release_return;
>  		case 'b':
> -			dec = decode_b_segment(&piecebuf);
> +			if ((found_error = decode_b_segment(&dec, &piecebuf))) {
> +				goto release_return;
> +			}

Don't enclose a single statement block inside {braces}.

>  			break;
>  		case 'q':
> -			dec = decode_q_segment(&piecebuf, 1);
> +			if ((found_error = decode_q_segment(&dec, &piecebuf, 1))) {
> +				goto release_return;
> +			}

Ditto.

>  			break;

Just a mental note (i.e., not anything wrong in the posted patch).
Even though the caller is prepared to see decode_x_segment() to
notice and report an error, the implementation just does not bother,
and mostly skips a garbage in the input.  Outside the topic of this
series, we may want to consider allowing the user to say "be strict
and barf when encoded contents are broken".

>  		}
> -		if (convert_to_utf8(mi, dec, charset_q.buf))
> +		if (convert_to_utf8(mi, &dec, charset_q.buf)) {
> +			strbuf_release(&dec);
>  			goto release_return;
> +		}

This, together with ...

> -		strbuf_addbuf(&outbuf, dec);
> -		strbuf_release(dec);
> -		free(dec);
> +		strbuf_addbuf(&outbuf, &dec);
> +		strbuf_release(&dec);

... release here, look somewhat pointless.  As you declared "dec" at
the outermost scope in this function, why not do the release at the
place everybody else is released/freed, at release_return: label?

>  		in = ep + 2;
>  	}
>  	strbuf_addstr(&outbuf, in);
> @@ -634,23 +638,24 @@ static int is_inbody_header(const struct mailinfo *mi,
>  
>  static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
>  {
> -	struct strbuf *ret;
> +	struct strbuf ret;

Do the "= STRBUF_INIT" at the caller.

> +	int found_error = 0;
>  
>  	switch (mi->transfer_encoding) {
>  	case TE_QP:
> -		ret = decode_q_segment(line, 0);
> +		found_error = decode_q_segment(&ret, line, 0);
>  		break;
>  	case TE_BASE64:
> -		ret = decode_b_segment(line);
> +		found_error = decode_b_segment(&ret, line);
>  		break;
>  	case TE_DONTCARE:
>  	default:
>  		return;
>  	}
>  	strbuf_reset(line);
> -	strbuf_addbuf(line, ret);
> -	strbuf_release(ret);
> -	free(ret);
> +	strbuf_addbuf(line, &ret);
> +	if (!found_error)
> +		strbuf_release(&ret);
>  }

THis is puzzling.  We add whatever is in ret to line, but release it
only when there is no error?  What happens when we did find error?
There does not seem to be any caller-callee contract on what the out
parameter should contain upon an error, which is a good thing, so we
should release it unconditionally, no?




  reply	other threads:[~2025-05-12 14:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08 13:38 [PATCH] decode_header: fix pointential memory leak if decode_header failed Lidong Yan via GitGitGadget
2025-05-08 22:16 ` Junio C Hamano
2025-05-09  1:51   ` lidongyan
2025-05-11 16:14 ` [PATCH v2] mailinfo: fix pointential memory leak if `decode_header` failed Lidong Yan via GitGitGadget
2025-05-12 14:12   ` Junio C Hamano [this message]
2025-05-12 16:17   ` [PATCH v3] " Lidong Yan via GitGitGadget
2025-05-12 19:32     ` Junio C Hamano
2025-05-13  2:49     ` [PATCH v4] " Lidong Yan via GitGitGadget
2025-05-13 13:34       ` Junio C Hamano
2025-05-13 13:42         ` lidongyan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqcycdx6iv.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=502024330056@smail.nju.edu.cn \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).