From: Junio C Hamano <gitster@pobox.com>
To: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,
LorenzoPegorari <lorenzo.pegorari2002@gmail.com>,
Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH] diff: fix out-of-bounds reads and NULL deref in diffstat UTF-8 truncation
Date: Fri, 17 Apr 2026 12:21:38 -0700 [thread overview]
Message-ID: <xmqqv7dpwfy5.fsf@gitster.g> (raw)
In-Reply-To: <pull.2093.git.1776443163041.gitgitgadget@gmail.com> (Elijah Newren via GitGitGadget's message of "Fri, 17 Apr 2026 16:26:03 +0000")
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
>
> f85b49f3d4a (diff: improve scaling of filenames in diffstat to handle
> UTF-8 chars, 2024-10-27) introduced a loop in show_stats() that calls
> utf8_width() repeatedly to skip leading characters until the displayed
> width fits.
A tangent, but I get a datestamp for the same f85b49f3 (diff:
improve scaling of filenames in diffstat to handle UTF-8 chars,
2026-01-16) that is different from what you showed above. Did you
find a bug in "git show -s --pretty=reference"?
> diff --git a/diff.c b/diff.c
> index 397e38b41c..7b27241733 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3093,8 +3093,17 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
> if (len < 0)
> len = 0;
>
> - while (name_len > len)
> - name_len -= utf8_width((const char**)&name, NULL);
> + while (name_len > len && *name) {
> + int w = utf8_width((const char **)&name, NULL);
> + if (!name) { /* Invalid UTF-8 */
> + name = file->print_name;
> + name_len = utf8_strwidth(name);
> + break;
> + }
IOW, we punt on "scaling" and instead use the full string? I was
wondering if we can punt on only this segment by replacing this
segment with just "..." and resync at the next slash.
> + if (w < 0) /* control character */
> + break;
When we have a control characer, we instead chomp immediately before
that byte, which sounds good. But then wouldn't the loop that found
an Invalid UTF-8 sequence in the middle of a name want to do the
same, i.e., take the good bits found so far and chomp at the broken
byte?
> + name_len -= w;
> + }
>
> slash = strchr(name, '/');
> if (slash)
Thanks.
next prev parent reply other threads:[~2026-04-17 19:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-17 16:26 [PATCH] diff: fix out-of-bounds reads and NULL deref in diffstat UTF-8 truncation Elijah Newren via GitGitGadget
2026-04-17 19:21 ` Junio C Hamano [this message]
2026-04-17 22:00 ` Elijah Newren
2026-04-17 22:21 ` Junio C Hamano
2026-04-17 22:45 ` [PATCH v2] " Elijah Newren via GitGitGadget
2026-04-19 23:52 ` Lorenzo Pegorari
2026-04-20 14:51 ` Elijah Newren
2026-04-20 15:42 ` [PATCH v3] " Elijah Newren via GitGitGadget
2026-04-20 16:41 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqv7dpwfy5.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=lorenzo.pegorari2002@gmail.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.