From: "René Scharfe" <l.s.r@web.de>
To: Alex via GitGitGadget <gitgitgadget@gmail.com>, git@vger.kernel.org
Cc: Alex <alexguo1023@gmail.com>, jinyaoguo <guo846@purdue.edu>
Subject: Re: [PATCH] Fix buffer underflow in xdl_build_script
Date: Sat, 24 May 2025 11:08:42 +0200 [thread overview]
Message-ID: <a58bf04a-d9fc-4ecf-96a5-82d31ada510d@web.de> (raw)
In-Reply-To: <037b4612-ea92-45bf-bacb-ba66204f941e@web.de>
Am 24.05.25 um 07:57 schrieb René Scharfe:
> Am 23.05.25 um 22:51 schrieb Alex via GitGitGadget:
>> From: jinyaoguo <guo846@purdue.edu>
>>
>> The loop in xdl_build_script used `i1 >= 0 || i2 >= 0`, causing
>> `i1` (or `i2`) to reach 0 and then access `rchg1[i1-1]` (or
>> `rchg2[i2-1]`), which underflows the buffer.
>> This commit adds explicit `i1 > 0` and `i2 > 0` checks around
>> those array accesses to prevent invalid negative indexing.
>
> xdl_prepare_ctx() in xdiff/xprepare.c allocates an extra entry at both
> ends for rchg arrays, so an index of -1 should be within the bounds.
>
> i1 and i2 are decreased in lockstep, though, so one of them can become
> smaller than -1 if nrec is different between the files. And that's how
> this code run can indeed run off into the weeds.
Actually no, i1 can't seem to reach 0 without i2 also being 0 and vice
versa. Or can it? It makes sense that we reach the start of both
buffers at the same time if we walk backwards from the end, don't
misstep and have consistent rchg array contents, but I'm not sure.
Are you able to demonstrate any out-of-bounds access with e.g.,
Valgrind, AddressSanitizer or an assertion?
> Curiously, AddressSanitizer doesn't report anything, but if I add the
> following line after the outer for, I can trigger it to report a
> heap-buffer-overflow with e.g., git show 8613c2bb6c:
>
> if (i1 < 0 || i2 < 0) fprintf(stderr, "Oops: %ld %ld\n", i1, i2);
That's because I forgot to add braces. D'oh! I can't trigger any
out-of-bounds access or that Oops with them properly in place. So I
let myself get fooled by a daring coding style. :-|
>
>>
>> Signed-off-by: Alex Guo <alexguo1023@gmail.com>
>> ---
>> Fix buffer underflow in xdl_build_script
>>
>> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1976%2Fmugitya03%2Fbuf-1-v1
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1976/mugitya03/buf-1-v1
>> Pull-Request: https://github.com/git/git/pull/1976
>>
>> xdiff/xdiffi.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
>> index 5a96e36dfbe..2e983965328 100644
>> --- a/xdiff/xdiffi.c
>> +++ b/xdiff/xdiffi.c
>> @@ -951,9 +951,10 @@ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
>> * Trivial. Collects "groups" of changes and creates an edit script.
Trivial for Davide perhaps (libxdiff author), but not my mushy brain..
>> */
>> for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
>
> Should the || be a && instead? From a birds-eye view I would assume we
> can stop scanning for changes when we exhaust (reach the top) of either
> side. We just have to make sure everything from the other side is
> accounted for in the last added change.
>
>> - if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
>> - for (l1 = i1; rchg1[i1 - 1]; i1--);
>> - for (l2 = i2; rchg2[i2 - 1]; i2--);
>> + if ((i1 > 0 && rchg1[i1 - 1]) ||
>> + (i2 > 0 && rchg2[i2 - 1])) {
>> + for (l1 = i1; i1 > 0 && rchg1[i1 - 1]; i1--);
>> + for (l2 = i2; i2 > 0 && rchg2[i2 - 1]; i2--);
>
> Nit: The indentation of that line is off.
>
>>
>> if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
>> xdl_free_script(cscr);
>>
>> base-commit: 8613c2bb6cd16ef530dc5dd74d3b818a1ccbf1c0
>
>
next prev parent reply other threads:[~2025-05-24 9:08 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-23 20:51 [PATCH] Fix buffer underflow in xdl_build_script Alex via GitGitGadget
2025-05-24 5:57 ` René Scharfe
2025-05-24 9:08 ` René Scharfe [this message]
2025-05-24 13:38 ` Phillip Wood
2025-05-24 13:53 ` Phillip Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a58bf04a-d9fc-4ecf-96a5-82d31ada510d@web.de \
--to=l.s.r@web.de \
--cc=alexguo1023@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=guo846@purdue.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).