* [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
@ 2025-01-09 6:21 Patrick Steinhardt
2025-01-09 10:31 ` Kristoffer Haugsbakk
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-09 6:21 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.
It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, where as printf(3p) does.
Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
This fixes the issue reported in [1]. Thanks!
Patrick
[1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de>
---
builtin/blame.c | 4 +++-
t/t8002-blame.sh | 4 ++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
char ch;
size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
the_hash_algo->hexsz : (size_t) abbrev;
+ if (length > GIT_MAX_HEXSZ)
+ length = GIT_MAX_HEXSZ;
if (opt & OUTPUT_COLOR_LINE) {
if (cnt > 0) {
@@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
length--;
putchar('?');
}
- fwrite(hex, 1, length, stdout);
+ printf("%.*s", (int)length, hex);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
if (opt & OUTPUT_SHOW_EMAIL)
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
check_abbrev $hexsz --no-abbrev
'
+test_expect_success 'blame --abbrev gets truncated' '
+ check_abbrev 9000 --abbrev=$hexsz HEAD
+'
+
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
test_must_fail git blame --exclude-promisor-objects one
'
---
base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt @ 2025-01-09 10:31 ` Kristoffer Haugsbakk 2025-01-09 10:49 ` Johannes Schindelin ` (2 subsequent siblings) 3 siblings, 0 replies; 18+ messages in thread From: Kristoffer Haugsbakk @ 2025-01-09 10:31 UTC (permalink / raw) To: Patrick Steinhardt, git; +Cc: Johannes Schindelin On Thu, Jan 9, 2025, at 07:21, Patrick Steinhardt wrote: > object ID. This is because fwrite(3p) of course doesn't stop when it > sees a NUL byte, where as printf(3p) does. s/where as/whereas -- Kristoffer Haugsbakk ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt 2025-01-09 10:31 ` Kristoffer Haugsbakk @ 2025-01-09 10:49 ` Johannes Schindelin 2025-01-09 11:14 ` Patrick Steinhardt 2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt 3 siblings, 1 reply; 18+ messages in thread From: Johannes Schindelin @ 2025-01-09 10:49 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Hi Patrick, On Thu, 9 Jan 2025, Patrick Steinhardt wrote: > In 6411a0a896 (builtin/blame: fix type of `length` variable when > emitting object ID, 2024-12-06) we have fixed the type of the `length` > variable. In order to avoid a cast from `size_t` to `int` in the call to > printf(3p) with the "%.*s" formatter we have converted the code to > instead use fwrite(3p), which accepts the length as a `size_t`. > > It was reported though that this makes us read over the end of the OID > array when the provided `--abbrev=` length exceeds the length of the > object ID. This is because fwrite(3p) of course doesn't stop when it > sees a NUL byte, where as printf(3p) does. > > Fix the bug by reverting back to printf(3p) and culling the provided > length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an > `int`. > > Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > This fixes the issue reported in [1]. Thanks! Thank you for the quick fix! We will need to adjust it a little more, though: > > Patrick > > [1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de> > --- > builtin/blame.c | 4 +++- > t/t8002-blame.sh | 4 ++++ > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/builtin/blame.c b/builtin/blame.c > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > char ch; > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > the_hash_algo->hexsz : (size_t) abbrev; > + if (length > GIT_MAX_HEXSZ) > + length = GIT_MAX_HEXSZ; This causes a subtle change of behavior because there are a couple of conditional code blocks between this change and the `printf()` call decrease `length`, i.e. specifying values larger than the maximal hex size causes potentially-desirable, different behavior (and think about https://www.hyrumslaw.com/). > > if (opt & OUTPUT_COLOR_LINE) { > if (cnt > 0) { > @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > length--; > putchar('?'); > } > - fwrite(hex, 1, length, stdout); > + printf("%.*s", (int)length, hex); > if (opt & OUTPUT_ANNOTATE_COMPAT) { > const char *name; > if (opt & OUTPUT_SHOW_EMAIL) > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755 > --- a/t/t8002-blame.sh > +++ b/t/t8002-blame.sh > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > check_abbrev $hexsz --no-abbrev > ' > > +test_expect_success 'blame --abbrev gets truncated' ' > + check_abbrev 9000 --abbrev=$hexsz HEAD This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value that needs to be truncated. > +' > + > test_expect_success '--exclude-promisor-objects does not BUG-crash' ' > test_must_fail git blame --exclude-promisor-objects one > ' > > --- > base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e > change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71 Here is my proposed fixup: -- snipsnap -- [PATCH] fixup! builtin/blame: fix out-of-bounds read with excessive `--abbrev` The test case needs to actually test an excessive `--abbrev` value. Also, when calling `git blame --abbrev=<N>` with an `N` that is larger than the maximal OID hex size, there is a subtle side effect that makes it behave _differently_ than specifying said maximal hex size: When the command outputs boundary, unblamable or ignored commits' OIDs, those outputs are prefixed with characters indicating this, and the `abbrev` value is used to align the information that comes after the OID, clipping it as needed. Specifying a "too large" abbrev value here tells Git that yes, we want the full OIDs and don't you worry about alignment. Thanks to SHA-256 being _larger_ than the default SHA-1-based OIDs, and thanks to clipping at `GIT_MAX_HEXSZ`, this change of behavior can only be observed when running the test in SHA-256 mode. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- builtin/blame.c | 9 +++++++-- t/t8002-blame.sh | 2 +- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/builtin/blame.c b/builtin/blame.c index ad91fe9e97f9..5b4976835066 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int char ch; size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? the_hash_algo->hexsz : (size_t) abbrev; - if (length > GIT_MAX_HEXSZ) - length = GIT_MAX_HEXSZ; + + /* + * Leave enough space for ^, * and ? indicators (boundary, + * unblamable, ignored). + */ + if (length > GIT_MAX_HEXSZ + 3) + length = GIT_MAX_HEXSZ + 3; if (opt & OUTPUT_COLOR_LINE) { if (cnt > 0) { diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh index fcaba8c11f7e..71fa70a64679 100755 --- a/t/t8002-blame.sh +++ b/t/t8002-blame.sh @@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' ' test_expect_success 'blame --abbrev gets truncated' ' - check_abbrev 9000 --abbrev=$hexsz HEAD + check_abbrev 9000 --abbrev=9000 HEAD.. ' test_expect_success '--exclude-promisor-objects does not BUG-crash' ' -- 2.48.0.rc0.windows.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 10:49 ` Johannes Schindelin @ 2025-01-09 11:14 ` Patrick Steinhardt 2025-01-09 13:41 ` Johannes Schindelin 0 siblings, 1 reply; 18+ messages in thread From: Patrick Steinhardt @ 2025-01-09 11:14 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote: > > diff --git a/builtin/blame.c b/builtin/blame.c > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644 > > --- a/builtin/blame.c > > +++ b/builtin/blame.c > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > char ch; > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > > the_hash_algo->hexsz : (size_t) abbrev; > > + if (length > GIT_MAX_HEXSZ) > > + length = GIT_MAX_HEXSZ; > > This causes a subtle change of behavior because there are a couple of > conditional code blocks between this change and the `printf()` call > decrease `length`, i.e. specifying values larger than the maximal hex size > causes potentially-desirable, different behavior (and think about > https://www.hyrumslaw.com/). Alternatively we can move this until after we have done the subtractions. Then we don't have to do weird gymnastics. > > > > if (opt & OUTPUT_COLOR_LINE) { > > if (cnt > 0) { > > @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > length--; > > putchar('?'); > > } > > - fwrite(hex, 1, length, stdout); > > + printf("%.*s", (int)length, hex); > > if (opt & OUTPUT_ANNOTATE_COMPAT) { > > const char *name; > > if (opt & OUTPUT_SHOW_EMAIL) > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755 > > --- a/t/t8002-blame.sh > > +++ b/t/t8002-blame.sh > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > > check_abbrev $hexsz --no-abbrev > > ' > > > > +test_expect_success 'blame --abbrev gets truncated' ' > > + check_abbrev 9000 --abbrev=$hexsz HEAD > > This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value > that needs to be truncated. Oh dear. The test did manage to catch the bug, but thinking more about it that was only because my initial fix was broken. > diff --git a/builtin/blame.c b/builtin/blame.c > index ad91fe9e97f9..5b4976835066 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > char ch; > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > the_hash_algo->hexsz : (size_t) abbrev; > - if (length > GIT_MAX_HEXSZ) > - length = GIT_MAX_HEXSZ; > + > + /* > + * Leave enough space for ^, * and ? indicators (boundary, > + * unblamable, ignored). > + */ > + if (length > GIT_MAX_HEXSZ + 3) > + length = GIT_MAX_HEXSZ + 3; > > if (opt & OUTPUT_COLOR_LINE) { > if (cnt > 0) { How about this instead? diff --git a/builtin/blame.c b/builtin/blame.c index ad91fe9e97..f92e487bed 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -475,8 +475,6 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int char ch; size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? the_hash_algo->hexsz : (size_t) abbrev; - if (length > GIT_MAX_HEXSZ) - length = GIT_MAX_HEXSZ; if (opt & OUTPUT_COLOR_LINE) { if (cnt > 0) { @@ -507,6 +505,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int length--; putchar('?'); } + + if (length > GIT_MAX_HEXSZ) + length = GIT_MAX_HEXSZ; printf("%.*s", (int)length, hex); if (opt & OUTPUT_ANNOTATE_COMPAT) { const char *name; In that case there's no need to juggle with the magic indicators, which makes it a bit easier to reason about. > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > index fcaba8c11f7e..71fa70a64679 100755 > --- a/t/t8002-blame.sh > +++ b/t/t8002-blame.sh > @@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > ' > > test_expect_success 'blame --abbrev gets truncated' ' > - check_abbrev 9000 --abbrev=$hexsz HEAD > + check_abbrev 9000 --abbrev=9000 HEAD.. > ' This should be `check_abbrev $hexsz --abbrev=9000`, shouldn't it? Patrick ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 11:14 ` Patrick Steinhardt @ 2025-01-09 13:41 ` Johannes Schindelin 2025-01-10 9:27 ` Johannes Schindelin 0 siblings, 1 reply; 18+ messages in thread From: Johannes Schindelin @ 2025-01-09 13:41 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Hi Patrick, On Thu, 9 Jan 2025, Patrick Steinhardt wrote: > On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote: > > > diff --git a/builtin/blame.c b/builtin/blame.c > > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644 > > > --- a/builtin/blame.c > > > +++ b/builtin/blame.c > > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > > char ch; > > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > > > the_hash_algo->hexsz : (size_t) abbrev; > > > + if (length > GIT_MAX_HEXSZ) > > > + length = GIT_MAX_HEXSZ; > > > > This causes a subtle change of behavior because there are a couple of > > conditional code blocks between this change and the `printf()` call > > decrease `length`, i.e. specifying values larger than the maximal hex size > > causes potentially-desirable, different behavior (and think about > > https://www.hyrumslaw.com/). > > Alternatively we can move this until after we have done the > subtractions. Then we don't have to do weird gymnastics. Or we can even avoid assiging a maximum altogether: if (length < GIT_MAX_HEXSZ) printf("%.*s", (int)length, hex); else printf("%s", hex); Or be more consistent with Git's source code style which often prefers ternaries, favoring succinctness over readability: printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex); > > > if (opt & OUTPUT_COLOR_LINE) { > > > if (cnt > 0) { > > > @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > > length--; > > > putchar('?'); > > > } > > > - fwrite(hex, 1, length, stdout); > > > + printf("%.*s", (int)length, hex); > > > if (opt & OUTPUT_ANNOTATE_COMPAT) { > > > const char *name; > > > if (opt & OUTPUT_SHOW_EMAIL) > > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755 > > > --- a/t/t8002-blame.sh > > > +++ b/t/t8002-blame.sh > > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > > > check_abbrev $hexsz --no-abbrev > > > ' > > > > > > +test_expect_success 'blame --abbrev gets truncated' ' > > > + check_abbrev 9000 --abbrev=$hexsz HEAD > > > > This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value > > that needs to be truncated. > > Oh dear. The test did manage to catch the bug, but thinking more about > it that was only because my initial fix was broken. > > > diff --git a/builtin/blame.c b/builtin/blame.c > > index ad91fe9e97f9..5b4976835066 100644 > > --- a/builtin/blame.c > > +++ b/builtin/blame.c > > @@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > char ch; > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > > the_hash_algo->hexsz : (size_t) abbrev; > > - if (length > GIT_MAX_HEXSZ) > > - length = GIT_MAX_HEXSZ; > > + > > + /* > > + * Leave enough space for ^, * and ? indicators (boundary, > > + * unblamable, ignored). > > + */ > > + if (length > GIT_MAX_HEXSZ + 3) > > + length = GIT_MAX_HEXSZ + 3; > > > > if (opt & OUTPUT_COLOR_LINE) { > > if (cnt > 0) { > > How about this instead? > > diff --git a/builtin/blame.c b/builtin/blame.c > index ad91fe9e97..f92e487bed 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -475,8 +475,6 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > char ch; > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > the_hash_algo->hexsz : (size_t) abbrev; > - if (length > GIT_MAX_HEXSZ) > - length = GIT_MAX_HEXSZ; > > if (opt & OUTPUT_COLOR_LINE) { > if (cnt > 0) { > @@ -507,6 +505,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > length--; > putchar('?'); > } > + > + if (length > GIT_MAX_HEXSZ) > + length = GIT_MAX_HEXSZ; > printf("%.*s", (int)length, hex); > if (opt & OUTPUT_ANNOTATE_COMPAT) { > const char *name; > > In that case there's no need to juggle with the magic indicators, which > makes it a bit easier to reason about. > > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > > index fcaba8c11f7e..71fa70a64679 100755 > > --- a/t/t8002-blame.sh > > +++ b/t/t8002-blame.sh > > @@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > > ' > > > > test_expect_success 'blame --abbrev gets truncated' ' > > - check_abbrev 9000 --abbrev=$hexsz HEAD > > + check_abbrev 9000 --abbrev=9000 HEAD.. > > ' > > This should be `check_abbrev $hexsz --abbrev=9000`, shouldn't it? I kind of liked the idea to keep the same cut-off threshold for the validation. But I won't insist. The construct is obtuse in either case ;-) (Not your fault, of course, you merely imitated existing code, which is the correct thing to do). Ciao, Johannes ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 13:41 ` Johannes Schindelin @ 2025-01-10 9:27 ` Johannes Schindelin 2025-01-10 9:49 ` Patrick Steinhardt 0 siblings, 1 reply; 18+ messages in thread From: Johannes Schindelin @ 2025-01-10 9:27 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Hi Patrick, On Thu, 9 Jan 2025, Johannes Schindelin wrote: > On Thu, 9 Jan 2025, Patrick Steinhardt wrote: > > > On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote: > > > > diff --git a/builtin/blame.c b/builtin/blame.c > > > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644 > > > > --- a/builtin/blame.c > > > > +++ b/builtin/blame.c > > > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > > > char ch; > > > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > > > > the_hash_algo->hexsz : (size_t) abbrev; > > > > + if (length > GIT_MAX_HEXSZ) > > > > + length = GIT_MAX_HEXSZ; > > > > > > This causes a subtle change of behavior because there are a couple of > > > conditional code blocks between this change and the `printf()` call > > > decrease `length`, i.e. specifying values larger than the maximal hex size > > > causes potentially-desirable, different behavior (and think about > > > https://www.hyrumslaw.com/). > > > > Alternatively we can move this until after we have done the > > subtractions. Then we don't have to do weird gymnastics. > > Or we can even avoid assiging a maximum altogether: > > if (length < GIT_MAX_HEXSZ) > printf("%.*s", (int)length, hex); > else > printf("%s", hex); > > Or be more consistent with Git's source code style which often prefers > ternaries, favoring succinctness over readability: > > printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex); Coverity noticed a problem with this approach, looking at https://github.com/git/git/blob/v2.48.0-rc2/builtin/blame.c#L493: memset(hex, ' ', length); If the `GIT_MAX_HEXSZ` guard is moved after this statement, then we can easily overrun the `hex` buffer. Ciao, Johannes ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-10 9:27 ` Johannes Schindelin @ 2025-01-10 9:49 ` Patrick Steinhardt 0 siblings, 0 replies; 18+ messages in thread From: Patrick Steinhardt @ 2025-01-10 9:49 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Fri, Jan 10, 2025 at 10:27:23AM +0100, Johannes Schindelin wrote: > Hi Patrick, > > On Thu, 9 Jan 2025, Johannes Schindelin wrote: > > > On Thu, 9 Jan 2025, Patrick Steinhardt wrote: > > > > > On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote: > > > > > diff --git a/builtin/blame.c b/builtin/blame.c > > > > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644 > > > > > --- a/builtin/blame.c > > > > > +++ b/builtin/blame.c > > > > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > > > > > char ch; > > > > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ? > > > > > the_hash_algo->hexsz : (size_t) abbrev; > > > > > + if (length > GIT_MAX_HEXSZ) > > > > > + length = GIT_MAX_HEXSZ; > > > > > > > > This causes a subtle change of behavior because there are a couple of > > > > conditional code blocks between this change and the `printf()` call > > > > decrease `length`, i.e. specifying values larger than the maximal hex size > > > > causes potentially-desirable, different behavior (and think about > > > > https://www.hyrumslaw.com/). > > > > > > Alternatively we can move this until after we have done the > > > subtractions. Then we don't have to do weird gymnastics. > > > > Or we can even avoid assiging a maximum altogether: > > > > if (length < GIT_MAX_HEXSZ) > > printf("%.*s", (int)length, hex); > > else > > printf("%s", hex); > > > > Or be more consistent with Git's source code style which often prefers > > ternaries, favoring succinctness over readability: > > > > printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex); > > Coverity noticed a problem with this approach, looking at > https://github.com/git/git/blob/v2.48.0-rc2/builtin/blame.c#L493: > > memset(hex, ' ', length); > > If the `GIT_MAX_HEXSZ` guard is moved after this statement, then we can > easily overrun the `hex` buffer. Oh. That's even an old-standing issue that wasn't caused by the refactoring, right? Your proposed fix to set `length = GIT_MAX_HEXSZ + 3` to account for the old behaviour wouldn't fix it either, as we could still end up overwriting two bytes. I'll send a new version with another commit on top to fix this. Patrick ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt 2025-01-09 10:31 ` Kristoffer Haugsbakk 2025-01-09 10:49 ` Johannes Schindelin @ 2025-01-09 11:48 ` Patrick Steinhardt 2025-01-09 12:40 ` shejialuo ` (2 more replies) 2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt 3 siblings, 3 replies; 18+ messages in thread From: Patrick Steinhardt @ 2025-01-09 11:48 UTC (permalink / raw) To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk In 6411a0a896 (builtin/blame: fix type of `length` variable when emitting object ID, 2024-12-06) we have fixed the type of the `length` variable. In order to avoid a cast from `size_t` to `int` in the call to printf(3p) with the "%.*s" formatter we have converted the code to instead use fwrite(3p), which accepts the length as a `size_t`. It was reported though that this makes us read over the end of the OID array when the provided `--abbrev=` length exceeds the length of the object ID. This is because fwrite(3p) of course doesn't stop when it sees a NUL byte, whereas printf(3p) does. Fix the bug by reverting back to printf(3p) and culling the provided length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an `int`. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> --- This fixes the issue reported in [1]. Thanks! Changes in v2: - Take into account that we may strip ^, * and ? indicators by moving around the check. - Fix the testcase so that it actually fails without the fix. - Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@pks.im Patrick [1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de> --- builtin/blame.c | 5 ++++- t/t8002-blame.sh | 4 ++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/builtin/blame.c b/builtin/blame.c index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int length--; putchar('?'); } - fwrite(hex, 1, length, stdout); + + if (length > GIT_MAX_HEXSZ) + length = GIT_MAX_HEXSZ; + printf("%.*s", (int)length, hex); if (opt & OUTPUT_ANNOTATE_COMPAT) { const char *name; if (opt & OUTPUT_SHOW_EMAIL) diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755 --- a/t/t8002-blame.sh +++ b/t/t8002-blame.sh @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' check_abbrev $hexsz --no-abbrev ' +test_expect_success 'blame --abbrev gets truncated' ' + check_abbrev $hexsz --abbrev=9000 HEAD +' + test_expect_success '--exclude-promisor-objects does not BUG-crash' ' test_must_fail git blame --exclude-promisor-objects one ' --- base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt @ 2025-01-09 12:40 ` shejialuo 2025-01-09 13:49 ` Johannes Schindelin 2025-01-09 13:43 ` Johannes Schindelin 2025-01-09 14:59 ` Junio C Hamano 2 siblings, 1 reply; 18+ messages in thread From: shejialuo @ 2025-01-09 12:40 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Johannes Schindelin, Kristoffer Haugsbakk On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote: > In 6411a0a896 (builtin/blame: fix type of `length` variable when > emitting object ID, 2024-12-06) we have fixed the type of the `length` > variable. In order to avoid a cast from `size_t` to `int` in the call to > printf(3p) with the "%.*s" formatter we have converted the code to > instead use fwrite(3p), which accepts the length as a `size_t`. > > It was reported though that this makes us read over the end of the OID > array when the provided `--abbrev=` length exceeds the length of the > object ID. This is because fwrite(3p) of course doesn't stop when it > sees a NUL byte, whereas printf(3p) does. > > Fix the bug by reverting back to printf(3p) and culling the provided > length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an > `int`. > > Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > This fixes the issue reported in [1]. Thanks! > > Changes in v2: > > - Take into account that we may strip ^, * and ? indicators by moving > around the check. > - Fix the testcase so that it actually fails without the fix. > - Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@pks.im > > Patrick > > [1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de> > --- > builtin/blame.c | 5 ++++- > t/t8002-blame.sh | 4 ++++ > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/builtin/blame.c b/builtin/blame.c > index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > length--; > putchar('?'); > } > - fwrite(hex, 1, length, stdout); > + > + if (length > GIT_MAX_HEXSZ) > + length = GIT_MAX_HEXSZ; Here, we explicitly set the length if it exceeds the max hex of the repo. Although `printf` will automatically hide the length, we should explicitly do this. Make sense. > + printf("%.*s", (int)length, hex); > if (opt & OUTPUT_ANNOTATE_COMPAT) { > const char *name; > if (opt & OUTPUT_SHOW_EMAIL) > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755 > --- a/t/t8002-blame.sh > +++ b/t/t8002-blame.sh > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > check_abbrev $hexsz --no-abbrev > ' > > +test_expect_success 'blame --abbrev gets truncated' ' > + check_abbrev $hexsz --abbrev=9000 HEAD > +' > + By the way, I feel this usage is a little strange as the user side. When I received the report mail from Johannes today morning, I feel a little funny that we allow the value of the `--abrrev` option exceeds the `GIT_MAX_HEXSZ` in the first place. > test_expect_success '--exclude-promisor-objects does not BUG-crash' ' > test_must_fail git blame --exclude-promisor-objects one > ' > > --- > base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e > change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71 > Thanks, Jialuo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 12:40 ` shejialuo @ 2025-01-09 13:49 ` Johannes Schindelin 2025-01-10 12:16 ` shejialuo 0 siblings, 1 reply; 18+ messages in thread From: Johannes Schindelin @ 2025-01-09 13:49 UTC (permalink / raw) To: shejialuo; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk Hi Jialuo, On Thu, 9 Jan 2025, shejialuo wrote: > On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote: > > > + printf("%.*s", (int)length, hex); > > if (opt & OUTPUT_ANNOTATE_COMPAT) { > > const char *name; > > if (opt & OUTPUT_SHOW_EMAIL) > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755 > > --- a/t/t8002-blame.sh > > +++ b/t/t8002-blame.sh > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > > check_abbrev $hexsz --no-abbrev > > ' > > > > +test_expect_success 'blame --abbrev gets truncated' ' > > + check_abbrev $hexsz --abbrev=9000 HEAD > > +' > > + > > By the way, I feel this usage is a little strange as the user side. When > I received the report mail from Johannes today morning, I feel a little > funny that we allow the value of the `--abrrev` option exceeds the > `GIT_MAX_HEXSZ` in the first place. See the explanation I provided in https://lore.kernel.org/git/c439fcaf-11af-7862-9c3c-18dc0842b57d@gmx.de/: When calling `git blame --abbrev=40 HEAD.. -- <file>` (in a SHA-1-based repository), the OIDs are prefixed with a `^` and then the last hex digit will be cut. The reason? Git wants to align the text after the OID. When calling it with `--abbrev=41`, the full OID is shown. Ciao, Johannes ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 13:49 ` Johannes Schindelin @ 2025-01-10 12:16 ` shejialuo 0 siblings, 0 replies; 18+ messages in thread From: shejialuo @ 2025-01-10 12:16 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk On Thu, Jan 09, 2025 at 02:49:09PM +0100, Johannes Schindelin wrote: > Hi Jialuo, > > On Thu, 9 Jan 2025, shejialuo wrote: > > > On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote: > > > > > + printf("%.*s", (int)length, hex); > > > if (opt & OUTPUT_ANNOTATE_COMPAT) { > > > const char *name; > > > if (opt & OUTPUT_SHOW_EMAIL) > > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755 > > > --- a/t/t8002-blame.sh > > > +++ b/t/t8002-blame.sh > > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > > > check_abbrev $hexsz --no-abbrev > > > ' > > > > > > +test_expect_success 'blame --abbrev gets truncated' ' > > > + check_abbrev $hexsz --abbrev=9000 HEAD > > > +' > > > + > > > > By the way, I feel this usage is a little strange as the user side. When > > I received the report mail from Johannes today morning, I feel a little > > funny that we allow the value of the `--abrrev` option exceeds the > > `GIT_MAX_HEXSZ` in the first place. > > See the explanation I provided in > https://lore.kernel.org/git/c439fcaf-11af-7862-9c3c-18dc0842b57d@gmx.de/: > When calling `git blame --abbrev=40 HEAD.. -- <file>` (in a SHA-1-based > repository), the OIDs are prefixed with a `^` and then the last hex digit > will be cut. The reason? Git wants to align the text after the OID. > I have read through this, thanks for the detailed explanation. Thanks, Jialuo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt 2025-01-09 12:40 ` shejialuo @ 2025-01-09 13:43 ` Johannes Schindelin 2025-01-09 14:59 ` Junio C Hamano 2 siblings, 0 replies; 18+ messages in thread From: Johannes Schindelin @ 2025-01-09 13:43 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Kristoffer Haugsbakk Hi Patrick, On Thu, 9 Jan 2025, Patrick Steinhardt wrote: > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755 > --- a/t/t8002-blame.sh > +++ b/t/t8002-blame.sh > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' > check_abbrev $hexsz --no-abbrev > ' > > +test_expect_success 'blame --abbrev gets truncated' ' > + check_abbrev $hexsz --abbrev=9000 HEAD Please note that the patch I proposed had `HEAD..` here, not `HEAD`. The latter would not have surfaced the problem with boundary commits. Ciao, Johannes ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt 2025-01-09 12:40 ` shejialuo 2025-01-09 13:43 ` Johannes Schindelin @ 2025-01-09 14:59 ` Junio C Hamano 2 siblings, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2025-01-09 14:59 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Johannes Schindelin, Kristoffer Haugsbakk Patrick Steinhardt <ps@pks.im> writes: > diff --git a/builtin/blame.c b/builtin/blame.c > index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int Mental note: the "hex" in question that is used later is defined as char hex[GIT_MAX_HEXSZ + 1]; at the top of this scope. > length--; > putchar('?'); > } > - fwrite(hex, 1, length, stdout); > + > + if (length > GIT_MAX_HEXSZ) > + length = GIT_MAX_HEXSZ; > + printf("%.*s", (int)length, hex); This is not wrong per-se, but leaves a funny aftertaste after reading it, especially if the reader knows the original *bug* was about showing past the end of the string in "hex". The question is "what if the current hash function produces shorter than MAX_HEXSZ?" The updated code is correct only because it reinstates the use of printf() so the "length" being longer than the string itself no longer matters, and the only requirement on "length" is not to read beyond the end of hex[] array. So feeding the length of the array as the limit is not wrong, even though it may be feeding a limit that is larger than the string stored in the array. An obvious alternative would have been to base the limit on the value of strlen(hex) and then we may even keep using fwrite(); then we do not have worry about "what if MAX_HEXSZ is larger than the current hash function?" and nonsense like that (I personally prefer what is being reviewed much better; please do not take this "obvious alternative" as a suggestion). Limiting the "length" here does improve the resulting code, relative to v1 iteration. This printf is about showing the hexstring and we are making sure we do not read past the end of the array that holds the string, and doing it here means we do not have to worry about leading prefix characters about boundary etc. Very good improvement. Thanks. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes 2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt ` (2 preceding siblings ...) 2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt @ 2025-01-10 11:26 ` Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt 3 siblings, 2 replies; 18+ messages in thread From: Patrick Steinhardt @ 2025-01-10 11:26 UTC (permalink / raw) To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk, Junio C Hamano Hi, This fixes the issues reported in [1] and [2]. Thanks! Changes in v2: - Take into account that we may strip ^, * and ? indicators by moving around the check. - Fix the testcase so that it actually fails without the fix. - Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@pks.im Changes in v3: - Add another testcase for boundary commits. - Fix another out-of-bound write noticed by Coverity. This bug is not a regression in v2.48.0, but is a preexisting error. - Simplify the printf statement a bit by using a ternary statement. - Link to v2: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v2-1-589c81a6ddb0@pks.im Patrick [1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de> [2]: <48ca0114-124b-e3f5-af80-1e302bf9ce52@gmx.de> --- Patrick Steinhardt (2): builtin/blame: fix out-of-bounds read with excessive `--abbrev` builtin/blame: fix out-of-bounds write with blank boundary commits builtin/blame.c | 9 +++++---- t/t8002-blame.sh | 26 ++++++++++++++++++++++++++ 2 files changed, 31 insertions(+), 4 deletions(-) Range-diff versus v2: 1: 7065637d8e ! 1: 3865bfe643 builtin/blame: fix out-of-bounds read with excessive `--abbrev` @@ builtin/blame.c: static void emit_other(struct blame_scoreboard *sb, struct blam } - fwrite(hex, 1, length, stdout); + -+ if (length > GIT_MAX_HEXSZ) -+ length = GIT_MAX_HEXSZ; -+ printf("%.*s", (int)length, hex); ++ printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex); if (opt & OUTPUT_ANNOTATE_COMPAT) { const char *name; if (opt & OUTPUT_SHOW_EMAIL) @@ t/t8002-blame.sh: test_expect_success '--no-abbrev works like --abbrev with full +test_expect_success 'blame --abbrev gets truncated' ' + check_abbrev $hexsz --abbrev=9000 HEAD +' ++ ++test_expect_success 'blame --abbrev gets truncated with boundary commit' ' ++ check_abbrev $hexsz --abbrev=9000 ^HEAD ++' + test_expect_success '--exclude-promisor-objects does not BUG-crash' ' test_must_fail git blame --exclude-promisor-objects one -: ---------- > 2: af0af67a8a builtin/blame: fix out-of-bounds write with blank boundary commits --- base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` 2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt @ 2025-01-10 11:26 ` Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt 1 sibling, 0 replies; 18+ messages in thread From: Patrick Steinhardt @ 2025-01-10 11:26 UTC (permalink / raw) To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk, Junio C Hamano In 6411a0a896 (builtin/blame: fix type of `length` variable when emitting object ID, 2024-12-06) we have fixed the type of the `length` variable. In order to avoid a cast from `size_t` to `int` in the call to printf(3p) with the "%.*s" formatter we have converted the code to instead use fwrite(3p), which accepts the length as a `size_t`. It was reported though that this makes us read over the end of the OID array when the provided `--abbrev=` length exceeds the length of the object ID. This is because fwrite(3p) of course doesn't stop when it sees a NUL byte, whereas printf(3p) does. Fix the bug by reverting back to printf(3p) and culling the provided length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an `int`. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/blame.c | 3 ++- t/t8002-blame.sh | 8 ++++++++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/builtin/blame.c b/builtin/blame.c index 867032e4c16878ffd56df8a73162b89ca4bd2694..d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -505,7 +505,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int length--; putchar('?'); } - fwrite(hex, 1, length, stdout); + + printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex); if (opt & OUTPUT_ANNOTATE_COMPAT) { const char *name; if (opt & OUTPUT_SHOW_EMAIL) diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..b3f8b63d2e6744dd434f38fd9f10b56cd432141b 100755 --- a/t/t8002-blame.sh +++ b/t/t8002-blame.sh @@ -126,6 +126,14 @@ test_expect_success '--no-abbrev works like --abbrev with full length' ' check_abbrev $hexsz --no-abbrev ' +test_expect_success 'blame --abbrev gets truncated' ' + check_abbrev $hexsz --abbrev=9000 HEAD +' + +test_expect_success 'blame --abbrev gets truncated with boundary commit' ' + check_abbrev $hexsz --abbrev=9000 ^HEAD +' + test_expect_success '--exclude-promisor-objects does not BUG-crash' ' test_must_fail git blame --exclude-promisor-objects one ' -- 2.48.0.rc2.279.g1de40edade.dirty ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits 2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt @ 2025-01-10 11:26 ` Patrick Steinhardt 2025-01-10 13:00 ` Johannes Schindelin 1 sibling, 1 reply; 18+ messages in thread From: Patrick Steinhardt @ 2025-01-10 11:26 UTC (permalink / raw) To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk, Junio C Hamano When passing the `-b` flag to git-blame(1), then any blamed boundary commits which were marked as uninteresting will not get their actual commit ID printed, but will instead be replaced by a couple of spaces. The flag can lead to an out-of-bounds write as though when combined with `--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ` as we simply use memset(3p) on that array with the user-provided length directly. The result is most likely that we segfault. An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes. But when the underlying object ID is SHA1, and if the abbreviated length exceeds the SHA1 length, it would cause us to print more bytes than desired, and the result would be misaligned. Instead, fix the bug by computing the length via strlen(3p). This makes us write as many bytes as the formatted object ID requires and thus effectively limits the length of what we may end up printing to the length of its hash. If `--abbrev=` asks us to abbreviate to something shorter than the full length of the underlying hash function it would be handled by the call to printf(3p) correctly. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/blame.c | 6 +++--- t/t8002-blame.sh | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/builtin/blame.c b/builtin/blame.c index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int fputs(color, stdout); if (suspect->commit->object.flags & UNINTERESTING) { - if (blank_boundary) - memset(hex, ' ', length); - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { + if (blank_boundary) { + memset(hex, ' ', strlen(hex)); + } else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { length--; putchar('^'); } diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh index b3f8b63d2e6744dd434f38fd9f10b56cd432141b..1ad039e1234828ca8779ad76147bfa7fe14c5a2e 100755 --- a/t/t8002-blame.sh +++ b/t/t8002-blame.sh @@ -134,6 +134,24 @@ test_expect_success 'blame --abbrev gets truncated with boundary commit' ' check_abbrev $hexsz --abbrev=9000 ^HEAD ' +test_expect_success 'blame --abbrev -b truncates the blank boundary' ' + # Note that `--abbrev=` always gets incremented by 1, which is why we + # expect 11 leading spaces and not 10. + cat >expect <<-EOF && + $(printf "%0.s " $(test_seq 11)) (<author@example.com> 2005-04-07 15:45:13 -0700 1) abbrev + EOF + git blame -b --abbrev=10 ^HEAD -- abbrev.t >actual && + test_cmp expect actual +' + +test_expect_success 'blame with excessive --abbrev and -b culls to hash length' ' + cat >expect <<-EOF && + $(printf "%0.s " $(test_seq $hexsz)) (<author@example.com> 2005-04-07 15:45:13 -0700 1) abbrev + EOF + git blame -b --abbrev=9000 ^HEAD -- abbrev.t >actual && + test_cmp expect actual +' + test_expect_success '--exclude-promisor-objects does not BUG-crash' ' test_must_fail git blame --exclude-promisor-objects one ' -- 2.48.0.rc2.279.g1de40edade.dirty ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits 2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt @ 2025-01-10 13:00 ` Johannes Schindelin 2025-01-10 14:21 ` Junio C Hamano 0 siblings, 1 reply; 18+ messages in thread From: Johannes Schindelin @ 2025-01-10 13:00 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Kristoffer Haugsbakk, Junio C Hamano Hi Patrick, On Fri, 10 Jan 2025, Patrick Steinhardt wrote: > diff --git a/builtin/blame.c b/builtin/blame.c > index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > fputs(color, stdout); > > if (suspect->commit->object.flags & UNINTERESTING) { > - if (blank_boundary) > - memset(hex, ' ', length); > - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { > + if (blank_boundary) { > + memset(hex, ' ', strlen(hex)); Using `strlen()` is a neat trick. I could have done without slipping in a style change (introducing curlies), but the most important thing is that it fixes the bug. Thank you, Johannes > + } else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { > length--; > putchar('^'); > } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits 2025-01-10 13:00 ` Johannes Schindelin @ 2025-01-10 14:21 ` Junio C Hamano 0 siblings, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2025-01-10 14:21 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > On Fri, 10 Jan 2025, Patrick Steinhardt wrote: > >> diff --git a/builtin/blame.c b/builtin/blame.c >> index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644 >> --- a/builtin/blame.c >> +++ b/builtin/blame.c >> @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int >> fputs(color, stdout); >> >> if (suspect->commit->object.flags & UNINTERESTING) { >> - if (blank_boundary) >> - memset(hex, ' ', length); >> - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { >> + if (blank_boundary) { >> + memset(hex, ' ', strlen(hex)); > > Using `strlen()` is a neat trick. > > I could have done without slipping in a style change (introducing > curlies), but the most important thing is that it fixes the bug. Thank both of you for these last-minute fixes. Hopefully we can have them in today's release, and we didn't miss unexpected side effects in them, I hope ;-). ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-01-10 14:21 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt 2025-01-09 10:31 ` Kristoffer Haugsbakk 2025-01-09 10:49 ` Johannes Schindelin 2025-01-09 11:14 ` Patrick Steinhardt 2025-01-09 13:41 ` Johannes Schindelin 2025-01-10 9:27 ` Johannes Schindelin 2025-01-10 9:49 ` Patrick Steinhardt 2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt 2025-01-09 12:40 ` shejialuo 2025-01-09 13:49 ` Johannes Schindelin 2025-01-10 12:16 ` shejialuo 2025-01-09 13:43 ` Johannes Schindelin 2025-01-09 14:59 ` Junio C Hamano 2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt 2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt 2025-01-10 13:00 ` Johannes Schindelin 2025-01-10 14:21 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).