* [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
@ 2025-01-09 6:21 Patrick Steinhardt
2025-01-09 10:31 ` Kristoffer Haugsbakk
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-09 6:21 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.
It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, where as printf(3p) does.
Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
This fixes the issue reported in [1]. Thanks!
Patrick
[1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de>
---
builtin/blame.c | 4 +++-
t/t8002-blame.sh | 4 ++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
char ch;
size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
the_hash_algo->hexsz : (size_t) abbrev;
+ if (length > GIT_MAX_HEXSZ)
+ length = GIT_MAX_HEXSZ;
if (opt & OUTPUT_COLOR_LINE) {
if (cnt > 0) {
@@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
length--;
putchar('?');
}
- fwrite(hex, 1, length, stdout);
+ printf("%.*s", (int)length, hex);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
if (opt & OUTPUT_SHOW_EMAIL)
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
check_abbrev $hexsz --no-abbrev
'
+test_expect_success 'blame --abbrev gets truncated' '
+ check_abbrev 9000 --abbrev=$hexsz HEAD
+'
+
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
test_must_fail git blame --exclude-promisor-objects one
'
---
base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
@ 2025-01-09 10:31 ` Kristoffer Haugsbakk
2025-01-09 10:49 ` Johannes Schindelin
` (2 subsequent siblings)
3 siblings, 0 replies; 18+ messages in thread
From: Kristoffer Haugsbakk @ 2025-01-09 10:31 UTC (permalink / raw)
To: Patrick Steinhardt, git; +Cc: Johannes Schindelin
On Thu, Jan 9, 2025, at 07:21, Patrick Steinhardt wrote:
> object ID. This is because fwrite(3p) of course doesn't stop when it
> sees a NUL byte, where as printf(3p) does.
s/where as/whereas
--
Kristoffer Haugsbakk
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
2025-01-09 10:31 ` Kristoffer Haugsbakk
@ 2025-01-09 10:49 ` Johannes Schindelin
2025-01-09 11:14 ` Patrick Steinhardt
2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt
3 siblings, 1 reply; 18+ messages in thread
From: Johannes Schindelin @ 2025-01-09 10:49 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Hi Patrick,
On Thu, 9 Jan 2025, Patrick Steinhardt wrote:
> In 6411a0a896 (builtin/blame: fix type of `length` variable when
> emitting object ID, 2024-12-06) we have fixed the type of the `length`
> variable. In order to avoid a cast from `size_t` to `int` in the call to
> printf(3p) with the "%.*s" formatter we have converted the code to
> instead use fwrite(3p), which accepts the length as a `size_t`.
>
> It was reported though that this makes us read over the end of the OID
> array when the provided `--abbrev=` length exceeds the length of the
> object ID. This is because fwrite(3p) of course doesn't stop when it
> sees a NUL byte, where as printf(3p) does.
>
> Fix the bug by reverting back to printf(3p) and culling the provided
> length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
> `int`.
>
> Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> This fixes the issue reported in [1]. Thanks!
Thank you for the quick fix! We will need to adjust it a little more,
though:
>
> Patrick
>
> [1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de>
> ---
> builtin/blame.c | 4 +++-
> t/t8002-blame.sh | 4 ++++
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> char ch;
> size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> the_hash_algo->hexsz : (size_t) abbrev;
> + if (length > GIT_MAX_HEXSZ)
> + length = GIT_MAX_HEXSZ;
This causes a subtle change of behavior because there are a couple of
conditional code blocks between this change and the `printf()` call
decrease `length`, i.e. specifying values larger than the maximal hex size
causes potentially-desirable, different behavior (and think about
https://www.hyrumslaw.com/).
>
> if (opt & OUTPUT_COLOR_LINE) {
> if (cnt > 0) {
> @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> length--;
> putchar('?');
> }
> - fwrite(hex, 1, length, stdout);
> + printf("%.*s", (int)length, hex);
> if (opt & OUTPUT_ANNOTATE_COMPAT) {
> const char *name;
> if (opt & OUTPUT_SHOW_EMAIL)
> diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755
> --- a/t/t8002-blame.sh
> +++ b/t/t8002-blame.sh
> @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> check_abbrev $hexsz --no-abbrev
> '
>
> +test_expect_success 'blame --abbrev gets truncated' '
> + check_abbrev 9000 --abbrev=$hexsz HEAD
This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value
that needs to be truncated.
> +'
> +
> test_expect_success '--exclude-promisor-objects does not BUG-crash' '
> test_must_fail git blame --exclude-promisor-objects one
> '
>
> ---
> base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
> change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
Here is my proposed fixup:
-- snipsnap --
[PATCH] fixup! builtin/blame: fix out-of-bounds read with excessive `--abbrev`
The test case needs to actually test an excessive `--abbrev` value.
Also, when calling `git blame --abbrev=<N>` with an `N` that is larger
than the maximal OID hex size, there is a subtle side effect that makes
it behave _differently_ than specifying said maximal hex size: When the
command outputs boundary, unblamable or ignored commits' OIDs, those
outputs are prefixed with characters indicating this, and the `abbrev`
value is used to align the information that comes after the OID,
clipping it as needed. Specifying a "too large" abbrev value here tells
Git that yes, we want the full OIDs and don't you worry about alignment.
Thanks to SHA-256 being _larger_ than the default SHA-1-based OIDs, and
thanks to clipping at `GIT_MAX_HEXSZ`, this change of behavior can only
be observed when running the test in SHA-256 mode.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
builtin/blame.c | 9 +++++++--
t/t8002-blame.sh | 2 +-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index ad91fe9e97f9..5b4976835066 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
char ch;
size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
the_hash_algo->hexsz : (size_t) abbrev;
- if (length > GIT_MAX_HEXSZ)
- length = GIT_MAX_HEXSZ;
+
+ /*
+ * Leave enough space for ^, * and ? indicators (boundary,
+ * unblamable, ignored).
+ */
+ if (length > GIT_MAX_HEXSZ + 3)
+ length = GIT_MAX_HEXSZ + 3;
if (opt & OUTPUT_COLOR_LINE) {
if (cnt > 0) {
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index fcaba8c11f7e..71fa70a64679 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
'
test_expect_success 'blame --abbrev gets truncated' '
- check_abbrev 9000 --abbrev=$hexsz HEAD
+ check_abbrev 9000 --abbrev=9000 HEAD..
'
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
--
2.48.0.rc0.windows.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 10:49 ` Johannes Schindelin
@ 2025-01-09 11:14 ` Patrick Steinhardt
2025-01-09 13:41 ` Johannes Schindelin
0 siblings, 1 reply; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-09 11:14 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote:
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
> > --- a/builtin/blame.c
> > +++ b/builtin/blame.c
> > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > char ch;
> > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> > the_hash_algo->hexsz : (size_t) abbrev;
> > + if (length > GIT_MAX_HEXSZ)
> > + length = GIT_MAX_HEXSZ;
>
> This causes a subtle change of behavior because there are a couple of
> conditional code blocks between this change and the `printf()` call
> decrease `length`, i.e. specifying values larger than the maximal hex size
> causes potentially-desirable, different behavior (and think about
> https://www.hyrumslaw.com/).
Alternatively we can move this until after we have done the
subtractions. Then we don't have to do weird gymnastics.
> >
> > if (opt & OUTPUT_COLOR_LINE) {
> > if (cnt > 0) {
> > @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > length--;
> > putchar('?');
> > }
> > - fwrite(hex, 1, length, stdout);
> > + printf("%.*s", (int)length, hex);
> > if (opt & OUTPUT_ANNOTATE_COMPAT) {
> > const char *name;
> > if (opt & OUTPUT_SHOW_EMAIL)
> > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755
> > --- a/t/t8002-blame.sh
> > +++ b/t/t8002-blame.sh
> > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> > check_abbrev $hexsz --no-abbrev
> > '
> >
> > +test_expect_success 'blame --abbrev gets truncated' '
> > + check_abbrev 9000 --abbrev=$hexsz HEAD
>
> This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value
> that needs to be truncated.
Oh dear. The test did manage to catch the bug, but thinking more about
it that was only because my initial fix was broken.
> diff --git a/builtin/blame.c b/builtin/blame.c
> index ad91fe9e97f9..5b4976835066 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> char ch;
> size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> the_hash_algo->hexsz : (size_t) abbrev;
> - if (length > GIT_MAX_HEXSZ)
> - length = GIT_MAX_HEXSZ;
> +
> + /*
> + * Leave enough space for ^, * and ? indicators (boundary,
> + * unblamable, ignored).
> + */
> + if (length > GIT_MAX_HEXSZ + 3)
> + length = GIT_MAX_HEXSZ + 3;
>
> if (opt & OUTPUT_COLOR_LINE) {
> if (cnt > 0) {
How about this instead?
diff --git a/builtin/blame.c b/builtin/blame.c
index ad91fe9e97..f92e487bed 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -475,8 +475,6 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
char ch;
size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
the_hash_algo->hexsz : (size_t) abbrev;
- if (length > GIT_MAX_HEXSZ)
- length = GIT_MAX_HEXSZ;
if (opt & OUTPUT_COLOR_LINE) {
if (cnt > 0) {
@@ -507,6 +505,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
length--;
putchar('?');
}
+
+ if (length > GIT_MAX_HEXSZ)
+ length = GIT_MAX_HEXSZ;
printf("%.*s", (int)length, hex);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
In that case there's no need to juggle with the magic indicators, which
makes it a bit easier to reason about.
> diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> index fcaba8c11f7e..71fa70a64679 100755
> --- a/t/t8002-blame.sh
> +++ b/t/t8002-blame.sh
> @@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> '
>
> test_expect_success 'blame --abbrev gets truncated' '
> - check_abbrev 9000 --abbrev=$hexsz HEAD
> + check_abbrev 9000 --abbrev=9000 HEAD..
> '
This should be `check_abbrev $hexsz --abbrev=9000`, shouldn't it?
Patrick
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
2025-01-09 10:31 ` Kristoffer Haugsbakk
2025-01-09 10:49 ` Johannes Schindelin
@ 2025-01-09 11:48 ` Patrick Steinhardt
2025-01-09 12:40 ` shejialuo
` (2 more replies)
2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt
3 siblings, 3 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-09 11:48 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.
It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, whereas printf(3p) does.
Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
This fixes the issue reported in [1]. Thanks!
Changes in v2:
- Take into account that we may strip ^, * and ? indicators by moving
around the check.
- Fix the testcase so that it actually fails without the fix.
- Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@pks.im
Patrick
[1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de>
---
builtin/blame.c | 5 ++++-
t/t8002-blame.sh | 4 ++++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
length--;
putchar('?');
}
- fwrite(hex, 1, length, stdout);
+
+ if (length > GIT_MAX_HEXSZ)
+ length = GIT_MAX_HEXSZ;
+ printf("%.*s", (int)length, hex);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
if (opt & OUTPUT_SHOW_EMAIL)
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
check_abbrev $hexsz --no-abbrev
'
+test_expect_success 'blame --abbrev gets truncated' '
+ check_abbrev $hexsz --abbrev=9000 HEAD
+'
+
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
test_must_fail git blame --exclude-promisor-objects one
'
---
base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt
@ 2025-01-09 12:40 ` shejialuo
2025-01-09 13:49 ` Johannes Schindelin
2025-01-09 13:43 ` Johannes Schindelin
2025-01-09 14:59 ` Junio C Hamano
2 siblings, 1 reply; 18+ messages in thread
From: shejialuo @ 2025-01-09 12:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Johannes Schindelin, Kristoffer Haugsbakk
On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote:
> In 6411a0a896 (builtin/blame: fix type of `length` variable when
> emitting object ID, 2024-12-06) we have fixed the type of the `length`
> variable. In order to avoid a cast from `size_t` to `int` in the call to
> printf(3p) with the "%.*s" formatter we have converted the code to
> instead use fwrite(3p), which accepts the length as a `size_t`.
>
> It was reported though that this makes us read over the end of the OID
> array when the provided `--abbrev=` length exceeds the length of the
> object ID. This is because fwrite(3p) of course doesn't stop when it
> sees a NUL byte, whereas printf(3p) does.
>
> Fix the bug by reverting back to printf(3p) and culling the provided
> length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
> `int`.
>
> Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> This fixes the issue reported in [1]. Thanks!
>
> Changes in v2:
>
> - Take into account that we may strip ^, * and ? indicators by moving
> around the check.
> - Fix the testcase so that it actually fails without the fix.
> - Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@pks.im
>
> Patrick
>
> [1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de>
> ---
> builtin/blame.c | 5 ++++-
> t/t8002-blame.sh | 4 ++++
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> length--;
> putchar('?');
> }
> - fwrite(hex, 1, length, stdout);
> +
> + if (length > GIT_MAX_HEXSZ)
> + length = GIT_MAX_HEXSZ;
Here, we explicitly set the length if it exceeds the max hex of the
repo. Although `printf` will automatically hide the length, we should
explicitly do this. Make sense.
> + printf("%.*s", (int)length, hex);
> if (opt & OUTPUT_ANNOTATE_COMPAT) {
> const char *name;
> if (opt & OUTPUT_SHOW_EMAIL)
> diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755
> --- a/t/t8002-blame.sh
> +++ b/t/t8002-blame.sh
> @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> check_abbrev $hexsz --no-abbrev
> '
>
> +test_expect_success 'blame --abbrev gets truncated' '
> + check_abbrev $hexsz --abbrev=9000 HEAD
> +'
> +
By the way, I feel this usage is a little strange as the user side. When
I received the report mail from Johannes today morning, I feel a little
funny that we allow the value of the `--abrrev` option exceeds the
`GIT_MAX_HEXSZ` in the first place.
> test_expect_success '--exclude-promisor-objects does not BUG-crash' '
> test_must_fail git blame --exclude-promisor-objects one
> '
>
> ---
> base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
> change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
>
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 11:14 ` Patrick Steinhardt
@ 2025-01-09 13:41 ` Johannes Schindelin
2025-01-10 9:27 ` Johannes Schindelin
0 siblings, 1 reply; 18+ messages in thread
From: Johannes Schindelin @ 2025-01-09 13:41 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Hi Patrick,
On Thu, 9 Jan 2025, Patrick Steinhardt wrote:
> On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote:
> > > diff --git a/builtin/blame.c b/builtin/blame.c
> > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
> > > --- a/builtin/blame.c
> > > +++ b/builtin/blame.c
> > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > > char ch;
> > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> > > the_hash_algo->hexsz : (size_t) abbrev;
> > > + if (length > GIT_MAX_HEXSZ)
> > > + length = GIT_MAX_HEXSZ;
> >
> > This causes a subtle change of behavior because there are a couple of
> > conditional code blocks between this change and the `printf()` call
> > decrease `length`, i.e. specifying values larger than the maximal hex size
> > causes potentially-desirable, different behavior (and think about
> > https://www.hyrumslaw.com/).
>
> Alternatively we can move this until after we have done the
> subtractions. Then we don't have to do weird gymnastics.
Or we can even avoid assiging a maximum altogether:
if (length < GIT_MAX_HEXSZ)
printf("%.*s", (int)length, hex);
else
printf("%s", hex);
Or be more consistent with Git's source code style which often prefers
ternaries, favoring succinctness over readability:
printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex);
> > > if (opt & OUTPUT_COLOR_LINE) {
> > > if (cnt > 0) {
> > > @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > > length--;
> > > putchar('?');
> > > }
> > > - fwrite(hex, 1, length, stdout);
> > > + printf("%.*s", (int)length, hex);
> > > if (opt & OUTPUT_ANNOTATE_COMPAT) {
> > > const char *name;
> > > if (opt & OUTPUT_SHOW_EMAIL)
> > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755
> > > --- a/t/t8002-blame.sh
> > > +++ b/t/t8002-blame.sh
> > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> > > check_abbrev $hexsz --no-abbrev
> > > '
> > >
> > > +test_expect_success 'blame --abbrev gets truncated' '
> > > + check_abbrev 9000 --abbrev=$hexsz HEAD
> >
> > This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value
> > that needs to be truncated.
>
> Oh dear. The test did manage to catch the bug, but thinking more about
> it that was only because my initial fix was broken.
>
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > index ad91fe9e97f9..5b4976835066 100644
> > --- a/builtin/blame.c
> > +++ b/builtin/blame.c
> > @@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > char ch;
> > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> > the_hash_algo->hexsz : (size_t) abbrev;
> > - if (length > GIT_MAX_HEXSZ)
> > - length = GIT_MAX_HEXSZ;
> > +
> > + /*
> > + * Leave enough space for ^, * and ? indicators (boundary,
> > + * unblamable, ignored).
> > + */
> > + if (length > GIT_MAX_HEXSZ + 3)
> > + length = GIT_MAX_HEXSZ + 3;
> >
> > if (opt & OUTPUT_COLOR_LINE) {
> > if (cnt > 0) {
>
> How about this instead?
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index ad91fe9e97..f92e487bed 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -475,8 +475,6 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> char ch;
> size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> the_hash_algo->hexsz : (size_t) abbrev;
> - if (length > GIT_MAX_HEXSZ)
> - length = GIT_MAX_HEXSZ;
>
> if (opt & OUTPUT_COLOR_LINE) {
> if (cnt > 0) {
> @@ -507,6 +505,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> length--;
> putchar('?');
> }
> +
> + if (length > GIT_MAX_HEXSZ)
> + length = GIT_MAX_HEXSZ;
> printf("%.*s", (int)length, hex);
> if (opt & OUTPUT_ANNOTATE_COMPAT) {
> const char *name;
>
> In that case there's no need to juggle with the magic indicators, which
> makes it a bit easier to reason about.
>
> > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > index fcaba8c11f7e..71fa70a64679 100755
> > --- a/t/t8002-blame.sh
> > +++ b/t/t8002-blame.sh
> > @@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> > '
> >
> > test_expect_success 'blame --abbrev gets truncated' '
> > - check_abbrev 9000 --abbrev=$hexsz HEAD
> > + check_abbrev 9000 --abbrev=9000 HEAD..
> > '
>
> This should be `check_abbrev $hexsz --abbrev=9000`, shouldn't it?
I kind of liked the idea to keep the same cut-off threshold for the
validation. But I won't insist. The construct is obtuse in either case ;-)
(Not your fault, of course, you merely imitated existing code, which is
the correct thing to do).
Ciao,
Johannes
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt
2025-01-09 12:40 ` shejialuo
@ 2025-01-09 13:43 ` Johannes Schindelin
2025-01-09 14:59 ` Junio C Hamano
2 siblings, 0 replies; 18+ messages in thread
From: Johannes Schindelin @ 2025-01-09 13:43 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Kristoffer Haugsbakk
Hi Patrick,
On Thu, 9 Jan 2025, Patrick Steinhardt wrote:
> diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755
> --- a/t/t8002-blame.sh
> +++ b/t/t8002-blame.sh
> @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> check_abbrev $hexsz --no-abbrev
> '
>
> +test_expect_success 'blame --abbrev gets truncated' '
> + check_abbrev $hexsz --abbrev=9000 HEAD
Please note that the patch I proposed had `HEAD..` here, not `HEAD`. The
latter would not have surfaced the problem with boundary commits.
Ciao,
Johannes
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 12:40 ` shejialuo
@ 2025-01-09 13:49 ` Johannes Schindelin
2025-01-10 12:16 ` shejialuo
0 siblings, 1 reply; 18+ messages in thread
From: Johannes Schindelin @ 2025-01-09 13:49 UTC (permalink / raw)
To: shejialuo; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk
Hi Jialuo,
On Thu, 9 Jan 2025, shejialuo wrote:
> On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote:
>
> > + printf("%.*s", (int)length, hex);
> > if (opt & OUTPUT_ANNOTATE_COMPAT) {
> > const char *name;
> > if (opt & OUTPUT_SHOW_EMAIL)
> > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755
> > --- a/t/t8002-blame.sh
> > +++ b/t/t8002-blame.sh
> > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> > check_abbrev $hexsz --no-abbrev
> > '
> >
> > +test_expect_success 'blame --abbrev gets truncated' '
> > + check_abbrev $hexsz --abbrev=9000 HEAD
> > +'
> > +
>
> By the way, I feel this usage is a little strange as the user side. When
> I received the report mail from Johannes today morning, I feel a little
> funny that we allow the value of the `--abrrev` option exceeds the
> `GIT_MAX_HEXSZ` in the first place.
See the explanation I provided in
https://lore.kernel.org/git/c439fcaf-11af-7862-9c3c-18dc0842b57d@gmx.de/:
When calling `git blame --abbrev=40 HEAD.. -- <file>` (in a SHA-1-based
repository), the OIDs are prefixed with a `^` and then the last hex digit
will be cut. The reason? Git wants to align the text after the OID.
When calling it with `--abbrev=41`, the full OID is shown.
Ciao,
Johannes
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt
2025-01-09 12:40 ` shejialuo
2025-01-09 13:43 ` Johannes Schindelin
@ 2025-01-09 14:59 ` Junio C Hamano
2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2025-01-09 14:59 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Johannes Schindelin, Kristoffer Haugsbakk
Patrick Steinhardt <ps@pks.im> writes:
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
Mental note: the "hex" in question that is used later is defined as
char hex[GIT_MAX_HEXSZ + 1];
at the top of this scope.
> length--;
> putchar('?');
> }
> - fwrite(hex, 1, length, stdout);
> +
> + if (length > GIT_MAX_HEXSZ)
> + length = GIT_MAX_HEXSZ;
> + printf("%.*s", (int)length, hex);
This is not wrong per-se, but leaves a funny aftertaste after
reading it, especially if the reader knows the original *bug* was
about showing past the end of the string in "hex". The question is
"what if the current hash function produces shorter than MAX_HEXSZ?"
The updated code is correct only because it reinstates the use of
printf() so the "length" being longer than the string itself no
longer matters, and the only requirement on "length" is not to
read beyond the end of hex[] array. So feeding the length of the
array as the limit is not wrong, even though it may be feeding a
limit that is larger than the string stored in the array.
An obvious alternative would have been to base the limit on the
value of strlen(hex) and then we may even keep using fwrite(); then
we do not have worry about "what if MAX_HEXSZ is larger than the
current hash function?" and nonsense like that (I personally prefer
what is being reviewed much better; please do not take this "obvious
alternative" as a suggestion).
Limiting the "length" here does improve the resulting code, relative
to v1 iteration. This printf is about showing the hexstring and we
are making sure we do not read past the end of the array that holds
the string, and doing it here means we do not have to worry about
leading prefix characters about boundary etc. Very good
improvement.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 13:41 ` Johannes Schindelin
@ 2025-01-10 9:27 ` Johannes Schindelin
2025-01-10 9:49 ` Patrick Steinhardt
0 siblings, 1 reply; 18+ messages in thread
From: Johannes Schindelin @ 2025-01-10 9:27 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Hi Patrick,
On Thu, 9 Jan 2025, Johannes Schindelin wrote:
> On Thu, 9 Jan 2025, Patrick Steinhardt wrote:
>
> > On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote:
> > > > diff --git a/builtin/blame.c b/builtin/blame.c
> > > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
> > > > --- a/builtin/blame.c
> > > > +++ b/builtin/blame.c
> > > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > > > char ch;
> > > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> > > > the_hash_algo->hexsz : (size_t) abbrev;
> > > > + if (length > GIT_MAX_HEXSZ)
> > > > + length = GIT_MAX_HEXSZ;
> > >
> > > This causes a subtle change of behavior because there are a couple of
> > > conditional code blocks between this change and the `printf()` call
> > > decrease `length`, i.e. specifying values larger than the maximal hex size
> > > causes potentially-desirable, different behavior (and think about
> > > https://www.hyrumslaw.com/).
> >
> > Alternatively we can move this until after we have done the
> > subtractions. Then we don't have to do weird gymnastics.
>
> Or we can even avoid assiging a maximum altogether:
>
> if (length < GIT_MAX_HEXSZ)
> printf("%.*s", (int)length, hex);
> else
> printf("%s", hex);
>
> Or be more consistent with Git's source code style which often prefers
> ternaries, favoring succinctness over readability:
>
> printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex);
Coverity noticed a problem with this approach, looking at
https://github.com/git/git/blob/v2.48.0-rc2/builtin/blame.c#L493:
memset(hex, ' ', length);
If the `GIT_MAX_HEXSZ` guard is moved after this statement, then we can
easily overrun the `hex` buffer.
Ciao,
Johannes
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-10 9:27 ` Johannes Schindelin
@ 2025-01-10 9:49 ` Patrick Steinhardt
0 siblings, 0 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 9:49 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
On Fri, Jan 10, 2025 at 10:27:23AM +0100, Johannes Schindelin wrote:
> Hi Patrick,
>
> On Thu, 9 Jan 2025, Johannes Schindelin wrote:
>
> > On Thu, 9 Jan 2025, Patrick Steinhardt wrote:
> >
> > > On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote:
> > > > > diff --git a/builtin/blame.c b/builtin/blame.c
> > > > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
> > > > > --- a/builtin/blame.c
> > > > > +++ b/builtin/blame.c
> > > > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > > > > char ch;
> > > > > size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> > > > > the_hash_algo->hexsz : (size_t) abbrev;
> > > > > + if (length > GIT_MAX_HEXSZ)
> > > > > + length = GIT_MAX_HEXSZ;
> > > >
> > > > This causes a subtle change of behavior because there are a couple of
> > > > conditional code blocks between this change and the `printf()` call
> > > > decrease `length`, i.e. specifying values larger than the maximal hex size
> > > > causes potentially-desirable, different behavior (and think about
> > > > https://www.hyrumslaw.com/).
> > >
> > > Alternatively we can move this until after we have done the
> > > subtractions. Then we don't have to do weird gymnastics.
> >
> > Or we can even avoid assiging a maximum altogether:
> >
> > if (length < GIT_MAX_HEXSZ)
> > printf("%.*s", (int)length, hex);
> > else
> > printf("%s", hex);
> >
> > Or be more consistent with Git's source code style which often prefers
> > ternaries, favoring succinctness over readability:
> >
> > printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex);
>
> Coverity noticed a problem with this approach, looking at
> https://github.com/git/git/blob/v2.48.0-rc2/builtin/blame.c#L493:
>
> memset(hex, ' ', length);
>
> If the `GIT_MAX_HEXSZ` guard is moved after this statement, then we can
> easily overrun the `hex` buffer.
Oh. That's even an old-standing issue that wasn't caused by the
refactoring, right? Your proposed fix to set `length = GIT_MAX_HEXSZ +
3` to account for the old behaviour wouldn't fix it either, as we could
still end up overwriting two bytes.
I'll send a new version with another commit on top to fix this.
Patrick
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes
2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
` (2 preceding siblings ...)
2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt
@ 2025-01-10 11:26 ` Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt
3 siblings, 2 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:26 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk, Junio C Hamano
Hi,
This fixes the issues reported in [1] and [2]. Thanks!
Changes in v2:
- Take into account that we may strip ^, * and ? indicators by moving
around the check.
- Fix the testcase so that it actually fails without the fix.
- Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@pks.im
Changes in v3:
- Add another testcase for boundary commits.
- Fix another out-of-bound write noticed by Coverity. This bug is not
a regression in v2.48.0, but is a preexisting error.
- Simplify the printf statement a bit by using a ternary statement.
- Link to v2: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v2-1-589c81a6ddb0@pks.im
Patrick
[1]: <4d812802-afbc-4635-7a19-73896fcda625@gmx.de>
[2]: <48ca0114-124b-e3f5-af80-1e302bf9ce52@gmx.de>
---
Patrick Steinhardt (2):
builtin/blame: fix out-of-bounds read with excessive `--abbrev`
builtin/blame: fix out-of-bounds write with blank boundary commits
builtin/blame.c | 9 +++++----
t/t8002-blame.sh | 26 ++++++++++++++++++++++++++
2 files changed, 31 insertions(+), 4 deletions(-)
Range-diff versus v2:
1: 7065637d8e ! 1: 3865bfe643 builtin/blame: fix out-of-bounds read with excessive `--abbrev`
@@ builtin/blame.c: static void emit_other(struct blame_scoreboard *sb, struct blam
}
- fwrite(hex, 1, length, stdout);
+
-+ if (length > GIT_MAX_HEXSZ)
-+ length = GIT_MAX_HEXSZ;
-+ printf("%.*s", (int)length, hex);
++ printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
if (opt & OUTPUT_SHOW_EMAIL)
@@ t/t8002-blame.sh: test_expect_success '--no-abbrev works like --abbrev with full
+test_expect_success 'blame --abbrev gets truncated' '
+ check_abbrev $hexsz --abbrev=9000 HEAD
+'
++
++test_expect_success 'blame --abbrev gets truncated with boundary commit' '
++ check_abbrev $hexsz --abbrev=9000 ^HEAD
++'
+
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
test_must_fail git blame --exclude-promisor-objects one
-: ---------- > 2: af0af67a8a builtin/blame: fix out-of-bounds write with blank boundary commits
---
base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt
@ 2025-01-10 11:26 ` Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt
1 sibling, 0 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:26 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk, Junio C Hamano
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.
It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, whereas printf(3p) does.
Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/blame.c | 3 ++-
t/t8002-blame.sh | 8 ++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index 867032e4c16878ffd56df8a73162b89ca4bd2694..d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -505,7 +505,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
length--;
putchar('?');
}
- fwrite(hex, 1, length, stdout);
+
+ printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
if (opt & OUTPUT_SHOW_EMAIL)
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..b3f8b63d2e6744dd434f38fd9f10b56cd432141b 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -126,6 +126,14 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
check_abbrev $hexsz --no-abbrev
'
+test_expect_success 'blame --abbrev gets truncated' '
+ check_abbrev $hexsz --abbrev=9000 HEAD
+'
+
+test_expect_success 'blame --abbrev gets truncated with boundary commit' '
+ check_abbrev $hexsz --abbrev=9000 ^HEAD
+'
+
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
test_must_fail git blame --exclude-promisor-objects one
'
--
2.48.0.rc2.279.g1de40edade.dirty
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits
2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
@ 2025-01-10 11:26 ` Patrick Steinhardt
2025-01-10 13:00 ` Johannes Schindelin
1 sibling, 1 reply; 18+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:26 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Kristoffer Haugsbakk, Junio C Hamano
When passing the `-b` flag to git-blame(1), then any blamed boundary
commits which were marked as uninteresting will not get their actual
commit ID printed, but will instead be replaced by a couple of spaces.
The flag can lead to an out-of-bounds write as though when combined with
`--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ`
as we simply use memset(3p) on that array with the user-provided length
directly. The result is most likely that we segfault.
An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes.
But when the underlying object ID is SHA1, and if the abbreviated length
exceeds the SHA1 length, it would cause us to print more bytes than
desired, and the result would be misaligned.
Instead, fix the bug by computing the length via strlen(3p). This makes
us write as many bytes as the formatted object ID requires and thus
effectively limits the length of what we may end up printing to the
length of its hash. If `--abbrev=` asks us to abbreviate to something
shorter than the full length of the underlying hash function it would be
handled by the call to printf(3p) correctly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/blame.c | 6 +++---
t/t8002-blame.sh | 18 ++++++++++++++++++
2 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
fputs(color, stdout);
if (suspect->commit->object.flags & UNINTERESTING) {
- if (blank_boundary)
- memset(hex, ' ', length);
- else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) {
+ if (blank_boundary) {
+ memset(hex, ' ', strlen(hex));
+ } else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) {
length--;
putchar('^');
}
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index b3f8b63d2e6744dd434f38fd9f10b56cd432141b..1ad039e1234828ca8779ad76147bfa7fe14c5a2e 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -134,6 +134,24 @@ test_expect_success 'blame --abbrev gets truncated with boundary commit' '
check_abbrev $hexsz --abbrev=9000 ^HEAD
'
+test_expect_success 'blame --abbrev -b truncates the blank boundary' '
+ # Note that `--abbrev=` always gets incremented by 1, which is why we
+ # expect 11 leading spaces and not 10.
+ cat >expect <<-EOF &&
+ $(printf "%0.s " $(test_seq 11)) (<author@example.com> 2005-04-07 15:45:13 -0700 1) abbrev
+ EOF
+ git blame -b --abbrev=10 ^HEAD -- abbrev.t >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'blame with excessive --abbrev and -b culls to hash length' '
+ cat >expect <<-EOF &&
+ $(printf "%0.s " $(test_seq $hexsz)) (<author@example.com> 2005-04-07 15:45:13 -0700 1) abbrev
+ EOF
+ git blame -b --abbrev=9000 ^HEAD -- abbrev.t >actual &&
+ test_cmp expect actual
+'
+
test_expect_success '--exclude-promisor-objects does not BUG-crash' '
test_must_fail git blame --exclude-promisor-objects one
'
--
2.48.0.rc2.279.g1de40edade.dirty
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-09 13:49 ` Johannes Schindelin
@ 2025-01-10 12:16 ` shejialuo
0 siblings, 0 replies; 18+ messages in thread
From: shejialuo @ 2025-01-10 12:16 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk
On Thu, Jan 09, 2025 at 02:49:09PM +0100, Johannes Schindelin wrote:
> Hi Jialuo,
>
> On Thu, 9 Jan 2025, shejialuo wrote:
>
> > On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote:
> >
> > > + printf("%.*s", (int)length, hex);
> > > if (opt & OUTPUT_ANNOTATE_COMPAT) {
> > > const char *name;
> > > if (opt & OUTPUT_SHOW_EMAIL)
> > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755
> > > --- a/t/t8002-blame.sh
> > > +++ b/t/t8002-blame.sh
> > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> > > check_abbrev $hexsz --no-abbrev
> > > '
> > >
> > > +test_expect_success 'blame --abbrev gets truncated' '
> > > + check_abbrev $hexsz --abbrev=9000 HEAD
> > > +'
> > > +
> >
> > By the way, I feel this usage is a little strange as the user side. When
> > I received the report mail from Johannes today morning, I feel a little
> > funny that we allow the value of the `--abrrev` option exceeds the
> > `GIT_MAX_HEXSZ` in the first place.
>
> See the explanation I provided in
> https://lore.kernel.org/git/c439fcaf-11af-7862-9c3c-18dc0842b57d@gmx.de/:
> When calling `git blame --abbrev=40 HEAD.. -- <file>` (in a SHA-1-based
> repository), the OIDs are prefixed with a `^` and then the last hex digit
> will be cut. The reason? Git wants to align the text after the OID.
>
I have read through this, thanks for the detailed explanation.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits
2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt
@ 2025-01-10 13:00 ` Johannes Schindelin
2025-01-10 14:21 ` Junio C Hamano
0 siblings, 1 reply; 18+ messages in thread
From: Johannes Schindelin @ 2025-01-10 13:00 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Kristoffer Haugsbakk, Junio C Hamano
Hi Patrick,
On Fri, 10 Jan 2025, Patrick Steinhardt wrote:
> diff --git a/builtin/blame.c b/builtin/blame.c
> index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> fputs(color, stdout);
>
> if (suspect->commit->object.flags & UNINTERESTING) {
> - if (blank_boundary)
> - memset(hex, ' ', length);
> - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) {
> + if (blank_boundary) {
> + memset(hex, ' ', strlen(hex));
Using `strlen()` is a neat trick.
I could have done without slipping in a style change (introducing
curlies), but the most important thing is that it fixes the bug.
Thank you,
Johannes
> + } else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) {
> length--;
> putchar('^');
> }
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits
2025-01-10 13:00 ` Johannes Schindelin
@ 2025-01-10 14:21 ` Junio C Hamano
0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2025-01-10 14:21 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> On Fri, 10 Jan 2025, Patrick Steinhardt wrote:
>
>> diff --git a/builtin/blame.c b/builtin/blame.c
>> index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644
>> --- a/builtin/blame.c
>> +++ b/builtin/blame.c
>> @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
>> fputs(color, stdout);
>>
>> if (suspect->commit->object.flags & UNINTERESTING) {
>> - if (blank_boundary)
>> - memset(hex, ' ', length);
>> - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) {
>> + if (blank_boundary) {
>> + memset(hex, ' ', strlen(hex));
>
> Using `strlen()` is a neat trick.
>
> I could have done without slipping in a style change (introducing
> curlies), but the most important thing is that it fixes the bug.
Thank both of you for these last-minute fixes. Hopefully we can
have them in today's release, and we didn't miss unexpected side
effects in them, I hope ;-).
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-01-10 14:21 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-09 6:21 [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
2025-01-09 10:31 ` Kristoffer Haugsbakk
2025-01-09 10:49 ` Johannes Schindelin
2025-01-09 11:14 ` Patrick Steinhardt
2025-01-09 13:41 ` Johannes Schindelin
2025-01-10 9:27 ` Johannes Schindelin
2025-01-10 9:49 ` Patrick Steinhardt
2025-01-09 11:48 ` [PATCH v2] " Patrick Steinhardt
2025-01-09 12:40 ` shejialuo
2025-01-09 13:49 ` Johannes Schindelin
2025-01-10 12:16 ` shejialuo
2025-01-09 13:43 ` Johannes Schindelin
2025-01-09 14:59 ` Junio C Hamano
2025-01-10 11:26 ` [PATCH v3 0/2] builtin/blame: fix out-of-bounds reads and writes Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 1/2] builtin/blame: fix out-of-bounds read with excessive `--abbrev` Patrick Steinhardt
2025-01-10 11:26 ` [PATCH v3 2/2] builtin/blame: fix out-of-bounds write with blank boundary commits Patrick Steinhardt
2025-01-10 13:00 ` Johannes Schindelin
2025-01-10 14:21 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).