From: Taylor Blau <me@ttaylorr.com>
To: Phillip Wood <phillip.wood@dunelm.org.uk>
Cc: git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>,
Phillip Wood <phillip.wood123@gmail.com>
Subject: Re: [PATCH 1/4] midx repack: avoid integer overflow on 32 bit systems
Date: Tue, 20 May 2025 13:54:06 -0400 [thread overview]
Message-ID: <aCzBvvZDS2OFJ30h@nand.local> (raw)
In-Reply-To: <cbc5e69b908cef3800569abe79cb9c107f72bfec.1747753388.git.phillip.wood@dunelm.org.uk>
On Tue, May 20, 2025 at 04:04:24PM +0100, Phillip Wood wrote:
> diff --git a/midx-write.c b/midx-write.c
> index dd3b3070e55..c7cb2315431 100644
> --- a/midx-write.c
> +++ b/midx-write.c
> @@ -1699,19 +1699,23 @@ static void fill_included_packs_batch(struct repository *r,
> for (i = 0; total_size < batch_size && i < m->num_packs; i++) {
> int pack_int_id = pack_info[i].pack_int_id;
> struct packed_git *p = m->packs[pack_int_id];
> - size_t expected_size;
> + uint64_t expected_size;
>
> if (!want_included_pack(r, m, pack_kept_objects, pack_int_id))
> continue;
>
> - expected_size = st_mult(p->pack_size,
> - pack_info[i].referenced_objects);
> + expected_size = uint64_mult(p->pack_size,
> + pack_info[i].referenced_objects);
Makes sense.
> expected_size /= p->num_objects;
>
> if (expected_size >= batch_size)
> continue;
>
> - total_size += expected_size;
> + if (unsigned_add_overflows (total_size, (size_t)expected_size))
> + total_size = SIZE_MAX;
> + else
> + total_size += expected_size;
> +
But this part I am not totally following. Here we have 'total_size'
declared as a size_t, and 'expected_size' as a uint64_t, and (on 32-bit
systems) down-cast to a 32-bit unsigned value.
So if 'expected_size' is larger than SIZE_MAX, we should set
'total_size' to SIZE_MAX. But that may not happen, say if
'expected_size' is (2^32-1<<32). Should total_size also be declared as a
uint64_t here?
I wondered if it might be easier to count down from the given batch_size
instead of adding up to it (requiring the second
unsigned_add_overflows() check). I tried it out and got this instead:
--- 8< ---
diff --git a/midx-write.c b/midx-write.c
index 48a4dc5e94..f81dd9ff6d 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -1671,7 +1671,7 @@ static void fill_included_packs_batch(struct repository *r,
size_t batch_size)
{
uint32_t i;
- size_t total_size;
+ uint64_t remaining = batch_size;
struct repack_info *pack_info;
int pack_kept_objects = 0;
@@ -1695,23 +1695,23 @@ static void fill_included_packs_batch(struct repository *r,
QSORT(pack_info, m->num_packs, compare_by_mtime);
- total_size = 0;
- for (i = 0; total_size < batch_size && i < m->num_packs; i++) {
+ for (i = 0; i < m->num_packs; i++) {
int pack_int_id = pack_info[i].pack_int_id;
struct packed_git *p = m->packs[pack_int_id];
- size_t expected_size;
+ uint64_t expected_size, factor;
if (!want_included_pack(r, m, pack_kept_objects, pack_int_id))
continue;
- expected_size = st_mult(p->pack_size,
- pack_info[i].referenced_objects);
- expected_size /= p->num_objects;
+ factor = pack_info[i].referenced_objects / p->num_objects;
+ if (p->pack_size > UINT64_MAX / factor)
+ die(...);
- if (expected_size >= batch_size)
- continue;
+ expected_size = p->pack_size * factor;
+ if (expected_size > remaining)
+ break;
- total_size += expected_size;
+ remaining -= expected_size;
include_pack[pack_int_id] = 1;
}
--- >8 ---
That reduces the two overflow checks down to one, and avoids the need to
introduce a uint64_t-specific variant of the st_add() function.
Thanks,
Taylor
next prev parent reply other threads:[~2025-05-20 17:54 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-20 15:04 [PATCH 0/4] midx repack: fix overflow on 32 bit systems Phillip Wood
2025-05-20 15:04 ` [PATCH 1/4] midx repack: avoid integer " Phillip Wood
2025-05-20 17:54 ` Taylor Blau [this message]
2025-05-21 15:19 ` Phillip Wood
2025-05-23 0:34 ` Taylor Blau
2025-05-21 13:10 ` D. Ben Knoble
2025-05-21 15:01 ` Junio C Hamano
2025-05-21 15:20 ` Phillip Wood
2025-05-20 15:04 ` [PATCH 2/4] midx repack: avoid potential integer overflow on 64 " Phillip Wood
2025-05-20 17:59 ` Taylor Blau
2025-05-21 15:20 ` Phillip Wood
2025-05-20 15:04 ` [PATCH 3/4] midx: avoid negative array index Phillip Wood
2025-05-20 17:58 ` Taylor Blau
2025-05-20 15:04 ` [PATCH 4/4] midx docs: clarify tie breaking Phillip Wood
2025-05-20 18:07 ` Taylor Blau
2025-05-21 15:20 ` Phillip Wood
2025-05-21 13:14 ` D. Ben Knoble
2025-05-22 15:55 ` [PATCH v2 0/4] midx repack: fix overflow on 32 bit systems Phillip Wood
2025-05-22 15:55 ` [PATCH v2 1/4] midx repack: avoid integer " Phillip Wood
2025-05-22 15:55 ` [PATCH v2 2/4] midx repack: avoid potential integer overflow on 64 " Phillip Wood
2025-05-22 15:55 ` [PATCH v2 3/4] midx: avoid negative array index Phillip Wood
2025-05-22 15:55 ` [PATCH v2 4/4] midx docs: clarify tie breaking Phillip Wood
2025-05-23 0:36 ` [PATCH v2 0/4] midx repack: fix overflow on 32 bit systems Taylor Blau
2025-05-27 8:26 ` Phillip Wood
2025-05-27 15:42 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aCzBvvZDS2OFJ30h@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=phillip.wood123@gmail.com \
--cc=phillip.wood@dunelm.org.uk \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.