From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from cloud.peff.net (cloud.peff.net [217.216.95.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A5B221A6808 for ; Sat, 16 May 2026 02:51:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.216.95.84 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778899883; cv=none; b=nF0ZMOxJDU3VV4c8kZ+frpzyEn2uZ9yrRP4iyES9eLyB8k1hRVHPpkg67Xeb9xckgag55KwcLf8IYrpII1x5EvLWEVGYF/y0Bhg28TguJ4BhgcHLnYR1yShTWW7RM0/9opv7XYmlD2EErZnEzFYxoif1KJklofIu/yBkIGdUZL8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778899883; c=relaxed/simple; bh=pqVTqga7I2qD6iRbPCEO3fWRtGNquhJNu2nH3gIh31Q=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KQ2uQECpvDniLfmVZpEcmnLQfN+ejjTz4eWg2vOLtG0Y8HtSTmt2OCnaQ5gQNu2+K0w936PSWYSI+8P/+4JZ5mt5xVJDljPojHr0XVIJSNsLBDaOutcUtNSfV7YUxOHKaGTdCXLIMTn/k1FINROcXVFLSW7G0uJtZlQ8YpNZR4o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=H7QebALv; arc=none smtp.client-ip=217.216.95.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="H7QebALv" Received: (qmail 3656 invoked by uid 106); 16 May 2026 02:51:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:content-transfer-encoding:in-reply-to; s=20240930; bh=pqVTqga7I2qD6iRbPCEO3fWRtGNquhJNu2nH3gIh31Q=; b=H7QebALvMuZGnlbvl+zzS1tMVwwgIqJgzFJH3LhDe1tHHV1RlEkhaMBPuFz4Oi3RRjTtvlDifXPDRMMmJoipd5eyXD4/xg8rKVqWg85yM9CR8RrRCik4f+HiY4CCLFbVPLJfvIakl4giWZlSOFifK/Yco7+SKBhXXNCAE1uOWdD8hRnjJKjyLVQ4uumToRkyiCRr3HszmOB+SYgqvRnDEldZZ6A4t8LjpHtNmnqci4sVIPAHLtXlCetCHENMhHuALCe7covcKJUPIRf3gvqlpwyocBiB7GwOfjhh5mX/uw5rIhEK9b7+JGCZdXGGx2CAMdojcpnVUvA5QMdgvbRx2w== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Sat, 16 May 2026 02:51:20 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 7832 invoked by uid 111); 16 May 2026 02:51:19 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 15 May 2026 22:51:19 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 15 May 2026 22:51:19 -0400 From: Jeff King To: =?utf-8?B?UmVuw6k=?= Scharfe Cc: Git List , Junio C Hamano Subject: Re: [PATCH] evaluate the second argument of ALLOC_GROW only once Message-ID: <20260516025119.GA832077@coredump.intra.peff.net> References: <323f5677-301b-4d7a-b552-6606597c2b1f@web.de> <20260515190818.GA98370@coredump.intra.peff.net> <20260515195049.GA149960@coredump.intra.peff.net> <1d79d7bc-5441-4e72-9cb0-e8900f57172c@web.de> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1d79d7bc-5441-4e72-9cb0-e8900f57172c@web.de> On Sat, May 16, 2026 at 01:01:05AM +0200, René Scharfe wrote: > > I think as long as the behavior remains "slow, but we do not overflow > > any buffers" when you reach these limits, that's OK. Nobody is going to > > do it in practice, and we just want to make sure that malicious inputs > > cannot get out-of-bounds writes. It might be worth adding a comment, > > though, to make sure nobody ever swaps "alloc_grow_new_alloc_" for > > "alloc" in that macro. > There is no overflow check in either version (yet), so neither is safe > to operate close to the boundary. Close meaning the intermediate term > (alloc + 16) * 3 being bigger than the maximum value. Yes, but for some definition of safe. Both before and after your patch, as we get close to the boundary the allocation will grow slower than it should, but we'll never write out of bounds. The behavior for the "git foo" I showed earlier is slightly different: - before your patch, ~2GB we stop doubling and instead start growing the array by one at each ALLOC_GROW() call. This is because alloc_nr() overflows to a small value, but the: if (alloc_nr(alloc) < (nr)) alloc = (nr); check kicks in. - after your patch we grow to ~4GB, and then things get super slow. This is because we correctly compute the new allocation as a size_t, but then truncate it while assigning to alloc. So on the next ALLOC_GROW() call, we'll think the buffer is way too small and try to realloc again. I don't know why this is so much slower than the grow-by-one above, but it is. Neither is really correct, but both are in the realm of OK: stupidly large input doesn't perform well, but there's no buffer overflow vulnerability. What I was worried about is what happens if you tweak your patch like this: diff --git a/git-compat-util.h b/git-compat-util.h index 2bc1f43f48..0730dd24ad 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -870,7 +870,7 @@ static inline bool st_alloc_nr(size_t nr, size_t alloc, size_t *outp) size_t alloc_grow_new_alloc_; \ if (st_alloc_nr((nr), (alloc), &alloc_grow_new_alloc_)) { \ alloc = alloc_grow_new_alloc_; \ - REALLOC_ARRAY(x, alloc_grow_new_alloc_); \ + REALLOC_ARRAY(x, alloc); \ } \ } while (0) In that case we really do end up with too-small allocations and out-of-bounds writes. Maybe you saw that coming and that's why you wrote it as you did. But it is definitely subtle enough that I think it would merit a big warning comment that "alloc" and "alloc_grow_new_alloc_" are not necessarily the same type, and hence not necessarily the same value. > Here's a demo program exercising the arithmetic part of the macros: I think the difference isn't in the arithmetic values that come out, but in what is fed to realloc() itself. And in your harness, realloc is just "x = true". If you actually store the value that would be passed to realloc() like this: diff --git a/foo.c.orig b/foo.c index 2fbce8c..7498f36 100644 --- a/foo.c.orig +++ b/foo.c @@ -11,7 +11,7 @@ alloc = (nr); \ else \ alloc = alloc_nr(alloc); \ - x = true; \ + x = alloc; \ } \ } while (0) @@ -31,7 +31,7 @@ static inline bool st_alloc_nr(size_t nr, size_t alloc, size_t *outp) size_t alloc_grow_new_alloc_; \ if (st_alloc_nr((nr), (alloc), &alloc_grow_new_alloc_)) { \ alloc = alloc_grow_new_alloc_; \ - x = true; \ + x = alloc_grow_new_alloc_; \ } \ } while (0) @@ -44,7 +44,7 @@ int main(int argc, char **argv) for (T i = 0;; i++) { for (T j = MIN;; j++) { T alloc1 = j, alloc2 = j; - bool allocated1 = false, allocated2 = false; + size_t allocated1 = 0, allocated2 = 0; ALLOC_GROW1(allocated1, i, alloc1); ALLOC_GROW2(allocated2, i, alloc2); if (alloc1 != alloc2 || allocated1 != allocated2) then you see the differences. For negative values, yeah, you end up with big size_t values. But for an unsigned type you get different small allocations. -Peff