From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from cloud.peff.net (cloud.peff.net [217.216.95.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48EEF413D90 for ; Fri, 15 May 2026 19:50:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.216.95.84 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778874653; cv=none; b=IzrAvrGCdAQWTe5YdEYKxxV7u5H3V1/+MpmyommpxanyBhfrQuWdBauekbge7qHzGExFuInumfTLnv1Y5QSCH9nnPG2qSH+/TL1IzSkg7EYnIbqx2kQb5FeR8O4j0Q9bEZX9x/Zg/hxeW6hbyiy8wcR6oy+NFJKhIO85+9w5bTg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778874653; c=relaxed/simple; bh=pRxD8MSvvbr1pENJZ1sZJkhINVIQ1Uj71/oREnPOjPY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iFieBGNBHrVGvzO/NmUBydUhGazhI7NADnV+wHZSYjQ+617+uvlR1IrrC4m/EP+2au5LfmXT1UqBX3THu77PqEdc5LLPKnuO6a/dniAFa1CK5p6iLVBToYWQ1uApWS0CdmNz7tBUNZ4Hnc9dHH3RmBVnS0BbjTif/bAyRr8BbmA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=UQsEWL+t; arc=none smtp.client-ip=217.216.95.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="UQsEWL+t" Received: (qmail 56915 invoked by uid 106); 15 May 2026 19:50:50 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:content-transfer-encoding:in-reply-to; s=20240930; bh=pRxD8MSvvbr1pENJZ1sZJkhINVIQ1Uj71/oREnPOjPY=; b=UQsEWL+tXDB03f3MnI0hzlBi/EWzHOwA1yGDLA1TnkdCR9nWKmLzkHlstONUolvKxjCrVBZmy+ceSgW3Ms2I25k9tJJZMn1gKKTkct2No6pCJIziVbsBZwMD9TKIA/C2bi62w3PmTi/B9NCJFAPCribm8x7ljDXl8iVP3uSDMjqgT42U6oBmiEmsGBAO/zvo46F8eyptxl1bNVF8LEYaN6NGepdRHJYmvYXxTrqEU7I0zMZvdfeT1uumIJY7kIOT4+VnSGrUHZLlC+DG8Y0Iabcj6o4k7DuL++QXL+wZsJt1uDzLROCcRLW0+kVlAYRB8ceXzM4tlUkwwosw2UjC9Q== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Fri, 15 May 2026 19:50:50 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 105811 invoked by uid 111); 15 May 2026 19:50:49 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 15 May 2026 15:50:49 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 15 May 2026 15:50:49 -0400 From: Jeff King To: =?utf-8?B?UmVuw6k=?= Scharfe Cc: Git List , Junio C Hamano Subject: Re: [PATCH] evaluate the second argument of ALLOC_GROW only once Message-ID: <20260515195049.GA149960@coredump.intra.peff.net> References: <323f5677-301b-4d7a-b552-6606597c2b1f@web.de> <20260515190818.GA98370@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20260515190818.GA98370@coredump.intra.peff.net> On Fri, May 15, 2026 at 03:08:18PM -0400, Jeff King wrote: > On Fri, May 15, 2026 at 08:16:50PM +0200, René Scharfe wrote: > > > + size_t alloc_grow_new_alloc_; \ > > + if (st_alloc_nr((nr), (alloc), &alloc_grow_new_alloc_)) { \ > > + alloc = alloc_grow_new_alloc_; \ > > + REALLOC_ARRAY(x, alloc_grow_new_alloc_); \ > > } \ > > What happens if a caller passes in an argument that isn't a size_t? > We'll check for overflow in the size_t space, and then truncate it when > we assign to alloc, I think. > > I think we generally try to hold allocations in size_t these days, but > I'd be surprised if there weren't a few "int" holdouts. Grepping around, > alloc_node() seems to be an example. > > BTW, non-size_t arguments nullifies my earlier hand-waving around "nr + > 1 overflowing implies we've filled up the address space". But we are > still protected in the existing code by the: > > if (alloc_nr(alloc) < (nr)) > alloc = (nr); > > logic. But with your patch, that all happens in the size_t space, so I > think it would actually introduce possible array overflows when the > caller is using a smaller type. Hmm, playing with it and looking a little closer, I think we don't end up overflowing the buffer because you use the size_t for REALLOC_ARRAY(). So the result is big, but then "alloc" is truncated. And then on the next call, we think "oh no, the allocation is way too small" because we are using the truncated value. So we try to size up for every single allocation, even though it's actually big enough, and the program slows to a crawl. ;) For reference, IU was using this hack to play around and demonstrate: diff --git a/git.c b/git.c index 5a40eab8a2..638bbc69e4 100644 --- a/git.c +++ b/git.c @@ -969,6 +969,19 @@ int cmd_main(int argc, const char **argv) cmd = argv[0]; + if (!strcmp(cmd, "foo")) { + unsigned char *buf = NULL; + unsigned nr = 0, alloc = 0; + for (unsigned i = 0; i < UINT_MAX; i++) { + ALLOC_GROW(buf, nr + 1, alloc); + if (i % 313370 == 0) + warning("at i=%u, alloc=%u, nr=%u", i, alloc, nr); + buf[nr++] = i % 256; + } + printf("done, final nr=%u, alloc=%u\n", nr, alloc); + return 0; + } + /* * We use PATH to find git commands, but we prepend some higher * precedence paths: the "--exec-path" option, the GIT_EXEC_PATH The same problem exists in several places in actual code, but I'm not sure how practical it is to trigger. The alloc_node() is counting not just objects, but blocks of objects. So you'd need 2*31 * 1024 objects of one type, which is probably going to run afoul of other limitations. Other cases are similar; for example "yes | git fetch-pack --stdin foo" will grow an array indefinitely, but at one strbuf per line it starts swapping on my 64GB machine at only 350M entries. I think as long as the behavior remains "slow, but we do not overflow any buffers" when you reach these limits, that's OK. Nobody is going to do it in practice, and we just want to make sure that malicious inputs cannot get out-of-bounds writes. It might be worth adding a comment, though, to make sure nobody ever swaps "alloc_grow_new_alloc_" for "alloc" in that macro. -Peff