From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4459E3D5244 for ; Tue, 14 Apr 2026 10:26:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776162370; cv=none; b=FbGcz94FR3FGQhJsHb2nph9nf+xGGFQ7NXKFWlGzGfqoZEdYQa8waRMg2IlSzUQ7Z1KzhtvRYrG+2/JfZRA/QpgXArWUhsRFMw5nJMHicRVyY0u2uEmdaMH9kdttUQmEwasdw5LwgY+0zFO0UotIPp1TNEeDqrVFv4PGHPq6OzY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776162370; c=relaxed/simple; bh=lLnw08jCTWpP4YPGRp45HZtXDNXrxAlLo5yo/Nfo7cA=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OP4fZpAW6qReqlHrV5F2rE1BnHCSyfabU1IUUBjV+QEXmHOaElLrSXxnXCJZo0Sl23zRAyWjqL2GmaXblzGUki8ja2KPylQgRQ0ksvU65xx6o3ZEVxajAz3+bWU8cg40hNxLnTFEB29K8Ljv4++8CQ9bpSyGNTco0d0H2qM5uDU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=qxeSX236; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qxeSX236" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-488aa77a06eso89961885e9.0 for ; Tue, 14 Apr 2026 03:26:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776162362; x=1776767162; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=eZb/0/xSNQxp0F85Jgh+HG9e9qkIM5zeg0QYc5nf7j8=; b=qxeSX236u6IJQhMfCDa9vkjyLkypP5WvVHN8gCzg7R7LPi1EmNuGA/epR/cnVmtZN6 LNBC4UUjI7MCOCKJL7N536wOaxzf2uRU76OR+4LxYa3PcftBSVpJPEWZ994OcRYk+iff vu92Yzzgd8M9fxzQdlr5HX+racwRWnALHOaUwYg8s+nC4h3KGxewyvdf3zd1huFLPBmM zAHl4+uj6nv2TAsCKB0epksQ/B9pU6DAizI6BDGFXX8bYv+R/y/MPpRbkPMsL6o70b+5 sVkCb+Q2aJuast+LxHL0LyGzArRYQ2sYyNrwfVia6czijgLIkAWp0ZVpdEcrUOSZWHGt Zj5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776162362; x=1776767162; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=eZb/0/xSNQxp0F85Jgh+HG9e9qkIM5zeg0QYc5nf7j8=; b=n7lXL9Bi5yUM/xmHMNS3wa9Er2ZHBTCTiVF7fd6GHmnj061HyPjG93hTyAY7e2uMMh oJB6FyNt7396fSQvqh8t/iLbqnI7H/RYsrjaSGa6NJM6kdUWHyj1DVgLD7gnrKFNLS50 RcZoWvV/g4RKNFl4UqW8b5d7mRRDvuUlS01YjgN0Rj8Vw5Kkft+E4uWQaB11aXzhBbj4 EfZ7FEZG7uqObDldA+fsigbu0IjB61+w3WnzDcOcxj8MmI3R3wqv1/lKqtu4+URWsNEh kth4nH0GD9c4wB/8uRo0Vy0NNKFEOfizmVmyRFctsjQAd0K7A/JyMhF1u0H8whmh4goa huzg== X-Forwarded-Encrypted: i=1; AFNElJ9dYXMyndtK0cSHj543NXgghsqc/KEWDhd+B/2+kMxLG04EUjp0hlb7oJlq26uCXitdBNWYax21Immc4hI=@vger.kernel.org X-Gm-Message-State: AOJu0Yy6t6bbf3tqbuLcUVtT7Fwp1TZG/AF3e3MeklwTIbSvz1IvLgjQ fNbqNgztSCjlAx/VA/sUbMfjr6s4JUm5JHYhn3ZJGh4HCY4+LOgjZniC X-Gm-Gg: AeBDiesglxTX2v1+0F83NlY6D/nvxUwtKv0s2LACS4nZPELjTwUl+w24iY1Te4XcZt8 CyMH+vQw7X5igxDGk1cxUQkNTZ2gq/cvK/P2WtjNcdL2F9SqsdXYZPd/YLMJlCARENmm3KY7r3x DE9nJov0DXKKvpV9rYbL7Kc0J/mj+FS4vS45TpL2ZNVkAjp2ORmAlSt0mhhNoYx6TSfjaOxgSc7 kPFOxUl+8IPJ5uKuUenrXB8Zk2+Iw/xWvQD6+JXOI4h6AYnXn4LyWNOmoroHJg/aFKk4eMekBm4 ANr4ubS7IGM91aXY5UfiVA5dSmehei0L2zObIOp0Z1XYlBa50LQMrsglQdOUae1M9vgJ1xPxsiU JCR9PKJ1dXAQ1Z+xGP7BgqzMzclfmNghJ4VmMKjr4GE1NzLNZeceffE54c95hkTGGGnu/N0R5uo rtOaNdj7/85nP+MBes4m4RMGpuZSjd9GUCGt4WMcIoQ1LvNqZn7s4EW3QIIPcT+eUi X-Received: by 2002:a05:600c:a311:b0:483:7903:c3b1 with SMTP id 5b1f17b1804b1-488d68607fbmr169949555e9.20.1776162362105; Tue, 14 Apr 2026 03:26:02 -0700 (PDT) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43d7bc5f70fsm12239619f8f.3.2026.04.14.03.26.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Apr 2026 03:26:01 -0700 (PDT) Date: Tue, 14 Apr 2026 11:26:00 +0100 From: David Laight To: "Arnd Bergmann" Cc: "Lukas Wunner" , "Andy Shevchenko" , "Herbert Xu" , "David S . Miller" , "Andrew Morton" , "Andrey Ryabinin" , "Ignat Korchagin" , "Stefan Berger" , linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, "Alexander Potapenko" , "Andrey Konovalov" , "Dmitry Vyukov" , "Vincenzo Frascino" Subject: Re: [PATCH] crypto: ecc - Unbreak the build on arm with CONFIG_KASAN_STACK=y Message-ID: <20260414112600.553e7c44@pumpkin> In-Reply-To: References: <05d3e296-1b61-4ab4-9bec-6c11407e6f89@app.fastmail.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Mon, 13 Apr 2026 22:32:24 +0200 "Arnd Bergmann" wrote: > On Mon, Apr 13, 2026, at 21:46, Lukas Wunner wrote: > > On Mon, Apr 13, 2026 at 05:42:39PM +0200, Arnd Bergmann wrote: > >> On Wed, Apr 8, 2026, at 15:36, Lukas Wunner wrote: > > > > Attached please find the Assembler output created by gcc -save-temps, > > both the original version and the one with limited inlining. > > > > The former requires a 1360 bytes stack frame, the latter 1232 bytes. > > E.g. xycz_initial_double() is not inlined into ecc_point_mult(), > > together with all its recursive baggage, so the latter version > > contains two branch instructions to that function which the former > > (original) version does not contain. > > Thanks! > > So it indeed appears that the problem does not go away but only > stays below the arbitrary threshold of 1280 bytes (which was > recently raised). I would not trust that to actually be the > case across all architectures then, as there are some targets > like mips or parisc tend to use even more stack space than > arm. With your current patch, that means there is a good chance > the problem will come back later. Not only that, the 'stack frome size' is just a proxy for total stack use - which is a lot harder to calculate. I've a cunning plan to use clangs function prototype hashing to do a static stack calculation that includes indirect calls. (I did one many years ago for some embedded code that had none.) I suspect it will find all sorts of code paths that 'blow' the kernel stack out of the water. A good bet will be snprintf() calls in unusual error paths (even after ignoring recursive snprintf() calls and all the %px modifiers). > > At the beginning of the function, it looks like the same register values > > are stored to multiple locations on the stack. I assume that's what you > > mean by awful code generation? This odd behavior seems more subdued in > > the version with limited inlining. > > Right. As far as I can tell, the source code is heavily optimized > for performance, but with the sanitizer active this would likely > be several times slower, both from the actual sanitizing and > from the register spilling. I can see how the use of 'u64' > arrays makes this harder for a 32-bit target with limited > available registers. gcc make a right 'pigs breakfast' of handling u64 items on 32bit. It gets really horrid on x86 (which has 8 registers including %sp and %bp). I got the impression it sometimes treats a u64 as being two 32bit values, and other times as a 64bit value held in two registers. The former tends to generate better code, but that latter happens if an asm() block (or probably anything else) ends up with an 'A' constraint for a value in %edx:%eax. It will spill constant zero words to stack, and do multiplies by values that are constant zero. (I think the code generated for a single call to mul_64_64() will show it all.) I've just looked at that source. It seems to be doing 'very wide' arithmetic using u64[]. That will be really horrid on 32bit - it needs to use u32[]. Stopping some of those function being inlined will help. Even on 64bit I doubt it'll make that much difference to overall performance. David > > Arnd >