From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-vk1-f173.google.com (mail-vk1-f173.google.com [209.85.221.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E85EF27280A for ; Fri, 24 Apr 2026 22:26:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777069605; cv=none; b=h1BzbjQgOr1dRG8Ml+cN5eVW6p4G5GGidLjikqrS6jTmamV+AQPThKm7TqOgTUL5scg0E53X0kkDckQ7TlG9945BT5VsBV0F/W8GsWMiCMJ1MBR+VuPgeYuh+uQRojLDxJUUX9yNJMZCqIREtzLTIUaSSPGgEmFc72XRi80MaP4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777069605; c=relaxed/simple; bh=JCmMkF8j9XkAn7X9ICtFfv3CDDtb4GCQb/VGYbTjgI0=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RVxZprHgs69sXUgayGXsy2LwxYHTTrocO3ZXEOCccCe2WWZoBxvYjm+8f34bR/W/tP6AMnxZ2c42XHqTnoDE4UMY52vPhgDw+Lh8GJgyzIDMyG95229TRcE2TFXX6KYrEr9oWFOGsMibbIpkJFZvzmTog+ZoecLLRhHcnh9u4mo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Fl3Nov3t; arc=none smtp.client-ip=209.85.221.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Fl3Nov3t" Received: by mail-vk1-f173.google.com with SMTP id 71dfb90a1353d-56a9076813bso3434348e0c.3 for ; Fri, 24 Apr 2026 15:26:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777069603; x=1777674403; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=pJzRY5bsd0JJCam6REjPs0q6dvg2byR9L1bxAyFIFhw=; b=Fl3Nov3tAeKh59+AeJkbqIkeuAtV2umO9n5lKbw1gKzXzdyM8sffgzJw2YO22wOzoz sQATrx+giA0YW4DbIjJhX/2pexFATROWvHqa5x6NptW3DjrmxjnMEeVE4LAKN53eewe4 Pvn3ob4FsNTYw0DQJ62LBiggRGo4f3D2ukOEu5p3a0hLiabUVQR1JPUTlf8X7cdy4HQm DaL775BsRg/p6IxHGVOIcF6UFLFG9lxbB3gv4jTRCcLAA63A8RZzwuC2+QOoFfQMh0FX xh7ySQwH1guf67+IIhD7u1KNUSyWkTujkOZlXOJ+FKhALvRAwGjjhGKJ00UjzTtSoaj4 bgvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777069603; x=1777674403; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=pJzRY5bsd0JJCam6REjPs0q6dvg2byR9L1bxAyFIFhw=; b=JPrGGwpSrJlkxtS5A0fWo8FZnAC4TH1VTrOm/3dQXX/N7vZMbTTeVM9dcS05tgIxMQ yGk1fB2vhhRQxTHMDfZFUDjFmh8C+DGa27L0p2bY0L9LkH+WbSF2lQuZ5T/8d9yKZW9M QIW4kfbxQRXLfObZmvUgch96+3uEpYeQVvwFPkaD5sgpjzMuvgsP1S9fdhUV/VkbMO0e Q2cKH3k3pXK0FgU8tYFyqnBC8z/TqMECMWNuFihXWWT5V4EkFzyLyK2LkUzExg64BHCo W2ARcNJHktOZzXTX1EEorWdhPwNnV5xE4DQgHm/DK5nERcGBylvtlP4olO90fAAIMqpk UuNw== X-Forwarded-Encrypted: i=1; AFNElJ8bgT7MsNhsGv5lrHZckjDcTGwEg1ZIfFhUhSw4eUr/A1q5FmdsQRUDGY0IXfKB6lffusouzMqIAnppiFY=@vger.kernel.org X-Gm-Message-State: AOJu0Yx8QO6hah92P/J2lVzsHhtVj0cihIu+PGKhcrpXj43aCOmoCMGj UesB77FALGG3QCrXYSZxgwORo3CggVsu7afHihJhGmaolKhPh6c2d7ev X-Gm-Gg: AeBDieukRC0+wYyrRhMmrd8d0u4ufAjuebcPehNmvZcX4Vj1LZ2cPTjErNKlbLeLtv4 97KlSKMKLrmBN9ucUYs0dAj3fNJWpCbjQEurNysDZ7/iSxhv2NZLtUwic7Y9ghusY3kUEJ6nIrS gNxD5MgEm5MhEcJXwCDXBv1QatAs57pljZRCEjnd/eun3T/0LFxSaNlPGkGDClXPydNZ3eEtfMu 8vvtUb2Va9JjfKl0/LmrNbw1ugSjOctqqlrHibUYxvsL/KML8r8vFEDeld90C7N/gsMcvdrDE1T bZ1833W8z0qBsMgyOM5XEflRSqVZX4lGCBbu/ZMk47XXC6xtDl4w5vZnqcaNnQ97BITdPWBIVj8 mIGfgp2bqtzwuTPUIxg54Bnkhy1mypHjR23TODazyxbTeQxsNqj7Y7vH6DRQd5jWm6+9yktIUMj XL5+jL0DAg0O+zwyP+Qqcm5JjgmseN8soqH+9S0ngjccvf7Ne3SKirtOwskRLESmcfXpRSCsk4L OVcfmw2bzOMlg== X-Received: by 2002:a05:6122:2104:b0:56b:5e7e:d3fb with SMTP id 71dfb90a1353d-56fa59f36aemr16647685e0c.12.1777069602870; Fri, 24 Apr 2026 15:26:42 -0700 (PDT) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 71dfb90a1353d-56fa91bfbffsm14233298e0c.2.2026.04.24.15.26.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Apr 2026 15:26:42 -0700 (PDT) Date: Fri, 24 Apr 2026 23:26:37 +0100 From: David Laight To: Pasha Tatashin Cc: Dave Hansen , David Stevens , Linus Walleij , Will Deacon , Quentin Perret , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Andy Lutomirski , Xin Li , Peter Zijlstra , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Uladzislau Rezki , Kees Cook , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 00/13] Dynamic Kernel Stacks Message-ID: <20260424232637.054f15dd@pumpkin> In-Reply-To: References: <20260424191456.2679717-1-stevensd@google.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Fri, 24 Apr 2026 21:35:20 +0000 Pasha Tatashin wrote: > On 04-24 12:41, Dave Hansen wrote: > > On 4/24/26 12:14, David Stevens wrote: > > > The question is then: is this approach something that is fundamentally > > > untenable in the kernel > > > > Yes. Fundamentally untenable. > > > > Not allowing stack faults has been a wonderful simplification. It's one > > of those things that just plain makes the kernel easier to maintain. > > Saving low single digits of system memory is not exactly making me eager > > to go back to the harder-to-maintain days. > > > > I seriously doubt that this 1% is the lowest hanging fruit for memory > > bloat on these systems. ;) > > This true until, in a fleet of millions of machines, you encounter a > one-in-a-billion chance of a stack overflow. You are then forced to > double the statically allocated kernel stacks on every machine, paying a > memory tax even though 99.999..% of threads never exceed 4K. This > overhead accumulates to petabytes of wasted capacity. And then you hit a stack fault in some path where you can't sleep and there isn't any available kernel memory. An alternative idea is to arrange for some system calls to sleep in userspace, so when the thread is woken it re-executes the system call. It then makes sense to assign the kernel stack to the process when it enters the kernel. That might mean that you don't need a kernel stack for all the threads sleeping in futex() - it might even be possible to do the retry in userspace saving the second kernel entry most of the time. It is all 'hard and difficult' though. The easier solution is to rewrite the system code so it doesn't have 1000s of threads :-) David > > Pasha >