From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E02FD3CFF55 for ; Mon, 27 Apr 2026 16:31:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777307467; cv=none; b=mh2zDlZXFcnLjrnFrI3jtnvjrgiC4NzAggGLRwotYplK+QhkBm5rFoW+NkuyNnWL2cWMraXWxLH7KtC5N29i1hD1arWQ89LHh2Cm19qTxciHbmx92CCwDQjzhoEHtFfq0O9lR1FqbpCrrSkwBj28BG2iwD/6nS+1JfgHA3dRYBU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777307467; c=relaxed/simple; bh=1s1Re4X2SZpCe4IGFCsjD75f2NMGJ5GwF1VcPXDnr/E=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=p4x0OZSwPttHIDLfnVaiveSzp4aGSkRQuo9uiLRARAqr5Z2YT2qriA1TvckrnBPOMJ6XKwZlr/aaVAy8jB+lz44c05MnjlxwtdwNrMMOFpqpLhEOOAhgPv/QykGMRiSZdM/svE8FKraX8GbqzaOlfU6/GW8fqzEjZ0ycEZMDBAc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=henoZZmP; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="henoZZmP" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-8dbbc6c16b2so1317407285a.0 for ; Mon, 27 Apr 2026 09:31:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1777307465; x=1777912265; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=OMolR2fifT8xuTtrcKtIWiBhTS5Vkvrx+dwt/cFeq2Q=; b=henoZZmPCOs/H+Pc4/lfEllwzEj+9Yc05m91kKJnRABAWas2w94Ptc9TELxZljFOOX KlCWLhTPCXzWdqRXB6ovMM2M9uJxZBPx/31UxHWt0tFmfdpntKg+k4Fc+kDgXhSyaGvm Xu3ZH1vJRP2u3jMMTsx+i7VzigFJyIETnzyg+JE9Mwa0tBHM36Qn5Ze140icGoL1XbG9 38ePEOcCgmKHJU+h0hg5I0atXPrV3PXvyadW+cfCmLqKCith57TUmmHePdoF2XEEesg7 KKlXoXstezPWeHuEhwkLnwIAlIHRiGg+yVZ0aJZngcZiOFgEyKoz+TySIKhjm6gtuKkx KftA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777307465; x=1777912265; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OMolR2fifT8xuTtrcKtIWiBhTS5Vkvrx+dwt/cFeq2Q=; b=JxuSgEO4LjmAUle5+a0tVUyWKWT598OEqcgcKE/NcdHdypEhUJ+ZGi1wMKFkYok/mc cuUEq7gc2bNu+bK0dZVqvvn63Bv1teuanBYfD0+nkYMX99MXhFq+ULTn9RDrMDI3D7FO Mvk0UrnuGLFV9+Q3EzciGEs9IQKpQOmDaqmn51yNBybyjJqUdU8P2dGrfE3/TbeS7T5k MSTrb3DocP0yDZoUuK6Buw/R42A0Q/3G23e7BFzG/9ThCUJrn+Tu0CMUK8Jr9vPJcII9 cgaB+UhNCUJ1ov5nsbCKlvyenNagtkktCL+Nz1V2kaxp3CMPJ+o5K8BOqNNMPKDpJsKq CEOg== X-Forwarded-Encrypted: i=1; AFNElJ9Ic8OYSryfegr7wwoZ8merVjtrNSKx9RsZAnoCTmp5PXBpABBjVYUFJkVr8bASge+WMjui0exuE0pr46M=@vger.kernel.org X-Gm-Message-State: AOJu0YweO563OreowqHmTk6PUW75mwP3gdddXKotC0kB8RbIH3ROtTgA j8PLiQ+EBD2NoHBbwq8Y5u3jcc3bmCfR6RE/P8+BHmB0zgs6ht9ZEMbVGlt14W930e0= X-Gm-Gg: AeBDievrmbrhnGtdyhEpuYFgYuP9ptvG30QmuGniSXC9F4A+xYYANgB8N6uJRCujo1b G3vmewYB6YhfiVEn3uONU80g4cKtRc2BjJnpjjpvkoVDjCRy4OTWwbgxp4PndnCkQ9pCVQ79XAV FceXy1TnhbLpjkduCYs9Zq35PYoL9Yi6kfRUkEuPfWHhwYOZ4HsKqTMcJRYbx3pwBr3IN53RYWc xnZpg4joNSKaQzrHYVBpPVtXGnw8YR5iAdNa3czkS+fhAWPsY7ll9zdmgxMCnk+4fz7+zts1WNL 3KCLwlajAvOhiElOX+o8vc5cGp9n1laOTQVFS2CM0pLbJdFu9+BOuEq1EGfgg9PGgUHdPjZzzT7 ek1rFCB9cHc2VJGAeppEpa/8rDR/KvYsNGeQb9VMaBOUaBzfY7wdOGGtv3sWlCQF/QdoFSxTQhR MyquMsBW0pn8vRfT/4s80AEi4QzVWuQeRmrqtoI/Qp4ew0DNnHEtM8+pxjfDJN00ITNRvfWR1r X-Received: by 2002:a05:620a:31a0:b0:8cf:c537:e0a2 with SMTP id af79cd13be357-8e78a52c89amr5128094985a.15.1777307464610; Mon, 27 Apr 2026 09:31:04 -0700 (PDT) Received: from plex ([71.181.43.54]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8e7d64caf37sm2595104485a.11.2026.04.27.09.31.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Apr 2026 09:31:04 -0700 (PDT) Date: Mon, 27 Apr 2026 16:31:02 +0000 From: Pasha Tatashin To: "H. Peter Anvin" Cc: Dave Hansen , David Stevens , Pasha Tatashin , Linus Walleij , Will Deacon , Quentin Perret , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, Andy Lutomirski , Xin Li , Peter Zijlstra , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Uladzislau Rezki , Kees Cook , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 00/13] Dynamic Kernel Stacks Message-ID: References: <20260424191456.2679717-1-stevensd@google.com> <6369e5ce-74e3-4c68-8053-d7d7d21b6955@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6369e5ce-74e3-4c68-8053-d7d7d21b6955@zytor.com> On 04-25 02:19, H. Peter Anvin wrote: > On 2026-04-24 12:41, Dave Hansen wrote: > > On 4/24/26 12:14, David Stevens wrote: > >> The question is then: is this approach something that is fundamentally > >> untenable in the kernel > > > > Yes. Fundamentally untenable. > > > > Not allowing stack faults has been a wonderful simplification. It's one > > of those things that just plain makes the kernel easier to maintain. > > Saving low single digits of system memory is not exactly making me eager > > to go back to the harder-to-maintain days. > > > > I seriously doubt that this 1% is the lowest hanging fruit for memory > > bloat on these systems. ;) > > It is worth noting that this was one of the VERY early design decisions that > has shaped Linux from the beginning: > > - No swapping of kernel memory > - Kernel stacks are statically allocated > - Physical RAM is mapped into the kernel at all times > - A "monolithic" kernel using function calls, not message passing > - A kernel interface that closely maps to the low-level application API > (e.g. each user space thread is a kernel thread.) > - Kernel ABIs and APIs are subject to evolution; stability is only guaranteed > in user space. > > Those design decisions are, by and large, what has made Linux Linux: a > relatively simple, highly performant, and reliable system. I think there is a bit of survivorship bias in that list. Originally, there were many other foundational assumptions that have since evolved as hardware and requirements scaled. For example, there were assumptions about no dynamic hardware reconfiguration (no memory/CPU hot-plug), uniform memory access (no NUMA), and fixed page sizes (no THP or HugeTLB). All of those have changed, and you, better than most, know of many other such examples. A more recent example is PREEMPT_RT: the Linux kernel was originally designed to be non-preemptible. Even the assumptions in your list, such as "physical RAM is mapped into the kernel at all times," are evolving: emulated pmem is not mapped, and guestmemfd plans to allow unmapping memory from the direct map for security reasons. Aside from trying our best not to break user space and allowing the internal kernel API to evolve, the other items are architectural decisions that can and should adapt to new requirements. We now have machines with thousands of hardware threads. Running millions of software threads on such machines is a practical reality, and at fleet scales, statically allocating kernel stacks for all of them wastes a massive amount of memory. The proposed solution won't affect Linux as a whole. It can be optionally enabled for targeted configurations. Additionally, the max stack size is still statically set; it simply isn't populated until actually used. Pasha