From: Dave Hansen <dave.hansen@intel.com>
To: David Stevens <stevensd@google.com>
Cc: David Laight <david.laight.linux@gmail.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Linus Walleij <linus.walleij@linaro.org>,
Will Deacon <willdeacon@google.com>,
Quentin Perret <qperret@google.com>,
Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Andy Lutomirski <luto@kernel.org>, Xin Li <xin@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Uladzislau Rezki <urezki@gmail.com>, Kees Cook <kees@kernel.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v2 00/13] Dynamic Kernel Stacks
Date: Sat, 20 Jun 2026 16:22:08 -0700 [thread overview]
Message-ID: <ecd980ee-bba6-4183-94fa-ae4cee630841@intel.com> (raw)
In-Reply-To: <CAOiLmNE8okatb6RSonemyMT+BGVGg2h3=i_Epg5zppfp1Ks54g@mail.gmail.com>
On 6/19/26 22:25, David Stevens wrote:
>> Needing memory in the middle of schedule() is generally a no-go. But its
>> a lot better than not being able to continue _execution_ of a kernel
>> thread at *ALL*, possibly in a non-preemptible context, like when you do
>> it in a #PF.
> I don't think this is different from the current proposal from a
> memory allocation standpoint. Both proposals effectively maintain a
> pool of preallocated pages used to fill the current thread's stack.
> They vary substantially in when the pages are put into the page
> tables, but both need to allocate during schedule().
I think you're saying: "Dave, you didn't solve all of our problems for
us." I'd definitely agree. ;)
I thought I wrote it somewhere, but I either deleted it or it got
ignored. I'll repeat: this PoC series has two big, big sticking points:
1. It requires allocation in very sticky contexts. It's theoretically
any code that pushes on the stack. That's a *LOT* of the kernel.
An allocation failure pretty much means the CPU thread is stuck.
2. Because those pushes happen almost anywhere, a #PF can happen almost
anywhere, which widens the places #PF needs to be handled. Thus, the
angst from the x86 maintainers.
I think I've at least hand-waved a potential path to getting rid of
sticking point #2 in its entirety, and reducing the x86 maintainer angst.
My hand waving also reduces the scope of #1. It removes the need to
allocate memory in some crazy interrupt-disabled region in the I/O
driver interrupt handler holding a bunch of locks when a #MC happens
during an NMI while kswapd was running.
So, yeah "both need to allocate during schedule()" is factually correct.
But this PoC needs to allocate successfully *EVERYWHERE*. Virtually all
kernel code paths, modulo some very very special areas.
Are you saying that as an engineering principle you see needing to
guarantee allocation success of 12k at "virtually all kernel code paths"
and "schedule()" as equivalent barriers to solving the problem at hand
because they're both non-zero in size?
I suspect not. But it's kinda coming off that way. A bit of coaching for
dealing with grumpy time-constrained maintainers: if they take their
time to help you solve their problem, don't spend undue effort pointing
out the engineering compromises in their proposals. Take more time to
consider the engineering tradeoffs as opposed to simply arguing a lack
of utter perfection.
But, really, my big takeaway from this thread is that the folks pushing
dynamic kernel stacks have a very limited understanding of upstream or
what its priorities are. Probably the single biggest obstacle here is
going to be proving to the long-term maintainers that this isn't another
dump and run operation. I suspect the x86 folks are going to be a bit
more amenable in that territory than our mm friends. <cough>MGLRU<cough>
Either way, welcome to the party! If you want to come help upstream,
there are always patches to review and always bugs to fix.
next prev parent reply other threads:[~2026-06-20 23:22 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 19:14 [PATCH v2 00/13] Dynamic Kernel Stacks David Stevens
2026-04-24 19:14 ` [PATCH v2 01/13] fork: Remove assumption that vm_area->nr_pages equals to THREAD_SIZE David Stevens
2026-04-24 19:14 ` [PATCH v2 02/13] fork: Don't assume fully populated stack during reuse David Stevens
2026-04-24 19:14 ` [PATCH v2 03/13] fork: Move vm_stack to the beginning of the stack David Stevens
2026-04-24 19:14 ` [PATCH v2 04/13] fork: separate vmap stack allocation and free calls David Stevens
2026-04-24 19:14 ` [PATCH v2 05/13] mm/vmalloc: Add a get_vm_area_node() and vmap_pages_range() public functions David Stevens
2026-04-24 19:14 ` [PATCH v2 06/13] fork: Move vmap stack freeing to work queue David Stevens
2026-04-24 19:14 ` [PATCH v2 07/13] fork: Dynamic Kernel Stacks David Stevens
2026-04-24 19:14 ` [PATCH v2 08/13] task_stack.h: Add stack_not_used() support for dynamic stack David Stevens
2026-04-24 19:14 ` [PATCH v2 09/13] fork: Dynamic Kernel Stack accounting David Stevens
2026-04-24 19:14 ` [PATCH v2 10/13] fork: Store task pointer in unpopulated stack ptes David Stevens
2026-04-24 19:14 ` [PATCH v2 11/13] x86/entry/fred: encode frame pointer on entry David Stevens
2026-05-20 22:24 ` David Stevens
2026-05-22 22:25 ` H. Peter Anvin
2026-05-24 18:22 ` Xin Li
2026-04-24 19:14 ` [PATCH v2 12/13] x86: Add support for dynamic kernel stacks via FRED David Stevens
2026-04-24 19:14 ` [PATCH v2 13/13] x86: Add support for dynamic kernel stacks via IST David Stevens
2026-04-24 19:41 ` [PATCH v2 00/13] Dynamic Kernel Stacks Dave Hansen
2026-04-24 21:35 ` Pasha Tatashin
2026-04-24 22:21 ` Dave Hansen
2026-04-24 22:49 ` David Stevens
2026-04-24 22:26 ` David Laight
2026-04-24 23:06 ` Pasha Tatashin
2026-06-19 0:29 ` Dave Hansen
2026-06-19 19:56 ` Zach O'Keefe
2026-06-20 5:25 ` David Stevens
2026-06-20 23:22 ` Dave Hansen [this message]
2026-04-25 9:19 ` H. Peter Anvin
2026-04-27 16:17 ` Dave Hansen
2026-06-18 14:50 ` Zach O'Keefe
2026-06-18 18:53 ` Dave Hansen
2026-06-18 22:28 ` H. Peter Anvin
2026-06-19 0:40 ` David Stevens
2026-06-19 0:44 ` H. Peter Anvin
2026-06-19 12:45 ` Thomas Gleixner
2026-06-19 19:20 ` Zach O'Keefe
2026-06-19 21:59 ` Thomas Gleixner
2026-06-20 5:02 ` David Stevens
2026-06-20 21:59 ` Thomas Gleixner
2026-06-20 19:33 ` Zach O'Keefe
2026-06-20 19:44 ` H. Peter Anvin
2026-06-20 20:01 ` Zach O'Keefe
2026-06-20 23:34 ` Thomas Gleixner
2026-04-27 16:31 ` Pasha Tatashin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ecd980ee-bba6-4183-94fa-ae4cee630841@intel.com \
--to=dave.hansen@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david.laight.linux@gmail.com \
--cc=david@kernel.org \
--cc=hpa@zytor.com \
--cc=kees@kernel.org \
--cc=linus.walleij@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luto@kernel.org \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=pasha.tatashin@soleen.com \
--cc=peterz@infradead.org \
--cc=qperret@google.com \
--cc=rppt@kernel.org \
--cc=stevensd@google.com \
--cc=surenb@google.com \
--cc=tglx@kernel.org \
--cc=urezki@gmail.com \
--cc=vbabka@kernel.org \
--cc=willdeacon@google.com \
--cc=x86@kernel.org \
--cc=xin@zytor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox