From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A03F3B2FF2 for ; Fri, 26 Jun 2026 22:39:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782513596; cv=none; b=Jf6LsmaXaZO2pcB9/tmtaTGILWpNdLba71Js5iwfwG0zaA7/V/EBFIkmotlcUDmGXUgbeHcCeBU9zcGiH5XvUur2ocekaY7U2qU0M3T/osYeio+xcLhV1G88yev4rILBJEoLbv+UZJh9l00Kv0rXz2XvWT7kKLpwVhTo+KWo4qw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782513596; c=relaxed/simple; bh=xRyVsNxWSYWFT5ciRNHD2Pl8J5fnwXbHK+e/Auq/KVQ=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=tGgIv59tGTlyM0fteHToJPOQbmcZeaMdFsx39gIfnP+Ucsk/iV1IcRzSjvxMlzPkgU2VYuNaVWR29zK3PSMGfrUee7jUqumfZt9kv1Z39ma4cHyhPSyMRDwKPqFGA62Bv9HS7NCFsbAQvIADFBwTHEH76UC4NtVN6kQZbDwLdZc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TjCvamyW; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TjCvamyW" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7AAE31F000E9; Fri, 26 Jun 2026 22:39:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782513595; bh=JgUOiZxZzW9VeKLG8hV+EIovmMVBIXkOvz/W5iFc/gs=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=TjCvamyWEfXozmOQ3ilBoq0H9DMOWf+7DEZzmVzOrpPWBFsfTQsTgvBEQ/p83dr12 D53bxJkcJO7L/rET/Frg7CPmJt0Binn+CtSl3YP+eIsPtFPb874HYENAYWbOMmZVA2 IT/hKBe9EF3WzMKocNoZfHDK56s2TuvTxxjltIe8lF96Vzqd6LbU4/NhKXCogjzpX5 G8wrWOztF9zbuHdszefwnIT0kARR0MmXyIB0W1SjAPrJaT92++naCNBmwfpkv51OvY sC3IpbtZixAzfj8F9Zfpqhyy7Cv+w0T/G6kutxwygeugY3mbmcrwDKjvet4jn16LNE 3UiYj9ABhz51g== From: Thomas Gleixner To: David Stevens , Pasha Tatashin , Linus Walleij , Will Deacon , Quentin Perret , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Andy Lutomirski , Xin Li , Peter Zijlstra , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Uladzislau Rezki , Kees Cook Cc: David Stevens , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 12/13] x86: Add support for dynamic kernel stacks via FRED In-Reply-To: <20260424191456.2679717-13-stevensd@google.com> References: <20260424191456.2679717-1-stevensd@google.com> <20260424191456.2679717-13-stevensd@google.com> Date: Sat, 27 Jun 2026 00:39:52 +0200 Message-ID: <87zf0hgc3r.ffs@fw13> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Fri, Apr 24 2026 at 12:14, David Stevens wrote: As promised, I've came around to look at this part too. > +#ifdef CONFIG_DYNAMIC_STACK > + > +static noinstr unsigned long copy_stack_data(struct pt_regs *regs) > +{ > + unsigned long new_sp; > + unsigned long data_len; Just a minor issue. Please read and comply with the coding standards documented in https://docs.kernel.org/process/maintainer-tip.html > + > + new_sp = regs->sp - (FRED_CONFIG_REDZONE_AMOUNT << 6); > + new_sp &= FRED_STACK_FRAME_RSP_MASK; > + data_len = sizeof(struct fred_frame); > + new_sp -= data_len; > + > + memcpy((void *)new_sp, regs, data_len); > + > + return new_sp; > +} > + > +__visible noinstr unsigned long switch_to_kstack(struct pt_regs *regs) > +{ > + return copy_stack_data(regs); > +} > + > +#define ALIGN_TO_STACK(addr) ((addr) & ~(THREAD_ALIGN - 1)) > + > +__visible noinstr unsigned long handle_dynamic_stack_kernel_faults(struct pt_regs *regs) > +{ > + unsigned long address; > + struct task_struct *tsk; > + bool on_stack; > + > + address = fred_event_data(regs); > + if (fault_in_kernel_space(address) && !in_nmi()) { Assume the following scenario: CPU runs in user space NMI is raised NMI runs on SL0 NMI execution triggers the stack guard page Don't tell me that's unlikely. As discussed before unlikely does not exist. You just have to enable enough debug muck, build with a compiler which aggressively out of lines code (hint CLANG) and then end up in a deep perf/trace/bpf callchain. > + tsk = task_from_stack_address(address); > + > + if (tsk && dynamic_stack_fault(tsk, address, &on_stack)) { > + WARN_ON_ONCE(tsk != current && > + ALIGN_TO_STACK(regs->sp) != ALIGN_TO_STACK(address)); > + return 0; > + } > + } > + > + /* > + * The regular fault handler won't sleep when executing in an > + * atomic context, so we can complete the #PF directly on the > + * #PF stack. Q: What guarantees you that in_atomic() is giving you always the right answer? A: Nothing Why? If there is a #PF (not caused by the thread stack guard) on a SL>0 stack _before_ the kernel reached the point where it increments preempt_count, then in_atomic() returns false. As a consequence you copy stack data around to the same place, which should be benign, but it is well understood that memcpy() source and destination areas _must_ not overlap. That's UB, no? I know that should not happen, but that doesn't make it less UB :) > + */ > + if (in_atomic()) > + return (unsigned long)regs; > + else > + return copy_stack_data(regs); Thanks, tglx