All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Zizhi Wo <wozizhi@huaweicloud.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	jack@suse.com, brauner@kernel.org, hch@lst.de,
	akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org, yangerkun@huawei.com,
	wangkefeng.wang@huawei.com, pangliyuan1@huawei.com,
	xieyuanbin1@huawei.com
Subject: Re: [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context
Date: Tue, 2 Dec 2025 12:43:32 +0000	[thread overview]
Message-ID: <aS7e9CbQXS27sGcd@shell.armlinux.org.uk> (raw)
In-Reply-To: <CAHk-=wh+cFLLi2x6u61pvL07phSyHPVBTo9Lac2uuqK4eRG_=w@mail.gmail.com>

On Fri, Nov 28, 2025 at 09:06:50AM -0800, Linus Torvalds wrote:
> I don't think it's necessarily all that big of a deal. Yeah, this is
> old code, and yeah, it could probably be cleaned up a bit, but at the
> same time, "old and crusty" also means "fairly well tested". This
> whole fault on a kernel address is a fairly unusual case, and as
> mentioned, I *think* the above fix is sufficient.

We have another issue in the code - which has the branch predictor
hardening for spectre issues, which can be called with interrupts
enabled, causing a kernel warning - obviously not good.

There's another issue which PREEMPT_RT has picked up on - which is
that delivering signals via __do_user_fault() with interrupts disabled
causes spinlocks (which can sleep on PREEMPT_RT) to warn.

What I'm thinking is to address both of these by handling kernel space
page faults (which will be permission or PTE-not-present) separately
(not even build tested):

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 2bc828a1940c..972bce697c6c 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -175,7 +175,8 @@ __do_kernel_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
 
 /*
  * Something tried to access memory that isn't in our memory map..
- * User mode accesses just cause a SIGSEGV
+ * User mode accesses just cause a SIGSEGV. Ensure interrupts are enabled
+ * here, which is safe as the fault being handled is from userspace.
  */
 static void
 __do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
@@ -183,8 +184,7 @@ __do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
 {
 	struct task_struct *tsk = current;
 
-	if (addr > TASK_SIZE)
-		harden_branch_predictor();
+	local_irq_enable();
 
 #ifdef CONFIG_DEBUG_USER
 	if (((user_debug & UDBG_SEGV) && (sig == SIGSEGV)) ||
@@ -259,6 +259,38 @@ static inline bool ttbr0_usermode_access_allowed(struct pt_regs *regs)
 }
 #endif
 
+static int __kprobes
+do_kernel_address_page_fault(unsigned long addr, unsigned int fsr,
+			     struct pt_regs *regs)
+{
+	if (user_mode(regs)) {
+		/*
+		 * Fault from user mode for a kernel space address. User mode
+		 * should not be faulting in kernel space, which includes the
+		 * vector/khelper page. Handle the Spectre issues while
+		 * interrupts are still disabled, then send a SIGSEGV. Note
+		 * that __do_user_fault() will enable interrupts.
+		 */
+		harden_branch_predictor();
+		__do_user_fault(addr, fsr, SIGSEGV, SEGV_MAPERR, regs);
+	} else {
+		/*
+		 * Fault from kernel mode. Enable interrupts if they were
+		 * enabled in the parent context. Section (upper page table)
+		 * translation faults are handled via do_translation_fault(),
+		 * so we will only get here for a non-present kernel space
+		 * PTE or kernel space permission fault. Both of these should
+		 * not happen.
+		 */
+		if (interrupts_enabled(regs))
+			local_irq_enable();
+
+		__do_kernel_fault(mm, addr, fsr, regs);
+	}
+
+	return 0;
+}
+
 static int __kprobes
 do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
@@ -272,6 +304,8 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	if (kprobe_page_fault(regs, fsr))
 		return 0;
 
+	if (addr >= TASK_SIZE)
+		return do_kernel_address_page_fault(addr, fsr, regs);
 
 	/* Enable interrupts if they were enabled in the parent context. */
 	if (interrupts_enabled(regs))

... and I think there was a bug in the branch predictor handling -
addr == TASK_SIZE should have been included.

Does this look sensible?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!


  parent reply	other threads:[~2025-12-02 12:44 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-26  9:05 [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context Zizhi Wo
2025-11-26 10:19 ` [RFC PATCH] vfs: Fix might sleep in load_unaligned_zeropad() with rcu read lock held Xie Yuanbin
2025-11-26 18:10   ` Al Viro
2025-11-26 18:48     ` Al Viro
2025-11-26 19:05       ` Russell King (Oracle)
2025-11-26 19:26         ` Al Viro
2025-11-26 19:51           ` Russell King (Oracle)
2025-11-26 20:02             ` Al Viro
2025-11-26 22:25               ` david laight
2025-11-26 23:51                 ` Al Viro
2025-11-26 23:31               ` Russell King (Oracle)
2025-11-27  3:03                 ` Xie Yuanbin
2025-11-27  7:20                   ` Sebastian Andrzej Siewior
2025-11-27 11:20                     ` Xie Yuanbin
2025-11-28  1:39           ` Xie Yuanbin
2025-11-26 20:42   ` Al Viro
2025-11-26 10:27 ` [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context Zizhi Wo
2025-11-26 21:12   ` Linus Torvalds
2025-11-27 10:27     ` Will Deacon
2025-11-27 10:57     ` Russell King (Oracle)
2025-11-28 17:06       ` Linus Torvalds
2025-11-29  1:01         ` Zizhi Wo
2025-11-29  1:35           ` Linus Torvalds
2025-11-29  4:08             ` [Bug report] hash_name() may cross page boundary and trigger Xie Yuanbin
2025-11-29  9:08               ` Al Viro
2025-11-29  9:25                 ` Xie Yuanbin
2025-11-29  9:44                   ` Al Viro
2025-11-29 10:05                     ` Xie Yuanbin
2025-11-29 10:45                 ` david laight
2025-11-29  8:54             ` [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context Al Viro
2025-12-01  2:08             ` Zizhi Wo
2025-11-29  2:18         ` [Bug report] hash_name() may cross page boundary and trigger Xie Yuanbin
2025-12-01 13:28         ` [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context Will Deacon
2025-12-02 12:43         ` Russell King (Oracle) [this message]
2025-12-02 13:02           ` Xie Yuanbin
2025-12-02 22:07           ` Linus Torvalds
2025-12-03  1:48             ` Xie Yuanbin
2025-12-05 12:08               ` Russell King (Oracle)
2025-12-08  2:32                 ` Xie Yuanbin
2025-12-08  9:26                   ` David Laight
2025-12-08 10:07                   ` Russell King (Oracle)
2025-12-08 13:18                     ` Xie Yuanbin
2025-12-08 15:43                       ` Russell King (Oracle)
2025-12-09  1:30                         ` Xie Yuanbin
2025-11-26 18:55 ` Al Viro
2025-11-27  2:24   ` Zizhi Wo
2025-11-29  3:37     ` Al Viro
2025-11-30  3:01       ` [RFC][alpha] saner vmalloc handling (was Re: [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context) Al Viro
2025-11-30 11:32         ` david laight
2025-11-30 16:43           ` Al Viro
2025-11-30 18:14             ` Magnus Lindholm
2025-11-30 19:03             ` david laight
2025-11-30 20:31               ` Al Viro
2025-11-30 20:32                 ` Al Viro
2025-11-30 22:16         ` Linus Torvalds
2025-11-30 23:37           ` Al Viro
2025-12-01  2:03       ` [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context Zizhi Wo
2025-11-27 12:59 ` Will Deacon
2025-11-28  1:17   ` Zizhi Wo
2025-11-28  1:18     ` Zizhi Wo
2025-11-28  1:39       ` Zizhi Wo
2025-11-28 12:25         ` Will Deacon
2025-11-29  1:02           ` Zizhi Wo
2025-11-29  3:55             ` Al Viro
2025-12-01  2:38               ` Zizhi Wo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aS7e9CbQXS27sGcd@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=hch@lst.de \
    --cc=jack@suse.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pangliyuan1@huawei.com \
    --cc=torvalds@linux-foundation.org \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=wozizhi@huaweicloud.com \
    --cc=xieyuanbin1@huawei.com \
    --cc=yangerkun@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.