From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E9E4014EA5A; Thu, 29 Feb 2024 20:37:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709239069; cv=none; b=tOmOoTe1bEtj7MQDgX9W22Ea2fjqw5Lz/G2IRtHTg1nwZJA/++lcgccSbH/IO95x7s3D1CW5xIDujxDLnioLGp0tP0GxF9o4lHJK74I+BWrPii7zHpm+LF5pZlvZB2Fyi2WxLBrXePBGGpIrFgLUDEOV9raqhwwziLK7WsOSQaU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709239069; c=relaxed/simple; bh=Vo3k/UgoRrmyUGzlldwCCqinbenvGMIr4yM+BJI4MQQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rpYtl9K1Jb7ZxeWsOSjx3W+KbIwHlB0/JMTUkuKP3WYuJkUgF/vvHZY6YrAjVnO3/wwN6ydDClG0ee5EL3i7wImL3uU7SyPYIS70BmZbRVpX0rsegVTe7kGIqzbsqk7nOjrqW7LqHKMy6xOuke/YG7ODr0OJtWeaRv05BTqF0vI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GLecPDSc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GLecPDSc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88A64C433F1; Thu, 29 Feb 2024 20:37:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1709239068; bh=Vo3k/UgoRrmyUGzlldwCCqinbenvGMIr4yM+BJI4MQQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GLecPDScEvLCPrfbNGhsEweVsOUz+02E5PzJIFJ9/10jrkmP7UIJssktkM2Q4uOun j4RNSqHi4lNihXeNeNoCq2d7nqK/feBMqAXlKOXDfDwrKtMRdo0Ceuf5I01nQ1Ya/4 63aWAJmuCYn6MfSWKlR4n9oxr1WO4riAawBgYF/1J4ijz4pZvoTPNR7xLyrTJaljw9 ve/H0/kx+66E9lnAMJYCLNiid6nIK9ADvoE3FHKNG9DWF+Yln3VHNYj4UiQwHkCKIP VuSv/aKv50iVaTsHNWS+VkhYtotCRPcKqM3XZGZzxMzjvZx0NyIZLc8DNv2ws2wzAH 1GPUa/NPi07Yg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Hou Tao , Thomas Gleixner , syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com, xingwei lee , Sohil Mehta , Alexei Starovoitov , Sasha Levin , dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, mingo@redhat.com, bp@alien8.de, x86@kernel.org, bpf@vger.kernel.org Subject: [PATCH AUTOSEL 6.7 09/24] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() Date: Thu, 29 Feb 2024 15:36:49 -0500 Message-ID: <20240229203729.2860356-9-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240229203729.2860356-1-sashal@kernel.org> References: <20240229203729.2860356-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.7.6 Content-Transfer-Encoding: 8bit From: Hou Tao [ Upstream commit 32019c659ecfe1d92e3bf9fcdfbb11a7c70acd58 ] When trying to use copy_from_kernel_nofault() to read vsyscall page through a bpf program, the following oops was reported: BUG: unable to handle page fault for address: ffffffffff600000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... RIP: 0010:copy_from_kernel_nofault+0x6f/0x110 ...... Call Trace: ? copy_from_kernel_nofault+0x6f/0x110 bpf_probe_read_kernel+0x1d/0x50 bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d trace_call_bpf+0xc5/0x1c0 perf_call_bpf_enter.isra.0+0x69/0xb0 perf_syscall_enter+0x13e/0x200 syscall_trace_enter+0x188/0x1c0 do_syscall_64+0xb5/0xe0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 ...... ---[ end trace 0000000000000000 ]--- The oops is triggered when: 1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall page and invokes copy_from_kernel_nofault() which in turn calls __get_user_asm(). 2) Because the vsyscall page address is not readable from kernel space, a page fault exception is triggered accordingly. 3) handle_page_fault() considers the vsyscall page address as a user space address instead of a kernel space address. This results in the fix-up setup by bpf not being applied and a page_fault_oops() is invoked due to SMAP. Considering handle_page_fault() has already considered the vsyscall page address as a userspace address, fix the problem by disallowing vsyscall page read for copy_from_kernel_nofault(). Originally-by: Thomas Gleixner Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com Reported-by: xingwei lee Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com Signed-off-by: Hou Tao Reviewed-by: Sohil Mehta Acked-by: Thomas Gleixner Link: https://lore.kernel.org/r/20240202103935.3154011-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin --- arch/x86/mm/maccess.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c index 6993f026adec9..42115ac079cfe 100644 --- a/arch/x86/mm/maccess.c +++ b/arch/x86/mm/maccess.c @@ -3,6 +3,8 @@ #include #include +#include + #ifdef CONFIG_X86_64 bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size) { @@ -15,6 +17,14 @@ bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size) if (vaddr < TASK_SIZE_MAX + PAGE_SIZE) return false; + /* + * Reading from the vsyscall page may cause an unhandled fault in + * certain cases. Though it is at an address above TASK_SIZE_MAX, it is + * usually considered as a user space address. + */ + if (is_vsyscall_vaddr(vaddr)) + return false; + /* * Allow everything during early boot before 'x86_virt_bits' * is initialized. Needed for instruction decoding in early -- 2.43.0