* [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
@ 2024-09-02 5:31 Omar Sandoval
2024-09-02 5:31 ` [PATCH 1/2] mm: rename copy_from_kernel_nofault_allowed() to copy_kernel_nofault_allowed() Omar Sandoval
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Omar Sandoval @ 2024-09-02 5:31 UTC (permalink / raw)
To: linux-mm, Andrew Morton
Cc: Christoph Hellwig, x86, linux-arm-kernel, linux-kernel, loongarch,
linux-mips, linux-parisc, linuxppc-dev, linux-um, kernel-team
From: Omar Sandoval <osandov@fb.com>
Hi,
I hit a case where copy_to_kernel_nofault() will fault (lol): if the
destination address is in userspace and x86 Supervisor Mode Access
Prevention is enabled. Patch 2 has the details and the fix. Patch 1
renames a helper function so that its use in patch 2 makes more sense.
If the rename is too intrusive, I can drop it.
Thanks,
Omar
Omar Sandoval (2):
mm: rename copy_from_kernel_nofault_allowed() to
copy_kernel_nofault_allowed()
mm: make copy_to_kernel_nofault() not fault on user addresses
arch/arm/mm/fault.c | 2 +-
arch/loongarch/mm/maccess.c | 2 +-
arch/mips/mm/maccess.c | 2 +-
arch/parisc/lib/memcpy.c | 2 +-
arch/powerpc/mm/maccess.c | 2 +-
arch/um/kernel/maccess.c | 2 +-
arch/x86/mm/maccess.c | 4 ++--
include/linux/uaccess.h | 2 +-
mm/maccess.c | 10 ++++++----
9 files changed, 15 insertions(+), 13 deletions(-)
--
2.46.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] mm: rename copy_from_kernel_nofault_allowed() to copy_kernel_nofault_allowed()
2024-09-02 5:31 [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
@ 2024-09-02 5:31 ` Omar Sandoval
2024-09-02 5:31 ` [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
2024-09-02 6:19 ` [PATCH 0/2] " Christophe Leroy
2 siblings, 0 replies; 10+ messages in thread
From: Omar Sandoval @ 2024-09-02 5:31 UTC (permalink / raw)
To: linux-mm, Andrew Morton
Cc: Christoph Hellwig, x86, linux-arm-kernel, linux-kernel, loongarch,
linux-mips, linux-parisc, linuxppc-dev, linux-um, kernel-team
From: Omar Sandoval <osandov@fb.com>
All of the existing checks are applicable to both the "from" and "to"
directions, and the next patch needs it for copy_to_kernel_nofault().
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
arch/arm/mm/fault.c | 2 +-
arch/loongarch/mm/maccess.c | 2 +-
arch/mips/mm/maccess.c | 2 +-
arch/parisc/lib/memcpy.c | 2 +-
arch/powerpc/mm/maccess.c | 2 +-
arch/um/kernel/maccess.c | 2 +-
arch/x86/mm/maccess.c | 4 ++--
include/linux/uaccess.h | 2 +-
mm/maccess.c | 7 +++----
9 files changed, 12 insertions(+), 13 deletions(-)
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index ab01b51de559..3fef0a59af4f 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -27,7 +27,7 @@
#ifdef CONFIG_MMU
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
unsigned long addr = (unsigned long)unsafe_src;
diff --git a/arch/loongarch/mm/maccess.c b/arch/loongarch/mm/maccess.c
index 58173842c6be..70012ac0a5a8 100644
--- a/arch/loongarch/mm/maccess.c
+++ b/arch/loongarch/mm/maccess.c
@@ -3,7 +3,7 @@
#include <linux/uaccess.h>
#include <linux/kernel.h>
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
/* highest bit set means kernel space */
return (unsigned long)unsafe_src >> (BITS_PER_LONG - 1);
diff --git a/arch/mips/mm/maccess.c b/arch/mips/mm/maccess.c
index 58173842c6be..70012ac0a5a8 100644
--- a/arch/mips/mm/maccess.c
+++ b/arch/mips/mm/maccess.c
@@ -3,7 +3,7 @@
#include <linux/uaccess.h>
#include <linux/kernel.h>
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
/* highest bit set means kernel space */
return (unsigned long)unsafe_src >> (BITS_PER_LONG - 1);
diff --git a/arch/parisc/lib/memcpy.c b/arch/parisc/lib/memcpy.c
index 5fc0c852c84c..78758b9a0f49 100644
--- a/arch/parisc/lib/memcpy.c
+++ b/arch/parisc/lib/memcpy.c
@@ -48,7 +48,7 @@ void * memcpy(void * dst,const void *src, size_t count)
EXPORT_SYMBOL(memcpy);
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
if ((unsigned long)unsafe_src < PAGE_SIZE)
return false;
diff --git a/arch/powerpc/mm/maccess.c b/arch/powerpc/mm/maccess.c
index ea821d0ffe16..e0f4c394836c 100644
--- a/arch/powerpc/mm/maccess.c
+++ b/arch/powerpc/mm/maccess.c
@@ -7,7 +7,7 @@
#include <asm/inst.h>
#include <asm/ppc-opcode.h>
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
return is_kernel_addr((unsigned long)unsafe_src);
}
diff --git a/arch/um/kernel/maccess.c b/arch/um/kernel/maccess.c
index 8ccd56813f68..3497148e9aa5 100644
--- a/arch/um/kernel/maccess.c
+++ b/arch/um/kernel/maccess.c
@@ -7,7 +7,7 @@
#include <linux/kernel.h>
#include <os.h>
-bool copy_from_kernel_nofault_allowed(const void *src, size_t size)
+bool copy_kernel_nofault_allowed(const void *src, size_t size)
{
void *psrc = (void *)rounddown((unsigned long)src, PAGE_SIZE);
diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c
index 42115ac079cf..be28eda2c0b0 100644
--- a/arch/x86/mm/maccess.c
+++ b/arch/x86/mm/maccess.c
@@ -6,7 +6,7 @@
#include <asm/vsyscall.h>
#ifdef CONFIG_X86_64
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
unsigned long vaddr = (unsigned long)unsafe_src;
@@ -36,7 +36,7 @@ bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
return __is_canonical_address(vaddr, boot_cpu_data.x86_virt_bits);
}
#else
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
return (unsigned long)unsafe_src >= TASK_SIZE_MAX;
}
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index d8e4105a2f21..297a02b2bd53 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -387,7 +387,7 @@ copy_struct_from_user(void *dst, size_t ksize, const void __user *src,
return 0;
}
-bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size);
+bool copy_kernel_nofault_allowed(const void *unsafe_src, size_t size);
long copy_from_kernel_nofault(void *dst, const void *src, size_t size);
long notrace copy_to_kernel_nofault(void *dst, const void *src, size_t size);
diff --git a/mm/maccess.c b/mm/maccess.c
index 518a25667323..72e9c03ea37f 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -7,8 +7,7 @@
#include <linux/uaccess.h>
#include <asm/tlb.h>
-bool __weak copy_from_kernel_nofault_allowed(const void *unsafe_src,
- size_t size)
+bool __weak copy_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
return true;
}
@@ -28,7 +27,7 @@ long copy_from_kernel_nofault(void *dst, const void *src, size_t size)
if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
align = (unsigned long)dst | (unsigned long)src;
- if (!copy_from_kernel_nofault_allowed(src, size))
+ if (!copy_kernel_nofault_allowed(src, size))
return -ERANGE;
pagefault_disable();
@@ -83,7 +82,7 @@ long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count)
if (unlikely(count <= 0))
return 0;
- if (!copy_from_kernel_nofault_allowed(unsafe_addr, count))
+ if (!copy_kernel_nofault_allowed(unsafe_addr, count))
return -ERANGE;
pagefault_disable();
--
2.46.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 5:31 [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
2024-09-02 5:31 ` [PATCH 1/2] mm: rename copy_from_kernel_nofault_allowed() to copy_kernel_nofault_allowed() Omar Sandoval
@ 2024-09-02 5:31 ` Omar Sandoval
2024-09-04 7:50 ` Christophe Leroy
2024-09-02 6:19 ` [PATCH 0/2] " Christophe Leroy
2 siblings, 1 reply; 10+ messages in thread
From: Omar Sandoval @ 2024-09-02 5:31 UTC (permalink / raw)
To: linux-mm, Andrew Morton
Cc: Christoph Hellwig, x86, linux-arm-kernel, linux-kernel, loongarch,
linux-mips, linux-parisc, linuxppc-dev, linux-um, kernel-team
From: Omar Sandoval <osandov@fb.com>
I found that on x86, copy_to_kernel_nofault() still faults on addresses
outside of the kernel address range (including NULL):
# echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
# echo g > /proc/sysrq-trigger
...
[15]kdb> mm 0 1234
[ 94.652476] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 94.652481] #PF: supervisor write access in kernel mode
[ 94.652483] #PF: error_code(0x0002) - not-present page
[ 94.652485] PGD 0 P4D 0
[ 94.652489] Oops: Oops: 0002 [#1] PREEMPT SMP PTI
[ 94.652493] CPU: 15 UID: 0 PID: 619 Comm: bash Not tainted 6.11.0-rc6 #11
[ 94.652497] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 94.652498] RIP: 0010:copy_to_kernel_nofault+0x1c/0x90
[ 94.652505] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 65 48 8b 0d 14 9d 72 4c 83 81 04 1b 00 00 01 48 83 fa 07 76 18 48 8b 06 <48> 89 07 48 83 ea 08 48 83 c7 08 48 83 c6 08 48 83 fa 07 77 e8 48
[ 94.652507] RSP: 0018:ffffa4a640b1fa48 EFLAGS: 00010002
[ 94.652509] RAX: 00000000000004d2 RBX: 0000000000000003 RCX: ffff8a8251020000
[ 94.652510] RDX: 0000000000000008 RSI: ffffa4a640b1fa68 RDI: 0000000000000000
[ 94.652512] RBP: 0000000000000000 R08: ffffa4a640b1f9f0 R09: ffffa4a640b1f9f8
[ 94.652513] R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
[ 94.652515] R13: 00000000000004d2 R14: 0000000000000000 R15: ffff8a8251020000
[ 94.652518] FS: 00007fa9c15f6b80(0000) GS:ffff8a895f7c0000(0000) knlGS:0000000000000000
[ 94.652520] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 94.652521] CR2: 0000000000000000 CR3: 000000010ff72004 CR4: 0000000000770ef0
[ 94.652522] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 94.652523] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 94.652524] PKRU: 55555554
[ 94.652525] Call Trace:
[ 94.652528] <TASK>
[ 94.652534] ? __die+0x24/0x70
[ 94.652538] ? page_fault_oops+0x15a/0x480
[ 94.652543] ? kdb_msg_write.part.0+0x3e/0xb0
[ 94.652547] ? vkdb_printf+0x1a6/0x8b0
[ 94.652550] ? exc_page_fault+0x72/0x160
[ 94.652555] ? asm_exc_page_fault+0x26/0x30
[ 94.652564] ? copy_to_kernel_nofault+0x1c/0x90
[ 94.652566] kdb_putarea_size+0x15/0x80
[ 94.652570] kdb_putword+0x72/0xc0
[ 94.652573] kdb_mm+0xdc/0x130
[ 94.652576] kdb_parse+0x2b7/0x6c0
[ 94.652578] ? kdb_getstr+0x40a/0x910
[ 94.652581] kdb_main_loop+0x4a0/0xa40
[ 94.652584] ? module_event+0x10/0x10
[ 94.652590] kdb_stub+0x1ae/0x3f0
[ 94.652594] kgdb_cpu_enter+0x2a8/0x5f0
[ 94.652599] kgdb_handle_exception+0xbd/0x100
[ 94.652605] __kgdb_notify+0x30/0xd0
[ 94.652610] kgdb_notify+0x15/0x30
[ 94.652612] notifier_call_chain+0x5b/0xd0
[ 94.652618] notify_die+0x53/0x80
[ 94.652622] exc_int3+0xf9/0x130
[ 94.652626] asm_exc_int3+0x39/0x40
Note that copy_to_kernel_nofault() uses pagefault_disable(), but it
still faults. This is because with Supervisor Mode Access Prevention
(SMAP) enabled, do_user_addr_fault() Oopses on a fault for a user
address from kernel space _before_ checking faulthandler_disabled().
copy_from_kernel_nofault() avoids this by checking that the address is
in the kernel before doing the actual memory access. Do the same in
copy_to_kernel_nofault() so that we get an error as expected:
# echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
# echo g > /proc/sysrq-trigger
...
[17]kdb> mm 0 1234
kdb_putarea_size: Bad address 0x0
diag: -21: Invalid address
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
mm/maccess.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/maccess.c b/mm/maccess.c
index 72e9c03ea37f..d67dee51a1cc 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -61,6 +61,9 @@ long copy_to_kernel_nofault(void *dst, const void *src, size_t size)
if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
align = (unsigned long)dst | (unsigned long)src;
+ if (!copy_kernel_nofault_allowed(dst, size))
+ return -ERANGE;
+
pagefault_disable();
if (!(align & 7))
copy_to_kernel_nofault_loop(dst, src, size, u64, Efault);
--
2.46.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 5:31 [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
2024-09-02 5:31 ` [PATCH 1/2] mm: rename copy_from_kernel_nofault_allowed() to copy_kernel_nofault_allowed() Omar Sandoval
2024-09-02 5:31 ` [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
@ 2024-09-02 6:19 ` Christophe Leroy
2024-09-02 6:31 ` Omar Sandoval
2 siblings, 1 reply; 10+ messages in thread
From: Christophe Leroy @ 2024-09-02 6:19 UTC (permalink / raw)
To: Omar Sandoval, linux-mm, Andrew Morton
Cc: Christoph Hellwig, x86, linux-arm-kernel, linux-kernel, loongarch,
linux-mips, linux-parisc, linuxppc-dev, linux-um, kernel-team
Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>
> From: Omar Sandoval <osandov@fb.com>
>
> Hi,
>
> I hit a case where copy_to_kernel_nofault() will fault (lol): if the
> destination address is in userspace and x86 Supervisor Mode Access
> Prevention is enabled. Patch 2 has the details and the fix. Patch 1
> renames a helper function so that its use in patch 2 makes more sense.
> If the rename is too intrusive, I can drop it.
The name of the function is "copy_to_kernel". If the destination is a
user address, it is not a copy to kernel but a copy to user and you
already have the function copy_to_user() for that. copy_to_user()
properly handles SMAP.
Christophe
>
> Thanks,
> Omar
>
> Omar Sandoval (2):
> mm: rename copy_from_kernel_nofault_allowed() to
> copy_kernel_nofault_allowed()
> mm: make copy_to_kernel_nofault() not fault on user addresses
>
> arch/arm/mm/fault.c | 2 +-
> arch/loongarch/mm/maccess.c | 2 +-
> arch/mips/mm/maccess.c | 2 +-
> arch/parisc/lib/memcpy.c | 2 +-
> arch/powerpc/mm/maccess.c | 2 +-
> arch/um/kernel/maccess.c | 2 +-
> arch/x86/mm/maccess.c | 4 ++--
> include/linux/uaccess.h | 2 +-
> mm/maccess.c | 10 ++++++----
> 9 files changed, 15 insertions(+), 13 deletions(-)
>
> --
> 2.46.0
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 6:19 ` [PATCH 0/2] " Christophe Leroy
@ 2024-09-02 6:31 ` Omar Sandoval
2024-09-02 8:56 ` David Hildenbrand
0 siblings, 1 reply; 10+ messages in thread
From: Omar Sandoval @ 2024-09-02 6:31 UTC (permalink / raw)
To: Christophe Leroy
Cc: linux-mm, Andrew Morton, Christoph Hellwig, x86, linux-arm-kernel,
linux-kernel, loongarch, linux-mips, linux-parisc, linuxppc-dev,
linux-um, kernel-team
On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
>
>
> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> > [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> >
> > From: Omar Sandoval <osandov@fb.com>
> >
> > Hi,
> >
> > I hit a case where copy_to_kernel_nofault() will fault (lol): if the
> > destination address is in userspace and x86 Supervisor Mode Access
> > Prevention is enabled. Patch 2 has the details and the fix. Patch 1
> > renames a helper function so that its use in patch 2 makes more sense.
> > If the rename is too intrusive, I can drop it.
>
> The name of the function is "copy_to_kernel". If the destination is a user
> address, it is not a copy to kernel but a copy to user and you already have
> the function copy_to_user() for that. copy_to_user() properly handles SMAP.
I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
arbitrary address, and I want it to return an error instead of crashing
if the address is not a valid kernel address. As far as I can tell, that
is the whole point of copy_to_kernel_nofault().
Thanks,
Omar
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 6:31 ` Omar Sandoval
@ 2024-09-02 8:56 ` David Hildenbrand
2024-09-02 15:26 ` Omar Sandoval
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2024-09-02 8:56 UTC (permalink / raw)
To: Omar Sandoval, Christophe Leroy
Cc: linux-mm, Andrew Morton, Christoph Hellwig, x86, linux-arm-kernel,
linux-kernel, loongarch, linux-mips, linux-parisc, linuxppc-dev,
linux-um, kernel-team
On 02.09.24 08:31, Omar Sandoval wrote:
> On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
>>
>>
>> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
>>> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>>>
>>> From: Omar Sandoval <osandov@fb.com>
>>>
>>> Hi,
>>>
>>> I hit a case where copy_to_kernel_nofault() will fault (lol): if the
>>> destination address is in userspace and x86 Supervisor Mode Access
>>> Prevention is enabled. Patch 2 has the details and the fix. Patch 1
>>> renames a helper function so that its use in patch 2 makes more sense.
>>> If the rename is too intrusive, I can drop it.
>>
>> The name of the function is "copy_to_kernel". If the destination is a user
>> address, it is not a copy to kernel but a copy to user and you already have
>> the function copy_to_user() for that. copy_to_user() properly handles SMAP.
>
> I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
> arbitrary address, and I want it to return an error instead of crashing
> if the address is not a valid kernel address. As far as I can tell, that
> is the whole point of copy_to_kernel_nofault().
The thing is that you (well, KDB) triggers something that would be
considered a real BUG when triggered from "ordinary" (non-debugging) code.
But now I am confused: "if the destination address is in userspace" does
not really make sense in the context of KDB, no?
[15]kdb> mm 0 1234
[ 94.652476] BUG: kernel NULL pointer dereference, address:
0000000000000000
Why is address 0 in "user space"? "Which" user space?
Isn't the problem here that KDB lets you blindly write to any
non-existing memory address?
Likely it should do some proper filtering like we do in fs/proc/kcore.c:
Take a look at the KCORE_RAM case where we make sure the page exists, is
online and may be accessed. Only then, we trigger a
copy_from_kernel_nofault(). Note that the KCORE_USER is a corner case
only for some special thingies on x86 (vsyscall), and can be ignored for
our case here.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 8:56 ` David Hildenbrand
@ 2024-09-02 15:26 ` Omar Sandoval
2024-09-02 16:39 ` David Hildenbrand
0 siblings, 1 reply; 10+ messages in thread
From: Omar Sandoval @ 2024-09-02 15:26 UTC (permalink / raw)
To: David Hildenbrand
Cc: Christophe Leroy, linux-mm, Andrew Morton, Christoph Hellwig, x86,
linux-arm-kernel, linux-kernel, loongarch, linux-mips,
linux-parisc, linuxppc-dev, linux-um, kernel-team
On Mon, Sep 02, 2024 at 10:56:27AM +0200, David Hildenbrand wrote:
> On 02.09.24 08:31, Omar Sandoval wrote:
> > On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
> > >
> > >
> > > Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> > > > [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > > >
> > > > From: Omar Sandoval <osandov@fb.com>
> > > >
> > > > Hi,
> > > >
> > > > I hit a case where copy_to_kernel_nofault() will fault (lol): if the
> > > > destination address is in userspace and x86 Supervisor Mode Access
> > > > Prevention is enabled. Patch 2 has the details and the fix. Patch 1
> > > > renames a helper function so that its use in patch 2 makes more sense.
> > > > If the rename is too intrusive, I can drop it.
> > >
> > > The name of the function is "copy_to_kernel". If the destination is a user
> > > address, it is not a copy to kernel but a copy to user and you already have
> > > the function copy_to_user() for that. copy_to_user() properly handles SMAP.
> >
> > I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
> > arbitrary address, and I want it to return an error instead of crashing
> > if the address is not a valid kernel address. As far as I can tell, that
> > is the whole point of copy_to_kernel_nofault().
>
> The thing is that you (well, KDB) triggers something that would be
> considered a real BUG when triggered from "ordinary" (non-debugging) code.
If that's the case, then it's a really weird inconsistency that it's OK
to call copy_from_kernel_nofault() with an invalid address but a bug to
call copy_to_kernel_nofault() on the same address. Again, isn't the
whole point of these functions to fail gracefully instead of crashing on
invalid addresses? (Modulo the offline and hwpoison cases you mention
for /proc/kcore.)
> But now I am confused: "if the destination address is in userspace" does not
> really make sense in the context of KDB, no?
>
> [15]kdb> mm 0 1234
> [ 94.652476] BUG: kernel NULL pointer dereference, address:
> 0000000000000000
>
> Why is address 0 in "user space"? "Which" user space?
Sure, it's not really user space, but it's below TASK_SIZE_MAX, so
things like handle_page_fault() and fault_in_kernel_space() treat it as
if it were a user address. I could
s/userspace address/address that is less than TASK_SIZE_MAX or is_vsyscall_vaddr(address)/.
> Isn't the problem here that KDB lets you blindly write to any non-existing
> memory address?
>
>
> Likely it should do some proper filtering like we do in fs/proc/kcore.c:
>
> Take a look at the KCORE_RAM case where we make sure the page exists, is
> online and may be accessed. Only then, we trigger a
> copy_from_kernel_nofault(). Note that the KCORE_USER is a corner case only
> for some special thingies on x86 (vsyscall), and can be ignored for our case
> here.
Sure, it would be better to harden KDB against all of these special
cases. But you can break things in all sorts of fun ways with a
debugger, anyways. The point of this patch is that it's nonsense that a
function named copy_to_kernel_nofault() does indeed fault in a trivial
case like address < TASK_SIZE_MAX.
Thanks,
Omar
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 15:26 ` Omar Sandoval
@ 2024-09-02 16:39 ` David Hildenbrand
0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand @ 2024-09-02 16:39 UTC (permalink / raw)
To: Omar Sandoval
Cc: Christophe Leroy, linux-mm, Andrew Morton, Christoph Hellwig, x86,
linux-arm-kernel, linux-kernel, loongarch, linux-mips,
linux-parisc, linuxppc-dev, linux-um, kernel-team
On 02.09.24 17:26, Omar Sandoval wrote:
> On Mon, Sep 02, 2024 at 10:56:27AM +0200, David Hildenbrand wrote:
>> On 02.09.24 08:31, Omar Sandoval wrote:
>>> On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
>>>>> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>>>>>
>>>>> From: Omar Sandoval <osandov@fb.com>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I hit a case where copy_to_kernel_nofault() will fault (lol): if the
>>>>> destination address is in userspace and x86 Supervisor Mode Access
>>>>> Prevention is enabled. Patch 2 has the details and the fix. Patch 1
>>>>> renames a helper function so that its use in patch 2 makes more sense.
>>>>> If the rename is too intrusive, I can drop it.
>>>>
>>>> The name of the function is "copy_to_kernel". If the destination is a user
>>>> address, it is not a copy to kernel but a copy to user and you already have
>>>> the function copy_to_user() for that. copy_to_user() properly handles SMAP.
>>>
>>> I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
>>> arbitrary address, and I want it to return an error instead of crashing
>>> if the address is not a valid kernel address. As far as I can tell, that
>>> is the whole point of copy_to_kernel_nofault().
>>
>> The thing is that you (well, KDB) triggers something that would be
>> considered a real BUG when triggered from "ordinary" (non-debugging) code.
>
> If that's the case, then it's a really weird inconsistency that it's OK
> to call copy_from_kernel_nofault() with an invalid address but a bug to
> call copy_to_kernel_nofault() on the same address. Again, isn't the
> whole point of these functions to fail gracefully instead of crashing on
> invalid addresses? (Modulo the offline and hwpoison cases you mention
> for /proc/kcore.)
I assume the difference is mostly historically, because usually, when
modifying something (ftrace, live patch, kdb) you better know what you
want to modify actually exist and can be modified. IOW, you usually
read-before-weite.
In contrast, things like /proc/kcore, (I think) while limiting it to
sane addresses, might still read from areas where we remove entries from
the directmap (e.g., secretmem), I think.
Like, in a compiler, modifying a variable you didn't read before is
rather rare as well. If you would have tried to read it, the
copy_from_kernel_nofault() would have failed.
I agree that the difference is weird, and likely really "nobody ran into
this before in sane use cases".
>
>> But now I am confused: "if the destination address is in userspace" does not
>> really make sense in the context of KDB, no?
>>
>> [15]kdb> mm 0 1234
>> [ 94.652476] BUG: kernel NULL pointer dereference, address:
>> 0000000000000000
>>
>> Why is address 0 in "user space"? "Which" user space?
>
> Sure, it's not really user space, but it's below TASK_SIZE_MAX, so
> things like handle_page_fault() and fault_in_kernel_space() treat it as
> if it were a user address. I could
> s/userspace address/address that is less than TASK_SIZE_MAX or is_vsyscall_vaddr(address)/.
Ah, okay, that's x86 specifics detail in
copy_from_kernel_nofault_allowed(), thanks.
>
>> Isn't the problem here that KDB lets you blindly write to any non-existing
>> memory address?
>>
>>
>> Likely it should do some proper filtering like we do in fs/proc/kcore.c:
>>
>> Take a look at the KCORE_RAM case where we make sure the page exists, is
>> online and may be accessed. Only then, we trigger a
>> copy_from_kernel_nofault(). Note that the KCORE_USER is a corner case only
>> for some special thingies on x86 (vsyscall), and can be ignored for our case
>> here.
>
> Sure, it would be better to harden KDB against all of these special
> cases. But you can break things in all sorts of fun ways with a
> debugger, anyways. The point of this patch is that it's nonsense that a
> function named copy_to_kernel_nofault() does indeed fault in a trivial
> case like address < TASK_SIZE_MAX.
Yes, because the write-without-read to kernel memory "you don't know
even exists" is rather ... weird :)
Anyhow, no strong opinion here. Patches look simple enough.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-02 5:31 ` [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
@ 2024-09-04 7:50 ` Christophe Leroy
2024-09-04 22:56 ` Omar Sandoval
0 siblings, 1 reply; 10+ messages in thread
From: Christophe Leroy @ 2024-09-04 7:50 UTC (permalink / raw)
To: Omar Sandoval, linux-mm, Andrew Morton, Benjamin Gray,
Christopher M. Riedl
Cc: Christoph Hellwig, x86, linux-arm-kernel, linux-kernel, loongarch,
linux-mips, linux-parisc, linuxppc-dev, linux-um, kernel-team
Hi,
Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>
> From: Omar Sandoval <osandov@fb.com>
>
> I found that on x86, copy_to_kernel_nofault() still faults on addresses
> outside of the kernel address range (including NULL):
>
> # echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
> # echo g > /proc/sysrq-trigger
> ...
> [15]kdb> mm 0 1234
> [ 94.652476] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
>
> Note that copy_to_kernel_nofault() uses pagefault_disable(), but it
> still faults. This is because with Supervisor Mode Access Prevention
> (SMAP) enabled, do_user_addr_fault() Oopses on a fault for a user
> address from kernel space _before_ checking faulthandler_disabled().
>
> copy_from_kernel_nofault() avoids this by checking that the address is
> in the kernel before doing the actual memory access. Do the same in
> copy_to_kernel_nofault() so that we get an error as expected:
>
> # echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
> # echo g > /proc/sysrq-trigger
> ...
> [17]kdb> mm 0 1234
> kdb_putarea_size: Bad address 0x0
> diag: -21: Invalid address
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> mm/maccess.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/maccess.c b/mm/maccess.c
> index 72e9c03ea37f..d67dee51a1cc 100644
> --- a/mm/maccess.c
> +++ b/mm/maccess.c
> @@ -61,6 +61,9 @@ long copy_to_kernel_nofault(void *dst, const void *src, size_t size)
> if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
> align = (unsigned long)dst | (unsigned long)src;
>
> + if (!copy_kernel_nofault_allowed(dst, size))
> + return -ERANGE;
> +
> pagefault_disable();
> if (!(align & 7))
> copy_to_kernel_nofault_loop(dst, src, size, u64, Efault);
> --
> 2.46.0
>
This patch leads to the following errors on ppc64le_defconfig:
[ 2.423930][ T1] Running code patching self-tests ...
[ 2.428912][ T1] code-patching: test failed at line 395
[ 2.429085][ T1] code-patching: test failed at line 398
[ 2.429561][ T1] code-patching: test failed at line 432
[ 2.429679][ T1] code-patching: test failed at line 435
This seems to be linked to commit c28c15b6d28a ("powerpc/code-patching:
Use temporary mm for Radix MMU"), copy_from_kernel_nofault_allowed()
returns false for the patching area.
Christophe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses
2024-09-04 7:50 ` Christophe Leroy
@ 2024-09-04 22:56 ` Omar Sandoval
0 siblings, 0 replies; 10+ messages in thread
From: Omar Sandoval @ 2024-09-04 22:56 UTC (permalink / raw)
To: Christophe Leroy
Cc: linux-mm, Andrew Morton, Benjamin Gray, Christopher M. Riedl,
Christoph Hellwig, x86, linux-arm-kernel, linux-kernel, loongarch,
linux-mips, linux-parisc, linuxppc-dev, linux-um, kernel-team
On Wed, Sep 04, 2024 at 09:50:56AM +0200, Christophe Leroy wrote:
> Hi,
>
> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> > [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> >
> > From: Omar Sandoval <osandov@fb.com>
> >
> > I found that on x86, copy_to_kernel_nofault() still faults on addresses
> > outside of the kernel address range (including NULL):
> >
> > # echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
> > # echo g > /proc/sysrq-trigger
> > ...
> > [15]kdb> mm 0 1234
> > [ 94.652476] BUG: kernel NULL pointer dereference, address: 0000000000000000
> ...
> >
> > Note that copy_to_kernel_nofault() uses pagefault_disable(), but it
> > still faults. This is because with Supervisor Mode Access Prevention
> > (SMAP) enabled, do_user_addr_fault() Oopses on a fault for a user
> > address from kernel space _before_ checking faulthandler_disabled().
> >
> > copy_from_kernel_nofault() avoids this by checking that the address is
> > in the kernel before doing the actual memory access. Do the same in
> > copy_to_kernel_nofault() so that we get an error as expected:
> >
> > # echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
> > # echo g > /proc/sysrq-trigger
> > ...
> > [17]kdb> mm 0 1234
> > kdb_putarea_size: Bad address 0x0
> > diag: -21: Invalid address
> >
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> > mm/maccess.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/maccess.c b/mm/maccess.c
> > index 72e9c03ea37f..d67dee51a1cc 100644
> > --- a/mm/maccess.c
> > +++ b/mm/maccess.c
> > @@ -61,6 +61,9 @@ long copy_to_kernel_nofault(void *dst, const void *src, size_t size)
> > if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
> > align = (unsigned long)dst | (unsigned long)src;
> >
> > + if (!copy_kernel_nofault_allowed(dst, size))
> > + return -ERANGE;
> > +
> > pagefault_disable();
> > if (!(align & 7))
> > copy_to_kernel_nofault_loop(dst, src, size, u64, Efault);
> > --
> > 2.46.0
> >
>
> This patch leads to the following errors on ppc64le_defconfig:
>
> [ 2.423930][ T1] Running code patching self-tests ...
> [ 2.428912][ T1] code-patching: test failed at line 395
> [ 2.429085][ T1] code-patching: test failed at line 398
> [ 2.429561][ T1] code-patching: test failed at line 432
> [ 2.429679][ T1] code-patching: test failed at line 435
>
> This seems to be linked to commit c28c15b6d28a ("powerpc/code-patching: Use
> temporary mm for Radix MMU"), copy_from_kernel_nofault_allowed() returns
> false for the patching area.
Thanks for testing. This patch isn't worth the trouble, so we can drop
it.
Thanks,
Omar
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-04 22:56 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-02 5:31 [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
2024-09-02 5:31 ` [PATCH 1/2] mm: rename copy_from_kernel_nofault_allowed() to copy_kernel_nofault_allowed() Omar Sandoval
2024-09-02 5:31 ` [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses Omar Sandoval
2024-09-04 7:50 ` Christophe Leroy
2024-09-04 22:56 ` Omar Sandoval
2024-09-02 6:19 ` [PATCH 0/2] " Christophe Leroy
2024-09-02 6:31 ` Omar Sandoval
2024-09-02 8:56 ` David Hildenbrand
2024-09-02 15:26 ` Omar Sandoval
2024-09-02 16:39 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).