* [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup
@ 2024-04-24 22:54 Luis Chamberlain
2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Luis Chamberlain @ 2024-04-24 22:54 UTC (permalink / raw)
To: akpm, ziy, linux-mm
Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry,
p.raghav, da.gomez, mcgrof
I've been stress testing large folio splits using the new debugfs interface
with a new fstests test [0], to make the debugfs interface more useful it helps
to do bounds checks on input parameters so this fixes two issues found while
doing automatic stress testing of the debugfs interface with found files.
Provided there are no issues I think this should go in for v6.9-rc6.
Also, *should* we strive to support splits for large folios to min order
with say MADV_NOHUGEPAGE ?
[0] https://lkml.kernel.org/r/20240424224649.1494092-1-mcgrof@kernel.org
Luis Chamberlain (2):
mm/huge_memory: skip invalid debugfs file entry for folio split
mm/huge_memory: cap max length on debugfs file entry folio split
mm/huge_memory.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain @ 2024-04-24 22:54 ` Luis Chamberlain 2024-04-25 1:03 ` Zi Yan 2024-04-25 21:01 ` Andrew Morton 2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain 2024-04-24 22:55 ` [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain 2 siblings, 2 replies; 11+ messages in thread From: Luis Chamberlain @ 2024-04-24 22:54 UTC (permalink / raw) To: akpm, ziy, linux-mm Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez, mcgrof If the file entry is too long we may easily end up going out of bounds and crash after strsep() on sscanf(). To avoid this ensure we bound the string to an expected length before we use sscanf() on it. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- mm/huge_memory.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9e9879d2f501..8386d24a163e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3623,6 +3623,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, char file_path[MAX_INPUT_BUF_SZ]; pgoff_t off_start = 0, off_end = 0; size_t input_len = strlen(input_buf); + size_t max_left_over; tok = strsep(&buf, ","); if (tok) { @@ -3632,6 +3633,14 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, goto out; } + max_left_over = MAX_INPUT_BUF_SZ - strlen(file_path); + if (!buf || + strnlen(buf, max_left_over) < 7 || + strnlen(buf, max_left_over) > max_left_over) { + ret = -EINVAL; + goto out; + } + ret = sscanf(buf, "0x%lx,0x%lx,%d", &off_start, &off_end, &new_order); if (ret != 2 && ret != 3) { ret = -EINVAL; -- 2.43.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain @ 2024-04-25 1:03 ` Zi Yan 2024-04-25 22:40 ` Luis Chamberlain 2024-04-25 21:01 ` Andrew Morton 1 sibling, 1 reply; 11+ messages in thread From: Zi Yan @ 2024-04-25 1:03 UTC (permalink / raw) To: Luis Chamberlain Cc: akpm, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez [-- Attachment #1: Type: text/plain, Size: 1610 bytes --] On 24 Apr 2024, at 18:54, Luis Chamberlain wrote: > If the file entry is too long we may easily end up going out of bounds > and crash after strsep() on sscanf(). To avoid this ensure we bound the > string to an expected length before we use sscanf() on it. > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > mm/huge_memory.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 9e9879d2f501..8386d24a163e 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3623,6 +3623,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, > char file_path[MAX_INPUT_BUF_SZ]; > pgoff_t off_start = 0, off_end = 0; > size_t input_len = strlen(input_buf); > + size_t max_left_over; > > tok = strsep(&buf, ","); > if (tok) { > @@ -3632,6 +3633,14 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, > goto out; > } > > + max_left_over = MAX_INPUT_BUF_SZ - strlen(file_path); > + if (!buf || > + strnlen(buf, max_left_over) < 7 || What is this magic number 7? strlen("0xN,0xN") as the minimal input string size? Maybe use sizeof("0xN,0xN") - 1 instead? > + strnlen(buf, max_left_over) > max_left_over) { > + ret = -EINVAL; > + goto out; > + } > + > ret = sscanf(buf, "0x%lx,0x%lx,%d", &off_start, &off_end, &new_order); > if (ret != 2 && ret != 3) { > ret = -EINVAL; > -- > 2.43.0 Everything else looks good to me. Reviewed-by: Zi Yan <ziy@nvidia.com> -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-25 1:03 ` Zi Yan @ 2024-04-25 22:40 ` Luis Chamberlain 0 siblings, 0 replies; 11+ messages in thread From: Luis Chamberlain @ 2024-04-25 22:40 UTC (permalink / raw) To: Zi Yan Cc: akpm, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez On Wed, Apr 24, 2024 at 09:03:51PM -0400, Zi Yan wrote: > On 24 Apr 2024, at 18:54, Luis Chamberlain wrote: > > > If the file entry is too long we may easily end up going out of bounds > > and crash after strsep() on sscanf(). To avoid this ensure we bound the > > string to an expected length before we use sscanf() on it. > > > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > > --- > > mm/huge_memory.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 9e9879d2f501..8386d24a163e 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -3623,6 +3623,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, > > char file_path[MAX_INPUT_BUF_SZ]; > > pgoff_t off_start = 0, off_end = 0; > > size_t input_len = strlen(input_buf); > > + size_t max_left_over; > > > > tok = strsep(&buf, ","); > > if (tok) { > > @@ -3632,6 +3633,14 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, > > goto out; > > } > > > > + max_left_over = MAX_INPUT_BUF_SZ - strlen(file_path); > > + if (!buf || > > + strnlen(buf, max_left_over) < 7 || > > What is this magic number 7? strlen("0xN,0xN") as the minimal input string size? > Maybe use sizeof("0xN,0xN") - 1 instead? Sure and I forgot the fixes tag, will send a v2. Luis ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain 2024-04-25 1:03 ` Zi Yan @ 2024-04-25 21:01 ` Andrew Morton 2024-04-29 4:04 ` Luis Chamberlain 1 sibling, 1 reply; 11+ messages in thread From: Andrew Morton @ 2024-04-25 21:01 UTC (permalink / raw) To: Luis Chamberlain Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote: > If the file entry is too long we may easily end up going out of bounds > and crash after strsep() on sscanf(). > Can you explain why? I'm not seeing it. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-25 21:01 ` Andrew Morton @ 2024-04-29 4:04 ` Luis Chamberlain 2024-04-29 16:23 ` Andrew Morton 0 siblings, 1 reply; 11+ messages in thread From: Luis Chamberlain @ 2024-04-29 4:04 UTC (permalink / raw) To: Andrew Morton Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez On Thu, Apr 25, 2024 at 02:01:26PM -0700, Andrew Morton wrote: > On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote: > > > If the file entry is too long we may easily end up going out of bounds > > and crash after strsep() on sscanf(). > > > > Can you explain why? I'm not seeing it. I couldn't see it either but I just looked at the crash below and its the only thing I could think of. So I think its when userspace somehow abuses MAX_INPUT_BUF_SZ a lot somehow. The test I am using: https://lore.kernel.org/all/20240424224649.1494092-1-mcgrof@kernel.org/ The crash: Apr 28 20:09:21 debian12-xfs-reflink-16k-4ks unknown: run fstests generic/745 at 2024-04-28 20:09:21 Apr 28 20:09:22 debian12-xfs-reflink-16k-4ks kernel: XFS (loop5): EXPERIMENTAL: Filesystem with Large Block Size (16384 bytes) enabled. Apr 28 20:09:22 debian12-xfs-reflink-16k-4ks kernel: XFS (loop5): Mounting V5 Filesystem 6633551d-bd66-4a7c-a4ce-7e312e611742 Apr 28 20:09:22 debian12-xfs-reflink-16k-4ks kernel: XFS (loop5): Ending clean mount Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/fss896, page offset: [0x0 - 0x0] with order: 2 Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 0 file-backed THP split Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/f0, page offset: [0x0 - 0x6a4000] with order: 2 Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 2 file-backed THP split Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/f1, page offset: [0x0 - 0x0] with order: 2 Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 0 file-backed THP split Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p2/d0/d1/d4/f5, page offset: [0x0 - 0xc83d64] with order: 2 Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 1 of 6 file-backed THP split Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p2/d0/d1/d4/f9, page offset: [0x0 - 0xcad301] with order: 2 Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 9 of 79 file-backed THP split <--- after ~2 minutes --> Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/dc/dd/d274/d2ca/de12/d139/d4b1/d3cf/d4bb/d1240/d96c/d902/d7a5/d8b0/dbfb/dd18/d795/d8a1/dcc5/d1950/d19ed/d1204/d1452/db5f/dbb7/d10e8/d316/d707/f8f3, page offset: [0x0 - 0x17c0000] with order: 2 Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 6 file-backed THP split Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/dc/dd/d274/d2ca/de12/d139/d4b1/d3cf/d4bb/d1240/d96c/d902/d7a5/d8b0/dbfb/dd18/d795/d8a1/dcc5/d1950/d19ed/d1204/d1452/db5f/dbb7/d10e8/d316/d707/fbb8, page offset: [0x0 - 0x32b533] with order: 2 Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 5 file-backed THP split Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: #PF: supervisor read access in kernel mode Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: #PF: error_code(0x0000) - not-present page Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: PGD 0 P4D 0 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: CPU: 7 PID: 2177 Comm: 745 Not tainted 6.9.0-rc5+ #5 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RIP: 0010:vsscanf (lib/vsprintf.c:3465) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Code: 0f 85 1d 02 00 00 48 8b 54 24 08 0f b6 02 3c 25 40 0f 95 c7 84 c0 0f 95 c1 40 20 cf 41 89 fe 74 40 48 8d 4a 01 48 89 4c 24 08 <41> 0f b6 45 00 38 02 0f 84 87 02 00 00 48 8b 44 24 38 65 48 2b 04 All code ======== 0: 0f 85 1d 02 00 00 jne 0x223 6: 48 8b 54 24 08 mov 0x8(%rsp),%rdx b: 0f b6 02 movzbl (%rdx),%eax e: 3c 25 cmp $0x25,%al 10: 40 0f 95 c7 setne %dil 14: 84 c0 test %al,%al 16: 0f 95 c1 setne %cl 19: 40 20 cf and %cl,%dil 1c: 41 89 fe mov %edi,%r14d 1f: 74 40 je 0x61 21: 48 8d 4a 01 lea 0x1(%rdx),%rcx 25: 48 89 4c 24 08 mov %rcx,0x8(%rsp) 2a:* 41 0f b6 45 00 movzbl 0x0(%r13),%eax <-- trapping instruction 2f: 38 02 cmp %al,(%rdx) 31: 0f 84 87 02 00 00 je 0x2be 37: 48 8b 44 24 38 mov 0x38(%rsp),%rax 3c: 65 gs 3d: 48 rex.W 3e: 2b .byte 0x2b 3f: 04 .byte 0x4 Code starting with the faulting instruction =========================================== 0: 41 0f b6 45 00 movzbl 0x0(%r13),%eax 5: 38 02 cmp %al,(%rdx) 7: 0f 84 87 02 00 00 je 0x294 d: 48 8b 44 24 38 mov 0x38(%rsp),%rax 12: 65 gs 13: 48 rex.W 14: 2b .byte 0x2b 15: 04 .byte 0x4 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RSP: 0018:ffffb24ec358bad8 EFLAGS: 00010202 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RAX: 0000000000000030 RBX: ffffb24ec358bb50 RCX: ffffffffa03fbc18 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RDX: ffffffffa03fbc17 RSI: ffffffffa03fbc17 RDI: 0000000000000001 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RBP: 0000000000000000 R08: ffffb24ec358bbdc R09: 000000000000002c Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R10: ffffb24ec358bbfa R11: 0000000000000000 R12: 0000000000000000 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffffb24ec358bf08 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: FS: 00007f8b39189740(0000) GS:ffff88837bdc0000(0000) knlGS:0000000000000000 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: CR2: 0000000000000000 CR3: 000000010c9f2003 CR4: 0000000000770ef0 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: PKRU: 55555554 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Call Trace: Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: <TASK> Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? page_fault_oops (arch/x86/mm/fault.c:713) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? do_user_addr_fault (./include/linux/kprobes.h:591 (discriminator 1) arch/x86/mm/fault.c:1265 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1563) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? vsscanf (lib/vsprintf.c:3465) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? stack_depot_save_flags (lib/stackdepot.c:609) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: sscanf (lib/vsprintf.c:3725) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? __check_object_size (mm/usercopy.c:226 (discriminator 1) mm/usercopy.c:213 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: split_huge_pages_write (mm/huge_memory.c:3664) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: full_proxy_write (fs/debugfs/file.c:328 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: vfs_write (fs/read_write.c:588) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? __count_memcg_events (mm/memcontrol.c:723 (discriminator 1) mm/memcontrol.c:962 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? count_memcg_events.constprop.0 (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) ./include/linux/memcontrol.h:1097 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? handle_mm_fault (mm/memory.c:5483 mm/memory.c:5622) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? preempt_count_add (./include/linux/ftrace.h:974 (discriminator 1) kernel/sched/core.c:5852 (discriminator 1) kernel/sched/core.c:5849 (discriminator 1) kernel/sched/core.c:5877 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ksys_write (fs/read_write.c:643) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RIP: 0033:0x7f8b39284240 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 All code ======== 0: 40 00 48 8b rex add %cl,-0x75(%rax) 4: 15 c1 9b 0d 00 adc $0xd9bc1,%eax 9: f7 d8 neg %eax b: 64 89 02 mov %eax,%fs:(%rdx) e: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 15: eb b7 jmp 0xffffffffffffffce 17: 0f 1f 00 nopl (%rax) 1a: 80 3d a1 23 0e 00 00 cmpb $0x0,0xe23a1(%rip) # 0xe23c2 21: 74 17 je 0x3a 23: b8 01 00 00 00 mov $0x1,%eax 28: 0f 05 syscall 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction 30: 77 58 ja 0x8a 32: c3 ret 33: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 3a: 48 83 ec 28 sub $0x28,%rsp 3e: 48 rex.W 3f: 89 .byte 0x89 Code starting with the faulting instruction =========================================== 0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax 6: 77 58 ja 0x60 8: c3 ret 9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 10: 48 83 ec 28 sub $0x28,%rsp 14: 48 rex.W 15: 89 .byte 0x89 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RSP: 002b:00007fff2a14b5f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RAX: ffffffffffffffda RBX: 0000000000000109 RCX: 00007f8b39284240 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RDX: 0000000000000109 RSI: 00005589ea673400 RDI: 0000000000000001 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RBP: 00005589ea673400 R08: 0000000000000007 R09: 0000000000000073 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000109 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R13: 00007f8b3935f760 R14: 0000000000000109 R15: 00007f8b3935a9e0 Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: </TASK> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-29 4:04 ` Luis Chamberlain @ 2024-04-29 16:23 ` Andrew Morton 2024-05-07 0:30 ` Luis Chamberlain 0 siblings, 1 reply; 11+ messages in thread From: Andrew Morton @ 2024-04-29 16:23 UTC (permalink / raw) To: Luis Chamberlain Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez On Sun, 28 Apr 2024 21:04:50 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote: > On Thu, Apr 25, 2024 at 02:01:26PM -0700, Andrew Morton wrote: > > On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > > If the file entry is too long we may easily end up going out of bounds > > > and crash after strsep() on sscanf(). > > > > > > > Can you explain why? I'm not seeing it. > > I couldn't see it either but I just looked at the crash below and > its the only thing I could think of. So I think its when userspace > somehow abuses MAX_INPUT_BUF_SZ a lot somehow. This isn't a good basis for making kernel changes :( Can you investigate a little further please? What actually is present at *buf when your new checks succeed? Could we be seeing 0xNNN,0xNNN and leaving new_order unaltered? Or something. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split 2024-04-29 16:23 ` Andrew Morton @ 2024-05-07 0:30 ` Luis Chamberlain 0 siblings, 0 replies; 11+ messages in thread From: Luis Chamberlain @ 2024-05-07 0:30 UTC (permalink / raw) To: Andrew Morton Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez On Mon, Apr 29, 2024 at 09:23:07AM -0700, Andrew Morton wrote: > On Sun, 28 Apr 2024 21:04:50 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote: > > > On Thu, Apr 25, 2024 at 02:01:26PM -0700, Andrew Morton wrote: > > > On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > > > > If the file entry is too long we may easily end up going out of bounds > > > > and crash after strsep() on sscanf(). > > > > > > > > > > Can you explain why? I'm not seeing it. > > > > I couldn't see it either but I just looked at the crash below and > > its the only thing I could think of. So I think its when userspace > > somehow abuses MAX_INPUT_BUF_SZ a lot somehow. > > This isn't a good basis for making kernel changes :( > > Can you investigate a little further please? Sure, this will require some more time, feel free to ignore these two patches for now then. Luis ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry folio split 2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain 2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain @ 2024-04-24 22:54 ` Luis Chamberlain 2024-04-25 1:05 ` Zi Yan 2024-04-24 22:55 ` [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain 2 siblings, 1 reply; 11+ messages in thread From: Luis Chamberlain @ 2024-04-24 22:54 UTC (permalink / raw) To: akpm, ziy, linux-mm Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez, mcgrof Don't allow to query beyond a mapped file's length. Since this is just a debugfs interface allow userspace to be lazy and use a large value so we can just use the entire file. Without this we can end up wasting cycles looking for folios which just don't exist for no good reason. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- mm/huge_memory.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8386d24a163e..86a8c7b3b8dc 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3535,7 +3535,7 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, struct file *candidate; struct address_space *mapping; int ret = -EINVAL; - pgoff_t index; + pgoff_t index, fsize; int nr_pages = 1; unsigned long total = 0, split = 0; @@ -3547,11 +3547,14 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, if (IS_ERR(candidate)) goto out; + mapping = candidate->f_mapping; + fsize = i_size_read(mapping->host); + if (off_end > fsize) + off_end = fsize; + pr_debug("split file-backed THPs in file: %s, page offset: [0x%lx - 0x%lx]\n", file_path, off_start, off_end); - mapping = candidate->f_mapping; - for (index = off_start; index < off_end; index += nr_pages) { struct folio *folio = filemap_get_folio(mapping, index); -- 2.43.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry folio split 2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain @ 2024-04-25 1:05 ` Zi Yan 0 siblings, 0 replies; 11+ messages in thread From: Zi Yan @ 2024-04-25 1:05 UTC (permalink / raw) To: Luis Chamberlain Cc: akpm, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez [-- Attachment #1: Type: text/plain, Size: 1617 bytes --] On 24 Apr 2024, at 18:54, Luis Chamberlain wrote: > Don't allow to query beyond a mapped file's length. Since this is just > a debugfs interface allow userspace to be lazy and use a large value so > we can just use the entire file. > > Without this we can end up wasting cycles looking for folios which > just don't exist for no good reason. > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > mm/huge_memory.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 8386d24a163e..86a8c7b3b8dc 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3535,7 +3535,7 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, > struct file *candidate; > struct address_space *mapping; > int ret = -EINVAL; > - pgoff_t index; > + pgoff_t index, fsize; > int nr_pages = 1; > unsigned long total = 0, split = 0; > > @@ -3547,11 +3547,14 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, > if (IS_ERR(candidate)) > goto out; > > + mapping = candidate->f_mapping; > + fsize = i_size_read(mapping->host); > + if (off_end > fsize) > + off_end = fsize; > + > pr_debug("split file-backed THPs in file: %s, page offset: [0x%lx - 0x%lx]\n", > file_path, off_start, off_end); > > - mapping = candidate->f_mapping; > - > for (index = off_start; index < off_end; index += nr_pages) { > struct folio *folio = filemap_get_folio(mapping, index); LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com> -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup 2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain 2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain 2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain @ 2024-04-24 22:55 ` Luis Chamberlain 2 siblings, 0 replies; 11+ messages in thread From: Luis Chamberlain @ 2024-04-24 22:55 UTC (permalink / raw) To: akpm, ziy, linux-mm Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry, p.raghav, da.gomez The subject hints at a cleanup but I decided to leave that for another separate patch which I'll send next. Luis ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-05-07 0:31 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain 2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain 2024-04-25 1:03 ` Zi Yan 2024-04-25 22:40 ` Luis Chamberlain 2024-04-25 21:01 ` Andrew Morton 2024-04-29 4:04 ` Luis Chamberlain 2024-04-29 16:23 ` Andrew Morton 2024-05-07 0:30 ` Luis Chamberlain 2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain 2024-04-25 1:05 ` Zi Yan 2024-04-24 22:55 ` [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).