* [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
@ 2025-07-22 7:07 kernel test robot
2025-07-22 10:52 ` Pedro Falcato
2025-07-28 20:46 ` David Howells
0 siblings, 2 replies; 5+ messages in thread
From: kernel test robot @ 2025-07-22 7:07 UTC (permalink / raw)
To: Vlastimil Babka
Cc: oe-lkp, lkp, Roman Gushchin, Harry Yoo, linux-mm, oliver.sang
Hello,
kernel test robot noticed "BUG:KASAN:stack-out-of-bounds_in_copy_from_iter" on:
commit: 5660ee54e7982f9097ddc684e90f15bdcc7fef4b ("mm, slab: use frozen pages for large kmalloc")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[test failed on linux-next/master d086c886ceb9f59dea6c3a9dae7eb89e780a20c9]
in testcase: blktests
version: blktests-x86_64-5d9ef47-1_20250709
with following parameters:
disk: 1SSD
test: nvme-group-00
nvme_trtype: rdma
use_siw: true
config: x86_64-rhel-9.4-func
compiler: gcc-12
test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
[ 232.729908][ T3003] BUG: KASAN: stack-out-of-bounds in _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 232.737608][ T3003] Read of size 4 at addr ffffc90002527694 by task siw_tx/2/3003
[ 232.745045][ T3003]
[ 232.747222][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Not tainted 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
[ 232.747226][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[ 232.747228][ T3003] Call Trace:
[ 232.747230][ T3003] <TASK>
[ 232.747231][ T3003] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 1))
[ 232.747236][ T3003] print_address_description+0x2c/0x3b0
[ 232.747241][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 232.747244][ T3003] print_report (mm/kasan/report.c:522)
[ 232.747247][ T3003] ? kasan_addr_to_slab (mm/kasan/common.c:37)
[ 232.747250][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 232.747252][ T3003] kasan_report (mm/kasan/report.c:636)
[ 232.747255][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 232.747259][ T3003] _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 232.747263][ T3003] ? __pfx__copy_from_iter (lib/iov_iter.c:254)
[ 232.747266][ T3003] ? __pfx_tcp_current_mss (net/ipv4/tcp_output.c:1873)
[ 232.747270][ T3003] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:867 include/linux/page-flags.h:888 include/linux/mm.h:992 include/linux/mm.h:2050 mm/usercopy.c:199)
[ 232.747274][ T3003] ? 0xffffffff81000000
[ 232.747276][ T3003] ? __check_object_size (mm/memremap.c:421)
[ 232.747280][ T3003] skb_do_copy_data_nocache (include/linux/uio.h:228 include/linux/uio.h:245 include/net/sock.h:2243)
[ 232.747284][ T3003] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2234)
[ 232.747286][ T3003] ? __sk_mem_schedule (net/core/sock.c:3403)
[ 232.747291][ T3003] tcp_sendmsg_locked (include/net/sock.h:2271 net/ipv4/tcp.c:1254)
[ 232.747297][ T3003] ? sock_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:750)
[ 232.747300][ T3003] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1061)
[ 232.747303][ T3003] ? __pfx_sock_sendmsg (net/socket.c:739)
[ 232.747306][ T3003] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178)
[ 232.747312][ T3003] siw_tcp_sendpages+0x1f1/0x4f0 siw
[ 232.747326][ T3003] ? __pfx_siw_tcp_sendpages+0x10/0x10 siw
[ 232.747340][ T3003] siw_tx_hdt (drivers/infiniband/sw/siw/siw_qp_tx.c:379 drivers/infiniband/sw/siw/siw_qp_tx.c:586) siw
[ 232.747354][ T3003] ? __pfx_siw_tx_hdt (drivers/infiniband/sw/siw/siw_qp_tx.c:431) siw
[ 232.747368][ T3003] ? dl_scaled_delta_exec (kernel/sched/deadline.c:1481)
[ 232.747372][ T3003] ? __pfx_sched_balance_rq (kernel/sched/fair.c:11754)
[ 232.747375][ T3003] ? update_curr_dl_se (kernel/sched/deadline.c:1509)
[ 232.747379][ T3003] ? place_entity (kernel/sched/fair.c:5211)
[ 232.747382][ T3003] ? switch_hrtimer_base (kernel/time/hrtimer.c:232 kernel/time/hrtimer.c:258)
[ 232.747386][ T3003] ? pick_eevdf (kernel/sched/fair.c:946)
[ 232.747389][ T3003] ? __resched_curr (arch/x86/include/asm/bitops.h:60 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/thread_info.h:97 kernel/sched/core.c:1114)
[ 232.747393][ T3003] ? update_curr (kernel/sched/fair.c:1236)
[ 232.747395][ T3003] ? xas_load (include/linux/xarray.h:175 include/linux/xarray.h:1270 lib/xarray.c:241)
[ 232.747400][ T3003] ? xa_load (lib/xarray.c:1613)
[ 232.747403][ T3003] ? __pfx_xa_load (lib/xarray.c:1613)
[ 232.747407][ T3003] ? ttwu_do_activate (kernel/sched/core.c:3719 kernel/sched/core.c:3749)
[ 232.747410][ T3003] ? update_rq_clock_task (kernel/sched/sched.h:1327 kernel/sched/pelt.h:120 kernel/sched/core.c:798)
[ 232.747415][ T3003] ? siw_mem_id2obj (drivers/infiniband/sw/siw/siw_mem.c:28) siw
[ 232.747425][ T3003] ? __pfx_siw_try_1seg (drivers/infiniband/sw/siw/siw_qp_tx.c:50) siw
[ 232.747436][ T3003] ? __pfx_try_to_wake_up (kernel/sched/core.c:4189)
[ 232.747440][ T3003] ? siw_qp_prepare_tx (drivers/infiniband/sw/siw/siw_qp_tx.c:222) siw
[ 232.747452][ T3003] siw_qp_sq_proc_tx (drivers/infiniband/sw/siw/siw_qp_tx.c:882) siw
[ 232.747463][ T3003] ? siw_activate_tx (drivers/infiniband/sw/siw/siw_qp.c:996) siw
[ 232.747474][ T3003] siw_qp_sq_process (drivers/infiniband/sw/siw/siw_qp_tx.c:1038) siw
[ 232.747486][ T3003] siw_sq_resume (drivers/infiniband/sw/siw/siw_qp_tx.c:1170) siw
[ 232.747497][ T3003] siw_run_sq (drivers/infiniband/sw/siw/siw_qp_tx.c:1258) siw
[ 232.747508][ T3003] ? __pfx_siw_run_sq (drivers/infiniband/sw/siw/siw_qp_tx.c:1236) siw
[ 232.747518][ T3003] ? __pfx__raw_spin_lock_irqsave (kernel/locking/spinlock.c:161)
[ 232.747522][ T3003] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383)
[ 232.747526][ T3003] ? __kthread_parkme (arch/x86/include/asm/bitops.h:206 (discriminator 15) arch/x86/include/asm/bitops.h:238 (discriminator 15) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 15) kernel/kthread.c:291 (discriminator 15))
[ 232.747530][ T3003] ? __pfx_siw_run_sq (drivers/infiniband/sw/siw/siw_qp_tx.c:1236) siw
[ 232.747541][ T3003] kthread (kernel/kthread.c:464)
[ 232.747544][ T3003] ? __pfx_kthread (kernel/kthread.c:413)
[ 232.747546][ T3003] ? __pfx__raw_spin_lock_irq (kernel/locking/spinlock.c:169)
[ 232.747549][ T3003] ? __pfx_kthread (kernel/kthread.c:413)
[ 232.747552][ T3003] ? __pfx_kthread (kernel/kthread.c:413)
[ 232.747555][ T3003] ret_from_fork (arch/x86/kernel/process.c:148)
[ 232.747559][ T3003] ? __pfx_kthread (kernel/kthread.c:413)
[ 232.747561][ T3003] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
[ 232.747568][ T3003] </TASK>
[ 232.747569][ T3003]
[ 233.078198][ T3003] The buggy address belongs to stack of task siw_tx/2/3003
[ 233.085214][ T3003] and is located at offset 76 in frame:
[ 233.090677][ T3003] siw_tcp_sendpages+0x0/0x4f0 siw
[ 233.096405][ T3003]
[ 233.098576][ T3003] This frame has 2 objects:
[ 233.102906][ T3003] [48, 64) 'bvec'
[ 233.102908][ T3003] [80, 184) 'msg'
[ 233.106463][ T3003]
[ 233.112188][ T3003] The buggy address belongs to the virtual mapping at
[ 233.112188][ T3003] [ffffc90002520000, ffffc90002529000) created by:
[ 233.112188][ T3003] dup_task_struct (kernel/fork.c:878)
[ 233.129638][ T3003]
[ 233.131813][ T3003] The buggy address belongs to the physical page:
[ 233.138055][ T3003] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888700000000 pfn:0x745e9a
[ 233.147993][ T3003] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 233.155173][ T3003] raw: 0017ffffc0000000 0000000000000000 dead000000000122 0000000000000000
[ 233.163555][ T3003] raw: ffff888700000000 0000000000000000 00000001ffffffff 0000000000000000
[ 233.171938][ T3003] page dumped because: kasan: bad access detected
[ 233.178164][ T3003]
[ 233.180337][ T3003] Memory state around the buggy address:
[ 233.185804][ T3003] ffffc90002527580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[ 233.193683][ T3003] ffffc90002527600: 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 00
[ 233.201548][ T3003] >ffffc90002527680: 00 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 233.209414][ T3003] ^
[ 233.213833][ T3003] ffffc90002527700: f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00
[ 233.221697][ T3003] ffffc90002527780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 233.229562][ T3003] ==================================================================
[ 233.237471][ T3003] Disabling lock debugging due to kernel taint
[ 233.243463][ T3003] Oops: general protection fault, probably for non-canonical address 0x5088000005158: 0000 [#1] SMP KASAN PTI
[ 233.254872][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Tainted: G B 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
[ 233.267574][ T3003] Tainted: [B]=BAD_PAGE
[ 233.271559][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[ 233.279597][ T3003] RIP: 0010:memcpy_orig (arch/x86/lib/memcpy_64.S:95)
[ 233.284533][ T3003] Code: 89 07 4c 89 4f 08 4c 89 57 10 4c 89 5f 18 48 8d 7f 20 73 d4 83 c2 20 eb 44 48 01 d6 48 01 d7 48 83 ea 20 0f 1f 00 48 83 ea 20 <4c> 8b 46 f8 4c 8b 4e f0 4c 8b 56 e8 4c 8b 5e e0 48 8d 76 e0 4c 89
All code
========
0: 89 07 mov %eax,(%rdi)
2: 4c 89 4f 08 mov %r9,0x8(%rdi)
6: 4c 89 57 10 mov %r10,0x10(%rdi)
a: 4c 89 5f 18 mov %r11,0x18(%rdi)
e: 48 8d 7f 20 lea 0x20(%rdi),%rdi
12: 73 d4 jae 0xffffffffffffffe8
14: 83 c2 20 add $0x20,%edx
17: eb 44 jmp 0x5d
19: 48 01 d6 add %rdx,%rsi
1c: 48 01 d7 add %rdx,%rdi
1f: 48 83 ea 20 sub $0x20,%rdx
23: 0f 1f 00 nopl (%rax)
26: 48 83 ea 20 sub $0x20,%rdx
2a:* 4c 8b 46 f8 mov -0x8(%rsi),%r8 <-- trapping instruction
2e: 4c 8b 4e f0 mov -0x10(%rsi),%r9
32: 4c 8b 56 e8 mov -0x18(%rsi),%r10
36: 4c 8b 5e e0 mov -0x20(%rsi),%r11
3a: 48 8d 76 e0 lea -0x20(%rsi),%rsi
3e: 4c rex.WR
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 4c 8b 46 f8 mov -0x8(%rsi),%r8
4: 4c 8b 4e f0 mov -0x10(%rsi),%r9
8: 4c 8b 56 e8 mov -0x18(%rsi),%r10
c: 4c 8b 5e e0 mov -0x20(%rsi),%r11
10: 48 8d 76 e0 lea -0x20(%rsi),%rsi
14: 4c rex.WR
15: 89 .byte 0x89
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250722/202507220801.50a7210-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
2025-07-22 7:07 [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter kernel test robot
@ 2025-07-22 10:52 ` Pedro Falcato
2025-07-22 11:32 ` Vlastimil Babka
2025-07-28 20:46 ` David Howells
1 sibling, 1 reply; 5+ messages in thread
From: Pedro Falcato @ 2025-07-22 10:52 UTC (permalink / raw)
To: kernel test robot
Cc: Vlastimil Babka, oe-lkp, lkp, Roman Gushchin, Harry Yoo,
David Howells, linux-mm
+cc dhowells
On Tue, Jul 22, 2025 at 03:07:44PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "BUG:KASAN:stack-out-of-bounds_in_copy_from_iter" on:
>
> commit: 5660ee54e7982f9097ddc684e90f15bdcc7fef4b ("mm, slab: use frozen pages for large kmalloc")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> [test failed on linux-next/master d086c886ceb9f59dea6c3a9dae7eb89e780a20c9]
>
> in testcase: blktests
> version: blktests-x86_64-5d9ef47-1_20250709
> with following parameters:
>
> disk: 1SSD
> test: nvme-group-00
> nvme_trtype: rdma
> use_siw: true
>
>
>
> config: x86_64-rhel-9.4-func
> compiler: gcc-12
> test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
>
>
> [ 232.729908][ T3003] BUG: KASAN: stack-out-of-bounds in _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
> [ 232.737608][ T3003] Read of size 4 at addr ffffc90002527694 by task siw_tx/2/3003
> [ 232.745045][ T3003]
> [ 232.747222][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Not tainted 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
> [ 232.747226][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
> [ 232.747228][ T3003] Call Trace:
> [ 232.747230][ T3003] <TASK>
> [ 232.747231][ T3003] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 1))
> [ 232.747236][ T3003] print_address_description+0x2c/0x3b0
> [ 232.747241][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
> [ 232.747244][ T3003] print_report (mm/kasan/report.c:522)
> [ 232.747247][ T3003] ? kasan_addr_to_slab (mm/kasan/common.c:37)
> [ 232.747250][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
> [ 232.747252][ T3003] kasan_report (mm/kasan/report.c:636)
> [ 232.747255][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
> [ 232.747259][ T3003] _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
> [ 232.747263][ T3003] ? __pfx__copy_from_iter (lib/iov_iter.c:254)
> [ 232.747266][ T3003] ? __pfx_tcp_current_mss (net/ipv4/tcp_output.c:1873)
> [ 232.747270][ T3003] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:867 include/linux/page-flags.h:888 include/linux/mm.h:992 include/linux/mm.h:2050 mm/usercopy.c:199)
> [ 232.747274][ T3003] ? 0xffffffff81000000
> [ 232.747276][ T3003] ? __check_object_size (mm/memremap.c:421)
> [ 232.747280][ T3003] skb_do_copy_data_nocache (include/linux/uio.h:228 include/linux/uio.h:245 include/net/sock.h:2243)
> [ 232.747284][ T3003] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2234)
> [ 232.747286][ T3003] ? __sk_mem_schedule (net/core/sock.c:3403)
> [ 232.747291][ T3003] tcp_sendmsg_locked (include/net/sock.h:2271 net/ipv4/tcp.c:1254)
> [ 232.747297][ T3003] ? sock_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:750)
> [ 232.747300][ T3003] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1061)
> [ 232.747303][ T3003] ? __pfx_sock_sendmsg (net/socket.c:739)
> [ 232.747306][ T3003] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178)
> [ 232.747312][ T3003] siw_tcp_sendpages+0x1f1/0x4f0 siw
It seems to me that the change introduced back in 6.4 by David was silently
borked (credit to Vlastimil for initially pointing it out to me). Namely:
https://lore.kernel.org/all/20230331160914.1608208-1-dhowells@redhat.com/
introduced three changes, where we're inlining tcp_sendpages:
c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
e117dcfd646e ("tls: Inline do_tcp_sendpages()")
7f8816ab4bae ("espintcp: Inline do_tcp_sendpages()")
(there's a separate ebf2e8860eea, but it looks okay)
Taking a closer look into siw (my comments):
static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
size_t size)
[...]
/* Calculate the number of bytes we need to push, for this page
* specifically */
size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
/* If we can't splice it, then copy it in, as normal */
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
/* Set the bvec pointing to the page, with len $bytes */
bvec_set_page(&bvec, page[i], bytes, offset);
/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
lock_sock(sk);
/* Sendmsg with $size size (!!!) */
rv = tcp_sendmsg_locked(sk, &msg, size);
Now, (probably) why we didn't see this before: ever since Vlastimil introduced
5660ee54e798("mm, slab: use frozen pages for large kmalloc") into -next, sendpage_ok
fails for large kmalloc pages. This makes it so we don't take the MSG_SPLICE_PAGES paths,
which have a subtle difference deep into iov_iter paths:
(MSG_SPLICE_PAGES)
skb_splice_from_iter
iov_iter_extract_pages
iov_iter_extract_bvec_pages
uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
skb_splice_from_iter gets a "short" read
(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
[...]
copy_from_iter
/* this doesn't help */
if (unlikely(iter->count < len))
len = iter->count;
iterate_bvec
... and we run off the bvecs
Anyway, long-winded analysis just to say:
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -332,11 +332,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
bvec_set_page(&bvec, page[i], bytes, offset);
- iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+ iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
try_page_again:
lock_sock(sk);
- rv = tcp_sendmsg_locked(sk, &msg, size);
+ rv = tcp_sendmsg_locked(sk, &msg, bytes);
release_sock(sk);
if (rv > 0) {
(I had a closer look at the tls, espintcp changes, and they seem correct)
--
Pedro
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
2025-07-22 10:52 ` Pedro Falcato
@ 2025-07-22 11:32 ` Vlastimil Babka
2025-07-22 12:01 ` Pedro Falcato
0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2025-07-22 11:32 UTC (permalink / raw)
To: Pedro Falcato, kernel test robot, Bernard Metzler,
Jason Gunthorpe, Leon Romanovsky, linux-rdma@vger.kernel.org
Cc: oe-lkp, lkp, Roman Gushchin, Harry Yoo, David Howells, linux-mm
On 7/22/25 12:52, Pedro Falcato wrote:
> +cc dhowells
+Cc siw+infiniband maintainers too.
Thanks Pedro. Hope there can be either a hotfix for 6.16, or the fix is part
of 6.17 merge window (and I tell Linus to merge slab only afterwards), or I
get the blessing to include it in my tree preceding commit 5660ee54e798 (to
be merged in 6.17 merge window).
Also would you submit the fix formally?
Thanks,
Vlastimil
> On Tue, Jul 22, 2025 at 03:07:44PM +0800, kernel test robot wrote:
>>
>>
>> Hello,
>>
>> kernel test robot noticed "BUG:KASAN:stack-out-of-bounds_in_copy_from_iter" on:
>>
>> commit: 5660ee54e7982f9097ddc684e90f15bdcc7fef4b ("mm, slab: use frozen pages for large kmalloc")
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>>
>> [test failed on linux-next/master d086c886ceb9f59dea6c3a9dae7eb89e780a20c9]
>>
>> in testcase: blktests
>> version: blktests-x86_64-5d9ef47-1_20250709
>> with following parameters:
>>
>> disk: 1SSD
>> test: nvme-group-00
>> nvme_trtype: rdma
>> use_siw: true
>>
>>
>>
>> config: x86_64-rhel-9.4-func
>> compiler: gcc-12
>> test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory
>>
>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>
>>
>>
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>> | Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
>>
>>
>> [ 232.729908][ T3003] BUG: KASAN: stack-out-of-bounds in _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
>> [ 232.737608][ T3003] Read of size 4 at addr ffffc90002527694 by task siw_tx/2/3003
>> [ 232.745045][ T3003]
>> [ 232.747222][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Not tainted 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
>> [ 232.747226][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
>> [ 232.747228][ T3003] Call Trace:
>> [ 232.747230][ T3003] <TASK>
>> [ 232.747231][ T3003] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 1))
>> [ 232.747236][ T3003] print_address_description+0x2c/0x3b0
>> [ 232.747241][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
>> [ 232.747244][ T3003] print_report (mm/kasan/report.c:522)
>> [ 232.747247][ T3003] ? kasan_addr_to_slab (mm/kasan/common.c:37)
>> [ 232.747250][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
>> [ 232.747252][ T3003] kasan_report (mm/kasan/report.c:636)
>> [ 232.747255][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
>> [ 232.747259][ T3003] _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
>> [ 232.747263][ T3003] ? __pfx__copy_from_iter (lib/iov_iter.c:254)
>> [ 232.747266][ T3003] ? __pfx_tcp_current_mss (net/ipv4/tcp_output.c:1873)
>> [ 232.747270][ T3003] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:867 include/linux/page-flags.h:888 include/linux/mm.h:992 include/linux/mm.h:2050 mm/usercopy.c:199)
>> [ 232.747274][ T3003] ? 0xffffffff81000000
>> [ 232.747276][ T3003] ? __check_object_size (mm/memremap.c:421)
>> [ 232.747280][ T3003] skb_do_copy_data_nocache (include/linux/uio.h:228 include/linux/uio.h:245 include/net/sock.h:2243)
>> [ 232.747284][ T3003] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2234)
>> [ 232.747286][ T3003] ? __sk_mem_schedule (net/core/sock.c:3403)
>> [ 232.747291][ T3003] tcp_sendmsg_locked (include/net/sock.h:2271 net/ipv4/tcp.c:1254)
>> [ 232.747297][ T3003] ? sock_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:750)
>> [ 232.747300][ T3003] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1061)
>> [ 232.747303][ T3003] ? __pfx_sock_sendmsg (net/socket.c:739)
>> [ 232.747306][ T3003] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178)
>> [ 232.747312][ T3003] siw_tcp_sendpages+0x1f1/0x4f0 siw
>
> It seems to me that the change introduced back in 6.4 by David was silently
> borked (credit to Vlastimil for initially pointing it out to me). Namely:
>
> https://lore.kernel.org/all/20230331160914.1608208-1-dhowells@redhat.com/
> introduced three changes, where we're inlining tcp_sendpages:
>
> c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
> e117dcfd646e ("tls: Inline do_tcp_sendpages()")
> 7f8816ab4bae ("espintcp: Inline do_tcp_sendpages()")
>
> (there's a separate ebf2e8860eea, but it looks okay)
>
> Taking a closer look into siw (my comments):
>
> static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
> size_t size)
> [...]
> /* Calculate the number of bytes we need to push, for this page
> * specifically */
> size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
> /* If we can't splice it, then copy it in, as normal */
> if (!sendpage_ok(page[i]))
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
> /* Set the bvec pointing to the page, with len $bytes */
> bvec_set_page(&bvec, page[i], bytes, offset);
> /* Set the iter to $size, aka the size of the whole sendpages (!!!) */
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> try_page_again:
> lock_sock(sk);
> /* Sendmsg with $size size (!!!) */
> rv = tcp_sendmsg_locked(sk, &msg, size);
>
>
> Now, (probably) why we didn't see this before: ever since Vlastimil introduced
> 5660ee54e798("mm, slab: use frozen pages for large kmalloc") into -next, sendpage_ok
> fails for large kmalloc pages. This makes it so we don't take the MSG_SPLICE_PAGES paths,
> which have a subtle difference deep into iov_iter paths:
>
> (MSG_SPLICE_PAGES)
> skb_splice_from_iter
> iov_iter_extract_pages
> iov_iter_extract_bvec_pages
> uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
> skb_splice_from_iter gets a "short" read
>
> (!MSG_SPLICE_PAGES)
> skb_copy_to_page_nocache copy=iov_iter_count
> [...]
> copy_from_iter
> /* this doesn't help */
> if (unlikely(iter->count < len))
> len = iter->count;
> iterate_bvec
> ... and we run off the bvecs
>
> Anyway, long-winded analysis just to say:
>
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -332,11 +332,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
> if (!sendpage_ok(page[i]))
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
> bvec_set_page(&bvec, page[i], bytes, offset);
> - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
>
> try_page_again:
> lock_sock(sk);
> - rv = tcp_sendmsg_locked(sk, &msg, size);
> + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> release_sock(sk);
>
> if (rv > 0) {
>
> (I had a closer look at the tls, espintcp changes, and they seem correct)
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
2025-07-22 11:32 ` Vlastimil Babka
@ 2025-07-22 12:01 ` Pedro Falcato
0 siblings, 0 replies; 5+ messages in thread
From: Pedro Falcato @ 2025-07-22 12:01 UTC (permalink / raw)
To: Vlastimil Babka
Cc: kernel test robot, Bernard Metzler, Jason Gunthorpe,
Leon Romanovsky, linux-rdma@vger.kernel.org, oe-lkp, lkp,
Roman Gushchin, Harry Yoo, David Howells, linux-mm
On Tue, Jul 22, 2025 at 01:32:09PM +0200, Vlastimil Babka wrote:
> On 7/22/25 12:52, Pedro Falcato wrote:
> > +cc dhowells
>
> +Cc siw+infiniband maintainers too.
>
> Thanks Pedro. Hope there can be either a hotfix for 6.16, or the fix is part
> of 6.17 merge window (and I tell Linus to merge slab only afterwards), or I
> get the blessing to include it in my tree preceding commit 5660ee54e798 (to
> be merged in 6.17 merge window).
>
> Also would you submit the fix formally?
Yep, I'll send it out as soon as we figure out the tree situation
(I was also waiting for comments from David, if any).
--
Pedro
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
2025-07-22 7:07 [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter kernel test robot
2025-07-22 10:52 ` Pedro Falcato
@ 2025-07-28 20:46 ` David Howells
1 sibling, 0 replies; 5+ messages in thread
From: David Howells @ 2025-07-28 20:46 UTC (permalink / raw)
To: Pedro Falcato
Cc: dhowells, kernel test robot, Vlastimil Babka, oe-lkp, lkp,
Roman Gushchin, Harry Yoo, linux-mm
Pedro Falcato <pfalcato@suse.de> wrote:
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -332,11 +332,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
> if (!sendpage_ok(page[i]))
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
> bvec_set_page(&bvec, page[i], bytes, offset);
> - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
>
> try_page_again:
> lock_sock(sk);
> - rv = tcp_sendmsg_locked(sk, &msg, size);
> + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> release_sock(sk);
>
> if (rv > 0) {
Looks good.
Reviewed-by: David Howells <dhowells@redhat.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-07-28 20:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-22 7:07 [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter kernel test robot
2025-07-22 10:52 ` Pedro Falcato
2025-07-22 11:32 ` Vlastimil Babka
2025-07-22 12:01 ` Pedro Falcato
2025-07-28 20:46 ` David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).