[linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [linux-next:master] [mm, slab]  5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
@ 2025-07-22  7:07 kernel test robot
  2025-07-22 10:52 ` Pedro Falcato
  2025-07-28 20:46 ` David Howells
  0 siblings, 2 replies; 5+ messages in thread
From: kernel test robot @ 2025-07-22  7:07 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: oe-lkp, lkp, Roman Gushchin, Harry Yoo, linux-mm, oliver.sang



Hello,

kernel test robot noticed "BUG:KASAN:stack-out-of-bounds_in_copy_from_iter" on:

commit: 5660ee54e7982f9097ddc684e90f15bdcc7fef4b ("mm, slab: use frozen pages for large kmalloc")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

[test failed on linux-next/master d086c886ceb9f59dea6c3a9dae7eb89e780a20c9]

in testcase: blktests
version: blktests-x86_64-5d9ef47-1_20250709
with following parameters:

	disk: 1SSD
	test: nvme-group-00
	nvme_trtype: rdma
	use_siw: true



config: x86_64-rhel-9.4-func
compiler: gcc-12
test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com


[ 232.729908][ T3003] BUG: KASAN: stack-out-of-bounds in _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[  232.737608][ T3003] Read of size 4 at addr ffffc90002527694 by task siw_tx/2/3003
[  232.745045][ T3003]
[  232.747222][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Not tainted 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
[  232.747226][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[  232.747228][ T3003] Call Trace:
[  232.747230][ T3003]  <TASK>
[ 232.747231][ T3003] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 1)) 
[ 232.747236][ T3003] print_address_description+0x2c/0x3b0 
[ 232.747241][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 232.747244][ T3003] print_report (mm/kasan/report.c:522) 
[ 232.747247][ T3003] ? kasan_addr_to_slab (mm/kasan/common.c:37) 
[ 232.747250][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 232.747252][ T3003] kasan_report (mm/kasan/report.c:636) 
[ 232.747255][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 232.747259][ T3003] _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 232.747263][ T3003] ? __pfx__copy_from_iter (lib/iov_iter.c:254) 
[ 232.747266][ T3003] ? __pfx_tcp_current_mss (net/ipv4/tcp_output.c:1873) 
[ 232.747270][ T3003] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:867 include/linux/page-flags.h:888 include/linux/mm.h:992 include/linux/mm.h:2050 mm/usercopy.c:199) 
[  232.747274][ T3003]  ? 0xffffffff81000000
[ 232.747276][ T3003] ? __check_object_size (mm/memremap.c:421) 
[ 232.747280][ T3003] skb_do_copy_data_nocache (include/linux/uio.h:228 include/linux/uio.h:245 include/net/sock.h:2243) 
[ 232.747284][ T3003] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2234) 
[ 232.747286][ T3003] ? __sk_mem_schedule (net/core/sock.c:3403) 
[ 232.747291][ T3003] tcp_sendmsg_locked (include/net/sock.h:2271 net/ipv4/tcp.c:1254) 
[ 232.747297][ T3003] ? sock_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:750) 
[ 232.747300][ T3003] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1061) 
[ 232.747303][ T3003] ? __pfx_sock_sendmsg (net/socket.c:739) 
[ 232.747306][ T3003] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
[ 232.747312][ T3003] siw_tcp_sendpages+0x1f1/0x4f0 siw 
[ 232.747326][ T3003] ? __pfx_siw_tcp_sendpages+0x10/0x10 siw 
[ 232.747340][ T3003] siw_tx_hdt (drivers/infiniband/sw/siw/siw_qp_tx.c:379 drivers/infiniband/sw/siw/siw_qp_tx.c:586) siw 
[ 232.747354][ T3003] ? __pfx_siw_tx_hdt (drivers/infiniband/sw/siw/siw_qp_tx.c:431) siw 
[ 232.747368][ T3003] ? dl_scaled_delta_exec (kernel/sched/deadline.c:1481) 
[ 232.747372][ T3003] ? __pfx_sched_balance_rq (kernel/sched/fair.c:11754) 
[ 232.747375][ T3003] ? update_curr_dl_se (kernel/sched/deadline.c:1509) 
[ 232.747379][ T3003] ? place_entity (kernel/sched/fair.c:5211) 
[ 232.747382][ T3003] ? switch_hrtimer_base (kernel/time/hrtimer.c:232 kernel/time/hrtimer.c:258) 
[ 232.747386][ T3003] ? pick_eevdf (kernel/sched/fair.c:946) 
[ 232.747389][ T3003] ? __resched_curr (arch/x86/include/asm/bitops.h:60 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/thread_info.h:97 kernel/sched/core.c:1114) 
[ 232.747393][ T3003] ? update_curr (kernel/sched/fair.c:1236) 
[ 232.747395][ T3003] ? xas_load (include/linux/xarray.h:175 include/linux/xarray.h:1270 lib/xarray.c:241) 
[ 232.747400][ T3003] ? xa_load (lib/xarray.c:1613) 
[ 232.747403][ T3003] ? __pfx_xa_load (lib/xarray.c:1613) 
[ 232.747407][ T3003] ? ttwu_do_activate (kernel/sched/core.c:3719 kernel/sched/core.c:3749) 
[ 232.747410][ T3003] ? update_rq_clock_task (kernel/sched/sched.h:1327 kernel/sched/pelt.h:120 kernel/sched/core.c:798) 
[ 232.747415][ T3003] ? siw_mem_id2obj (drivers/infiniband/sw/siw/siw_mem.c:28) siw 
[ 232.747425][ T3003] ? __pfx_siw_try_1seg (drivers/infiniband/sw/siw/siw_qp_tx.c:50) siw 
[ 232.747436][ T3003] ? __pfx_try_to_wake_up (kernel/sched/core.c:4189) 
[ 232.747440][ T3003] ? siw_qp_prepare_tx (drivers/infiniband/sw/siw/siw_qp_tx.c:222) siw 
[ 232.747452][ T3003] siw_qp_sq_proc_tx (drivers/infiniband/sw/siw/siw_qp_tx.c:882) siw 
[ 232.747463][ T3003] ? siw_activate_tx (drivers/infiniband/sw/siw/siw_qp.c:996) siw 
[ 232.747474][ T3003] siw_qp_sq_process (drivers/infiniband/sw/siw/siw_qp_tx.c:1038) siw 
[ 232.747486][ T3003] siw_sq_resume (drivers/infiniband/sw/siw/siw_qp_tx.c:1170) siw 
[ 232.747497][ T3003] siw_run_sq (drivers/infiniband/sw/siw/siw_qp_tx.c:1258) siw 
[ 232.747508][ T3003] ? __pfx_siw_run_sq (drivers/infiniband/sw/siw/siw_qp_tx.c:1236) siw 
[ 232.747518][ T3003] ? __pfx__raw_spin_lock_irqsave (kernel/locking/spinlock.c:161) 
[ 232.747522][ T3003] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383) 
[ 232.747526][ T3003] ? __kthread_parkme (arch/x86/include/asm/bitops.h:206 (discriminator 15) arch/x86/include/asm/bitops.h:238 (discriminator 15) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 15) kernel/kthread.c:291 (discriminator 15)) 
[ 232.747530][ T3003] ? __pfx_siw_run_sq (drivers/infiniband/sw/siw/siw_qp_tx.c:1236) siw 
[ 232.747541][ T3003] kthread (kernel/kthread.c:464) 
[ 232.747544][ T3003] ? __pfx_kthread (kernel/kthread.c:413) 
[ 232.747546][ T3003] ? __pfx__raw_spin_lock_irq (kernel/locking/spinlock.c:169) 
[ 232.747549][ T3003] ? __pfx_kthread (kernel/kthread.c:413) 
[ 232.747552][ T3003] ? __pfx_kthread (kernel/kthread.c:413) 
[ 232.747555][ T3003] ret_from_fork (arch/x86/kernel/process.c:148) 
[ 232.747559][ T3003] ? __pfx_kthread (kernel/kthread.c:413) 
[ 232.747561][ T3003] ret_from_fork_asm (arch/x86/entry/entry_64.S:258) 
[  232.747568][ T3003]  </TASK>
[  232.747569][ T3003]
[  233.078198][ T3003] The buggy address belongs to stack of task siw_tx/2/3003
[  233.085214][ T3003]  and is located at offset 76 in frame:
[ 233.090677][ T3003] siw_tcp_sendpages+0x0/0x4f0 siw 
[  233.096405][ T3003]
[  233.098576][ T3003] This frame has 2 objects:
[  233.102906][ T3003]  [48, 64) 'bvec'
[  233.102908][ T3003]  [80, 184) 'msg'
[  233.106463][ T3003]
[  233.112188][ T3003] The buggy address belongs to the virtual mapping at
[  233.112188][ T3003]  [ffffc90002520000, ffffc90002529000) created by:
[ 233.112188][ T3003] dup_task_struct (kernel/fork.c:878) 
[  233.129638][ T3003]
[  233.131813][ T3003] The buggy address belongs to the physical page:
[  233.138055][ T3003] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888700000000 pfn:0x745e9a
[  233.147993][ T3003] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[  233.155173][ T3003] raw: 0017ffffc0000000 0000000000000000 dead000000000122 0000000000000000
[  233.163555][ T3003] raw: ffff888700000000 0000000000000000 00000001ffffffff 0000000000000000
[  233.171938][ T3003] page dumped because: kasan: bad access detected
[  233.178164][ T3003]
[  233.180337][ T3003] Memory state around the buggy address:
[  233.185804][ T3003]  ffffc90002527580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[  233.193683][ T3003]  ffffc90002527600: 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 00
[  233.201548][ T3003] >ffffc90002527680: 00 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00
[  233.209414][ T3003]                          ^
[  233.213833][ T3003]  ffffc90002527700: f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00
[  233.221697][ T3003]  ffffc90002527780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  233.229562][ T3003] ==================================================================
[  233.237471][ T3003] Disabling lock debugging due to kernel taint
[  233.243463][ T3003] Oops: general protection fault, probably for non-canonical address 0x5088000005158: 0000 [#1] SMP KASAN PTI
[  233.254872][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Tainted: G    B               6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
[  233.267574][ T3003] Tainted: [B]=BAD_PAGE
[  233.271559][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[ 233.279597][ T3003] RIP: 0010:memcpy_orig (arch/x86/lib/memcpy_64.S:95) 
[ 233.284533][ T3003] Code: 89 07 4c 89 4f 08 4c 89 57 10 4c 89 5f 18 48 8d 7f 20 73 d4 83 c2 20 eb 44 48 01 d6 48 01 d7 48 83 ea 20 0f 1f 00 48 83 ea 20 <4c> 8b 46 f8 4c 8b 4e f0 4c 8b 56 e8 4c 8b 5e e0 48 8d 76 e0 4c 89
All code
========
   0:	89 07                	mov    %eax,(%rdi)
   2:	4c 89 4f 08          	mov    %r9,0x8(%rdi)
   6:	4c 89 57 10          	mov    %r10,0x10(%rdi)
   a:	4c 89 5f 18          	mov    %r11,0x18(%rdi)
   e:	48 8d 7f 20          	lea    0x20(%rdi),%rdi
  12:	73 d4                	jae    0xffffffffffffffe8
  14:	83 c2 20             	add    $0x20,%edx
  17:	eb 44                	jmp    0x5d
  19:	48 01 d6             	add    %rdx,%rsi
  1c:	48 01 d7             	add    %rdx,%rdi
  1f:	48 83 ea 20          	sub    $0x20,%rdx
  23:	0f 1f 00             	nopl   (%rax)
  26:	48 83 ea 20          	sub    $0x20,%rdx
  2a:*	4c 8b 46 f8          	mov    -0x8(%rsi),%r8		<-- trapping instruction
  2e:	4c 8b 4e f0          	mov    -0x10(%rsi),%r9
  32:	4c 8b 56 e8          	mov    -0x18(%rsi),%r10
  36:	4c 8b 5e e0          	mov    -0x20(%rsi),%r11
  3a:	48 8d 76 e0          	lea    -0x20(%rsi),%rsi
  3e:	4c                   	rex.WR
  3f:	89                   	.byte 0x89

Code starting with the faulting instruction
===========================================
   0:	4c 8b 46 f8          	mov    -0x8(%rsi),%r8
   4:	4c 8b 4e f0          	mov    -0x10(%rsi),%r9
   8:	4c 8b 56 e8          	mov    -0x18(%rsi),%r10
   c:	4c 8b 5e e0          	mov    -0x20(%rsi),%r11
  10:	48 8d 76 e0          	lea    -0x20(%rsi),%rsi
  14:	4c                   	rex.WR
  15:	89                   	.byte 0x89


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250722/202507220801.50a7210-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next:master] [mm, slab]  5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
  2025-07-22  7:07 [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter kernel test robot
@ 2025-07-22 10:52 ` Pedro Falcato
  2025-07-22 11:32   ` Vlastimil Babka
  2025-07-28 20:46 ` David Howells
  1 sibling, 1 reply; 5+ messages in thread
From: Pedro Falcato @ 2025-07-22 10:52 UTC (permalink / raw)
  To: kernel test robot
  Cc: Vlastimil Babka, oe-lkp, lkp, Roman Gushchin, Harry Yoo,
	David Howells, linux-mm

+cc dhowells

On Tue, Jul 22, 2025 at 03:07:44PM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "BUG:KASAN:stack-out-of-bounds_in_copy_from_iter" on:
> 
> commit: 5660ee54e7982f9097ddc684e90f15bdcc7fef4b ("mm, slab: use frozen pages for large kmalloc")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> [test failed on linux-next/master d086c886ceb9f59dea6c3a9dae7eb89e780a20c9]
> 
> in testcase: blktests
> version: blktests-x86_64-5d9ef47-1_20250709
> with following parameters:
> 
> 	disk: 1SSD
> 	test: nvme-group-00
> 	nvme_trtype: rdma
> 	use_siw: true
> 
> 
> 
> config: x86_64-rhel-9.4-func
> compiler: gcc-12
> test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
> 
> 
> [ 232.729908][ T3003] BUG: KASAN: stack-out-of-bounds in _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
> [  232.737608][ T3003] Read of size 4 at addr ffffc90002527694 by task siw_tx/2/3003
> [  232.745045][ T3003]
> [  232.747222][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Not tainted 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
> [  232.747226][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
> [  232.747228][ T3003] Call Trace:
> [  232.747230][ T3003]  <TASK>
> [ 232.747231][ T3003] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 1)) 
> [ 232.747236][ T3003] print_address_description+0x2c/0x3b0 
> [ 232.747241][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
> [ 232.747244][ T3003] print_report (mm/kasan/report.c:522) 
> [ 232.747247][ T3003] ? kasan_addr_to_slab (mm/kasan/common.c:37) 
> [ 232.747250][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
> [ 232.747252][ T3003] kasan_report (mm/kasan/report.c:636) 
> [ 232.747255][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
> [ 232.747259][ T3003] _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
> [ 232.747263][ T3003] ? __pfx__copy_from_iter (lib/iov_iter.c:254) 
> [ 232.747266][ T3003] ? __pfx_tcp_current_mss (net/ipv4/tcp_output.c:1873) 
> [ 232.747270][ T3003] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:867 include/linux/page-flags.h:888 include/linux/mm.h:992 include/linux/mm.h:2050 mm/usercopy.c:199) 
> [  232.747274][ T3003]  ? 0xffffffff81000000
> [ 232.747276][ T3003] ? __check_object_size (mm/memremap.c:421) 
> [ 232.747280][ T3003] skb_do_copy_data_nocache (include/linux/uio.h:228 include/linux/uio.h:245 include/net/sock.h:2243) 
> [ 232.747284][ T3003] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2234) 
> [ 232.747286][ T3003] ? __sk_mem_schedule (net/core/sock.c:3403) 
> [ 232.747291][ T3003] tcp_sendmsg_locked (include/net/sock.h:2271 net/ipv4/tcp.c:1254) 
> [ 232.747297][ T3003] ? sock_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:750) 
> [ 232.747300][ T3003] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1061) 
> [ 232.747303][ T3003] ? __pfx_sock_sendmsg (net/socket.c:739) 
> [ 232.747306][ T3003] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
> [ 232.747312][ T3003] siw_tcp_sendpages+0x1f1/0x4f0 siw 

It seems to me that the change introduced back in 6.4 by David was silently
borked (credit to Vlastimil for initially pointing it out to me). Namely:

https://lore.kernel.org/all/20230331160914.1608208-1-dhowells@redhat.com/
introduced three changes, where we're inlining tcp_sendpages:

c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
e117dcfd646e ("tls: Inline do_tcp_sendpages()")
7f8816ab4bae ("espintcp: Inline do_tcp_sendpages()")

(there's a separate ebf2e8860eea, but it looks okay)

Taking a closer look into siw (my comments):

static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
			     size_t size)
[...]
	/* Calculate the number of bytes we need to push, for this page
	 * specifically */
	size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
	/* If we can't splice it, then copy it in, as normal */
	if (!sendpage_ok(page[i]))
		msg.msg_flags &= ~MSG_SPLICE_PAGES;
	/* Set the bvec pointing to the page, with len $bytes */
	bvec_set_page(&bvec, page[i], bytes, offset);
	/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
	lock_sock(sk);
	/* Sendmsg with $size size (!!!) */
	rv = tcp_sendmsg_locked(sk, &msg, size);


Now, (probably) why we didn't see this before: ever since Vlastimil introduced
5660ee54e798("mm, slab: use frozen pages for large kmalloc") into -next, sendpage_ok
fails for large kmalloc pages. This makes it so we don't take the MSG_SPLICE_PAGES paths,
which have a subtle difference deep into iov_iter paths:

(MSG_SPLICE_PAGES)
skb_splice_from_iter
  iov_iter_extract_pages
    iov_iter_extract_bvec_pages
      uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
  skb_splice_from_iter gets a "short" read

(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
 [...]
   copy_from_iter
   	/* this doesn't help */
     	if (unlikely(iter->count < len))
		len = iter->count;
	  iterate_bvec
	    ... and we run off the bvecs

Anyway, long-winded analysis just to say:

--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -332,11 +332,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
                if (!sendpage_ok(page[i]))
                        msg.msg_flags &= ~MSG_SPLICE_PAGES;
                bvec_set_page(&bvec, page[i], bytes, offset);
-               iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+               iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);

 try_page_again:
                lock_sock(sk);
-               rv = tcp_sendmsg_locked(sk, &msg, size);
+               rv = tcp_sendmsg_locked(sk, &msg, bytes);
                release_sock(sk);

                if (rv > 0) {

(I had a closer look at the tls, espintcp changes, and they seem correct)

-- 
Pedro


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
  2025-07-22 10:52 ` Pedro Falcato
@ 2025-07-22 11:32   ` Vlastimil Babka
  2025-07-22 12:01     ` Pedro Falcato
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2025-07-22 11:32 UTC (permalink / raw)
  To: Pedro Falcato, kernel test robot, Bernard Metzler,
	Jason Gunthorpe, Leon Romanovsky, linux-rdma@vger.kernel.org
  Cc: oe-lkp, lkp, Roman Gushchin, Harry Yoo, David Howells, linux-mm

On 7/22/25 12:52, Pedro Falcato wrote:
> +cc dhowells

+Cc siw+infiniband maintainers too.

Thanks Pedro. Hope there can be either a hotfix for 6.16, or the fix is part
of 6.17 merge window (and I tell Linus to merge slab only afterwards), or I
get the blessing to include it in my tree preceding commit 5660ee54e798 (to
be merged in 6.17 merge window).

Also would you submit the fix formally?

Thanks,
Vlastimil

> On Tue, Jul 22, 2025 at 03:07:44PM +0800, kernel test robot wrote:
>> 
>> 
>> Hello,
>> 
>> kernel test robot noticed "BUG:KASAN:stack-out-of-bounds_in_copy_from_iter" on:
>> 
>> commit: 5660ee54e7982f9097ddc684e90f15bdcc7fef4b ("mm, slab: use frozen pages for large kmalloc")
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>> 
>> [test failed on linux-next/master d086c886ceb9f59dea6c3a9dae7eb89e780a20c9]
>> 
>> in testcase: blktests
>> version: blktests-x86_64-5d9ef47-1_20250709
>> with following parameters:
>> 
>> 	disk: 1SSD
>> 	test: nvme-group-00
>> 	nvme_trtype: rdma
>> 	use_siw: true
>> 
>> 
>> 
>> config: x86_64-rhel-9.4-func
>> compiler: gcc-12
>> test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory
>> 
>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>> 
>> 
>> 
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>> | Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
>> 
>> 
>> [ 232.729908][ T3003] BUG: KASAN: stack-out-of-bounds in _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
>> [  232.737608][ T3003] Read of size 4 at addr ffffc90002527694 by task siw_tx/2/3003
>> [  232.745045][ T3003]
>> [  232.747222][ T3003] CPU: 2 UID: 0 PID: 3003 Comm: siw_tx/2 Not tainted 6.16.0-rc2-00002-g5660ee54e798 #1 PREEMPT(voluntary)
>> [  232.747226][ T3003] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
>> [  232.747228][ T3003] Call Trace:
>> [  232.747230][ T3003]  <TASK>
>> [ 232.747231][ T3003] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 1)) 
>> [ 232.747236][ T3003] print_address_description+0x2c/0x3b0 
>> [ 232.747241][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
>> [ 232.747244][ T3003] print_report (mm/kasan/report.c:522) 
>> [ 232.747247][ T3003] ? kasan_addr_to_slab (mm/kasan/common.c:37) 
>> [ 232.747250][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
>> [ 232.747252][ T3003] kasan_report (mm/kasan/report.c:636) 
>> [ 232.747255][ T3003] ? _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
>> [ 232.747259][ T3003] _copy_from_iter (include/linux/iov_iter.h:117 include/linux/iov_iter.h:304 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
>> [ 232.747263][ T3003] ? __pfx__copy_from_iter (lib/iov_iter.c:254) 
>> [ 232.747266][ T3003] ? __pfx_tcp_current_mss (net/ipv4/tcp_output.c:1873) 
>> [ 232.747270][ T3003] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:867 include/linux/page-flags.h:888 include/linux/mm.h:992 include/linux/mm.h:2050 mm/usercopy.c:199) 
>> [  232.747274][ T3003]  ? 0xffffffff81000000
>> [ 232.747276][ T3003] ? __check_object_size (mm/memremap.c:421) 
>> [ 232.747280][ T3003] skb_do_copy_data_nocache (include/linux/uio.h:228 include/linux/uio.h:245 include/net/sock.h:2243) 
>> [ 232.747284][ T3003] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2234) 
>> [ 232.747286][ T3003] ? __sk_mem_schedule (net/core/sock.c:3403) 
>> [ 232.747291][ T3003] tcp_sendmsg_locked (include/net/sock.h:2271 net/ipv4/tcp.c:1254) 
>> [ 232.747297][ T3003] ? sock_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:750) 
>> [ 232.747300][ T3003] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1061) 
>> [ 232.747303][ T3003] ? __pfx_sock_sendmsg (net/socket.c:739) 
>> [ 232.747306][ T3003] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
>> [ 232.747312][ T3003] siw_tcp_sendpages+0x1f1/0x4f0 siw 
> 
> It seems to me that the change introduced back in 6.4 by David was silently
> borked (credit to Vlastimil for initially pointing it out to me). Namely:
> 
> https://lore.kernel.org/all/20230331160914.1608208-1-dhowells@redhat.com/
> introduced three changes, where we're inlining tcp_sendpages:
> 
> c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
> e117dcfd646e ("tls: Inline do_tcp_sendpages()")
> 7f8816ab4bae ("espintcp: Inline do_tcp_sendpages()")
> 
> (there's a separate ebf2e8860eea, but it looks okay)
> 
> Taking a closer look into siw (my comments):
> 
> static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
> 			     size_t size)
> [...]
> 	/* Calculate the number of bytes we need to push, for this page
> 	 * specifically */
> 	size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
> 	/* If we can't splice it, then copy it in, as normal */
> 	if (!sendpage_ok(page[i]))
> 		msg.msg_flags &= ~MSG_SPLICE_PAGES;
> 	/* Set the bvec pointing to the page, with len $bytes */
> 	bvec_set_page(&bvec, page[i], bytes, offset);
> 	/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
> 	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> try_page_again:
> 	lock_sock(sk);
> 	/* Sendmsg with $size size (!!!) */
> 	rv = tcp_sendmsg_locked(sk, &msg, size);
> 
> 
> Now, (probably) why we didn't see this before: ever since Vlastimil introduced
> 5660ee54e798("mm, slab: use frozen pages for large kmalloc") into -next, sendpage_ok
> fails for large kmalloc pages. This makes it so we don't take the MSG_SPLICE_PAGES paths,
> which have a subtle difference deep into iov_iter paths:
> 
> (MSG_SPLICE_PAGES)
> skb_splice_from_iter
>   iov_iter_extract_pages
>     iov_iter_extract_bvec_pages
>       uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
>   skb_splice_from_iter gets a "short" read
> 
> (!MSG_SPLICE_PAGES)
> skb_copy_to_page_nocache copy=iov_iter_count
>  [...]
>    copy_from_iter
>    	/* this doesn't help */
>      	if (unlikely(iter->count < len))
> 		len = iter->count;
> 	  iterate_bvec
> 	    ... and we run off the bvecs
> 
> Anyway, long-winded analysis just to say:
> 
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -332,11 +332,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
>                 if (!sendpage_ok(page[i]))
>                         msg.msg_flags &= ~MSG_SPLICE_PAGES;
>                 bvec_set_page(&bvec, page[i], bytes, offset);
> -               iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> +               iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> 
>  try_page_again:
>                 lock_sock(sk);
> -               rv = tcp_sendmsg_locked(sk, &msg, size);
> +               rv = tcp_sendmsg_locked(sk, &msg, bytes);
>                 release_sock(sk);
> 
>                 if (rv > 0) {
> 
> (I had a closer look at the tls, espintcp changes, and they seem correct)
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
  2025-07-22 11:32   ` Vlastimil Babka
@ 2025-07-22 12:01     ` Pedro Falcato
  0 siblings, 0 replies; 5+ messages in thread
From: Pedro Falcato @ 2025-07-22 12:01 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: kernel test robot, Bernard Metzler, Jason Gunthorpe,
	Leon Romanovsky, linux-rdma@vger.kernel.org, oe-lkp, lkp,
	Roman Gushchin, Harry Yoo, David Howells, linux-mm

On Tue, Jul 22, 2025 at 01:32:09PM +0200, Vlastimil Babka wrote:
> On 7/22/25 12:52, Pedro Falcato wrote:
> > +cc dhowells
> 
> +Cc siw+infiniband maintainers too.
> 
> Thanks Pedro. Hope there can be either a hotfix for 6.16, or the fix is part
> of 6.17 merge window (and I tell Linus to merge slab only afterwards), or I
> get the blessing to include it in my tree preceding commit 5660ee54e798 (to
> be merged in 6.17 merge window).
> 
> Also would you submit the fix formally?

Yep, I'll send it out as soon as we figure out the tree situation
(I was also waiting for comments from David, if any).

-- 
Pedro


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter
  2025-07-22  7:07 [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter kernel test robot
  2025-07-22 10:52 ` Pedro Falcato
@ 2025-07-28 20:46 ` David Howells
  1 sibling, 0 replies; 5+ messages in thread
From: David Howells @ 2025-07-28 20:46 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: dhowells, kernel test robot, Vlastimil Babka, oe-lkp, lkp,
	Roman Gushchin, Harry Yoo, linux-mm

Pedro Falcato <pfalcato@suse.de> wrote:

> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -332,11 +332,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
>                 if (!sendpage_ok(page[i]))
>                         msg.msg_flags &= ~MSG_SPLICE_PAGES;
>                 bvec_set_page(&bvec, page[i], bytes, offset);
> -               iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> +               iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> 
>  try_page_again:
>                 lock_sock(sk);
> -               rv = tcp_sendmsg_locked(sk, &msg, size);
> +               rv = tcp_sendmsg_locked(sk, &msg, bytes);
>                 release_sock(sk);
> 
>                 if (rv > 0) {

Looks good.

Reviewed-by: David Howells <dhowells@redhat.com>



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-28 20:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-22  7:07 [linux-next:master] [mm, slab] 5660ee54e7: BUG:KASAN:stack-out-of-bounds_in_copy_from_iter kernel test robot
2025-07-22 10:52 ` Pedro Falcato
2025-07-22 11:32   ` Vlastimil Babka
2025-07-22 12:01     ` Pedro Falcato
2025-07-28 20:46 ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).