Linux network filesystem support library
 help / color / mirror / Atom feed
* [linus:master] [netfs]  cd0277ed0c: BUG:KASAN:slab-use-after-free_in_copy_from_iter
@ 2024-09-18  2:18 kernel test robot
  2024-09-24 21:40 ` David Howells
  0 siblings, 1 reply; 3+ messages in thread
From: kernel test robot @ 2024-09-18  2:18 UTC (permalink / raw)
  To: David Howells
  Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Jeff Layton, netfs,
	linux-fsdevel, oliver.sang



Hello,

kernel test robot noticed "BUG:KASAN:slab-use-after-free_in_copy_from_iter" on:

commit: cd0277ed0c188dd40e7744e89299af7b78831ca4 ("netfs: Use new folio_queue data type and iterator instead of xarray iter")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      a430d95c5efa2b545d26a094eb5f624e36732af0]
[test failed on linux-next/master 7083504315d64199a329de322fce989e1e10f4f7]

in testcase: xfstests
version: xfstests-x86_64-b1465280-1_20240909
with following parameters:

	disk: 4HDD
	fs: ext4
	fs2: smbv2
	test: generic-group-11



compiler: gcc-12
test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com


[ 461.422026][ T2594] BUG: KASAN: slab-use-after-free in _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[  461.429760][ T2594] Read of size 8 at addr ffff8881ec497520 by task aio-stress/2594
[  461.437419][ T2594]
[  461.439617][ T2594] CPU: 2 UID: 0 PID: 2594 Comm: aio-stress Not tainted 6.11.0-rc6-00065-gcd0277ed0c18 #1
[  461.449270][ T2594] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[  461.457360][ T2594] Call Trace:
[  461.460504][ T2594]  <TASK>
[ 461.463295][ T2594] dump_stack_lvl (lib/dump_stack.c:122 (discriminator 1)) 
[ 461.467658][ T2594] print_address_description+0x2c/0x3a0 
[ 461.474100][ T2594] ? _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 461.479060][ T2594] print_report (mm/kasan/report.c:489) 
[ 461.483327][ T2594] ? kasan_addr_to_slab (mm/kasan/common.c:37) 
[ 461.488112][ T2594] ? _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 461.493071][ T2594] kasan_report (mm/kasan/report.c:603) 
[ 461.497337][ T2594] ? _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 461.502295][ T2594] _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260) 
[ 461.507081][ T2594] ? __pfx_try_charge_memcg (mm/memcontrol.c:2158) 
[ 461.512301][ T2594] ? __pfx__copy_from_iter (lib/iov_iter.c:254) 
[ 461.517445][ T2594] ? __mod_memcg_state (mm/memcontrol.c:555 mm/memcontrol.c:669) 
[ 461.522420][ T2594] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:827 include/linux/page-flags.h:848 include/linux/mm.h:1126 include/linux/mm.h:2142 mm/usercopy.c:199) 
[  461.527414][ T2594]  ? 0xffffffff81000000
[ 461.531417][ T2594] ? __check_object_size (mm/memremap.c:167) 
[ 461.537080][ T2594] skb_do_copy_data_nocache (include/linux/uio.h:219 include/linux/uio.h:236 include/net/sock.h:2167) 
[ 461.542472][ T2594] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2158) 
[ 461.548385][ T2594] ? __sk_mem_schedule (net/core/sock.c:3194) 
[ 461.553191][ T2594] tcp_sendmsg_locked (include/net/sock.h:2195 net/ipv4/tcp.c:1218) 
[ 461.558236][ T2594] ? cifs_strict_fsync (fs/smb/client/cifsglob.h:1577 fs/smb/client/file.c:2658) cifs
[ 461.563805][ T2594] ? __x64_sys_fsync (include/linux/file.h:47 fs/sync.c:213 fs/sync.c:220 fs/sync.c:218 fs/sync.c:218) 
[ 461.568431][ T2594] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1049) 
[ 461.573822][ T2594] ? do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) 
[ 461.578348][ T2594] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
[ 461.583134][ T2594] ? __pfx__raw_spin_lock_bh (kernel/locking/spinlock.c:177) 
[ 461.588464][ T2594] tcp_sendmsg (net/ipv4/tcp.c:1355) 
[ 461.592569][ T2594] sock_sendmsg (net/socket.c:730 net/socket.c:745 net/socket.c:768) 
[ 461.596921][ T2594] ? stack_trace_save (kernel/stacktrace.c:123) 
[ 461.601618][ T2594] ? __pfx_sock_sendmsg (net/socket.c:757) 
[ 461.606491][ T2594] ? recalc_sigpending (arch/x86/include/asm/bitops.h:75 include/asm-generic/bitops/instrumented-atomic.h:42 include/linux/thread_info.h:94 kernel/signal.c:178 kernel/signal.c:175) 
[ 461.611464][ T2594] smb_send_kvec (fs/smb/client/transport.c:215) cifs
[ 461.616599][ T2594] __smb_send_rqst (fs/smb/client/transport.c:361) cifs
[ 461.621910][ T2594] ? __pfx___smb_send_rqst (fs/smb/client/transport.c:274) cifs
[ 461.627741][ T2594] ? __pfx_mempool_alloc_noprof (mm/mempool.c:385) 
[ 461.633311][ T2594] ? __asan_memset (mm/kasan/shadow.c:84) 
[ 461.637750][ T2594] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 461.642275][ T2594] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153) 
[ 461.647320][ T2594] ? smb2_setup_async_request (fs/smb/client/smb2transport.c:903) cifs
[ 461.653633][ T2594] cifs_call_async (fs/smb/client/transport.c:841) cifs
[ 461.658940][ T2594] ? __pfx_cifs_call_async (fs/smb/client/transport.c:787) cifs
[ 461.664762][ T2594] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 461.669288][ T2594] ? __asan_memset (mm/kasan/shadow.c:84) 
[ 461.673740][ T2594] ? __smb2_plain_req_init (arch/x86/include/asm/atomic.h:53 include/linux/atomic/atomic-arch-fallback.h:992 include/linux/atomic/atomic-instrumented.h:436 fs/smb/client/smb2pdu.c:555) cifs
[ 461.679842][ T2594] smb2_async_writev (fs/smb/client/smb2pdu.c:5026) cifs
[ 461.685454][ T2594] ? __pfx_smb2_async_writev (fs/smb/client/smb2pdu.c:4894) cifs
[ 461.691472][ T2594] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 461.696006][ T2594] ? __pfx__raw_spin_lock_bh (kernel/locking/spinlock.c:177) 
[ 461.701323][ T2594] ? cifs_prepare_write (fs/smb/client/file.c:77) cifs
[ 461.707116][ T2594] ? netfs_advance_write (fs/netfs/write_issue.c:300) 
[ 461.712257][ T2594] netfs_advance_write (fs/netfs/write_issue.c:300) 
[ 461.717218][ T2594] ? netfs_buffer_append_folio (arch/x86/include/asm/bitops.h:206 (discriminator 3) arch/x86/include/asm/bitops.h:238 (discriminator 3) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 3) include/linux/page-flags.h:827 (discriminator 3) include/linux/page-flags.h:848 (discriminator 3) include/linux/mm.h:1126 (discriminator 3) include/linux/folio_queue.h:102 (discriminator 3) fs/netfs/misc.c:43 (discriminator 3)) 
[ 461.722870][ T2594] netfs_write_folio (fs/netfs/write_issue.c:468) 
[ 461.727743][ T2594] ? writeback_iter (mm/page-writeback.c:2591) 
[ 461.732460][ T2594] netfs_writepages (fs/netfs/write_issue.c:540) 
[ 461.737161][ T2594] ? __pfx_netfs_writepages (fs/netfs/write_issue.c:499) 
[ 461.742379][ T2594] ? copy_page_from_iter_atomic (arch/x86/include/asm/uaccess_64.h:110 arch/x86/include/asm/uaccess_64.h:125 lib/iov_iter.c:55 include/linux/iov_iter.h:30 include/linux/iov_iter.h:300 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:481) 
[ 461.748225][ T2594] do_writepages (mm/page-writeback.c:2683) 
[ 461.752665][ T2594] ? inode_maybe_inc_iversion (arch/x86/include/asm/atomic64_64.h:101 include/linux/atomic/atomic-arch-fallback.h:4256 include/linux/atomic/atomic-instrumented.h:2858 fs/libfs.c:2020) 
[ 461.758140][ T2594] ? __pfx_do_writepages (mm/page-writeback.c:2673) 
[ 461.763099][ T2594] ? __pfx___might_resched (kernel/sched/core.c:8418) 
[ 461.768230][ T2594] ? shmem_write_end (arch/x86/include/asm/atomic.h:67 include/linux/atomic/atomic-arch-fallback.h:2278 include/linux/atomic/atomic-instrumented.h:1384 include/linux/page_ref.h:205 include/linux/mm.h:1152 include/linux/mm.h:1157 include/linux/mm.h:1489 mm/shmem.c:2934) 
[ 461.773015][ T2594] ? balance_dirty_pages_ratelimited_flags (mm/page-writeback.c:2084) 
[ 461.779621][ T2594] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 461.784146][ T2594] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153) 
[ 461.789190][ T2594] ? generic_perform_write (mm/filemap.c:4044) 
[ 461.794495][ T2594] ? wbc_attach_and_unlock_inode (arch/x86/include/asm/jump_label.h:27 include/linux/backing-dev.h:176 fs/fs-writeback.c:737) 
[ 461.800236][ T2594] filemap_fdatawrite_wbc (mm/filemap.c:398 mm/filemap.c:387) 
[ 461.805469][ T2594] __filemap_fdatawrite_range (mm/filemap.c:422) 
[ 461.810860][ T2594] ? __pfx___filemap_fdatawrite_range (mm/filemap.c:422) 
[ 461.816951][ T2594] ? mutex_unlock (arch/x86/include/asm/atomic64_64.h:101 include/linux/atomic/atomic-arch-fallback.h:4329 include/linux/atomic/atomic-long.h:1506 include/linux/atomic/atomic-instrumented.h:4481 kernel/locking/mutex.c:181 kernel/locking/mutex.c:545) 
[ 461.821303][ T2594] ? __pfx_mutex_unlock (kernel/locking/mutex.c:543) 
[ 461.826174][ T2594] file_write_and_wait_range (mm/filemap.c:788) 
[ 461.831566][ T2594] cifs_strict_fsync (fs/smb/client/file.c:2660) cifs
[ 461.836956][ T2594] __x64_sys_fsync (include/linux/file.h:47 fs/sync.c:213 fs/sync.c:220 fs/sync.c:218 fs/sync.c:218) 
[ 461.841414][ T2594] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) 
[ 461.845780][ T2594] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
[  461.851526][ T2594] RIP: 0033:0x7f8302baab10
[ 461.855793][ T2594] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d d1 ba 0d 00 00 74 17 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
All code
========
   0:	00 f7                	add    %dh,%bh
   2:	d8 64 89 01          	fsubs  0x1(%rcx,%rcx,4)
   6:	48 83 c8 ff          	or     $0xffffffffffffffff,%rax
   a:	c3                   	retq   
   b:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  12:	00 00 00 
  15:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  1a:	80 3d d1 ba 0d 00 00 	cmpb   $0x0,0xdbad1(%rip)        # 0xdbaf2
  21:	74 17                	je     0x3a
  23:	b8 4a 00 00 00       	mov    $0x4a,%eax
  28:	0f 05                	syscall 
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 48                	ja     0x7a
  32:	c3                   	retq   
  33:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  3a:	48 83 ec 18          	sub    $0x18,%rsp
  3e:	89                   	.byte 0x89
  3f:	7c                   	.byte 0x7c

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 48                	ja     0x50
   8:	c3                   	retq   
   9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  10:	48 83 ec 18          	sub    $0x18,%rsp
  14:	89                   	.byte 0x89
  15:	7c                   	.byte 0x7c


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240918/202409180928.f20b5a08-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linus:master] [netfs] cd0277ed0c: BUG:KASAN:slab-use-after-free_in_copy_from_iter
  2024-09-18  2:18 [linus:master] [netfs] cd0277ed0c: BUG:KASAN:slab-use-after-free_in_copy_from_iter kernel test robot
@ 2024-09-24 21:40 ` David Howells
  2024-09-26  2:15   ` Oliver Sang
  0 siblings, 1 reply; 3+ messages in thread
From: David Howells @ 2024-09-24 21:40 UTC (permalink / raw)
  To: kernel test robot
  Cc: dhowells, oe-lkp, lkp, linux-kernel, Christian Brauner,
	Jeff Layton, netfs, linux-fsdevel

Hi Oliver,

Can you try the attached?

Thanks,
David
---
netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs)

In netfslib, a buffered writeback operation has a 'write queue' of folios
that are being written, held in a linear sequence of folio_queue structs.
The 'issuer' adds new folio_queues on the leading edge of the queue and
populates each one progressively; the 'collector' pops them off the
trailing edge and discards them and the folios they point to as they are
consumed.

The queue is required to always retain at least one folio_queue structure.
This allows the queue to be accessed without locking and with just a bit of
barriering.

When a new subrequest is prepared, its ->io_iter iterator is pointed at the
current end of the write queue and then the iterator is extended as more
data is added to the queue until the subrequest is committed.

Now, the problem is that the folio_queue at the leading edge of the write
queue when a subrequest is prepared might have been entirely consumed - but
not yet removed from the queue as it is the only remaining one and is
preventing the queue from collapsing.

So, what happens is that subreq->io_iter is pointed at the spent
folio_queue, then a new folio_queue is added, and, at that point, the
collector is at entirely at liberty to immediately delete the spent
folio_queue.

This leaves the subreq->io_iter pointing at a freed object.  If the system
is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted
freed object and advances to the next folio_queue in the queue.

In the case seen, however, the freed object gets recycled and put back onto
the queue at the tail and filled to the end.  This confuses
iterate_folioq() and it tries to step ->next, which may be NULL - resulting
in an oops.

Fix this by the following means:

 (1) When preparing a write subrequest, make sure there's a folio_queue
     struct with space in it at the leading edge of the queue.  A function
     to make space is split out of the function to append a folio so that
     it can be called for this purpose.

 (2) If the request struct iterator is pointing to a completely spent
     folio_queue when we make space, then advance the iterator to the newly
     allocated folio_queue.  The subrequest's iterator will then be set
     from this.

Whilst we're at it, also split out the function to allocate a folio_queue,
initialise it and do the accounting.

The oops could be triggered using the generic/346 xfstest with a filesystem
on9P over TCP with cache=loose.  The oops looked something like:

 BUG: kernel NULL pointer dereference, address: 0000000000000008
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 ...
 RIP: 0010:_copy_from_iter+0x2db/0x530
 ...
 Call Trace:
  <TASK>
 ...
  p9pdu_vwritef+0x3d8/0x5d0
  p9_client_prepare_req+0xa8/0x140
  p9_client_rpc+0x81/0x280
  p9_client_write+0xcf/0x1c0
  v9fs_issue_write+0x87/0xc0
  netfs_advance_write+0xa0/0xb0
  netfs_write_folio.isra.0+0x42d/0x500
  netfs_writepages+0x15a/0x1f0
  do_writepages+0xd1/0x220
  filemap_fdatawrite_wbc+0x5c/0x80
  v9fs_mmap_vm_close+0x7d/0xb0
  remove_vma+0x35/0x70
  vms_complete_munmap_vmas+0x11a/0x170
  do_vmi_align_munmap+0x17d/0x1c0
  do_vmi_munmap+0x13e/0x150
  __vm_munmap+0x92/0xd0
  __x64_sys_munmap+0x17/0x20
  do_syscall_64+0x80/0xe0
  entry_SYSCALL_64_after_hwframe+0x71/0x79

This may also fix a similar-looking issue with cifs and generic/074.

  | Reported-by: kernel test robot <oliver.sang@intel.com>
  | Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@kernel.org>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: v9fs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/internal.h    |    2 +
 fs/netfs/misc.c        |   72 ++++++++++++++++++++++++++++++++++---------------
 fs/netfs/objects.c     |   12 ++++++++
 fs/netfs/write_issue.c |   12 +++++++-
 4 files changed, 76 insertions(+), 22 deletions(-)

diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index c7f23dd3556a..79c0ad89affb 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -58,6 +58,7 @@ static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {}
 /*
  * misc.c
  */
+struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq);
 int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
 			      bool needs_put);
 struct folio_queue *netfs_delete_buffer_head(struct netfs_io_request *wreq);
@@ -76,6 +77,7 @@ void netfs_clear_subrequests(struct netfs_io_request *rreq, bool was_async);
 void netfs_put_request(struct netfs_io_request *rreq, bool was_async,
 		       enum netfs_rreq_ref_trace what);
 struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
+struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp);
 
 static inline void netfs_see_request(struct netfs_io_request *rreq,
 				     enum netfs_rreq_ref_trace what)
diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
index 0ad0982ce0e2..a743e8963247 100644
--- a/fs/netfs/misc.c
+++ b/fs/netfs/misc.c
@@ -9,34 +9,64 @@
 #include "internal.h"
 
 /*
- * Append a folio to the rolling queue.
+ * Make sure there's space in the rolling queue.
  */
-int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
-			      bool needs_put)
+struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq)
 {
-	struct folio_queue *tail = rreq->buffer_tail;
-	unsigned int slot, order = folio_order(folio);
+	struct folio_queue *tail = rreq->buffer_tail, *prev;
+	unsigned int prev_nr_slots = 0;
 
 	if (WARN_ON_ONCE(!rreq->buffer && tail) ||
 	    WARN_ON_ONCE(rreq->buffer && !tail))
-		return -EIO;
-
-	if (!tail || folioq_full(tail)) {
-		tail = kmalloc(sizeof(*tail), GFP_NOFS);
-		if (!tail)
-			return -ENOMEM;
-		netfs_stat(&netfs_n_folioq);
-		folioq_init(tail);
-		tail->prev = rreq->buffer_tail;
-		if (tail->prev)
-			tail->prev->next = tail;
-		rreq->buffer_tail = tail;
-		if (!rreq->buffer) {
-			rreq->buffer = tail;
-			iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0);
+		return ERR_PTR(-EIO);
+
+	prev = tail;
+	if (prev) {
+		if (!folioq_full(tail))
+			return tail;
+		prev_nr_slots = folioq_nr_slots(tail);
+	}
+
+	tail = netfs_folioq_alloc(rreq, GFP_NOFS);
+	if (!tail)
+		return ERR_PTR(-ENOMEM);
+	tail->prev = prev;
+	if (prev)
+		/* [!] NOTE: After we set prev->next, the consumer is entirely
+		 * at liberty to delete prev.
+		 */
+		WRITE_ONCE(prev->next, tail);
+
+	rreq->buffer_tail = tail;
+	if (!rreq->buffer) {
+		rreq->buffer = tail;
+		iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0);
+	} else {
+		/* Make sure we don't leave the master iterator pointing to a
+		 * block that might get immediately consumed.
+		 */
+		if (rreq->io_iter.folioq == prev &&
+		    rreq->io_iter.folioq_slot == prev_nr_slots) {
+			rreq->io_iter.folioq = tail;
+			rreq->io_iter.folioq_slot = 0;
 		}
-		rreq->buffer_tail_slot = 0;
 	}
+	rreq->buffer_tail_slot = 0;
+	return tail;
+}
+
+/*
+ * Append a folio to the rolling queue.
+ */
+int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
+			      bool needs_put)
+{
+	struct folio_queue *tail;
+	unsigned int slot, order = folio_order(folio);
+
+	tail = netfs_buffer_make_space(rreq);
+	if (IS_ERR(tail))
+		return PTR_ERR(tail);
 
 	rreq->io_iter.count += PAGE_SIZE << order;
 
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index d32964e8ca5d..dd8241bc996b 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -250,3 +250,15 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async,
 	if (dead)
 		netfs_free_subrequest(subreq, was_async);
 }
+
+struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp)
+{
+	struct folio_queue *fq;
+
+	fq = kmalloc(sizeof(*fq), gfp);
+	if (fq) {
+		netfs_stat(&netfs_n_folioq);
+		folioq_init(fq);
+	}
+	return fq;
+}
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 04e66d587f77..0929d9fd4ce7 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -153,12 +153,22 @@ static void netfs_prepare_write(struct netfs_io_request *wreq,
 				loff_t start)
 {
 	struct netfs_io_subrequest *subreq;
+	struct iov_iter *wreq_iter = &wreq->io_iter;
+
+	/* Make sure we don't point the iterator at a used-up folio_queue
+	 * struct being used as a placeholder to prevent the queue from
+	 * collapsing.  In such a case, extend the queue.
+	 */
+	if (iov_iter_is_folioq(wreq_iter) &&
+	    wreq_iter->folioq_slot >= folioq_nr_slots(wreq_iter->folioq)) {
+		netfs_buffer_make_space(wreq);
+	}
 
 	subreq = netfs_alloc_subrequest(wreq);
 	subreq->source		= stream->source;
 	subreq->start		= start;
 	subreq->stream_nr	= stream->stream_nr;
-	subreq->io_iter		= wreq->io_iter;
+	subreq->io_iter		= *wreq_iter;
 
 	_enter("R=%x[%x]", wreq->debug_id, subreq->debug_index);
 

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [linus:master] [netfs] cd0277ed0c: BUG:KASAN:slab-use-after-free_in_copy_from_iter
  2024-09-24 21:40 ` David Howells
@ 2024-09-26  2:15   ` Oliver Sang
  0 siblings, 0 replies; 3+ messages in thread
From: Oliver Sang @ 2024-09-26  2:15 UTC (permalink / raw)
  To: David Howells
  Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Jeff Layton, netfs,
	linux-fsdevel, oliver.sang

Hi, David,

On Tue, Sep 24, 2024 at 10:40:07PM +0100, David Howells wrote:
> Hi Oliver,
> 
> Can you try the attached?

yes, this patch fixed the issue we reported.
Tested-by: kernel test robot <oliver.sang@intel.com>

we found this patch cannot apply on cd0277ed0c directly, so apply it upon
mainline commit
684a64bf32b6e Merge tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

for this report, we found the failure for generic/113
(https://download.01.org/0day-ci/archive/20240918/202409180928.f20b5a08-oliver.sang@intel.com/xfstests)

by the patch

=========================================================================================
compiler/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-12/4HDD/smbv2/ext4/x86_64-rhel-8.3-func/debian-12-x86_64-20240206.cgz/lkp-skl-d05/generic-group-11/xfstests

commit:
  684a64bf32b6e ("Merge tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs")
  b0b53eafc5a38 (linux-devel/fixup-684a64bf32b6e) netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs)

684a64bf32b6e488 b0b53eafc5a3803dcebf2899cbc
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
          6:6          -83%            :6     dmesg.BUG:KASAN:slab-use-after-free_in_copy_from_iter
           :6          100%           6:6     xfstests.generic.113.pass


since generic/074 is mentioned, we also tested and confirmed it's also a good
fix. thanks

=========================================================================================
compiler/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-12/4HDD/smbv2/ext4/x86_64-rhel-8.3-func/debian-12-x86_64-20240206.cgz/lkp-skl-d05/generic-074/xfstests

commit:
  684a64bf32b6e ("Merge tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs")
  b0b53eafc5a38 (linux-devel/fixup-684a64bf32b6e) netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs)


684a64bf32b6e488 b0b53eafc5a3803dcebf2899cbc
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
          6:6          -83%            :6     dmesg.BUG:KASAN:slab-use-after-free_in_copy_from_iter
           :6          100%           6:6     xfstests.generic.074.pass


> 
> Thanks,
> David

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-26  2:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-18  2:18 [linus:master] [netfs] cd0277ed0c: BUG:KASAN:slab-use-after-free_in_copy_from_iter kernel test robot
2024-09-24 21:40 ` David Howells
2024-09-26  2:15   ` Oliver Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox