All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Cc: Bart Van Assche <bvanassche@acm.org>
Subject: Re: [bug report] KASAN slab-use-after-free at blktests srp/002 with siw driver
Date: Tue, 4 Jun 2024 11:26:34 +0200	[thread overview]
Message-ID: <6bcbe337-c2fe-46ee-8228-a3cff6852c28@linux.dev> (raw)
In-Reply-To: <5prftateosuvgmosryes4lakptbxccwtx7yajoicjhudt7gyvp@w3f6nqdvurir>


On 04.06.24 09:25, Shinichiro Kawasaki wrote:
> As I noted in another thread [1], KASAN slab-use-after-free is observed when
> I repeat the blktests test case srp/002 with the siw driver [2]. The kernel
> version was v6.10-rc2. The failure is recreated in stable manner when the test
> case is repeated around 30 times. It was not observed with the rxe driver.
>
> I think this failure is same as that I reported in Jun/2023 [3]. The Call Trace
> reported is quite similar. Also, I confirmed that the trial fix patch that I
> created in Jun/2023 avoided the KASAN failure at srp/002.

"the trial fix patch that I created in Jun/2023" that you mentioned is 
the commit in the link?

https://lore.kernel.org/linux-rdma/20230612054237.1855292-1-shinichiro.kawasaki@wdc.com/

Thanks,

Zhu Yanjun

>
> In Jun/2023, the KASAN failure was observed with the test cases nvme/030 and
> nvme/031. But the symptom disappeared in Sep/2023 [4]. I guess the failure has
> got observable again with srp/002.
>
> As for the root cause, it was advised that "There is something wrong with the
> iwarp cm if it is destroying IDs in handlers" [5]. Actions for fix will be
> appreciated. I'm willing to test fix patches.
>
> [1] https://lore.kernel.org/linux-block/n2adhqzr6x5fss6jff7pxhubkkalvxeyesmg7jre4uomfcdudb@dwn3wgkqhmj7/
>
> [2]
>
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000006d1c31fe with status 5
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 00000000916ce050 with status 5
> Jun 04 09:23:11 testnode2 kernel: ==================================================================
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000001770ef1b with status 5
> Jun 04 09:23:11 testnode2 kernel: BUG: KASAN: slab-use-after-free in __mutex_lock+0x1110/0x13c0
> Jun 04 09:23:11 testnode2 kernel: Read of size 8 at addr ffff888131a3e418 by task kworker/u16:6/1345
> Jun 04 09:23:11 testnode2 kernel:
> Jun 04 09:23:11 testnode2 kernel: CPU: 1 PID: 1345 Comm: kworker/u16:6 Not tainted 6.10.0-rc2+ #288
> Jun 04 09:23:11 testnode2 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
> Jun 04 09:23:11 testnode2 kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm]
> Jun 04 09:23:11 testnode2 kernel: Call Trace:
> Jun 04 09:23:11 testnode2 kernel:  <TASK>
> Jun 04 09:23:11 testnode2 kernel:  dump_stack_lvl+0x6a/0x90
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 00000000f727e5c2 with status 5
> Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
> Jun 04 09:23:11 testnode2 kernel:  print_report+0x174/0x505
> Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
> Jun 04 09:23:11 testnode2 kernel:  ? __virt_addr_valid+0x1b9/0x400
> Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
> Jun 04 09:23:11 testnode2 kernel:  kasan_report+0xa7/0x180
> Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
> Jun 04 09:23:11 testnode2 kernel:  __mutex_lock+0x1110/0x13c0
> Jun 04 09:23:11 testnode2 kernel:  ? cma_iw_handler+0xac/0x500 [rdma_cm]
> Jun 04 09:23:11 testnode2 kernel:  ? __lock_acquire+0x139d/0x5d60
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx___mutex_lock+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel:  ? mark_lock+0xf5/0x1580
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_mark_lock+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000009bc71497 with status 5
> Jun 04 09:23:11 testnode2 kernel:  ? cma_iw_handler+0xac/0x500 [rdma_cm]
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 0000000041c0fa4b with status 5
> Jun 04 09:23:11 testnode2 kernel:  cma_iw_handler+0xac/0x500 [rdma_cm]
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_cma_iw_handler+0x10/0x10 [rdma_cm]
> Jun 04 09:23:11 testnode2 kernel:  ? mark_held_locks+0x94/0xe0
> Jun 04 09:23:11 testnode2 kernel:  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> Jun 04 09:23:11 testnode2 kernel:  cm_work_handler+0xb54/0x1c50 [iw_cm]
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_cm_work_handler+0x10/0x10 [iw_cm]
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_lock_release+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 00000000f48094cb with status 5
> Jun 04 09:23:11 testnode2 kernel:  process_one_work+0x865/0x1410
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000001c3faa8a with status 5
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_lock_acquire+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_process_one_work+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel:  ? assign_work+0x16c/0x240
> Jun 04 09:23:11 testnode2 kernel:  ? lock_is_held_type+0xd5/0x130
> Jun 04 09:23:11 testnode2 kernel:  worker_thread+0x5e2/0x1010
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_worker_thread+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel:  kthread+0x2d1/0x3a0
> Jun 04 09:23:11 testnode2 kernel:  ? _raw_spin_unlock_irq+0x24/0x50
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_kthread+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel:  ret_from_fork+0x30/0x70
> Jun 04 09:23:11 testnode2 kernel:  ? __pfx_kthread+0x10/0x10
> Jun 04 09:23:11 testnode2 kernel:  ret_from_fork_asm+0x1a/0x30
> Jun 04 09:23:11 testnode2 kernel:  </TASK>
> Jun 04 09:23:11 testnode2 kernel:
> Jun 04 09:23:11 testnode2 kernel: Allocated by task 75327:
> Jun 04 09:23:11 testnode2 kernel:  kasan_save_stack+0x2c/0x50
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000001bd9ea09 with status 5
> Jun 04 09:23:11 testnode2 kernel:  kasan_save_track+0x10/0x30
> Jun 04 09:23:11 testnode2 kernel:  __kasan_kmalloc+0xa6/0xb0
> Jun 04 09:23:11 testnode2 kernel:  __rdma_create_id+0x5b/0x5d0 [rdma_cm]
> Jun 04 09:23:11 testnode2 kernel:  __rdma_create_kernel_id+0x12/0x40 [rdma_cm]
> Jun 04 09:23:11 testnode2 kernel:  srp_new_rdma_cm_id+0x7c/0x200 [ib_srp]
> Jun 04 09:23:11 testnode2 kernel:  add_target_store+0x135e/0x29f0 [ib_srp]
> Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000005afc8065 with status 5
> Jun 04 09:23:11 testnode2 kernel:  kernfs_fop_write_iter+0x3a4/0x5a0
> Jun 04 09:23:11 testnode2 kernel:  vfs_write+0x5e3/0xe70
> Jun 04 09:23:11 testnode2 kernel:  ksys_write+0xf7/0x1d0
> Jun 04 09:23:11 testnode2 kernel:  do_syscall_64+0x93/0x180
> Jun 04 09:23:11 testnode2 kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Jun 04 09:23:11 testnode2 kernel:
> Jun 04 09:23:11 testnode2 kernel: Freed by task 66344:
> Jun 04 09:23:11 testnode2 kernel:  kasan_save_stack+0x2c/0x50
> Jun 04 09:23:11 testnode2 kernel:  kasan_save_track+0x10/0x30
> Jun 04 09:23:11 testnode2 kernel:  kasan_save_free_info+0x37/0x60
> Jun 04 09:23:11 testnode2 kernel:  poison_slab_object+0x109/0x180
> Jun 04 09:23:11 testnode2 kernel:  __kasan_slab_free+0x2e/0x50
> Jun 04 09:23:11 testnode2 kernel:  kfree+0x11a/0x390
> Jun 04 09:23:11 testnode2 kernel:  srp_free_ch_ib+0x895/0xc80 [ib_srp]
> Jun 04 09:23:11 testnode2 kernel:  srp_remove_work+0x309/0x6c0 [ib_srp]
> Jun 04 09:23:11 testnode2 kernel:  process_one_work+0x865/0x1410
> Jun 04 09:23:11 testnode2 kernel:  worker_thread+0x5e2/0x1010
> Jun 04 09:23:11 testnode2 kernel:  kthread+0x2d1/0x3a0
> Jun 04 09:23:11 testnode2 kernel:  ret_from_fork+0x30/0x70
> Jun 04 09:23:11 testnode2 kernel:  ret_from_fork_asm+0x1a/0x30
> Jun 04 09:23:11 testnode2 kernel:
> Jun 04 09:23:11 testnode2 kernel: The buggy address belongs to the object at ffff888131a3e000
>                                     which belongs to the cache kmalloc-2k of size 2048
> Jun 04 09:23:11 testnode2 kernel: The buggy address is located 1048 bytes inside of
>                                     freed 2048-byte region [ffff888131a3e000, ffff888131a3e800)
> Jun 04 09:23:11 testnode2 kernel:
> Jun 04 09:23:11 testnode2 kernel: The buggy address belongs to the physical page:
> Jun 04 09:23:11 testnode2 kernel: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888131a38000 pfn:0x131a38
> Jun 04 09:23:11 testnode2 kernel: head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> Jun 04 09:23:11 testnode2 kernel: flags: 0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
> Jun 04 09:23:11 testnode2 kernel: page_type: 0xffffefff(slab)
> Jun 04 09:23:11 testnode2 kernel: raw: 0017ffffc0000240 ffff888100042f00 ffffea0004c89610 ffffea0004a3c010
> Jun 04 09:23:11 testnode2 kernel: raw: ffff888131a38000 0000000000080006 00000001ffffefff 0000000000000000
> Jun 04 09:23:11 testnode2 kernel: head: 0017ffffc0000240 ffff888100042f00 ffffea0004c89610 ffffea0004a3c010
> Jun 04 09:23:11 testnode2 kernel: head: ffff888131a38000 0000000000080006 00000001ffffefff 0000000000000000
> Jun 04 09:23:11 testnode2 kernel: head: 0017ffffc0000003 ffffea0004c68e01 ffffffffffffffff 0000000000000000
> Jun 04 09:23:11 testnode2 kernel: head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> Jun 04 09:23:11 testnode2 kernel: page dumped because: kasan: bad access detected
> Jun 04 09:23:11 testnode2 kernel:
> Jun 04 09:23:11 testnode2 kernel: Memory state around the buggy address:
> Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Jun 04 09:23:11 testnode2 kernel: >ffff888131a3e400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Jun 04 09:23:11 testnode2 kernel:                             ^
> Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Jun 04 09:23:11 testnode2 kernel: ==================================================================
> Jun 04 09:23:11 testnode2 kernel: Disabling lock debugging due to kernel taint
> Jun 04 09:23:11 testnode2 kernel: device-mapper: multipath: 253:2: Failing path 8:80.
> Jun 04 09:23:11 testnode2 kernel: device-mapper: uevent: dm_send_uevents: skipping sending uevent for lost device
> ...
>
> [3] https://lore.kernel.org/linux-rdma/20230612054237.1855292-1-shinichiro.kawasaki@wdc.com/
> [4] https://lore.kernel.org/linux-rdma/g2lh3wh6e6yossw2ktqmxx2rf63m36mumqmx4qbtzvxuygsr6h@gpgftgfigllv/
> [5] https://lore.kernel.org/linux-rdma/ZIn6ul5jPuxC+uIG@ziepe.ca/

-- 
Best Regards,
Yanjun.Zhu


  reply	other threads:[~2024-06-04  9:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-04  7:25 [bug report] KASAN slab-use-after-free at blktests srp/002 with siw driver Shinichiro Kawasaki
2024-06-04  9:26 ` Zhu Yanjun [this message]
2024-06-04 20:15   ` Bart Van Assche
2024-06-04 20:22     ` Jason Gunthorpe
2024-06-04 20:25       ` Bart Van Assche
2024-06-05  7:36     ` Shinichiro Kawasaki
2024-06-05 10:42     ` Zhu Yanjun
2024-06-05 13:07       ` Bart Van Assche
2024-06-05  7:39   ` Shinichiro Kawasaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6bcbe337-c2fe-46ee-8228-a3cff6852c28@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=bvanassche@acm.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.