* [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
@ 2023-06-12 5:42 Shin'ichiro Kawasaki
2023-06-12 7:13 ` Leon Romanovsky
2023-06-12 14:18 ` Jason Gunthorpe
0 siblings, 2 replies; 17+ messages in thread
From: Shin'ichiro Kawasaki @ 2023-06-12 5:42 UTC (permalink / raw)
To: linux-rdma, Jason Gunthorpe, Leon Romanovsky
Cc: linux-nvme, Damien Le Moal, Shin'ichiro Kawasaki
When rdma_destroy_id() and cma_iw_handler() race, struct rdma_id_private
*id_priv can be destroyed during cma_iw_handler call. This causes "BUG:
KASAN: slab-use-after-free" at mutex_lock() in cma_iw_handler() [1].
To prevent the destroy of id_priv, keep its reference count by calling
cma_id_get() and cma_id_put() at start and end of cma_iw_handler().
[1]
==================================================================
BUG: KASAN: slab-use-after-free in __mutex_lock+0x1324/0x18f0
Read of size 8 at addr ffff888197b37418 by task kworker/u8:0/9
CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0 #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
Workqueue: iw_cm_wq cm_work_handler [iw_cm]
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x90
print_report+0xcf/0x660
? __mutex_lock+0x1324/0x18f0
kasan_report+0xa4/0xe0
? __mutex_lock+0x1324/0x18f0
__mutex_lock+0x1324/0x18f0
? cma_iw_handler+0xac/0x4f0 [rdma_cm]
? _raw_spin_unlock_irqrestore+0x30/0x60
? rcu_is_watching+0x11/0xb0
? _raw_spin_unlock_irqrestore+0x30/0x60
? trace_hardirqs_on+0x12/0x100
? __pfx___mutex_lock+0x10/0x10
? __percpu_counter_sum+0x147/0x1e0
? domain_dirty_limits+0x246/0x390
? wb_over_bg_thresh+0x4d5/0x610
? rcu_is_watching+0x11/0xb0
? cma_iw_handler+0xac/0x4f0 [rdma_cm]
cma_iw_handler+0xac/0x4f0 [rdma_cm]
? rcu_is_watching+0x11/0xb0
? __pfx_cma_iw_handler+0x10/0x10 [rdma_cm]
? attach_entity_load_avg+0x4e2/0x920
? _raw_spin_unlock_irqrestore+0x30/0x60
? rcu_is_watching+0x11/0xb0
cm_work_handler+0x139e/0x1c50 [iw_cm]
? __pfx_cm_work_handler+0x10/0x10 [iw_cm]
? rcu_is_watching+0x11/0xb0
? __pfx_try_to_wake_up+0x10/0x10
? __pfx_do_raw_spin_lock+0x10/0x10
? __pfx___might_resched+0x10/0x10
? _raw_spin_unlock_irq+0x24/0x50
process_one_work+0x843/0x1350
? __pfx_lock_acquire+0x10/0x10
? __pfx_process_one_work+0x10/0x10
? __pfx_do_raw_spin_lock+0x10/0x10
worker_thread+0xfc/0x1260
? __pfx_worker_thread+0x10/0x10
kthread+0x29e/0x340
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Allocated by task 4225:
kasan_save_stack+0x2f/0x50
kasan_set_track+0x21/0x30
__kasan_kmalloc+0xa6/0xb0
__rdma_create_id+0x5b/0x5d0 [rdma_cm]
__rdma_create_kernel_id+0x12/0x40 [rdma_cm]
nvme_rdma_alloc_queue+0x26a/0x5f0 [nvme_rdma]
nvme_rdma_setup_ctrl+0xb84/0x1d90 [nvme_rdma]
nvme_rdma_create_ctrl+0x7b5/0xd20 [nvme_rdma]
nvmf_dev_write+0xddd/0x22b0 [nvme_fabrics]
vfs_write+0x211/0xd50
ksys_write+0x100/0x1e0
do_syscall_64+0x5b/0x80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Freed by task 4227:
kasan_save_stack+0x2f/0x50
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x50
____kasan_slab_free+0x169/0x1c0
slab_free_freelist_hook+0xdb/0x1b0
__kmem_cache_free+0xb8/0x2e0
nvme_rdma_free_queue+0x4a/0x70 [nvme_rdma]
nvme_rdma_teardown_io_queues.part.0+0x14a/0x1e0 [nvme_rdma]
nvme_rdma_delete_ctrl+0x4f/0x100 [nvme_rdma]
nvme_do_delete_ctrl+0x14e/0x240 [nvme_core]
nvme_sysfs_delete+0xcb/0x100 [nvme_core]
kernfs_fop_write_iter+0x359/0x530
vfs_write+0x58f/0xd50
ksys_write+0x100/0x1e0
do_syscall_64+0x5b/0x80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
The buggy address belongs to the object at ffff888197b37000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1048 bytes inside of
freed 2048-byte region [ffff888197b37000, ffff888197b37800)
The buggy address belongs to the physical page:
page:00000000fbe33a6e refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x197b30
head:00000000fbe33a6e order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
anon flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0010200 ffff888100042f00 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888197b37300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888197b37380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888197b37400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888197b37480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888197b37500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Fixes: de910bd92137 ("RDMA/cma: Simplify locking needed for serialization of callbacks")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org
---
The BUG KASAN was observed with blktests at test cases nvme/030 or nvme/031,
using SIW transport [*]. To reproduce it, it is required to repeat the test
cases from 30 to 50 times on my test system.
[*] https://lore.kernel.org/linux-block/rsmmxrchy6voi5qhl4irss5sprna3f5owkqtvybxglcv2pnylm@xmrnpfu3tfpe/
Changes from v1:
* Improved the commit message per comments on the list
drivers/infiniband/core/cma.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 93a1c48d0c32..c5267d9bb184 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2477,6 +2477,7 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
+ cma_id_get(id_priv);
mutex_lock(&id_priv->handler_mutex);
if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT)
goto out;
@@ -2524,12 +2525,14 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
if (ret) {
/* Destroy the CM ID by returning a non-zero value. */
id_priv->cm_id.iw = NULL;
+ cma_id_put(id_priv);
destroy_id_handler_unlock(id_priv);
return ret;
}
out:
mutex_unlock(&id_priv->handler_mutex);
+ cma_id_put(id_priv);
return ret;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-12 5:42 [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler Shin'ichiro Kawasaki
@ 2023-06-12 7:13 ` Leon Romanovsky
2023-06-12 14:18 ` Jason Gunthorpe
1 sibling, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2023-06-12 7:13 UTC (permalink / raw)
To: linux-rdma, Jason Gunthorpe, Shin'ichiro Kawasaki
Cc: linux-nvme, Damien Le Moal
On Mon, 12 Jun 2023 14:42:37 +0900, Shin'ichiro Kawasaki wrote:
> When rdma_destroy_id() and cma_iw_handler() race, struct rdma_id_private
> *id_priv can be destroyed during cma_iw_handler call. This causes "BUG:
> KASAN: slab-use-after-free" at mutex_lock() in cma_iw_handler() [1].
> To prevent the destroy of id_priv, keep its reference count by calling
> cma_id_get() and cma_id_put() at start and end of cma_iw_handler().
>
> [1]
>
> [...]
Applied, thanks!
[1/1] RDMA/cma: prevent rdma id destroy during cma_iw_handler
https://git.kernel.org/rdma/rdma/c/fd06a5925e4773
Best regards,
--
Leon Romanovsky <leon@kernel.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-12 5:42 [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler Shin'ichiro Kawasaki
2023-06-12 7:13 ` Leon Romanovsky
@ 2023-06-12 14:18 ` Jason Gunthorpe
2023-06-13 1:43 ` Shinichiro Kawasaki
1 sibling, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2023-06-12 14:18 UTC (permalink / raw)
To: Shin'ichiro Kawasaki
Cc: linux-rdma, Leon Romanovsky, linux-nvme, Damien Le Moal
On Mon, Jun 12, 2023 at 02:42:37PM +0900, Shin'ichiro Kawasaki wrote:
> When rdma_destroy_id() and cma_iw_handler() race, struct rdma_id_private
> *id_priv can be destroyed during cma_iw_handler call. This causes "BUG:
> KASAN: slab-use-after-free" at mutex_lock() in cma_iw_handler() [1].
> To prevent the destroy of id_priv, keep its reference count by calling
> cma_id_get() and cma_id_put() at start and end of cma_iw_handler().
>
> [1]
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in __mutex_lock+0x1324/0x18f0
> Read of size 8 at addr ffff888197b37418 by task kworker/u8:0/9
>
> CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0 #62
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
> Workqueue: iw_cm_wq cm_work_handler [iw_cm]
> Call Trace:
> <TASK>
> dump_stack_lvl+0x57/0x90
> print_report+0xcf/0x660
> ? __mutex_lock+0x1324/0x18f0
> kasan_report+0xa4/0xe0
> ? __mutex_lock+0x1324/0x18f0
> __mutex_lock+0x1324/0x18f0
> ? cma_iw_handler+0xac/0x4f0 [rdma_cm]
> ? _raw_spin_unlock_irqrestore+0x30/0x60
> ? rcu_is_watching+0x11/0xb0
> ? _raw_spin_unlock_irqrestore+0x30/0x60
> ? trace_hardirqs_on+0x12/0x100
> ? __pfx___mutex_lock+0x10/0x10
> ? __percpu_counter_sum+0x147/0x1e0
> ? domain_dirty_limits+0x246/0x390
> ? wb_over_bg_thresh+0x4d5/0x610
> ? rcu_is_watching+0x11/0xb0
> ? cma_iw_handler+0xac/0x4f0 [rdma_cm]
> cma_iw_handler+0xac/0x4f0 [rdma_cm]
What is the full call chain here, eg with the static functions
un-inlined?
>
> drivers/infiniband/core/cma.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 93a1c48d0c32..c5267d9bb184 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -2477,6 +2477,7 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
> struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
> struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
>
> + cma_id_get(id_priv);
> mutex_lock(&id_priv->handler_mutex);
> if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT)
> goto out;
> @@ -2524,12 +2525,14 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
> if (ret) {
> /* Destroy the CM ID by returning a non-zero value. */
> id_priv->cm_id.iw = NULL;
> + cma_id_put(id_priv);
> destroy_id_handler_unlock(id_priv);
> return ret;
> }
>
> out:
> mutex_unlock(&id_priv->handler_mutex);
> + cma_id_put(id_priv);
> return ret;
> }
cm_work_handler already has a ref on the iwcm_id_private
I think there is likely some much larger issue with the IW CM if the
cm_id can be destroyed while the iwcm_id is in use? It is weird that
there are two id memories for this :\
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-12 14:18 ` Jason Gunthorpe
@ 2023-06-13 1:43 ` Shinichiro Kawasaki
2023-06-13 6:47 ` Shinichiro Kawasaki
2023-06-13 13:30 ` Jason Gunthorpe
0 siblings, 2 replies; 17+ messages in thread
From: Shinichiro Kawasaki @ 2023-06-13 1:43 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: linux-rdma@vger.kernel.org, Leon Romanovsky,
linux-nvme@lists.infradead.org, Damien Le Moal
On Jun 12, 2023 / 11:18, Jason Gunthorpe wrote:
> On Mon, Jun 12, 2023 at 02:42:37PM +0900, Shin'ichiro Kawasaki wrote:
> > When rdma_destroy_id() and cma_iw_handler() race, struct rdma_id_private
> > *id_priv can be destroyed during cma_iw_handler call. This causes "BUG:
> > KASAN: slab-use-after-free" at mutex_lock() in cma_iw_handler() [1].
> > To prevent the destroy of id_priv, keep its reference count by calling
> > cma_id_get() and cma_id_put() at start and end of cma_iw_handler().
> >
> > [1]
> >
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in __mutex_lock+0x1324/0x18f0
> > Read of size 8 at addr ffff888197b37418 by task kworker/u8:0/9
> >
> > CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0 #62
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
> > Workqueue: iw_cm_wq cm_work_handler [iw_cm]
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x57/0x90
> > print_report+0xcf/0x660
> > ? __mutex_lock+0x1324/0x18f0
> > kasan_report+0xa4/0xe0
> > ? __mutex_lock+0x1324/0x18f0
> > __mutex_lock+0x1324/0x18f0
> > ? cma_iw_handler+0xac/0x4f0 [rdma_cm]
> > ? _raw_spin_unlock_irqrestore+0x30/0x60
> > ? rcu_is_watching+0x11/0xb0
> > ? _raw_spin_unlock_irqrestore+0x30/0x60
> > ? trace_hardirqs_on+0x12/0x100
> > ? __pfx___mutex_lock+0x10/0x10
> > ? __percpu_counter_sum+0x147/0x1e0
> > ? domain_dirty_limits+0x246/0x390
> > ? wb_over_bg_thresh+0x4d5/0x610
> > ? rcu_is_watching+0x11/0xb0
> > ? cma_iw_handler+0xac/0x4f0 [rdma_cm]
> > cma_iw_handler+0xac/0x4f0 [rdma_cm]
>
> What is the full call chain here, eg with the static functions
> un-inlined?
I checked the inlined func call chain from cm_work_handler to cma_iw_handler (I
recreated the symptom using kernel v6.4-rc5, so, address numbers are different):
$ ./scripts/faddr2line ./drivers/infiniband/core/iw_cm.ko cm_work_handler+0xb57/0x1c50
cm_work_handler+0xb57/0x1c50:
cm_close_handler at /home/shin/Linux/linux/drivers/infiniband/core/iwcm.c:974
(inlined by) process_event at /home/shin/Linux/linux/drivers/infiniband/core/iwcm.c:997
(inlined by) cm_work_handler at /home/shin/Linux/linux/drivers/infiniband/core/iwcm.c:1036
With this, my understanding of the full call chain from NVME driver to
cma_iw_handler is as follows, including task switch to cm_work_handler:
nvme_rdma_teardown_io_queue
nvme_rdma_stop_io_queues
nvme_rdma_stop_queue
__nvme_rdma_stop_queue
rdma_disconnect
iw_cm_disconnect
iwcm_modify_qp_sqd
ib_modify_qp
_ib_modify_qp
ib_security_modify_qp
siw_verbs_modify_qp
siw_qp_modify
siw_qp_cm_drop
siw_cm_upcall(IW_CM_EVENT_CLOSE)
cm_event_handler -> refcount_inc(&cm_id_priv->refoucnt)
queue_work
-> cm_work_handler
process_event
cm_close_handler
cm_work_handler
cma_iw_handler
> >
> > drivers/infiniband/core/cma.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> > index 93a1c48d0c32..c5267d9bb184 100644
> > --- a/drivers/infiniband/core/cma.c
> > +++ b/drivers/infiniband/core/cma.c
> > @@ -2477,6 +2477,7 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
> > struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
> > struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
> >
> > + cma_id_get(id_priv);
> > mutex_lock(&id_priv->handler_mutex);
> > if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT)
> > goto out;
> > @@ -2524,12 +2525,14 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
> > if (ret) {
> > /* Destroy the CM ID by returning a non-zero value. */
> > id_priv->cm_id.iw = NULL;
> > + cma_id_put(id_priv);
> > destroy_id_handler_unlock(id_priv);
> > return ret;
> > }
> >
> > out:
> > mutex_unlock(&id_priv->handler_mutex);
> > + cma_id_put(id_priv);
> > return ret;
> > }
>
> cm_work_handler already has a ref on the iwcm_id_private
>
> I think there is likely some much larger issue with the IW CM if the
> cm_id can be destroyed while the iwcm_id is in use? It is weird that
> there are two id memories for this :\
My understanding about the call chain to rdma id destroy is as follows. I guess
_destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
why it does not wait for cm_id deref by cm_work_handler.
nvme_rdma_teardown_io_queueus
nvme_rdma_stop_io_queues -> chained to cma_iw_handler
nvme_rdma_free_io_queues
nvme_rdma_free_queue
rdma_destroy_id
mutex_lock(&id_priv->handler_mutex)
destroy_id_handler_unlock
mutex_unlock(&id_priv->handler_mutex)
_destory_id
iw_destroy_cm_id
wait_for_completiion(&id_priv->comp)
kfree(id_priv)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-13 1:43 ` Shinichiro Kawasaki
@ 2023-06-13 6:47 ` Shinichiro Kawasaki
2023-06-13 13:30 ` Jason Gunthorpe
1 sibling, 0 replies; 17+ messages in thread
From: Shinichiro Kawasaki @ 2023-06-13 6:47 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: linux-rdma@vger.kernel.org, Leon Romanovsky,
linux-nvme@lists.infradead.org, Damien Le Moal
On Jun 13, 2023 / 10:43, Shin'ichiro Kawasaki wrote:
[...]
> My understanding about the call chain to rdma id destroy is as follows. I guess
> _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> why it does not wait for cm_id deref by cm_work_handler.
I found the commit 59c68ac31e15 ("iw_cm: free cm_id resources on the last
deref") looks related the issue. Its commit message says that it "removed the
block in iw_destroy_cm_id()" to avoid potential deadlock. So it looks cm_id ref
no longer guards rdma id destroy. I'm not sure how to fix this issue at IW CM
level.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-13 1:43 ` Shinichiro Kawasaki
2023-06-13 6:47 ` Shinichiro Kawasaki
@ 2023-06-13 13:30 ` Jason Gunthorpe
2023-06-13 18:07 ` Leon Romanovsky
1 sibling, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2023-06-13 13:30 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: linux-rdma@vger.kernel.org, Leon Romanovsky,
linux-nvme@lists.infradead.org, Damien Le Moal
On Tue, Jun 13, 2023 at 01:43:43AM +0000, Shinichiro Kawasaki wrote:
> > I think there is likely some much larger issue with the IW CM if the
> > cm_id can be destroyed while the iwcm_id is in use? It is weird that
> > there are two id memories for this :\
>
> My understanding about the call chain to rdma id destroy is as follows. I guess
> _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> why it does not wait for cm_id deref by cm_work_handler.
>
> nvme_rdma_teardown_io_queueus
> nvme_rdma_stop_io_queues -> chained to cma_iw_handler
> nvme_rdma_free_io_queues
> nvme_rdma_free_queue
> rdma_destroy_id
> mutex_lock(&id_priv->handler_mutex)
> destroy_id_handler_unlock
> mutex_unlock(&id_priv->handler_mutex)
> _destory_id
> iw_destroy_cm_id
> wait_for_completiion(&id_priv->comp)
> kfree(id_priv)
Once a destroy_cm_id() has returned that layer is no longer
permitted to run or be running in its handlers. The iw cm is broken if
it allows this, and that is the cause of the bug.
Taking more refs within handlers that are already not allowed to be
running is just racy.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-13 13:30 ` Jason Gunthorpe
@ 2023-06-13 18:07 ` Leon Romanovsky
2023-06-14 7:53 ` Shinichiro Kawasaki
0 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2023-06-13 18:07 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Shinichiro Kawasaki, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Tue, Jun 13, 2023 at 10:30:37AM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 13, 2023 at 01:43:43AM +0000, Shinichiro Kawasaki wrote:
> > > I think there is likely some much larger issue with the IW CM if the
> > > cm_id can be destroyed while the iwcm_id is in use? It is weird that
> > > there are two id memories for this :\
> >
> > My understanding about the call chain to rdma id destroy is as follows. I guess
> > _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> > why it does not wait for cm_id deref by cm_work_handler.
> >
> > nvme_rdma_teardown_io_queueus
> > nvme_rdma_stop_io_queues -> chained to cma_iw_handler
> > nvme_rdma_free_io_queues
> > nvme_rdma_free_queue
> > rdma_destroy_id
> > mutex_lock(&id_priv->handler_mutex)
> > destroy_id_handler_unlock
> > mutex_unlock(&id_priv->handler_mutex)
> > _destory_id
> > iw_destroy_cm_id
> > wait_for_completiion(&id_priv->comp)
> > kfree(id_priv)
>
> Once a destroy_cm_id() has returned that layer is no longer
> permitted to run or be running in its handlers. The iw cm is broken if
> it allows this, and that is the cause of the bug.
>
> Taking more refs within handlers that are already not allowed to be
> running is just racy.
So we need to revert that patch from our rdma-rc.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-13 18:07 ` Leon Romanovsky
@ 2023-06-14 7:53 ` Shinichiro Kawasaki
2023-06-14 17:36 ` Jason Gunthorpe
0 siblings, 1 reply; 17+ messages in thread
From: Shinichiro Kawasaki @ 2023-06-14 7:53 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jason Gunthorpe, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Jun 13, 2023 / 21:07, Leon Romanovsky wrote:
> On Tue, Jun 13, 2023 at 10:30:37AM -0300, Jason Gunthorpe wrote:
> > On Tue, Jun 13, 2023 at 01:43:43AM +0000, Shinichiro Kawasaki wrote:
> > > > I think there is likely some much larger issue with the IW CM if the
> > > > cm_id can be destroyed while the iwcm_id is in use? It is weird that
> > > > there are two id memories for this :\
> > >
> > > My understanding about the call chain to rdma id destroy is as follows. I guess
> > > _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> > > why it does not wait for cm_id deref by cm_work_handler.
> > >
> > > nvme_rdma_teardown_io_queueus
> > > nvme_rdma_stop_io_queues -> chained to cma_iw_handler
> > > nvme_rdma_free_io_queues
> > > nvme_rdma_free_queue
> > > rdma_destroy_id
> > > mutex_lock(&id_priv->handler_mutex)
> > > destroy_id_handler_unlock
> > > mutex_unlock(&id_priv->handler_mutex)
> > > _destory_id
> > > iw_destroy_cm_id
> > > wait_for_completiion(&id_priv->comp)
> > > kfree(id_priv)
> >
> > Once a destroy_cm_id() has returned that layer is no longer
> > permitted to run or be running in its handlers. The iw cm is broken if
> > it allows this, and that is the cause of the bug.
> >
> > Taking more refs within handlers that are already not allowed to be
> > running is just racy.
>
> So we need to revert that patch from our rdma-rc.
I see, thanks for the clarifications.
As another fix approach, I reverted the commit 59c68ac31e15 ("iw_cm: free cm_id
resources on the last deref") so that iw_destroy_cm_id() waits for deref of
cm_id. With that revert, the KASAN slab-use-after-free disappeared. Is this
the right fix approach?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-14 7:53 ` Shinichiro Kawasaki
@ 2023-06-14 17:36 ` Jason Gunthorpe
2023-06-15 0:45 ` Shinichiro Kawasaki
0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2023-06-14 17:36 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: Leon Romanovsky, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Wed, Jun 14, 2023 at 07:53:49AM +0000, Shinichiro Kawasaki wrote:
> On Jun 13, 2023 / 21:07, Leon Romanovsky wrote:
> > On Tue, Jun 13, 2023 at 10:30:37AM -0300, Jason Gunthorpe wrote:
> > > On Tue, Jun 13, 2023 at 01:43:43AM +0000, Shinichiro Kawasaki wrote:
> > > > > I think there is likely some much larger issue with the IW CM if the
> > > > > cm_id can be destroyed while the iwcm_id is in use? It is weird that
> > > > > there are two id memories for this :\
> > > >
> > > > My understanding about the call chain to rdma id destroy is as follows. I guess
> > > > _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> > > > why it does not wait for cm_id deref by cm_work_handler.
> > > >
> > > > nvme_rdma_teardown_io_queueus
> > > > nvme_rdma_stop_io_queues -> chained to cma_iw_handler
> > > > nvme_rdma_free_io_queues
> > > > nvme_rdma_free_queue
> > > > rdma_destroy_id
> > > > mutex_lock(&id_priv->handler_mutex)
> > > > destroy_id_handler_unlock
> > > > mutex_unlock(&id_priv->handler_mutex)
> > > > _destory_id
> > > > iw_destroy_cm_id
> > > > wait_for_completiion(&id_priv->comp)
> > > > kfree(id_priv)
> > >
> > > Once a destroy_cm_id() has returned that layer is no longer
> > > permitted to run or be running in its handlers. The iw cm is broken if
> > > it allows this, and that is the cause of the bug.
> > >
> > > Taking more refs within handlers that are already not allowed to be
> > > running is just racy.
> >
> > So we need to revert that patch from our rdma-rc.
>
> I see, thanks for the clarifications.
>
> As another fix approach, I reverted the commit 59c68ac31e15 ("iw_cm: free cm_id
> resources on the last deref") so that iw_destroy_cm_id() waits for deref of
> cm_id. With that revert, the KASAN slab-use-after-free disappeared. Is this
> the right fix approach?
That seems like it would bring back the bug it was fixing, though it
isn't totally clear what that is
There is something wrong with the iwarp cm if it is destroying IDs in
handlers, IB cm avoids doing that to avoid the deadlock, the same
solution will be needed for iwarp too.
Also the code this patch removed is quite ugly, if we are going back
to waiting it should be written in a more modern way without the test
bit and so on.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-14 17:36 ` Jason Gunthorpe
@ 2023-06-15 0:45 ` Shinichiro Kawasaki
2023-09-05 0:39 ` Shinichiro Kawasaki
0 siblings, 1 reply; 17+ messages in thread
From: Shinichiro Kawasaki @ 2023-06-15 0:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Jun 14, 2023 / 14:36, Jason Gunthorpe wrote:
> On Wed, Jun 14, 2023 at 07:53:49AM +0000, Shinichiro Kawasaki wrote:
[...]
> > As another fix approach, I reverted the commit 59c68ac31e15 ("iw_cm: free cm_id
> > resources on the last deref") so that iw_destroy_cm_id() waits for deref of
> > cm_id. With that revert, the KASAN slab-use-after-free disappeared. Is this
> > the right fix approach?
>
> That seems like it would bring back the bug it was fixing, though it
> isn't totally clear what that is
>
> There is something wrong with the iwarp cm if it is destroying IDs in
> handlers, IB cm avoids doing that to avoid the deadlock, the same
> solution will be needed for iwarp too.
>
> Also the code this patch removed is quite ugly, if we are going back
> to waiting it should be written in a more modern way without the test
> bit and so on.
I see, thanks for the clarifications. I took a look in ib_destroy_cm_id() and
fount it does differently from iw_destroy_cm_id(). I guess certain amount of
changes will be required for the good fix in iwarp code.
I stop my fix attempt here, since it looks beyond my bandwidth now. If anyone
provides fix patches, I'm willing to test them.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-06-15 0:45 ` Shinichiro Kawasaki
@ 2023-09-05 0:39 ` Shinichiro Kawasaki
2023-09-06 8:54 ` Daniel Wagner
0 siblings, 1 reply; 17+ messages in thread
From: Shinichiro Kawasaki @ 2023-09-05 0:39 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Jun 15, 2023 / 09:45, Shin'ichiro Kawasaki wrote:
[...]
> I stop my fix attempt here, since it looks beyond my bandwidth now. If anyone
> provides fix patches, I'm willing to test them.
FYI, I tried to recreate this failure on my test system this week, and found it
is no longer recreated. In case your TODO list includes a work item for this
issue, please put lower priority on it.
Two month ago, the failure had been recreated by repeating the blktests test
cases nvme/030 and nvme/031 with SIW tranposrt around 30 times. Now I do not see
it when I repeat them 100 times. I suspected some kernel changes between v6.4
and v6.5 would have fixed the issue, but even when I use both kernel versions,
the failure is no longer seen. I reverted nvme-cli and libnvme to older version,
but still do not see the failure. I just guess some change on my test system
affected the symptom (Fedora 38 on QEMU).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-09-05 0:39 ` Shinichiro Kawasaki
@ 2023-09-06 8:54 ` Daniel Wagner
2023-09-06 11:31 ` Shinichiro Kawasaki
0 siblings, 1 reply; 17+ messages in thread
From: Daniel Wagner @ 2023-09-06 8:54 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Tue, Sep 05, 2023 at 12:39:38AM +0000, Shinichiro Kawasaki wrote:
> Two month ago, the failure had been recreated by repeating the blktests test
> cases nvme/030 and nvme/031 with SIW tranposrt around 30 times. Now I do not see
> it when I repeat them 100 times. I suspected some kernel changes between v6.4
> and v6.5 would have fixed the issue, but even when I use both kernel versions,
> the failure is no longer seen. I reverted nvme-cli and libnvme to older version,
> but still do not see the failure. I just guess some change on my test system
> affected the symptom (Fedora 38 on QEMU).
Maybe the refactoring in blktests could also be the reason the tests are
not working? Just an idea.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-09-06 8:54 ` Daniel Wagner
@ 2023-09-06 11:31 ` Shinichiro Kawasaki
2023-09-07 7:47 ` Yi Zhang
0 siblings, 1 reply; 17+ messages in thread
From: Shinichiro Kawasaki @ 2023-09-06 11:31 UTC (permalink / raw)
To: Daniel Wagner
Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal
On Sep 06, 2023 / 10:54, Daniel Wagner wrote:
> On Tue, Sep 05, 2023 at 12:39:38AM +0000, Shinichiro Kawasaki wrote:
> > Two month ago, the failure had been recreated by repeating the blktests test
> > cases nvme/030 and nvme/031 with SIW tranposrt around 30 times. Now I do not see
> > it when I repeat them 100 times. I suspected some kernel changes between v6.4
> > and v6.5 would have fixed the issue, but even when I use both kernel versions,
> > the failure is no longer seen. I reverted nvme-cli and libnvme to older version,
> > but still do not see the failure. I just guess some change on my test system
> > affected the symptom (Fedora 38 on QEMU).
>
> Maybe the refactoring in blktests could also be the reason the tests are
> not working? Just an idea.
That is an idea. I rolled back blktests to the git revision of two months ago,
but still the failure is not recreated. Hmm.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-09-06 11:31 ` Shinichiro Kawasaki
@ 2023-09-07 7:47 ` Yi Zhang
2023-09-07 16:17 ` Jason Gunthorpe
2023-09-08 7:18 ` Hannes Reinecke
0 siblings, 2 replies; 17+ messages in thread
From: Yi Zhang @ 2023-09-07 7:47 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: Daniel Wagner, Jason Gunthorpe, Leon Romanovsky,
linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org,
Damien Le Moal
Seems I reproduced this issue[3] when I run [1] on the latest
linux-block/for-next with kernel config[2].
[1]nvme_trtype=rdma ./check nvme/
[2] https://pastebin.com/u9fh7dbX
[3]
[ 387.880016] run blktests nvme/003 at 2023-09-07 03:31:15
[ 388.173979] (null): rxe_set_mtu: Set mtu to 1024
[ 388.310460] infiniband eno1_rxe: set active
[ 388.310495] infiniband eno1_rxe: added eno1
[ 388.708380] loop0: detected capacity change from 0 to 2097152
[ 388.767800] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 388.839397] nvmet_rdma: enabling port 0 (10.16.221.26:4420)
[ 389.053251] nvmet: creating discovery controller 1 for subsystem
nqn.2014-08.org.nvmexpress.discovery for NQN
nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 389.061415] nvme nvme2: new ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery", addr 10.16.221.26:4420
[ 402.202057] nvme nvme2: Removing ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery"
[ 402.251605] ======================================================
[ 402.257782] WARNING: possible circular locking dependency detected
[ 402.263963] 6.5.0+ #1 Not tainted
[ 402.267280] ------------------------------------------------------
[ 402.273459] kworker/5:0/48 is trying to acquire lock:
[ 402.278512] ffff88831a9b6450 (&id_priv->handler_mutex){+.+.}-{3:3},
at: rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.288279]
but task is already holding lock:
[ 402.294112] ffffc90000687da0
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
process_one_work+0x83a/0x1660
[ 402.304729]
which lock already depends on the new lock.
[ 402.312903]
the existing dependency chain (in reverse order) is:
[ 402.320380]
-> #3 ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
[ 402.328561] __lock_acquire+0xbd6/0x1c00
[ 402.333015] lock_acquire+0x1da/0x5e0
[ 402.337203] process_one_work+0x884/0x1660
[ 402.341822] worker_thread+0x5be/0xef0
[ 402.346093] kthread+0x2f4/0x3d0
[ 402.349845] ret_from_fork+0x30/0x70
[ 402.353947] ret_from_fork_asm+0x1b/0x30
[ 402.358391]
-> #2 ((wq_completion)nvmet-wq){+.+.}-{0:0}:
[ 402.365186] __lock_acquire+0xbd6/0x1c00
[ 402.369631] lock_acquire+0x1da/0x5e0
[ 402.373818] __flush_workqueue+0x101/0x1230
[ 402.378524] nvmet_rdma_queue_connect+0x106/0x4b0 [nvmet_rdma]
[ 402.384885] cma_cm_event_handler+0xf2/0x430 [rdma_cm]
[ 402.390562] cma_ib_req_handler+0x93a/0x1cd0 [rdma_cm]
[ 402.396238] cm_process_work+0x48/0x3b0 [ib_cm]
[ 402.401307] cm_req_handler+0xf9b/0x2610 [ib_cm]
[ 402.406465] cm_work_handler+0x80a/0x1380 [ib_cm]
[ 402.411708] process_one_work+0x955/0x1660
[ 402.416326] worker_thread+0x5be/0xef0
[ 402.420600] kthread+0x2f4/0x3d0
[ 402.424351] ret_from_fork+0x30/0x70
[ 402.428450] ret_from_fork_asm+0x1b/0x30
[ 402.432897]
-> #1 (&id_priv->handler_mutex/1){+.+.}-{3:3}:
[ 402.439874] __lock_acquire+0xbd6/0x1c00
[ 402.444320] lock_acquire+0x1da/0x5e0
[ 402.448504] __mutex_lock+0x154/0x1490
[ 402.452776] cma_ib_req_handler+0x4f8/0x1cd0 [rdma_cm]
[ 402.458455] cm_process_work+0x48/0x3b0 [ib_cm]
[ 402.463525] cm_req_handler+0xf9b/0x2610 [ib_cm]
[ 402.468681] cm_work_handler+0x80a/0x1380 [ib_cm]
[ 402.473925] process_one_work+0x955/0x1660
[ 402.478543] worker_thread+0x5be/0xef0
[ 402.482817] kthread+0x2f4/0x3d0
[ 402.486570] ret_from_fork+0x30/0x70
[ 402.490668] ret_from_fork_asm+0x1b/0x30
[ 402.495113]
-> #0 (&id_priv->handler_mutex){+.+.}-{3:3}:
[ 402.501908] check_prev_add+0x1bd/0x23a0
[ 402.506352] validate_chain+0xb02/0xf30
[ 402.510713] __lock_acquire+0xbd6/0x1c00
[ 402.515159] lock_acquire+0x1da/0x5e0
[ 402.519344] __mutex_lock+0x154/0x1490
[ 402.523617] rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.528774] nvmet_rdma_free_queue+0x7b/0x380 [nvmet_rdma]
[ 402.534790] nvmet_rdma_release_queue_work+0x3e/0x90 [nvmet_rdma]
[ 402.541408] process_one_work+0x955/0x1660
[ 402.546029] worker_thread+0x5be/0xef0
[ 402.550303] kthread+0x2f4/0x3d0
[ 402.554054] ret_from_fork+0x30/0x70
[ 402.558153] ret_from_fork_asm+0x1b/0x30
[ 402.562599]
other info that might help us debug this:
[ 402.570599] Chain exists of:
&id_priv->handler_mutex --> (wq_completion)nvmet-wq
--> (work_completion)(&queue->release_work)
[ 402.584854] Possible unsafe locking scenario:
[ 402.590775] CPU0 CPU1
[ 402.595307] ---- ----
[ 402.599839] lock((work_completion)(&queue->release_work));
[ 402.605508] lock((wq_completion)nvmet-wq);
[ 402.612300]
lock((work_completion)(&queue->release_work));
[ 402.620475] lock(&id_priv->handler_mutex);
[ 402.624755]
*** DEADLOCK ***
[ 402.630675] 2 locks held by kworker/5:0/48:
[ 402.634860] #0: ffff8883070c9d48
((wq_completion)nvmet-wq){+.+.}-{0:0}, at:
process_one_work+0x80c/0x1660
[ 402.644524] #1: ffffc90000687da0
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
process_one_work+0x83a/0x1660
[ 402.655572]
stack backtrace:
[ 402.659933] CPU: 5 PID: 48 Comm: kworker/5:0 Not tainted 6.5.0+ #1
[ 402.666110] Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS
2.11.4 03/22/2023
[ 402.673763] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
[ 402.680646] Call Trace:
[ 402.683099] <TASK>
[ 402.685205] dump_stack_lvl+0x60/0xb0
[ 402.688879] check_noncircular+0x2f9/0x3e0
[ 402.692982] ? __pfx_check_noncircular+0x10/0x10
[ 402.697609] ? srso_return_thunk+0x5/0x10
[ 402.701635] ? srso_return_thunk+0x5/0x10
[ 402.705649] ? alloc_chain_hlocks+0x1de/0x520
[ 402.710010] check_prev_add+0x1bd/0x23a0
[ 402.713944] ? mark_lock.part.0+0xca/0xab0
[ 402.718054] validate_chain+0xb02/0xf30
[ 402.721909] ? __pfx_validate_chain+0x10/0x10
[ 402.726274] ? _raw_spin_unlock_irq+0x24/0x50
[ 402.730635] ? lockdep_hardirqs_on+0x79/0x100
[ 402.735000] ? srso_return_thunk+0x5/0x10
[ 402.739012] ? _raw_spin_unlock_irq+0x2f/0x50
[ 402.743380] __lock_acquire+0xbd6/0x1c00
[ 402.747321] lock_acquire+0x1da/0x5e0
[ 402.750991] ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.755805] ? __pfx_lock_acquire+0x10/0x10
[ 402.759997] ? do_raw_spin_trylock+0xb5/0x180
[ 402.764363] ? srso_return_thunk+0x5/0x10
[ 402.768381] ? srso_return_thunk+0x5/0x10
[ 402.772402] __mutex_lock+0x154/0x1490
[ 402.776159] ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.780973] ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.785782] ? __pfx___mutex_lock+0x10/0x10
[ 402.789981] ? srso_return_thunk+0x5/0x10
[ 402.793992] ? rcu_is_watching+0x11/0xb0
[ 402.797929] ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.802738] rdma_destroy_id+0x17/0x20 [rdma_cm]
[ 402.807377] nvmet_rdma_free_queue+0x7b/0x380 [nvmet_rdma]
[ 402.812875] nvmet_rdma_release_queue_work+0x3e/0x90 [nvmet_rdma]
[ 402.818972] process_one_work+0x955/0x1660
[ 402.823078] ? __lock_acquired+0x207/0x7b0
[ 402.827177] ? __pfx_process_one_work+0x10/0x10
[ 402.831711] ? __pfx___lock_acquired+0x10/0x10
[ 402.836164] ? worker_thread+0x15a/0xef0
[ 402.840095] worker_thread+0x5be/0xef0
[ 402.843862] ? __pfx_worker_thread+0x10/0x10
[ 402.848134] kthread+0x2f4/0x3d0
[ 402.851366] ? __pfx_kthread+0x10/0x10
[ 402.855121] ret_from_fork+0x30/0x70
[ 402.858704] ? __pfx_kthread+0x10/0x10
[ 402.862459] ret_from_fork_asm+0x1b/0x30
[ 402.866398] </TASK>
[ 403.865323] rdma_rxe: unloaded
[ 405.242319] rdma_rxe: loaded
[ 405.272491] loop: module loaded
[ 405.352101] run blktests nvme/004 at 2023-09-07 03:31:32
[ 405.504367] (null): rxe_set_mtu: Set mtu to 1024
[ 405.587145] infiniband eno1_rxe: set active
[ 405.595981] infiniband eno1_rxe: added eno1
[ 405.714913] loop0: detected capacity change from 0 to 2097152
[ 405.746602] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 405.784747] nvmet_rdma: enabling port 0 (10.16.221.26:4420)
[ 405.876322] nvmet: creating nvm controller 1 for subsystem
blktests-subsystem-1 for NQN
nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 405.894750] nvme nvme2: creating 16 I/O queues.
[ 406.181403] nvme nvme2: mapped 16/0/0 default/read/poll queues.
[ 406.216348] nvme nvme2: new ctrl: NQN "blktests-subsystem-1", addr
10.16.221.26:4420
[ 406.552633] nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
[ 407.879250] rdma_rxe: unloaded
[ 409.146426] rdma_rxe: loaded
[ 409.176282] loop: module loaded
[ 409.460211] rdma_rxe: unloaded
[ 410.938596] rdma_rxe: loaded
[ 410.966908] loop: module loaded
On Wed, Sep 6, 2023 at 7:33 PM Shinichiro Kawasaki
<shinichiro.kawasaki@wdc.com> wrote:
>
> On Sep 06, 2023 / 10:54, Daniel Wagner wrote:
> > On Tue, Sep 05, 2023 at 12:39:38AM +0000, Shinichiro Kawasaki wrote:
> > > Two month ago, the failure had been recreated by repeating the blktests test
> > > cases nvme/030 and nvme/031 with SIW tranposrt around 30 times. Now I do not see
> > > it when I repeat them 100 times. I suspected some kernel changes between v6.4
> > > and v6.5 would have fixed the issue, but even when I use both kernel versions,
> > > the failure is no longer seen. I reverted nvme-cli and libnvme to older version,
> > > but still do not see the failure. I just guess some change on my test system
> > > affected the symptom (Fedora 38 on QEMU).
> >
> > Maybe the refactoring in blktests could also be the reason the tests are
> > not working? Just an idea.
>
> That is an idea. I rolled back blktests to the git revision of two months ago,
> but still the failure is not recreated. Hmm.
>
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-09-07 7:47 ` Yi Zhang
@ 2023-09-07 16:17 ` Jason Gunthorpe
2023-09-08 7:18 ` Hannes Reinecke
1 sibling, 0 replies; 17+ messages in thread
From: Jason Gunthorpe @ 2023-09-07 16:17 UTC (permalink / raw)
To: Yi Zhang
Cc: Shinichiro Kawasaki, Daniel Wagner, Leon Romanovsky,
linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org,
Damien Le Moal
On Thu, Sep 07, 2023 at 03:47:03PM +0800, Yi Zhang wrote:
> Seems I reproduced this issue[3] when I run [1] on the latest
> linux-block/for-next with kernel config[2].
https://lore.kernel.org/linux-rdma/13441b9b-cc13-f0e0-bd46-f14983dadd49@grimberg.me/
So far nobody has come with a solution
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-09-07 7:47 ` Yi Zhang
2023-09-07 16:17 ` Jason Gunthorpe
@ 2023-09-08 7:18 ` Hannes Reinecke
2023-09-08 8:33 ` Yi Zhang
1 sibling, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2023-09-08 7:18 UTC (permalink / raw)
To: Yi Zhang, Shinichiro Kawasaki
Cc: Daniel Wagner, Jason Gunthorpe, Leon Romanovsky,
linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org,
Damien Le Moal
On 9/7/23 09:47, Yi Zhang wrote:
> Seems I reproduced this issue[3] when I run [1] on the latest
> linux-block/for-next with kernel config[2].
>
> [1]nvme_trtype=rdma ./check nvme/
>
> [2] https://pastebin.com/u9fh7dbX
>
But isn't this the same issue as the circular locking discussed earlier?
Can you check if the patch
'nvmet-tcp: avoid circular locking dependency on install_queue()'
fixes this issue, too?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
2023-09-08 7:18 ` Hannes Reinecke
@ 2023-09-08 8:33 ` Yi Zhang
0 siblings, 0 replies; 17+ messages in thread
From: Yi Zhang @ 2023-09-08 8:33 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Shinichiro Kawasaki, Daniel Wagner, Jason Gunthorpe,
Leon Romanovsky, linux-rdma@vger.kernel.org,
linux-nvme@lists.infradead.org, Damien Le Moal, Guoqing Jiang
On Fri, Sep 8, 2023 at 3:18 PM Hannes Reinecke <hare@suse.de> wrote:
>
> On 9/7/23 09:47, Yi Zhang wrote:
> > Seems I reproduced this issue[3] when I run [1] on the latest
> > linux-block/for-next with kernel config[2].
> >
> > [1]nvme_trtype=rdma ./check nvme/
> >
> > [2] https://pastebin.com/u9fh7dbX
> >
> But isn't this the same issue as the circular locking discussed earlier?
Sorry, maybe I mixed up the issues here, There are three issues here:
1. "BUG:KASAN: slab-use-after-free at mutex_lock() in
cma_iw_handler()." which Shin'ichiro mentioned in this patch and
triggered by nvme/rdma nvme/030 nvme/031 use siw
2. nvme/tcp circular locking on install_queue which could be fixed by:
https://lore.kernel.org/linux-nvme/20230810132006.129365-1-hare@suse.de/
https://lore.kernel.org/linux-nvme/20230726142939.10062-1-guoqing.jiang@linux.dev/
3. circular locking at rdma_destroy_id+0x17/0x20 [rdma_cm] which was
already filed last year but still not get one solution:
https://lore.kernel.org/linux-rdma/CAHj4cs93BfTRgWF6PbuZcfq6AARHgYC2g=RQ-7Jgcf1-6h+2SQ@mail.gmail.com/
>
> Can you check if the patch
> 'nvmet-tcp: avoid circular locking dependency on install_queue()'
> fixes this issue, too?
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke Kernel Storage Architect
> hare@suse.de +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
> Myers, Andrew McDonald, Martje Boudien Moerman
>
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2023-09-08 8:34 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-12 5:42 [PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler Shin'ichiro Kawasaki
2023-06-12 7:13 ` Leon Romanovsky
2023-06-12 14:18 ` Jason Gunthorpe
2023-06-13 1:43 ` Shinichiro Kawasaki
2023-06-13 6:47 ` Shinichiro Kawasaki
2023-06-13 13:30 ` Jason Gunthorpe
2023-06-13 18:07 ` Leon Romanovsky
2023-06-14 7:53 ` Shinichiro Kawasaki
2023-06-14 17:36 ` Jason Gunthorpe
2023-06-15 0:45 ` Shinichiro Kawasaki
2023-09-05 0:39 ` Shinichiro Kawasaki
2023-09-06 8:54 ` Daniel Wagner
2023-09-06 11:31 ` Shinichiro Kawasaki
2023-09-07 7:47 ` Yi Zhang
2023-09-07 16:17 ` Jason Gunthorpe
2023-09-08 7:18 ` Hannes Reinecke
2023-09-08 8:33 ` Yi Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).