[PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries

public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
@ 2024-09-02 13:42 Leon Romanovsky
  2024-09-02 22:46 ` Zhu Yanjun
  2024-09-04 14:31 ` Jason Gunthorpe
  0 siblings, 2 replies; 7+ messages in thread
From: Leon Romanovsky @ 2024-09-02 13:42 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, syzbot+b8b7a6774bf40cf8296b

From: Leon Romanovsky <leonro@nvidia.com>

Failure in driver initialization can lead to a situation where the GID
entries are set but not used yet. In this case, the kref will be equal to 1,
which will trigger a false positive leak detection.

For example, these messages are printed during the driver initialization
and followed by release_gid_table() call:

 infiniband syz1: ib_query_port failed (-19)
 infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
------------[ cut here ]------------
 GID entry ref leak for dev syz1 index 0 ref=1
 WARNING: CPU: 0 PID: 19837 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x387/0x4b0
 Modules linked in:
 CPU: 0 UID: 0 PID: 19837 Comm: syz.1.3934 Not tainted 6.11.0-rc5-syzkaller-00079-g928f79a188aa #0
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
 RIP: 0010:gid_table_release_one+0x387/0x4b0
 Code: 78 07 00 00 48 85 f6 74 2a 48 89 74 24 38
e8 b0 0a 76 f9 48 8b 74 24 38 44 89 f9 89 da 48 c7 c7 c0 69 51 8c e8 5a
c3 38 f9 90 <0f> 0b 90 90 e9 6f fe ff ff e8 8b 0a 76 f9 49 8d bc 24 28
07 00 00
 RSP: 0018:ffffc900042b7080 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002811e000
 RDX: 0000000000040000 RSI: ffffffff814dd406 RDI: 0000000000000001
 RBP: ffff88807ebaaf00 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888051860000
 R13: dffffc0000000000 R14: ffffed100fd755fb R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff88802c000000(0063) knlGS:00000000f56c6b40
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 000000002effcff8 CR3: 0000000060c5e000 CR4: 0000000000350ef0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
 Call Trace:
  <TASK>
  ? show_regs+0x8c/0xa0
  ? __warn+0xe5/0x3c0
  ? gid_table_release_one+0x387/0x4b0
  ? report_bug+0x3c0/0x580
  ? handle_bug+0x3d/0x70
  ? exc_invalid_op+0x17/0x50
  ? asm_exc_invalid_op+0x1a/0x20
  ? __warn_printk+0x1a6/0x350
  ? gid_table_release_one+0x387/0x4b0
  ib_device_release+0xef/0x1e0
  ? __pfx_ib_device_release+0x10/0x10
  device_release+0xa1/0x240
  kobject_put+0x1e4/0x5a0
  put_device+0x1f/0x30
  rxe_net_add+0xe0/0x110
  rxe_newlink+0x70/0x190
  nldev_newlink+0x373/0x5e0
  ? __pfx_nldev_newlink+0x10/0x10
  ? aa_get_newest_label+0x376/0x680
  ? __pfx_lock_acquire+0x10/0x10
  ? __pfx_aa_get_newest_label+0x10/0x10
  ? __pfx_rwsem_read_trylock+0x10/0x10
  ? __pfx___might_resched+0x10/0x10
  ? apparmor_capable+0x114/0x1d0
  ? ns_capable+0xd7/0x110
  ? __pfx_nldev_newlink+0x10/0x10
  rdma_nl_rcv_msg+0x388/0x6e0
  ? __pfx_rdma_nl_rcv_msg+0x10/0x10
  ? __pfx___lock_acquire+0x10/0x10
  ? find_held_lock+0x2d/0x110
  rdma_nl_rcv_skb.constprop.0.isra.0+0x2e6/0x450
  ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10
  ? netlink_deliver_tap+0x1ae/0xcf0
  netlink_unicast+0x53c/0x7f0
  ? __pfx_netlink_unicast+0x10/0x10
  ? __phys_addr_symbol+0x30/0x80
  ? __check_object_size+0x497/0x720
  netlink_sendmsg+0x8b8/0xd70
  ? __pfx_netlink_sendmsg+0x10/0x10
  ? bpf_lsm_socket_sendmsg+0x9/0x10
  ____sys_sendmsg+0x9b4/0xb50
  ? __pfx_____sys_sendmsg+0x10/0x10
  ? get_compat_msghdr+0x11b/0x170
  ? __pfx___lock_acquire+0x10/0x10
  ? try_to_wake_up+0xc08/0x13e0
  ___sys_sendmsg+0x135/0x1e0
  ? __pfx____sys_sendmsg+0x10/0x10
  ? __fget_light+0x173/0x210
  __sys_sendmsg+0x117/0x1f0
  ? __pfx___sys_sendmsg+0x10/0x10
  ? __ia32_sys_futex_time32+0x1da/0x460
  __do_fast_syscall_32+0x73/0x120
  do_fast_syscall_32+0x32/0x80
  entry_SYSENTER_compat_after_hwframe+0x84/0x8e
 RIP: 0023:0xf7f20579
 Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01
10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5
0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00
00 00 00
 RSP: 002b:00000000f56c656c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
 RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00000000200003c0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
SYZFAIL: failed to recv rpc fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
 Kernel panic - not syncing: kernel: panic_on_warn set ...

In order to fix it, don't print kern

Fixes: a92fbeac7e94 ("RDMA/cache: Release GID table even if leak is detected")
Reported-by: syzbot+b8b7a6774bf40cf8296b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000d70eed06211ac86b@google.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/cache.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index b7c078b7f7cf..c6aec2e04d4c 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
 		return;
 
 	for (i = 0; i < table->sz; i++) {
+		int gid_kref;
+
 		if (is_gid_entry_free(table->data_vec[i]))
 			continue;
 
-		WARN_ONCE(true,
+		gid_kref = kref_read(&table->data_vec[i]->kref);
+		WARN_ONCE(gid_kref > 1,
 			  "GID entry ref leak for dev %s index %d ref=%u\n",
-			  dev_name(&device->dev), i,
-			  kref_read(&table->data_vec[i]->kref));
+			  dev_name(&device->dev), i, gid_kref);
 	}
 
 	mutex_destroy(&table->lock);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
  2024-09-02 13:42 [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries Leon Romanovsky
@ 2024-09-02 22:46 ` Zhu Yanjun
  2024-09-03  7:26   ` Leon Romanovsky
  2024-09-04  8:33   ` Leon Romanovsky
  2024-09-04 14:31 ` Jason Gunthorpe
  1 sibling, 2 replies; 7+ messages in thread
From: Zhu Yanjun @ 2024-09-02 22:46 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, syzbot+b8b7a6774bf40cf8296b

在 2024/9/2 21:42, Leon Romanovsky 写道:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> Failure in driver initialization can lead to a situation where the GID
> entries are set but not used yet. In this case, the kref will be equal to 1,
> which will trigger a false positive leak detection.

I have also delved into this problem. And I agree with you. This is a 
false positive leak detection.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

> 
> For example, these messages are printed during the driver initialization
> and followed by release_gid_table() call:
> 
>   infiniband syz1: ib_query_port failed (-19)
>   infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
> ------------[ cut here ]------------
>   GID entry ref leak for dev syz1 index 0 ref=1
>   WARNING: CPU: 0 PID: 19837 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x387/0x4b0
>   Modules linked in:
>   CPU: 0 UID: 0 PID: 19837 Comm: syz.1.3934 Not tainted 6.11.0-rc5-syzkaller-00079-g928f79a188aa #0
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>   RIP: 0010:gid_table_release_one+0x387/0x4b0
>   Code: 78 07 00 00 48 85 f6 74 2a 48 89 74 24 38
> e8 b0 0a 76 f9 48 8b 74 24 38 44 89 f9 89 da 48 c7 c7 c0 69 51 8c e8 5a
> c3 38 f9 90 <0f> 0b 90 90 e9 6f fe ff ff e8 8b 0a 76 f9 49 8d bc 24 28
> 07 00 00
>   RSP: 0018:ffffc900042b7080 EFLAGS: 00010286
>   RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002811e000
>   RDX: 0000000000040000 RSI: ffffffff814dd406 RDI: 0000000000000001
>   RBP: ffff88807ebaaf00 R08: 0000000000000001 R09: 0000000000000000
>   R10: 0000000000000001 R11: 0000000000000000 R12: ffff888051860000
>   R13: dffffc0000000000 R14: ffffed100fd755fb R15: 0000000000000001
>   FS:  0000000000000000(0000) GS:ffff88802c000000(0063) knlGS:00000000f56c6b40
>   CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
>   CR2: 000000002effcff8 CR3: 0000000060c5e000 CR4: 0000000000350ef0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
>   Call Trace:
>    <TASK>
>    ? show_regs+0x8c/0xa0
>    ? __warn+0xe5/0x3c0
>    ? gid_table_release_one+0x387/0x4b0
>    ? report_bug+0x3c0/0x580
>    ? handle_bug+0x3d/0x70
>    ? exc_invalid_op+0x17/0x50
>    ? asm_exc_invalid_op+0x1a/0x20
>    ? __warn_printk+0x1a6/0x350
>    ? gid_table_release_one+0x387/0x4b0
>    ib_device_release+0xef/0x1e0
>    ? __pfx_ib_device_release+0x10/0x10
>    device_release+0xa1/0x240
>    kobject_put+0x1e4/0x5a0
>    put_device+0x1f/0x30
>    rxe_net_add+0xe0/0x110
>    rxe_newlink+0x70/0x190
>    nldev_newlink+0x373/0x5e0
>    ? __pfx_nldev_newlink+0x10/0x10
>    ? aa_get_newest_label+0x376/0x680
>    ? __pfx_lock_acquire+0x10/0x10
>    ? __pfx_aa_get_newest_label+0x10/0x10
>    ? __pfx_rwsem_read_trylock+0x10/0x10
>    ? __pfx___might_resched+0x10/0x10
>    ? apparmor_capable+0x114/0x1d0
>    ? ns_capable+0xd7/0x110
>    ? __pfx_nldev_newlink+0x10/0x10
>    rdma_nl_rcv_msg+0x388/0x6e0
>    ? __pfx_rdma_nl_rcv_msg+0x10/0x10
>    ? __pfx___lock_acquire+0x10/0x10
>    ? find_held_lock+0x2d/0x110
>    rdma_nl_rcv_skb.constprop.0.isra.0+0x2e6/0x450
>    ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10
>    ? netlink_deliver_tap+0x1ae/0xcf0
>    netlink_unicast+0x53c/0x7f0
>    ? __pfx_netlink_unicast+0x10/0x10
>    ? __phys_addr_symbol+0x30/0x80
>    ? __check_object_size+0x497/0x720
>    netlink_sendmsg+0x8b8/0xd70
>    ? __pfx_netlink_sendmsg+0x10/0x10
>    ? bpf_lsm_socket_sendmsg+0x9/0x10
>    ____sys_sendmsg+0x9b4/0xb50
>    ? __pfx_____sys_sendmsg+0x10/0x10
>    ? get_compat_msghdr+0x11b/0x170
>    ? __pfx___lock_acquire+0x10/0x10
>    ? try_to_wake_up+0xc08/0x13e0
>    ___sys_sendmsg+0x135/0x1e0
>    ? __pfx____sys_sendmsg+0x10/0x10
>    ? __fget_light+0x173/0x210
>    __sys_sendmsg+0x117/0x1f0
>    ? __pfx___sys_sendmsg+0x10/0x10
>    ? __ia32_sys_futex_time32+0x1da/0x460
>    __do_fast_syscall_32+0x73/0x120
>    do_fast_syscall_32+0x32/0x80
>    entry_SYSENTER_compat_after_hwframe+0x84/0x8e
>   RIP: 0023:0xf7f20579
>   Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01
> 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5
> 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00
> 00 00 00
>   RSP: 002b:00000000f56c656c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
>   RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00000000200003c0
>   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
> SYZFAIL: failed to recv rpc fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
>   R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>   </TASK>
>   Kernel panic - not syncing: kernel: panic_on_warn set ...
> 
> In order to fix it, don't print kern
> 
> Fixes: a92fbeac7e94 ("RDMA/cache: Release GID table even if leak is detected")
> Reported-by: syzbot+b8b7a6774bf40cf8296b@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000d70eed06211ac86b@google.com
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>   drivers/infiniband/core/cache.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index b7c078b7f7cf..c6aec2e04d4c 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
>   		return;
>   
>   	for (i = 0; i < table->sz; i++) {
> +		int gid_kref;
> +
>   		if (is_gid_entry_free(table->data_vec[i]))
>   			continue;
>   
> -		WARN_ONCE(true,
> +		gid_kref = kref_read(&table->data_vec[i]->kref);
> +		WARN_ONCE(gid_kref > 1,
>   			  "GID entry ref leak for dev %s index %d ref=%u\n",
> -			  dev_name(&device->dev), i,
> -			  kref_read(&table->data_vec[i]->kref));
> +			  dev_name(&device->dev), i, gid_kref);
>   	}
>   
>   	mutex_destroy(&table->lock);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
  2024-09-02 22:46 ` Zhu Yanjun
@ 2024-09-03  7:26   ` Leon Romanovsky
  2024-09-04  8:33   ` Leon Romanovsky
  1 sibling, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2024-09-03  7:26 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: Jason Gunthorpe, linux-rdma, syzbot+b8b7a6774bf40cf8296b

On Tue, Sep 03, 2024 at 06:46:24AM +0800, Zhu Yanjun wrote:
> 在 2024/9/2 21:42, Leon Romanovsky 写道:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Failure in driver initialization can lead to a situation where the GID
> > entries are set but not used yet. In this case, the kref will be equal to 1,
> > which will trigger a false positive leak detection.
> 
> I have also delved into this problem. And I agree with you. This is a false
> positive leak detection.
> 
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Thanks for the review.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
  2024-09-02 22:46 ` Zhu Yanjun
  2024-09-03  7:26   ` Leon Romanovsky
@ 2024-09-04  8:33   ` Leon Romanovsky
  1 sibling, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2024-09-04  8:33 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: Jason Gunthorpe, linux-rdma, syzbot+b8b7a6774bf40cf8296b

On Tue, Sep 03, 2024 at 06:46:24AM +0800, Zhu Yanjun wrote:
> 在 2024/9/2 21:42, Leon Romanovsky 写道:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Failure in driver initialization can lead to a situation where the GID
> > entries are set but not used yet. In this case, the kref will be equal to 1,
> > which will trigger a false positive leak detection.
> 
> I have also delved into this problem. And I agree with you. This is a false
> positive leak detection.

I retested after applying commit 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup")
and it looks like this patch is not needed anymore. I will drop it.

Thanks

> 
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> Zhu Yanjun
> 
> > 
> > For example, these messages are printed during the driver initialization
> > and followed by release_gid_table() call:
> > 
> >   infiniband syz1: ib_query_port failed (-19)
> >   infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
> > ------------[ cut here ]------------
> >   GID entry ref leak for dev syz1 index 0 ref=1
> >   WARNING: CPU: 0 PID: 19837 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x387/0x4b0
> >   Modules linked in:
> >   CPU: 0 UID: 0 PID: 19837 Comm: syz.1.3934 Not tainted 6.11.0-rc5-syzkaller-00079-g928f79a188aa #0
> >   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> >   RIP: 0010:gid_table_release_one+0x387/0x4b0
> >   Code: 78 07 00 00 48 85 f6 74 2a 48 89 74 24 38
> > e8 b0 0a 76 f9 48 8b 74 24 38 44 89 f9 89 da 48 c7 c7 c0 69 51 8c e8 5a
> > c3 38 f9 90 <0f> 0b 90 90 e9 6f fe ff ff e8 8b 0a 76 f9 49 8d bc 24 28
> > 07 00 00
> >   RSP: 0018:ffffc900042b7080 EFLAGS: 00010286
> >   RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002811e000
> >   RDX: 0000000000040000 RSI: ffffffff814dd406 RDI: 0000000000000001
> >   RBP: ffff88807ebaaf00 R08: 0000000000000001 R09: 0000000000000000
> >   R10: 0000000000000001 R11: 0000000000000000 R12: ffff888051860000
> >   R13: dffffc0000000000 R14: ffffed100fd755fb R15: 0000000000000001
> >   FS:  0000000000000000(0000) GS:ffff88802c000000(0063) knlGS:00000000f56c6b40
> >   CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
> >   CR2: 000000002effcff8 CR3: 0000000060c5e000 CR4: 0000000000350ef0
> >   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
> >   Call Trace:
> >    <TASK>
> >    ? show_regs+0x8c/0xa0
> >    ? __warn+0xe5/0x3c0
> >    ? gid_table_release_one+0x387/0x4b0
> >    ? report_bug+0x3c0/0x580
> >    ? handle_bug+0x3d/0x70
> >    ? exc_invalid_op+0x17/0x50
> >    ? asm_exc_invalid_op+0x1a/0x20
> >    ? __warn_printk+0x1a6/0x350
> >    ? gid_table_release_one+0x387/0x4b0
> >    ib_device_release+0xef/0x1e0
> >    ? __pfx_ib_device_release+0x10/0x10
> >    device_release+0xa1/0x240
> >    kobject_put+0x1e4/0x5a0
> >    put_device+0x1f/0x30
> >    rxe_net_add+0xe0/0x110
> >    rxe_newlink+0x70/0x190
> >    nldev_newlink+0x373/0x5e0
> >    ? __pfx_nldev_newlink+0x10/0x10
> >    ? aa_get_newest_label+0x376/0x680
> >    ? __pfx_lock_acquire+0x10/0x10
> >    ? __pfx_aa_get_newest_label+0x10/0x10
> >    ? __pfx_rwsem_read_trylock+0x10/0x10
> >    ? __pfx___might_resched+0x10/0x10
> >    ? apparmor_capable+0x114/0x1d0
> >    ? ns_capable+0xd7/0x110
> >    ? __pfx_nldev_newlink+0x10/0x10
> >    rdma_nl_rcv_msg+0x388/0x6e0
> >    ? __pfx_rdma_nl_rcv_msg+0x10/0x10
> >    ? __pfx___lock_acquire+0x10/0x10
> >    ? find_held_lock+0x2d/0x110
> >    rdma_nl_rcv_skb.constprop.0.isra.0+0x2e6/0x450
> >    ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10
> >    ? netlink_deliver_tap+0x1ae/0xcf0
> >    netlink_unicast+0x53c/0x7f0
> >    ? __pfx_netlink_unicast+0x10/0x10
> >    ? __phys_addr_symbol+0x30/0x80
> >    ? __check_object_size+0x497/0x720
> >    netlink_sendmsg+0x8b8/0xd70
> >    ? __pfx_netlink_sendmsg+0x10/0x10
> >    ? bpf_lsm_socket_sendmsg+0x9/0x10
> >    ____sys_sendmsg+0x9b4/0xb50
> >    ? __pfx_____sys_sendmsg+0x10/0x10
> >    ? get_compat_msghdr+0x11b/0x170
> >    ? __pfx___lock_acquire+0x10/0x10
> >    ? try_to_wake_up+0xc08/0x13e0
> >    ___sys_sendmsg+0x135/0x1e0
> >    ? __pfx____sys_sendmsg+0x10/0x10
> >    ? __fget_light+0x173/0x210
> >    __sys_sendmsg+0x117/0x1f0
> >    ? __pfx___sys_sendmsg+0x10/0x10
> >    ? __ia32_sys_futex_time32+0x1da/0x460
> >    __do_fast_syscall_32+0x73/0x120
> >    do_fast_syscall_32+0x32/0x80
> >    entry_SYSENTER_compat_after_hwframe+0x84/0x8e
> >   RIP: 0023:0xf7f20579
> >   Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01
> > 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5
> > 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00
> > 00 00 00
> >   RSP: 002b:00000000f56c656c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
> >   RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00000000200003c0
> >   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> >   R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
> > SYZFAIL: failed to recv rpc fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
> >   R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> >   </TASK>
> >   Kernel panic - not syncing: kernel: panic_on_warn set ...
> > 
> > In order to fix it, don't print kern
> > 
> > Fixes: a92fbeac7e94 ("RDMA/cache: Release GID table even if leak is detected")
> > Reported-by: syzbot+b8b7a6774bf40cf8296b@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/000000000000d70eed06211ac86b@google.com
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >   drivers/infiniband/core/cache.c | 8 +++++---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index b7c078b7f7cf..c6aec2e04d4c 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
> >   		return;
> >   	for (i = 0; i < table->sz; i++) {
> > +		int gid_kref;
> > +
> >   		if (is_gid_entry_free(table->data_vec[i]))
> >   			continue;
> > -		WARN_ONCE(true,
> > +		gid_kref = kref_read(&table->data_vec[i]->kref);
> > +		WARN_ONCE(gid_kref > 1,
> >   			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > -			  dev_name(&device->dev), i,
> > -			  kref_read(&table->data_vec[i]->kref));
> > +			  dev_name(&device->dev), i, gid_kref);
> >   	}
> >   	mutex_destroy(&table->lock);
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
  2024-09-02 13:42 [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries Leon Romanovsky
  2024-09-02 22:46 ` Zhu Yanjun
@ 2024-09-04 14:31 ` Jason Gunthorpe
  2024-09-04 15:34   ` Leon Romanovsky
  1 sibling, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2024-09-04 14:31 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Leon Romanovsky, linux-rdma, syzbot+b8b7a6774bf40cf8296b

On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> Failure in driver initialization can lead to a situation where the GID
> entries are set but not used yet. In this case, the kref will be equal to 1,
> which will trigger a false positive leak detection.

Why does that happen??


> For example, these messages are printed during the driver initialization
> and followed by release_gid_table() call:
> 
>  infiniband syz1: ib_query_port failed (-19)
>  infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache

Okay, but who set the ref=1?

> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index b7c078b7f7cf..c6aec2e04d4c 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
>  		return;
>  
>  	for (i = 0; i < table->sz; i++) {
> +		int gid_kref;
> +
>  		if (is_gid_entry_free(table->data_vec[i]))
>  			continue;
>  
> -		WARN_ONCE(true,
> +		gid_kref = kref_read(&table->data_vec[i]->kref);
> +		WARN_ONCE(gid_kref > 1,
>  			  "GID entry ref leak for dev %s index %d ref=%u\n",
> -			  dev_name(&device->dev), i,
> -			  kref_read(&table->data_vec[i]->kref));
> +			  dev_name(&device->dev), i, gid_kref);
>  	}

I'm not convinced, I think the bug here is something wrong on the
refcounting side not the freeing side. Ref should not be 1. Seems like
missing error unwinding in the init side.

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
  2024-09-04 14:31 ` Jason Gunthorpe
@ 2024-09-04 15:34   ` Leon Romanovsky
  2024-09-05  6:54     ` Zhu Yanjun
  0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2024-09-04 15:34 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma, syzbot+b8b7a6774bf40cf8296b

On Wed, Sep 04, 2024 at 11:31:13AM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Failure in driver initialization can lead to a situation where the GID
> > entries are set but not used yet. In this case, the kref will be equal to 1,
> > which will trigger a false positive leak detection.
> 
> Why does that happen??
> 
> 
> > For example, these messages are printed during the driver initialization
> > and followed by release_gid_table() call:
> > 
> >  infiniband syz1: ib_query_port failed (-19)
> >  infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
> 
> Okay, but who set the ref=1?
> 
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index b7c078b7f7cf..c6aec2e04d4c 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
> >  		return;
> >  
> >  	for (i = 0; i < table->sz; i++) {
> > +		int gid_kref;
> > +
> >  		if (is_gid_entry_free(table->data_vec[i]))
> >  			continue;
> >  
> > -		WARN_ONCE(true,
> > +		gid_kref = kref_read(&table->data_vec[i]->kref);
> > +		WARN_ONCE(gid_kref > 1,
> >  			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > -			  dev_name(&device->dev), i,
> > -			  kref_read(&table->data_vec[i]->kref));
> > +			  dev_name(&device->dev), i, gid_kref);
> >  	}
> 
> I'm not convinced, I think the bug here is something wrong on the
> refcounting side not the freeing side. Ref should not be 1. Seems like
> missing error unwinding in the init side.

I dropped this patch as the real fix is here 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup")

Thanks

> 
> Jason
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries
  2024-09-04 15:34   ` Leon Romanovsky
@ 2024-09-05  6:54     ` Zhu Yanjun
  0 siblings, 0 replies; 7+ messages in thread
From: Zhu Yanjun @ 2024-09-05  6:54 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe; +Cc: linux-rdma, syzbot+b8b7a6774bf40cf8296b

在 2024/9/4 23:34, Leon Romanovsky 写道:
> On Wed, Sep 04, 2024 at 11:31:13AM -0300, Jason Gunthorpe wrote:
>> On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> Failure in driver initialization can lead to a situation where the GID
>>> entries are set but not used yet. In this case, the kref will be equal to 1,
>>> which will trigger a false positive leak detection.
>>
>> Why does that happen??
>>
>>
>>> For example, these messages are printed during the driver initialization
>>> and followed by release_gid_table() call:
>>>
>>>   infiniband syz1: ib_query_port failed (-19)
>>>   infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
>>
>> Okay, but who set the ref=1?
>>
>>> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
>>> index b7c078b7f7cf..c6aec2e04d4c 100644
>>> --- a/drivers/infiniband/core/cache.c
>>> +++ b/drivers/infiniband/core/cache.c
>>> @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
>>>   		return;
>>>   
>>>   	for (i = 0; i < table->sz; i++) {
>>> +		int gid_kref;
>>> +
>>>   		if (is_gid_entry_free(table->data_vec[i]))
>>>   			continue;
>>>   
>>> -		WARN_ONCE(true,
>>> +		gid_kref = kref_read(&table->data_vec[i]->kref);
>>> +		WARN_ONCE(gid_kref > 1,
>>>   			  "GID entry ref leak for dev %s index %d ref=%u\n",
>>> -			  dev_name(&device->dev), i,
>>> -			  kref_read(&table->data_vec[i]->kref));
>>> +			  dev_name(&device->dev), i, gid_kref);
>>>   	}
>>
>> I'm not convinced, I think the bug here is something wrong on the
>> refcounting side not the freeing side. Ref should not be 1. Seems like
>> missing error unwinding in the init side.
> 
> I dropped this patch as the real fix is here 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup")

The commit 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow 
cleanup") is in the link 
https://patchwork.kernel.org/project/linux-rdma/patch/79137687d829899b0b1c9835fcb4b258004c439a.1725273354.git.leon@kernel.org/

Zhu Yanjun

> 
> Thanks
> 
>>
>> Jason
>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-09-05  6:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-02 13:42 [PATCH rdma-next] RDMA/core: Skip initialized but not leaked GID entries Leon Romanovsky
2024-09-02 22:46 ` Zhu Yanjun
2024-09-03  7:26   ` Leon Romanovsky
2024-09-04  8:33   ` Leon Romanovsky
2024-09-04 14:31 ` Jason Gunthorpe
2024-09-04 15:34   ` Leon Romanovsky
2024-09-05  6:54     ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox