* [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
@ 2025-02-21 2:05 Roman Gushchin
2025-02-21 3:14 ` Parav Pandit
0 siblings, 1 reply; 22+ messages in thread
From: Roman Gushchin @ 2025-02-21 2:05 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla, Parav Pandit,
linux-rdma, linux-kernel
Commit 54747231150f ("RDMA: Introduce and use rdma_device_to_ibdev()")
introduced rdma_device_to_ibdev() helper which has to be used to
obtain an ib_device pointer from a device pointer.
hw_stat_device_show() and hw_stat_device_store() were missed.
It causes a NULL pointer dereference panic on an attempt to read
hw counters from a namespace, when the device structure is not
embedded into the ib_device structure. In this case casting the device
pointer into the ib_device pointer using container_of() is wrong.
Instead, rdma_device_to_ibdev() should be used, which uses the
back-reference (container_of(device, struct ib_core_device, dev))->owner.
[42021.807566] BUG: kernel NULL pointer dereference, address: 0000000000000028
[42021.814463] #PF: supervisor read access in kernel mode
[42021.819549] #PF: error_code(0x0000) - not-present page
[42021.824636] PGD 0 P4D 0
[42021.827145] Oops: 0000 [#1] SMP PTI
[42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump: loaded Tainted: G S W I XXX
[42021.841697] Hardware name: XXX
[42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff ff <48> 8b 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
[42021.873931] RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287
[42021.879108] RAX: ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
[42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI: ffff940c7517aef0
[42021.893230] RBP: ffff97fe90f03e70 R08: ffff94085f1aa000 R09: 0000000000000000
[42021.900294] R10: ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
[42021.907355] R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
[42021.914418] FS: 00007fda1a3b9700(0000) GS:ffff94453fb80000(0000) knlGS:0000000000000000
[42021.922423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42021.928130] CR2: 0000000000000028 CR3: 00000042dcfb8003 CR4: 00000000003726f0
[42021.935194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[42021.949324] Call Trace:
[42021.951756] <TASK>
[42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
[42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0
[42021.963874] [<ffffffff86c9ef75>] ? page_fault_oops+0x2b5/0x3b0
[42021.969749] [<ffffffff87674b92>] ? exc_page_fault+0x1a2/0x3c0
[42021.975549] [<ffffffff87801326>] ? asm_exc_page_fault+0x26/0x30
[42021.981517] [<ffffffffc0775680>] ? __pfx_show_hw_stats+0x10/0x10 [ib_core]
[42021.988482] [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
[42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
[42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
[42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
[42022.017058] [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
[42022.022073] [<ffffffff8766f1ca>] do_syscall_64+0x6a/0xa0
[42022.027441] [<ffffffff8780013b>] entry_SYSCALL_64_after_hwframe+0x78/0xe2
Fixes: 54747231150f ("RDMA: Introduce and use rdma_device_to_ibdev()")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Maher Sanalla <msanalla@nvidia.com>
Cc: Parav Pandit <parav@mellanox.com>
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/infiniband/core/sysfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 7491328ca5e6..0be77b8abeae 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -148,7 +148,7 @@ static ssize_t hw_stat_device_show(struct device *dev,
{
struct hw_stats_device_attribute *stat_attr =
container_of(attr, struct hw_stats_device_attribute, attr);
- struct ib_device *ibdev = container_of(dev, struct ib_device, dev);
+ struct ib_device *ibdev = rdma_device_to_ibdev(dev);
return stat_attr->show(ibdev, ibdev->hw_stats_data->stats,
stat_attr - ibdev->hw_stats_data->attrs, 0, buf);
@@ -160,7 +160,7 @@ static ssize_t hw_stat_device_store(struct device *dev,
{
struct hw_stats_device_attribute *stat_attr =
container_of(attr, struct hw_stats_device_attribute, attr);
- struct ib_device *ibdev = container_of(dev, struct ib_device, dev);
+ struct ib_device *ibdev = rdma_device_to_ibdev(dev);
return stat_attr->store(ibdev, ibdev->hw_stats_data->stats,
stat_attr - ibdev->hw_stats_data->attrs, 0, buf,
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 2:05 [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show() Roman Gushchin
@ 2025-02-21 3:14 ` Parav Pandit
2025-02-21 4:25 ` Roman Gushchin
0 siblings, 1 reply; 22+ messages in thread
From: Parav Pandit @ 2025-02-21 3:14 UTC (permalink / raw)
To: Roman Gushchin, Jason Gunthorpe
Cc: Leon Romanovsky, Maher Sanalla, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org
> From: Roman Gushchin <roman.gushchin@linux.dev>
> Sent: Friday, February 21, 2025 7:36 AM
>
> Commit 54747231150f ("RDMA: Introduce and use rdma_device_to_ibdev()")
> introduced rdma_device_to_ibdev() helper which has to be used to obtain an
> ib_device pointer from a device pointer.
>
> hw_stat_device_show() and hw_stat_device_store() were missed.
>
> It causes a NULL pointer dereference panic on an attempt to read hw counters
> from a namespace, when the device structure is not embedded into the
> ib_device structure.
Do you mean net namespace other than default init_net?
Assuming the answer is yes, some question below.
> In this case casting the device pointer into the ib_device
> pointer using container_of() is wrong.
> Instead, rdma_device_to_ibdev() should be used, which uses the back-
> reference (container_of(device, struct ib_core_device, dev))->owner.
>
> [42021.807566] BUG: kernel NULL pointer dereference, address:
> 0000000000000028 [42021.814463] #PF: supervisor read access in kernel
> mode [42021.819549] #PF: error_code(0x0000) - not-present page
> [42021.824636] PGD 0 P4D 0 [42021.827145] Oops: 0000 [#1] SMP PTI
> [42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump: loaded
> Tainted: G S W I XXX
> [42021.841697] Hardware name: XXX
> [42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
> [42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f
> 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff ff <48> 8b
> 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48 [42021.873931]
> RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108] RAX:
> ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530 [42021.907355]
> R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
> [42021.914418] FS: 00007fda1a3b9700(0000) GS:ffff94453fb80000(0000)
> knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 [42021.928130] CR2: 0000000000000028 CR3:
> 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400 [42021.949324] Call Trace:
> [42021.951756] <TASK>
> [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70 [42021.959030]
> [<ffffffff86c58468>] ? __die+0x78/0xc0 [42021.963874] [<ffffffff86c9ef75>] ?
> page_fault_oops+0x2b5/0x3b0 [42021.969749] [<ffffffff87674b92>] ?
> exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>] ?
> asm_exc_page_fault+0x26/0x30 [42021.981517] [<ffffffffc0775680>] ?
> __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
> [42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
> [42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
> [42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
> [42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0 [42022.017058]
> [<ffffffff869f50ee>] ksys_read+0x6e/0xe0 [42022.022073] [<ffffffff8766f1ca>]
> do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> Fixes: 54747231150f ("RDMA: Introduce and use rdma_device_to_ibdev()")
Commit eb15c78b05bd9 eliminated hw_counters sysfs directory into the net namespace.
I don't see it created in any other net ns other than init_net with kernel 6.12+.
I am puzzled. Can you please explain/share the reproduction steps for generating above call trace?
> Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Leon Romanovsky <leon@kernel.org>
> Cc: Maher Sanalla <msanalla@nvidia.com>
> Cc: Parav Pandit <parav@mellanox.com>
> Cc: linux-rdma@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
> drivers/infiniband/core/sysfs.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index 7491328ca5e6..0be77b8abeae 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -148,7 +148,7 @@ static ssize_t hw_stat_device_show(struct device
> *dev, {
> struct hw_stats_device_attribute *stat_attr =
> container_of(attr, struct hw_stats_device_attribute, attr);
> - struct ib_device *ibdev = container_of(dev, struct ib_device, dev);
> + struct ib_device *ibdev = rdma_device_to_ibdev(dev);
>
> return stat_attr->show(ibdev, ibdev->hw_stats_data->stats,
> stat_attr - ibdev->hw_stats_data->attrs, 0, buf);
> @@ -160,7 +160,7 @@ static ssize_t hw_stat_device_store(struct device
> *dev, {
> struct hw_stats_device_attribute *stat_attr =
> container_of(attr, struct hw_stats_device_attribute, attr);
> - struct ib_device *ibdev = container_of(dev, struct ib_device, dev);
> + struct ib_device *ibdev = rdma_device_to_ibdev(dev);
>
> return stat_attr->store(ibdev, ibdev->hw_stats_data->stats,
> stat_attr - ibdev->hw_stats_data->attrs, 0,
> buf,
> --
> 2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 3:14 ` Parav Pandit
@ 2025-02-21 4:25 ` Roman Gushchin
2025-02-21 4:34 ` Parav Pandit
0 siblings, 1 reply; 22+ messages in thread
From: Roman Gushchin @ 2025-02-21 4:25 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
>
> > From: Roman Gushchin <roman.gushchin@linux.dev>
> > Sent: Friday, February 21, 2025 7:36 AM
> >
> > Commit 54747231150f ("RDMA: Introduce and use rdma_device_to_ibdev()")
> > introduced rdma_device_to_ibdev() helper which has to be used to obtain an
> > ib_device pointer from a device pointer.
> >
>
> > hw_stat_device_show() and hw_stat_device_store() were missed.
> >
> > It causes a NULL pointer dereference panic on an attempt to read hw counters
> > from a namespace, when the device structure is not embedded into the
> > ib_device structure.
> Do you mean net namespace other than default init_net?
> Assuming the answer is yes, some question below.
>
> > In this case casting the device pointer into the ib_device
> > pointer using container_of() is wrong.
> > Instead, rdma_device_to_ibdev() should be used, which uses the back-
> > reference (container_of(device, struct ib_core_device, dev))->owner.
> >
> > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > 0000000000000028 [42021.814463] #PF: supervisor read access in kernel
> > mode [42021.819549] #PF: error_code(0x0000) - not-present page
> > [42021.824636] PGD 0 P4D 0 [42021.827145] Oops: 0000 [#1] SMP PTI
> > [42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump: loaded
> > Tainted: G S W I XXX
> > [42021.841697] Hardware name: XXX
> > [42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
> > [42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f
> > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff ff <48> 8b
> > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48 [42021.873931]
> > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108] RAX:
> > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530 [42021.907355]
> > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
> > [42021.914418] FS: 00007fda1a3b9700(0000) GS:ffff94453fb80000(0000)
> > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES: 0000 CR0:
> > 0000000080050033 [42021.928130] CR2: 0000000000000028 CR3:
> > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > 0000000000000400 [42021.949324] Call Trace:
> > [42021.951756] <TASK>
> > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70 [42021.959030]
> > [<ffffffff86c58468>] ? __die+0x78/0xc0 [42021.963874] [<ffffffff86c9ef75>] ?
> > page_fault_oops+0x2b5/0x3b0 [42021.969749] [<ffffffff87674b92>] ?
> > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>] ?
> > asm_exc_page_fault+0x26/0x30 [42021.981517] [<ffffffffc0775680>] ?
> > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
> > [42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
> > [42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
> > [42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
> > [42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0 [42022.017058]
> > [<ffffffff869f50ee>] ksys_read+0x6e/0xe0 [42022.022073] [<ffffffff8766f1ca>]
> > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> >
> > Fixes: 54747231150f ("RDMA: Introduce and use rdma_device_to_ibdev()")
> Commit eb15c78b05bd9 eliminated hw_counters sysfs directory into the net namespace.
> I don't see it created in any other net ns other than init_net with kernel 6.12+.
>
> I am puzzled. Can you please explain/share the reproduction steps for generating above call trace?
Hi Parav!
This bug was spotted in the production on a small number of machines. They were
running a 6.6-based kernel (with no changes around this code). I don't have
a reproducer (and there is no simple way for me to reproduce the problem), but
I've several core dumps and from inspecting them it was clear that a ib_device
pointer obtained in hw_stat_device_show() was wrong. At the same time the
ib_pointer obtained in the way rdma_device_to_ibdev() works was correct.
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 4:25 ` Roman Gushchin
@ 2025-02-21 4:34 ` Parav Pandit
2025-02-21 4:49 ` Roman Gushchin
2025-02-21 17:43 ` Jason Gunthorpe
0 siblings, 2 replies; 22+ messages in thread
From: Parav Pandit @ 2025-02-21 4:34 UTC (permalink / raw)
To: Roman Gushchin
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
> From: Roman Gushchin <roman.gushchin@linux.dev>
> Sent: Friday, February 21, 2025 9:56 AM
>
> On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
> >
> > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > Sent: Friday, February 21, 2025 7:36 AM
> > >
> > > Commit 54747231150f ("RDMA: Introduce and use
> > > rdma_device_to_ibdev()") introduced rdma_device_to_ibdev() helper
> > > which has to be used to obtain an ib_device pointer from a device pointer.
> > >
> >
> > > hw_stat_device_show() and hw_stat_device_store() were missed.
> > >
> > > It causes a NULL pointer dereference panic on an attempt to read hw
> > > counters from a namespace, when the device structure is not embedded
> > > into the ib_device structure.
> > Do you mean net namespace other than default init_net?
> > Assuming the answer is yes, some question below.
> >
> > > In this case casting the device pointer into the ib_device pointer
> > > using container_of() is wrong.
> > > Instead, rdma_device_to_ibdev() should be used, which uses the back-
> > > reference (container_of(device, struct ib_core_device, dev))->owner.
> > >
> > > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > > 0000000000000028 [42021.814463] #PF: supervisor read access in
> > > kernel mode [42021.819549] #PF: error_code(0x0000) - not-present
> > > page [42021.824636] PGD 0 P4D 0 [42021.827145] Oops: 0000 [#1] SMP
> > > PTI [42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump:
> loaded
> > > Tainted: G S W I XXX
> > > [42021.841697] Hardware name: XXX
> > > [42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
> > > [42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa
> > > 0f 1f
> > > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff
> > > ff <48> 8b
> > > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
> > > [42021.873931]
> > > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108] RAX:
> > > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
> > > [42021.907355]
> > > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
> > > [42021.914418] FS: 00007fda1a3b9700(0000)
> GS:ffff94453fb80000(0000)
> > > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES: 0000
> CR0:
> > > 0000000080050033 [42021.928130] CR2: 0000000000000028 CR3:
> > > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > > 0000000000000400 [42021.949324] Call Trace:
> > > [42021.951756] <TASK>
> > > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
> > > [42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0 [42021.963874]
> [<ffffffff86c9ef75>] ?
> > > page_fault_oops+0x2b5/0x3b0 [42021.969749] [<ffffffff87674b92>] ?
> > > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>] ?
> > > asm_exc_page_fault+0x26/0x30 [42021.981517] [<ffffffffc0775680>] ?
> > > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
> > > [42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
> > > [42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
> > > [42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
> > > [42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
> > > [42022.017058] [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
> > > [42022.022073] [<ffffffff8766f1ca>]
> > > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > >
> > > Fixes: 54747231150f ("RDMA: Introduce and use
> > > rdma_device_to_ibdev()")
> > Commit eb15c78b05bd9 eliminated hw_counters sysfs directory into the
> net namespace.
> > I don't see it created in any other net ns other than init_net with kernel
> 6.12+.
> >
> > I am puzzled. Can you please explain/share the reproduction steps for
> generating above call trace?
>
> Hi Parav!
>
> This bug was spotted in the production on a small number of machines. They
> were running a 6.6-based kernel (with no changes around this code). I don't
> have a reproducer (and there is no simple way for me to reproduce the
> problem), but I've several core dumps and from inspecting them it was clear
> that a ib_device pointer obtained in hw_stat_device_show() was wrong. At
> the same time the ib_pointer obtained in the way rdma_device_to_ibdev()
> works was correct.
>
I just tried reproducing now on 6.12+ kernel manually.
It appears impossible to reach flow to me as intended in the commit I listed.
And the call trace shows opposite.
So please gather the information from the production system on reproducing it or configuration wise.
We still need to block the hw counters from net ns and will have to generate different fix if it was reached somehow.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 4:34 ` Parav Pandit
@ 2025-02-21 4:49 ` Roman Gushchin
2025-02-21 8:03 ` Parav Pandit
2025-02-21 17:43 ` Jason Gunthorpe
1 sibling, 1 reply; 22+ messages in thread
From: Roman Gushchin @ 2025-02-21 4:49 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
>
> > From: Roman Gushchin <roman.gushchin@linux.dev>
> > Sent: Friday, February 21, 2025 9:56 AM
> >
> > On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
> > >
> > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > Sent: Friday, February 21, 2025 7:36 AM
> > > >
> > > > Commit 54747231150f ("RDMA: Introduce and use
> > > > rdma_device_to_ibdev()") introduced rdma_device_to_ibdev() helper
> > > > which has to be used to obtain an ib_device pointer from a device pointer.
> > > >
> > >
> > > > hw_stat_device_show() and hw_stat_device_store() were missed.
> > > >
> > > > It causes a NULL pointer dereference panic on an attempt to read hw
> > > > counters from a namespace, when the device structure is not embedded
> > > > into the ib_device structure.
> > > Do you mean net namespace other than default init_net?
> > > Assuming the answer is yes, some question below.
> > >
> > > > In this case casting the device pointer into the ib_device pointer
> > > > using container_of() is wrong.
> > > > Instead, rdma_device_to_ibdev() should be used, which uses the back-
> > > > reference (container_of(device, struct ib_core_device, dev))->owner.
> > > >
> > > > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > > > 0000000000000028 [42021.814463] #PF: supervisor read access in
> > > > kernel mode [42021.819549] #PF: error_code(0x0000) - not-present
> > > > page [42021.824636] PGD 0 P4D 0 [42021.827145] Oops: 0000 [#1] SMP
> > > > PTI [42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump:
> > loaded
> > > > Tainted: G S W I XXX
> > > > [42021.841697] Hardware name: XXX
> > > > [42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
> > > > [42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa
> > > > 0f 1f
> > > > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff
> > > > ff <48> 8b
> > > > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
> > > > [42021.873931]
> > > > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108] RAX:
> > > > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > > > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > > > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > > > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > > > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
> > > > [42021.907355]
> > > > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
> > > > [42021.914418] FS: 00007fda1a3b9700(0000)
> > GS:ffff94453fb80000(0000)
> > > > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES: 0000
> > CR0:
> > > > 0000000080050033 [42021.928130] CR2: 0000000000000028 CR3:
> > > > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > > > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > > > 0000000000000400 [42021.949324] Call Trace:
> > > > [42021.951756] <TASK>
> > > > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
> > > > [42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0 [42021.963874]
> > [<ffffffff86c9ef75>] ?
> > > > page_fault_oops+0x2b5/0x3b0 [42021.969749] [<ffffffff87674b92>] ?
> > > > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>] ?
> > > > asm_exc_page_fault+0x26/0x30 [42021.981517] [<ffffffffc0775680>] ?
> > > > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > > > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
> > > > [42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
> > > > [42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
> > > > [42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
> > > > [42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
> > > > [42022.017058] [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
> > > > [42022.022073] [<ffffffff8766f1ca>]
> > > > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > >
> > > > Fixes: 54747231150f ("RDMA: Introduce and use
> > > > rdma_device_to_ibdev()")
> > > Commit eb15c78b05bd9 eliminated hw_counters sysfs directory into the
> > net namespace.
> > > I don't see it created in any other net ns other than init_net with kernel
> > 6.12+.
> > >
> > > I am puzzled. Can you please explain/share the reproduction steps for
> > generating above call trace?
> >
> > Hi Parav!
> >
> > This bug was spotted in the production on a small number of machines. They
> > were running a 6.6-based kernel (with no changes around this code). I don't
> > have a reproducer (and there is no simple way for me to reproduce the
> > problem), but I've several core dumps and from inspecting them it was clear
> > that a ib_device pointer obtained in hw_stat_device_show() was wrong. At
> > the same time the ib_pointer obtained in the way rdma_device_to_ibdev()
> > works was correct.
> >
> I just tried reproducing now on 6.12+ kernel manually.
Can you, please, share your steps? Or try the 6.6 kernel?
> It appears impossible to reach flow to me as intended in the commit I listed.
> And the call trace shows opposite.
> So please gather the information from the production system on reproducing it or configuration wise.
> We still need to block the hw counters from net ns and will have to generate different fix if it was reached somehow.
I'll try, but it'll take a lot of time - I'm very limited in terms of what I can
do because it's a production workload.
So if you have any suggestions on where to look at or what to try, I'll
appreciate it.
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 4:49 ` Roman Gushchin
@ 2025-02-21 8:03 ` Parav Pandit
2025-02-21 20:02 ` Roman Gushchin
0 siblings, 1 reply; 22+ messages in thread
From: Parav Pandit @ 2025-02-21 8:03 UTC (permalink / raw)
To: Roman Gushchin
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
> From: Roman Gushchin <roman.gushchin@linux.dev>
> Sent: Friday, February 21, 2025 10:20 AM
>
> On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
> >
> > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > Sent: Friday, February 21, 2025 9:56 AM
> > >
> > > On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > > Sent: Friday, February 21, 2025 7:36 AM
> > > > >
> > > > > Commit 54747231150f ("RDMA: Introduce and use
> > > > > rdma_device_to_ibdev()") introduced rdma_device_to_ibdev()
> > > > > helper which has to be used to obtain an ib_device pointer from a
> device pointer.
> > > > >
> > > >
> > > > > hw_stat_device_show() and hw_stat_device_store() were missed.
> > > > >
> > > > > It causes a NULL pointer dereference panic on an attempt to read
> > > > > hw counters from a namespace, when the device structure is not
> > > > > embedded into the ib_device structure.
> > > > Do you mean net namespace other than default init_net?
> > > > Assuming the answer is yes, some question below.
> > > >
> > > > > In this case casting the device pointer into the ib_device
> > > > > pointer using container_of() is wrong.
> > > > > Instead, rdma_device_to_ibdev() should be used, which uses the
> > > > > back- reference (container_of(device, struct ib_core_device, dev))-
> >owner.
> > > > >
> > > > > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > > > > 0000000000000028 [42021.814463] #PF: supervisor read access in
> > > > > kernel mode [42021.819549] #PF: error_code(0x0000) - not-present
> > > > > page [42021.824636] PGD 0 P4D 0 [42021.827145] Oops: 0000 [#1]
> > > > > SMP PTI [42021.830598] CPU: 82 PID: 2843922 Comm: switchto-
> defaul Kdump:
> > > loaded
> > > > > Tainted: G S W I XXX
> > > > > [42021.841697] Hardware name: XXX [42021.849619] RIP:
> > > > > 0010:hw_stat_device_show+0x1e/0x40 [ib_core] [42021.855362]
> > > > > Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f
> > > > > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0
> > > > > fa ff ff <48> 8b
> > > > > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
> > > > > [42021.873931]
> > > > > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108] RAX:
> > > > > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > > > > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > > > > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > > > > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > > > > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
> > > > > [42021.907355]
> > > > > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15:
> > > > > ffff94085f1aa000 [42021.914418] FS: 00007fda1a3b9700(0000)
> > > GS:ffff94453fb80000(0000)
> > > > > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES:
> > > > > 0000
> > > CR0:
> > > > > 0000000080050033 [42021.928130] CR2: 0000000000000028 CR3:
> > > > > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > > > > 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> > > > > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > > > > 0000000000000400 [42021.949324] Call Trace:
> > > > > [42021.951756] <TASK>
> > > > > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
> > > > > [42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0
> > > > > [42021.963874]
> > > [<ffffffff86c9ef75>] ?
> > > > > page_fault_oops+0x2b5/0x3b0 [42021.969749] [<ffffffff87674b92>] ?
> > > > > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>] ?
> > > > > asm_exc_page_fault+0x26/0x30 [42021.981517] [<ffffffffc0775680>]
> ?
> > > > > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > > > > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
> > > > > [42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
> > > > > [42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
> > > > > [42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
> > > > > [42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
> > > > > [42022.017058] [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
> > > > > [42022.022073] [<ffffffff8766f1ca>]
> > > > > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > > >
> > > > > Fixes: 54747231150f ("RDMA: Introduce and use
> > > > > rdma_device_to_ibdev()")
> > > > Commit eb15c78b05bd9 eliminated hw_counters sysfs directory into
> > > > the
> > > net namespace.
> > > > I don't see it created in any other net ns other than init_net
> > > > with kernel
> > > 6.12+.
> > > >
> > > > I am puzzled. Can you please explain/share the reproduction steps
> > > > for
> > > generating above call trace?
> > >
> > > Hi Parav!
> > >
> > > This bug was spotted in the production on a small number of
> > > machines. They were running a 6.6-based kernel (with no changes
> > > around this code). I don't have a reproducer (and there is no simple
> > > way for me to reproduce the problem), but I've several core dumps
> > > and from inspecting them it was clear that a ib_device pointer
> > > obtained in hw_stat_device_show() was wrong. At the same time the
> > > ib_pointer obtained in the way rdma_device_to_ibdev() works was
> correct.
> > >
> > I just tried reproducing now on 6.12+ kernel manually.
>
> Can you, please, share your steps? Or try the 6.6 kernel?
>
$ rdma system show to display 'netns shared'.
$ ip netns add foo
$ ip netns exec foo bash
$ attempt to access the hw counters from the foo net namespace.
For the rdma devices following directory must not have "hw_counters" directory.
$ ls -l /sys/class/infiniband/<dev_name>/ports/1/
> > It appears impossible to reach flow to me as intended in the commit I listed.
> > And the call trace shows opposite.
> > So please gather the information from the production system on
> reproducing it or configuration wise.
> > We still need to block the hw counters from net ns and will have to generate
> different fix if it was reached somehow.
>
> I'll try, but it'll take a lot of time - I'm very limited in terms of what I can do
> because it's a production workload.
>
The reproduction should be straightforward if its basic issue.
> So if you have any suggestions on where to look at or what to try, I'll
> appreciate it.
The output of below commands will provide some hint.
$ rdma system show
$ rdma dev show
$ ip netns show
$ mount | grep net
ls -l /run/netns or whichever directory where netns mounted from the output of above command.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 4:34 ` Parav Pandit
2025-02-21 4:49 ` Roman Gushchin
@ 2025-02-21 17:43 ` Jason Gunthorpe
2025-02-22 18:34 ` Parav Pandit
1 sibling, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2025-02-21 17:43 UTC (permalink / raw)
To: Parav Pandit
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
> I just tried reproducing now on 6.12+ kernel manually.
> It appears impossible to reach flow to me as intended in the commit
> I listed.
It looks to me like this:
static void rdma_init_coredev(struct ib_core_device *coredev,
struct ib_device *dev, struct net *net)
{
coredev->dev.groups = dev->groups;
^^^^^^^^^^^^^^^^^^^^^
Copies the sysfs groups from the normal ib_dev which includes the hw_*
stuff to the per-NS device?
Everything in that groups list must use rdma_device_to_ibdev()
int ib_setup_device_attrs(struct ib_device *ibdev)
{
[..]
attr->attr.show = hw_stat_device_show;
attr->show = show_hw_stats;
data->group.attrs[pos] = &attr->attr.attr;
[..]
ibdev->groups[i] = &data->group;
Which means the sysfs reported here is in that list?
Maybe this was misses when the sysfs was shut off?
Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 8:03 ` Parav Pandit
@ 2025-02-21 20:02 ` Roman Gushchin
2025-02-22 18:36 ` Parav Pandit
0 siblings, 1 reply; 22+ messages in thread
From: Roman Gushchin @ 2025-02-21 20:02 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Fri, Feb 21, 2025 at 08:03:33AM +0000, Parav Pandit wrote:
> > From: Roman Gushchin <roman.gushchin@linux.dev>
> > Sent: Friday, February 21, 2025 10:20 AM
> >
> > On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
> > >
> > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > Sent: Friday, February 21, 2025 9:56 AM
> > > >
> > > > On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
> > > > >
> > > > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > > > Sent: Friday, February 21, 2025 7:36 AM
> > > > > >
> > > > > > Commit 54747231150f ("RDMA: Introduce and use
> > > > > > rdma_device_to_ibdev()") introduced rdma_device_to_ibdev()
> > > > > > helper which has to be used to obtain an ib_device pointer from a
> > device pointer.
> > > > > >
> > > > >
> > > > > > hw_stat_device_show() and hw_stat_device_store() were missed.
> > > > > >
> > > > > > It causes a NULL pointer dereference panic on an attempt to read
> > > > > > hw counters from a namespace, when the device structure is not
> > > > > > embedded into the ib_device structure.
> > > > > Do you mean net namespace other than default init_net?
> > > > > Assuming the answer is yes, some question below.
> > > > >
> > > > > > In this case casting the device pointer into the ib_device
> > > > > > pointer using container_of() is wrong.
> > > > > > Instead, rdma_device_to_ibdev() should be used, which uses the
> > > > > > back- reference (container_of(device, struct ib_core_device, dev))-
> > >owner.
> > > > > >
> > > > > > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > > > > > 0000000000000028 [42021.814463] #PF: supervisor read access in
> > > > > > kernel mode [42021.819549] #PF: error_code(0x0000) - not-present
> > > > > > page [42021.824636] PGD 0 P4D 0 [42021.827145] Oops: 0000 [#1]
> > > > > > SMP PTI [42021.830598] CPU: 82 PID: 2843922 Comm: switchto-
> > defaul Kdump:
> > > > loaded
> > > > > > Tainted: G S W I XXX
> > > > > > [42021.841697] Hardware name: XXX [42021.849619] RIP:
> > > > > > 0010:hw_stat_device_show+0x1e/0x40 [ib_core] [42021.855362]
> > > > > > Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f
> > > > > > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0
> > > > > > fa ff ff <48> 8b
> > > > > > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
> > > > > > [42021.873931]
> > > > > > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108] RAX:
> > > > > > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > > > > > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > > > > > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > > > > > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > > > > > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
> > > > > > [42021.907355]
> > > > > > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15:
> > > > > > ffff94085f1aa000 [42021.914418] FS: 00007fda1a3b9700(0000)
> > > > GS:ffff94453fb80000(0000)
> > > > > > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES:
> > > > > > 0000
> > > > CR0:
> > > > > > 0000000080050033 [42021.928130] CR2: 0000000000000028 CR3:
> > > > > > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > > > > > 0000000000000000 DR1: 0000000000000000 DR2:
> > 0000000000000000
> > > > > > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > > > > > 0000000000000400 [42021.949324] Call Trace:
> > > > > > [42021.951756] <TASK>
> > > > > > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
> > > > > > [42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0
> > > > > > [42021.963874]
> > > > [<ffffffff86c9ef75>] ?
> > > > > > page_fault_oops+0x2b5/0x3b0 [42021.969749] [<ffffffff87674b92>] ?
> > > > > > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>] ?
> > > > > > asm_exc_page_fault+0x26/0x30 [42021.981517] [<ffffffffc0775680>]
> > ?
> > > > > > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > > > > > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
> > > > > > [42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
> > > > > > [42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
> > > > > > [42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
> > > > > > [42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
> > > > > > [42022.017058] [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
> > > > > > [42022.022073] [<ffffffff8766f1ca>]
> > > > > > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > > > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > > > >
> > > > > > Fixes: 54747231150f ("RDMA: Introduce and use
> > > > > > rdma_device_to_ibdev()")
> > > > > Commit eb15c78b05bd9 eliminated hw_counters sysfs directory into
> > > > > the
> > > > net namespace.
> > > > > I don't see it created in any other net ns other than init_net
> > > > > with kernel
> > > > 6.12+.
> > > > >
> > > > > I am puzzled. Can you please explain/share the reproduction steps
> > > > > for
> > > > generating above call trace?
> > > >
> > > > Hi Parav!
> > > >
> > > > This bug was spotted in the production on a small number of
> > > > machines. They were running a 6.6-based kernel (with no changes
> > > > around this code). I don't have a reproducer (and there is no simple
> > > > way for me to reproduce the problem), but I've several core dumps
> > > > and from inspecting them it was clear that a ib_device pointer
> > > > obtained in hw_stat_device_show() was wrong. At the same time the
> > > > ib_pointer obtained in the way rdma_device_to_ibdev() works was
> > correct.
> > > >
> > > I just tried reproducing now on 6.12+ kernel manually.
> >
> > Can you, please, share your steps? Or try the 6.6 kernel?
> >
> $ rdma system show to display 'netns shared'.
> $ ip netns add foo
> $ ip netns exec foo bash
> $ attempt to access the hw counters from the foo net namespace.
Ok, it worked well. The following commands
$ ip netns add foo
$ ip netns exec foo bash
$ cat /sys/class/infiniband/mlx4_0/hw_counters/*
cause a panic on a vanilla v6.12.9 without my changes
and work perfectly fine with my patch.
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 17:43 ` Jason Gunthorpe
@ 2025-02-22 18:34 ` Parav Pandit
2025-02-24 15:11 ` Jason Gunthorpe
0 siblings, 1 reply; 22+ messages in thread
From: Parav Pandit @ 2025-02-22 18:34 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, February 21, 2025 11:14 PM
>
> On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
>
> > I just tried reproducing now on 6.12+ kernel manually.
> > It appears impossible to reach flow to me as intended in the commit I
> > listed.
>
> It looks to me like this:
>
> static void rdma_init_coredev(struct ib_core_device *coredev,
> struct ib_device *dev, struct net *net) {
> coredev->dev.groups = dev->groups;
> ^^^^^^^^^^^^^^^^^^^^^
>
> Copies the sysfs groups from the normal ib_dev which includes the hw_* stuff
> to the per-NS device?
>
I dig deeper.
Yes, the source commit was not what Roman listed.
The source is commit b7066b32a14fd.
> Everything in that groups list must use rdma_device_to_ibdev()
>
> int ib_setup_device_attrs(struct ib_device *ibdev) { [..]
> attr->attr.show = hw_stat_device_show;
> attr->show = show_hw_stats;
> data->group.attrs[pos] = &attr->attr.attr; [..]
> ibdev->groups[i] = &data->group;
>
> Which means the sysfs reported here is in that list?
>
> Maybe this was misses when the sysfs was shut off?
>
The original commit eb15c78b05bd9 and others, never introduced device level counters in the net ns in shared mode.
Commit eb15c78b05bd9 by design didn't introduce device global port level and device level counters in non init net ns as it may lead to unwanted side talk via counters and to avoid RW lifespan file.
And we should continue that way to not take up more burden to maintain the counters in shared mode for non init net.
Roman's crash report validates that device level counters are never used in non init net.
So suggest, lets fix it rather than adding the proposed change which opens the doors.
ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by renaming ib_setup_port_attrs() to be generic.
To utilize the group initialization ib_setup_port_attrs() needs to move up before device_add().
Roman,
Will you please modify the fix to avoid hw_stats exposure in non init net using above proposal?
Please let me know in case if you need my help to revise the patch.
> Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-21 20:02 ` Roman Gushchin
@ 2025-02-22 18:36 ` Parav Pandit
2025-02-22 20:50 ` Roman Gushchin
0 siblings, 1 reply; 22+ messages in thread
From: Parav Pandit @ 2025-02-22 18:36 UTC (permalink / raw)
To: Roman Gushchin
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
Hi Roman,
> From: Roman Gushchin <roman.gushchin@linux.dev>
> Sent: Saturday, February 22, 2025 1:33 AM
>
> On Fri, Feb 21, 2025 at 08:03:33AM +0000, Parav Pandit wrote:
> > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > Sent: Friday, February 21, 2025 10:20 AM
> > >
> > > On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > > Sent: Friday, February 21, 2025 9:56 AM
> > > > >
> > > > > On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
> > > > > >
> > > > > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > > > > Sent: Friday, February 21, 2025 7:36 AM
> > > > > > >
> > > > > > > Commit 54747231150f ("RDMA: Introduce and use
> > > > > > > rdma_device_to_ibdev()") introduced rdma_device_to_ibdev()
> > > > > > > helper which has to be used to obtain an ib_device pointer
> > > > > > > from a
> > > device pointer.
> > > > > > >
> > > > > >
> > > > > > > hw_stat_device_show() and hw_stat_device_store() were missed.
> > > > > > >
> > > > > > > It causes a NULL pointer dereference panic on an attempt to
> > > > > > > read hw counters from a namespace, when the device structure
> > > > > > > is not embedded into the ib_device structure.
> > > > > > Do you mean net namespace other than default init_net?
> > > > > > Assuming the answer is yes, some question below.
> > > > > >
> > > > > > > In this case casting the device pointer into the ib_device
> > > > > > > pointer using container_of() is wrong.
> > > > > > > Instead, rdma_device_to_ibdev() should be used, which uses
> > > > > > > the
> > > > > > > back- reference (container_of(device, struct ib_core_device,
> > > > > > > dev))-
> > > >owner.
> > > > > > >
> > > > > > > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > > > > > > 0000000000000028 [42021.814463] #PF: supervisor read access
> > > > > > > in kernel mode [42021.819549] #PF: error_code(0x0000) -
> > > > > > > not-present page [42021.824636] PGD 0 P4D 0 [42021.827145]
> > > > > > > Oops: 0000 [#1] SMP PTI [42021.830598] CPU: 82 PID: 2843922
> > > > > > > Comm: switchto-
> > > defaul Kdump:
> > > > > loaded
> > > > > > > Tainted: G S W I XXX
> > > > > > > [42021.841697] Hardware name: XXX [42021.849619] RIP:
> > > > > > > 0010:hw_stat_device_show+0x1e/0x40 [ib_core] [42021.855362]
> > > > > > > Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f
> > > > > > > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7
> > > > > > > f0 fa ff ff <48> 8b
> > > > > > > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
> > > > > > > [42021.873931]
> > > > > > > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108]
> RAX:
> > > > > > > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > > > > > > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > > > > > > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > > > > > > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > > > > > > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
> > > > > > > [42021.907355]
> > > > > > > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15:
> > > > > > > ffff94085f1aa000 [42021.914418] FS: 00007fda1a3b9700(0000)
> > > > > GS:ffff94453fb80000(0000)
> > > > > > > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES:
> > > > > > > 0000
> > > > > CR0:
> > > > > > > 0000000080050033 [42021.928130] CR2: 0000000000000028
> CR3:
> > > > > > > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > > > > > > 0000000000000000 DR1: 0000000000000000 DR2:
> > > 0000000000000000
> > > > > > > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0
> DR7:
> > > > > > > 0000000000000400 [42021.949324] Call Trace:
> > > > > > > [42021.951756] <TASK>
> > > > > > > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
> > > > > > > [42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0
> > > > > > > [42021.963874]
> > > > > [<ffffffff86c9ef75>] ?
> > > > > > > page_fault_oops+0x2b5/0x3b0 [42021.969749]
> [<ffffffff87674b92>] ?
> > > > > > > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>]
> ?
> > > > > > > asm_exc_page_fault+0x26/0x30 [42021.981517]
> > > > > > > [<ffffffffc0775680>]
> > > ?
> > > > > > > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > > > > > > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40
> > > > > > > [ib_core] [42021.995438] [<ffffffff86ac7f8e>]
> > > > > > > dev_attr_show+0x1e/0x50 [42022.000803] [<ffffffff86a3eeb1>]
> > > > > > > sysfs_kf_seq_show+0x81/0xe0 [42022.006508]
> > > > > > > [<ffffffff86a11134>] seq_read_iter+0xf4/0x410 [42022.011954]
> > > > > > > [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0 [42022.017058]
> > > > > > > [<ffffffff869f50ee>] ksys_read+0x6e/0xe0 [42022.022073]
> > > > > > > [<ffffffff8766f1ca>]
> > > > > > > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > > > > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > > > > >
> > > > > > > Fixes: 54747231150f ("RDMA: Introduce and use
> > > > > > > rdma_device_to_ibdev()")
> > > > > > Commit eb15c78b05bd9 eliminated hw_counters sysfs directory
> > > > > > into the
> > > > > net namespace.
> > > > > > I don't see it created in any other net ns other than init_net
> > > > > > with kernel
> > > > > 6.12+.
> > > > > >
> > > > > > I am puzzled. Can you please explain/share the reproduction
> > > > > > steps for
> > > > > generating above call trace?
> > > > >
> > > > > Hi Parav!
> > > > >
> > > > > This bug was spotted in the production on a small number of
> > > > > machines. They were running a 6.6-based kernel (with no changes
> > > > > around this code). I don't have a reproducer (and there is no
> > > > > simple way for me to reproduce the problem), but I've several
> > > > > core dumps and from inspecting them it was clear that a
> > > > > ib_device pointer obtained in hw_stat_device_show() was wrong.
> > > > > At the same time the ib_pointer obtained in the way
> > > > > rdma_device_to_ibdev() works was
> > > correct.
> > > > >
> > > > I just tried reproducing now on 6.12+ kernel manually.
> > >
> > > Can you, please, share your steps? Or try the 6.6 kernel?
> > >
> > $ rdma system show to display 'netns shared'.
> > $ ip netns add foo
> > $ ip netns exec foo bash
> > $ attempt to access the hw counters from the foo net namespace.
>
> Ok, it worked well. The following commands
>
> $ ip netns add foo
> $ ip netns exec foo bash
> $ cat /sys/class/infiniband/mlx4_0/hw_counters/*
>
> cause a panic on a vanilla v6.12.9 without my changes and work perfectly fine
> with my patch.
>
I dig further and I see the issue. Its not the per port counter.
It's the per device counter which got broken by commit 467f432a521a2.
It introduced the device counter unintentionally in non init net.
And we need to fix that (instead of allowing it and opening more holes).
I replied with more details to Jason G in previous reply.
Please take a look.
> Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-22 18:36 ` Parav Pandit
@ 2025-02-22 20:50 ` Roman Gushchin
0 siblings, 0 replies; 22+ messages in thread
From: Roman Gushchin @ 2025-02-22 20:50 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Sat, Feb 22, 2025 at 06:36:46PM +0000, Parav Pandit wrote:
> Hi Roman,
>
> > From: Roman Gushchin <roman.gushchin@linux.dev>
> > Sent: Saturday, February 22, 2025 1:33 AM
> >
> > On Fri, Feb 21, 2025 at 08:03:33AM +0000, Parav Pandit wrote:
> > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > Sent: Friday, February 21, 2025 10:20 AM
> > > >
> > > > On Fri, Feb 21, 2025 at 04:34:25AM +0000, Parav Pandit wrote:
> > > > >
> > > > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > > > Sent: Friday, February 21, 2025 9:56 AM
> > > > > >
> > > > > > On Fri, Feb 21, 2025 at 03:14:16AM +0000, Parav Pandit wrote:
> > > > > > >
> > > > > > > > From: Roman Gushchin <roman.gushchin@linux.dev>
> > > > > > > > Sent: Friday, February 21, 2025 7:36 AM
> > > > > > > >
> > > > > > > > Commit 54747231150f ("RDMA: Introduce and use
> > > > > > > > rdma_device_to_ibdev()") introduced rdma_device_to_ibdev()
> > > > > > > > helper which has to be used to obtain an ib_device pointer
> > > > > > > > from a
> > > > device pointer.
> > > > > > > >
> > > > > > >
> > > > > > > > hw_stat_device_show() and hw_stat_device_store() were missed.
> > > > > > > >
> > > > > > > > It causes a NULL pointer dereference panic on an attempt to
> > > > > > > > read hw counters from a namespace, when the device structure
> > > > > > > > is not embedded into the ib_device structure.
> > > > > > > Do you mean net namespace other than default init_net?
> > > > > > > Assuming the answer is yes, some question below.
> > > > > > >
> > > > > > > > In this case casting the device pointer into the ib_device
> > > > > > > > pointer using container_of() is wrong.
> > > > > > > > Instead, rdma_device_to_ibdev() should be used, which uses
> > > > > > > > the
> > > > > > > > back- reference (container_of(device, struct ib_core_device,
> > > > > > > > dev))-
> > > > >owner.
> > > > > > > >
> > > > > > > > [42021.807566] BUG: kernel NULL pointer dereference, address:
> > > > > > > > 0000000000000028 [42021.814463] #PF: supervisor read access
> > > > > > > > in kernel mode [42021.819549] #PF: error_code(0x0000) -
> > > > > > > > not-present page [42021.824636] PGD 0 P4D 0 [42021.827145]
> > > > > > > > Oops: 0000 [#1] SMP PTI [42021.830598] CPU: 82 PID: 2843922
> > > > > > > > Comm: switchto-
> > > > defaul Kdump:
> > > > > > loaded
> > > > > > > > Tainted: G S W I XXX
> > > > > > > > [42021.841697] Hardware name: XXX [42021.849619] RIP:
> > > > > > > > 0010:hw_stat_device_show+0x1e/0x40 [ib_core] [42021.855362]
> > > > > > > > Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f
> > > > > > > > 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7
> > > > > > > > f0 fa ff ff <48> 8b
> > > > > > > > 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
> > > > > > > > [42021.873931]
> > > > > > > > RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287 [42021.879108]
> > RAX:
> > > > > > > > ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
> > > > > > > > [42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI:
> > > > > > > > ffff940c7517aef0 [42021.893230] RBP: ffff97fe90f03e70 R08:
> > > > > > > > ffff94085f1aa000 R09: 0000000000000000 [42021.900294] R10:
> > > > > > > > ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
> > > > > > > > [42021.907355]
> > > > > > > > R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15:
> > > > > > > > ffff94085f1aa000 [42021.914418] FS: 00007fda1a3b9700(0000)
> > > > > > GS:ffff94453fb80000(0000)
> > > > > > > > knlGS:0000000000000000 [42021.922423] CS: 0010 DS: 0000 ES:
> > > > > > > > 0000
> > > > > > CR0:
> > > > > > > > 0000000080050033 [42021.928130] CR2: 0000000000000028
> > CR3:
> > > > > > > > 00000042dcfb8003 CR4: 00000000003726f0 [42021.935194] DR0:
> > > > > > > > 0000000000000000 DR1: 0000000000000000 DR2:
> > > > 0000000000000000
> > > > > > > > [42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0
> > DR7:
> > > > > > > > 0000000000000400 [42021.949324] Call Trace:
> > > > > > > > [42021.951756] <TASK>
> > > > > > > > [42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
> > > > > > > > [42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0
> > > > > > > > [42021.963874]
> > > > > > [<ffffffff86c9ef75>] ?
> > > > > > > > page_fault_oops+0x2b5/0x3b0 [42021.969749]
> > [<ffffffff87674b92>] ?
> > > > > > > > exc_page_fault+0x1a2/0x3c0 [42021.975549] [<ffffffff87801326>]
> > ?
> > > > > > > > asm_exc_page_fault+0x26/0x30 [42021.981517]
> > > > > > > > [<ffffffffc0775680>]
> > > > ?
> > > > > > > > __pfx_show_hw_stats+0x10/0x10 [ib_core] [42021.988482]
> > > > > > > > [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40
> > > > > > > > [ib_core] [42021.995438] [<ffffffff86ac7f8e>]
> > > > > > > > dev_attr_show+0x1e/0x50 [42022.000803] [<ffffffff86a3eeb1>]
> > > > > > > > sysfs_kf_seq_show+0x81/0xe0 [42022.006508]
> > > > > > > > [<ffffffff86a11134>] seq_read_iter+0xf4/0x410 [42022.011954]
> > > > > > > > [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0 [42022.017058]
> > > > > > > > [<ffffffff869f50ee>] ksys_read+0x6e/0xe0 [42022.022073]
> > > > > > > > [<ffffffff8766f1ca>]
> > > > > > > > do_syscall_64+0x6a/0xa0 [42022.027441] [<ffffffff8780013b>]
> > > > > > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > > > > > >
> > > > > > > > Fixes: 54747231150f ("RDMA: Introduce and use
> > > > > > > > rdma_device_to_ibdev()")
> > > > > > > Commit eb15c78b05bd9 eliminated hw_counters sysfs directory
> > > > > > > into the
> > > > > > net namespace.
> > > > > > > I don't see it created in any other net ns other than init_net
> > > > > > > with kernel
> > > > > > 6.12+.
> > > > > > >
> > > > > > > I am puzzled. Can you please explain/share the reproduction
> > > > > > > steps for
> > > > > > generating above call trace?
> > > > > >
> > > > > > Hi Parav!
> > > > > >
> > > > > > This bug was spotted in the production on a small number of
> > > > > > machines. They were running a 6.6-based kernel (with no changes
> > > > > > around this code). I don't have a reproducer (and there is no
> > > > > > simple way for me to reproduce the problem), but I've several
> > > > > > core dumps and from inspecting them it was clear that a
> > > > > > ib_device pointer obtained in hw_stat_device_show() was wrong.
> > > > > > At the same time the ib_pointer obtained in the way
> > > > > > rdma_device_to_ibdev() works was
> > > > correct.
> > > > > >
> > > > > I just tried reproducing now on 6.12+ kernel manually.
> > > >
> > > > Can you, please, share your steps? Or try the 6.6 kernel?
> > > >
> > > $ rdma system show to display 'netns shared'.
> > > $ ip netns add foo
> > > $ ip netns exec foo bash
> > > $ attempt to access the hw counters from the foo net namespace.
> >
> > Ok, it worked well. The following commands
> >
> > $ ip netns add foo
> > $ ip netns exec foo bash
> > $ cat /sys/class/infiniband/mlx4_0/hw_counters/*
> >
> > cause a panic on a vanilla v6.12.9 without my changes and work perfectly fine
> > with my patch.
> >
> I dig further and I see the issue. Its not the per port counter.
> It's the per device counter which got broken by commit 467f432a521a2.
> It introduced the device counter unintentionally in non init net.
> And we need to fix that (instead of allowing it and opening more holes).
> I replied with more details to Jason G in previous reply.
> Please take a look.
I can prepare a patch like this, but I'm slightly worried about hiding
previously exposed counters. I don't think it's a problem for us (even though
I'm not 100% sure), but technically it can be seen as breaking the API.
How about merging 2 patches: my original patch which fixes the memory corruption
and a separate patch which hides those counters from non-init namespace?
The first can be safely propagated towards stable trees.
I'll prepate the second patch soon.
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-22 18:34 ` Parav Pandit
@ 2025-02-24 15:11 ` Jason Gunthorpe
2025-02-24 15:16 ` Parav Pandit
0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2025-02-24 15:11 UTC (permalink / raw)
To: Parav Pandit
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by
> renaming ib_setup_port_attrs() to be generic. To utilize the group
> initialization ib_setup_port_attrs() needs to move up before
> device_add().
It needs more than that, somehow you have to maintain two groups list
or somehow remove the coredev->dev.groups assignment..
Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-24 15:11 ` Jason Gunthorpe
@ 2025-02-24 15:16 ` Parav Pandit
2025-02-24 23:22 ` Roman Gushchin
2025-02-24 23:31 ` Jason Gunthorpe
0 siblings, 2 replies; 22+ messages in thread
From: Parav Pandit @ 2025-02-24 15:16 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Monday, February 24, 2025 8:41 PM
>
> On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by
> > renaming ib_setup_port_attrs() to be generic. To utilize the group
> > initialization ib_setup_port_attrs() needs to move up before
> > device_add().
>
> It needs more than that, somehow you have to maintain two groups list or
> somehow remove the coredev->dev.groups assignment..
>
I was thinking that if both device and port attr setup is done in same function, there is knowledge of is_full_dev that can be used for device level hw_stats setup. (similar to how its done at port level).
> Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-24 15:16 ` Parav Pandit
@ 2025-02-24 23:22 ` Roman Gushchin
2025-02-24 23:30 ` Jason Gunthorpe
2025-02-24 23:31 ` Jason Gunthorpe
1 sibling, 1 reply; 22+ messages in thread
From: Roman Gushchin @ 2025-02-24 23:22 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
>
>
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Monday, February 24, 2025 8:41 PM
> >
> > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by
> > > renaming ib_setup_port_attrs() to be generic. To utilize the group
> > > initialization ib_setup_port_attrs() needs to move up before
> > > device_add().
> >
> > It needs more than that, somehow you have to maintain two groups list or
> > somehow remove the coredev->dev.groups assignment..
> >
> I was thinking that if both device and port attr setup is done in same function, there is knowledge of is_full_dev that can be used for device level hw_stats setup. (similar to how its done at port level).
Given that there is a bit of discussion on how to move forward with this,
can we please merge the trivial fix in the mean time? (Just sent out v2 with
the fixed commit log).
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-24 23:22 ` Roman Gushchin
@ 2025-02-24 23:30 ` Jason Gunthorpe
2025-02-25 3:42 ` Roman Gushchin
0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2025-02-24 23:30 UTC (permalink / raw)
To: Roman Gushchin
Cc: Parav Pandit, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Mon, Feb 24, 2025 at 11:22:29PM +0000, Roman Gushchin wrote:
> On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Monday, February 24, 2025 8:41 PM
> > >
> > > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > > ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by
> > > > renaming ib_setup_port_attrs() to be generic. To utilize the group
> > > > initialization ib_setup_port_attrs() needs to move up before
> > > > device_add().
> > >
> > > It needs more than that, somehow you have to maintain two groups list or
> > > somehow remove the coredev->dev.groups assignment..
> > >
> > I was thinking that if both device and port attr setup is done in
> > same function, there is knowledge of is_full_dev that can be used
> > for device level hw_stats setup. (similar to how its done at port
> > level).
>
> Given that there is a bit of discussion on how to move forward with this,
> can we please merge the trivial fix in the mean time? (Just sent out v2 with
> the fixed commit log).
Well, the issue now is the ABI break
If the right answer is to remove the sysfs entirely then it doesn't
make sense to make it work in the stable and LTS kernels since that
would create users. Currently it is fully broken so there are no
users. Can we say that so certainly after it is fixed?
Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-24 15:16 ` Parav Pandit
2025-02-24 23:22 ` Roman Gushchin
@ 2025-02-24 23:31 ` Jason Gunthorpe
2025-02-25 10:38 ` Parav Pandit
1 sibling, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2025-02-24 23:31 UTC (permalink / raw)
To: Parav Pandit
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
>
>
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Monday, February 24, 2025 8:41 PM
> >
> > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by
> > > renaming ib_setup_port_attrs() to be generic. To utilize the group
> > > initialization ib_setup_port_attrs() needs to move up before
> > > device_add().
> >
> > It needs more than that, somehow you have to maintain two groups list or
> > somehow remove the coredev->dev.groups assignment..
> >
> I was thinking that if both device and port attr setup is done in
> same function, there is knowledge of is_full_dev that can be used
> for device level hw_stats setup. (similar to how its done at port
> level).
Again the issue is the group list, so long as we are setting up
attributes through the copied group list the whole thing doesn't work.
The group list is used to avoid startup races with udev, so this is a
bit complex
Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-24 23:30 ` Jason Gunthorpe
@ 2025-02-25 3:42 ` Roman Gushchin
2025-02-25 4:34 ` Parav Pandit
2025-02-25 13:16 ` Jason Gunthorpe
0 siblings, 2 replies; 22+ messages in thread
From: Roman Gushchin @ 2025-02-25 3:42 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Parav Pandit, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Mon, Feb 24, 2025 at 07:30:04PM -0400, Jason Gunthorpe wrote:
> On Mon, Feb 24, 2025 at 11:22:29PM +0000, Roman Gushchin wrote:
> > On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Monday, February 24, 2025 8:41 PM
> > > >
> > > > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > > > ib_setup_device_attrs() should be merged to ib_setup_port_attrs() by
> > > > > renaming ib_setup_port_attrs() to be generic. To utilize the group
> > > > > initialization ib_setup_port_attrs() needs to move up before
> > > > > device_add().
> > > >
> > > > It needs more than that, somehow you have to maintain two groups list or
> > > > somehow remove the coredev->dev.groups assignment..
> > > >
> > > I was thinking that if both device and port attr setup is done in
> > > same function, there is knowledge of is_full_dev that can be used
> > > for device level hw_stats setup. (similar to how its done at port
> > > level).
> >
> > Given that there is a bit of discussion on how to move forward with this,
> > can we please merge the trivial fix in the mean time? (Just sent out v2 with
> > the fixed commit log).
>
> Well, the issue now is the ABI break
>
> If the right answer is to remove the sysfs entirely then it doesn't
> make sense to make it work in the stable and LTS kernels since that
> would create users. Currently it is fully broken so there are no
> users. Can we say that so certainly after it is fixed?
It's a good point.
Ok, then we need something like this (obviously, coded more nicely):
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 0ded91f056f3..6998907fc779 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -956,6 +956,7 @@ static int add_one_compat_dev(struct ib_device *device,
ret = device_add(&cdev->dev);
if (ret)
goto add_err;
+ device->groups[2] = NULL;
ret = ib_setup_port_attrs(cdev);
if (ret)
goto port_err;
^ permalink raw reply related [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-25 3:42 ` Roman Gushchin
@ 2025-02-25 4:34 ` Parav Pandit
2025-02-26 3:41 ` Roman Gushchin
2025-02-25 13:16 ` Jason Gunthorpe
1 sibling, 1 reply; 22+ messages in thread
From: Parav Pandit @ 2025-02-25 4:34 UTC (permalink / raw)
To: Roman Gushchin, Jason Gunthorpe
Cc: Leon Romanovsky, Maher Sanalla, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org
> From: Roman Gushchin <roman.gushchin@linux.dev>
> Sent: Tuesday, February 25, 2025 9:12 AM
>
> On Mon, Feb 24, 2025 at 07:30:04PM -0400, Jason Gunthorpe wrote:
> > On Mon, Feb 24, 2025 at 11:22:29PM +0000, Roman Gushchin wrote:
> > > On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > Sent: Monday, February 24, 2025 8:41 PM
> > > > >
> > > > > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > > > > ib_setup_device_attrs() should be merged to
> > > > > > ib_setup_port_attrs() by renaming ib_setup_port_attrs() to be
> > > > > > generic. To utilize the group initialization
> > > > > > ib_setup_port_attrs() needs to move up before device_add().
> > > > >
> > > > > It needs more than that, somehow you have to maintain two groups
> > > > > list or somehow remove the coredev->dev.groups assignment..
> > > > >
> > > > I was thinking that if both device and port attr setup is done in
> > > > same function, there is knowledge of is_full_dev that can be used
> > > > for device level hw_stats setup. (similar to how its done at port
> > > > level).
> > >
> > > Given that there is a bit of discussion on how to move forward with
> > > this, can we please merge the trivial fix in the mean time? (Just
> > > sent out v2 with the fixed commit log).
> >
> > Well, the issue now is the ABI break
> >
> > If the right answer is to remove the sysfs entirely then it doesn't
> > make sense to make it work in the stable and LTS kernels since that
> > would create users. Currently it is fully broken so there are no
> > users. Can we say that so certainly after it is fixed?
>
> It's a good point.
>
> Ok, then we need something like this (obviously, coded more nicely):
>
Yes, this is my suggestion too in little more detail at [1].
[1] https://lore.kernel.org/linux-rdma/20250224233109.GE520155@nvidia.com/T/#m43a5974cad17566080eeb64c6d5327aad4f0a852
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 0ded91f056f3..6998907fc779 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -956,6 +956,7 @@ static int add_one_compat_dev(struct ib_device
> *device,
> ret = device_add(&cdev->dev);
> if (ret)
> goto add_err;
> + device->groups[2] = NULL;
> ret = ib_setup_port_attrs(cdev);
> if (ret)
> goto port_err;
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-24 23:31 ` Jason Gunthorpe
@ 2025-02-25 10:38 ` Parav Pandit
0 siblings, 0 replies; 22+ messages in thread
From: Parav Pandit @ 2025-02-25 10:38 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Roman Gushchin, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, February 25, 2025 5:01 AM
>
> On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Monday, February 24, 2025 8:41 PM
> > >
> > > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > > ib_setup_device_attrs() should be merged to ib_setup_port_attrs()
> > > > by renaming ib_setup_port_attrs() to be generic. To utilize the
> > > > group initialization ib_setup_port_attrs() needs to move up before
> > > > device_add().
> > >
> > > It needs more than that, somehow you have to maintain two groups
> > > list or somehow remove the coredev->dev.groups assignment..
> > >
> > I was thinking that if both device and port attr setup is done in same
> > function, there is knowledge of is_full_dev that can be used for
> > device level hw_stats setup. (similar to how its done at port level).
>
> Again the issue is the group list, so long as we are setting up attributes
> through the copied group list the whole thing doesn't work.
>
> The group list is used to avoid startup races with udev, so this is a bit complex
>
Yes, I was suggesting initializing the group early as done today, if we can move the rest of the port sysfs also at same place.
If that is complex, we need to write the ugly code to store the group index to blind copy of group in the compat devices.
> Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-25 3:42 ` Roman Gushchin
2025-02-25 4:34 ` Parav Pandit
@ 2025-02-25 13:16 ` Jason Gunthorpe
2025-02-26 3:37 ` Roman Gushchin
1 sibling, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2025-02-25 13:16 UTC (permalink / raw)
To: Roman Gushchin
Cc: Parav Pandit, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Tue, Feb 25, 2025 at 03:42:28AM +0000, Roman Gushchin wrote:
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 0ded91f056f3..6998907fc779 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -956,6 +956,7 @@ static int add_one_compat_dev(struct ib_device *device,
> ret = device_add(&cdev->dev);
> if (ret)
> goto add_err;
> + device->groups[2] = NULL;
> ret = ib_setup_port_attrs(cdev);
> if (ret)
> goto port_err;
That's horrible - but OK, maybe something like that..
Does it work? Or does the driver core need groups after the initial
setup?
Could we have two group lists and link them together? IIRC there was a
way to do that without creating a sub directory
Jason
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-25 13:16 ` Jason Gunthorpe
@ 2025-02-26 3:37 ` Roman Gushchin
0 siblings, 0 replies; 22+ messages in thread
From: Roman Gushchin @ 2025-02-26 3:37 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Parav Pandit, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Tue, Feb 25, 2025 at 09:16:18AM -0400, Jason Gunthorpe wrote:
> On Tue, Feb 25, 2025 at 03:42:28AM +0000, Roman Gushchin wrote:
>
> > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> > index 0ded91f056f3..6998907fc779 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -956,6 +956,7 @@ static int add_one_compat_dev(struct ib_device *device,
> > ret = device_add(&cdev->dev);
> > if (ret)
> > goto add_err;
> > + device->groups[2] = NULL;
> > ret = ib_setup_port_attrs(cdev);
> > if (ret)
> > goto port_err;
>
> That's horrible - but OK, maybe something like that..
>
> Does it work? Or does the driver core need groups after the initial
> setup?
>
> Could we have two group lists and link them together? IIRC there was a
> way to do that without creating a sub directory
It does work.
I just sent a decent implementation of this idea, please, take a look.
Thank you!
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show()
2025-02-25 4:34 ` Parav Pandit
@ 2025-02-26 3:41 ` Roman Gushchin
0 siblings, 0 replies; 22+ messages in thread
From: Roman Gushchin @ 2025-02-26 3:41 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Gunthorpe, Leon Romanovsky, Maher Sanalla,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
On Tue, Feb 25, 2025 at 04:34:11AM +0000, Parav Pandit wrote:
>
>
> > From: Roman Gushchin <roman.gushchin@linux.dev>
> > Sent: Tuesday, February 25, 2025 9:12 AM
> >
> > On Mon, Feb 24, 2025 at 07:30:04PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Feb 24, 2025 at 11:22:29PM +0000, Roman Gushchin wrote:
> > > > On Mon, Feb 24, 2025 at 03:16:46PM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > > Sent: Monday, February 24, 2025 8:41 PM
> > > > > >
> > > > > > On Sat, Feb 22, 2025 at 06:34:21PM +0000, Parav Pandit wrote:
> > > > > > > ib_setup_device_attrs() should be merged to
> > > > > > > ib_setup_port_attrs() by renaming ib_setup_port_attrs() to be
> > > > > > > generic. To utilize the group initialization
> > > > > > > ib_setup_port_attrs() needs to move up before device_add().
> > > > > >
> > > > > > It needs more than that, somehow you have to maintain two groups
> > > > > > list or somehow remove the coredev->dev.groups assignment..
> > > > > >
> > > > > I was thinking that if both device and port attr setup is done in
> > > > > same function, there is knowledge of is_full_dev that can be used
> > > > > for device level hw_stats setup. (similar to how its done at port
> > > > > level).
> > > >
> > > > Given that there is a bit of discussion on how to move forward with
> > > > this, can we please merge the trivial fix in the mean time? (Just
> > > > sent out v2 with the fixed commit log).
> > >
> > > Well, the issue now is the ABI break
> > >
> > > If the right answer is to remove the sysfs entirely then it doesn't
> > > make sense to make it work in the stable and LTS kernels since that
> > > would create users. Currently it is fully broken so there are no
> > > users. Can we say that so certainly after it is fixed?
> >
> > It's a good point.
> >
> > Ok, then we need something like this (obviously, coded more nicely):
> >
> Yes, this is my suggestion too in little more detail at [1].
>
> [1] https://lore.kernel.org/linux-rdma/20250224233109.GE520155@nvidia.com/T/#m43a5974cad17566080eeb64c6d5327aad4f0a852
I tried to implement what you suggested, but apparently it's complicated.
You can't call ib_setup_port_attrs() before device_add() because
kobject_init_and_add() fails. Maybe an option is to add "hw_counters"
dynamically similar to ports, Idk.
Anyway, sent a slightly different version, please, take a look.
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-02-26 3:41 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21 2:05 [PATCH] RDMA/core: fix a NULL-pointer dereference in hw_stat_device_show() Roman Gushchin
2025-02-21 3:14 ` Parav Pandit
2025-02-21 4:25 ` Roman Gushchin
2025-02-21 4:34 ` Parav Pandit
2025-02-21 4:49 ` Roman Gushchin
2025-02-21 8:03 ` Parav Pandit
2025-02-21 20:02 ` Roman Gushchin
2025-02-22 18:36 ` Parav Pandit
2025-02-22 20:50 ` Roman Gushchin
2025-02-21 17:43 ` Jason Gunthorpe
2025-02-22 18:34 ` Parav Pandit
2025-02-24 15:11 ` Jason Gunthorpe
2025-02-24 15:16 ` Parav Pandit
2025-02-24 23:22 ` Roman Gushchin
2025-02-24 23:30 ` Jason Gunthorpe
2025-02-25 3:42 ` Roman Gushchin
2025-02-25 4:34 ` Parav Pandit
2025-02-26 3:41 ` Roman Gushchin
2025-02-25 13:16 ` Jason Gunthorpe
2025-02-26 3:37 ` Roman Gushchin
2025-02-24 23:31 ` Jason Gunthorpe
2025-02-25 10:38 ` Parav Pandit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox