* [PATCH] rdma: infiniband: Added __alloc_cq request value Return value non-zero value determination
@ 2025-04-07 9:33 luoqing
2025-04-07 16:25 ` Jason Gunthorpe
0 siblings, 1 reply; 4+ messages in thread
From: luoqing @ 2025-04-07 9:33 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: luoqing, Leon Romanovsky, linux-rdma, linux-kernel
From: luoqing <luoqing@kylinos.cn>
When the kernel allocates memory for completion queue object ib_cq on the specified
InfiniBand device dev and ensures that the allocated memory is cleared to zero,
if the ib_cq object is not initialized to 0, a non-null value is still returned,
and the kernel should exit and give a warning.
Avoid kernel crash when this memory is initialized.
ib_mad_init_device
-->ib_mad_port_open
-->__ib_alloc_cq
-->rdma_zalloc_drv_obj(dev, ib_cq);
#8 [ffff80211b3c7430] do_mem_abort at ffff4bedae5912c4
#9 [ffff80211b3c7610] el1_ia at ffff4bedae592f8c
PC: ffff4bed866f5aac [__ib_alloc_cq+100]
LR: ffff4bed866f5a98 [__ib_alloc_cq+80]
SP: ffff80211b3c7620 PSTATE: 60400009
X29: ffff80211b3c7620 X28: ffff4bedae5c7a70 X27: 0000000000000000
X26: ffff4bed86737680 X25: ffff8020b62f4000 X24: 0000000000000002
X23: 0000000000000280 X22: ffff4bedaf8a4d28 X21: ffff8020ca8d0000
X20: 0000000000000000 X19: 0000000000000010 X18: ffff80211b3c7410
X17: 00000000172acefd X16: ffff4bedae8603e8 X15: 00000000b19d2ea3
X14: 000000000950e09b X13: 000000009bd4e304 X12: 00000000f81e149c
X11: 0000000096b29e56 X10: 0000000000000f70 X9: ffff80211b3c7360
X8: ffff80211b4e9350 X7: 0000000000000000 X6: ffff4bedaf2d08f0
X5: ffff4bed86737680 X4: 0000000000000002 X3: 0000000000000000
X2: 0000000000000280 X1: 00000000006080c0 X0: 0000000000000010
X2: 0000000000000280 X1: 00000000006080c0 X0: 0000000000000010
#10 [ffff80211b3c7620] __ib_alloc_cq at ffff4bed866f5aa8 [ib_core]
#11 [ffff80211b3c7690] ib_mad_port_open at ffff4bed86711338 [ib_core]
#12 [ffff80211b3c7710] ib_mad_init_device at ffff4bed867118d0 [ib_core]
#13 [ffff80211b3c7760] add_client_context at ffff4bed866fca40 [ib_core]
#14 [ffff80211b3c77a0] enable_device_and_get at ffff4bed866fcb90 [ib_core]
#15 [ffff80211b3c77f0] ib_register_device at ffff4bed866fd750 [ib_core]
#16 [ffff80211b3c78b0] irdma_ib_register_device at ffff4bed81ea3d20 [irdma]
#17 [ffff80211b3c7920] irdma_probe at ffff4bed81e7130c [irdma]
When ib_cq is zero, the return value of cq is ZERO_SIZE_PTR ((void *)16) and is not non-null
cq = rdma_zalloc_drv_obj(dev, ib_cq);
Signed-off-by: luoqing <luoqing@kylinos.cn>
---
drivers/infiniband/core/cq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index a70876a0a231..90ea9fc99fb7 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -221,7 +221,7 @@ struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, int nr_cqe,
int ret = -ENOMEM;
cq = rdma_zalloc_drv_obj(dev, ib_cq);
- if (!cq)
+ if (unlikely(ZERO_OR_NULL_PTR(cq)))
return ERR_PTR(ret);
cq->device = dev;
--
2.27.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] rdma: infiniband: Added __alloc_cq request value Return value non-zero value determination
2025-04-07 9:33 [PATCH] rdma: infiniband: Added __alloc_cq request value Return value non-zero value determination luoqing
@ 2025-04-07 16:25 ` Jason Gunthorpe
[not found] ` <7afc834e.5498.1965c20f9f0.Coremail.l1138897701@163.com>
0 siblings, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2025-04-07 16:25 UTC (permalink / raw)
To: luoqing; +Cc: luoqing, Leon Romanovsky, linux-rdma, linux-kernel
On Mon, Apr 07, 2025 at 05:33:41PM +0800, luoqing wrote:
> From: luoqing <luoqing@kylinos.cn>
>
> When the kernel allocates memory for completion queue object ib_cq on the specified
> InfiniBand device dev and ensures that the allocated memory is cleared to zero,
> if the ib_cq object is not initialized to 0, a non-null value is still returned,
> and the kernel should exit and give a warning.
> Avoid kernel crash when this memory is initialized.
?? This doesn't make any sense.
> ib_mad_init_device
> -->ib_mad_port_open
> -->__ib_alloc_cq
> -->rdma_zalloc_drv_obj(dev, ib_cq);
rdma_zalloc_drv_obj() must return memory that is validly castable to
the struct ib_cq.
> When ib_cq is zero, the return value of cq is ZERO_SIZE_PTR ((void *)16) and is not non-null
> cq = rdma_zalloc_drv_obj(dev, ib_cq);
It looks to me like the driver returned the wrong size for the ib_cq
in the ops->size_ib_cq. It is not allowed to be 0 if the driver is
supporting cq.
Arguably we should check that the size_* pointers have the requirement
minimum size when registering the driver.
Allocation time is too late.
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: [PATCH] rdma: infiniband: Added __alloc_cq request value Return value non-zero value determination
[not found] ` <7afc834e.5498.1965c20f9f0.Coremail.l1138897701@163.com>
@ 2025-04-22 11:58 ` Leon Romanovsky
[not found] ` <2ac5915f.14aa.196604a64b6.Coremail.l1138897701@163.com>
0 siblings, 1 reply; 4+ messages in thread
From: Leon Romanovsky @ 2025-04-22 11:58 UTC (permalink / raw)
To: l1138897701; +Cc: Jason Gunthorpe, luoqing, linux-rdma, linux-kernel
On Tue, Apr 22, 2025 at 02:13:07PM +0800, l1138897701 wrote:
> In fact, the occurrence of this problem is because when the outbox driver is compiled and installed,
So let's fix your out-of-tree driver.
Thanks
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: Re: [PATCH] rdma: infiniband: Added __alloc_cq request value Return value non-zero value determination
[not found] ` <2ac5915f.14aa.196604a64b6.Coremail.l1138897701@163.com>
@ 2025-04-23 7:14 ` Leon Romanovsky
0 siblings, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2025-04-23 7:14 UTC (permalink / raw)
To: l1138897701; +Cc: Jason Gunthorpe, luoqing, linux-rdma, linux-kernel
On Wed, Apr 23, 2025 at 09:36:50AM +0800, l1138897701 wrote:
>
>
>
> Thank you for your reply.
>
>
>
>
> The ultimate goal of this patch is to confirm that if there are design flaws in the outbox driver,
>
> when compiling and installing it in the kernel, I personally consider that the kernel should issue a warning or report an error instead of directly panicking.
>
>
>
>
> It is worth considering whether the kernel needs such a fault-proofing mechanism
Kernel code doesn't have such protections by design. Panic is a perfect
thing to teach users don't use out-of-tree broken modules.
Thanks
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-04-23 7:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 9:33 [PATCH] rdma: infiniband: Added __alloc_cq request value Return value non-zero value determination luoqing
2025-04-07 16:25 ` Jason Gunthorpe
[not found] ` <7afc834e.5498.1965c20f9f0.Coremail.l1138897701@163.com>
2025-04-22 11:58 ` Leon Romanovsky
[not found] ` <2ac5915f.14aa.196604a64b6.Coremail.l1138897701@163.com>
2025-04-23 7:14 ` Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).