* [PATCH for-rc] RDMA/efa: Propagate destroy AH error
@ 2026-05-26 7:33 Tom Sela
2026-06-02 0:22 ` Jason Gunthorpe
0 siblings, 1 reply; 4+ messages in thread
From: Tom Sela @ 2026-05-26 7:33 UTC (permalink / raw)
To: mrgolin, tomsela, jgg, leon, linux-rdma
Cc: sleybo, matua, gal.pressman, Yonatan Nachum
AH destruction currently always returns success, ignoring any error
from the device. Propagate the actual device error so the caller can
handle failures appropriately.
Fixes: 9a9ebf8cd72b ("RDMA: Restore ability to fail on AH destroy")
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Reviewed-by: Yonatan Nachum <ynachum@amazon.com>
Signed-off-by: Tom Sela <tomsela@amazon.com>
---
drivers/infiniband/hw/efa/efa_verbs.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 7bd0838ebc99..1e4f052e6385 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -2134,8 +2134,7 @@ int efa_destroy_ah(struct ib_ah *ibah, u32 flags)
return -EOPNOTSUPP;
}
- efa_ah_destroy(dev, ah);
- return 0;
+ return efa_ah_destroy(dev, ah);
}
struct rdma_hw_stats *efa_alloc_hw_port_stats(struct ib_device *ibdev,
--
2.47.3
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH for-rc] RDMA/efa: Propagate destroy AH error
2026-05-26 7:33 [PATCH for-rc] RDMA/efa: Propagate destroy AH error Tom Sela
@ 2026-06-02 0:22 ` Jason Gunthorpe
2026-06-08 14:57 ` tom sela
0 siblings, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2026-06-02 0:22 UTC (permalink / raw)
To: Tom Sela
Cc: mrgolin, leon, linux-rdma, sleybo, matua, gal.pressman,
Yonatan Nachum
On Tue, May 26, 2026 at 07:33:34AM +0000, Tom Sela wrote:
> AH destruction currently always returns success, ignoring any error
> from the device. Propagate the actual device error so the caller can
> handle failures appropriately.
Callers don't handle failures. Drivers are not permitted to fail
destroy, if they do it probably will trigger a WARN_ON.
You can make some of an argument to allow failing destroy for user
objects only, but not like this in general for kernel objects.
If your FW fails destroying a kernel object then the device is busted,
you should reset it and succeed to destroy the kernel object anyhow.
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH for-rc] RDMA/efa: Propagate destroy AH error
2026-06-02 0:22 ` Jason Gunthorpe
@ 2026-06-08 14:57 ` tom sela
2026-06-08 15:26 ` Jason Gunthorpe
0 siblings, 1 reply; 4+ messages in thread
From: tom sela @ 2026-06-08 14:57 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: mrgolin, leon, linux-rdma, sleybo, matua, gal.pressman,
Yonatan Nachum
On Mon, Jun 01, 2026 at 09:22:23PM -0300, Jason Gunthorpe wrote:
> On Tue, May 26, 2026 at 07:33:34AM +0000, Tom Sela wrote:
> > AH destruction currently always returns success, ignoring any error
> > from the device. Propagate the actual device error so the caller can
> > handle failures appropriately.
>
> Callers don't handle failures. Drivers are not permitted to fail
> destroy, if they do it probably will trigger a WARN_ON.
>
> You can make some of an argument to allow failing destroy for user
> objects only, but not like this in general for kernel objects.
>
> If your FW fails destroying a kernel object then the device is busted,
> you should reset it and succeed to destroy the kernel object anyhow.
>
> Jason
This code is for user objects only. When destroy is called for a user object, the core code handles the failure gracefully and can retry cleanup at a later stage.
Currently we don't have a code path where destroy_ah actually fails in device, but we'd like the error propagation in place for completeness so that if a future FW change can return a transient error, we handle it correctly rather than silently ignoring it.
Would you prefer we explicitly guard this with a check for ibah->uobject
(i.e., only propagate the error when it's a user object).
Tom
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH for-rc] RDMA/efa: Propagate destroy AH error
2026-06-08 14:57 ` tom sela
@ 2026-06-08 15:26 ` Jason Gunthorpe
0 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-06-08 15:26 UTC (permalink / raw)
To: tom sela
Cc: mrgolin, leon, linux-rdma, sleybo, matua, gal.pressman,
Yonatan Nachum
On Mon, Jun 08, 2026 at 02:57:38PM +0000, tom sela wrote:
> On Mon, Jun 01, 2026 at 09:22:23PM -0300, Jason Gunthorpe wrote:
> > On Tue, May 26, 2026 at 07:33:34AM +0000, Tom Sela wrote:
> > > AH destruction currently always returns success, ignoring any error
> > > from the device. Propagate the actual device error so the caller can
> > > handle failures appropriately.
> >
> > Callers don't handle failures. Drivers are not permitted to fail
> > destroy, if they do it probably will trigger a WARN_ON.
> >
> > You can make some of an argument to allow failing destroy for user
> > objects only, but not like this in general for kernel objects.
> >
> > If your FW fails destroying a kernel object then the device is busted,
> > you should reset it and succeed to destroy the kernel object anyhow.
> >
> > Jason
>
>
> This code is for user objects only. When destroy is called for a
> user object, the core code handles the failure gracefully and can
> retry cleanup at a later stage.
>
> Currently we don't have a code path where destroy_ah actually fails
> in device, but we'd like the error propagation in place for
> completeness so that if a future FW change can return a transient
> error, we handle it correctly rather than silently ignoring it.
>
> Would you prefer we explicitly guard this with a check for
> ibah->uobject (i.e., only propagate the error when it's a user
> object).
Do you ever plan to support kverbs on efa?
It is still not Ok to propogae all failures even on uobjects, you will
still trigger a WARN_ON eventually.. It has to succeed under the retry
logic.
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-08 15:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-26 7:33 [PATCH for-rc] RDMA/efa: Propagate destroy AH error Tom Sela
2026-06-02 0:22 ` Jason Gunthorpe
2026-06-08 14:57 ` tom sela
2026-06-08 15:26 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox