Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH for-rc] RDMA/efa: Propagate destroy AH error
@ 2026-05-26  7:33 Tom Sela
  2026-06-02  0:22 ` Jason Gunthorpe
  0 siblings, 1 reply; 4+ messages in thread
From: Tom Sela @ 2026-05-26  7:33 UTC (permalink / raw)
  To: mrgolin, tomsela, jgg, leon, linux-rdma
  Cc: sleybo, matua, gal.pressman, Yonatan Nachum

AH destruction currently always returns success, ignoring any error
from the device. Propagate the actual device error so the caller can
handle failures appropriately.

Fixes: 9a9ebf8cd72b ("RDMA: Restore ability to fail on AH destroy")
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Reviewed-by: Yonatan Nachum <ynachum@amazon.com>
Signed-off-by: Tom Sela <tomsela@amazon.com>
---
 drivers/infiniband/hw/efa/efa_verbs.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 7bd0838ebc99..1e4f052e6385 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -2134,8 +2134,7 @@ int efa_destroy_ah(struct ib_ah *ibah, u32 flags)
 		return -EOPNOTSUPP;
 	}
 
-	efa_ah_destroy(dev, ah);
-	return 0;
+	return efa_ah_destroy(dev, ah);
 }
 
 struct rdma_hw_stats *efa_alloc_hw_port_stats(struct ib_device *ibdev,
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH for-rc] RDMA/efa: Propagate destroy AH error
  2026-05-26  7:33 [PATCH for-rc] RDMA/efa: Propagate destroy AH error Tom Sela
@ 2026-06-02  0:22 ` Jason Gunthorpe
  2026-06-08 14:57   ` tom sela
  0 siblings, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2026-06-02  0:22 UTC (permalink / raw)
  To: Tom Sela
  Cc: mrgolin, leon, linux-rdma, sleybo, matua, gal.pressman,
	Yonatan Nachum

On Tue, May 26, 2026 at 07:33:34AM +0000, Tom Sela wrote:
> AH destruction currently always returns success, ignoring any error
> from the device. Propagate the actual device error so the caller can
> handle failures appropriately.

Callers don't handle failures. Drivers are not permitted to fail
destroy, if they do it probably will trigger a WARN_ON.

You can make some of an argument to allow failing destroy for user
objects only, but not like this in general for kernel objects.

If your FW fails destroying a kernel object then the device is busted,
you should reset it and succeed to destroy the kernel object anyhow.

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-rc] RDMA/efa: Propagate destroy AH error
  2026-06-02  0:22 ` Jason Gunthorpe
@ 2026-06-08 14:57   ` tom sela
  2026-06-08 15:26     ` Jason Gunthorpe
  0 siblings, 1 reply; 4+ messages in thread
From: tom sela @ 2026-06-08 14:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: mrgolin, leon, linux-rdma, sleybo, matua, gal.pressman,
	Yonatan Nachum

On Mon, Jun 01, 2026 at 09:22:23PM -0300, Jason Gunthorpe wrote:
> On Tue, May 26, 2026 at 07:33:34AM +0000, Tom Sela wrote:
> > AH destruction currently always returns success, ignoring any error
> > from the device. Propagate the actual device error so the caller can
> > handle failures appropriately.
> 
> Callers don't handle failures. Drivers are not permitted to fail
> destroy, if they do it probably will trigger a WARN_ON.
> 
> You can make some of an argument to allow failing destroy for user
> objects only, but not like this in general for kernel objects.
> 
> If your FW fails destroying a kernel object then the device is busted,
> you should reset it and succeed to destroy the kernel object anyhow.
> 
> Jason


This code is for user objects only. When destroy is called for a user object, the core code handles the failure gracefully and can retry cleanup at a later stage.

Currently we don't have a code path where destroy_ah actually fails in device, but we'd like the error propagation in place for completeness so that if a future FW change can return a transient error, we handle it correctly rather than silently ignoring it.

Would you prefer we explicitly guard this with a check for ibah->uobject
(i.e., only propagate the error when it's a user object).

Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-rc] RDMA/efa: Propagate destroy AH error
  2026-06-08 14:57   ` tom sela
@ 2026-06-08 15:26     ` Jason Gunthorpe
  0 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-06-08 15:26 UTC (permalink / raw)
  To: tom sela
  Cc: mrgolin, leon, linux-rdma, sleybo, matua, gal.pressman,
	Yonatan Nachum

On Mon, Jun 08, 2026 at 02:57:38PM +0000, tom sela wrote:
> On Mon, Jun 01, 2026 at 09:22:23PM -0300, Jason Gunthorpe wrote:
> > On Tue, May 26, 2026 at 07:33:34AM +0000, Tom Sela wrote:
> > > AH destruction currently always returns success, ignoring any error
> > > from the device. Propagate the actual device error so the caller can
> > > handle failures appropriately.
> > 
> > Callers don't handle failures. Drivers are not permitted to fail
> > destroy, if they do it probably will trigger a WARN_ON.
> > 
> > You can make some of an argument to allow failing destroy for user
> > objects only, but not like this in general for kernel objects.
> > 
> > If your FW fails destroying a kernel object then the device is busted,
> > you should reset it and succeed to destroy the kernel object anyhow.
> > 
> > Jason
> 
> 
> This code is for user objects only. When destroy is called for a
> user object, the core code handles the failure gracefully and can
> retry cleanup at a later stage.
> 
> Currently we don't have a code path where destroy_ah actually fails
> in device, but we'd like the error propagation in place for
> completeness so that if a future FW change can return a transient
> error, we handle it correctly rather than silently ignoring it.
> 
> Would you prefer we explicitly guard this with a check for
> ibah->uobject (i.e., only propagate the error when it's a user
> object).

Do you ever plan to support kverbs on efa?

It is still not Ok to propogae all failures even on uobjects, you will
still trigger a WARN_ON eventually.. It has to succeed under the retry
logic.

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-08 15:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-26  7:33 [PATCH for-rc] RDMA/efa: Propagate destroy AH error Tom Sela
2026-06-02  0:22 ` Jason Gunthorpe
2026-06-08 14:57   ` tom sela
2026-06-08 15:26     ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox