Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse
@ 2026-06-08  7:16 Yonatan Nachum
  2026-06-08  7:16 ` [PATCH for-next v4 1/2] RDMA/efa: Add initialization of AH cache rhashtable Yonatan Nachum
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Yonatan Nachum @ 2026-06-08  7:16 UTC (permalink / raw)
  To: jgg, leon, linux-rdma
  Cc: mrgolin, sleybo, matua, gal.pressman, Yonatan Nachum

Changelog:
v4:
 * Use kzalloc_obj for AH cache entry allocation instead of kzalloc
v3: https://lore.kernel.org/all/20260607161753.1607559-1-ynachum@amazon.com/
 * Address Sashiko comments in:
   https://sashiko.dev/#/patchset/20260512061121.2177521-1-ynachum%40amazon.com
v2: https://lore.kernel.org/all/20260512061121.2177521-1-ynachum@amazon.com/
 * Zero-initialize AH cache key on cache lookup.
v1: https://lore.kernel.org/all/20260510083035.458081-1-ynachum@amazon.com/

-------------------------------------------------------------------------
New EFA devices don't support the creation of multiple AHs to the same
remote on the same PD. To overcome this limitation, introduce an AH
cache that manages AH reuse transparently.

The cache uses an rhashtable keyed by (PD, GID) to track active address
handles with refcounts. On create AH, the driver returns an existing AH
number if one is already cached, or creates a new one and caches it. On
destroy AH, the driver only issues the device destroy command when the
last reference is dropped.

A per-entry mutex serializes concurrent device commands on the same
cache entry, preventing create-before-destroy races on the device.

Yonatan Nachum (2):
  RDMA/efa: Add initialization of AH cache rhashtable
  RDMA/efa: Add AH cache handling on create and destroy AH

 drivers/infiniband/hw/efa/Makefile       |   4 +-
 drivers/infiniband/hw/efa/efa_ah_cache.c | 163 +++++++++++++++++++++++
 drivers/infiniband/hw/efa/efa_ah_cache.h |  42 ++++++
 drivers/infiniband/hw/efa/efa_com.c      |  12 +-
 drivers/infiniband/hw/efa/efa_com.h      |   3 +
 drivers/infiniband/hw/efa/efa_com_cmd.c  |  73 +++++++---
 drivers/infiniband/hw/efa/efa_com_cmd.h  |   1 +
 drivers/infiniband/hw/efa/efa_verbs.c    |   9 +-
 8 files changed, 281 insertions(+), 26 deletions(-)
 create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.c
 create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.h

-- 
2.50.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH for-next v4 1/2] RDMA/efa: Add initialization of AH cache rhashtable
  2026-06-08  7:16 [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
@ 2026-06-08  7:16 ` Yonatan Nachum
  2026-06-08  7:16 ` [PATCH for-next v4 2/2] RDMA/efa: Add AH cache handling on create and destroy AH Yonatan Nachum
  2026-06-14  7:12 ` [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
  2 siblings, 0 replies; 6+ messages in thread
From: Yonatan Nachum @ 2026-06-08  7:16 UTC (permalink / raw)
  To: jgg, leon, linux-rdma
  Cc: mrgolin, sleybo, matua, gal.pressman, Yonatan Nachum,
	Firas Jahjah

New EFA devices don't support the creation of multiple address handles
to the same remote on the same PD.
To overcome this limitation, introduce an AH cache rhashtable which will
store the refcounts of the same AH creation on the same PD and will
allow the driver to manage AH reuse. The hashtable key is the
combination of PD and GID. Add initialization and teardown logic for the
rhashtable.

Reviewed-by: Firas Jahjah <firasj@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
---
 drivers/infiniband/hw/efa/Makefile       |  4 +--
 drivers/infiniband/hw/efa/efa_ah_cache.c | 41 ++++++++++++++++++++++++
 drivers/infiniband/hw/efa/efa_ah_cache.h | 37 +++++++++++++++++++++
 drivers/infiniband/hw/efa/efa_com.c      | 12 ++++++-
 drivers/infiniband/hw/efa/efa_com.h      |  3 ++
 5 files changed, 94 insertions(+), 3 deletions(-)
 create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.c
 create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.h

diff --git a/drivers/infiniband/hw/efa/Makefile b/drivers/infiniband/hw/efa/Makefile
index 6e83083af0bc..a6a433b0ba2f 100644
--- a/drivers/infiniband/hw/efa/Makefile
+++ b/drivers/infiniband/hw/efa/Makefile
@@ -1,9 +1,9 @@
 # SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
-# Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
+# Copyright 2018-2026 Amazon.com, Inc. or its affiliates. All rights reserved.
 #
 # Makefile for Amazon Elastic Fabric Adapter (EFA) device driver.
 #
 
 obj-$(CONFIG_INFINIBAND_EFA) += efa.o
 
-efa-y := efa_com_cmd.o efa_com.o efa_main.o efa_verbs.o
+efa-y := efa_com_cmd.o efa_ah_cache.o efa_com.o efa_main.o efa_verbs.o
diff --git a/drivers/infiniband/hw/efa/efa_ah_cache.c b/drivers/infiniband/hw/efa/efa_ah_cache.c
new file mode 100644
index 000000000000..ab763b06b9bb
--- /dev/null
+++ b/drivers/infiniband/hw/efa/efa_ah_cache.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright 2026 Amazon.com, Inc. or its affiliates. All rights reserved.
+ */
+
+#include <linux/slab.h>
+
+#include "efa_ah_cache.h"
+
+static const struct rhashtable_params ah_cache_params = {
+	.key_len = sizeof(struct efa_ah_cache_key),
+	.key_offset = offsetof(struct efa_ah_cache_entry, key),
+	.head_offset = offsetof(struct efa_ah_cache_entry, linkage),
+};
+
+int efa_ah_cache_init(struct efa_ah_cache *ah_cache)
+{
+	int err;
+
+	mutex_init(&ah_cache->lock);
+	err = rhashtable_init(&ah_cache->hashtable, &ah_cache_params);
+	if (err)
+		mutex_destroy(&ah_cache->lock);
+
+	return err;
+}
+
+static void efa_ah_cache_entry_free(void *ptr, void *arg)
+{
+	struct efa_ah_cache_entry *entry = ptr;
+
+	mutex_destroy(&entry->lock);
+	kfree(entry);
+}
+
+void efa_ah_cache_destroy(struct efa_ah_cache *ah_cache)
+{
+	rcu_barrier();
+	rhashtable_free_and_destroy(&ah_cache->hashtable, efa_ah_cache_entry_free, NULL);
+	mutex_destroy(&ah_cache->lock);
+}
diff --git a/drivers/infiniband/hw/efa/efa_ah_cache.h b/drivers/infiniband/hw/efa/efa_ah_cache.h
new file mode 100644
index 000000000000..133181b4466d
--- /dev/null
+++ b/drivers/infiniband/hw/efa/efa_ah_cache.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
+/*
+ * Copyright 2026 Amazon.com, Inc. or its affiliates. All rights reserved.
+ */
+
+#ifndef _EFA_AH_CACHE_H_
+#define _EFA_AH_CACHE_H_
+
+#include <linux/refcount.h>
+#include <linux/rhashtable.h>
+
+#define EFA_AH_GID_SIZE 16
+
+struct efa_ah_cache_key {
+	u8 gid[EFA_AH_GID_SIZE];
+	u16 pd;
+};
+
+struct efa_ah_cache_entry {
+	struct efa_ah_cache_key key;
+	u16 ah;
+	bool initialized;
+	refcount_t refcount;
+	struct rhash_head linkage;
+	struct rcu_head rcu_head;
+	struct mutex lock; /* Serializes device commands per cache entry */
+};
+
+struct efa_ah_cache {
+	struct rhashtable hashtable;
+	struct mutex lock; /* Protects AH cache hashtable */
+};
+
+int efa_ah_cache_init(struct efa_ah_cache *ah_cache);
+void efa_ah_cache_destroy(struct efa_ah_cache *ah_cache);
+
+#endif /* _EFA_AH_CACHE_H_ */
diff --git a/drivers/infiniband/hw/efa/efa_com.c b/drivers/infiniband/hw/efa/efa_com.c
index 7cc3f4af0bb9..7097d1c2f23d 100644
--- a/drivers/infiniband/hw/efa/efa_com.c
+++ b/drivers/infiniband/hw/efa/efa_com.c
@@ -724,6 +724,8 @@ void efa_com_admin_destroy(struct efa_com_dev *edev)
 
 	size = aenq->depth * sizeof(*aenq->entries);
 	dma_free_coherent(edev->dmadev, size, aenq->entries, aenq->dma_addr);
+
+	efa_ah_cache_destroy(&edev->ah_cache);
 }
 
 /**
@@ -782,6 +784,12 @@ int efa_com_admin_init(struct efa_com_dev *edev,
 		return -ENODEV;
 	}
 
+	err = efa_ah_cache_init(&edev->ah_cache);
+	if (err) {
+		ibdev_err(edev->efa_dev, "Failed to init AH cache\n");
+		return err;
+	}
+
 	aq->depth = EFA_ADMIN_QUEUE_DEPTH;
 
 	aq->dmadev = edev->dmadev;
@@ -794,7 +802,7 @@ int efa_com_admin_init(struct efa_com_dev *edev,
 
 	err = efa_com_init_comp_ctxt(aq);
 	if (err)
-		return err;
+		goto err_destroy_ah_cache;
 
 	err = efa_com_admin_init_sq(edev);
 	if (err)
@@ -832,6 +840,8 @@ int efa_com_admin_init(struct efa_com_dev *edev,
 			  aq->sq.entries, aq->sq.dma_addr);
 err_destroy_comp_ctxt:
 	devm_kfree(edev->dmadev, aq->comp_ctx);
+err_destroy_ah_cache:
+	efa_ah_cache_destroy(&edev->ah_cache);
 
 	return err;
 }
diff --git a/drivers/infiniband/hw/efa/efa_com.h b/drivers/infiniband/hw/efa/efa_com.h
index f8c692b0e092..599db9d583bf 100644
--- a/drivers/infiniband/hw/efa/efa_com.h
+++ b/drivers/infiniband/hw/efa/efa_com.h
@@ -14,6 +14,7 @@
 
 #include <rdma/ib_verbs.h>
 
+#include "efa_ah_cache.h"
 #include "efa_common_defs.h"
 #include "efa_admin_defs.h"
 #include "efa_admin_cmds_defs.h"
@@ -113,6 +114,8 @@ struct efa_com_dev {
 	u32 supported_features;
 	u32 dma_addr_bits;
 
+	struct efa_ah_cache ah_cache;
+
 	u32 dev_api_ver;
 	struct efa_com_mmio_read mmio_read;
 };
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-next v4 2/2] RDMA/efa: Add AH cache handling on create and destroy AH
  2026-06-08  7:16 [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
  2026-06-08  7:16 ` [PATCH for-next v4 1/2] RDMA/efa: Add initialization of AH cache rhashtable Yonatan Nachum
@ 2026-06-08  7:16 ` Yonatan Nachum
  2026-06-14  7:12 ` [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
  2 siblings, 0 replies; 6+ messages in thread
From: Yonatan Nachum @ 2026-06-08  7:16 UTC (permalink / raw)
  To: jgg, leon, linux-rdma
  Cc: mrgolin, sleybo, matua, gal.pressman, Yonatan Nachum,
	Firas Jahjah

On create AH, first check if the AH cache entry already exists and if
so, returns the already stored AH number. If the entry doesn't exist,
the driver creates it and calls the device to create the AH. A per-entry
mutex serializes concurrent device commands on the same AH cache entry,
ensuring only one thread issues the device create while others wait and
reuse the result. If the device create fails, the entry remains
uninitialized so subsequent threads calls can create the AH.

On destroy AH, the refcount is checked and if it's the last reference,
the driver issues the device destroy command while holding the entry
mutex. The entry remains in the hashtable during destroy to allow
concurrent create threads to find it and wait on the entry mutex,
preventing create-before-destroy races on the device. After the device
destroy completes, the entry is either recycled if new users arrived or
removed and freed.

Reviewed-by: Firas Jahjah <firasj@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
---
 drivers/infiniband/hw/efa/efa_ah_cache.c | 122 +++++++++++++++++++++++
 drivers/infiniband/hw/efa/efa_ah_cache.h |   5 +
 drivers/infiniband/hw/efa/efa_com_cmd.c  |  73 ++++++++++----
 drivers/infiniband/hw/efa/efa_com_cmd.h  |   1 +
 drivers/infiniband/hw/efa/efa_verbs.c    |   9 +-
 5 files changed, 187 insertions(+), 23 deletions(-)

diff --git a/drivers/infiniband/hw/efa/efa_ah_cache.c b/drivers/infiniband/hw/efa/efa_ah_cache.c
index ab763b06b9bb..b738261dca35 100644
--- a/drivers/infiniband/hw/efa/efa_ah_cache.c
+++ b/drivers/infiniband/hw/efa/efa_ah_cache.c
@@ -39,3 +39,125 @@ void efa_ah_cache_destroy(struct efa_ah_cache *ah_cache)
 	rhashtable_free_and_destroy(&ah_cache->hashtable, efa_ah_cache_entry_free, NULL);
 	mutex_destroy(&ah_cache->lock);
 }
+
+static struct efa_ah_cache_entry *efa_ah_cache_lookup(struct efa_ah_cache *ah_cache, u16 pd,
+						      u8 *gid)
+	__must_hold(&ah_cache->lock)
+{
+	struct efa_ah_cache_key key = {};
+
+	memcpy(key.gid, gid, sizeof(key.gid));
+	key.pd = pd;
+
+	return rhashtable_lookup_fast(&ah_cache->hashtable, &key, ah_cache_params);
+}
+
+/**
+ * efa_ah_cache_get_or_create - Get or create an AH cache entry
+ * @ah_cache: AH cache
+ * @pd: Protection domain number
+ * @gid: GID address
+ *
+ * Look up an AH cache entry by PD and GID. If found, increment the refcount and
+ * return it. If not found, allocate a new entry and insert it into the
+ * hashtable. The entry is returned unlocked.
+ *
+ * Return: Pointer to the entry on success, ERR_PTR on failure.
+ */
+struct efa_ah_cache_entry *efa_ah_cache_get_or_create(struct efa_ah_cache *ah_cache, u16 pd,
+						      u8 *gid)
+{
+	struct efa_ah_cache_entry *entry;
+	int err;
+
+	mutex_lock(&ah_cache->lock);
+
+	entry = efa_ah_cache_lookup(ah_cache, pd, gid);
+	if (entry) {
+		refcount_inc(&entry->refcount);
+		mutex_unlock(&ah_cache->lock);
+		return entry;
+	}
+
+	entry = kzalloc_obj(*entry);
+	if (!entry) {
+		mutex_unlock(&ah_cache->lock);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	memcpy(entry->key.gid, gid, sizeof(entry->key.gid));
+	entry->key.pd = pd;
+	refcount_set(&entry->refcount, 1);
+	mutex_init(&entry->lock);
+
+	err = rhashtable_insert_fast(&ah_cache->hashtable, &entry->linkage, ah_cache_params);
+	if (err) {
+		mutex_destroy(&entry->lock);
+		kfree(entry);
+		mutex_unlock(&ah_cache->lock);
+		return ERR_PTR(err);
+	}
+
+	mutex_unlock(&ah_cache->lock);
+	return entry;
+}
+
+/**
+ * efa_ah_cache_put_unless_last - Release a reference to an AH cache entry
+ * @ah_cache: AH cache
+ * @pd: Protection domain number
+ * @gid: GID address
+ *
+ * If this is not the last reference, decrement the refcount and return NULL.
+ * If this is the last reference, return the entry with its mutex locked
+ * without decrementing.
+ *
+ * Return: Pointer to the locked entry if last reference, NULL otherwise.
+ */
+struct efa_ah_cache_entry *efa_ah_cache_put_unless_last(struct efa_ah_cache *ah_cache, u16 pd,
+							u8 *gid)
+{
+	struct efa_ah_cache_entry *entry;
+
+	mutex_lock(&ah_cache->lock);
+	entry = efa_ah_cache_lookup(ah_cache, pd, gid);
+	if (!entry) {
+		mutex_unlock(&ah_cache->lock);
+		return NULL;
+	}
+
+	if (refcount_dec_not_one(&entry->refcount)) {
+		mutex_unlock(&ah_cache->lock);
+		return NULL;
+	}
+
+	mutex_lock(&entry->lock);
+	mutex_unlock(&ah_cache->lock);
+	return entry;
+}
+
+/**
+ * efa_ah_cache_put - Release the final reference to an AH cache entry
+ * @ah_cache: AH cache
+ * @entry: AH cache entry
+ *
+ * Decrement the refcount. If it reaches zero, the entry is removed from the
+ * hashtable and freed. Otherwise, the entry is kept for reuse.
+ *
+ * Called after the device destroy completes or on a failed create to release
+ * the caller's reference.
+ */
+void efa_ah_cache_put(struct efa_ah_cache *ah_cache, struct efa_ah_cache_entry *entry)
+{
+	mutex_lock(&ah_cache->lock);
+	if (!refcount_dec_and_test(&entry->refcount)) {
+		mutex_unlock(&ah_cache->lock);
+		return;
+	}
+
+	rhashtable_remove_fast(&ah_cache->hashtable, &entry->linkage, ah_cache_params);
+	mutex_unlock(&ah_cache->lock);
+
+	mutex_destroy(&entry->lock);
+	kfree_rcu(entry, rcu_head);
+}
diff --git a/drivers/infiniband/hw/efa/efa_ah_cache.h b/drivers/infiniband/hw/efa/efa_ah_cache.h
index 133181b4466d..573fd29bb416 100644
--- a/drivers/infiniband/hw/efa/efa_ah_cache.h
+++ b/drivers/infiniband/hw/efa/efa_ah_cache.h
@@ -33,5 +33,10 @@ struct efa_ah_cache {
 
 int efa_ah_cache_init(struct efa_ah_cache *ah_cache);
 void efa_ah_cache_destroy(struct efa_ah_cache *ah_cache);
+struct efa_ah_cache_entry *efa_ah_cache_get_or_create(struct efa_ah_cache *ah_cache, u16 pd,
+						      u8 *gid);
+struct efa_ah_cache_entry *efa_ah_cache_put_unless_last(struct efa_ah_cache *ah_cache, u16 pd,
+							u8 *gid);
+void efa_ah_cache_put(struct efa_ah_cache *ah_cache, struct efa_ah_cache_entry *entry);
 
 #endif /* _EFA_AH_CACHE_H_ */
diff --git a/drivers/infiniband/hw/efa/efa_com_cmd.c b/drivers/infiniband/hw/efa/efa_com_cmd.c
index 63c7f07806a8..9eafcba5e028 100644
--- a/drivers/infiniband/hw/efa/efa_com_cmd.c
+++ b/drivers/infiniband/hw/efa/efa_com_cmd.c
@@ -313,19 +313,25 @@ int efa_com_dereg_mr(struct efa_com_dev *edev,
 	return 0;
 }
 
-int efa_com_create_ah(struct efa_com_dev *edev,
-		      struct efa_com_create_ah_params *params,
-		      struct efa_com_create_ah_result *result)
+int efa_com_destroy_ah(struct efa_com_dev *edev,
+		       struct efa_com_destroy_ah_params *params)
 {
-	struct efa_admin_create_ah_resp cmd_completion;
+	struct efa_admin_destroy_ah_resp cmd_completion;
+	struct efa_admin_destroy_ah_cmd ah_cmd = {};
 	struct efa_com_admin_queue *aq = &edev->aq;
-	struct efa_admin_create_ah_cmd ah_cmd = {};
+	struct efa_ah_cache_entry *entry;
 	int err;
 
-	ah_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_AH;
+	entry = efa_ah_cache_put_unless_last(&edev->ah_cache, params->pdn, params->gid);
+	if (!entry)
+		return 0;
 
-	memcpy(ah_cmd.dest_addr, params->dest_addr, sizeof(ah_cmd.dest_addr));
-	ah_cmd.pd = params->pdn;
+	if (!entry->initialized)
+		goto out;
+
+	ah_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_AH;
+	ah_cmd.ah = entry->ah;
+	ah_cmd.pd = entry->key.pd;
 
 	err = efa_com_cmd_exec(aq,
 			       (struct efa_admin_aq_entry *)&ah_cmd,
@@ -333,27 +339,47 @@ int efa_com_create_ah(struct efa_com_dev *edev,
 			       (struct efa_admin_acq_entry *)&cmd_completion,
 			       sizeof(cmd_completion));
 	if (err) {
+		mutex_unlock(&entry->lock);
 		ibdev_err_ratelimited(edev->efa_dev,
-				      "Failed to create ah for %pI6 [%d]\n",
-				      ah_cmd.dest_addr, err);
+				      "Failed to destroy ah-%d pd-%d [%d]\n",
+				      ah_cmd.ah, ah_cmd.pd, err);
 		return err;
 	}
 
-	result->ah = cmd_completion.ah;
+	entry->initialized = false;
+
+out:
+	mutex_unlock(&entry->lock);
+	efa_ah_cache_put(&edev->ah_cache, entry);
 
 	return 0;
 }
 
-int efa_com_destroy_ah(struct efa_com_dev *edev,
-		       struct efa_com_destroy_ah_params *params)
+int efa_com_create_ah(struct efa_com_dev *edev,
+		      struct efa_com_create_ah_params *params,
+		      struct efa_com_create_ah_result *result)
 {
-	struct efa_admin_destroy_ah_resp cmd_completion;
-	struct efa_admin_destroy_ah_cmd ah_cmd = {};
+	struct efa_com_destroy_ah_params destroy_params = {};
+	struct efa_admin_create_ah_resp cmd_completion;
 	struct efa_com_admin_queue *aq = &edev->aq;
+	struct efa_admin_create_ah_cmd ah_cmd = {};
+	struct efa_ah_cache_entry *entry;
 	int err;
 
-	ah_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_AH;
-	ah_cmd.ah = params->ah;
+	entry = efa_ah_cache_get_or_create(&edev->ah_cache, params->pdn, params->dest_addr);
+	if (IS_ERR(entry))
+		return PTR_ERR(entry);
+
+	mutex_lock(&entry->lock);
+	if (entry->initialized) {
+		result->ah = entry->ah;
+		mutex_unlock(&entry->lock);
+		return 0;
+	}
+
+	ah_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_AH;
+
+	memcpy(ah_cmd.dest_addr, params->dest_addr, sizeof(ah_cmd.dest_addr));
 	ah_cmd.pd = params->pdn;
 
 	err = efa_com_cmd_exec(aq,
@@ -362,12 +388,21 @@ int efa_com_destroy_ah(struct efa_com_dev *edev,
 			       (struct efa_admin_acq_entry *)&cmd_completion,
 			       sizeof(cmd_completion));
 	if (err) {
+		mutex_unlock(&entry->lock);
+		memcpy(destroy_params.gid, params->dest_addr, sizeof(destroy_params.gid));
+		destroy_params.pdn = params->pdn;
+		efa_com_destroy_ah(edev, &destroy_params);
 		ibdev_err_ratelimited(edev->efa_dev,
-				      "Failed to destroy ah-%d pd-%d [%d]\n",
-				      ah_cmd.ah, ah_cmd.pd, err);
+				      "Failed to create ah for %pI6 [%d]\n",
+				      ah_cmd.dest_addr, err);
 		return err;
 	}
 
+	entry->ah = cmd_completion.ah;
+	entry->initialized = true;
+	result->ah = cmd_completion.ah;
+	mutex_unlock(&entry->lock);
+
 	return 0;
 }
 
diff --git a/drivers/infiniband/hw/efa/efa_com_cmd.h b/drivers/infiniband/hw/efa/efa_com_cmd.h
index ef15b3c38429..39bd4e06684a 100644
--- a/drivers/infiniband/hw/efa/efa_com_cmd.h
+++ b/drivers/infiniband/hw/efa/efa_com_cmd.h
@@ -106,6 +106,7 @@ struct efa_com_create_ah_result {
 
 struct efa_com_destroy_ah_params {
 	u16 ah;
+	u8 gid[EFA_GID_SIZE];
 	u16 pdn;
 };
 
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 434d60235945..6742a4037888 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -2032,10 +2032,11 @@ int efa_mmap(struct ib_ucontext *ibucontext,
 
 static int efa_ah_destroy(struct efa_dev *dev, struct efa_ah *ah)
 {
-	struct efa_com_destroy_ah_params params = {
-		.ah = ah->ah,
-		.pdn = to_epd(ah->ibah.pd)->pdn,
-	};
+	struct efa_com_destroy_ah_params params = {};
+
+	params.ah = ah->ah;
+	memcpy(params.gid, ah->id, sizeof(params.gid));
+	params.pdn = to_epd(ah->ibah.pd)->pdn;
 
 	return efa_com_destroy_ah(&dev->edev, &params);
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse
  2026-06-08  7:16 [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
  2026-06-08  7:16 ` [PATCH for-next v4 1/2] RDMA/efa: Add initialization of AH cache rhashtable Yonatan Nachum
  2026-06-08  7:16 ` [PATCH for-next v4 2/2] RDMA/efa: Add AH cache handling on create and destroy AH Yonatan Nachum
@ 2026-06-14  7:12 ` Yonatan Nachum
  2026-06-16 17:50   ` Leon Romanovsky
  2 siblings, 1 reply; 6+ messages in thread
From: Yonatan Nachum @ 2026-06-14  7:12 UTC (permalink / raw)
  To: jgg, leon, linux-rdma; +Cc: mrgolin, sleybo, matua, gal.pressman

On Mon, Jun 08, 2026 at 07:16:18AM +0000, Yonatan Nachum wrote:
> Changelog:
> v4:
>  * Use kzalloc_obj for AH cache entry allocation instead of kzalloc
> v3: https://lore.kernel.org/all/20260607161753.1607559-1-ynachum@amazon.com/
>  * Address Sashiko comments in:
>    https://sashiko.dev/#/patchset/20260512061121.2177521-1-ynachum%40amazon.com
> v2: https://lore.kernel.org/all/20260512061121.2177521-1-ynachum@amazon.com/
>  * Zero-initialize AH cache key on cache lookup.
> v1: https://lore.kernel.org/all/20260510083035.458081-1-ynachum@amazon.com/
> 
> -------------------------------------------------------------------------
> New EFA devices don't support the creation of multiple AHs to the same
> remote on the same PD. To overcome this limitation, introduce an AH
> cache that manages AH reuse transparently.
> 
> The cache uses an rhashtable keyed by (PD, GID) to track active address
> handles with refcounts. On create AH, the driver returns an existing AH
> number if one is already cached, or creates a new one and caches it. On
> destroy AH, the driver only issues the device destroy command when the
> last reference is dropped.
> 
> A per-entry mutex serializes concurrent device commands on the same
> cache entry, preventing create-before-destroy races on the device.
> 
> Yonatan Nachum (2):
>   RDMA/efa: Add initialization of AH cache rhashtable
>   RDMA/efa: Add AH cache handling on create and destroy AH
> 
>  drivers/infiniband/hw/efa/Makefile       |   4 +-
>  drivers/infiniband/hw/efa/efa_ah_cache.c | 163 +++++++++++++++++++++++
>  drivers/infiniband/hw/efa/efa_ah_cache.h |  42 ++++++
>  drivers/infiniband/hw/efa/efa_com.c      |  12 +-
>  drivers/infiniband/hw/efa/efa_com.h      |   3 +
>  drivers/infiniband/hw/efa/efa_com_cmd.c  |  73 +++++++---
>  drivers/infiniband/hw/efa/efa_com_cmd.h  |   1 +
>  drivers/infiniband/hw/efa/efa_verbs.c    |   9 +-
>  8 files changed, 281 insertions(+), 26 deletions(-)
>  create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.c
>  create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.h
> 
> -- 
> 2.50.1
>

Hi, kind reminder.
This series blocks merging EFAv4 device ID and we would want it to land
for the next merge window if possible.

I reviewed the last Sashiko comment and it's not relevant since on
destroy AH failure, we keep the entry in the hashtable.

Thanks
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse
  2026-06-14  7:12 ` [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
@ 2026-06-16 17:50   ` Leon Romanovsky
  2026-06-16 19:31     ` Yonatan Nachum
  0 siblings, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2026-06-16 17:50 UTC (permalink / raw)
  To: Yonatan Nachum; +Cc: jgg, linux-rdma, mrgolin, sleybo, matua, gal.pressman

On Sun, Jun 14, 2026 at 07:12:29AM +0000, Yonatan Nachum wrote:
> On Mon, Jun 08, 2026 at 07:16:18AM +0000, Yonatan Nachum wrote:
> > Changelog:
> > v4:
> >  * Use kzalloc_obj for AH cache entry allocation instead of kzalloc
> > v3: https://lore.kernel.org/all/20260607161753.1607559-1-ynachum@amazon.com/
> >  * Address Sashiko comments in:
> >    https://sashiko.dev/#/patchset/20260512061121.2177521-1-ynachum%40amazon.com
> > v2: https://lore.kernel.org/all/20260512061121.2177521-1-ynachum@amazon.com/
> >  * Zero-initialize AH cache key on cache lookup.
> > v1: https://lore.kernel.org/all/20260510083035.458081-1-ynachum@amazon.com/
> > 
> > -------------------------------------------------------------------------
> > New EFA devices don't support the creation of multiple AHs to the same
> > remote on the same PD. To overcome this limitation, introduce an AH
> > cache that manages AH reuse transparently.
> > 
> > The cache uses an rhashtable keyed by (PD, GID) to track active address
> > handles with refcounts. On create AH, the driver returns an existing AH
> > number if one is already cached, or creates a new one and caches it. On
> > destroy AH, the driver only issues the device destroy command when the
> > last reference is dropped.
> > 
> > A per-entry mutex serializes concurrent device commands on the same
> > cache entry, preventing create-before-destroy races on the device.
> > 
> > Yonatan Nachum (2):
> >   RDMA/efa: Add initialization of AH cache rhashtable
> >   RDMA/efa: Add AH cache handling on create and destroy AH
> > 
> >  drivers/infiniband/hw/efa/Makefile       |   4 +-
> >  drivers/infiniband/hw/efa/efa_ah_cache.c | 163 +++++++++++++++++++++++
> >  drivers/infiniband/hw/efa/efa_ah_cache.h |  42 ++++++
> >  drivers/infiniband/hw/efa/efa_com.c      |  12 +-
> >  drivers/infiniband/hw/efa/efa_com.h      |   3 +
> >  drivers/infiniband/hw/efa/efa_com_cmd.c  |  73 +++++++---
> >  drivers/infiniband/hw/efa/efa_com_cmd.h  |   1 +
> >  drivers/infiniband/hw/efa/efa_verbs.c    |   9 +-
> >  8 files changed, 281 insertions(+), 26 deletions(-)
> >  create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.c
> >  create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.h
> > 
> > -- 
> > 2.50.1
> >
> 
> Hi, kind reminder.
> This series blocks merging EFAv4 device ID and we would want it to land
> for the next merge window if possible.

It is not possible.

The use of entry->lock together with a refcount and an "initialize"
flag suggests that the refcount is not being used correctly.

I would expect a single ah_cache lock, with the refcount tracking the
number of users of the entry.

Thanks.

> 
> I reviewed the last Sashiko comment and it's not relevant since on
> destroy AH failure, we keep the entry in the hashtable.
> 
> Thanks
>  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse
  2026-06-16 17:50   ` Leon Romanovsky
@ 2026-06-16 19:31     ` Yonatan Nachum
  0 siblings, 0 replies; 6+ messages in thread
From: Yonatan Nachum @ 2026-06-16 19:31 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: jgg, linux-rdma, mrgolin, sleybo, matua, gal.pressman

On Tue, Jun 16, 2026 at 08:50:33PM +0300, Leon Romanovsky wrote:
> On Sun, Jun 14, 2026 at 07:12:29AM +0000, Yonatan Nachum wrote:
> > On Mon, Jun 08, 2026 at 07:16:18AM +0000, Yonatan Nachum wrote:
> > > Changelog:
> > > v4:
> > >  * Use kzalloc_obj for AH cache entry allocation instead of kzalloc
> > > v3: https://lore.kernel.org/all/20260607161753.1607559-1-ynachum@amazon.com/
> > >  * Address Sashiko comments in:
> > >    https://sashiko.dev/#/patchset/20260512061121.2177521-1-ynachum%40amazon.com
> > > v2: https://lore.kernel.org/all/20260512061121.2177521-1-ynachum@amazon.com/
> > >  * Zero-initialize AH cache key on cache lookup.
> > > v1: https://lore.kernel.org/all/20260510083035.458081-1-ynachum@amazon.com/
> > > 
> > > -------------------------------------------------------------------------
> > > New EFA devices don't support the creation of multiple AHs to the same
> > > remote on the same PD. To overcome this limitation, introduce an AH
> > > cache that manages AH reuse transparently.
> > > 
> > > The cache uses an rhashtable keyed by (PD, GID) to track active address
> > > handles with refcounts. On create AH, the driver returns an existing AH
> > > number if one is already cached, or creates a new one and caches it. On
> > > destroy AH, the driver only issues the device destroy command when the
> > > last reference is dropped.
> > > 
> > > A per-entry mutex serializes concurrent device commands on the same
> > > cache entry, preventing create-before-destroy races on the device.
> > > 
> > > Yonatan Nachum (2):
> > >   RDMA/efa: Add initialization of AH cache rhashtable
> > >   RDMA/efa: Add AH cache handling on create and destroy AH
> > > 
> > >  drivers/infiniband/hw/efa/Makefile       |   4 +-
> > >  drivers/infiniband/hw/efa/efa_ah_cache.c | 163 +++++++++++++++++++++++
> > >  drivers/infiniband/hw/efa/efa_ah_cache.h |  42 ++++++
> > >  drivers/infiniband/hw/efa/efa_com.c      |  12 +-
> > >  drivers/infiniband/hw/efa/efa_com.h      |   3 +
> > >  drivers/infiniband/hw/efa/efa_com_cmd.c  |  73 +++++++---
> > >  drivers/infiniband/hw/efa/efa_com_cmd.h  |   1 +
> > >  drivers/infiniband/hw/efa/efa_verbs.c    |   9 +-
> > >  8 files changed, 281 insertions(+), 26 deletions(-)
> > >  create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.c
> > >  create mode 100644 drivers/infiniband/hw/efa/efa_ah_cache.h
> > > 
> > > -- 
> > > 2.50.1
> > >
> > 
> > Hi, kind reminder.
> > This series blocks merging EFAv4 device ID and we would want it to land
> > for the next merge window if possible.
> 
> It is not possible.
> 
> The use of entry->lock together with a refcount and an "initialize"
> flag suggests that the refcount is not being used correctly.
> 
> I would expect a single ah_cache lock, with the refcount tracking the
> number of users of the entry.
> 
> Thanks.

A global AH cache lock would serialize all AH commands for any PD-GID
combination, including the ones that go to the device.
The per-entry mutex allows different entries to issue device commands in
parallel while only serializing operations on the same entry.

The initialized flag is needed because the entry must exist in the
hashtable before the device command completes, so concurrent threads
targeting the same PD-GID find it and wait on the per-entry mutex.

I am open to simplifying to a single globlal lock if you prefer, but it
comes at the performance cost of serializing all AH commands.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-16 19:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08  7:16 [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
2026-06-08  7:16 ` [PATCH for-next v4 1/2] RDMA/efa: Add initialization of AH cache rhashtable Yonatan Nachum
2026-06-08  7:16 ` [PATCH for-next v4 2/2] RDMA/efa: Add AH cache handling on create and destroy AH Yonatan Nachum
2026-06-14  7:12 ` [PATCH for-next v4 0/2] RDMA/efa: Add AH cache for AH reuse Yonatan Nachum
2026-06-16 17:50   ` Leon Romanovsky
2026-06-16 19:31     ` Yonatan Nachum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox