public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/5] IB service resolution
@ 2025-06-30 10:52 Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 1/5] RDMA/sa_query: Add RMPP support for SA queries Leon Romanovsky
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-06-30 10:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Mark Zhang, Or Har-Toov,
	Vlad Dumitrescu

From: Leon Romanovsky <leonro@nvidia.com>

From Mark,

This patchset adds support to resolve IB service records from SM. With
this feature, the user-space is able to resolve and query IB address
information like DGID and PKEY based on an IB service name or ID,
through new ucma commands.

New CM states:
 * RDMA_CM_ADDRINFO_QUERY - Indicates cm is in the process of IB service
   resolution;
 * RDMA_CM_ADDRINFO_RESOLVED - Indicates cm has finished the process of
   IB service resolution.

New CM events:
 * RDMA_CM_EVENT_ADDRINFO_RESOLVED - Indicates successful resolution of
   address information.
 * RDMA_CM_EVENT_ADDRINFO_ERROR - Indicates a failure in address
   information resolution.

The patchset also enables writing custom events into the CM. This is
particularly useful for applications that require events not typically
generated in the standard rdmacm flow. Two new event types are
supported:

 * RDMA_CM_EVENT_USER - A user-defined event, where the event details
   are specified by the application and not interpreted by the librdmacm
   library.
 * RDMA_CM_EVENT_INTERNAL - An internal event, used and consumed exclusively
   by the librdmacm library.

For instance, the new user-space API rdma_resolve_addrinfo() will
support both SA (Service Agent) and DNS resolution. In the DNS case,
since there is no standard CM event generated upon completion, an
RDMA_CM_EVENT_INTERNAL event with "ADDRINFO_RESOLVED" information as the
parameter can be written into the CM. This allows the librdmacm library
to receive the event and report an ADDRINFO_RESOLVED event to the user,
ensuring that DNS resolution follows the same workflow as IB service
record resolution.

Thanks

Mark Zhang (5):
  RDMA/sa_query: Add RMPP support for SA queries
  RDMA/sa_query: Support IB service records resolution
  RDMA/cma: Support IB service record resolution
  RDMA/ucma: Support query resolved service records
  RDMA/ucma: Support write an event into a CM

 drivers/infiniband/core/cma.c      | 136 +++++++++++++-
 drivers/infiniband/core/cma_priv.h |   4 +-
 drivers/infiniband/core/sa_query.c | 277 +++++++++++++++++++++++++++--
 drivers/infiniband/core/ucma.c     | 120 ++++++++++++-
 include/rdma/ib_mad.h              |   1 +
 include/rdma/ib_sa.h               |  37 ++++
 include/rdma/rdma_cm.h             |  21 ++-
 include/uapi/rdma/ib_user_sa.h     |  14 ++
 include/uapi/rdma/rdma_user_cm.h   |  42 ++++-
 9 files changed, 634 insertions(+), 18 deletions(-)

-- 
2.50.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 1/5] RDMA/sa_query: Add RMPP support for SA queries
  2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
@ 2025-06-30 10:52 ` Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 2/5] RDMA/sa_query: Support IB service records resolution Leon Romanovsky
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-06-30 10:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Mark Zhang, linux-rdma, Or Har-Toov, Vlad Dumitrescu

From: Mark Zhang <markzhang@nvidia.com>

Register GSI mad agent with RMPP support and add rmpp_callback for
SA queries. This is needed for querying more than one service record
in one query.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/sa_query.c | 39 +++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 53571e6b3162..770e9f18349b 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -107,6 +107,8 @@ struct ib_sa_device {
 struct ib_sa_query {
 	void (*callback)(struct ib_sa_query *sa_query, int status,
 			 struct ib_sa_mad *mad);
+	void (*rmpp_callback)(struct ib_sa_query *sa_query, int status,
+			      struct ib_mad_recv_wc *mad);
 	void (*release)(struct ib_sa_query *);
 	struct ib_sa_client    *client;
 	struct ib_sa_port      *port;
@@ -1987,23 +1989,29 @@ static void send_handler(struct ib_mad_agent *agent,
 {
 	struct ib_sa_query *query = mad_send_wc->send_buf->context[0];
 	unsigned long flags;
+	int status = 0;
 
-	if (query->callback)
+	if (query->callback || query->rmpp_callback) {
 		switch (mad_send_wc->status) {
 		case IB_WC_SUCCESS:
 			/* No callback -- already got recv */
 			break;
 		case IB_WC_RESP_TIMEOUT_ERR:
-			query->callback(query, -ETIMEDOUT, NULL);
+			status = -ETIMEDOUT;
 			break;
 		case IB_WC_WR_FLUSH_ERR:
-			query->callback(query, -EINTR, NULL);
+			status = -EINTR;
 			break;
 		default:
-			query->callback(query, -EIO, NULL);
+			status = -EIO;
 			break;
 		}
 
+		if (status)
+			query->callback ? query->callback(query, status, NULL) :
+				query->rmpp_callback(query, status, NULL);
+	}
+
 	xa_lock_irqsave(&queries, flags);
 	__xa_erase(&queries, query->id);
 	xa_unlock_irqrestore(&queries, flags);
@@ -2019,17 +2027,25 @@ static void recv_handler(struct ib_mad_agent *mad_agent,
 			 struct ib_mad_recv_wc *mad_recv_wc)
 {
 	struct ib_sa_query *query;
+	struct ib_mad *mad;
+
 
 	if (!send_buf)
 		return;
 
 	query = send_buf->context[0];
-	if (query->callback) {
+	mad = mad_recv_wc->recv_buf.mad;
+
+	if (query->rmpp_callback) {
+		if (mad_recv_wc->wc->status == IB_WC_SUCCESS)
+			query->rmpp_callback(query, mad->mad_hdr.status ?
+					     -EINVAL : 0, mad_recv_wc);
+		else
+			query->rmpp_callback(query, -EIO, NULL);
+	} else if (query->callback) {
 		if (mad_recv_wc->wc->status == IB_WC_SUCCESS)
-			query->callback(query,
-					mad_recv_wc->recv_buf.mad->mad_hdr.status ?
-					-EINVAL : 0,
-					(struct ib_sa_mad *) mad_recv_wc->recv_buf.mad);
+			query->callback(query, mad->mad_hdr.status ?
+					-EINVAL : 0, (struct ib_sa_mad *)mad);
 		else
 			query->callback(query, -EIO, NULL);
 	}
@@ -2181,8 +2197,9 @@ static int ib_sa_add_one(struct ib_device *device)
 
 		sa_dev->port[i].agent =
 			ib_register_mad_agent(device, i + s, IB_QPT_GSI,
-					      NULL, 0, send_handler,
-					      recv_handler, sa_dev, 0);
+					      NULL, IB_MGMT_RMPP_VERSION,
+					      send_handler, recv_handler,
+					      sa_dev, 0);
 		if (IS_ERR(sa_dev->port[i].agent)) {
 			ret = PTR_ERR(sa_dev->port[i].agent);
 			goto err;
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 2/5] RDMA/sa_query: Support IB service records resolution
  2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 1/5] RDMA/sa_query: Add RMPP support for SA queries Leon Romanovsky
@ 2025-06-30 10:52 ` Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 3/5] RDMA/cma: Support IB service record resolution Leon Romanovsky
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-06-30 10:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Mark Zhang, linux-rdma, Or Har-Toov, Vlad Dumitrescu

From: Mark Zhang <markzhang@nvidia.com>

Add an SA query API ib_sa_service_rec_get() to support building and
sending SA query MADs that ask for service records with a specific
name or ID, and receiving and parsing responses from the SM.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/sa_query.c | 238 +++++++++++++++++++++++++++++
 include/rdma/ib_mad.h              |   1 +
 include/rdma/ib_sa.h               |  37 +++++
 3 files changed, 276 insertions(+)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 770e9f18349b..c0a7af1b4fe4 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -152,6 +152,13 @@ struct ib_sa_mcmember_query {
 	struct ib_sa_query sa_query;
 };
 
+struct ib_sa_service_query {
+	void (*callback)(int status, struct sa_service_rec *rec,
+			 unsigned int num_services, void *context);
+	void *context;
+	struct ib_sa_query sa_query;
+};
+
 static LIST_HEAD(ib_nl_request_list);
 static DEFINE_SPINLOCK(ib_nl_request_lock);
 static atomic_t ib_nl_sa_request_seq;
@@ -686,6 +693,58 @@ static const struct ib_field guidinfo_rec_table[] = {
 	  .size_bits    = 512 },
 };
 
+#define SERVICE_REC_FIELD(field) \
+	.struct_offset_bytes = offsetof(struct sa_service_rec, field),     \
+	.struct_size_bytes   = sizeof_field(struct sa_service_rec, field), \
+	.field_name          = "sa_service_rec:" #field
+
+static const struct ib_field service_rec_table[] = {
+	{ SERVICE_REC_FIELD(id),
+	  .offset_words = 0,
+	  .offset_bits  = 0,
+	  .size_bits    = 64 },
+	{ SERVICE_REC_FIELD(gid),
+	  .offset_words = 2,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ SERVICE_REC_FIELD(pkey),
+	  .offset_words = 6,
+	  .offset_bits  = 0,
+	  .size_bits    = 16 },
+	{ RESERVED,
+	  .offset_words = 6,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+	{ SERVICE_REC_FIELD(lease),
+	  .offset_words = 7,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+	{ SERVICE_REC_FIELD(key),
+	  .offset_words = 8,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ SERVICE_REC_FIELD(name),
+	  .offset_words = 12,
+	  .offset_bits  = 0,
+	  .size_bits    = 512 },
+	{ SERVICE_REC_FIELD(data_8),
+	  .offset_words = 28,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ SERVICE_REC_FIELD(data_16),
+	  .offset_words = 32,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ SERVICE_REC_FIELD(data_32),
+	  .offset_words = 36,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ SERVICE_REC_FIELD(data_64),
+	  .offset_words = 40,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+};
+
 #define RDMA_PRIMARY_PATH_MAX_REC_NUM 3
 
 static inline void ib_sa_disable_local_svc(struct ib_sa_query *query)
@@ -1392,6 +1451,20 @@ void ib_sa_pack_path(struct sa_path_rec *rec, void *attribute)
 }
 EXPORT_SYMBOL(ib_sa_pack_path);
 
+void ib_sa_pack_service(struct sa_service_rec *rec, void *attribute)
+{
+	ib_pack(service_rec_table, ARRAY_SIZE(service_rec_table), rec,
+		attribute);
+}
+EXPORT_SYMBOL(ib_sa_pack_service);
+
+void ib_sa_unpack_service(void *attribute, struct sa_service_rec *rec)
+{
+	ib_unpack(service_rec_table, ARRAY_SIZE(service_rec_table), attribute,
+		  rec);
+}
+EXPORT_SYMBOL(ib_sa_unpack_service);
+
 static bool ib_sa_opa_pathrecord_support(struct ib_sa_client *client,
 					 struct ib_sa_device *sa_dev,
 					 u32 port_num)
@@ -1481,6 +1554,68 @@ static void ib_sa_path_rec_callback(struct ib_sa_query *sa_query,
 	}
 }
 
+#define IB_SA_DATA_OFFS 56
+#define IB_SERVICE_REC_SZ 176
+
+static void ib_unpack_service_rmpp(struct sa_service_rec *rec,
+				   struct ib_mad_recv_wc *mad_wc,
+				   int num_services)
+{
+	unsigned int cp_sz, data_i, data_size, rec_i = 0, buf_i = 0;
+	struct ib_mad_recv_buf *mad_buf;
+	u8 buf[IB_SERVICE_REC_SZ];
+	u8 *data;
+
+	data_size = sizeof(((struct ib_sa_mad *) mad_buf->mad)->data);
+
+	list_for_each_entry(mad_buf, &mad_wc->rmpp_list, list) {
+		data = ((struct ib_sa_mad *) mad_buf->mad)->data;
+		data_i = 0;
+		while (data_i < data_size && rec_i < num_services) {
+			cp_sz = min(IB_SERVICE_REC_SZ - buf_i,
+				    data_size - data_i);
+			memcpy(buf + buf_i, data + data_i, cp_sz);
+			data_i += cp_sz;
+			buf_i += cp_sz;
+			if (buf_i == IB_SERVICE_REC_SZ) {
+				ib_sa_unpack_service(buf, rec + rec_i);
+				buf_i = 0;
+				rec_i++;
+			}
+		}
+	}
+}
+
+static void ib_sa_service_rec_callback(struct ib_sa_query *sa_query, int status,
+				       struct ib_mad_recv_wc *mad_wc)
+{
+	struct ib_sa_service_query *query =
+		container_of(sa_query, struct ib_sa_service_query, sa_query);
+	struct sa_service_rec *rec;
+	int num_services;
+
+	if (!mad_wc || !mad_wc->recv_buf.mad) {
+		query->callback(status, NULL, 0, query->context);
+		return;
+	}
+
+	num_services = (mad_wc->mad_len - IB_SA_DATA_OFFS) / IB_SERVICE_REC_SZ;
+	if (!num_services) {
+		query->callback(-ENODATA, NULL, 0, query->context);
+		return;
+	}
+
+	rec = kmalloc_array(num_services, sizeof(*rec), GFP_KERNEL);
+	if (!rec) {
+		query->callback(-ENOMEM, NULL, 0, query->context);
+		return;
+	}
+
+	ib_unpack_service_rmpp(rec, mad_wc, num_services);
+	query->callback(status, rec, num_services, query->context);
+	kfree(rec);
+}
+
 static void ib_sa_path_rec_release(struct ib_sa_query *sa_query)
 {
 	struct ib_sa_path_query *query =
@@ -1490,6 +1625,14 @@ static void ib_sa_path_rec_release(struct ib_sa_query *sa_query)
 	kfree(query);
 }
 
+static void ib_sa_service_rec_release(struct ib_sa_query *sa_query)
+{
+	struct ib_sa_service_query *query =
+		container_of(sa_query, struct ib_sa_service_query, sa_query);
+
+	kfree(query);
+}
+
 /**
  * ib_sa_path_rec_get - Start a Path get query
  * @client:SA client
@@ -1620,6 +1763,101 @@ int ib_sa_path_rec_get(struct ib_sa_client *client,
 }
 EXPORT_SYMBOL(ib_sa_path_rec_get);
 
+/**
+ * ib_sa_service_rec_get - Start a Service get query
+ * @client: SA client
+ * @device: device to send query on
+ * @port_num: port number to send query on
+ * @rec: Service Record to send in query
+ * @comp_mask: component mask to send in query
+ * @timeout_ms: time to wait for response
+ * @gfp_mask: GFP mask to use for internal allocations
+ * @callback: function called when query completes, times out or is
+ * canceled
+ * @context: opaque user context passed to callback
+ * @sa_query: query context, used to cancel query
+ *
+ * Send a Service Record Get query to the SA to look up a path.  The
+ * callback function will be called when the query completes (or
+ * fails); status is 0 for a successful response, -EINTR if the query
+ * is canceled, -ETIMEDOUT is the query timed out, or -EIO if an error
+ * occurred sending the query.  The resp parameter of the callback is
+ * only valid if status is 0.
+ *
+ * If the return value of ib_sa_service_rec_get() is negative, it is an
+ * error code. Otherwise it is a query ID that can be used to cancel
+ * the query.
+ */
+int ib_sa_service_rec_get(struct ib_sa_client *client,
+			  struct ib_device *device, u32 port_num,
+			  struct sa_service_rec *rec,
+			  ib_sa_comp_mask comp_mask,
+			  unsigned long timeout_ms, gfp_t gfp_mask,
+			  void (*callback)(int status,
+					   struct sa_service_rec *resp,
+					   unsigned int num_services,
+					   void *context),
+			  void *context, struct ib_sa_query **sa_query)
+{
+	struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
+	struct ib_sa_service_query *query;
+	struct ib_mad_agent *agent;
+	struct ib_sa_port   *port;
+	struct ib_sa_mad *mad;
+	int ret;
+
+	if (!sa_dev)
+		return -ENODEV;
+
+	port = &sa_dev->port[port_num - sa_dev->start_port];
+	agent = port->agent;
+
+	query = kzalloc(sizeof(*query), gfp_mask);
+	if (!query)
+		return -ENOMEM;
+
+	query->sa_query.port = port;
+
+	ret = alloc_mad(&query->sa_query, gfp_mask);
+	if (ret)
+		goto err1;
+
+	ib_sa_client_get(client);
+	query->sa_query.client = client;
+	query->callback        = callback;
+	query->context         = context;
+
+	mad = query->sa_query.mad_buf->mad;
+	init_mad(&query->sa_query, agent);
+
+	query->sa_query.rmpp_callback = callback ? ib_sa_service_rec_callback :
+		NULL;
+	query->sa_query.release = ib_sa_service_rec_release;
+	mad->mad_hdr.method	= IB_MGMT_METHOD_GET_TABLE;
+	mad->mad_hdr.attr_id	= cpu_to_be16(IB_SA_ATTR_SERVICE_REC);
+	mad->sa_hdr.comp_mask	= comp_mask;
+
+	ib_sa_pack_service(rec, mad->data);
+
+	*sa_query = &query->sa_query;
+	query->sa_query.mad_buf->context[1] = rec;
+
+	ret = send_mad(&query->sa_query, timeout_ms, gfp_mask);
+	if (ret < 0)
+		goto err2;
+
+	return ret;
+
+err2:
+	*sa_query = NULL;
+	ib_sa_client_put(query->sa_query.client);
+	free_mad(&query->sa_query);
+err1:
+	kfree(query);
+	return ret;
+}
+EXPORT_SYMBOL(ib_sa_service_rec_get);
+
 static void ib_sa_mcmember_rec_callback(struct ib_sa_query *sa_query,
 					int status, struct ib_sa_mad *mad)
 {
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 3f1b58d8b4bf..8bd0e1eb393b 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -48,6 +48,7 @@
 #define IB_MGMT_METHOD_REPORT			0x06
 #define IB_MGMT_METHOD_REPORT_RESP		0x86
 #define IB_MGMT_METHOD_TRAP_REPRESS		0x07
+#define IB_MGMT_METHOD_GET_TABLE		0x12
 
 #define IB_MGMT_METHOD_RESP			0x80
 #define IB_BM_ATTR_MOD_RESP			cpu_to_be32(1)
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index b46353fc53bf..95e8924ad563 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -189,6 +189,20 @@ struct sa_path_rec {
 	u32 flags;
 };
 
+struct sa_service_rec {
+	__be64 id;
+	__u8 gid[16];
+	__be16 pkey;
+	__u8 reserved[2];
+	__be32 lease;
+	__u8 key[16];
+	__u8 name[64];
+	__u8 data_8[16];
+	__be16 data_16[8];
+	__be32 data_32[4];
+	__be64 data_64[2];
+};
+
 static inline enum ib_gid_type
 		sa_conv_pathrec_to_gid_type(struct sa_path_rec *rec)
 {
@@ -417,6 +431,17 @@ int ib_sa_path_rec_get(struct ib_sa_client *client, struct ib_device *device,
 					unsigned int num_prs, void *context),
 		       void *context, struct ib_sa_query **query);
 
+int ib_sa_service_rec_get(struct ib_sa_client *client,
+			  struct ib_device *device, u32 port_num,
+			  struct sa_service_rec *rec,
+			  ib_sa_comp_mask comp_mask,
+			  unsigned long timeout_ms, gfp_t gfp_mask,
+			  void (*callback)(int status,
+					   struct sa_service_rec *resp,
+					   unsigned int num_services,
+					   void *context),
+			  void *context, struct ib_sa_query **sa_query);
+
 struct ib_sa_multicast {
 	struct ib_sa_mcmember_rec rec;
 	ib_sa_comp_mask		comp_mask;
@@ -508,6 +533,18 @@ int ib_init_ah_attr_from_path(struct ib_device *device, u32 port_num,
  */
 void ib_sa_pack_path(struct sa_path_rec *rec, void *attribute);
 
+/**
+ * ib_sa_pack_service - Convert a service record from struct ib_sa_service_rec
+ * to IB MAD wire format.
+ */
+void ib_sa_pack_service(struct sa_service_rec *rec, void *attribute);
+
+/**
+ * ib_sa_unpack_service - Convert a service record from MAD format to struct
+ * ib_sa_service_rec.
+ */
+void ib_sa_unpack_service(void *attribute, struct sa_service_rec *rec);
+
 /**
  * ib_sa_unpack_path - Convert a path record from MAD format to struct
  * ib_sa_path_rec.
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 3/5] RDMA/cma: Support IB service record resolution
  2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 1/5] RDMA/sa_query: Add RMPP support for SA queries Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 2/5] RDMA/sa_query: Support IB service records resolution Leon Romanovsky
@ 2025-06-30 10:52 ` Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 4/5] RDMA/ucma: Support query resolved service records Leon Romanovsky
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-06-30 10:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Mark Zhang, linux-rdma, Or Har-Toov, Vlad Dumitrescu

From: Mark Zhang <markzhang@nvidia.com>

Add new UCMA command and the corresponding CMA implementation. Userspace
can send this command to request service resolution based on service
name or ID.

On a successful resolution, one or multiple service records are
returned, the first one will be used as destination address by default.

Two new CM events are added and returned to caller accordingly:
  - RDMA_CM_EVENT_ADDRINFO_RESOLVED: Resolve succeeded;
  - RDMA_CM_EVENT_ADDRINFO_ERROR:  Resolve failed.

Internally two new CM states are added:
  - RDMA_CM_ADDRINFO_QUERY: CM is in the process of IB service
    resolution;
  - RDMA_CM_ADDRINFO_RESOLVED: CM has finished the resolve process.

With these new states, beside existing state transfer processes, 2 new
processes are supported:
 1. The default address is used:
    RDMA_CM_ADDR_BOUND ->
      RDMA_CM_ADDRINFO_QUERY ->
        RDMA_CM_ADDRINFO_RESOLVED ->
          RDMA_CM_ROUTE_QUERY

 2. To use a different address:
    RDMA_CM_ADDR_BOUND ->
      RDMA_CM_ADDRINFO_QUERY->
        RDMA_CM_ADDRINFO_RESOLVED ->
          RDMA_CM_ADDR_QUERY ->
            RDMA_CM_ADDR_RESOLVED ->
              RDMA_CM_ROUTE_QUERY

In the 2nd case, resolve_addrinfo returns multiple records, a user
could call rdma_resolve_addr() with the one that is not the first.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/cma.c      | 136 ++++++++++++++++++++++++++++-
 drivers/infiniband/core/cma_priv.h |   4 +-
 drivers/infiniband/core/ucma.c     |  30 ++++++-
 include/rdma/rdma_cm.h             |  18 +++-
 include/uapi/rdma/rdma_user_cm.h   |  20 ++++-
 5 files changed, 202 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 9b471548e7ae..5b2d3ae3f9fc 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2076,6 +2076,7 @@ static void _destroy_id(struct rdma_id_private *id_priv,
 	kfree(id_priv->id.route.path_rec);
 	kfree(id_priv->id.route.path_rec_inbound);
 	kfree(id_priv->id.route.path_rec_outbound);
+	kfree(id_priv->id.route.service_recs);
 
 	put_net(id_priv->id.route.addr.dev_addr.net);
 	kfree(id_priv);
@@ -3382,13 +3383,18 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
 int rdma_resolve_route(struct rdma_cm_id *id, unsigned long timeout_ms)
 {
 	struct rdma_id_private *id_priv;
+	enum rdma_cm_state state;
 	int ret;
 
 	if (!timeout_ms)
 		return -EINVAL;
 
 	id_priv = container_of(id, struct rdma_id_private, id);
-	if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_RESOLVED, RDMA_CM_ROUTE_QUERY))
+	state = id_priv->state;
+	if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_RESOLVED,
+			   RDMA_CM_ROUTE_QUERY) &&
+	    !cma_comp_exch(id_priv, RDMA_CM_ADDRINFO_RESOLVED,
+			   RDMA_CM_ROUTE_QUERY))
 		return -EINVAL;
 
 	cma_id_get(id_priv);
@@ -3409,7 +3415,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, unsigned long timeout_ms)
 
 	return 0;
 err:
-	cma_comp_exch(id_priv, RDMA_CM_ROUTE_QUERY, RDMA_CM_ADDR_RESOLVED);
+	cma_comp_exch(id_priv, RDMA_CM_ROUTE_QUERY, state);
 	cma_id_put(id_priv);
 	return ret;
 }
@@ -5506,3 +5512,129 @@ static void __exit cma_cleanup(void)
 
 module_init(cma_init);
 module_exit(cma_cleanup);
+
+static void cma_query_ib_service_handler(int status,
+					 struct sa_service_rec *recs,
+					 unsigned int num_recs, void *context)
+{
+	struct cma_work *work = context;
+	struct rdma_id_private *id_priv = work->id;
+	struct sockaddr_ib *addr;
+
+	if (status)
+		goto fail;
+
+	if (!num_recs) {
+		status = -ENOENT;
+		goto fail;
+	}
+
+	if (id_priv->id.route.service_recs) {
+		status = -EALREADY;
+		goto fail;
+	}
+
+	id_priv->id.route.service_recs =
+		kmalloc_array(num_recs, sizeof(*recs), GFP_KERNEL);
+	if (!id_priv->id.route.service_recs) {
+		status = -ENOMEM;
+		goto fail;
+	}
+
+	id_priv->id.route.num_service_recs = num_recs;
+	memcpy(id_priv->id.route.service_recs, recs, sizeof(*recs) * num_recs);
+
+	addr = (struct sockaddr_ib *)&id_priv->id.route.addr.dst_addr;
+	addr->sib_family = AF_IB;
+	addr->sib_addr = *(struct ib_addr *)&recs->gid;
+	addr->sib_pkey = recs->pkey;
+	addr->sib_sid = recs->id;
+	rdma_addr_set_dgid(&id_priv->id.route.addr.dev_addr,
+			   (union ib_gid *)&addr->sib_addr);
+	ib_addr_set_pkey(&id_priv->id.route.addr.dev_addr,
+			 ntohs(addr->sib_pkey));
+
+	queue_work(cma_wq, &work->work);
+	return;
+
+fail:
+	work->old_state = RDMA_CM_ADDRINFO_QUERY;
+	work->new_state = RDMA_CM_ADDR_BOUND;
+	work->event.event = RDMA_CM_EVENT_ADDRINFO_ERROR;
+	work->event.status = status;
+	pr_debug_ratelimited(
+		"RDMA CM: SERVICE_ERROR: failed to query service record. status %d\n",
+		status);
+	queue_work(cma_wq, &work->work);
+}
+
+static int cma_resolve_ib_service(struct rdma_id_private *id_priv,
+				  struct rdma_ucm_ib_service *ibs)
+{
+	struct sa_service_rec sr = {};
+	ib_sa_comp_mask mask = 0;
+	struct cma_work *work;
+
+	work = kzalloc(sizeof(*work), GFP_KERNEL);
+	if (!work)
+		return -ENOMEM;
+
+	cma_id_get(id_priv);
+
+	work->id = id_priv;
+	INIT_WORK(&work->work, cma_work_handler);
+	work->old_state = RDMA_CM_ADDRINFO_QUERY;
+	work->new_state = RDMA_CM_ADDRINFO_RESOLVED;
+	work->event.event = RDMA_CM_EVENT_ADDRINFO_RESOLVED;
+
+	if (ibs->flags & RDMA_USER_CM_IB_SERVICE_FLAG_ID) {
+		sr.id = cpu_to_be64(ibs->service_id);
+		mask |= IB_SA_SERVICE_REC_SERVICE_ID;
+	}
+	if (ibs->flags & RDMA_USER_CM_IB_SERVICE_FLAG_NAME) {
+		strscpy(sr.name, ibs->service_name, sizeof(sr.name));
+		mask |= IB_SA_SERVICE_REC_SERVICE_NAME;
+	}
+
+	id_priv->query_id = ib_sa_service_rec_get(&sa_client,
+						  id_priv->id.device,
+						  id_priv->id.port_num,
+						  &sr, mask,
+						  2000, GFP_KERNEL,
+						  cma_query_ib_service_handler,
+						  work, &id_priv->query);
+
+	if (id_priv->query_id < 0) {
+		cma_id_put(id_priv);
+		kfree(work);
+		return id_priv->query_id;
+	}
+
+	return 0;
+}
+
+int rdma_resolve_ib_service(struct rdma_cm_id *id,
+			    struct rdma_ucm_ib_service *ibs)
+{
+	struct rdma_id_private *id_priv;
+	int ret;
+
+	id_priv = container_of(id, struct rdma_id_private, id);
+	if (!id_priv->cma_dev ||
+	    !cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDRINFO_QUERY))
+		return -EINVAL;
+
+	if (rdma_cap_ib_sa(id->device, id->port_num))
+		ret = cma_resolve_ib_service(id_priv, ibs);
+	else
+		ret = -EOPNOTSUPP;
+
+	if (ret)
+		goto err;
+
+	return 0;
+err:
+	cma_comp_exch(id_priv, RDMA_CM_ADDRINFO_QUERY, RDMA_CM_ADDR_BOUND);
+	return ret;
+}
+EXPORT_SYMBOL(rdma_resolve_ib_service);
diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
index b7354c94cf1b..c604b601f4d9 100644
--- a/drivers/infiniband/core/cma_priv.h
+++ b/drivers/infiniband/core/cma_priv.h
@@ -47,7 +47,9 @@ enum rdma_cm_state {
 	RDMA_CM_ADDR_BOUND,
 	RDMA_CM_LISTEN,
 	RDMA_CM_DEVICE_REMOVAL,
-	RDMA_CM_DESTROYING
+	RDMA_CM_DESTROYING,
+	RDMA_CM_ADDRINFO_QUERY,
+	RDMA_CM_ADDRINFO_RESOLVED
 };
 
 struct rdma_id_private {
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 6e700b974033..1915f4e68308 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -282,6 +282,10 @@ static struct ucma_event *ucma_create_uevent(struct ucma_context *ctx,
 	}
 	uevent->resp.event = event->event;
 	uevent->resp.status = event->status;
+
+	if (event->event == RDMA_CM_EVENT_ADDRINFO_RESOLVED)
+		goto out;
+
 	if (ctx->cm_id->qp_type == IB_QPT_UD)
 		ucma_copy_ud_event(ctx->cm_id->device, &uevent->resp.param.ud,
 				   &event->param.ud);
@@ -289,6 +293,7 @@ static struct ucma_event *ucma_create_uevent(struct ucma_context *ctx,
 		ucma_copy_conn_event(&uevent->resp.param.conn,
 				     &event->param.conn);
 
+out:
 	uevent->resp.ece.vendor_id = event->ece.vendor_id;
 	uevent->resp.ece.attr_mod = event->ece.attr_mod;
 	return uevent;
@@ -728,6 +733,28 @@ static ssize_t ucma_resolve_addr(struct ucma_file *file,
 	return ret;
 }
 
+static ssize_t ucma_resolve_ib_service(struct ucma_file *file,
+				       const char __user *inbuf, int in_len,
+				       int out_len)
+{
+	struct rdma_ucm_resolve_ib_service cmd;
+	struct ucma_context *ctx;
+	int ret;
+
+	if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+		return -EFAULT;
+
+	ctx = ucma_get_ctx(file, cmd.id);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	mutex_lock(&ctx->mutex);
+	ret = rdma_resolve_ib_service(ctx->cm_id, &cmd.ibs);
+	mutex_unlock(&ctx->mutex);
+	ucma_put_ctx(ctx);
+	return ret;
+}
+
 static ssize_t ucma_resolve_route(struct ucma_file *file,
 				  const char __user *inbuf,
 				  int in_len, int out_len)
@@ -1703,7 +1730,8 @@ static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
 	[RDMA_USER_CM_CMD_QUERY]	 = ucma_query,
 	[RDMA_USER_CM_CMD_BIND]		 = ucma_bind,
 	[RDMA_USER_CM_CMD_RESOLVE_ADDR]	 = ucma_resolve_addr,
-	[RDMA_USER_CM_CMD_JOIN_MCAST]	 = ucma_join_multicast
+	[RDMA_USER_CM_CMD_JOIN_MCAST]	 = ucma_join_multicast,
+	[RDMA_USER_CM_CMD_RESOLVE_IB_SERVICE] = ucma_resolve_ib_service
 };
 
 static ssize_t ucma_write(struct file *filp, const char __user *buf,
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index d1593ad47e28..72d1568e4cfb 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -33,7 +33,9 @@ enum rdma_cm_event_type {
 	RDMA_CM_EVENT_MULTICAST_JOIN,
 	RDMA_CM_EVENT_MULTICAST_ERROR,
 	RDMA_CM_EVENT_ADDR_CHANGE,
-	RDMA_CM_EVENT_TIMEWAIT_EXIT
+	RDMA_CM_EVENT_TIMEWAIT_EXIT,
+	RDMA_CM_EVENT_ADDRINFO_RESOLVED,
+	RDMA_CM_EVENT_ADDRINFO_ERROR
 };
 
 const char *__attribute_const__ rdma_event_msg(enum rdma_cm_event_type event);
@@ -63,6 +65,9 @@ struct rdma_route {
 	 * 2 - Both primary and alternate path are available
 	 */
 	int num_pri_alt_paths;
+
+	unsigned int num_service_recs;
+	struct sa_service_rec *service_recs;
 };
 
 struct rdma_conn_param {
@@ -197,6 +202,17 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
  */
 int rdma_resolve_route(struct rdma_cm_id *id, unsigned long timeout_ms);
 
+/**
+ * rdma_resolve_ib_service - Resolve the IB service record of the
+ *   service with the given service ID or name.
+ *
+ * This function is optional in the rdma cm flow. It is called on the client
+ * side of a connection, before calling rdma_resolve_route. The resolution
+ * can be done once per rdma_cm_id.
+ */
+int rdma_resolve_ib_service(struct rdma_cm_id *id,
+			    struct rdma_ucm_ib_service *ibs);
+
 /**
  * rdma_create_qp - Allocate a QP and associate it with the specified RDMA
  * identifier.
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 7cea03581f79..8799623bcba0 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -67,7 +67,8 @@ enum {
 	RDMA_USER_CM_CMD_QUERY,
 	RDMA_USER_CM_CMD_BIND,
 	RDMA_USER_CM_CMD_RESOLVE_ADDR,
-	RDMA_USER_CM_CMD_JOIN_MCAST
+	RDMA_USER_CM_CMD_JOIN_MCAST,
+	RDMA_USER_CM_CMD_RESOLVE_IB_SERVICE
 };
 
 /* See IBTA Annex A11, servies ID bytes 4 & 5 */
@@ -338,4 +339,21 @@ struct rdma_ucm_migrate_resp {
 	__u32 events_reported;
 };
 
+enum {
+	RDMA_USER_CM_IB_SERVICE_FLAG_ID = 1 << 0,
+	RDMA_USER_CM_IB_SERVICE_FLAG_NAME = 1 << 1,
+};
+
+#define RDMA_USER_CM_IB_SERVICE_NAME_SIZE 64
+struct rdma_ucm_ib_service {
+	__u64 service_id;
+	__u8  service_name[RDMA_USER_CM_IB_SERVICE_NAME_SIZE];
+	__u32 flags;
+	__u32 reserved;
+};
+
+struct rdma_ucm_resolve_ib_service {
+	__u32 id;
+	struct rdma_ucm_ib_service ibs;
+};
 #endif /* RDMA_USER_CM_H */
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 4/5] RDMA/ucma: Support query resolved service records
  2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
                   ` (2 preceding siblings ...)
  2025-06-30 10:52 ` [PATCH rdma-next 3/5] RDMA/cma: Support IB service record resolution Leon Romanovsky
@ 2025-06-30 10:52 ` Leon Romanovsky
  2025-06-30 10:52 ` [PATCH rdma-next 5/5] RDMA/ucma: Support write an event into a CM Leon Romanovsky
  2025-08-13 10:16 ` [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
  5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-06-30 10:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Mark Zhang, linux-rdma, Or Har-Toov, Vlad Dumitrescu

From: Mark Zhang <markzhang@nvidia.com>

Enable user-space to query resolved service records through a ucma
command when a RDMA_CM_EVENT_ADDRINFO_RESOLVED event is received.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/ucma.c   | 40 ++++++++++++++++++++++++++++++++
 include/uapi/rdma/ib_user_sa.h   | 14 +++++++++++
 include/uapi/rdma/rdma_user_cm.h |  8 ++++++-
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 1915f4e68308..3b9ca6d7a21b 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1021,6 +1021,43 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx,
 	return ret;
 }
 
+static ssize_t ucma_query_ib_service(struct ucma_context *ctx,
+				     void __user *response, int out_len)
+{
+	struct rdma_ucm_query_ib_service_resp *resp;
+	int n, ret = 0;
+
+	if (out_len < sizeof(struct rdma_ucm_query_ib_service_resp))
+		return -ENOSPC;
+
+	if (!ctx->cm_id->route.service_recs)
+		return -ENODATA;
+
+	resp = kzalloc(out_len, GFP_KERNEL);
+	if (!resp)
+		return -ENOMEM;
+
+	resp->num_service_recs = ctx->cm_id->route.num_service_recs;
+
+	n = (out_len - sizeof(struct rdma_ucm_query_ib_service_resp)) /
+		sizeof(struct ib_user_service_rec);
+
+	if (!n)
+		goto out;
+
+	if (n > ctx->cm_id->route.num_service_recs)
+		n = ctx->cm_id->route.num_service_recs;
+
+	memcpy(resp->recs, ctx->cm_id->route.service_recs,
+	       sizeof(*resp->recs) * n);
+	if (copy_to_user(response, resp, struct_size(resp, recs, n)))
+		ret = -EFAULT;
+
+out:
+	kfree(resp);
+	return ret;
+}
+
 static ssize_t ucma_query(struct ucma_file *file,
 			  const char __user *inbuf,
 			  int in_len, int out_len)
@@ -1049,6 +1086,9 @@ static ssize_t ucma_query(struct ucma_file *file,
 	case RDMA_USER_CM_QUERY_GID:
 		ret = ucma_query_gid(ctx, response, out_len);
 		break;
+	case RDMA_USER_CM_QUERY_IB_SERVICE:
+		ret = ucma_query_ib_service(ctx, response, out_len);
+		break;
 	default:
 		ret = -ENOSYS;
 		break;
diff --git a/include/uapi/rdma/ib_user_sa.h b/include/uapi/rdma/ib_user_sa.h
index 435155d6e1c6..acfa20816bc6 100644
--- a/include/uapi/rdma/ib_user_sa.h
+++ b/include/uapi/rdma/ib_user_sa.h
@@ -74,4 +74,18 @@ struct ib_user_path_rec {
 	__u8	preference;
 };
 
+struct ib_user_service_rec {
+	__be64	id;
+	__u8	gid[16];
+	__be16	pkey;
+	__u8	reserved[2];
+	__be32	lease;
+	__u8	key[16];
+	__u8	name[64];
+	__u8	data_8[16];
+	__be16	data_16[8];
+	__be32	data_32[4];
+	__be64	data_64[2];
+};
+
 #endif /* IB_USER_SA_H */
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 8799623bcba0..00501da0567e 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -148,7 +148,8 @@ struct rdma_ucm_resolve_route {
 enum {
 	RDMA_USER_CM_QUERY_ADDR,
 	RDMA_USER_CM_QUERY_PATH,
-	RDMA_USER_CM_QUERY_GID
+	RDMA_USER_CM_QUERY_GID,
+	RDMA_USER_CM_QUERY_IB_SERVICE
 };
 
 struct rdma_ucm_query {
@@ -188,6 +189,11 @@ struct rdma_ucm_query_path_resp {
 	struct ib_path_rec_data path_data[];
 };
 
+struct rdma_ucm_query_ib_service_resp {
+	__u32 num_service_recs;
+	struct ib_user_service_rec recs[];
+};
+
 struct rdma_ucm_conn_param {
 	__u32 qp_num;
 	__u32 qkey;
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 5/5] RDMA/ucma: Support write an event into a CM
  2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
                   ` (3 preceding siblings ...)
  2025-06-30 10:52 ` [PATCH rdma-next 4/5] RDMA/ucma: Support query resolved service records Leon Romanovsky
@ 2025-06-30 10:52 ` Leon Romanovsky
  2025-08-13 10:16 ` [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
  5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-06-30 10:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Mark Zhang, linux-rdma, Or Har-Toov, Vlad Dumitrescu

From: Mark Zhang <markzhang@nvidia.com>

Enable user-space to inject an event into a CM through it's event
channel. Two new events are added and supported: RDMA_CM_EVENT_USER and
RDMA_CM_EVENT_INTERNAL. With these 2 events a new event parameter "arg"
is supported, which is passed from sender to receiver transparently.

With this feature an application is able to write an event into a CM
channel with a new user-space rdmacm API. For example thread T1 could
write an event with the API:
    rdma_write_cm_event(cm_id, RDMA_CM_EVENT_USER, status, arg);
and thread T2 could receive the event with rdma_get_cm_event().

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/ucma.c   | 52 +++++++++++++++++++++++++++++++-
 include/rdma/rdma_cm.h           |  5 ++-
 include/uapi/rdma/rdma_user_cm.h | 16 +++++++++-
 3 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 3b9ca6d7a21b..f86ece701db6 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1745,6 +1745,55 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
 	return ret;
 }
 
+static ssize_t ucma_write_cm_event(struct ucma_file *file,
+				   const char __user *inbuf, int in_len,
+				   int out_len)
+{
+	struct rdma_ucm_write_cm_event cmd;
+	struct rdma_cm_event event = {};
+	struct ucma_event *uevent;
+	struct ucma_context *ctx;
+	int ret = 0;
+
+	if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+		return -EFAULT;
+
+	if ((cmd.event != RDMA_CM_EVENT_USER) &&
+	    (cmd.event != RDMA_CM_EVENT_INTERNAL))
+		return -EINVAL;
+
+	ctx = ucma_get_ctx(file, cmd.id);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	event.event = cmd.event;
+	event.status = cmd.status;
+	event.param.arg = cmd.param.arg;
+
+	uevent = kzalloc(sizeof(*uevent), GFP_KERNEL);
+	if (!uevent) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	uevent->ctx = ctx;
+	uevent->resp.uid = ctx->uid;
+	uevent->resp.id = ctx->id;
+	uevent->resp.event = event.event;
+	uevent->resp.status = event.status;
+	memcpy(uevent->resp.param.arg32, &event.param.arg,
+	       sizeof(event.param.arg));
+
+	mutex_lock(&ctx->file->mut);
+	list_add_tail(&uevent->list, &ctx->file->event_list);
+	mutex_unlock(&ctx->file->mut);
+	wake_up_interruptible(&ctx->file->poll_wait);
+
+out:
+	ucma_put_ctx(ctx);
+	return ret;
+}
+
 static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
 				   const char __user *inbuf,
 				   int in_len, int out_len) = {
@@ -1771,7 +1820,8 @@ static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
 	[RDMA_USER_CM_CMD_BIND]		 = ucma_bind,
 	[RDMA_USER_CM_CMD_RESOLVE_ADDR]	 = ucma_resolve_addr,
 	[RDMA_USER_CM_CMD_JOIN_MCAST]	 = ucma_join_multicast,
-	[RDMA_USER_CM_CMD_RESOLVE_IB_SERVICE] = ucma_resolve_ib_service
+	[RDMA_USER_CM_CMD_RESOLVE_IB_SERVICE] = ucma_resolve_ib_service,
+	[RDMA_USER_CM_CMD_WRITE_CM_EVENT] = ucma_write_cm_event,
 };
 
 static ssize_t ucma_write(struct file *filp, const char __user *buf,
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 72d1568e4cfb..9bd930a83e6e 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -35,7 +35,9 @@ enum rdma_cm_event_type {
 	RDMA_CM_EVENT_ADDR_CHANGE,
 	RDMA_CM_EVENT_TIMEWAIT_EXIT,
 	RDMA_CM_EVENT_ADDRINFO_RESOLVED,
-	RDMA_CM_EVENT_ADDRINFO_ERROR
+	RDMA_CM_EVENT_ADDRINFO_ERROR,
+	RDMA_CM_EVENT_USER,
+	RDMA_CM_EVENT_INTERNAL,
 };
 
 const char *__attribute_const__ rdma_event_msg(enum rdma_cm_event_type event);
@@ -98,6 +100,7 @@ struct rdma_cm_event {
 	union {
 		struct rdma_conn_param	conn;
 		struct rdma_ud_param	ud;
+		u64			arg;
 	} param;
 	struct rdma_ucm_ece ece;
 };
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 00501da0567e..5ded174687ee 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -68,7 +68,8 @@ enum {
 	RDMA_USER_CM_CMD_BIND,
 	RDMA_USER_CM_CMD_RESOLVE_ADDR,
 	RDMA_USER_CM_CMD_JOIN_MCAST,
-	RDMA_USER_CM_CMD_RESOLVE_IB_SERVICE
+	RDMA_USER_CM_CMD_RESOLVE_IB_SERVICE,
+	RDMA_USER_CM_CMD_WRITE_CM_EVENT,
 };
 
 /* See IBTA Annex A11, servies ID bytes 4 & 5 */
@@ -304,6 +305,7 @@ struct rdma_ucm_event_resp {
 	union {
 		struct rdma_ucm_conn_param conn;
 		struct rdma_ucm_ud_param   ud;
+		__u32 arg32[2];
 	} param;
 	__u32 reserved;
 	struct rdma_ucm_ece ece;
@@ -362,4 +364,16 @@ struct rdma_ucm_resolve_ib_service {
 	__u32 id;
 	struct rdma_ucm_ib_service ibs;
 };
+
+struct rdma_ucm_write_cm_event {
+	__u32 id;
+	__u32 reserved;
+	__u32 event;
+	__u32 status;
+	union {
+		struct rdma_ucm_conn_param conn;
+		struct rdma_ucm_ud_param ud;
+		__u64 arg;
+	} param;
+};
 #endif /* RDMA_USER_CM_H */
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next 0/5] IB service resolution
  2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
                   ` (4 preceding siblings ...)
  2025-06-30 10:52 ` [PATCH rdma-next 5/5] RDMA/ucma: Support write an event into a CM Leon Romanovsky
@ 2025-08-13 10:16 ` Leon Romanovsky
  5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-08-13 10:16 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: linux-rdma, Mark Zhang, Or Har-Toov, Vlad Dumitrescu,
	Leon Romanovsky


On Mon, 30 Jun 2025 13:52:30 +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> From Mark,
> 
> This patchset adds support to resolve IB service records from SM. With
> this feature, the user-space is able to resolve and query IB address
> information like DGID and PKEY based on an IB service name or ID,
> through new ucma commands.
> 
> [...]

Applied, thanks!

[1/5] RDMA/sa_query: Add RMPP support for SA queries
      https://git.kernel.org/rdma/rdma/c/ef5fcdb7300aba
[2/5] RDMA/sa_query: Support IB service records resolution
      https://git.kernel.org/rdma/rdma/c/a892a3e74fb4f6
[3/5] RDMA/cma: Support IB service record resolution
      https://git.kernel.org/rdma/rdma/c/a6404823fe20e0
[4/5] RDMA/ucma: Support query resolved service records
      https://git.kernel.org/rdma/rdma/c/810f874eda8e49
[5/5] RDMA/ucma: Support write an event into a CM
      https://git.kernel.org/rdma/rdma/c/a3c9d0fcd37155

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-08-13 10:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-30 10:52 [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky
2025-06-30 10:52 ` [PATCH rdma-next 1/5] RDMA/sa_query: Add RMPP support for SA queries Leon Romanovsky
2025-06-30 10:52 ` [PATCH rdma-next 2/5] RDMA/sa_query: Support IB service records resolution Leon Romanovsky
2025-06-30 10:52 ` [PATCH rdma-next 3/5] RDMA/cma: Support IB service record resolution Leon Romanovsky
2025-06-30 10:52 ` [PATCH rdma-next 4/5] RDMA/ucma: Support query resolved service records Leon Romanovsky
2025-06-30 10:52 ` [PATCH rdma-next 5/5] RDMA/ucma: Support write an event into a CM Leon Romanovsky
2025-08-13 10:16 ` [PATCH rdma-next 0/5] IB service resolution Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox