linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next 0/7] Verbs RSS
@ 2015-10-15  7:16 Yishai Hadas
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

RSS (Receive Side Scaling) technology allows to spread incoming traffic between
different receive descriptor queues.
Assigning each queue to different CPU cores allows to better load balance the
incoming traffic and improve performance.

This patch-set introduces some new objects and verbs in order to allow
verbs based solutions to utilize the RSS offload capability which is
widely supported today by many modern NICs. It extends the IB and uverbs
layers to support the above functionality and supplies a specific
implementation for the mlx5_ib driver.

The implementation is based on an RFC that was sent to the list some months ago
and describes the expected verbs and objects.
RFC: http://www.spinics.net/lists/linux-rdma/msg25012.html

In addition, below URL can be used as a reference to the motivation and 
the justification to add the new objects that are described below.
http://lxr.free-electrons.com/source/Documentation/networking/scaling.txt

Overview of the changes:
- Add new objects: Work Queue and Receive Work Queues Indirection Table.
- Add new verbs that are required to handle the new objects:
  ib_create_wq(), ib_modify_wq(), ib_destory_wq(),
  ib_create_rwq_ind_table(), ib_destroy_rwq_ind_table().

Work Queue: (ib_wq)
- Work Queue is associated (many to one) with  Completion Queue.
- It owns Work Queue properties (PD, WQ size etc.).
- Currently Work Queue type can be IB_WQT_RQ (receive queue), other ones may be added
  in the future. (e.g. IB_WQT_SQ, send queue)
- Work Queue from type IB_WQT_RQ contains receive work requests.
- Work Queue context is subject to a well-defined state transitions done by the modify_wq verb.
- Work Queue is a necessary component for RSS technology since RSS mechanism is
  supposed to distribute the traffic between multiple Receive Work Queues. 
  
Receive Work Queue Indirection Table: (ib_rwq_ind_tbl)
- Serves to spread traffic between Work Queues from type RQ.
- Can be modified dynamically to give different queues different relative weights.
- The receive queue for a packet is determined by computed hash for the incoming packet.
- Receive Work Queue Indirection Table is associated (one to many) with QPs.

Future extensions to this patch-set:
- Add ib_modify_rwq_ind_table() verb to enable a dynamic RQ mapping change.
- Introduce RSS hashing configuration that should be used to compute the required RQ entry for the incoming packet.
- Extend the ib_create_qp() verb to work with external WQs by the indirection table object and with RSS hash configuration.
  - Will enable a ULP/user application to enjoy from the RSS scaling.
  - QPs that support flow steering rules can enjoy from the RSS scaling in addition to the steering capabilities.
- Reflect RSS capabilities by the query device verb.
- User space support (i.e. libibverbs/vendor drivers) to expose the new verbs and objects.

Patches:
#1 - Exposes the required APIs from mlx5_core to be used in coming patches by mlx5_ib driver.
#2 - Introduces the Work Queue object and its verbs in the IB layer.
#3 - Adds uverbs support for the Work Queue verbs.
#4 - Implements the Work Queue verbs in mlx5_ib driver.
#5 - Introduces Receive Work Queue indirection table and its verbs in the IB layer.
#6 - Adds uverbs support for the Receive Work Queue indirection table verbs.
#7 - Implements the Receive Work Queue indirection table verbs in mlx5_ib driver.
 
Yishai Hadas (7):
  net/mlx5_core: Expose transobj APIs from mlx5 core
  IB: Introduce Work Queue object and its verbs
  IB/uverbs: Add WQ support
  IB/mlx5: Add receive Work Queue verbs
  IB: Introduce Receive Work Queue indirection table
  IB/uverbs: Introduce RWQ Indirection table
  IB/mlx5: Add Receive Work Queue Indirection table operations

 drivers/infiniband/core/uverbs.h                   |   12 +
 drivers/infiniband/core/uverbs_cmd.c               |  405 ++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c              |   38 ++
 drivers/infiniband/core/verbs.c                    |  111 ++++++
 drivers/infiniband/hw/mlx5/main.c                  |   13 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h               |   49 +++
 drivers/infiniband/hw/mlx5/qp.c                    |  319 +++++++++++++++
 drivers/infiniband/hw/mlx5/user.h                  |   15 +
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/srq.c      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c |    8 +-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.h |   72 ----
 include/linux/mlx5/transobj.h                      |   73 ++++
 include/rdma/ib_verbs.h                            |  136 +++++++
 include/uapi/rdma/ib_user_verbs.h                  |   65 ++++
 15 files changed, 1245 insertions(+), 75 deletions(-)
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.h
 create mode 100644 include/linux/mlx5/transobj.h

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH for-next 1/7] net/mlx5_core: Expose transobj APIs from mlx5 core
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-10-15  7:16   ` Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs Yishai Hadas
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Move transobj.h from the core library to include/linux/mlx5
to enable using its functionality outside of mlx5 core.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/srq.c      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c |    8 ++-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.h |   72 -------------------
 include/linux/mlx5/transobj.h                      |   73 ++++++++++++++++++++
 5 files changed, 82 insertions(+), 75 deletions(-)
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.h
 create mode 100644 include/linux/mlx5/transobj.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 0983a20..47b2811 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -36,8 +36,8 @@
 #include <linux/mlx5/qp.h>
 #include <linux/mlx5/cq.h>
 #include <linux/mlx5/vport.h>
+#include <linux/mlx5/transobj.h>
 #include "wq.h"
-#include "transobj.h"
 #include "mlx5_core.h"
 
 #define MLX5E_MAX_NUM_TC	8
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/srq.c b/drivers/net/ethernet/mellanox/mlx5/core/srq.c
index c48f504..1c1619a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/srq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/srq.c
@@ -35,9 +35,9 @@
 #include <linux/mlx5/driver.h>
 #include <linux/mlx5/cmd.h>
 #include <linux/mlx5/srq.h>
+#include <linux/mlx5/transobj.h>
 #include <rdma/ib_verbs.h>
 #include "mlx5_core.h"
-#include "transobj.h"
 
 void mlx5_srq_event(struct mlx5_core_dev *dev, u32 srqn, int event_type)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/transobj.c b/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
index b4c87c7..3d5db0b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
@@ -30,9 +30,10 @@
  * SOFTWARE.
  */
 
+#include <linux/export.h>
 #include <linux/mlx5/driver.h>
+#include <linux/mlx5/transobj.h>
 #include "mlx5_core.h"
-#include "transobj.h"
 
 int mlx5_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn)
 {
@@ -83,6 +84,7 @@ int mlx5_core_create_rq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *rqn)
 
 	return err;
 }
+EXPORT_SYMBOL(mlx5_core_create_rq);
 
 int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 rqn, u32 *in, int inlen)
 {
@@ -94,6 +96,7 @@ int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 rqn, u32 *in, int inlen)
 	memset(out, 0, sizeof(out));
 	return mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
 }
+EXPORT_SYMBOL(mlx5_core_modify_rq);
 
 void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn)
 {
@@ -107,6 +110,7 @@ void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn)
 
 	mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
 }
+EXPORT_SYMBOL(mlx5_core_destroy_rq);
 
 int mlx5_core_create_sq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *sqn)
 {
@@ -386,6 +390,7 @@ int mlx5_core_create_rqt(struct mlx5_core_dev *dev, u32 *in, int inlen,
 
 	return err;
 }
+EXPORT_SYMBOL(mlx5_core_create_rqt);
 
 int mlx5_core_modify_rqt(struct mlx5_core_dev *dev, u32 rqtn, u32 *in,
 			 int inlen)
@@ -411,3 +416,4 @@ void mlx5_core_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn)
 
 	mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
 }
+EXPORT_SYMBOL(mlx5_core_destroy_rqt);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/transobj.h b/drivers/net/ethernet/mellanox/mlx5/core/transobj.h
deleted file mode 100644
index 74cae51..0000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/transobj.h
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * Copyright (c) 2013-2015, Mellanox Technologies, Ltd.  All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses.  You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- *     Redistribution and use in source and binary forms, with or
- *     without modification, are permitted provided that the following
- *     conditions are met:
- *
- *      - Redistributions of source code must retain the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer.
- *
- *      - Redistributions in binary form must reproduce the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer in the documentation and/or other materials
- *        provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#ifndef __TRANSOBJ_H__
-#define __TRANSOBJ_H__
-
-int mlx5_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn);
-void mlx5_dealloc_transport_domain(struct mlx5_core_dev *dev, u32 tdn);
-int mlx5_core_create_rq(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			u32 *rqn);
-int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 rqn, u32 *in, int inlen);
-void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn);
-int mlx5_core_create_sq(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			u32 *sqn);
-int mlx5_core_modify_sq(struct mlx5_core_dev *dev, u32 sqn, u32 *in, int inlen);
-void mlx5_core_destroy_sq(struct mlx5_core_dev *dev, u32 sqn);
-int mlx5_core_create_tir(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			 u32 *tirn);
-int mlx5_core_modify_tir(struct mlx5_core_dev *dev, u32 tirn, u32 *in,
-			 int inlen);
-void mlx5_core_destroy_tir(struct mlx5_core_dev *dev, u32 tirn);
-int mlx5_core_create_tis(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			 u32 *tisn);
-void mlx5_core_destroy_tis(struct mlx5_core_dev *dev, u32 tisn);
-int mlx5_core_create_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			 u32 *rmpn);
-int mlx5_core_modify_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen);
-int mlx5_core_destroy_rmp(struct mlx5_core_dev *dev, u32 rmpn);
-int mlx5_core_query_rmp(struct mlx5_core_dev *dev, u32 rmpn, u32 *out);
-int mlx5_core_arm_rmp(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm);
-int mlx5_core_create_xsrq(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			  u32 *rmpn);
-int mlx5_core_destroy_xsrq(struct mlx5_core_dev *dev, u32 rmpn);
-int mlx5_core_query_xsrq(struct mlx5_core_dev *dev, u32 rmpn, u32 *out);
-int mlx5_core_arm_xsrq(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm);
-
-int mlx5_core_create_rqt(struct mlx5_core_dev *dev, u32 *in, int inlen,
-			 u32 *rqtn);
-int mlx5_core_modify_rqt(struct mlx5_core_dev *dev, u32 rqtn, u32 *in,
-			 int inlen);
-void mlx5_core_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn);
-
-#endif /* __TRANSOBJ_H__ */
diff --git a/include/linux/mlx5/transobj.h b/include/linux/mlx5/transobj.h
new file mode 100644
index 0000000..6b4b3a4
--- /dev/null
+++ b/include/linux/mlx5/transobj.h
@@ -0,0 +1,73 @@
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __TRANSOBJ_H__
+#define __TRANSOBJ_H__
+
+#include <linux/mlx5/driver.h>
+
+int mlx5_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn);
+void mlx5_dealloc_transport_domain(struct mlx5_core_dev *dev, u32 tdn);
+int mlx5_core_create_rq(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			u32 *rqn);
+int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 rqn, u32 *in, int inlen);
+void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn);
+int mlx5_core_create_sq(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			u32 *sqn);
+int mlx5_core_modify_sq(struct mlx5_core_dev *dev, u32 sqn, u32 *in, int inlen);
+void mlx5_core_destroy_sq(struct mlx5_core_dev *dev, u32 sqn);
+int mlx5_core_create_tir(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			 u32 *tirn);
+int mlx5_core_modify_tir(struct mlx5_core_dev *dev, u32 tirn, u32 *in,
+			 int inlen);
+void mlx5_core_destroy_tir(struct mlx5_core_dev *dev, u32 tirn);
+int mlx5_core_create_tis(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			 u32 *tisn);
+void mlx5_core_destroy_tis(struct mlx5_core_dev *dev, u32 tisn);
+int mlx5_core_create_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			 u32 *rmpn);
+int mlx5_core_modify_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen);
+int mlx5_core_destroy_rmp(struct mlx5_core_dev *dev, u32 rmpn);
+int mlx5_core_query_rmp(struct mlx5_core_dev *dev, u32 rmpn, u32 *out);
+int mlx5_core_arm_rmp(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm);
+int mlx5_core_create_xsrq(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			  u32 *rmpn);
+int mlx5_core_destroy_xsrq(struct mlx5_core_dev *dev, u32 rmpn);
+int mlx5_core_query_xsrq(struct mlx5_core_dev *dev, u32 rmpn, u32 *out);
+int mlx5_core_arm_xsrq(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm);
+int mlx5_core_create_rqt(struct mlx5_core_dev *dev, u32 *in, int inlen,
+			 u32 *rqtn);
+int mlx5_core_modify_rqt(struct mlx5_core_dev *dev, u32 rqtn, u32 *in,
+			 int inlen);
+void mlx5_core_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn);
+
+#endif /* __TRANSOBJ_H__ */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15  7:16   ` [PATCH for-next 1/7] net/mlx5_core: Expose transobj APIs from mlx5 core Yishai Hadas
@ 2015-10-15  7:16   ` Yishai Hadas
       [not found]     ` <1444893410-13242-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15  7:16   ` [PATCH for-next 3/7] IB/uverbs: Add WQ support Yishai Hadas
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Introduce Work Queue object and its create/destroy/modify verbs.

QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.
WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues.

WQ associated (many to one) with Completion Queue and it owns WQ
properties (PD, WQ size, etc.).
WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
it may be extend to others such as IB_WQT_SQ. (send queue).
WQ from type IB_WQT_RQ contains receive work requests.

WQ context is subject to a well-defined state transitions done by
the modify_wq verb.
When WQ is created its initial state becomes IB_WQS_RESET.
>From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
>From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
or to IB_WQS_ERR.
>From IB_WQS_ERR it can be modified to IB_WQS_RESET.

Note: transition to IB_WQS_ERR might occur implicitly in case there
was some HW error.


Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c |   59 ++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         |   88 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 147 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index e1f2c98..c63c622 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1435,6 +1435,65 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
 }
 EXPORT_SYMBOL(ib_dealloc_xrcd);
 
+struct ib_wq *ib_create_wq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *wq_attr)
+{
+	struct ib_wq *wq;
+
+	if (!pd->device->create_wq)
+		return ERR_PTR(-ENOSYS);
+
+	wq = pd->device->create_wq(pd, wq_attr, NULL);
+	if (!IS_ERR(wq)) {
+		wq->event_handler = wq_attr->event_handler;
+		wq->wq_context = wq_attr->wq_context;
+		wq->wq_type = wq_attr->wq_type;
+		wq->cq = wq_attr->cq;
+		wq->device = pd->device;
+		wq->pd = pd;
+		wq->uobject = NULL;
+		atomic_inc(&pd->usecnt);
+		atomic_inc(&wq_attr->cq->usecnt);
+		atomic_set(&wq->usecnt, 0);
+	}
+	return wq;
+}
+EXPORT_SYMBOL(ib_create_wq);
+
+int ib_destroy_wq(struct ib_wq *wq)
+{
+	int err;
+	struct ib_cq *cq = wq->cq;
+	struct ib_pd *pd = wq->pd;
+
+	if (!wq->device->destroy_wq)
+		return -ENOSYS;
+
+	if (atomic_read(&wq->usecnt))
+		return -EBUSY;
+
+	err = wq->device->destroy_wq(wq);
+	if (!err) {
+		atomic_dec(&pd->usecnt);
+		atomic_dec(&cq->usecnt);
+	}
+	return err;
+}
+EXPORT_SYMBOL(ib_destroy_wq);
+
+int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		 enum ib_wq_attr_mask attr_mask)
+{
+	int err;
+
+	if (!wq->device->modify_wq)
+		return -ENOSYS;
+
+	err = wq->device->modify_wq(wq, wq_attr, attr_mask, NULL);
+	return err;
+}
+EXPORT_SYMBOL(ib_modify_wq);
+
 struct ib_flow *ib_create_flow(struct ib_qp *qp,
 			       struct ib_flow_attr *flow_attr,
 			       int domain)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e1f65e2..0c6291b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1310,6 +1310,48 @@ struct ib_srq {
 	} ext;
 };
 
+enum ib_wq_type {
+	IB_WQT_RQ
+};
+
+enum ib_wq_state {
+	IB_WQS_RESET,
+	IB_WQS_RDY,
+	IB_WQS_ERR
+};
+
+struct ib_wq {
+	struct ib_device       *device;
+	struct ib_uobject      *uobject;
+	void		    *wq_context;
+	void		    (*event_handler)(struct ib_event *, void *);
+	struct ib_pd	       *pd;
+	struct ib_cq	       *cq;
+	u32		wq_num;
+	enum ib_wq_state       state;
+	enum ib_wq_type	wq_type;
+	atomic_t		usecnt;
+};
+
+struct ib_wq_init_attr {
+	void		       *wq_context;
+	enum ib_wq_type	wq_type;
+	u32		max_wr;
+	u32		max_sge;
+	struct	ib_cq	       *cq;
+	void		    (*event_handler)(struct ib_event *, void *);
+};
+
+enum ib_wq_attr_mask {
+	IB_WQ_STATE	= 1 << 0,
+	IB_WQ_CUR_STATE	= 1 << 1,
+};
+
+struct ib_wq_attr {
+	enum	ib_wq_state	wq_state;
+	enum	ib_wq_state	curr_wq_state;
+};
+
 struct ib_qp {
 	struct ib_device       *device;
 	struct ib_pd	       *pd;
@@ -1771,6 +1813,14 @@ struct ib_device {
 	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
 						      struct ib_mr_status *mr_status);
 	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
+	struct ib_wq *		   (*create_wq)(struct ib_pd *pd,
+						struct ib_wq_init_attr *init_attr,
+						struct ib_udata *udata);
+	int			   (*destroy_wq)(struct ib_wq *wq);
+	int			   (*modify_wq)(struct ib_wq *wq,
+						struct ib_wq_attr *attr,
+						enum ib_wq_attr_mask attr_mask,
+						struct ib_udata *udata);
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
@@ -3024,4 +3074,42 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, u8 port,
 					    u16 pkey, const union ib_gid *gid,
 					    const struct sockaddr *addr);
 
+/**
+ * ib_create_wq - Creates a WQ associated with the specified protection
+ * domain.
+ * @pd: The protection domain associated with the WQ.
+ * @wq_init_attr: A list of initial attributes required to create the
+ * WQ. If WQ creation succeeds, then the attributes are updated to
+ * the actual capabilities of the created WQ.
+ *
+ * wq_init_attr->max_wr and wq_init_attr->max_sge determine
+ * the requested size of the WQ, and set to the actual values allocated
+ * on return.
+ * If ib_create_wq() succeeds, then max_wr and max_sge will always be
+ * at least as large as the requested values.
+ *
+ * Return Value
+ * ib_create_wq() returns a pointer to the created WQ, or NULL if the request
+ * fails.
+ */
+struct ib_wq *ib_create_wq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *init_attr);
+
+/**
+ * ib_destroy_wq - Destroys the specified WQ.
+ * @wq: The WQ to destroy.
+ */
+int ib_destroy_wq(struct ib_wq *wq);
+
+/**
+ * ib_modify_wq - Modifies the specified WQ.
+ * @wq: The WQ to modify.
+ * @wq_attr: On input, specifies the WQ attributes to modify.
+ * @attr_mask: A bit-mask used to specify which attributes of the WQ
+ *   are being modified.
+ * On output, the current values of selected WQ attributes are returned.
+ */
+int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
+		 enum ib_wq_attr_mask attr_mask);
+
 #endif /* IB_VERBS_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH for-next 3/7] IB/uverbs: Add WQ support
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15  7:16   ` [PATCH for-next 1/7] net/mlx5_core: Expose transobj APIs from mlx5 core Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs Yishai Hadas
@ 2015-10-15  7:16   ` Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 4/7] IB/mlx5: Add receive Work Queue verbs Yishai Hadas
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Add Work Queue support, it includes create, modify and
destroy commands.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |    9 ++
 drivers/infiniband/core/uverbs_cmd.c  |  218 +++++++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c |   25 ++++
 include/rdma/ib_verbs.h               |    3 +
 include/uapi/rdma/ib_user_verbs.h     |   41 ++++++
 5 files changed, 296 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 3863d33..ab3d331 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -162,6 +162,10 @@ struct ib_uqp_object {
 	struct ib_uxrcd_object *uxrcd;
 };
 
+struct ib_uwq_object {
+	struct ib_uevent_object	uevent;
+};
+
 struct ib_ucq_object {
 	struct ib_uobject	uobject;
 	struct ib_uverbs_file  *uverbs_file;
@@ -181,6 +185,7 @@ extern struct idr ib_uverbs_qp_idr;
 extern struct idr ib_uverbs_srq_idr;
 extern struct idr ib_uverbs_xrcd_idr;
 extern struct idr ib_uverbs_rule_idr;
+extern struct idr ib_uverbs_wq_idr;
 
 void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
 
@@ -199,6 +204,7 @@ void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 void ib_uverbs_comp_handler(struct ib_cq *cq, void *cq_context);
 void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_qp_event_handler(struct ib_event *event, void *context_ptr);
+void ib_uverbs_wq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_event_handler(struct ib_event_handler *handler,
 			     struct ib_event *event);
@@ -272,5 +278,8 @@ IB_UVERBS_DECLARE_EX_CMD(create_flow);
 IB_UVERBS_DECLARE_EX_CMD(destroy_flow);
 IB_UVERBS_DECLARE_EX_CMD(query_device);
 IB_UVERBS_DECLARE_EX_CMD(create_cq);
+IB_UVERBS_DECLARE_EX_CMD(create_wq);
+IB_UVERBS_DECLARE_EX_CMD(modify_wq);
+IB_UVERBS_DECLARE_EX_CMD(destroy_wq);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index be4cb9f..d316cd7 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -57,6 +57,7 @@ static struct uverbs_lock_class ah_lock_class	= { .name = "AH-uobj" };
 static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
 static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
 static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
+static struct uverbs_lock_class wq_lock_class = { .name = "WQ-uobj" };
 
 /*
  * The ib_uobject locking scheme is as follows:
@@ -241,6 +242,16 @@ static struct ib_qp *idr_read_qp(int qp_handle, struct ib_ucontext *context)
 	return idr_read_obj(&ib_uverbs_qp_idr, qp_handle, context, 0);
 }
 
+static struct ib_wq *idr_read_wq(int wq_handle, struct ib_ucontext *context)
+{
+	return idr_read_obj(&ib_uverbs_wq_idr, wq_handle, context, 0);
+}
+
+static void put_wq_read(struct ib_wq *wq)
+{
+	put_uobj_read(wq->uobject);
+}
+
 static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
@@ -327,6 +338,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	INIT_LIST_HEAD(&ucontext->qp_list);
 	INIT_LIST_HEAD(&ucontext->srq_list);
 	INIT_LIST_HEAD(&ucontext->ah_list);
+	INIT_LIST_HEAD(&ucontext->wq_list);
 	INIT_LIST_HEAD(&ucontext->xrcd_list);
 	INIT_LIST_HEAD(&ucontext->rule_list);
 	rcu_read_lock();
@@ -2915,6 +2927,212 @@ static int kern_spec_to_ib_spec(struct ib_uverbs_flow_spec *kern_spec,
 	return 0;
 }
 
+int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
+			   struct ib_device *ib_dev,
+			   struct ib_udata *ucore,
+			   struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_create_wq	  cmd;
+	struct ib_uverbs_ex_create_wq_resp resp;
+	struct ib_uwq_object           *obj;
+	int err = 0;
+	struct ib_cq *cq;
+	struct ib_pd *pd;
+	struct ib_wq *wq;
+	struct ib_wq_init_attr wq_init_attr;
+
+	if (ucore->inlen < sizeof(cmd))
+		return -EINVAL;
+
+	if (ucore->outlen < sizeof(resp))
+		return -ENOSPC;
+
+	err = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
+	if (err)
+		return err;
+
+	if (cmd.comp_mask)
+		return -EINVAL;
+
+	obj = kmalloc(sizeof(*obj), GFP_KERNEL);
+	if (!obj)
+		return -ENOMEM;
+
+	init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext,
+		  &wq_lock_class);
+	down_write(&obj->uevent.uobject.mutex);
+	pd  = idr_read_pd(cmd.pd_handle, file->ucontext);
+	if (!pd) {
+		err = -EINVAL;
+		goto err_uobj;
+	}
+
+	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	if (!cq) {
+		err = -EINVAL;
+		goto err_put_pd;
+	}
+
+	memset(&wq_init_attr, 0, sizeof(wq_init_attr));
+	wq_init_attr.cq = cq;
+	wq_init_attr.max_sge = cmd.max_sge;
+	wq_init_attr.max_wr = cmd.max_wr;
+	wq_init_attr.wq_context = file;
+	wq_init_attr.wq_type = cmd.wq_type;
+	wq_init_attr.event_handler = ib_uverbs_wq_event_handler;
+	obj->uevent.events_reported = 0;
+	INIT_LIST_HEAD(&obj->uevent.event_list);
+	wq = pd->device->create_wq(pd, &wq_init_attr, uhw);
+	if (IS_ERR(wq)) {
+		err = PTR_ERR(wq);
+		goto err_put_cq;
+	}
+
+	wq->uobject = &obj->uevent.uobject;
+	obj->uevent.uobject.object = wq;
+	wq->wq_type = wq_init_attr.wq_type;
+	wq->cq = cq;
+	wq->pd = pd;
+	wq->device = pd->device;
+	wq->wq_context = wq_init_attr.wq_context;
+	atomic_set(&wq->usecnt, 0);
+	atomic_inc(&pd->usecnt);
+	atomic_inc(&cq->usecnt);
+	wq->uobject = &obj->uevent.uobject;
+	obj->uevent.uobject.object = wq;
+	err = idr_add_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+	if (err)
+		goto destroy_wq;
+
+	memset(&resp, 0, sizeof(resp));
+	resp.wq_handle = obj->uevent.uobject.id;
+	resp.max_sge = wq_init_attr.max_sge;
+	resp.max_wr = wq_init_attr.max_wr;
+	resp.wqn = wq->wq_num;
+	resp.response_length = offsetof(typeof(resp), wqn) + sizeof(resp.wqn);
+	err = ib_copy_to_udata(ucore,
+			       &resp, sizeof(resp));
+	if (err)
+		goto err_copy;
+
+	put_pd_read(pd);
+	put_cq_read(cq);
+
+	mutex_lock(&file->mutex);
+	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->wq_list);
+	mutex_unlock(&file->mutex);
+
+	obj->uevent.uobject.live = 1;
+	up_write(&obj->uevent.uobject.mutex);
+	return 0;
+
+err_copy:
+	idr_remove_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+destroy_wq:
+	ib_destroy_wq(wq);
+err_put_cq:
+	put_cq_read(cq);
+err_put_pd:
+	put_pd_read(pd);
+err_uobj:
+	put_uobj_write(&obj->uevent.uobject);
+
+	return err;
+}
+
+int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
+			    struct ib_device *ib_dev,
+			    struct ib_udata *ucore,
+			    struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_destroy_wq	cmd;
+	struct ib_uverbs_ex_destroy_wq_resp	resp;
+	struct ib_wq			*wq;
+	struct ib_uobject		*uobj;
+	struct ib_uwq_object		*obj;
+	int				ret;
+
+	if (ucore->inlen < sizeof(cmd))
+		return -EINVAL;
+
+	ret = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
+	if (ret)
+		return ret;
+
+	if (cmd.comp_mask)
+		return -EINVAL;
+
+	memset(&resp, 0, sizeof(resp));
+	resp.response_length = offsetof(typeof(resp), reserved) + sizeof(resp.reserved);
+	if (ucore->outlen < resp.response_length)
+		return -ENOSPC;
+
+	uobj = idr_write_uobj(&ib_uverbs_wq_idr, cmd.wq_handle,
+			      file->ucontext);
+	if (!uobj)
+		return -EINVAL;
+
+	wq = uobj->object;
+	obj = container_of(uobj, struct ib_uwq_object, uevent.uobject);
+	ret = ib_destroy_wq(wq);
+	if (!ret)
+		uobj->live = 0;
+
+	put_uobj_write(uobj);
+	if (ret)
+		return ret;
+
+	idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+
+	mutex_lock(&file->mutex);
+	list_del(&uobj->list);
+	mutex_unlock(&file->mutex);
+
+	ib_uverbs_release_uevent(file, &obj->uevent);
+	resp.events_reported = obj->uevent.events_reported;
+	put_uobj(uobj);
+
+	ret = ib_copy_to_udata(ucore, &resp, resp.response_length);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
+			   struct ib_device *ib_dev,
+			   struct ib_udata *ucore,
+			   struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_modify_wq cmd;
+	struct ib_wq *wq;
+	int ret;
+	struct ib_wq_attr wq_attr;
+	enum ib_wq_attr_mask attr_mask;
+
+	if (ucore->inlen < sizeof(cmd))
+		return -EINVAL;
+
+	ret = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
+	if (ret)
+		return ret;
+
+	if (!cmd.comp_mask)
+		return -EINVAL;
+
+	attr_mask = cmd.comp_mask;
+	wq = idr_read_wq(cmd.wq_handle, file->ucontext);
+	if (!wq)
+		return -EINVAL;
+
+	memset(&wq_attr, 0, sizeof(wq_attr));
+	wq_attr.curr_wq_state = cmd.curr_wq_state;
+	wq_attr.wq_state = cmd.wq_state;
+	ret = wq->device->modify_wq(wq, &wq_attr, attr_mask, uhw);
+	put_wq_read(wq);
+	return ret;
+}
+
 int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 			     struct ib_device *ib_dev,
 			     struct ib_udata *ucore,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index c29a660..8ae75cc 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -74,6 +74,7 @@ DEFINE_IDR(ib_uverbs_qp_idr);
 DEFINE_IDR(ib_uverbs_srq_idr);
 DEFINE_IDR(ib_uverbs_xrcd_idr);
 DEFINE_IDR(ib_uverbs_rule_idr);
+DEFINE_IDR(ib_uverbs_wq_idr);
 
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
@@ -127,6 +128,9 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_EX_CMD_DESTROY_FLOW]	= ib_uverbs_ex_destroy_flow,
 	[IB_USER_VERBS_EX_CMD_QUERY_DEVICE]	= ib_uverbs_ex_query_device,
 	[IB_USER_VERBS_EX_CMD_CREATE_CQ]	= ib_uverbs_ex_create_cq,
+	[IB_USER_VERBS_EX_CMD_CREATE_WQ]	= ib_uverbs_ex_create_wq,
+	[IB_USER_VERBS_EX_CMD_MODIFY_WQ]	= ib_uverbs_ex_modify_wq,
+	[IB_USER_VERBS_EX_CMD_DESTROY_WQ]	= ib_uverbs_ex_destroy_wq,
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
@@ -251,6 +255,17 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		kfree(uqp);
 	}
 
+	list_for_each_entry_safe(uobj, tmp, &context->wq_list, list) {
+		struct ib_wq *wq = uobj->object;
+		struct ib_uwq_object *uwq =
+			container_of(uobj, struct ib_uwq_object, uevent.uobject);
+
+		idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+		ib_destroy_wq(wq);
+		ib_uverbs_release_uevent(file, &uwq->uevent);
+		kfree(uwq);
+	}
+
 	list_for_each_entry_safe(uobj, tmp, &context->srq_list, list) {
 		struct ib_srq *srq = uobj->object;
 		struct ib_uevent_object *uevent =
@@ -554,6 +569,16 @@ void ib_uverbs_qp_event_handler(struct ib_event *event, void *context_ptr)
 				&uobj->events_reported);
 }
 
+void ib_uverbs_wq_event_handler(struct ib_event *event, void *context_ptr)
+{
+	struct ib_uevent_object *uobj = container_of(event->element.wq->uobject,
+						  struct ib_uevent_object, uobject);
+
+	ib_uverbs_async_handler(context_ptr, uobj->uobject.user_handle,
+				event->event, &uobj->event_list,
+				&uobj->events_reported);
+}
+
 void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr)
 {
 	struct ib_uevent_object *uobj;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0c6291b..808ccee 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -474,6 +474,7 @@ enum ib_event_type {
 	IB_EVENT_QP_LAST_WQE_REACHED,
 	IB_EVENT_CLIENT_REREGISTER,
 	IB_EVENT_GID_CHANGE,
+	IB_EVENT_WQ_FATAL,
 };
 
 __attribute_const__ const char *ib_event_msg(enum ib_event_type event);
@@ -484,6 +485,7 @@ struct ib_event {
 		struct ib_cq	*cq;
 		struct ib_qp	*qp;
 		struct ib_srq	*srq;
+		struct ib_wq	*wq;
 		u8		port_num;
 	} element;
 	enum ib_event_type	event;
@@ -1218,6 +1220,7 @@ struct ib_ucontext {
 	struct list_head	ah_list;
 	struct list_head	xrcd_list;
 	struct list_head	rule_list;
+	struct list_head	wq_list;
 	int			closing;
 
 	struct pid             *tgid;
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 978841e..2dc47e0 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -94,6 +94,9 @@ enum {
 	IB_USER_VERBS_EX_CMD_CREATE_CQ = IB_USER_VERBS_CMD_CREATE_CQ,
 	IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD,
 	IB_USER_VERBS_EX_CMD_DESTROY_FLOW,
+	IB_USER_VERBS_EX_CMD_CREATE_WQ,
+	IB_USER_VERBS_EX_CMD_MODIFY_WQ,
+	IB_USER_VERBS_EX_CMD_DESTROY_WQ,
 };
 
 /*
@@ -919,4 +922,42 @@ struct ib_uverbs_destroy_srq_resp {
 	__u32 events_reported;
 };
 
+struct ib_uverbs_ex_create_wq  {
+	__u32 comp_mask;
+	__u32 wq_type;
+	__u64 user_handle;
+	__u32 pd_handle;
+	__u32 cq_handle;
+	__u32 max_wr;
+	__u32 max_sge;
+};
+
+struct ib_uverbs_ex_create_wq_resp {
+	__u32 comp_mask;
+	__u32 response_length;
+	__u32 wq_handle;
+	__u32 max_wr;
+	__u32 max_sge;
+	__u32 wqn;
+};
+
+struct ib_uverbs_ex_destroy_wq  {
+	__u32 comp_mask;
+	__u32 wq_handle;
+};
+
+struct ib_uverbs_ex_destroy_wq_resp {
+	__u32 comp_mask;
+	__u32 response_length;
+	__u32 events_reported;
+	__u32 reserved;
+};
+
+struct ib_uverbs_ex_modify_wq  {
+	__u32 comp_mask;
+	__u32 wq_handle;
+	__u32 wq_state;
+	__u32 curr_wq_state;
+};
+
 #endif /* IB_USER_VERBS_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH for-next 4/7] IB/mlx5: Add receive Work Queue verbs
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-10-15  7:16   ` [PATCH for-next 3/7] IB/uverbs: Add WQ support Yishai Hadas
@ 2015-10-15  7:16   ` Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 5/7] IB: Introduce Receive Work Queue indirection table Yishai Hadas
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.

WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues

Add receive Work Queue verbs, it includes:
creation, modification and destruction.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    |    9 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   35 +++++
 drivers/infiniband/hw/mlx5/qp.c      |  263 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/user.h    |   15 ++
 4 files changed, 322 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 41d6911..1601f19 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1505,6 +1505,15 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 			(1ull << IB_USER_VERBS_CMD_CLOSE_XRCD);
 	}
 
+	if (MLX5_CAP_GEN(mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
+		dev->ib_dev.create_wq		= mlx5_ib_create_wq;
+		dev->ib_dev.modify_wq		= mlx5_ib_modify_wq;
+		dev->ib_dev.destroy_wq		= mlx5_ib_destroy_wq;
+		dev->ib_dev.uverbs_ex_cmd_mask |=
+				(1ull << IB_USER_VERBS_EX_CMD_CREATE_WQ) |
+				(1ull << IB_USER_VERBS_EX_CMD_MODIFY_WQ) |
+				(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ);
+	}
 	err = init_node_data(dev);
 	if (err)
 		goto err_dealloc;
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index bb8cda7..6c00aca 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -143,12 +143,36 @@ struct mlx5_ib_wq {
 	void		       *qend;
 };
 
+struct mlx5_ib_rwq {
+	struct ib_wq		ibwq;
+	u32			rqn;
+	u32			rq_num_pas;
+	u32			log_rq_stride;
+	u32			log_rq_size;
+	u32			rq_page_offset;
+	u32			log_page_size;
+	struct ib_umem		*umem;
+	size_t			buf_size;
+	unsigned int		page_shift;
+	int			create_type;
+	struct mlx5_db		db;
+	u32			user_index;
+	u32			wqe_count;
+	u32			wqe_shift;
+	int			wq_sig;
+};
+
 enum {
 	MLX5_QP_USER,
 	MLX5_QP_KERNEL,
 	MLX5_QP_EMPTY
 };
 
+enum {
+	MLX5_WQ_USER,
+	MLX5_WQ_KERNEL
+};
+
 /*
  * Connect-IB can trigger up to four concurrent pagefaults
  * per-QP.
@@ -493,6 +517,11 @@ static inline struct mlx5_ib_qp *to_mqp(struct ib_qp *ibqp)
 	return container_of(ibqp, struct mlx5_ib_qp, ibqp);
 }
 
+static inline struct mlx5_ib_rwq *to_mrwq(struct ib_wq *ibwq)
+{
+	return container_of(ibwq, struct mlx5_ib_rwq, ibwq);
+}
+
 static inline struct mlx5_ib_srq *to_mibsrq(struct mlx5_core_srq *msrq)
 {
 	return container_of(msrq, struct mlx5_ib_srq, msrq);
@@ -630,6 +659,12 @@ int mlx5_mr_ib_cont_pages(struct ib_umem *umem, u64 addr, int *count, int *shift
 void mlx5_umr_cq_handler(struct ib_cq *cq, void *cq_context);
 int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask,
 			    struct ib_mr_status *mr_status);
+struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
+				struct ib_wq_init_attr *init_attr,
+				struct ib_udata *udata);
+int mlx5_ib_destroy_wq(struct ib_wq *wq);
+int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		      enum ib_wq_attr_mask attr_mask, struct ib_udata *udata);
 
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 extern struct workqueue_struct *mlx5_ib_page_fault_wq;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index c745c6c..3763205 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -32,6 +32,7 @@
 
 #include <linux/module.h>
 #include <rdma/ib_umem.h>
+#include <linux/mlx5/transobj.h>
 #include "mlx5_ib.h"
 #include "user.h"
 
@@ -590,6 +591,71 @@ static int uuarn_to_uar_index(struct mlx5_uuar_info *uuari, int uuarn)
 	return uuari->uars[uuarn / MLX5_BF_REGS_PER_PAGE].index;
 }
 
+static void destroy_user_rq(struct ib_pd *pd, struct mlx5_ib_rwq *rwq)
+{
+	struct mlx5_ib_ucontext *context;
+
+	context = to_mucontext(pd->uobject->context);
+	mlx5_ib_db_unmap_user(context, &rwq->db);
+	if (rwq->umem)
+		ib_umem_release(rwq->umem);
+}
+
+static int create_user_rq(struct mlx5_ib_dev *dev, struct ib_pd *pd,
+			  struct mlx5_ib_rwq *rwq,
+			  struct mlx5_ib_create_wq *ucmd)
+{
+	struct mlx5_ib_ucontext *context;
+	int page_shift = 0;
+	int npages;
+	u32 offset = 0;
+	int ncont = 0;
+	int err;
+
+	if (!ucmd->buf_addr)
+		return -EINVAL;
+
+	context = to_mucontext(pd->uobject->context);
+	rwq->umem = ib_umem_get(pd->uobject->context, ucmd->buf_addr,
+			       rwq->buf_size, 0, 0);
+	if (IS_ERR(rwq->umem)) {
+		mlx5_ib_dbg(dev, "umem_get failed\n");
+		err = PTR_ERR(rwq->umem);
+		return err;
+	}
+
+	mlx5_ib_cont_pages(rwq->umem, ucmd->buf_addr, &npages, &page_shift,
+			   &ncont, NULL);
+	err = mlx5_ib_get_buf_offset(ucmd->buf_addr, page_shift,
+				     &rwq->rq_page_offset);
+	if (err) {
+		mlx5_ib_warn(dev, "bad offset\n");
+		goto err_umem;
+	}
+
+	rwq->rq_num_pas = ncont;
+	rwq->page_shift = page_shift;
+	rwq->log_page_size =  page_shift - PAGE_SHIFT;
+	rwq->wq_sig = !!(ucmd->flags & MLX5_WQ_FLAG_SIGNATURE);
+
+	mlx5_ib_dbg(dev, "addr 0x%llx, size %zd, npages %d, page_shift %d, ncont %d, offset %d\n",
+		    (unsigned long long)ucmd->buf_addr, rwq->buf_size,
+		    npages, page_shift, ncont, offset);
+
+	err = mlx5_ib_db_map_user(context, ucmd->db_addr, &rwq->db);
+	if (err) {
+		mlx5_ib_dbg(dev, "map failed\n");
+		goto err_umem;
+	}
+
+	rwq->create_type = MLX5_WQ_USER;
+	return 0;
+
+err_umem:
+	ib_umem_release(rwq->umem);
+	return err;
+}
+
 static int create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 			  struct mlx5_ib_qp *qp, struct ib_udata *udata,
 			  struct mlx5_create_qp_mbox_in **in,
@@ -3160,3 +3226,200 @@ int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd)
 
 	return 0;
 }
+
+static int  create_rq(struct mlx5_ib_rwq *rwq, struct ib_pd *pd,
+		      struct ib_wq_init_attr *init_attr)
+{
+	struct mlx5_ib_dev *dev;
+	__be64 *rq_pas0;
+	void *in;
+	void *rqc;
+	void *wq;
+	int inlen;
+	int err;
+
+	dev = to_mdev(pd->device);
+
+	inlen = MLX5_ST_SZ_BYTES(create_rq_in) + sizeof(u64) * rwq->rq_num_pas;
+	in = mlx5_vzalloc(inlen);
+	if (!in)
+		return -ENOMEM;
+
+	rqc = MLX5_ADDR_OF(create_rq_in, in, ctx);
+	MLX5_SET(rqc,  rqc, mem_rq_type,
+		 MLX5_RQC_MEM_RQ_TYPE_MEMORY_RQ_INLINE);
+	MLX5_SET(rqc, rqc, user_index, rwq->user_index);
+	MLX5_SET(rqc,  rqc, cqn, to_mcq(init_attr->cq)->mcq.cqn);
+	MLX5_SET(rqc,  rqc, state, MLX5_RQC_STATE_RST);
+	MLX5_SET(rqc,  rqc, flush_in_error_en, 1);
+	wq = MLX5_ADDR_OF(rqc, rqc, wq);
+	MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
+	MLX5_SET(wq, wq, end_padding_mode, MLX5_WQ_END_PAD_MODE_ALIGN);
+	MLX5_SET(wq, wq, log_wq_stride, rwq->log_rq_stride);
+	MLX5_SET(wq, wq, log_wq_sz, rwq->log_rq_size);
+	MLX5_SET(wq, wq, pd, to_mpd(pd)->pdn);
+	MLX5_SET(wq, wq, page_offset, rwq->rq_page_offset);
+	MLX5_SET(wq, wq, log_wq_pg_sz, rwq->log_page_size);
+	MLX5_SET(wq, wq, wq_signature, rwq->wq_sig);
+	MLX5_SET64(wq, wq, dbr_addr, rwq->db.dma);
+	rq_pas0 = (__be64 *)MLX5_ADDR_OF(wq, wq, pas);
+	mlx5_ib_populate_pas(dev, rwq->umem, rwq->page_shift, rq_pas0, 0);
+	err = mlx5_core_create_rq(dev->mdev, in, inlen, &rwq->rqn);
+	kvfree(in);
+	return err;
+}
+
+static int set_user_rq_size(struct mlx5_ib_dev *dev,
+			    struct ib_wq_init_attr *wq_init_attr,
+			    struct mlx5_ib_create_wq *ucmd,
+			    struct mlx5_ib_rwq *rwq)
+{
+	/* Sanity check RQ size before proceeding */
+	if (wq_init_attr->max_wr > (1 << MLX5_CAP_GEN(dev->mdev, log_max_wq_sz)))
+		return -EINVAL;
+
+	if (!ucmd->rq_wqe_count)
+		return -EINVAL;
+
+	rwq->wqe_count = ucmd->rq_wqe_count;
+	rwq->wqe_shift = ucmd->rq_wqe_shift;
+	rwq->buf_size = (rwq->wqe_count << rwq->wqe_shift);
+	rwq->log_rq_stride = rwq->wqe_shift;
+	rwq->log_rq_size = ilog2(rwq->wqe_count);
+	return 0;
+}
+
+static int prepare_user_rq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *init_attr,
+			   struct ib_udata *udata,
+			   struct mlx5_ib_rwq *rwq)
+{
+	struct mlx5_ib_dev *dev = to_mdev(pd->device);
+	struct mlx5_ib_create_wq ucmd;
+	int err;
+
+	if (udata->inlen < sizeof(ucmd)) {
+		mlx5_ib_dbg(dev, "invalid inlen\n");
+		return -EINVAL;
+	}
+
+	if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
+		mlx5_ib_dbg(dev, "copy failed\n");
+		return -EFAULT;
+	}
+
+	if (ucmd.reserved[0] || (ucmd.reserved[1])) {
+		mlx5_ib_dbg(dev, "invalid reserved\n");
+		return -EFAULT;
+	}
+
+	err = set_user_rq_size(dev, init_attr, &ucmd, rwq);
+	if (err) {
+		mlx5_ib_dbg(dev, "err %d\n", err);
+		return err;
+	}
+
+	err = create_user_rq(dev, pd, rwq, &ucmd);
+	if (err) {
+		mlx5_ib_dbg(dev, "err %d\n", err);
+		if (err)
+			return err;
+	}
+
+	rwq->user_index = ucmd.user_index;
+	return 0;
+}
+
+struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
+				struct ib_wq_init_attr *init_attr,
+				struct ib_udata *udata)
+{
+	struct mlx5_ib_dev *dev;
+	struct mlx5_ib_rwq *rwq;
+	int err;
+
+	if (!udata)
+		return ERR_PTR(-ENOSYS);
+
+	dev = to_mdev(pd->device);
+	switch (init_attr->wq_type) {
+	case IB_WQT_RQ:
+		rwq = kzalloc(sizeof(*rwq), GFP_KERNEL);
+		if (!rwq)
+			return ERR_PTR(-ENOMEM);
+		err = prepare_user_rq(pd, init_attr, udata, rwq);
+		if (err)
+			goto err;
+		err = create_rq(rwq, pd, init_attr);
+		if (err)
+			goto err_user_rq;
+		break;
+	default:
+		mlx5_ib_dbg(dev, "unsupported wq type %d\n",
+			    init_attr->wq_type);
+		return ERR_PTR(-EINVAL);
+	}
+
+	rwq->ibwq.wq_num = rwq->rqn;
+	rwq->ibwq.state = IB_WQS_RESET;
+	return &rwq->ibwq;
+
+err_user_rq:
+	destroy_user_rq(pd, rwq);
+err:
+	kfree(rwq);
+	return ERR_PTR(err);
+}
+
+int mlx5_ib_destroy_wq(struct ib_wq *wq)
+{
+	struct mlx5_ib_dev *dev = to_mdev(wq->device);
+	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
+
+	if (wq->wq_type != IB_WQT_RQ)
+		return -EINVAL;
+
+	mlx5_core_destroy_rq(dev->mdev, rwq->rqn);
+	destroy_user_rq(wq->pd, rwq);
+	kfree(rwq);
+
+	return 0;
+}
+
+int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		      enum ib_wq_attr_mask attr_mask, struct ib_udata *udata)
+{
+	struct mlx5_ib_dev *dev = to_mdev(wq->device);
+	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
+	int curr_wq_state;
+	int wq_state;
+	int inlen;
+	int err;
+	void *rqc;
+	void *in;
+
+	inlen = MLX5_ST_SZ_BYTES(modify_rq_in);
+	in = mlx5_vzalloc(inlen);
+	if (!in)
+		return -ENOMEM;
+
+	rqc = MLX5_ADDR_OF(modify_rq_in, in, ctx);
+
+	curr_wq_state = (attr_mask & IB_WQ_CUR_STATE) ?
+		wq_attr->curr_wq_state : wq->state;
+	wq_state = (attr_mask & IB_WQ_STATE) ?
+		wq_attr->wq_state : curr_wq_state;
+	if (curr_wq_state == IB_WQS_ERR)
+		curr_wq_state = MLX5_RQC_STATE_ERR;
+	if (wq_state == IB_WQS_ERR)
+		wq_state = MLX5_RQC_STATE_ERR;
+	MLX5_SET(modify_rq_in, in, rq_state, curr_wq_state);
+	MLX5_SET(rqc, rqc, state, wq_state);
+
+	err = mlx5_core_modify_rq(dev->mdev, rwq->rqn, in, inlen);
+	kvfree(in);
+	if (!err)
+		rwq->ibwq.state = (wq_state == MLX5_RQC_STATE_ERR) ? IB_WQS_ERR : wq_state;
+
+	return err;
+}
diff --git a/drivers/infiniband/hw/mlx5/user.h b/drivers/infiniband/hw/mlx5/user.h
index 76fb7b9..270c699 100644
--- a/drivers/infiniband/hw/mlx5/user.h
+++ b/drivers/infiniband/hw/mlx5/user.h
@@ -44,6 +44,10 @@ enum {
 	MLX5_SRQ_FLAG_SIGNATURE		= 1 << 0,
 };
 
+enum {
+	MLX5_WQ_FLAG_SIGNATURE		= 1 << 0,
+};
+
 
 /* Increment this value if any changes that break userspace ABI
  * compatibility are made.
@@ -130,4 +134,15 @@ struct mlx5_ib_create_qp {
 struct mlx5_ib_create_qp_resp {
 	__u32	uuar_index;
 };
+
+struct mlx5_ib_create_wq {
+	__u64	buf_addr;
+	__u64	db_addr;
+	__u32	rq_wqe_count;
+	__u32	rq_wqe_shift;
+	__u32	user_index;
+	__u32	flags;
+	__u64	reserved[2];
+};
+
 #endif /* MLX5_IB_USER_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH for-next 5/7] IB: Introduce Receive Work Queue indirection table
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-10-15  7:16   ` [PATCH for-next 4/7] IB/mlx5: Add receive Work Queue verbs Yishai Hadas
@ 2015-10-15  7:16   ` Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 6/7] IB/uverbs: Introduce RWQ Indirection table Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 7/7] IB/mlx5: Add Receive Work Queue Indirection table operations Yishai Hadas
  6 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Introduce Receive Work Queue indirection table.
This object can be used to spread incoming traffic to different
receieve Work Queues.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c |   52 +++++++++++++++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         |   44 +++++++++++++++++++++++++++++++++
 2 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index c63c622..12e1a82 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1494,6 +1494,58 @@ int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 }
 EXPORT_SYMBOL(ib_modify_wq);
 
+struct ib_rwq_ind_table *ib_create_rwq_ind_table(struct ib_device *device,
+						 struct ib_rwq_ind_table_init_attr*
+						 init_attr)
+{
+	struct ib_rwq_ind_table *rwq_ind_table;
+	int i;
+	u32 table_size;
+
+	if (!device->create_rwq_ind_table)
+		return ERR_PTR(-ENOSYS);
+
+	table_size = (1 << init_attr->log_ind_tbl_size);
+	rwq_ind_table = device->create_rwq_ind_table(device,
+				init_attr, NULL);
+	if (IS_ERR(rwq_ind_table))
+		return rwq_ind_table;
+
+	rwq_ind_table->ind_tbl = init_attr->ind_tbl;
+	rwq_ind_table->log_ind_tbl_size = init_attr->log_ind_tbl_size;
+	rwq_ind_table->device = device;
+	rwq_ind_table->uobject = NULL;
+	atomic_set(&rwq_ind_table->usecnt, 0);
+
+	for (i = 0; i < table_size; i++)
+		atomic_inc(&rwq_ind_table->ind_tbl[i]->usecnt);
+
+	return rwq_ind_table;
+}
+EXPORT_SYMBOL(ib_create_rwq_ind_table);
+
+int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *rwq_ind_table)
+{
+	int err, i;
+	u32 table_size = (1 << rwq_ind_table->log_ind_tbl_size);
+	struct ib_wq **ind_tbl = rwq_ind_table->ind_tbl;
+
+	if (!rwq_ind_table->device->destroy_rwq_ind_table)
+		return -ENOSYS;
+
+	if (atomic_read(&rwq_ind_table->usecnt))
+		return -EBUSY;
+
+	err = rwq_ind_table->device->destroy_rwq_ind_table(rwq_ind_table);
+	if (!err) {
+		for (i = 0; i < table_size; i++)
+			atomic_dec(&ind_tbl[i]->usecnt);
+	}
+
+	return err;
+}
+EXPORT_SYMBOL(ib_destroy_rwq_ind_table);
+
 struct ib_flow *ib_create_flow(struct ib_qp *qp,
 			       struct ib_flow_attr *flow_attr,
 			       int domain)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 808ccee..d517abe 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1355,6 +1355,21 @@ struct ib_wq_attr {
 	enum	ib_wq_state	curr_wq_state;
 };
 
+struct ib_rwq_ind_table {
+	struct ib_device	*device;
+	struct ib_uobject      *uobject;
+	atomic_t		usecnt;
+	u32		ind_tbl_num;
+	u32		log_ind_tbl_size;
+	struct ib_wq	**ind_tbl;
+};
+
+struct ib_rwq_ind_table_init_attr {
+	u32		log_ind_tbl_size;
+	/* Each entry is a pointer to Receive Work Queue */
+	struct ib_wq	**ind_tbl;
+};
+
 struct ib_qp {
 	struct ib_device       *device;
 	struct ib_pd	       *pd;
@@ -1824,6 +1839,11 @@ struct ib_device {
 						struct ib_wq_attr *attr,
 						enum ib_wq_attr_mask attr_mask,
 						struct ib_udata *udata);
+	struct ib_rwq_ind_table *  (*create_rwq_ind_table)(struct ib_device *device,
+							   struct ib_rwq_ind_table_init_attr *init_attr,
+							   struct ib_udata *udata);
+	int			   (*destroy_rwq_ind_table)(struct ib_rwq_ind_table *wq_ind_table);
+
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
@@ -3115,4 +3135,28 @@ int ib_destroy_wq(struct ib_wq *wq);
 int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
 		 enum ib_wq_attr_mask attr_mask);
 
+/*
+ * ib_create_rwq_ind_table - Creates a RQ Indirection Table associated
+ * with the specified protection domain.
+ * @device: The device on which to create the rwq indirection table.
+ * @ib_rwq_ind_table_init_attr: A list of initial attributes required to
+ * create the Indirection Table.
+ *
+ * Note: The life time of ib_rwq_ind_table_init_attr->ind_tbl is not less
+ *	than the created ib_rwq_ind_table object and the caller is responsible
+ *	for its memory allocation/free.
+ *
+ * Return Value:
+ * ib_create_rwq_ind_table returns a pointer to the created
+ * Indirection Table, or NULL if the request fails.
+ */
+struct ib_rwq_ind_table *ib_create_rwq_ind_table(struct ib_device *device,
+						 struct ib_rwq_ind_table_init_attr*
+						 wq_ind_table_init_attr);
+/*
+ * ib_destroy_rwq_ind_table - Destroys the specified Indirection Table.
+ * @wq_ind_table: The Indirection Table to destroy.
+*/
+int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table);
+
 #endif /* IB_VERBS_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH for-next 6/7] IB/uverbs: Introduce RWQ Indirection table
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-10-15  7:16   ` [PATCH for-next 5/7] IB: Introduce Receive Work Queue indirection table Yishai Hadas
@ 2015-10-15  7:16   ` Yishai Hadas
  2015-10-15  7:16   ` [PATCH for-next 7/7] IB/mlx5: Add Receive Work Queue Indirection table operations Yishai Hadas
  6 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Introduce RWQ indirection table uverbs commands, it includes:
create, destroy.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |    3 +
 drivers/infiniband/core/uverbs_cmd.c  |  187 +++++++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c |   13 +++
 include/rdma/ib_verbs.h               |    1 +
 include/uapi/rdma/ib_user_verbs.h     |   24 ++++
 5 files changed, 228 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index ab3d331..0885d7a 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -186,6 +186,7 @@ extern struct idr ib_uverbs_srq_idr;
 extern struct idr ib_uverbs_xrcd_idr;
 extern struct idr ib_uverbs_rule_idr;
 extern struct idr ib_uverbs_wq_idr;
+extern struct idr ib_uverbs_rwq_ind_tbl_idr;
 
 void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
 
@@ -281,5 +282,7 @@ IB_UVERBS_DECLARE_EX_CMD(create_cq);
 IB_UVERBS_DECLARE_EX_CMD(create_wq);
 IB_UVERBS_DECLARE_EX_CMD(modify_wq);
 IB_UVERBS_DECLARE_EX_CMD(destroy_wq);
+IB_UVERBS_DECLARE_EX_CMD(create_rwq_ind_table);
+IB_UVERBS_DECLARE_EX_CMD(destroy_rwq_ind_table);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index d316cd7..7653a0c 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -58,6 +58,7 @@ static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
 static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
 static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
 static struct uverbs_lock_class wq_lock_class = { .name = "WQ-uobj" };
+static struct uverbs_lock_class rwq_ind_table_lock_class = { .name = "IND_TBL-uobj" };
 
 /*
  * The ib_uobject locking scheme is as follows:
@@ -339,6 +340,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	INIT_LIST_HEAD(&ucontext->srq_list);
 	INIT_LIST_HEAD(&ucontext->ah_list);
 	INIT_LIST_HEAD(&ucontext->wq_list);
+	INIT_LIST_HEAD(&ucontext->rwq_ind_tbl_list);
 	INIT_LIST_HEAD(&ucontext->xrcd_list);
 	INIT_LIST_HEAD(&ucontext->rule_list);
 	rcu_read_lock();
@@ -3133,6 +3135,191 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
 	return ret;
 }
 
+int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
+				      struct ib_device *ib_dev,
+				      struct ib_udata *ucore,
+				      struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_create_rwq_ind_table	  cmd;
+	struct ib_uverbs_ex_create_rwq_ind_table_resp	resp;
+	struct ib_uobject		  *uobj;
+	int err = 0;
+	struct ib_rwq_ind_table_init_attr init_attr;
+	struct ib_rwq_ind_table *rwq_ind_tbl;
+	struct ib_wq	**wqs = NULL;
+	u32 *wqs_handles = NULL;
+	struct ib_wq	*wq = NULL;
+	int i, j, num_read_wqs;
+	u32 num_wq_handles;
+	u32 expected_in_size;
+
+	if (ucore->inlen < sizeof(cmd))
+		return -EINVAL;
+
+	if (ucore->outlen < sizeof(resp))
+		return -ENOSPC;
+
+	err = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
+	if (err)
+		return err;
+
+	ucore->inbuf += sizeof(cmd);
+	ucore->inlen -= sizeof(cmd);
+
+	if (cmd.comp_mask)
+		return -EINVAL;
+
+	num_wq_handles = 1 << cmd.log_ind_tbl_size;
+	expected_in_size = num_wq_handles * sizeof(__u32);
+	if (num_wq_handles == 1)
+		/* input size for wq handles is u64 aligned */
+		expected_in_size += sizeof(__u32);
+
+	if (ucore->inlen != expected_in_size)
+		return -EINVAL;
+
+	wqs_handles = kcalloc(num_wq_handles, sizeof(*wqs_handles),
+			      GFP_KERNEL);
+	if (!wqs_handles)
+		return -ENOMEM;
+
+	err = ib_copy_from_udata(wqs_handles, ucore,
+				 num_wq_handles * sizeof(__u32));
+	if (err)
+		goto err_free;
+
+	wqs = kcalloc(num_wq_handles, sizeof(*wqs), GFP_KERNEL);
+	if (!wqs)
+		goto  err_free;
+
+	for (num_read_wqs = 0; num_read_wqs < num_wq_handles;
+			num_read_wqs++) {
+		wq = idr_read_wq(wqs_handles[num_read_wqs], file->ucontext);
+		if (!wq)
+			goto put_wqs;
+
+		wqs[num_read_wqs] = wq;
+	}
+
+	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
+	if (!uobj) {
+		err = -ENOMEM;
+		goto put_wqs;
+	}
+
+	init_uobj(uobj, 0, file->ucontext, &rwq_ind_table_lock_class);
+	down_write(&uobj->mutex);
+	memset(&init_attr, 0, sizeof(init_attr));
+	init_attr.log_ind_tbl_size = cmd.log_ind_tbl_size;
+	init_attr.ind_tbl = wqs;
+	rwq_ind_tbl = ib_dev->create_rwq_ind_table(ib_dev, &init_attr, uhw);
+
+	if (IS_ERR(rwq_ind_tbl)) {
+		err = PTR_ERR(rwq_ind_tbl);
+		goto err_uobj;
+	}
+
+	rwq_ind_tbl->ind_tbl = wqs;
+	rwq_ind_tbl->log_ind_tbl_size = init_attr.log_ind_tbl_size;
+	rwq_ind_tbl->uobject = uobj;
+	uobj->object = rwq_ind_tbl;
+	rwq_ind_tbl->device = ib_dev;
+	atomic_set(&rwq_ind_tbl->usecnt, 0);
+
+	for (i = 0; i < num_wq_handles; i++)
+		atomic_inc(&wqs[i]->usecnt);
+
+	err = idr_add_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	if (err)
+		goto destroy_ind_tbl;
+
+	memset(&resp, 0, sizeof(resp));
+	resp.ind_tbl_handle = uobj->id;
+	resp.ind_tbl_num = rwq_ind_tbl->ind_tbl_num;
+	resp.response_length = offsetof(typeof(resp), ind_tbl_num) + sizeof(resp.ind_tbl_num);
+
+	err = ib_copy_to_udata(ucore,
+			       &resp, sizeof(resp));
+	if (err)
+		goto err_copy;
+
+	kfree(wqs_handles);
+
+	for (j = 0; j < num_read_wqs; j++)
+		put_wq_read(wqs[j]);
+
+	mutex_lock(&file->mutex);
+	list_add_tail(&uobj->list, &file->ucontext->rwq_ind_tbl_list);
+	mutex_unlock(&file->mutex);
+
+	uobj->live = 1;
+
+	up_write(&uobj->mutex);
+	return 0;
+
+err_copy:
+	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+destroy_ind_tbl:
+	ib_destroy_rwq_ind_table(rwq_ind_tbl);
+err_uobj:
+	put_uobj_write(uobj);
+put_wqs:
+	for (j = 0; j < num_read_wqs; j++)
+		put_wq_read(wqs[j]);
+err_free:
+	kfree(wqs_handles);
+	kfree(wqs);
+	return err;
+}
+
+int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
+				       struct ib_device *ib_dev,
+				       struct ib_udata *ucore,
+				       struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_destroy_rwq_ind_table	cmd;
+	struct ib_rwq_ind_table *rwq_ind_tbl;
+	struct ib_uobject		*uobj;
+	int			ret;
+	struct ib_wq	**ind_tbl;
+
+	if (ucore->inlen < sizeof(cmd))
+		return -EINVAL;
+
+	ret = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
+	if (ret)
+		return ret;
+
+	if (cmd.comp_mask)
+		return -EINVAL;
+
+	uobj = idr_write_uobj(&ib_uverbs_rwq_ind_tbl_idr, cmd.ind_tbl_handle,
+			      file->ucontext);
+	if (!uobj)
+		return -EINVAL;
+	rwq_ind_tbl = uobj->object;
+	ind_tbl = rwq_ind_tbl->ind_tbl;
+
+	ret = ib_destroy_rwq_ind_table(rwq_ind_tbl);
+	if (!ret)
+		uobj->live = 0;
+
+	put_uobj_write(uobj);
+
+	if (ret)
+		return ret;
+
+	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+
+	mutex_lock(&file->mutex);
+	list_del(&uobj->list);
+	mutex_unlock(&file->mutex);
+
+	put_uobj(uobj);
+	kfree(ind_tbl);
+	return ret;
+}
+
 int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 			     struct ib_device *ib_dev,
 			     struct ib_udata *ucore,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 8ae75cc..cf548df 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -75,6 +75,7 @@ DEFINE_IDR(ib_uverbs_srq_idr);
 DEFINE_IDR(ib_uverbs_xrcd_idr);
 DEFINE_IDR(ib_uverbs_rule_idr);
 DEFINE_IDR(ib_uverbs_wq_idr);
+DEFINE_IDR(ib_uverbs_rwq_ind_tbl_idr);
 
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
@@ -131,6 +132,8 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_EX_CMD_CREATE_WQ]	= ib_uverbs_ex_create_wq,
 	[IB_USER_VERBS_EX_CMD_MODIFY_WQ]	= ib_uverbs_ex_modify_wq,
 	[IB_USER_VERBS_EX_CMD_DESTROY_WQ]	= ib_uverbs_ex_destroy_wq,
+	[IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL] = ib_uverbs_ex_create_rwq_ind_table,
+	[IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL] = ib_uverbs_ex_destroy_rwq_ind_table,
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
@@ -255,6 +258,16 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		kfree(uqp);
 	}
 
+	list_for_each_entry_safe(uobj, tmp, &context->rwq_ind_tbl_list, list) {
+		struct ib_rwq_ind_table *rwq_ind_tbl = uobj->object;
+		struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
+
+		idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+		ib_destroy_rwq_ind_table(rwq_ind_tbl);
+		kfree(ind_tbl);
+		kfree(uobj);
+	}
+
 	list_for_each_entry_safe(uobj, tmp, &context->wq_list, list) {
 		struct ib_wq *wq = uobj->object;
 		struct ib_uwq_object *uwq =
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d517abe..36f4a45 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1221,6 +1221,7 @@ struct ib_ucontext {
 	struct list_head	xrcd_list;
 	struct list_head	rule_list;
 	struct list_head	wq_list;
+	struct list_head	rwq_ind_tbl_list;
 	int			closing;
 
 	struct pid             *tgid;
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 2dc47e0..d4b92ee 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -97,6 +97,8 @@ enum {
 	IB_USER_VERBS_EX_CMD_CREATE_WQ,
 	IB_USER_VERBS_EX_CMD_MODIFY_WQ,
 	IB_USER_VERBS_EX_CMD_DESTROY_WQ,
+	IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL,
+	IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL
 };
 
 /*
@@ -960,4 +962,26 @@ struct ib_uverbs_ex_modify_wq  {
 	__u32 curr_wq_state;
 };
 
+struct ib_uverbs_ex_create_rwq_ind_table  {
+	__u32 comp_mask;
+	__u32 log_ind_tbl_size;
+	/* Following are the wq handles according to log_ind_tbl_size
+	 * wq_handle1
+	 * wq_handle2
+	 */
+	__u32 wq_handles[0];
+};
+
+struct ib_uverbs_ex_create_rwq_ind_table_resp {
+	__u32 comp_mask;
+	__u32 response_length;
+	__u32 ind_tbl_handle;
+	__u32 ind_tbl_num;
+};
+
+struct ib_uverbs_ex_destroy_rwq_ind_table  {
+	__u32 comp_mask;
+	__u32 ind_tbl_handle;
+};
+
 #endif /* IB_USER_VERBS_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH for-next 7/7] IB/mlx5: Add Receive Work Queue Indirection table operations
       [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-10-15  7:16   ` [PATCH for-next 6/7] IB/uverbs: Introduce RWQ Indirection table Yishai Hadas
@ 2015-10-15  7:16   ` Yishai Hadas
  6 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15  7:16 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Add Receive Work Queue Indirection table operations, it includes:
create, destroy.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    |    6 +++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   14 ++++++++
 drivers/infiniband/hw/mlx5/qp.c      |   56 ++++++++++++++++++++++++++++++++++
 3 files changed, 75 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 1601f19..aa5f3c3 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1512,7 +1512,11 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 		dev->ib_dev.uverbs_ex_cmd_mask |=
 				(1ull << IB_USER_VERBS_EX_CMD_CREATE_WQ) |
 				(1ull << IB_USER_VERBS_EX_CMD_MODIFY_WQ) |
-				(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ);
+				(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ) |
+				(1ull << IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL) |
+				(1ull << IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL);
+		dev->ib_dev.create_rwq_ind_table = mlx5_ib_create_rwq_ind_table;
+		dev->ib_dev.destroy_rwq_ind_table = mlx5_ib_destroy_rwq_ind_table;
 	}
 	err = init_node_data(dev);
 	if (err)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 6c00aca..adf544e 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -173,6 +173,11 @@ enum {
 	MLX5_WQ_KERNEL
 };
 
+struct mlx5_ib_rwq_ind_table {
+	struct ib_rwq_ind_table ib_rwq_ind_tbl;
+	u32			rqtn;
+};
+
 /*
  * Connect-IB can trigger up to four concurrent pagefaults
  * per-QP.
@@ -522,6 +527,11 @@ static inline struct mlx5_ib_rwq *to_mrwq(struct ib_wq *ibwq)
 	return container_of(ibwq, struct mlx5_ib_rwq, ibwq);
 }
 
+static inline struct mlx5_ib_rwq_ind_table *to_mrwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl)
+{
+	return container_of(ib_rwq_ind_tbl, struct mlx5_ib_rwq_ind_table, ib_rwq_ind_tbl);
+}
+
 static inline struct mlx5_ib_srq *to_mibsrq(struct mlx5_core_srq *msrq)
 {
 	return container_of(msrq, struct mlx5_ib_srq, msrq);
@@ -665,6 +675,10 @@ struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
 int mlx5_ib_destroy_wq(struct ib_wq *wq);
 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 		      enum ib_wq_attr_mask attr_mask, struct ib_udata *udata);
+struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device,
+						      struct ib_rwq_ind_table_init_attr *init_attr,
+						      struct ib_udata *udata);
+int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table);
 
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 extern struct workqueue_struct *mlx5_ib_page_fault_wq;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 3763205..cef137a 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -3386,6 +3386,62 @@ int mlx5_ib_destroy_wq(struct ib_wq *wq)
 	return 0;
 }
 
+struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device,
+						      struct ib_rwq_ind_table_init_attr *init_attr,
+						      struct ib_udata *udata)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+	struct mlx5_ib_rwq_ind_table *rwq_ind_tbl;
+	int sz = 1 << init_attr->log_ind_tbl_size;
+	int inlen;
+	int err;
+	int i;
+	u32 *in;
+	void *rqtc;
+
+	rwq_ind_tbl = kzalloc(sizeof(*rwq_ind_tbl), GFP_KERNEL);
+	if (!rwq_ind_tbl)
+		return ERR_PTR(-ENOMEM);
+
+	inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + sizeof(u32) * sz;
+	in = mlx5_vzalloc(inlen);
+	if (!in) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context);
+
+	MLX5_SET(rqtc, rqtc, rqt_actual_size, sz);
+	MLX5_SET(rqtc, rqtc, rqt_max_size, sz);
+
+	for (i = 0; i < sz; i++)
+		MLX5_SET(rqtc, rqtc, rq_num[i], init_attr->ind_tbl[i]->wq_num);
+
+	err = mlx5_core_create_rqt(dev->mdev, in, inlen, &rwq_ind_tbl->rqtn);
+	kvfree(in);
+
+	if (err)
+		goto err;
+
+	rwq_ind_tbl->ib_rwq_ind_tbl.ind_tbl_num = rwq_ind_tbl->rqtn;
+	return &rwq_ind_tbl->ib_rwq_ind_tbl;
+err:
+	kfree(rwq_ind_tbl);
+	return ERR_PTR(err);
+}
+
+int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl)
+{
+	struct mlx5_ib_rwq_ind_table *rwq_ind_tbl = to_mrwq_ind_table(ib_rwq_ind_tbl);
+	struct mlx5_ib_dev *dev = to_mdev(ib_rwq_ind_tbl->device);
+
+	mlx5_core_destroy_rqt(dev->mdev, rwq_ind_tbl->rqtn);
+
+	kfree(rwq_ind_tbl);
+	return 0;
+}
+
 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 		      enum ib_wq_attr_mask attr_mask, struct ib_udata *udata)
 {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]     ` <1444893410-13242-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-10-15  8:50       ` Sagi Grimberg
       [not found]         ` <561F68D0.4050205-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2015-10-15  9:13       ` Parav Pandit
  1 sibling, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2015-10-15  8:50 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, alexv-VPRAkNaXOzVWk0Htik3J/w,
	tzahio-VPRAkNaXOzVWk0Htik3J/w, eli-VPRAkNaXOzVWk0Htik3J/w

Hi Yishai,

> +/**
> + * ib_create_wq - Creates a WQ associated with the specified protection
> + * domain.
> + * @pd: The protection domain associated with the WQ.
> + * @wq_init_attr: A list of initial attributes required to create the
> + * WQ. If WQ creation succeeds, then the attributes are updated to
> + * the actual capabilities of the created WQ.
> + *
> + * wq_init_attr->max_wr and wq_init_attr->max_sge determine
> + * the requested size of the WQ, and set to the actual values allocated
> + * on return.
> + * If ib_create_wq() succeeds, then max_wr and max_sge will always be
> + * at least as large as the requested values.
> + *
> + * Return Value
> + * ib_create_wq() returns a pointer to the created WQ, or NULL if the request
> + * fails.
> + */
> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
> +			   struct ib_wq_init_attr *init_attr);
> +

We started shifting function documentations from the header declarations
to the *.c implmentations. Would you mind moving it too?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]     ` <1444893410-13242-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15  8:50       ` Sagi Grimberg
@ 2015-10-15  9:13       ` Parav Pandit
       [not found]         ` <CAG53R5V7u7o+cVQnqhfjXdUjKJRVBiMx5CF3snWAWKG6ZbWMow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 18+ messages in thread
From: Parav Pandit @ 2015-10-15  9:13 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

Just curious, why does WQ need to be bind to PD?
Isn't ucontext sufficient?
Or because kcontext doesn't exist, PD serves that role?
Or Is this just manifestation of how hardware behave?

Since you mentioned, "QP can be configured to use "external" WQ
object", it might be worth to reuse the WQ across multiple QPs of
different PD?
Because MR and QP validation check has to happen among MR and actual
QP and might not require that check against WQ.

Parav

On Thu, Oct 15, 2015 at 12:46 PM, Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> Introduce Work Queue object and its create/destroy/modify verbs.
>
> QP can be created without internal WQs "packaged" inside it,
> this QP can be configured to use "external" WQ object as its
> receive/send queue.
> WQ is a necessary component for RSS technology since RSS mechanism
> is supposed to distribute the traffic between multiple
> Receive Work Queues.
>
> WQ associated (many to one) with Completion Queue and it owns WQ
> properties (PD, WQ size, etc.).
> WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
> it may be extend to others such as IB_WQT_SQ. (send queue).
> WQ from type IB_WQT_RQ contains receive work requests.
>
> WQ context is subject to a well-defined state transitions done by
> the modify_wq verb.
> When WQ is created its initial state becomes IB_WQS_RESET.
> From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
> From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
> or to IB_WQS_ERR.
> From IB_WQS_ERR it can be modified to IB_WQS_RESET.
>
> Note: transition to IB_WQS_ERR might occur implicitly in case there
> was some HW error.
>
>
> Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/core/verbs.c |   59 ++++++++++++++++++++++++++
>  include/rdma/ib_verbs.h         |   88 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 147 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index e1f2c98..c63c622 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1435,6 +1435,65 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
>  }
>  EXPORT_SYMBOL(ib_dealloc_xrcd);
>
> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
> +                          struct ib_wq_init_attr *wq_attr)
> +{
> +       struct ib_wq *wq;
> +
> +       if (!pd->device->create_wq)
> +               return ERR_PTR(-ENOSYS);
> +
> +       wq = pd->device->create_wq(pd, wq_attr, NULL);
> +       if (!IS_ERR(wq)) {
> +               wq->event_handler = wq_attr->event_handler;
> +               wq->wq_context = wq_attr->wq_context;
> +               wq->wq_type = wq_attr->wq_type;
> +               wq->cq = wq_attr->cq;
> +               wq->device = pd->device;
> +               wq->pd = pd;
> +               wq->uobject = NULL;
> +               atomic_inc(&pd->usecnt);
> +               atomic_inc(&wq_attr->cq->usecnt);
> +               atomic_set(&wq->usecnt, 0);
> +       }
> +       return wq;
> +}
> +EXPORT_SYMBOL(ib_create_wq);
> +
> +int ib_destroy_wq(struct ib_wq *wq)
> +{
> +       int err;
> +       struct ib_cq *cq = wq->cq;
> +       struct ib_pd *pd = wq->pd;
> +
> +       if (!wq->device->destroy_wq)
> +               return -ENOSYS;
> +
> +       if (atomic_read(&wq->usecnt))
> +               return -EBUSY;
> +
> +       err = wq->device->destroy_wq(wq);
> +       if (!err) {
> +               atomic_dec(&pd->usecnt);
> +               atomic_dec(&cq->usecnt);
> +       }
> +       return err;
> +}
> +EXPORT_SYMBOL(ib_destroy_wq);
> +
> +int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
> +                enum ib_wq_attr_mask attr_mask)
> +{
> +       int err;
> +
> +       if (!wq->device->modify_wq)
> +               return -ENOSYS;
> +
> +       err = wq->device->modify_wq(wq, wq_attr, attr_mask, NULL);
> +       return err;
> +}
> +EXPORT_SYMBOL(ib_modify_wq);
> +
>  struct ib_flow *ib_create_flow(struct ib_qp *qp,
>                                struct ib_flow_attr *flow_attr,
>                                int domain)
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index e1f65e2..0c6291b 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1310,6 +1310,48 @@ struct ib_srq {
>         } ext;
>  };
>
> +enum ib_wq_type {
> +       IB_WQT_RQ
> +};
> +
> +enum ib_wq_state {
> +       IB_WQS_RESET,
> +       IB_WQS_RDY,
> +       IB_WQS_ERR
> +};
> +
> +struct ib_wq {
> +       struct ib_device       *device;
> +       struct ib_uobject      *uobject;
> +       void                *wq_context;
> +       void                (*event_handler)(struct ib_event *, void *);
> +       struct ib_pd           *pd;
> +       struct ib_cq           *cq;
> +       u32             wq_num;
> +       enum ib_wq_state       state;
> +       enum ib_wq_type wq_type;
> +       atomic_t                usecnt;
> +};
> +
> +struct ib_wq_init_attr {
> +       void                   *wq_context;
> +       enum ib_wq_type wq_type;
> +       u32             max_wr;
> +       u32             max_sge;
> +       struct  ib_cq          *cq;
> +       void                (*event_handler)(struct ib_event *, void *);
> +};
> +
> +enum ib_wq_attr_mask {
> +       IB_WQ_STATE     = 1 << 0,
> +       IB_WQ_CUR_STATE = 1 << 1,
> +};
> +
> +struct ib_wq_attr {
> +       enum    ib_wq_state     wq_state;
> +       enum    ib_wq_state     curr_wq_state;
> +};
> +
>  struct ib_qp {
>         struct ib_device       *device;
>         struct ib_pd           *pd;
> @@ -1771,6 +1813,14 @@ struct ib_device {
>         int                        (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
>                                                       struct ib_mr_status *mr_status);
>         void                       (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
> +       struct ib_wq *             (*create_wq)(struct ib_pd *pd,
> +                                               struct ib_wq_init_attr *init_attr,
> +                                               struct ib_udata *udata);
> +       int                        (*destroy_wq)(struct ib_wq *wq);
> +       int                        (*modify_wq)(struct ib_wq *wq,
> +                                               struct ib_wq_attr *attr,
> +                                               enum ib_wq_attr_mask attr_mask,
> +                                               struct ib_udata *udata);
>
>         struct ib_dma_mapping_ops   *dma_ops;
>
> @@ -3024,4 +3074,42 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, u8 port,
>                                             u16 pkey, const union ib_gid *gid,
>                                             const struct sockaddr *addr);
>
> +/**
> + * ib_create_wq - Creates a WQ associated with the specified protection
> + * domain.
> + * @pd: The protection domain associated with the WQ.
> + * @wq_init_attr: A list of initial attributes required to create the
> + * WQ. If WQ creation succeeds, then the attributes are updated to
> + * the actual capabilities of the created WQ.
> + *
> + * wq_init_attr->max_wr and wq_init_attr->max_sge determine
> + * the requested size of the WQ, and set to the actual values allocated
> + * on return.
> + * If ib_create_wq() succeeds, then max_wr and max_sge will always be
> + * at least as large as the requested values.
> + *
> + * Return Value
> + * ib_create_wq() returns a pointer to the created WQ, or NULL if the request
> + * fails.
> + */
> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
> +                          struct ib_wq_init_attr *init_attr);
> +
> +/**
> + * ib_destroy_wq - Destroys the specified WQ.
> + * @wq: The WQ to destroy.
> + */
> +int ib_destroy_wq(struct ib_wq *wq);
> +
> +/**
> + * ib_modify_wq - Modifies the specified WQ.
> + * @wq: The WQ to modify.
> + * @wq_attr: On input, specifies the WQ attributes to modify.
> + * @attr_mask: A bit-mask used to specify which attributes of the WQ
> + *   are being modified.
> + * On output, the current values of selected WQ attributes are returned.
> + */
> +int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
> +                enum ib_wq_attr_mask attr_mask);
> +
>  #endif /* IB_VERBS_H */
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]         ` <CAG53R5V7u7o+cVQnqhfjXdUjKJRVBiMx5CF3snWAWKG6ZbWMow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-15 14:12           ` Yishai Hadas
       [not found]             ` <561FB463.8030602-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15 14:12 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

On 10/15/2015 12:13 PM, Parav Pandit wrote:
> Just curious, why does WQ need to be bind to PD?
> Isn't ucontext sufficient?
> Or because kcontext doesn't exist, PD serves that role?
> Or Is this just manifestation of how hardware behave?

PD is an attribute of a work queue (i.e. send/receive queue), it's used 
by the hardware for security validation before scattering to a memory 
region. For that, an external WQ object needs a PD, letting the
hardware makes that validation.

> Since you mentioned, "QP can be configured to use "external" WQ
> object", it might be worth to reuse the WQ across multiple QPs of
> different PD?

Correct, external WQ can be used across multiple QPs, in that case its 
PD is used by the hardware for security validation when it accesses to 
the MR, in that case the QP's PD is not in use.

> Because MR and QP validation check has to happen among MR and actual
> QP and might not require that check against WQ.

No, in that case of an external WQ its PD is used and the QP's PD is not 
in use.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]             ` <561FB463.8030602-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-10-15 15:17               ` Parav Pandit
       [not found]                 ` <CAG53R5U256Nt=3xd3sVUuf8J1icdK5YEiQFoQjR_HB_=5-cVww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Parav Pandit @ 2015-10-15 15:17 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Alex Vainman, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

On Thu, Oct 15, 2015 at 7:42 PM, Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 10/15/2015 12:13 PM, Parav Pandit wrote:
>>
>> Just curious, why does WQ need to be bind to PD?
>> Isn't ucontext sufficient?
>> Or because kcontext doesn't exist, PD serves that role?
>> Or Is this just manifestation of how hardware behave?
>
>
> PD is an attribute of a work queue (i.e. send/receive queue), it's used by
> the hardware for security validation before scattering to a memory region.
> For that, an external WQ object needs a PD, letting the
> hardware makes that validation.
>
>> Since you mentioned, "QP can be configured to use "external" WQ
>> object", it might be worth to reuse the WQ across multiple QPs of
>> different PD?
>
>
> Correct, external WQ can be used across multiple QPs, in that case its PD is
> used by the hardware for security validation when it accesses to the MR, in
> that case the QP's PD is not in use.
>
I think I get it, just confirming with below example.

So I think below is possible.
WQ_A having PD=1.
QP_A having PD=2 bound to WQ_A.
QP_B having PD=3 bound to WQ_A.
MR_X having PD=2.
And checks are done between MR and QP.

In other use case,
MR is not at all used. (only physical addresses are used)
WQ_A having PD=1.
QP_A having PD=2 bound to WQ_A.
QP_B having PD=3 bound to WQ_A.

WQ entries fail as MR is not associated and QP are bound to different
PD than the PD of WQ_A.
Because at QP bound time with WQ, its unknown whether it will use MR
or not in the WQE at run time.
Right?


>> Because MR and QP validation check has to happen among MR and actual
>> QP and might not require that check against WQ.
>
>
> No, in that case of an external WQ its PD is used and the QP's PD is not in
> use.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]                 ` <CAG53R5U256Nt=3xd3sVUuf8J1icdK5YEiQFoQjR_HB_=5-cVww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-15 16:25                   ` Yishai Hadas
       [not found]                     ` <561FD375.3000205-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2015-10-15 16:25 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Alex Vainman, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

On 10/15/2015 6:17 PM, Parav Pandit wrote:
> On Thu, Oct 15, 2015 at 7:42 PM, Yishai Hadas
> <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On 10/15/2015 12:13 PM, Parav Pandit wrote:
>>>
>>> Just curious, why does WQ need to be bind to PD?
>>> Isn't ucontext sufficient?
>>> Or because kcontext doesn't exist, PD serves that role?
>>> Or Is this just manifestation of how hardware behave?
>>
>>
>> PD is an attribute of a work queue (i.e. send/receive queue), it's used by
>> the hardware for security validation before scattering to a memory region.
>> For that, an external WQ object needs a PD, letting the
>> hardware makes that validation.
>>
>>> Since you mentioned, "QP can be configured to use "external" WQ
>>> object", it might be worth to reuse the WQ across multiple QPs of
>>> different PD?
>>
>>
>> Correct, external WQ can be used across multiple QPs, in that case its PD is
>> used by the hardware for security validation when it accesses to the MR, in
>> that case the QP's PD is not in use.
>>
> I think I get it, just confirming with below example.

.

> So I think below is possible.
> WQ_A having PD=1.
> QP_A having PD=2 bound to WQ_A.
> QP_B having PD=3 bound to WQ_A.
> MR_X having PD=2.
> And checks are done between MR and QP.
No, please follow above description, in that case PD=1 of WQ_A is used 
for the checks.

> In other use case,
> MR is not at all used. (only physical addresses are used)
> WQ_A having PD=1.
> QP_A having PD=2 bound to WQ_A.
> QP_B having PD=3 bound to WQ_A.
>
> WQ entries fail as MR is not associated and QP are bound to different
> PD than the PD of WQ_A.
> Because at QP bound time with WQ, its unknown whether it will use MR
> or not in the WQE at run time.
> Right?

In case there is MR for physical addresses it has a PD and the WQ's PD 
is used, in case there is no MR the PD is not applicable.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]                     ` <561FD375.3000205-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-10-15 16:49                       ` Parav Pandit
       [not found]                         ` <CAG53R5UmEhOh3n-+9qMoDthAtROdWQXstwqAFxzx8Urx1ZETnw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Parav Pandit @ 2015-10-15 16:49 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Alex Vainman, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

On Thu, Oct 15, 2015 at 9:55 PM, Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 10/15/2015 6:17 PM, Parav Pandit wrote:
>>
>> On Thu, Oct 15, 2015 at 7:42 PM, Yishai Hadas
>> <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>>
>>> On 10/15/2015 12:13 PM, Parav Pandit wrote:
>>>>
>>>>
>>>> Just curious, why does WQ need to be bind to PD?
>>>> Isn't ucontext sufficient?
>>>> Or because kcontext doesn't exist, PD serves that role?
>>>> Or Is this just manifestation of how hardware behave?
>>>
>>>
>>>
>>> PD is an attribute of a work queue (i.e. send/receive queue), it's used
>>> by
>>> the hardware for security validation before scattering to a memory
>>> region.
>>> For that, an external WQ object needs a PD, letting the
>>> hardware makes that validation.
>>>
>>>> Since you mentioned, "QP can be configured to use "external" WQ
>>>> object", it might be worth to reuse the WQ across multiple QPs of
>>>> different PD?
>>>
>>>
>>>
>>> Correct, external WQ can be used across multiple QPs, in that case its PD
>>> is
>>> used by the hardware for security validation when it accesses to the MR,
>>> in
>>> that case the QP's PD is not in use.
>>>
>> I think I get it, just confirming with below example.
>
>
> .
>
>> So I think below is possible.
>> WQ_A having PD=1.
>> QP_A having PD=2 bound to WQ_A.
>> QP_B having PD=3 bound to WQ_A.
>> MR_X having PD=2.
>> And checks are done between MR and QP.
>
> No, please follow above description, in that case PD=1 of WQ_A is used for
> the checks.
>
This appears to me a manifestation of hardware implementation
surfacing the verb layer.
There may be nothing wrong in it, but worth to know how to actually do
verb programming.

If there is stateless WQ being used by multiple QPs in multiplexed
way, it should be able to multiplex between QP's of different PD as
well.
Otherwise for every PD being created, there will have be one WQ needed
to service all the QPs belonging to that PD.

>> In other use case,
>> MR is not at all used. (only physical addresses are used)
>> WQ_A having PD=1.
>> QP_A having PD=2 bound to WQ_A.
>> QP_B having PD=3 bound to WQ_A.
>>
>> WQ entries fail as MR is not associated and QP are bound to different
>> PD than the PD of WQ_A.
>> Because at QP bound time with WQ, its unknown whether it will use MR
>> or not in the WQE at run time.
>> Right?
>
>
> In case there is MR for physical addresses it has a PD and the WQ's PD is
> used, in case there is no MR the PD is not applicable.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]                         ` <CAG53R5UmEhOh3n-+9qMoDthAtROdWQXstwqAFxzx8Urx1ZETnw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-18 15:08                           ` Yishai Hadas
       [not found]                             ` <5623B5D3.2020206-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2015-10-18 15:08 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Alex Vainman, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

On 10/15/2015 7:49 PM, Parav Pandit wrote:

> If there is stateless WQ being used by multiple QPs in multiplexed

The WQ is not stateless and always has its own PD.

> way, it should be able to multiplex between QP's of different PD as
> well.
> Otherwise for every PD being created, there will have be one WQ needed
> to service all the QPs belonging to that PD.

As mentioned, same WQ can serve multiple QPs, from PD point of view it 
behaves similarly to SRQ that may be associated with many QPs with 
different PDs.

See IB SPEC, Release 1.3, o10-2.2.1:
"SRQ may be associated with the same PD as used by one or more of its 
associated QPs or a different PD."

As part of coming V1 will improve the commit message to better clarify 
the WQ's PD behavior, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]         ` <561F68D0.4050205-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-10-18 15:13           ` Yishai Hadas
  0 siblings, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2015-10-18 15:13 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, alexv-VPRAkNaXOzVWk0Htik3J/w,
	tzahio-VPRAkNaXOzVWk0Htik3J/w, eli-VPRAkNaXOzVWk0Htik3J/w

On 10/15/2015 11:50 AM, Sagi Grimberg wrote:
> Hi Yishai,
>
>> +/**
>> + * ib_create_wq - Creates a WQ associated with the specified protection
>> + * domain.
>> + * @pd: The protection domain associated with the WQ.
>> + * @wq_init_attr: A list of initial attributes required to create the
>> + * WQ. If WQ creation succeeds, then the attributes are updated to
>> + * the actual capabilities of the created WQ.
>> + *
>> + * wq_init_attr->max_wr and wq_init_attr->max_sge determine
>> + * the requested size of the WQ, and set to the actual values allocated
>> + * on return.
>> + * If ib_create_wq() succeeds, then max_wr and max_sge will always be
>> + * at least as large as the requested values.
>> + *
>> + * Return Value
>> + * ib_create_wq() returns a pointer to the created WQ, or NULL if the
>> request
>> + * fails.
>> + */
>> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
>> +               struct ib_wq_init_attr *init_attr);
>> +
>
> We started shifting function documentations from the header declarations
> to the *.c implmentations. Would you mind moving it too?

OK, will move as part of V1 to be in the C file.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]                             ` <5623B5D3.2020206-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-10-18 15:15                               ` Parav Pandit
       [not found]                                 ` <CAG53R5VFK20e-=1VANya3XQCWts0wyhttsdp+SVJ=MR_27smnQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Parav Pandit @ 2015-10-18 15:15 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Alex Vainman, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	eli-VPRAkNaXOzVWk0Htik3J/w

On Sun, Oct 18, 2015 at 8:38 PM, Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 10/15/2015 7:49 PM, Parav Pandit wrote:
>
>> If there is stateless WQ being used by multiple QPs in multiplexed
>
>
> The WQ is not stateless and always has its own PD.
>
>> way, it should be able to multiplex between QP's of different PD as
>> well.
>> Otherwise for every PD being created, there will have be one WQ needed
>> to service all the QPs belonging to that PD.
>
>
> As mentioned, same WQ can serve multiple QPs, from PD point of view it
> behaves similarly to SRQ that may be associated with many QPs with different
> PDs.
>
> See IB SPEC, Release 1.3, o10-2.2.1:
> "SRQ may be associated with the same PD as used by one or more of its
> associated QPs or a different PD."
>
> As part of coming V1 will improve the commit message to better clarify the
> WQ's PD behavior, thanks.

Ok. Got it. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs
       [not found]                                 ` <CAG53R5VFK20e-=1VANya3XQCWts0wyhttsdp+SVJ=MR_27smnQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-18 15:38                                   ` Devesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Devesh Sharma @ 2015-10-18 15:38 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Yishai Hadas, Yishai Hadas, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Alex Vainman,
	tzahio-VPRAkNaXOzVWk0Htik3J/w, Eli Cohen

Hi All,

Will it be a good idea to have ha separate header for this feature.
Lets not append to ib_verbs.h?

-Regards
Devesh

On Sun, Oct 18, 2015 at 8:45 PM, Parav Pandit <pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Sun, Oct 18, 2015 at 8:38 PM, Yishai Hadas
> <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On 10/15/2015 7:49 PM, Parav Pandit wrote:
>>
>>> If there is stateless WQ being used by multiple QPs in multiplexed
>>
>>
>> The WQ is not stateless and always has its own PD.
>>
>>> way, it should be able to multiplex between QP's of different PD as
>>> well.
>>> Otherwise for every PD being created, there will have be one WQ needed
>>> to service all the QPs belonging to that PD.
>>
>>
>> As mentioned, same WQ can serve multiple QPs, from PD point of view it
>> behaves similarly to SRQ that may be associated with many QPs with different
>> PDs.
>>
>> See IB SPEC, Release 1.3, o10-2.2.1:
>> "SRQ may be associated with the same PD as used by one or more of its
>> associated QPs or a different PD."
>>
>> As part of coming V1 will improve the commit message to better clarify the
>> WQ's PD behavior, thanks.
>
> Ok. Got it. Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-10-18 15:38 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-15  7:16 [PATCH for-next 0/7] Verbs RSS Yishai Hadas
     [not found] ` <1444893410-13242-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-15  7:16   ` [PATCH for-next 1/7] net/mlx5_core: Expose transobj APIs from mlx5 core Yishai Hadas
2015-10-15  7:16   ` [PATCH for-next 2/7] IB: Introduce Work Queue object and its verbs Yishai Hadas
     [not found]     ` <1444893410-13242-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-15  8:50       ` Sagi Grimberg
     [not found]         ` <561F68D0.4050205-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-10-18 15:13           ` Yishai Hadas
2015-10-15  9:13       ` Parav Pandit
     [not found]         ` <CAG53R5V7u7o+cVQnqhfjXdUjKJRVBiMx5CF3snWAWKG6ZbWMow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-15 14:12           ` Yishai Hadas
     [not found]             ` <561FB463.8030602-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-10-15 15:17               ` Parav Pandit
     [not found]                 ` <CAG53R5U256Nt=3xd3sVUuf8J1icdK5YEiQFoQjR_HB_=5-cVww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-15 16:25                   ` Yishai Hadas
     [not found]                     ` <561FD375.3000205-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-10-15 16:49                       ` Parav Pandit
     [not found]                         ` <CAG53R5UmEhOh3n-+9qMoDthAtROdWQXstwqAFxzx8Urx1ZETnw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-18 15:08                           ` Yishai Hadas
     [not found]                             ` <5623B5D3.2020206-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-10-18 15:15                               ` Parav Pandit
     [not found]                                 ` <CAG53R5VFK20e-=1VANya3XQCWts0wyhttsdp+SVJ=MR_27smnQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-18 15:38                                   ` Devesh Sharma
2015-10-15  7:16   ` [PATCH for-next 3/7] IB/uverbs: Add WQ support Yishai Hadas
2015-10-15  7:16   ` [PATCH for-next 4/7] IB/mlx5: Add receive Work Queue verbs Yishai Hadas
2015-10-15  7:16   ` [PATCH for-next 5/7] IB: Introduce Receive Work Queue indirection table Yishai Hadas
2015-10-15  7:16   ` [PATCH for-next 6/7] IB/uverbs: Introduce RWQ Indirection table Yishai Hadas
2015-10-15  7:16   ` [PATCH for-next 7/7] IB/mlx5: Add Receive Work Queue Indirection table operations Yishai Hadas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).