* [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
@ 2015-08-20 14:34 Eran Ben Elisha
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter
Hi Doug,
This patch-set adds a new implementation for multicast loopback prevention for
mlx4 driver. The current implementation is very limited, especially if link
layer is Ethernet. The new implementation is based on HW feature of dropping
incoming multicast packets if the sender QP counter index is equal to the
receiver counter index.
Patch 0001 extends ib_uverbs_create_qp in order to allow receiving the
multicast loopback flag at create flags.
Patch 0002 adds an infrastructure for the counters' loopback prevention in the
mlx4_core.
Patch 0003 modifies mlx4_en QPs to use the new loopback prevention mode.
Patches 0004-0006 implements this feature for mlx4_ib driver.
Patch 0007 allows setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK in create_flag
field both from uverbs and ib_create_qp.
Changes from v0:
Move loopback assignment outside the for loop according to Yuval's comment
rebase over to-be-rebased/for-4.3
Thanks,
Eran.
Eran Ben Elisha (5):
IB/core: Extend ib_uverbs_create_qp
IB/core: Allow setting create flags in QP init attribute
IB/mlx4: Add IB counters table
IB/mlx4: Add counter based implementation for QP multicast loopback
block
IB/mlx4: Add support for blocking multicast loopback QP creation user
flag
Maor Gottlieb (2):
net/mlx4_core: Add support for filtering multicast loopback
net/mlx4_en: Implement mcast loopback prevention for ETH qps
drivers/infiniband/core/uverbs.h | 1 +
drivers/infiniband/core/uverbs_cmd.c | 259 +++++++++++++++------
drivers/infiniband/core/uverbs_main.c | 1 +
drivers/infiniband/hw/mlx4/mad.c | 25 +-
drivers/infiniband/hw/mlx4/main.c | 66 ++++--
drivers/infiniband/hw/mlx4/mlx4_ib.h | 10 +-
drivers/infiniband/hw/mlx4/qp.c | 88 ++++++-
drivers/net/ethernet/mellanox/mlx4/en_main.c | 22 ++
drivers/net/ethernet/mellanox/mlx4/en_resources.c | 25 ++
drivers/net/ethernet/mellanox/mlx4/fw.c | 6 +
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 +-
drivers/net/ethernet/mellanox/mlx4/qp.c | 19 +-
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 ++-
include/linux/mlx4/device.h | 2 +
include/linux/mlx4/qp.h | 24 +-
include/uapi/rdma/ib_user_verbs.h | 26 +++
16 files changed, 498 insertions(+), 109 deletions(-)
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 1/7] IB/core: Extend ib_uverbs_create_qp
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 2/7] IB/core: Allow setting create flags in QP init attribute Eran Ben Elisha
` (7 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter
ib_uverbs_ex_create_qp follows the extension verbs
mechanism. New features (for example, QP creation flags
field which is added in a downstream patch) could used
via user-space libraries without breaking the ABI.
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/core/uverbs.h | 1 +
drivers/infiniband/core/uverbs_cmd.c | 259 +++++++++++++++++++++++++---------
drivers/infiniband/core/uverbs_main.c | 1 +
include/uapi/rdma/ib_user_verbs.h | 26 ++++
4 files changed, 222 insertions(+), 65 deletions(-)
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 3863d33..94bbd8c 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -272,5 +272,6 @@ IB_UVERBS_DECLARE_EX_CMD(create_flow);
IB_UVERBS_DECLARE_EX_CMD(destroy_flow);
IB_UVERBS_DECLARE_EX_CMD(query_device);
IB_UVERBS_DECLARE_EX_CMD(create_cq);
+IB_UVERBS_DECLARE_EX_CMD(create_qp);
#endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a15318a..3cc2261 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1741,66 +1741,65 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
return in_len;
}
-ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
- struct ib_device *ib_dev,
- const char __user *buf, int in_len,
- int out_len)
-{
- struct ib_uverbs_create_qp cmd;
- struct ib_uverbs_create_qp_resp resp;
- struct ib_udata udata;
- struct ib_uqp_object *obj;
- struct ib_device *device;
- struct ib_pd *pd = NULL;
- struct ib_xrcd *xrcd = NULL;
- struct ib_uobject *uninitialized_var(xrcd_uobj);
- struct ib_cq *scq = NULL, *rcq = NULL;
- struct ib_srq *srq = NULL;
- struct ib_qp *qp;
- struct ib_qp_init_attr attr;
- int ret;
-
- if (out_len < sizeof resp)
- return -ENOSPC;
-
- if (copy_from_user(&cmd, buf, sizeof cmd))
- return -EFAULT;
+static int create_qp(struct ib_uverbs_file *file,
+ struct ib_udata *ucore,
+ struct ib_udata *uhw,
+ struct ib_uverbs_ex_create_qp *cmd,
+ size_t cmd_sz,
+ int (*cb)(struct ib_uverbs_file *file,
+ struct ib_uverbs_ex_create_qp_resp *resp,
+ struct ib_udata *udata),
+ void *context)
+{
+ struct ib_uqp_object *obj;
+ struct ib_device *device;
+ struct ib_pd *pd = NULL;
+ struct ib_xrcd *xrcd = NULL;
+ struct ib_uobject *uninitialized_var(xrcd_uobj);
+ struct ib_cq *scq = NULL, *rcq = NULL;
+ struct ib_srq *srq = NULL;
+ struct ib_qp *qp;
+ char *buf;
+ struct ib_qp_init_attr attr;
+ struct ib_uverbs_ex_create_qp_resp resp;
+ int ret;
- if (cmd.qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
+ if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
return -EPERM;
- INIT_UDATA(&udata, buf + sizeof cmd,
- (unsigned long) cmd.response + sizeof resp,
- in_len - sizeof cmd, out_len - sizeof resp);
-
obj = kzalloc(sizeof *obj, GFP_KERNEL);
if (!obj)
return -ENOMEM;
- init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_class);
+ init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext,
+ &qp_lock_class);
down_write(&obj->uevent.uobject.mutex);
- if (cmd.qp_type == IB_QPT_XRC_TGT) {
- xrcd = idr_read_xrcd(cmd.pd_handle, file->ucontext, &xrcd_uobj);
+ if (cmd->qp_type == IB_QPT_XRC_TGT) {
+ xrcd = idr_read_xrcd(cmd->pd_handle, file->ucontext,
+ &xrcd_uobj);
if (!xrcd) {
ret = -EINVAL;
goto err_put;
}
device = xrcd->device;
} else {
- if (cmd.qp_type == IB_QPT_XRC_INI) {
- cmd.max_recv_wr = cmd.max_recv_sge = 0;
+ if (cmd->qp_type == IB_QPT_XRC_INI) {
+ cmd->max_recv_wr = 0;
+ cmd->max_recv_sge = 0;
} else {
- if (cmd.is_srq) {
- srq = idr_read_srq(cmd.srq_handle, file->ucontext);
+ if (cmd->is_srq) {
+ srq = idr_read_srq(cmd->srq_handle,
+ file->ucontext);
if (!srq || srq->srq_type != IB_SRQT_BASIC) {
ret = -EINVAL;
goto err_put;
}
}
- if (cmd.recv_cq_handle != cmd.send_cq_handle) {
- rcq = idr_read_cq(cmd.recv_cq_handle, file->ucontext, 0);
+ if (cmd->recv_cq_handle != cmd->send_cq_handle) {
+ rcq = idr_read_cq(cmd->recv_cq_handle,
+ file->ucontext, 0);
if (!rcq) {
ret = -EINVAL;
goto err_put;
@@ -1808,9 +1807,9 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
}
}
- scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, !!rcq);
+ scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq);
rcq = rcq ?: scq;
- pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+ pd = idr_read_pd(cmd->pd_handle, file->ucontext);
if (!pd || !scq) {
ret = -EINVAL;
goto err_put;
@@ -1825,31 +1824,54 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
attr.recv_cq = rcq;
attr.srq = srq;
attr.xrcd = xrcd;
- attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR;
- attr.qp_type = cmd.qp_type;
+ attr.sq_sig_type = cmd->sq_sig_all ? IB_SIGNAL_ALL_WR :
+ IB_SIGNAL_REQ_WR;
+ attr.qp_type = cmd->qp_type;
attr.create_flags = 0;
- attr.cap.max_send_wr = cmd.max_send_wr;
- attr.cap.max_recv_wr = cmd.max_recv_wr;
- attr.cap.max_send_sge = cmd.max_send_sge;
- attr.cap.max_recv_sge = cmd.max_recv_sge;
- attr.cap.max_inline_data = cmd.max_inline_data;
+ attr.cap.max_send_wr = cmd->max_send_wr;
+ attr.cap.max_recv_wr = cmd->max_recv_wr;
+ attr.cap.max_send_sge = cmd->max_send_sge;
+ attr.cap.max_recv_sge = cmd->max_recv_sge;
+ attr.cap.max_inline_data = cmd->max_inline_data;
obj->uevent.events_reported = 0;
INIT_LIST_HEAD(&obj->uevent.event_list);
INIT_LIST_HEAD(&obj->mcast_list);
- if (cmd.qp_type == IB_QPT_XRC_TGT)
+ if (cmd_sz >= offsetof(typeof(*cmd), create_flags) +
+ sizeof(cmd->create_flags))
+ attr.create_flags = cmd->create_flags;
+
+ if (attr.create_flags) {
+ ret = -EINVAL;
+ goto err_put;
+ }
+
+ if (cmd->comp_mask) {
+ ret = -EINVAL;
+ goto err_put;
+ }
+
+ buf = (void *)cmd + sizeof(*cmd);
+ if (cmd_sz > sizeof(*cmd))
+ if (!(buf[0] == 0 && !memcmp(buf, buf + 1,
+ cmd_sz - sizeof(*cmd) - 1))) {
+ ret = -EINVAL;
+ goto err_put;
+ }
+
+ if (cmd->qp_type == IB_QPT_XRC_TGT)
qp = ib_create_qp(pd, &attr);
else
- qp = device->create_qp(pd, &attr, &udata);
+ qp = device->create_qp(pd, &attr, uhw);
if (IS_ERR(qp)) {
ret = PTR_ERR(qp);
goto err_put;
}
- if (cmd.qp_type != IB_QPT_XRC_TGT) {
+ if (cmd->qp_type != IB_QPT_XRC_TGT) {
qp->real_qp = qp;
qp->device = device;
qp->pd = pd;
@@ -1875,19 +1897,20 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
goto err_destroy;
memset(&resp, 0, sizeof resp);
- resp.qpn = qp->qp_num;
- resp.qp_handle = obj->uevent.uobject.id;
- resp.max_recv_sge = attr.cap.max_recv_sge;
- resp.max_send_sge = attr.cap.max_send_sge;
- resp.max_recv_wr = attr.cap.max_recv_wr;
- resp.max_send_wr = attr.cap.max_send_wr;
- resp.max_inline_data = attr.cap.max_inline_data;
+ resp.base.qpn = qp->qp_num;
+ resp.base.qp_handle = obj->uevent.uobject.id;
+ resp.base.max_recv_sge = attr.cap.max_recv_sge;
+ resp.base.max_send_sge = attr.cap.max_send_sge;
+ resp.base.max_recv_wr = attr.cap.max_recv_wr;
+ resp.base.max_send_wr = attr.cap.max_send_wr;
+ resp.base.max_inline_data = attr.cap.max_inline_data;
- if (copy_to_user((void __user *) (unsigned long) cmd.response,
- &resp, sizeof resp)) {
- ret = -EFAULT;
- goto err_copy;
- }
+ resp.response_length = offsetof(typeof(resp), response_length) +
+ sizeof(resp.response_length);
+
+ ret = cb(file, &resp, ucore);
+ if (ret)
+ goto err_cb;
if (xrcd) {
obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
@@ -1913,9 +1936,8 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
up_write(&obj->uevent.uobject.mutex);
- return in_len;
-
-err_copy:
+ return 0;
+err_cb:
idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
err_destroy:
@@ -1937,6 +1959,113 @@ err_put:
return ret;
}
+static int ib_uverbs_create_qp_cb(struct ib_uverbs_file *file,
+ struct ib_uverbs_ex_create_qp_resp *resp,
+ struct ib_udata *ucore)
+{
+ if (ib_copy_to_udata(ucore, &resp->base, sizeof(resp->base)))
+ return -EFAULT;
+
+ return 0;
+}
+
+ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
+ struct ib_device *ib_dev,
+ const char __user *buf, int in_len,
+ int out_len)
+{
+ struct ib_uverbs_create_qp cmd;
+ struct ib_uverbs_ex_create_qp cmd_ex;
+ struct ib_udata ucore;
+ struct ib_udata uhw;
+ ssize_t resp_size = sizeof(struct ib_uverbs_create_qp_resp);
+ int err;
+
+ if (out_len < resp_size)
+ return -ENOSPC;
+
+ if (copy_from_user(&cmd, buf, sizeof(cmd)))
+ return -EFAULT;
+
+ INIT_UDATA(&ucore, buf, (unsigned long)cmd.response, sizeof(cmd),
+ resp_size);
+ INIT_UDATA(&uhw, buf + sizeof(cmd),
+ (unsigned long)cmd.response + resp_size,
+ in_len - sizeof(cmd), out_len - resp_size);
+
+ memset(&cmd_ex, 0, sizeof(cmd_ex));
+ cmd_ex.user_handle = cmd.user_handle;
+ cmd_ex.pd_handle = cmd.pd_handle;
+ cmd_ex.send_cq_handle = cmd.send_cq_handle;
+ cmd_ex.recv_cq_handle = cmd.recv_cq_handle;
+ cmd_ex.srq_handle = cmd.srq_handle;
+ cmd_ex.max_send_wr = cmd.max_send_wr;
+ cmd_ex.max_recv_wr = cmd.max_recv_wr;
+ cmd_ex.max_send_sge = cmd.max_send_sge;
+ cmd_ex.max_recv_sge = cmd.max_recv_sge;
+ cmd_ex.max_inline_data = cmd.max_inline_data;
+ cmd_ex.sq_sig_all = cmd.sq_sig_all;
+ cmd_ex.qp_type = cmd.qp_type;
+ cmd_ex.is_srq = cmd.is_srq;
+
+ err = create_qp(file, &ucore, &uhw, &cmd_ex,
+ offsetof(typeof(cmd_ex), is_srq) +
+ sizeof(cmd.is_srq), ib_uverbs_create_qp_cb,
+ NULL);
+
+ if (err)
+ return err;
+
+ return in_len;
+}
+
+static int ib_uverbs_ex_create_qp_cb(struct ib_uverbs_file *file,
+ struct ib_uverbs_ex_create_qp_resp *resp,
+ struct ib_udata *ucore)
+{
+ if (ib_copy_to_udata(ucore, resp, resp->response_length))
+ return -EFAULT;
+
+ return 0;
+}
+
+int ib_uverbs_ex_create_qp(struct ib_uverbs_file *file,
+ struct ib_device *ib_dev,
+ struct ib_udata *ucore,
+ struct ib_udata *uhw)
+{
+ struct ib_uverbs_ex_create_qp_resp resp;
+ struct ib_uverbs_ex_create_qp cmd = {0};
+ int err;
+
+ if (ucore->inlen < (offsetof(typeof(cmd), comp_mask) +
+ sizeof(cmd.comp_mask)))
+ return -EINVAL;
+
+ err = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+ if (err)
+ return err;
+
+ if (cmd.comp_mask)
+ return -EINVAL;
+
+ if (cmd.reserved)
+ return -EINVAL;
+
+ if (ucore->outlen < (offsetof(typeof(resp), response_length) +
+ sizeof(resp.response_length)))
+ return -ENOSPC;
+
+ err = create_qp(file, ucore, uhw, &cmd,
+ min(ucore->inlen, sizeof(cmd)),
+ ib_uverbs_ex_create_qp_cb, NULL);
+
+ if (err)
+ return err;
+
+ return 0;
+}
+
ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
struct ib_device *ib_dev,
const char __user *buf, int in_len, int out_len)
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index c29a660..e3ef288 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -127,6 +127,7 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
[IB_USER_VERBS_EX_CMD_DESTROY_FLOW] = ib_uverbs_ex_destroy_flow,
[IB_USER_VERBS_EX_CMD_QUERY_DEVICE] = ib_uverbs_ex_query_device,
[IB_USER_VERBS_EX_CMD_CREATE_CQ] = ib_uverbs_ex_create_cq,
+ [IB_USER_VERBS_EX_CMD_CREATE_QP] = ib_uverbs_ex_create_qp,
};
static void ib_uverbs_add_one(struct ib_device *device);
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 978841e..8126c14 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -92,6 +92,7 @@ enum {
enum {
IB_USER_VERBS_EX_CMD_QUERY_DEVICE = IB_USER_VERBS_CMD_QUERY_DEVICE,
IB_USER_VERBS_EX_CMD_CREATE_CQ = IB_USER_VERBS_CMD_CREATE_CQ,
+ IB_USER_VERBS_EX_CMD_CREATE_QP = IB_USER_VERBS_CMD_CREATE_QP,
IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD,
IB_USER_VERBS_EX_CMD_DESTROY_FLOW,
};
@@ -516,6 +517,25 @@ struct ib_uverbs_create_qp {
__u64 driver_data[0];
};
+struct ib_uverbs_ex_create_qp {
+ __u64 user_handle;
+ __u32 pd_handle;
+ __u32 send_cq_handle;
+ __u32 recv_cq_handle;
+ __u32 srq_handle;
+ __u32 max_send_wr;
+ __u32 max_recv_wr;
+ __u32 max_send_sge;
+ __u32 max_recv_sge;
+ __u32 max_inline_data;
+ __u8 sq_sig_all;
+ __u8 qp_type;
+ __u8 is_srq;
+ __u8 reserved;
+ __u32 comp_mask;
+ __u32 create_flags;
+};
+
struct ib_uverbs_open_qp {
__u64 response;
__u64 user_handle;
@@ -538,6 +558,12 @@ struct ib_uverbs_create_qp_resp {
__u32 reserved;
};
+struct ib_uverbs_ex_create_qp_resp {
+ struct ib_uverbs_create_qp_resp base;
+ __u32 comp_mask;
+ __u32 response_length;
+};
+
/*
* This struct needs to remain a multiple of 8 bytes to keep the
* alignment of the modify QP parameters.
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 2/7] IB/core: Allow setting create flags in QP init attribute
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-20 14:34 ` [PATCH v1 for-next 1/7] IB/core: Extend ib_uverbs_create_qp Eran Ben Elisha
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback Eran Ben Elisha
` (6 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter
Allow setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK at create_flags in
ib_uverbs_create_qp_ex.
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/core/uverbs_cmd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3cc2261..fbde0c6 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1843,7 +1843,7 @@ static int create_qp(struct ib_uverbs_file *file,
sizeof(cmd->create_flags))
attr.create_flags = cmd->create_flags;
- if (attr.create_flags) {
+ if (attr.create_flags & ~IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) {
ret = -EINVAL;
goto err_put;
}
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-20 14:34 ` [PATCH v1 for-next 1/7] IB/core: Extend ib_uverbs_create_qp Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 2/7] IB/core: Allow setting create flags in QP init attribute Eran Ben Elisha
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps Eran Ben Elisha
` (5 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Update device capabilities regarding HW filtering multicast loopback support.
Add MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB attribute to mlx4_update_qp to
enable changing QP context to support filtering incoming multicast
loopback traffic according the sender's counter index.
Set the corresponding bits in QP context to force the loopback source
checks if attribute is given and HW supports it.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/net/ethernet/mellanox/mlx4/fw.c | 6 +++++
drivers/net/ethernet/mellanox/mlx4/qp.c | 19 +++++++++++++-
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 +++++++++++++++++-----
include/linux/mlx4/device.h | 2 ++
include/linux/mlx4/qp.h | 24 +++++++++++++----
5 files changed, 68 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index e30bf57..5218e1e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -154,6 +154,8 @@ static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags)
[26] = "Port ETS Scheduler support",
[27] = "Port beacon support",
[28] = "RX-ALL support",
+ [31] = "Modifying loopback source checks using UPDATE_QP support",
+ [32] = "Loopback source checks support",
};
int i;
@@ -946,6 +948,10 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
MLX4_GET(field32, outbox, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
if (field32 & (1 << 16))
dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_UPDATE_QP;
+ if (field32 & (1 << 18))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB;
+ if (field32 & (1 << 19))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_LB_SRC_CHK;
if (field32 & (1 << 26))
dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_VLAN_CONTROL;
if (field32 & (1 << 20))
diff --git a/drivers/net/ethernet/mellanox/mlx4/qp.c b/drivers/net/ethernet/mellanox/mlx4/qp.c
index 2026863..b162495 100644
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -436,6 +436,23 @@ int mlx4_update_qp(struct mlx4_dev *dev, u32 qpn,
cmd->qp_context.pri_path.grh_mylmc = params->smac_index;
}
+ if (attr & MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB) {
+ if (!(dev->caps.flags2
+ & MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB)) {
+ mlx4_warn(dev,
+ "Trying to set src check LB, but it isn't supported\n");
+ err = -ENOTSUPP;
+ goto out;
+ }
+ pri_addr_path_mask |=
+ 1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB;
+ if (params->flags &
+ MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB) {
+ cmd->qp_context.pri_path.fl |=
+ MLX4_FL_ETH_SRC_CHECK_MC_LB;
+ }
+ }
+
if (attr & MLX4_UPDATE_QP_VSD) {
qp_mask |= 1ULL << MLX4_UPD_QP_MASK_VSD;
if (params->flags & MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE)
@@ -458,7 +475,7 @@ int mlx4_update_qp(struct mlx4_dev *dev, u32 qpn,
err = mlx4_cmd(dev, mailbox->dma, qpn & 0xffffff, 0,
MLX4_CMD_UPDATE_QP, MLX4_CMD_TIME_CLASS_A,
MLX4_CMD_NATIVE);
-
+out:
mlx4_free_cmd_mailbox(dev, mailbox);
return err;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 731423c..502f335 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -770,9 +770,12 @@ static int update_vport_qp_param(struct mlx4_dev *dev,
}
}
+ /* preserve IF_COUNTER flag */
+ qpc->pri_path.vlan_control &=
+ MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER;
if (vp_oper->state.link_state == IFLA_VF_LINK_STATE_DISABLE &&
dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_UPDATE_QP) {
- qpc->pri_path.vlan_control =
+ qpc->pri_path.vlan_control |=
MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED |
MLX4_VLAN_CTRL_ETH_TX_BLOCK_UNTAGGED |
@@ -780,12 +783,12 @@ static int update_vport_qp_param(struct mlx4_dev *dev,
MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
} else if (0 != vp_oper->state.default_vlan) {
- qpc->pri_path.vlan_control =
+ qpc->pri_path.vlan_control |=
MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
} else { /* priority tagged */
- qpc->pri_path.vlan_control =
+ qpc->pri_path.vlan_control |=
MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
}
@@ -3762,9 +3765,6 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave,
update_gid(dev, inbox, (u8)slave);
adjust_proxy_tun_qkey(dev, vhcr, qpc);
orig_sched_queue = qpc->pri_path.sched_queue;
- err = update_vport_qp_param(dev, inbox, slave, qpn);
- if (err)
- return err;
err = get_res(dev, slave, qpn, RES_QP, &qp);
if (err)
@@ -3774,6 +3774,10 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave,
goto out;
}
+ err = update_vport_qp_param(dev, inbox, slave, qpn);
+ if (err)
+ goto out;
+
err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
out:
/* if no error, save sched queue value passed in by VF. This is
@@ -4208,7 +4212,9 @@ static int add_eth_header(struct mlx4_dev *dev, int slave,
}
-#define MLX4_UPD_QP_PATH_MASK_SUPPORTED (1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX)
+#define MLX4_UPD_QP_PATH_MASK_SUPPORTED ( \
+ 1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX |\
+ 1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB)
int mlx4_UPDATE_QP_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
@@ -4231,6 +4237,16 @@ int mlx4_UPDATE_QP_wrapper(struct mlx4_dev *dev, int slave,
(pri_addr_path_mask & ~MLX4_UPD_QP_PATH_MASK_SUPPORTED))
return -EPERM;
+ if ((pri_addr_path_mask &
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB)) &&
+ !(dev->caps.flags2 &
+ MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB)) {
+ mlx4_warn(dev,
+ "Src check LB for slave %d isn't supported\n",
+ slave);
+ return -ENOTSUPP;
+ }
+
/* Just change the smac for the QP */
err = get_res(dev, slave, qpn, RES_QP, &rqp);
if (err) {
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 5c7687a..3b9602f 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -212,6 +212,8 @@ enum {
MLX4_DEV_CAP_FLAG2_ETS_CFG = 1LL << 26,
MLX4_DEV_CAP_FLAG2_PORT_BEACON = 1LL << 27,
MLX4_DEV_CAP_FLAG2_IGNORE_FCS = 1LL << 28,
+ MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB = 1ULL << 31,
+ MLX4_DEV_CAP_FLAG2_LB_SRC_CHK = 1ULL << 32,
};
enum {
diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h
index 6fed539..62175c2 100644
--- a/include/linux/mlx4/qp.h
+++ b/include/linux/mlx4/qp.h
@@ -135,7 +135,10 @@ struct mlx4_rss_context {
struct mlx4_qp_path {
u8 fl;
- u8 vlan_control;
+ union {
+ u8 vlan_control;
+ u8 control;
+ };
u8 disable_pkey_check;
u8 pkey_index;
u8 counter_index;
@@ -156,9 +159,16 @@ struct mlx4_qp_path {
};
enum { /* fl */
- MLX4_FL_CV = 1 << 6,
- MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2
+ MLX4_FL_CV = 1 << 6,
+ MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2,
+ MLX4_FL_ETH_SRC_CHECK_MC_LB = 1 << 1,
+ MLX4_FL_ETH_SRC_CHECK_UC_LB = 1 << 0,
};
+
+enum { /* control */
+ MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER = 1 << 7,
+};
+
enum { /* vlan_control */
MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED = 1 << 6,
MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED = 1 << 5, /* 802.1p priority tag */
@@ -254,6 +264,8 @@ enum {
MLX4_UPD_QP_PATH_MASK_SCHED_QUEUE = 14 + 32,
MLX4_UPD_QP_PATH_MASK_IF_COUNTER_INDEX = 15 + 32,
MLX4_UPD_QP_PATH_MASK_FVL_RX = 16 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_UC_LB = 18 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB = 19 + 32,
};
enum { /* param3 */
@@ -435,11 +447,13 @@ enum mlx4_update_qp_attr {
MLX4_UPDATE_QP_VSD = 1 << 1,
MLX4_UPDATE_QP_RATE_LIMIT = 1 << 2,
MLX4_UPDATE_QP_QOS_VPORT = 1 << 3,
- MLX4_UPDATE_QP_SUPPORTED_ATTRS = (1 << 4) - 1
+ MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB = 1 << 4,
+ MLX4_UPDATE_QP_SUPPORTED_ATTRS = (1 << 5) - 1
};
enum mlx4_update_qp_params_flags {
- MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE = 1 << 0,
+ MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB = 1 << 0,
+ MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE = 1 << 1,
};
struct mlx4_update_qp_params {
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (2 preceding siblings ...)
2015-08-20 14:34 ` [PATCH v1 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback Eran Ben Elisha
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 5/7] IB/mlx4: Add IB counters table Eran Ben Elisha
` (4 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Set the mcast loopback prevention bit in the QPC for ETH MLX QPs (not
RSS QPs), when the firmware supports this feature. In addition, all rx
ring QPs need to be updated in order not to enforce loopback checks.
This prevents getting packets we sent both from the network stack and
the HCA. Loopback prevention is done by comparing the counter indices of
the sent and receiving QPs. If they're equal, packets aren't
loopback-ed.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/net/ethernet/mellanox/mlx4/en_main.c | 22 ++++++++++++++++++++
drivers/net/ethernet/mellanox/mlx4/en_resources.c | 25 +++++++++++++++++++++++
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 ++-
3 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index a946e4b..005f910 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -123,6 +123,28 @@ void mlx4_en_update_loopback_state(struct net_device *dev,
*/
if (mlx4_is_mfunc(priv->mdev->dev) || priv->validate_loopback)
priv->flags |= MLX4_EN_FLAG_ENABLE_HW_LOOPBACK;
+
+ mutex_lock(&priv->mdev->state_lock);
+ if (priv->mdev->dev->caps.flags2 &
+ MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB &&
+ priv->rss_map.indir_qp.qpn) {
+ int i;
+ int err = 0;
+ int loopback = !!(features & NETIF_F_LOOPBACK);
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ int ret;
+
+ ret = mlx4_en_change_mcast_lb(priv,
+ &priv->rss_map.qps[i],
+ loopback);
+ if (!err)
+ err = ret;
+ }
+ if (err)
+ mlx4_warn(priv->mdev, "failed to change mcast loopback\n");
+ }
+ mutex_unlock(&priv->mdev->state_lock);
}
static int mlx4_en_get_profile(struct mlx4_en_dev *mdev)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_resources.c b/drivers/net/ethernet/mellanox/mlx4/en_resources.c
index e482fa1b..12aab5a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_resources.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_resources.c
@@ -69,6 +69,15 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
context->pri_path.counter_index = priv->counter_index;
context->cqn_send = cpu_to_be32(cqn);
context->cqn_recv = cpu_to_be32(cqn);
+ if (!rss &&
+ (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_LB_SRC_CHK) &&
+ context->pri_path.counter_index !=
+ MLX4_SINK_COUNTER_INDEX(mdev->dev)) {
+ /* disable multicast loopback to qp with same counter */
+ if (!(dev->features & NETIF_F_LOOPBACK))
+ context->pri_path.fl |= MLX4_FL_ETH_SRC_CHECK_MC_LB;
+ context->pri_path.control |= MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER;
+ }
context->db_rec_addr = cpu_to_be64(priv->res.db.dma << 2);
if (!(dev->features & NETIF_F_HW_VLAN_CTAG_RX))
context->param3 |= cpu_to_be32(1 << 30);
@@ -80,6 +89,22 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
}
}
+int mlx4_en_change_mcast_lb(struct mlx4_en_priv *priv, struct mlx4_qp *qp,
+ int loopback)
+{
+ int ret;
+ struct mlx4_update_qp_params qp_params;
+
+ memset(&qp_params, 0, sizeof(qp_params));
+ if (!loopback)
+ qp_params.flags = MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB;
+
+ ret = mlx4_update_qp(priv->mdev->dev, qp->qpn,
+ MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB,
+ &qp_params);
+
+ return ret;
+}
int mlx4_en_map_buffer(struct mlx4_buf *buf)
{
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 666d166..7db86d4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -797,7 +797,8 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
void mlx4_en_sqp_event(struct mlx4_qp *qp, enum mlx4_event event);
int mlx4_en_map_buffer(struct mlx4_buf *buf);
void mlx4_en_unmap_buffer(struct mlx4_buf *buf);
-
+int mlx4_en_change_mcast_lb(struct mlx4_en_priv *priv, struct mlx4_qp *qp,
+ int loopback);
void mlx4_en_calc_rx_buf(struct net_device *dev);
int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv);
void mlx4_en_release_rss_steer(struct mlx4_en_priv *priv);
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 5/7] IB/mlx4: Add IB counters table
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (3 preceding siblings ...)
2015-08-20 14:34 ` [PATCH v1 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps Eran Ben Elisha
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block Eran Ben Elisha
` (3 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter
This is an infrastructure step for allocating and attaching more than
one counter to QPs on the same port. Allocate a counters table and
manage the insertion and removals of the counters in load and unload of
mlx4 IB.
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/mad.c | 25 ++++++++++----
drivers/infiniband/hw/mlx4/main.c | 63 ++++++++++++++++++++++++++++--------
drivers/infiniband/hw/mlx4/mlx4_ib.h | 9 +++++-
drivers/infiniband/hw/mlx4/qp.c | 8 +++--
4 files changed, 81 insertions(+), 24 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 1cd75ff..68f2567 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -824,18 +824,29 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
{
struct mlx4_counter counter_stats;
struct mlx4_ib_dev *dev = to_mdev(ibdev);
- int err;
+ struct counter_index *tmp_counter;
+ int err = IB_MAD_RESULT_FAILURE, stats_avail = 0;
if (in_mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_PERF_MGMT)
return -EINVAL;
memset(&counter_stats, 0, sizeof(counter_stats));
- err = mlx4_get_counter_stats(dev->dev,
- dev->counters[port_num - 1].index,
- &counter_stats, 0);
- if (err)
- err = IB_MAD_RESULT_FAILURE;
- else {
+ mutex_lock(&dev->counters_table[port_num - 1].mutex);
+ list_for_each_entry(tmp_counter,
+ &dev->counters_table[port_num - 1].counters_list,
+ list) {
+ err = mlx4_get_counter_stats(dev->dev,
+ tmp_counter->index,
+ &counter_stats, 0);
+ if (err) {
+ err = IB_MAD_RESULT_FAILURE;
+ stats_avail = 0;
+ break;
+ }
+ stats_avail = 1;
+ }
+ mutex_unlock(&dev->counters_table[port_num - 1].mutex);
+ if (stats_avail) {
memset(out_mad->data, 0, sizeof out_mad->data);
switch (counter_stats.counter_mode & 0xf) {
case 0:
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index efecdf0..26a96b8 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -1247,6 +1247,22 @@ static int add_gid_entry(struct ib_qp *ibqp, union ib_gid *gid)
return 0;
}
+static void mlx4_ib_delete_counters_table(struct mlx4_ib_dev *ibdev,
+ struct mlx4_ib_counters *ctr_table)
+{
+ struct counter_index *counter, *tmp_count;
+
+ mutex_lock(&ctr_table->mutex);
+ list_for_each_entry_safe(counter, tmp_count, &ctr_table->counters_list,
+ list) {
+ if (counter->allocated)
+ mlx4_counter_free(ibdev->dev, counter->index);
+ list_del(&counter->list);
+ kfree(counter);
+ }
+ mutex_unlock(&ctr_table->mutex);
+}
+
int mlx4_ib_add_mc(struct mlx4_ib_dev *mdev, struct mlx4_ib_qp *mqp,
union ib_gid *gid)
{
@@ -2131,6 +2147,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
int num_req_counters;
int allocated;
u32 counter_index;
+ struct counter_index *new_counter_index = NULL;
pr_info_once("%s", mlx4_ib_version);
@@ -2302,6 +2319,11 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
if (init_node_data(ibdev))
goto err_map;
+ for (i = 0; i < ibdev->num_ports; ++i) {
+ mutex_init(&ibdev->counters_table[i].mutex);
+ INIT_LIST_HEAD(&ibdev->counters_table[i].counters_list);
+ }
+
num_req_counters = mlx4_is_bonded(dev) ? 1 : ibdev->num_ports;
for (i = 0; i < num_req_counters; ++i) {
mutex_init(&ibdev->qp1_proxy_lock[i]);
@@ -2320,15 +2342,34 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
counter_index = mlx4_get_default_counter_index(dev,
i + 1);
}
- ibdev->counters[i].index = counter_index;
- ibdev->counters[i].allocated = allocated;
+ new_counter_index = kmalloc(sizeof(*new_counter_index),
+ GFP_KERNEL);
+ if (!new_counter_index) {
+ if (allocated)
+ mlx4_counter_free(ibdev->dev, counter_index);
+ goto err_counter;
+ }
+ new_counter_index->index = counter_index;
+ new_counter_index->allocated = allocated;
+ list_add_tail(&new_counter_index->list,
+ &ibdev->counters_table[i].counters_list);
+ ibdev->counters_table[i].default_counter = counter_index;
pr_info("counter index %d for port %d allocated %d\n",
counter_index, i + 1, allocated);
}
if (mlx4_is_bonded(dev))
for (i = 1; i < ibdev->num_ports ; ++i) {
- ibdev->counters[i].index = ibdev->counters[0].index;
- ibdev->counters[i].allocated = 0;
+ new_counter_index =
+ kmalloc(sizeof(struct counter_index),
+ GFP_KERNEL);
+ if (!new_counter_index)
+ goto err_counter;
+ new_counter_index->index = counter_index;
+ new_counter_index->allocated = 0;
+ list_add_tail(&new_counter_index->list,
+ &ibdev->counters_table[i].counters_list);
+ ibdev->counters_table[i].default_counter =
+ counter_index;
}
mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
@@ -2437,12 +2478,9 @@ err_steer_qp_release:
mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
ibdev->steer_qpn_count);
err_counter:
- for (i = 0; i < ibdev->num_ports; ++i) {
- if (ibdev->counters[i].index != -1 &&
- ibdev->counters[i].allocated)
- mlx4_counter_free(ibdev->dev,
- ibdev->counters[i].index);
- }
+ for (i = 0; i < ibdev->num_ports; ++i)
+ mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[i]);
+
err_map:
iounmap(ibdev->uar_map);
@@ -2546,9 +2584,8 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr)
iounmap(ibdev->uar_map);
for (p = 0; p < ibdev->num_ports; ++p)
- if (ibdev->counters[p].index != -1 &&
- ibdev->counters[p].allocated)
- mlx4_counter_free(ibdev->dev, ibdev->counters[p].index);
+ mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[p]);
+
mlx4_foreach_port(p, dev, MLX4_PORT_TYPE_IB)
mlx4_CLOSE_PORT(dev, p);
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 1e7b23b..4056dc1 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -528,10 +528,17 @@ struct mlx4_ib_iov_port {
};
struct counter_index {
+ struct list_head list;
u32 index;
u8 allocated;
};
+struct mlx4_ib_counters {
+ struct list_head counters_list;
+ struct mutex mutex; /* mutex for accessing counters list */
+ u32 default_counter;
+};
+
struct mlx4_ib_dev {
struct ib_device ib_dev;
struct mlx4_dev *dev;
@@ -550,7 +557,7 @@ struct mlx4_ib_dev {
struct mutex cap_mask_mutex;
bool ib_active;
struct mlx4_ib_iboe iboe;
- struct counter_index counters[MLX4_MAX_PORTS];
+ struct mlx4_ib_counters counters_table[MLX4_MAX_PORTS];
int *eq_table;
struct kobject *iov_parent;
struct kobject *ports_parent;
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 4ad9be3..382913e 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1460,6 +1460,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
int sqd_event;
int steer_qp = 0;
int err = -EINVAL;
+ int counter_index;
/* APM is not supported under RoCE */
if (attr_mask & IB_QP_ALT_PATH &&
@@ -1543,9 +1544,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
}
if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) {
- if (dev->counters[qp->port - 1].index != -1) {
- context->pri_path.counter_index =
- dev->counters[qp->port - 1].index;
+ counter_index =
+ dev->counters_table[qp->port - 1].default_counter;
+ if (counter_index != -1) {
+ context->pri_path.counter_index = counter_index;
optpar |= MLX4_QP_OPTPAR_COUNTER_INDEX;
} else
context->pri_path.counter_index =
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (4 preceding siblings ...)
2015-08-20 14:34 ` [PATCH v1 for-next 5/7] IB/mlx4: Add IB counters table Eran Ben Elisha
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag Eran Ben Elisha
` (2 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter
Current implementation for MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK is not
supported when link layer is Ethernet.
This patch will add counter based implementation for multicast loopback
prevention. HW can drop multicast loopback packets if sender QP counter
index is equal to receiver QP counter index. If qp flag
MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK is set and link layer is Ethernet,
create a new counter and attach it to the QP so it will continue
receiving multicast loopback traffic but it's own.
The decision if to create a new counter is being made at the qp
modification to RTR after the QP's port is set. When QP is destroyed or
moved back to reset state, delete the counter.
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 +
drivers/infiniband/hw/mlx4/qp.c | 67 ++++++++++++++++++++++++++++++++++++
2 files changed, 68 insertions(+)
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 4056dc1..086416a 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -320,6 +320,7 @@ struct mlx4_ib_qp {
struct list_head qps_list;
struct list_head cq_recv_list;
struct list_head cq_send_list;
+ struct counter_index *counter_index;
};
struct mlx4_ib_srq {
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 382913e..2871949 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -617,6 +617,18 @@ static int qp0_enabled_vf(struct mlx4_dev *dev, int qpn)
return 0;
}
+static void mlx4_ib_free_qp_counter(struct mlx4_ib_dev *dev,
+ struct mlx4_ib_qp *qp)
+{
+ mutex_lock(&dev->counters_table[qp->port - 1].mutex);
+ mlx4_counter_free(dev->dev, qp->counter_index->index);
+ list_del(&qp->counter_index->list);
+ mutex_unlock(&dev->counters_table[qp->port - 1].mutex);
+
+ kfree(qp->counter_index);
+ qp->counter_index = NULL;
+}
+
static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
struct ib_qp_init_attr *init_attr,
struct ib_udata *udata, int sqpn, struct mlx4_ib_qp **caller_qp,
@@ -1189,6 +1201,9 @@ int mlx4_ib_destroy_qp(struct ib_qp *qp)
mutex_unlock(&dev->qp1_proxy_lock[mqp->port - 1]);
}
+ if (mqp->counter_index)
+ mlx4_ib_free_qp_counter(dev, mqp);
+
pd = get_pd(mqp);
destroy_qp_common(dev, mqp, !!pd->ibpd.uobject);
@@ -1447,6 +1462,40 @@ static int handle_eth_ud_smac_index(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *
return 0;
}
+static int create_qp_lb_counter(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
+{
+ struct counter_index *new_counter_index;
+ int err;
+ u32 tmp_idx;
+
+ if (rdma_port_get_link_layer(&dev->ib_dev, qp->port) !=
+ IB_LINK_LAYER_ETHERNET ||
+ !(qp->flags & MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK) ||
+ !(dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_LB_SRC_CHK))
+ return 0;
+
+ err = mlx4_counter_alloc(dev->dev, &tmp_idx);
+ if (err)
+ return err;
+
+ new_counter_index = kmalloc(sizeof(*new_counter_index), GFP_KERNEL);
+ if (!new_counter_index) {
+ mlx4_counter_free(dev->dev, tmp_idx);
+ return -ENOMEM;
+ }
+
+ new_counter_index->index = tmp_idx;
+ new_counter_index->allocated = 1;
+ qp->counter_index = new_counter_index;
+
+ mutex_lock(&dev->counters_table[qp->port - 1].mutex);
+ list_add_tail(&new_counter_index->list,
+ &dev->counters_table[qp->port - 1].counters_list);
+ mutex_unlock(&dev->counters_table[qp->port - 1].mutex);
+
+ return 0;
+}
+
static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
const struct ib_qp_attr *attr, int attr_mask,
enum ib_qp_state cur_state, enum ib_qp_state new_state)
@@ -1520,6 +1569,9 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
context->sq_size_stride = ilog2(qp->sq.wqe_cnt) << 3;
context->sq_size_stride |= qp->sq.wqe_shift - 4;
+ if (new_state == IB_QPS_RESET && qp->counter_index)
+ mlx4_ib_free_qp_counter(dev, qp);
+
if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) {
context->sq_size_stride |= !!qp->sq_no_prefetch << 7;
context->xrcd = cpu_to_be32((u32) qp->xrcdn);
@@ -1544,11 +1596,24 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
}
if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) {
+ err = create_qp_lb_counter(dev, qp);
+ if (err)
+ goto out;
+
counter_index =
dev->counters_table[qp->port - 1].default_counter;
+ if (qp->counter_index)
+ counter_index = qp->counter_index->index;
+
if (counter_index != -1) {
context->pri_path.counter_index = counter_index;
optpar |= MLX4_QP_OPTPAR_COUNTER_INDEX;
+ if (qp->counter_index) {
+ context->pri_path.fl |=
+ MLX4_FL_ETH_SRC_CHECK_MC_LB;
+ context->pri_path.vlan_control |=
+ MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER;
+ }
} else
context->pri_path.counter_index =
MLX4_SINK_COUNTER_INDEX(dev->dev);
@@ -1850,6 +1915,8 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
}
}
out:
+ if (err && qp->counter_index)
+ mlx4_ib_free_qp_counter(dev, qp);
if (err && steer_qp)
mlx4_ib_steer_qp_reg(dev, qp, 0);
kfree(context);
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v1 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (5 preceding siblings ...)
2015-08-20 14:34 ` [PATCH v1 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block Eran Ben Elisha
@ 2015-08-20 14:34 ` Eran Ben Elisha
2015-08-25 16:44 ` [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4 Christoph Lameter
2015-09-21 14:24 ` eran ben elisha
8 siblings, 0 replies; 22+ messages in thread
From: Eran Ben Elisha @ 2015-08-20 14:34 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
Christoph Lameter
MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK is now supported downstream.
In addition, this flag was supported only for IB_QPT_UD, now, with the
new implementation it is supported for all QP types.
Support IB_USER_VERBS_EX_CMD_CREATE_QP in order to get the flag from
user space using the extension create qp command.
Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/main.c | 3 ++-
drivers/infiniband/hw/mlx4/qp.c | 13 ++++++++-----
2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 26a96b8..1079ee5 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2310,7 +2310,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.uverbs_ex_cmd_mask |=
(1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE) |
- (1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ);
+ (1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ) |
+ (1ull << IB_USER_VERBS_EX_CMD_CREATE_QP);
mlx4_ib_alloc_eqs(dev, ibdev);
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 2871949..ba25a1b 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -758,9 +758,6 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
} else {
qp->sq_no_prefetch = 0;
- if (init_attr->create_flags & IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK)
- qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK;
-
if (init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO)
qp->flags |= MLX4_IB_QP_LSO;
@@ -834,6 +831,9 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
goto err_proxy;
}
+ if (init_attr->create_flags & IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK)
+ qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK;
+
err = mlx4_qp_alloc(dev->dev, qpn, &qp->mqp, gfp);
if (err)
goto err_qpn;
@@ -1098,6 +1098,7 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
{
struct mlx4_ib_qp *qp = NULL;
int err;
+ int sup_u_create_flags = MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK;
u16 xrcdn = 0;
gfp_t gfp;
@@ -1121,8 +1122,10 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
}
if (init_attr->create_flags &&
- (udata ||
- ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP | MLX4_IB_QP_CREATE_USE_GFP_NOIO)) &&
+ ((udata && init_attr->create_flags & ~(sup_u_create_flags)) ||
+ ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP |
+ MLX4_IB_QP_CREATE_USE_GFP_NOIO |
+ MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK)) &&
init_attr->qp_type != IB_QPT_UD) ||
((init_attr->create_flags & MLX4_IB_SRIOV_SQP) &&
init_attr->qp_type > IB_QPT_GSI)))
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (6 preceding siblings ...)
2015-08-20 14:34 ` [PATCH v1 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag Eran Ben Elisha
@ 2015-08-25 16:44 ` Christoph Lameter
[not found] ` <alpine.DEB.2.11.1508251143430.16801-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2015-09-21 14:24 ` eran ben elisha
8 siblings, 1 reply; 22+ messages in thread
From: Christoph Lameter @ 2015-08-25 16:44 UTC (permalink / raw)
To: Eran Ben Elisha
Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz
On Thu, 20 Aug 2015, Eran Ben Elisha wrote:
> This patch-set adds a new implementation for multicast loopback prevention for
> mlx4 driver. The current implementation is very limited, especially if link
> layer is Ethernet. The new implementation is based on HW feature of dropping
> incoming multicast packets if the sender QP counter index is equal to the
> receiver counter index.
Do you have this in a git tree somewhere for testing?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <alpine.DEB.2.11.1508251143430.16801-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
@ 2015-08-26 9:12 ` eran ben elisha
[not found] ` <CAKHjkjnk9sFfr3KhNGBD_LzXUbDdb7_=vF2A3Z5+ousFShfxJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: eran ben elisha @ 2015-08-26 9:12 UTC (permalink / raw)
To: Christoph Lameter
Cc: Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
Or Gerlitz
On Tue, Aug 25, 2015 at 7:44 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, 20 Aug 2015, Eran Ben Elisha wrote:
>
>> This patch-set adds a new implementation for multicast loopback prevention for
>> mlx4 driver. The current implementation is very limited, especially if link
>> layer is Ethernet. The new implementation is based on HW feature of dropping
>> incoming multicast packets if the sender QP counter index is equal to the
>> receiver counter index.
>
> Do you have this in a git tree somewhere for testing?
Yes,
please pull from https://github.com/eranbenelisha/linux branch
rebased-for-4.3/lb_prev
If you with to get the user space as well for testing, please use:
https://github.com/eranbenelisha/libibverbs branch for-linux-rdma-lb_prev
https://github.com/eranbenelisha/libmlx4 branch for-linux-rdma-lb_prev
Eran.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (7 preceding siblings ...)
2015-08-25 16:44 ` [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4 Christoph Lameter
@ 2015-09-21 14:24 ` eran ben elisha
[not found] ` <CAKHjkjnQNqeLu+irEg97a1hcKp_ziXsDOFaTRwcaPpZAX_fMng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
8 siblings, 1 reply; 22+ messages in thread
From: eran ben elisha @ 2015-09-21 14:24 UTC (permalink / raw)
To: Eran Ben Elisha
Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
Christoph Lameter
On Thu, Aug 20, 2015 at 5:34 PM, Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> Hi Doug,
>
> This patch-set adds a new implementation for multicast loopback prevention for
> mlx4 driver. The current implementation is very limited, especially if link
> layer is Ethernet. The new implementation is based on HW feature of dropping
> incoming multicast packets if the sender QP counter index is equal to the
> receiver counter index.
>
> Patch 0001 extends ib_uverbs_create_qp in order to allow receiving the
> multicast loopback flag at create flags.
> Patch 0002 adds an infrastructure for the counters' loopback prevention in the
> mlx4_core.
> Patch 0003 modifies mlx4_en QPs to use the new loopback prevention mode.
> Patches 0004-0006 implements this feature for mlx4_ib driver.
> Patch 0007 allows setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK in create_flag
> field both from uverbs and ib_create_qp.
>
> Changes from v0:
> Move loopback assignment outside the for loop according to Yuval's comment
> rebase over to-be-rebased/for-4.3
>
>
> Thanks,
> Eran.
>
> Eran Ben Elisha (5):
> IB/core: Extend ib_uverbs_create_qp
> IB/core: Allow setting create flags in QP init attribute
> IB/mlx4: Add IB counters table
> IB/mlx4: Add counter based implementation for QP multicast loopback
> block
> IB/mlx4: Add support for blocking multicast loopback QP creation user
> flag
>
> Maor Gottlieb (2):
> net/mlx4_core: Add support for filtering multicast loopback
> net/mlx4_en: Implement mcast loopback prevention for ETH qps
Hi Doug,
This version sits in the mailing list for a month with no comment.
It has been tested by Christoph Lameter.
When do you plan to take it into your tree?
Eran.
>
> drivers/infiniband/core/uverbs.h | 1 +
> drivers/infiniband/core/uverbs_cmd.c | 259 +++++++++++++++------
> drivers/infiniband/core/uverbs_main.c | 1 +
> drivers/infiniband/hw/mlx4/mad.c | 25 +-
> drivers/infiniband/hw/mlx4/main.c | 66 ++++--
> drivers/infiniband/hw/mlx4/mlx4_ib.h | 10 +-
> drivers/infiniband/hw/mlx4/qp.c | 88 ++++++-
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 22 ++
> drivers/net/ethernet/mellanox/mlx4/en_resources.c | 25 ++
> drivers/net/ethernet/mellanox/mlx4/fw.c | 6 +
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 +-
> drivers/net/ethernet/mellanox/mlx4/qp.c | 19 +-
> .../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 ++-
> include/linux/mlx4/device.h | 2 +
> include/linux/mlx4/qp.h | 24 +-
> include/uapi/rdma/ib_user_verbs.h | 26 +++
> 16 files changed, 498 insertions(+), 109 deletions(-)
>
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <CAKHjkjnQNqeLu+irEg97a1hcKp_ziXsDOFaTRwcaPpZAX_fMng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-09-29 18:24 ` Doug Ledford
[not found] ` <560AD75A.7080700-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Doug Ledford @ 2015-09-29 18:24 UTC (permalink / raw)
To: eran ben elisha, Eran Ben Elisha
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter
[-- Attachment #1: Type: text/plain, Size: 3684 bytes --]
On 09/21/2015 10:24 AM, eran ben elisha wrote:
> On Thu, Aug 20, 2015 at 5:34 PM, Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> Hi Doug,
>>
>> This patch-set adds a new implementation for multicast loopback prevention for
>> mlx4 driver. The current implementation is very limited, especially if link
>> layer is Ethernet. The new implementation is based on HW feature of dropping
>> incoming multicast packets if the sender QP counter index is equal to the
>> receiver counter index.
>>
>> Patch 0001 extends ib_uverbs_create_qp in order to allow receiving the
>> multicast loopback flag at create flags.
>> Patch 0002 adds an infrastructure for the counters' loopback prevention in the
>> mlx4_core.
>> Patch 0003 modifies mlx4_en QPs to use the new loopback prevention mode.
>> Patches 0004-0006 implements this feature for mlx4_ib driver.
>> Patch 0007 allows setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK in create_flag
>> field both from uverbs and ib_create_qp.
>>
>> Changes from v0:
>> Move loopback assignment outside the for loop according to Yuval's comment
>> rebase over to-be-rebased/for-4.3
>>
>>
>> Thanks,
>> Eran.
>>
>> Eran Ben Elisha (5):
>> IB/core: Extend ib_uverbs_create_qp
>> IB/core: Allow setting create flags in QP init attribute
>> IB/mlx4: Add IB counters table
>> IB/mlx4: Add counter based implementation for QP multicast loopback
>> block
>> IB/mlx4: Add support for blocking multicast loopback QP creation user
>> flag
>>
>> Maor Gottlieb (2):
>> net/mlx4_core: Add support for filtering multicast loopback
>> net/mlx4_en: Implement mcast loopback prevention for ETH qps
>
> Hi Doug,
> This version sits in the mailing list for a month with no comment.
> It has been tested by Christoph Lameter.
> When do you plan to take it into your tree?
I'm getting ready to tackle the for-next backlog.
> Eran.
>
>>
>> drivers/infiniband/core/uverbs.h | 1 +
>> drivers/infiniband/core/uverbs_cmd.c | 259 +++++++++++++++------
>> drivers/infiniband/core/uverbs_main.c | 1 +
>> drivers/infiniband/hw/mlx4/mad.c | 25 +-
>> drivers/infiniband/hw/mlx4/main.c | 66 ++++--
>> drivers/infiniband/hw/mlx4/mlx4_ib.h | 10 +-
>> drivers/infiniband/hw/mlx4/qp.c | 88 ++++++-
>> drivers/net/ethernet/mellanox/mlx4/en_main.c | 22 ++
>> drivers/net/ethernet/mellanox/mlx4/en_resources.c | 25 ++
>> drivers/net/ethernet/mellanox/mlx4/fw.c | 6 +
>> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 +-
>> drivers/net/ethernet/mellanox/mlx4/qp.c | 19 +-
>> .../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 ++-
>> include/linux/mlx4/device.h | 2 +
>> include/linux/mlx4/qp.h | 24 +-
>> include/uapi/rdma/ib_user_verbs.h | 26 +++
>> 16 files changed, 498 insertions(+), 109 deletions(-)
>>
>> --
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <560AD75A.7080700-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-10-05 21:59 ` Or Gerlitz
[not found] ` <CAJ3xEMiuPEJZ5EQueC=8wQyok2QbEaywQ3gRYVpqOwSaFQcj0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Or Gerlitz @ 2015-10-05 21:59 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran
On Tue, Sep 29, 2015 at 9:24 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> I'm getting ready to tackle the for-next backlog.
Sounds, good, rc4 is here... so lets get things going. I see that
there is a 4.4 branch @ your kernel.org tree with the checksum bits,
is there anything else which you've already picked into your github
branches or this is all for now?
Re the backlog, we have Eran's multicast loopback prevention patches
(this thread, Haggai's work to actually add name spaces support [1] on
top of his effort for demuxing requests which landed in 4.3 and
Matan's subsequent RoCE work, where these all stand?
Or.
[1] http://marc.info/?l=linux-rdma&m=144068579319679&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <CAJ3xEMiuPEJZ5EQueC=8wQyok2QbEaywQ3gRYVpqOwSaFQcj0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-06 16:05 ` Doug Ledford
[not found] ` <5613F145.3040204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Doug Ledford @ 2015-10-06 16:05 UTC (permalink / raw)
To: Or Gerlitz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran
[-- Attachment #1: Type: text/plain, Size: 1583 bytes --]
On 10/05/2015 05:59 PM, Or Gerlitz wrote:
> On Tue, Sep 29, 2015 at 9:24 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> I'm getting ready to tackle the for-next backlog.
>
> Sounds, good, rc4 is here... so lets get things going. I see that
> there is a 4.4 branch @ your kernel.org tree with the checksum bits,
> is there anything else which you've already picked into your github
> branches or this is all for now?
>
> Re the backlog, we have Eran's multicast loopback prevention patches
> (this thread, Haggai's work to actually add name spaces support [1] on
> top of his effort for demuxing requests which landed in 4.3 and
> Matan's subsequent RoCE work, where these all stand?
>
> Or.
>
> [1] http://marc.info/?l=linux-rdma&m=144068579319679&w=2
I'll have some sort of answer for that soon. I spent the better part of
last week, and what time I worked on the weekend, plus all day yesterday
on the internal infrastructure here at Red Hat. We're experiencing some
growing pains in our cluster and some downtime as a result that keeps me
from being able to test code effectively. I wouldn't be surprised if it
takes another day or two to get it completely sorted out (or sorted as
best I can, some things are out of my control). Then I have to see if
any of the currently posted fixes for 4.3rc that I haven't grabbed yet
resolve the iSER issue I'm seeing, then I'll move on to for-next processing.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <5613F145.3040204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-10-06 16:54 ` Sagi Grimberg
[not found] ` <5613FCB1.4050605-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-10-06 20:54 ` Or Gerlitz
1 sibling, 1 reply; 22+ messages in thread
From: Sagi Grimberg @ 2015-10-06 16:54 UTC (permalink / raw)
To: Doug Ledford, Or Gerlitz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran
> Then I have to see if any of the currently posted fixes for 4.3rc that I haven't grabbed yet
> resolve the iSER issue I'm seeing, then I'll move on to for-next processing.
Anything I can help with Doug?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <5613FCB1.4050605-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-10-06 17:41 ` Doug Ledford
0 siblings, 0 replies; 22+ messages in thread
From: Doug Ledford @ 2015-10-06 17:41 UTC (permalink / raw)
To: Sagi Grimberg, Or Gerlitz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran
[-- Attachment #1: Type: text/plain, Size: 414 bytes --]
On 10/06/2015 12:54 PM, Sagi Grimberg wrote:
>> Then I have to see if any of the currently posted fixes for 4.3rc that
>> I haven't grabbed yet
>> resolve the iSER issue I'm seeing, then I'll move on to for-next
>> processing.
>
> Anything I can help with Doug?
I'll tell you in a few hours.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <5613F145.3040204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-06 16:54 ` Sagi Grimberg
@ 2015-10-06 20:54 ` Or Gerlitz
[not found] ` <CAJ3xEMhsr5uRLZ=Ur56TOpZy5XT3UbK=M1kcgjt=zx_x3HNcSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 22+ messages in thread
From: Or Gerlitz @ 2015-10-06 20:54 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran, Eli Cohen
On Tue, Oct 6, 2015 at 7:05 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> I'll have some sort of answer for that soon. I spent the better part of
> last week, and what time I worked on the weekend, plus all day yesterday
> on the internal infrastructure here at Red Hat. We're experiencing some
> growing pains in our cluster and some downtime as a result that keeps me
> from being able to test code effectively. I wouldn't be surprised if it
> takes another day or two to get it completely sorted out (or sorted as
> best I can, some things are out of my control). Then I have to see if
> any of the currently posted fixes for 4.3rc that I haven't grabbed yet
> resolve the iSER issue I'm seeing, then I'll move on to for-next processing.
Doug,
>From my experience with VPI (IB/RoCE) clusters, librdmacm/rping is the
answer... namely -- if you have **rping** up and running over kernel X
for both IB and RoCE, things aren't in such a bad state. If you want
to go deeper, have it working over IB non-default partition and
Ethernet VLAN.
Also, for IB multicast, mckey with IPoIB port space, iperf multicast
over IPoIB would tell you how things are.
So all to all, sans SRIOV, it should take you whole 20m to figure out
if something is really DOA over IB/RoCE HW and I believe iWARP too
(rping) - makes sense?
What we do know that needs fixing for 4.3-rc
--> RoCE, you need the patch re-posted by Haggai few hours ago
"IB/cma: Accept connection without a valid netdev on RoCE" -- without
it, RoCE isn't working.
--> **mlx5** devices and no-default IB pkeys, Haggai and Co are
working on a fix since this isn't working since 4.3-rc1. I told them
we need it till rc5.5 (i.e few days before rc6 and if not, will have
to revert some 4.3-rc1 bits.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <CAJ3xEMhsr5uRLZ=Ur56TOpZy5XT3UbK=M1kcgjt=zx_x3HNcSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-06 21:26 ` Doug Ledford
[not found] ` <56143C8E.9090407-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Doug Ledford @ 2015-10-06 21:26 UTC (permalink / raw)
To: Or Gerlitz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran, Eli Cohen
[-- Attachment #1: Type: text/plain, Size: 2456 bytes --]
On 10/06/2015 04:54 PM, Or Gerlitz wrote:
> On Tue, Oct 6, 2015 at 7:05 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> I'll have some sort of answer for that soon. I spent the better part of
>> last week, and what time I worked on the weekend, plus all day yesterday
>> on the internal infrastructure here at Red Hat. We're experiencing some
>> growing pains in our cluster and some downtime as a result that keeps me
>> from being able to test code effectively. I wouldn't be surprised if it
>> takes another day or two to get it completely sorted out (or sorted as
>> best I can, some things are out of my control). Then I have to see if
>> any of the currently posted fixes for 4.3rc that I haven't grabbed yet
>> resolve the iSER issue I'm seeing, then I'll move on to for-next processing.
>
> Doug,
>
> From my experience with VPI (IB/RoCE) clusters, librdmacm/rping is the
> answer... namely -- if you have **rping** up and running over kernel X
> for both IB and RoCE, things aren't in such a bad state. If you want
> to go deeper, have it working over IB non-default partition and
> Ethernet VLAN.
Nothing so simple unfortunately. And it isn't an IB/RoCE cluster, it's
IB/IB/OPA/RoCE/IWARP cluster. Regardless though, that's not my problem
and what I'm chasing.
> Also, for IB multicast, mckey with IPoIB port space, iperf multicast
> over IPoIB would tell you how things are.
>
> So all to all, sans SRIOV, it should take you whole 20m to figure out
> if something is really DOA over IB/RoCE HW and I believe iWARP too
> (rping) - makes sense?
Yes, I know how to do DOA testing.
> What we do know that needs fixing for 4.3-rc
>
> --> RoCE, you need the patch re-posted by Haggai few hours ago
> "IB/cma: Accept connection without a valid netdev on RoCE" -- without
> it, RoCE isn't working.
I have that already. It's available on both github and k.o and just
waiting for a pull request.
> --> **mlx5** devices and no-default IB pkeys, Haggai and Co are
> working on a fix since this isn't working since 4.3-rc1. I told them
> we need it till rc5.5 (i.e few days before rc6 and if not, will have
> to revert some 4.3-rc1 bits.
I already have on patch related to this in my repo as well. The 0day
testing just came back and it's all good.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <56143C8E.9090407-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-10-06 21:49 ` Or Gerlitz
[not found] ` <CAJ3xEMiNk2ZXgg_Ji5dg+ahL_D5_RAvEzEEhMsPDjC-PJYDu7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Or Gerlitz @ 2015-10-06 21:49 UTC (permalink / raw)
To: Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran, Eli Cohen
On Wed, Oct 7, 2015 at 12:26 AM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Nothing so simple unfortunately. And it isn't an IB/RoCE cluster, it's
> IB/IB/OPA/RoCE/IWARP cluster. Regardless though, that's not my problem
> and what I'm chasing.
To be precise no two transports out of IB/RoCE/iWARP/OPA are
inter-operable, so these are "just" different cards/transports under
the same IB core on this cluster.
> Yes, I know how to do DOA testing.
So what's dead in your env after (say) 59m of examination?
>> What we do know that needs fixing for 4.3-rc
>> --> RoCE, you need the patch re-posted by Haggai few hours ago
>> "IB/cma: Accept connection without a valid netdev on RoCE" -- without
>> it, RoCE isn't working.
> I have that already. It's available on both github and k.o and just
> waiting for a pull request.
Maybe wait to get the fixes for the non-default pkey on mlx5 (see more below)?
Did you actually note that before Haggai posted the patch?!
once I realized how deep was the breakage, I became sort of very
worried re your testing env not shouting hard on us this something is
broken even before 4.3-rc1
>> --> **mlx5** devices and no-default IB pkeys, Haggai and Co are
>> working on a fix since this isn't working since 4.3-rc1. I told them
>> we need it till rc5.5 (i.e few days before rc6 and if not, will have
>> to revert some 4.3-rc1 bits.
> I already have on patch related to this in my repo as well. The 0day
> testing just came back and it's all good.
I suspect that you don't... do you have rping up and running between
mlx4 and mlx5 on non default pkey? the breakage is a bit tricky and
you might not see it if you run mlx5 against mlx5, BTW which patch is
that?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <CAJ3xEMiNk2ZXgg_Ji5dg+ahL_D5_RAvEzEEhMsPDjC-PJYDu7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-07 15:28 ` Doug Ledford
[not found] ` <56153A0C.4040006-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Doug Ledford @ 2015-10-07 15:28 UTC (permalink / raw)
To: Or Gerlitz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran, Eli Cohen
[-- Attachment #1: Type: text/plain, Size: 4930 bytes --]
On 10/06/2015 05:49 PM, Or Gerlitz wrote:
> On Wed, Oct 7, 2015 at 12:26 AM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>
>> Nothing so simple unfortunately. And it isn't an IB/RoCE cluster, it's
>> IB/IB/OPA/RoCE/IWARP cluster. Regardless though, that's not my problem
>> and what I'm chasing.
>
> To be precise no two transports out of IB/RoCE/iWARP/OPA are
> inter-operable, so these are "just" different cards/transports under
> the same IB core on this cluster.
Except that some machines have links to as many as four of the different
fabrics and so a problem in one can effect testing of others.
>> Yes, I know how to do DOA testing.
>
> So what's dead in your env after (say) 59m of examination?
It's not dead after 59m, it's DOA immediately. And it's iSER.
But the details are much more complex than iSER is DOA. It was DOA when
running a rhel7 kernel (internal, for next kernel, not a release
kernel). That kernel is pretty close to upstream. When I went to put
an upstream kernel on there to see if it had the same issue, the
upstream kernel on that machine oopses on boot. It oopses in list_add,
but the backtrace doesn't list any usable information about who called
list_add with bogus data. However, reliably, right before the oops, the
ciostor driver fails to load properly, so I'm going with that being the
likely culprit. But each iteration is slow because when the rhel7
kernel iSER does it's thing, it causes a hung reboot, but it also
crashes the iDRAC in the machine (errant drivers crashing a baseboard
management controller is never a good sign), so the reboot must be done
via a hard power cycle. When the upstream kernel oopses on boot, at
least the iDRAC is still working. As a result, each test iteration is
pretty slow.
There we go, bootup on a 4.3-rc4 kernel with cxgb4 FCoE driver disabled
succeeded. A hurdle passed. Now I can test upstream iSER.
With an upstream kernel, the drive is still read-only with iSER (it's
not configured that way to the best of my knowledge, but I'm using auto
generate ACLs, I'm getting ready to switch the system to specific ACLs
instead), but the thread isn't stuck in D state, so that's an improvement.
However, the machine is still crashing the iDRAC on reboot. I can't be
certain if it's the SRP target or iSER target causing this as they both
were brought up live at the same time and reboot cycles without either
of these work fine. So I have more investigation to go before I know
exactly what's going on. And as I pointed out, each iteration is slow :-/
>>> What we do know that needs fixing for 4.3-rc
>>> --> RoCE, you need the patch re-posted by Haggai few hours ago
>>> "IB/cma: Accept connection without a valid netdev on RoCE" -- without
>>> it, RoCE isn't working.
>
>> I have that already. It's available on both github and k.o and just
>> waiting for a pull request.
>
> Maybe wait to get the fixes for the non-default pkey on mlx5 (see more below)?
>
> Did you actually note that before Haggai posted the patch?!
No.
> once I realized how deep was the breakage, I became sort of very
> worried re your testing env not shouting hard on us this something is
> broken even before 4.3-rc1
My test environment has been down for upgrades. In the last little bit
we've brought a second rack online, added 10 new machines, 3 new
switches, and moved existing machines around between the two racks in
order to more evenly balance the need for each port type across switches
in the two racks. There's been more than that going on behind the
scenes here too, but it's not really worth getting into all of it.
Suffice it to say I've been working on A) expanding the cluster, B)
expanding the things the cluster is configured to do and therefore able
to test, and C) finding a way to get upstream code into this testing
framework since it was previously all rhel/fedora centric.
And this test infrastructure goes down by COB Thursday of this week and
won't be back for a week because it's being used for NFSoRDMA testing at
this fall's Bake-a-thon.
>>> --> **mlx5** devices and no-default IB pkeys, Haggai and Co are
>>> working on a fix since this isn't working since 4.3-rc1. I told them
>>> we need it till rc5.5 (i.e few days before rc6 and if not, will have
>>> to revert some 4.3-rc1 bits.
>
>> I already have on patch related to this in my repo as well. The 0day
>> testing just came back and it's all good.
>
> I suspect that you don't...
I meant build tests passed, not run tests.
> do you have rping up and running between
> mlx4 and mlx5 on non default pkey? the breakage is a bit tricky and
> you might not see it if you run mlx5 against mlx5, BTW which patch is
> that?
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <56153A0C.4040006-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-10-07 15:36 ` Or Gerlitz
0 siblings, 0 replies; 22+ messages in thread
From: Or Gerlitz @ 2015-10-07 15:36 UTC (permalink / raw)
To: Doug Ledford, Or Gerlitz, Sagi Grimberg
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Lameter, Matan Barak, Haggai Eran, Eli Cohen
On 10/7/2015 6:28 PM, Doug Ledford wrote:
> However, the machine is still crashing the iDRAC on reboot. I can't be
> certain if it's the SRP target or iSER target causing this as they both
> were brought up live at the same time and reboot cycles without either
> of these work fine. So I have more investigation to go before I know
> exactly what's going on. And as I pointed out, each iteration is slow :-/
You may want to move to run with the lightweight user-space iSER target
(TGT) it's provided with Fedora
and also with RHEL7 using the unsupported addons (forgot how you call that)
And anyway, Sagi suggested to help, so, please make use of his brainpower
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4
[not found] ` <CAKHjkjnk9sFfr3KhNGBD_LzXUbDdb7_=vF2A3Z5+ousFShfxJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-14 15:34 ` Christoph Lameter
0 siblings, 0 replies; 22+ messages in thread
From: Christoph Lameter @ 2015-10-14 15:34 UTC (permalink / raw)
To: eran ben elisha
Cc: Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
Or Gerlitz
On Wed, 26 Aug 2015, eran ben elisha wrote:
> > Do you have this in a git tree somewhere for testing?
>
> Yes,
> please pull from https://github.com/eranbenelisha/linux branch
> rebased-for-4.3/lb_prev
>
> If you with to get the user space as well for testing, please use:
> https://github.com/eranbenelisha/libibverbs branch for-linux-rdma-lb_prev
> https://github.com/eranbenelisha/libmlx4 branch for-linux-rdma-lb_prev
This needs to be rebased. Cannot pull on top of current linus tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2015-10-14 15:34 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-20 14:34 [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4 Eran Ben Elisha
[not found] ` <1440081275-15864-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-20 14:34 ` [PATCH v1 for-next 1/7] IB/core: Extend ib_uverbs_create_qp Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 2/7] IB/core: Allow setting create flags in QP init attribute Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 5/7] IB/mlx4: Add IB counters table Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block Eran Ben Elisha
2015-08-20 14:34 ` [PATCH v1 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag Eran Ben Elisha
2015-08-25 16:44 ` [PATCH v1 for-next 0/7] Add support for multicast loopback prevention to mlx4 Christoph Lameter
[not found] ` <alpine.DEB.2.11.1508251143430.16801-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2015-08-26 9:12 ` eran ben elisha
[not found] ` <CAKHjkjnk9sFfr3KhNGBD_LzXUbDdb7_=vF2A3Z5+ousFShfxJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-14 15:34 ` Christoph Lameter
2015-09-21 14:24 ` eran ben elisha
[not found] ` <CAKHjkjnQNqeLu+irEg97a1hcKp_ziXsDOFaTRwcaPpZAX_fMng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-29 18:24 ` Doug Ledford
[not found] ` <560AD75A.7080700-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-05 21:59 ` Or Gerlitz
[not found] ` <CAJ3xEMiuPEJZ5EQueC=8wQyok2QbEaywQ3gRYVpqOwSaFQcj0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-06 16:05 ` Doug Ledford
[not found] ` <5613F145.3040204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-06 16:54 ` Sagi Grimberg
[not found] ` <5613FCB1.4050605-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-10-06 17:41 ` Doug Ledford
2015-10-06 20:54 ` Or Gerlitz
[not found] ` <CAJ3xEMhsr5uRLZ=Ur56TOpZy5XT3UbK=M1kcgjt=zx_x3HNcSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-06 21:26 ` Doug Ledford
[not found] ` <56143C8E.9090407-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-06 21:49 ` Or Gerlitz
[not found] ` <CAJ3xEMiNk2ZXgg_Ji5dg+ahL_D5_RAvEzEEhMsPDjC-PJYDu7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-07 15:28 ` Doug Ledford
[not found] ` <56153A0C.4040006-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-07 15:36 ` Or Gerlitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).