* [PATCH rdma-next 0/5] Dropples RQ
@ 2017-05-30 7:29 Leon Romanovsky
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2017-05-30 7:29 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hi Doug,
>From Maor:
---
This series adding support to avoid RAW ethernet packet drops.
When packets received and there are no receive WQEs, then
the packets are dropped. In certain scenarios it may be preferred
to configure the device to avoid such packet drops, assuming
the posting of WQEs will resume shortly.
Under such configuration, when a packet is received for a
WQ with no receive WQEs, the packet processing is delayed
until receive WQEs are posted.
If receive WQEs are still not posted after a device configured
time, the packet is dropped.
To enable this delay drop feature on a WQ, the user should
set the IB_WQ_FLAGS_DELAY_DROP flag on creation.
---
Available in the "topic/dropless-rq" topic branch of this git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git
Or for browsing:
https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=topic/dropless-rq
Possible merge conflicts resolution is available in:
https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=testing/queue-next
Thanks
Maor Gottlieb (5):
IB/core: Introduce delay drop for a WQ
net/mlx5: Introduce set delay drop command
net/mlx5: Introduce general notification event
IB/mlx5: Add support to dropless RQ
IB/mlx5: Add delay drop configuration and statistics
drivers/infiniband/hw/mlx5/main.c | 163 ++++++++++++++++++++++++++-
drivers/infiniband/hw/mlx5/mlx5_ib.h | 29 +++++
drivers/infiniband/hw/mlx5/qp.c | 50 +++++++-
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 23 ++++
drivers/net/ethernet/mellanox/mlx5/core/qp.c | 14 +++
include/linux/mlx5/device.h | 5 +
include/linux/mlx5/driver.h | 1 +
include/linux/mlx5/mlx5_ifc.h | 30 ++++-
include/linux/mlx5/qp.h | 3 +
include/rdma/ib_verbs.h | 5 +
10 files changed, 313 insertions(+), 10 deletions(-)
--
2.12.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH rdma-next 1/5] IB/core: Introduce delay drop for a WQ
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2017-05-30 7:29 ` Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 2/5] net/mlx5: Introduce set delay drop command Leon Romanovsky
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2017-05-30 7:29 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Work queue which is created with IB_WQ_FLAGS_DELAY_DROP won't
cause packet drops when there aren't receive WQEs, but will wait until
posting of receive WQEs or for some period of time that the device
was configured with.
It includes:
* Add a new creation flag to enable delay drop functionality in a WQ.
* A new capability was introduced - IB_RAW_PACKET_CAP_DELAY_DROP, which
is the device's ability to delay packet drops when there aren't receive
WQEs.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
include/rdma/ib_verbs.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index ba8314ec5768..7574c52dc355 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1546,6 +1546,10 @@ enum ib_raw_packet_caps {
IB_RAW_PACKET_CAP_SCATTER_FCS = (1 << 1),
/* Checksum offloads are supported (for both send and receive). */
IB_RAW_PACKET_CAP_IP_CSUM = (1 << 2),
+ /* When a packet is received for an RQ with no receive WQEs, the
+ * packet processing is delayed.
+ */
+ IB_RAW_PACKET_CAP_DELAY_DROP = (1 << 3),
};
enum ib_wq_type {
@@ -1574,6 +1578,7 @@ struct ib_wq {
enum ib_wq_flags {
IB_WQ_FLAGS_CVLAN_STRIPPING = 1 << 0,
IB_WQ_FLAGS_SCATTER_FCS = 1 << 1,
+ IB_WQ_FLAGS_DELAY_DROP = 1 << 2,
};
struct ib_wq_init_attr {
--
2.12.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next 2/5] net/mlx5: Introduce set delay drop command
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-05-30 7:29 ` [PATCH rdma-next 1/5] IB/core: Introduce delay drop for a WQ Leon Romanovsky
@ 2017-05-30 7:29 ` Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 3/5] net/mlx5: Introduce general notification event Leon Romanovsky
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2017-05-30 7:29 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Add support to SET_DELAY_DROP command.
This command will be used in downstream patches for delay packet drop.
The timeout value should be indicated by delay_drop_timeout field.
Packet processing will be delayed till timeout value passed or until
more WQEs are posted.
Setting this value to 0 disables the feature.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
drivers/net/ethernet/mellanox/mlx5/core/qp.c | 14 ++++++++++++++
include/linux/mlx5/mlx5_ifc.h | 25 ++++++++++++++++++++++++-
include/linux/mlx5/qp.h | 3 +++
3 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/qp.c b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
index 573a6b27fed8..f9fcfec0e198 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
@@ -243,6 +243,20 @@ int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
}
EXPORT_SYMBOL_GPL(mlx5_core_destroy_qp);
+int mlx5_core_set_delay_drop(struct mlx5_core_dev *dev,
+ u32 timeout_usec)
+{
+ u32 out[MLX5_ST_SZ_DW(set_delay_drop_params_out)] = {0};
+ u32 in[MLX5_ST_SZ_DW(set_delay_drop_params_in)] = {0};
+
+ MLX5_SET(set_delay_drop_params_in, in, opcode,
+ MLX5_CMD_OP_SET_DELAY_DROP_PARAMS);
+ MLX5_SET(set_delay_drop_params_in, in, delay_drop_timeout,
+ timeout_usec / 100);
+ return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+}
+EXPORT_SYMBOL_GPL(mlx5_core_set_delay_drop);
+
struct mbox_info {
u32 *in;
u32 *out;
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 6fa1eb6766af..5b9f836e66ca 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -200,6 +200,7 @@ enum {
MLX5_CMD_OP_QUERY_SQ = 0x907,
MLX5_CMD_OP_CREATE_RQ = 0x908,
MLX5_CMD_OP_MODIFY_RQ = 0x909,
+ MLX5_CMD_OP_SET_DELAY_DROP_PARAMS = 0x910,
MLX5_CMD_OP_DESTROY_RQ = 0x90a,
MLX5_CMD_OP_QUERY_RQ = 0x90b,
MLX5_CMD_OP_CREATE_RMP = 0x90c,
@@ -824,7 +825,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 retransmission_q_counters[0x1];
u8 reserved_at_183[0x1];
u8 modify_rq_counter_set_id[0x1];
- u8 reserved_at_185[0x1];
+ u8 rq_delay_drop[0x1];
u8 max_qp_cnt[0xa];
u8 pkey_table_size[0x10];
@@ -5820,6 +5821,28 @@ struct mlx5_ifc_destroy_rq_in_bits {
u8 reserved_at_60[0x20];
};
+struct mlx5_ifc_set_delay_drop_params_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_at_10[0x10];
+
+ u8 reserved_at_20[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_at_40[0x20];
+
+ u8 reserved_at_60[0x10];
+ u8 delay_drop_timeout[0x10];
+};
+
+struct mlx5_ifc_set_delay_drop_params_out_bits {
+ u8 status[0x8];
+ u8 reserved_at_8[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_at_40[0x40];
+};
+
struct mlx5_ifc_destroy_rmp_out_bits {
u8 status[0x8];
u8 reserved_at_8[0x18];
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index bef80d0a0e30..9aaa4951bc6a 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -551,6 +551,9 @@ int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
int mlx5_core_qp_query(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp,
u32 *out, int outlen);
+int mlx5_core_set_delay_drop(struct mlx5_core_dev *dev,
+ u32 timeout_usec);
+
int mlx5_core_xrcd_alloc(struct mlx5_core_dev *dev, u32 *xrcdn);
int mlx5_core_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn);
void mlx5_init_qp_table(struct mlx5_core_dev *dev);
--
2.12.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next 3/5] net/mlx5: Introduce general notification event
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-05-30 7:29 ` [PATCH rdma-next 1/5] IB/core: Introduce delay drop for a WQ Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 2/5] net/mlx5: Introduce set delay drop command Leon Romanovsky
@ 2017-05-30 7:29 ` Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 4/5] IB/mlx5: Add support to dropless RQ Leon Romanovsky
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2017-05-30 7:29 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
When delay drop timeout is expired, the firmware raises
general notification event of DELAY_DROP_TIMEOUT subtype.
In addition the feature is disable so the driver have to
reactivate the timeout.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 23 +++++++++++++++++++++++
include/linux/mlx5/device.h | 5 +++++
include/linux/mlx5/driver.h | 1 +
include/linux/mlx5/mlx5_ifc.h | 3 ++-
4 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 01d2cd7e4746..afbd9171acb0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -159,6 +159,8 @@ static const char *eqe_type_str(u8 type)
return "MLX5_EVENT_TYPE_PPS_EVENT";
case MLX5_EVENT_TYPE_FPGA_ERROR:
return "MLX5_EVENT_TYPE_FPGA_ERROR";
+ case MLX5_EVENT_TYPE_GENERAL_EVENT:
+ return "MLX5_EVENT_TYPE_GENERAL_EVENT";
default:
return "Unrecognized event";
}
@@ -376,6 +378,20 @@ int mlx5_core_page_fault_resume(struct mlx5_core_dev *dev, u32 token,
EXPORT_SYMBOL_GPL(mlx5_core_page_fault_resume);
#endif
+static void general_event_handler(struct mlx5_core_dev *dev,
+ struct mlx5_eqe *eqe)
+{
+ switch (eqe->sub_type) {
+ case MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT:
+ if (dev->event)
+ dev->event(dev, MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT, 0);
+ break;
+ default:
+ mlx5_core_dbg(dev, "General event with unrecognized subtype: sub_type %d\n",
+ eqe->sub_type);
+ }
+}
+
static irqreturn_t mlx5_eq_int(int irq, void *eq_ptr)
{
struct mlx5_eq *eq = eq_ptr;
@@ -484,6 +500,9 @@ static irqreturn_t mlx5_eq_int(int irq, void *eq_ptr)
mlx5_fpga_event(dev, eqe->type, &eqe->data.raw);
break;
+ case MLX5_EVENT_TYPE_GENERAL_EVENT:
+ general_event_handler(dev, eqe);
+ break;
default:
mlx5_core_warn(dev, "Unhandled event 0x%x on EQ 0x%x\n",
eqe->type, eq->eqn);
@@ -693,6 +712,10 @@ int mlx5_start_eqs(struct mlx5_core_dev *dev)
mlx5_core_is_pf(dev))
async_event_mask |= (1ull << MLX5_EVENT_TYPE_NIC_VPORT_CHANGE);
+ if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH &&
+ MLX5_CAP_GEN(dev, general_notification_event))
+ async_event_mask |= (1ull << MLX5_EVENT_TYPE_GENERAL_EVENT);
+
if (MLX5_CAP_GEN(dev, port_module_event))
async_event_mask |= (1ull << MLX5_EVENT_TYPE_PORT_MODULE_EVENT);
else
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 786a43843da9..586f9f8a1481 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -290,6 +290,7 @@ enum mlx5_event {
MLX5_EVENT_TYPE_GPIO_EVENT = 0x15,
MLX5_EVENT_TYPE_PORT_MODULE_EVENT = 0x16,
MLX5_EVENT_TYPE_REMOTE_CONFIG = 0x19,
+ MLX5_EVENT_TYPE_GENERAL_EVENT = 0x22,
MLX5_EVENT_TYPE_PPS_EVENT = 0x25,
MLX5_EVENT_TYPE_DB_BF_CONGESTION = 0x1a,
@@ -305,6 +306,10 @@ enum mlx5_event {
};
enum {
+ MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT = 0x1,
+};
+
+enum {
MLX5_PORT_CHANGE_SUBTYPE_DOWN = 1,
MLX5_PORT_CHANGE_SUBTYPE_ACTIVE = 4,
MLX5_PORT_CHANGE_SUBTYPE_INITIALIZED = 5,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 55bb712643cb..e88b710d3cb0 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -180,6 +180,7 @@ enum mlx5_dev_event {
MLX5_DEV_EVENT_GUID_CHANGE,
MLX5_DEV_EVENT_CLIENT_REREG,
MLX5_DEV_EVENT_PPS,
+ MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT,
};
enum mlx5_port_status {
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 5b9f836e66ca..1e107d474802 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -858,7 +858,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 max_tc[0x4];
u8 reserved_at_1d0[0x1];
u8 dcbx[0x1];
- u8 reserved_at_1d2[0x3];
+ u8 general_notification_event[0x1];
+ u8 reserved_at_1d3[0x2];
u8 fpga[0x1];
u8 rol_s[0x1];
u8 rol_g[0x1];
--
2.12.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next 4/5] IB/mlx5: Add support to dropless RQ
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (2 preceding siblings ...)
2017-05-30 7:29 ` [PATCH rdma-next 3/5] net/mlx5: Introduce general notification event Leon Romanovsky
@ 2017-05-30 7:29 ` Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 5/5] IB/mlx5: Add delay drop configuration and statistics Leon Romanovsky
2017-07-28 18:18 ` [PATCH rdma-next 0/5] Dropples RQ Doug Ledford
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2017-05-30 7:29 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
RQs that were configured for "delay drop" will prevent packet drops
when their WQEs are depleted.
Marking an RQ to be drop-less is done by setting delay_drop_en in RQ
context using CREATE_RQ command.
Since this feature is globally activated/deactivated by using the
SET_DELAY_DROP command on all the marked RQs, we activated/deactivated
it according to the number of RQs with 'delay_drop' enabled.
When timeout is expired, then the feature is deactivated. Therefore
the driver handles the delay drop timeout event and reactivate it.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
drivers/infiniband/hw/mlx5/main.c | 60 +++++++++++++++++++++++++++++++++---
drivers/infiniband/hw/mlx5/mlx5_ib.h | 19 ++++++++++++
drivers/infiniband/hw/mlx5/qp.c | 37 ++++++++++++++++++++++
include/linux/mlx5/mlx5_ifc.h | 2 +-
4 files changed, 113 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 42defaa0d6c6..e76891a2a3b9 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -692,6 +692,10 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
props->device_cap_flags |= IB_DEVICE_UD_TSO;
}
+ if (MLX5_CAP_GEN(dev->mdev, rq_delay_drop) &&
+ MLX5_CAP_GEN(dev->mdev, general_notification_event))
+ props->raw_packet_caps |= IB_RAW_PACKET_CAP_DELAY_DROP;
+
if (MLX5_CAP_GEN(dev->mdev, eth_net_offloads) &&
MLX5_CAP_ETH(dev->mdev, scatter_fcs)) {
/* Legacy bit to support old userspace libraries */
@@ -2698,6 +2702,24 @@ static void mlx5_ib_handle_internal_error(struct mlx5_ib_dev *ibdev)
spin_unlock_irqrestore(&ibdev->reset_flow_resource_lock, flags);
}
+static void delay_drop_handler(struct work_struct *work)
+{
+ int err;
+ struct mlx5_ib_delay_drop *delay_drop =
+ container_of(work, struct mlx5_ib_delay_drop,
+ delay_drop_work);
+
+ mutex_lock(&delay_drop->lock);
+ err = mlx5_core_set_delay_drop(delay_drop->dev->mdev,
+ delay_drop->timeout);
+ if (err) {
+ mlx5_ib_warn(delay_drop->dev, "Failed to set delay drop, timeout=%u\n",
+ delay_drop->timeout);
+ delay_drop->activate = false;
+ }
+ mutex_unlock(&delay_drop->lock);
+}
+
static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
enum mlx5_dev_event event, unsigned long param)
{
@@ -2750,8 +2772,11 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
ibev.event = IB_EVENT_CLIENT_REREGISTER;
port = (u8)param;
break;
+ case MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT:
+ schedule_work(&ibdev->delay_drop.delay_drop_work);
+ goto out;
default:
- return;
+ goto out;
}
ibev.device = &ibdev->ib_dev;
@@ -2759,7 +2784,7 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
if (port < 1 || port > ibdev->num_ports) {
mlx5_ib_warn(ibdev, "warning: event on port %d\n", port);
- return;
+ goto out;
}
if (ibdev->ib_active)
@@ -2767,6 +2792,9 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
if (fatal)
ibdev->ib_active = false;
+
+out:
+ return;
}
static int set_has_smi_cap(struct mlx5_ib_dev *dev)
@@ -3549,6 +3577,26 @@ static void mlx5_ib_free_rdma_netdev(struct net_device *netdev)
return mlx5_rdma_netdev_free(netdev);
}
+static void cancel_delay_drop(struct mlx5_ib_dev *dev)
+{
+ if (!(dev->ib_dev.attrs.raw_packet_caps & IB_RAW_PACKET_CAP_DELAY_DROP))
+ return;
+
+ cancel_work_sync(&dev->delay_drop.delay_drop_work);
+}
+
+static void init_delay_drop(struct mlx5_ib_dev *dev)
+{
+ if (!(dev->ib_dev.attrs.raw_packet_caps & IB_RAW_PACKET_CAP_DELAY_DROP))
+ return;
+
+ mutex_init(&dev->delay_drop.lock);
+ dev->delay_drop.dev = dev;
+ dev->delay_drop.activate = false;
+ dev->delay_drop.timeout = MLX5_MAX_DELAY_DROP_TIMEOUT_MS * 1000;
+ INIT_WORK(&dev->delay_drop.delay_drop_work, delay_drop_handler);
+}
+
static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
{
struct mlx5_ib_dev *dev;
@@ -3780,18 +3828,21 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
if (err)
goto err_dev;
+ init_delay_drop(dev);
+
for (i = 0; i < ARRAY_SIZE(mlx5_class_attributes); i++) {
err = device_create_file(&dev->ib_dev.dev,
mlx5_class_attributes[i]);
if (err)
- goto err_umrc;
+ goto err_delay_drop;
}
dev->ib_active = true;
return dev;
-err_umrc:
+err_delay_drop:
+ cancel_delay_drop(dev);
destroy_umrc_res(dev);
err_dev:
@@ -3836,6 +3887,7 @@ static void mlx5_ib_remove(struct mlx5_core_dev *mdev, void *context)
struct mlx5_ib_dev *dev = context;
enum rdma_link_layer ll = mlx5_ib_port_link_layer(&dev->ib_dev, 1);
+ cancel_delay_drop(dev);
mlx5_remove_netdev_notifier(dev);
ib_unregister_device(&dev->ib_dev);
mlx5_free_bfreg(dev->mdev, &dev->fp_bfreg);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 38c877bc45e5..dd44624de56a 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -247,6 +247,10 @@ struct mlx5_ib_wq {
void *qend;
};
+enum mlx5_ib_wq_flags {
+ MLX5_IB_WQ_FLAGS_DELAY_DROP = 0x1,
+};
+
struct mlx5_ib_rwq {
struct ib_wq ibwq;
struct mlx5_core_qp core_qp;
@@ -264,6 +268,7 @@ struct mlx5_ib_rwq {
u32 wqe_count;
u32 wqe_shift;
int wq_sig;
+ u32 create_flags; /* Use enum mlx5_ib_wq_flags */
};
enum {
@@ -618,6 +623,19 @@ struct mlx5_roce {
atomic_t next_port;
};
+enum {
+ MLX5_MAX_DELAY_DROP_TIMEOUT_MS = 100,
+};
+
+struct mlx5_ib_delay_drop {
+ struct mlx5_ib_dev *dev;
+ struct work_struct delay_drop_work;
+ /* serialize setting of delay drop */
+ struct mutex lock;
+ u32 timeout;
+ bool activate;
+};
+
struct mlx5_ib_dev {
struct ib_device ib_dev;
struct mlx5_core_dev *mdev;
@@ -654,6 +672,7 @@ struct mlx5_ib_dev {
struct mlx5_ib_port *port;
struct mlx5_sq_bfreg bfreg;
struct mlx5_sq_bfreg fp_bfreg;
+ struct mlx5_ib_delay_drop delay_drop;
};
static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq)
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index d17aad0f54c0..fe519487461e 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4610,6 +4610,24 @@ static void mlx5_ib_wq_event(struct mlx5_core_qp *core_qp, int type)
}
}
+static int set_delay_drop(struct mlx5_ib_dev *dev)
+{
+ int err = 0;
+
+ mutex_lock(&dev->delay_drop.lock);
+ if (dev->delay_drop.activate)
+ goto out;
+
+ err = mlx5_core_set_delay_drop(dev->mdev, dev->delay_drop.timeout);
+ if (err)
+ goto out;
+
+ dev->delay_drop.activate = true;
+out:
+ mutex_unlock(&dev->delay_drop.lock);
+ return err;
+}
+
static int create_rq(struct mlx5_ib_rwq *rwq, struct ib_pd *pd,
struct ib_wq_init_attr *init_attr)
{
@@ -4664,9 +4682,28 @@ static int create_rq(struct mlx5_ib_rwq *rwq, struct ib_pd *pd,
}
MLX5_SET(rqc, rqc, scatter_fcs, 1);
}
+ if (init_attr->create_flags & IB_WQ_FLAGS_DELAY_DROP) {
+ if (!(dev->ib_dev.attrs.raw_packet_caps &
+ IB_RAW_PACKET_CAP_DELAY_DROP)) {
+ mlx5_ib_dbg(dev, "Delay drop is not supported\n");
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+ MLX5_SET(rqc, rqc, delay_drop_en, 1);
+ }
rq_pas0 = (__be64 *)MLX5_ADDR_OF(wq, wq, pas);
mlx5_ib_populate_pas(dev, rwq->umem, rwq->page_shift, rq_pas0, 0);
err = mlx5_core_create_rq_tracked(dev->mdev, in, inlen, &rwq->core_qp);
+ if (!err && init_attr->create_flags & IB_WQ_FLAGS_DELAY_DROP) {
+ err = set_delay_drop(dev);
+ if (err) {
+ mlx5_ib_warn(dev, "Failed to enable delay drop err=%d\n",
+ err);
+ mlx5_core_destroy_rq_tracked(dev->mdev, &rwq->core_qp);
+ } else {
+ rwq->create_flags |= MLX5_IB_WQ_FLAGS_DELAY_DROP;
+ }
+ }
out:
kvfree(in);
return err;
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 1e107d474802..b0578e46ed6d 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -2498,7 +2498,7 @@ enum {
struct mlx5_ifc_rqc_bits {
u8 rlky[0x1];
- u8 reserved_at_1[0x1];
+ u8 delay_drop_en[0x1];
u8 scatter_fcs[0x1];
u8 vsd[0x1];
u8 mem_rq_type[0x4];
--
2.12.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH rdma-next 5/5] IB/mlx5: Add delay drop configuration and statistics
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (3 preceding siblings ...)
2017-05-30 7:29 ` [PATCH rdma-next 4/5] IB/mlx5: Add support to dropless RQ Leon Romanovsky
@ 2017-05-30 7:29 ` Leon Romanovsky
2017-07-28 18:18 ` [PATCH rdma-next 0/5] Dropples RQ Doug Ledford
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2017-05-30 7:29 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Maor Gottlieb
From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Add debugfs interface for monitor the number of delay drop timeout
events and the number of existing dropless RQs in the system.
In addition add debugfs interface for configuring the global timeout value
which is used in the SET_DELAY_DROP command.
Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
drivers/infiniband/hw/mlx5/main.c | 103 +++++++++++++++++++++++++++++++++++
drivers/infiniband/hw/mlx5/mlx5_ib.h | 10 ++++
drivers/infiniband/hw/mlx5/qp.c | 13 ++++-
3 files changed, 123 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index e76891a2a3b9..bf7ae5aa6c82 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -30,6 +30,7 @@
* SOFTWARE.
*/
+#include <linux/debugfs.h>
#include <linux/highmem.h>
#include <linux/module.h>
#include <linux/init.h>
@@ -2709,6 +2710,8 @@ static void delay_drop_handler(struct work_struct *work)
container_of(work, struct mlx5_ib_delay_drop,
delay_drop_work);
+ atomic_inc(&delay_drop->events_cnt);
+
mutex_lock(&delay_drop->lock);
err = mlx5_core_set_delay_drop(delay_drop->dev->mdev,
delay_drop->timeout);
@@ -3577,12 +3580,107 @@ static void mlx5_ib_free_rdma_netdev(struct net_device *netdev)
return mlx5_rdma_netdev_free(netdev);
}
+static void delay_drop_debugfs_cleanup(struct mlx5_ib_dev *dev)
+{
+ if (!dev->delay_drop.dbg)
+ return;
+ debugfs_remove_recursive(dev->delay_drop.dbg->dir_debugfs);
+ kfree(dev->delay_drop.dbg);
+ dev->delay_drop.dbg = NULL;
+}
+
static void cancel_delay_drop(struct mlx5_ib_dev *dev)
{
if (!(dev->ib_dev.attrs.raw_packet_caps & IB_RAW_PACKET_CAP_DELAY_DROP))
return;
cancel_work_sync(&dev->delay_drop.delay_drop_work);
+ delay_drop_debugfs_cleanup(dev);
+}
+
+static ssize_t delay_drop_timeout_read(struct file *filp, char __user *buf,
+ size_t count, loff_t *pos)
+{
+ struct mlx5_ib_delay_drop *delay_drop = filp->private_data;
+ char lbuf[20];
+ int len;
+
+ len = snprintf(lbuf, sizeof(lbuf), "%u\n", delay_drop->timeout);
+ return simple_read_from_buffer(buf, count, pos, lbuf, len);
+}
+
+static ssize_t delay_drop_timeout_write(struct file *filp, const char __user *buf,
+ size_t count, loff_t *pos)
+{
+ struct mlx5_ib_delay_drop *delay_drop = filp->private_data;
+ u32 timeout;
+ u32 var;
+
+ if (kstrtouint_from_user(buf, count, 0, &var))
+ return -EFAULT;
+
+ timeout = min_t(u32, roundup(var, 100), MLX5_MAX_DELAY_DROP_TIMEOUT_MS *
+ 1000);
+ if (timeout != var)
+ mlx5_ib_dbg(delay_drop->dev, "Round delay drop timeout to %u usec\n",
+ timeout);
+
+ delay_drop->timeout = timeout;
+
+ return count;
+}
+
+static const struct file_operations fops_delay_drop_timeout = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .write = delay_drop_timeout_write,
+ .read = delay_drop_timeout_read,
+};
+
+static int delay_drop_debugfs_init(struct mlx5_ib_dev *dev)
+{
+ struct mlx5_ib_dbg_delay_drop *dbg;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ dbg = kzalloc(sizeof(*dbg), GFP_KERNEL);
+ if (!dbg)
+ return -ENOMEM;
+
+ dbg->dir_debugfs =
+ debugfs_create_dir("delay_drop",
+ dev->mdev->priv.dbg_root);
+ if (!dbg->dir_debugfs)
+ return -ENOMEM;
+
+ dbg->events_cnt_debugfs =
+ debugfs_create_atomic_t("num_timeout_events", 0400,
+ dbg->dir_debugfs,
+ &dev->delay_drop.events_cnt);
+ if (!dbg->events_cnt_debugfs)
+ goto out_debugfs;
+
+ dbg->rqs_cnt_debugfs =
+ debugfs_create_atomic_t("num_rqs", 0400,
+ dbg->dir_debugfs,
+ &dev->delay_drop.rqs_cnt);
+ if (!dbg->rqs_cnt_debugfs)
+ goto out_debugfs;
+
+ dbg->timeout_debugfs =
+ debugfs_create_file("timeout", 0600,
+ dbg->dir_debugfs,
+ &dev->delay_drop,
+ &fops_delay_drop_timeout);
+ if (!dbg->timeout_debugfs)
+ goto out_debugfs;
+
+ return 0;
+
+out_debugfs:
+ delay_drop_debugfs_cleanup(dev);
+ return -ENOMEM;
}
static void init_delay_drop(struct mlx5_ib_dev *dev)
@@ -3595,6 +3693,11 @@ static void init_delay_drop(struct mlx5_ib_dev *dev)
dev->delay_drop.activate = false;
dev->delay_drop.timeout = MLX5_MAX_DELAY_DROP_TIMEOUT_MS * 1000;
INIT_WORK(&dev->delay_drop.delay_drop_work, delay_drop_handler);
+ atomic_set(&dev->delay_drop.rqs_cnt, 0);
+ atomic_set(&dev->delay_drop.events_cnt, 0);
+
+ if (delay_drop_debugfs_init(dev))
+ mlx5_ib_warn(dev, "Failed to init delay drop debugfs\n");
}
static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index dd44624de56a..14e4398e7a6a 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -627,6 +627,13 @@ enum {
MLX5_MAX_DELAY_DROP_TIMEOUT_MS = 100,
};
+struct mlx5_ib_dbg_delay_drop {
+ struct dentry *dir_debugfs;
+ struct dentry *rqs_cnt_debugfs;
+ struct dentry *events_cnt_debugfs;
+ struct dentry *timeout_debugfs;
+};
+
struct mlx5_ib_delay_drop {
struct mlx5_ib_dev *dev;
struct work_struct delay_drop_work;
@@ -634,6 +641,9 @@ struct mlx5_ib_delay_drop {
struct mutex lock;
u32 timeout;
bool activate;
+ atomic_t events_cnt;
+ atomic_t rqs_cnt;
+ struct mlx5_ib_dbg_delay_drop *dbg;
};
struct mlx5_ib_dev {
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index fe519487461e..75b87d72df7b 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -675,10 +675,14 @@ static int mlx5_ib_umem_get(struct mlx5_ib_dev *dev,
return err;
}
-static void destroy_user_rq(struct ib_pd *pd, struct mlx5_ib_rwq *rwq)
+static void destroy_user_rq(struct mlx5_ib_dev *dev, struct ib_pd *pd,
+ struct mlx5_ib_rwq *rwq)
{
struct mlx5_ib_ucontext *context;
+ if (rwq->create_flags & MLX5_IB_WQ_FLAGS_DELAY_DROP)
+ atomic_dec(&dev->delay_drop.rqs_cnt);
+
context = to_mucontext(pd->uobject->context);
mlx5_ib_db_unmap_user(context, &rwq->db);
if (rwq->umem)
@@ -4625,6 +4629,9 @@ static int set_delay_drop(struct mlx5_ib_dev *dev)
dev->delay_drop.activate = true;
out:
mutex_unlock(&dev->delay_drop.lock);
+
+ if (!err)
+ atomic_inc(&dev->delay_drop.rqs_cnt);
return err;
}
@@ -4837,7 +4844,7 @@ struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
err_copy:
mlx5_core_destroy_rq_tracked(dev->mdev, &rwq->core_qp);
err_user_rq:
- destroy_user_rq(pd, rwq);
+ destroy_user_rq(dev, pd, rwq);
err:
kfree(rwq);
return ERR_PTR(err);
@@ -4849,7 +4856,7 @@ int mlx5_ib_destroy_wq(struct ib_wq *wq)
struct mlx5_ib_rwq *rwq = to_mrwq(wq);
mlx5_core_destroy_rq_tracked(dev->mdev, &rwq->core_qp);
- destroy_user_rq(wq->pd, rwq);
+ destroy_user_rq(dev, wq->pd, rwq);
kfree(rwq);
return 0;
--
2.12.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH rdma-next 0/5] Dropples RQ
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (4 preceding siblings ...)
2017-05-30 7:29 ` [PATCH rdma-next 5/5] IB/mlx5: Add delay drop configuration and statistics Leon Romanovsky
@ 2017-07-28 18:18 ` Doug Ledford
5 siblings, 0 replies; 7+ messages in thread
From: Doug Ledford @ 2017-07-28 18:18 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Tue, 2017-05-30 at 10:29 +0300, Leon Romanovsky wrote:
> Hi Doug,
>
> From Maor:
> ---
> This series adding support to avoid RAW ethernet packet drops.
>
> When packets received and there are no receive WQEs, then
> the packets are dropped. In certain scenarios it may be preferred
> to configure the device to avoid such packet drops, assuming
> the posting of WQEs will resume shortly.
>
> Under such configuration, when a packet is received for a
> WQ with no receive WQEs, the packet processing is delayed
> until receive WQEs are posted.
>
> If receive WQEs are still not posted after a device configured
> time, the packet is dropped.
>
> To enable this delay drop feature on a WQ, the user should
> set the IB_WQ_FLAGS_DELAY_DROP flag on creation.
The series changes core code API, but only on the raw ethernet QP type,
which is essentially a Mellanox only thing, so I have no issues with
the API change as it is. Series applied.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-07-28 18:18 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-30 7:29 [PATCH rdma-next 0/5] Dropples RQ Leon Romanovsky
[not found] ` <20170530072914.9709-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-05-30 7:29 ` [PATCH rdma-next 1/5] IB/core: Introduce delay drop for a WQ Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 2/5] net/mlx5: Introduce set delay drop command Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 3/5] net/mlx5: Introduce general notification event Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 4/5] IB/mlx5: Add support to dropless RQ Leon Romanovsky
2017-05-30 7:29 ` [PATCH rdma-next 5/5] IB/mlx5: Add delay drop configuration and statistics Leon Romanovsky
2017-07-28 18:18 ` [PATCH rdma-next 0/5] Dropples RQ Doug Ledford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).