netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] Add QCN support to the DCB NL layer
@ 2015-03-04 12:51 Or Gerlitz
  2015-03-04 12:51 ` [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute Or Gerlitz
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Or Gerlitz @ 2015-03-04 12:51 UTC (permalink / raw)
  To: David S. Miller, John Fastabend; +Cc: netdev, Amir Vadai, Tal Alon, Or Gerlitz

Hi Dave, John, all

This series from Shani Michaeli adds support for the IEEE QCN attribute 
to the kernel DCB NL stack, and implementation in the mlx4 driver which
programs the firmware according to the admin directives.

Or.

Shani Michaeli (3):
  net/dcb: Add IEEE QCN attribute
  net/mlx4_core: Add basic elements for QCN
  net/mlx4_en: Add QCN parameters and statistics handling

 drivers/net/ethernet/mellanox/mlx4/cmd.c       |    9 +
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c |  218 ++++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/fw.c        |   13 ++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    1 +
 include/linux/mlx4/cmd.h                       |   13 ++
 include/linux/mlx4/device.h                    |    3 +-
 include/net/dcbnl.h                            |    3 +
 include/uapi/linux/dcbnl.h                     |   66 +++++++
 net/dcb/dcbnl.c                                |   41 ++++-
 9 files changed, 361 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute
  2015-03-04 12:51 [PATCH net-next 0/3] Add QCN support to the DCB NL layer Or Gerlitz
@ 2015-03-04 12:51 ` Or Gerlitz
  2015-03-04 17:19   ` John Fastabend
  2015-03-04 12:51 ` [PATCH net-next 2/3] net/mlx4_core: Add basic elements for QCN Or Gerlitz
  2015-03-04 12:51 ` [PATCH net-next 3/3] net/mlx4_en: Add QCN parameters and statistics handling Or Gerlitz
  2 siblings, 1 reply; 8+ messages in thread
From: Or Gerlitz @ 2015-03-04 12:51 UTC (permalink / raw)
  To: David S. Miller, John Fastabend
  Cc: netdev, Amir Vadai, Tal Alon, Shani Michaeli, Shachar Raindel,
	Or Gerlitz

From: Shani Michaeli <shanim@mellanox.com>

As specified in 802.1Qau spec. Add this optional attribute to the
DCB netlink layer. To allow for application to use the new attribute,
NIC drivers should implement and register the  callbacks ieee_getqcn,
ieee_setqcn and ieee_getqcnstats.

The QCN attribute holds a set of parameters for management, and
a set of statistics to provide informative data on Congestion-Control
defined by this spec.

Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 include/net/dcbnl.h        |    3 ++
 include/uapi/linux/dcbnl.h |   66 ++++++++++++++++++++++++++++++++++++++++++++
 net/dcb/dcbnl.c            |   41 +++++++++++++++++++++++++--
 3 files changed, 107 insertions(+), 3 deletions(-)

diff --git a/include/net/dcbnl.h b/include/net/dcbnl.h
index 597b88a..207d9ba 100644
--- a/include/net/dcbnl.h
+++ b/include/net/dcbnl.h
@@ -49,6 +49,9 @@ struct dcbnl_rtnl_ops {
 	int (*ieee_setets) (struct net_device *, struct ieee_ets *);
 	int (*ieee_getmaxrate) (struct net_device *, struct ieee_maxrate *);
 	int (*ieee_setmaxrate) (struct net_device *, struct ieee_maxrate *);
+	int (*ieee_getqcn) (struct net_device *, struct ieee_qcn *);
+	int (*ieee_setqcn) (struct net_device *, struct ieee_qcn *);
+	int (*ieee_getqcnstats) (struct net_device *, struct ieee_qcn_stats *);
 	int (*ieee_getpfc) (struct net_device *, struct ieee_pfc *);
 	int (*ieee_setpfc) (struct net_device *, struct ieee_pfc *);
 	int (*ieee_getapp) (struct net_device *, struct dcb_app *);
diff --git a/include/uapi/linux/dcbnl.h b/include/uapi/linux/dcbnl.h
index e711f20..6497d79 100644
--- a/include/uapi/linux/dcbnl.h
+++ b/include/uapi/linux/dcbnl.h
@@ -78,6 +78,70 @@ struct ieee_maxrate {
 	__u64	tc_maxrate[IEEE_8021QAZ_MAX_TCS];
 };
 
+enum dcbnl_cndd_states {
+	DCB_CNDD_RESET = 0,
+	DCB_CNDD_EDGE,
+	DCB_CNDD_INTERIOR,
+	DCB_CNDD_INTERIOR_READY,
+};
+
+/* This structure contains the IEEE 802.1Qau QCN managed object.
+ *
+ *@rpg_enable: enable QCN RP
+ *@rppp_max_rps: maximum number of RPs allowed for this CNPV on this port
+ *@rpg_time_reset: time between rate increases if no CNMs received.
+ *		   given in u-seconds
+ *@rpg_byte_reset: transmitted data between rate increases if no CNMs received.
+ *		   given in Bytes
+ *@rpg_threshold: The number of times rpByteStage or rpTimeStage can count
+ *		   before RP rate control state machine advances states
+ *@rpg_max_rate: the maxinun rate, in Mbits per second,
+ *		 at which an RP can transmit
+ *@rpg_ai_rate: The rate, in Mbits per second,
+ *		used to increase rpTargetRate in the RPR_ACTIVE_INCREASE
+ *@rpg_hai_rate: The rate, in Mbits per second,
+ *		 used to increase rpTargetRate in the RPR_HYPER_INCREASE state
+ *@rpg_gd: Upon CNM receive, flow rate is limited to (Fb/Gd)*CurrentRate.
+ *	   rpgGd is given as log2(Gd), where Gd may only be powers of 2
+ *@rpg_min_dec_fac: The minimum factor by which the current transmit rate
+ *		    can be changed by reception of a CNM.
+ *		    value is given as percentage (1-100)
+ *@rpg_min_rate: The minimum value, in bits per second, for rate to limit
+ *@cndd_state_machine: The state of the congestion notification domain
+ *		       defense state machine, as defined by IEEE 802.3Qau
+ *		       section 32.1.1. In the interior ready state,
+ *		       the QCN capable hardware may add CN-TAG TLV to the
+ *		       outgoing traffic, to specifically identify outgoing
+ *		       flows.
+ */
+
+struct ieee_qcn {
+	__u8 rpg_enable[IEEE_8021QAZ_MAX_TCS];
+	__u32 rppp_max_rps[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_time_reset[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_byte_reset[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_threshold[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_max_rate[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_ai_rate[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_hai_rate[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_gd[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_min_dec_fac[IEEE_8021QAZ_MAX_TCS];
+	__u32 rpg_min_rate[IEEE_8021QAZ_MAX_TCS];
+	__u32 cndd_state_machine[IEEE_8021QAZ_MAX_TCS];
+};
+
+/* This structure contains the IEEE 802.1Qau QCN statistics.
+ *
+ *@rppp_rp_centiseconds: the number of RP-centiseconds accumulated
+ *			 by RPs at this priority level on this Port
+ *@rppp_created_rps: number of active RPs(flows) that react to CNMs
+ */
+
+struct ieee_qcn_stats {
+	__u64 rppp_rp_centiseconds[IEEE_8021QAZ_MAX_TCS];
+	__u32 rppp_created_rps[IEEE_8021QAZ_MAX_TCS];
+};
+
 /* This structure contains the IEEE 802.1Qaz PFC managed object
  *
  * @pfc_cap: Indicates the number of traffic classes on the local device
@@ -334,6 +398,8 @@ enum ieee_attrs {
 	DCB_ATTR_IEEE_PEER_PFC,
 	DCB_ATTR_IEEE_PEER_APP,
 	DCB_ATTR_IEEE_MAXRATE,
+	DCB_ATTR_IEEE_QCN,
+	DCB_ATTR_IEEE_QCN_STATS,
 	__DCB_ATTR_IEEE_MAX
 };
 #define DCB_ATTR_IEEE_MAX (__DCB_ATTR_IEEE_MAX - 1)
diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index 93ea801..6fce458 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -177,6 +177,8 @@ static const struct nla_policy dcbnl_ieee_policy[DCB_ATTR_IEEE_MAX + 1] = {
 	[DCB_ATTR_IEEE_PFC]	    = {.len = sizeof(struct ieee_pfc)},
 	[DCB_ATTR_IEEE_APP_TABLE]   = {.type = NLA_NESTED},
 	[DCB_ATTR_IEEE_MAXRATE]   = {.len = sizeof(struct ieee_maxrate)},
+	[DCB_ATTR_IEEE_QCN]         = {.len = sizeof(struct ieee_qcn)},
+	[DCB_ATTR_IEEE_QCN_STATS]   = {.len = sizeof(struct ieee_qcn_stats)},
 };
 
 static const struct nla_policy dcbnl_ieee_app[DCB_ATTR_IEEE_APP_MAX + 1] = {
@@ -1030,7 +1032,7 @@ nla_put_failure:
 	return err;
 }
 
-/* Handle IEEE 802.1Qaz GET commands. */
+/* Handle IEEE 802.1Qaz/802.1Qau/802.1Qbb GET commands. */
 static int dcbnl_ieee_fill(struct sk_buff *skb, struct net_device *netdev)
 {
 	struct nlattr *ieee, *app;
@@ -1067,6 +1069,30 @@ static int dcbnl_ieee_fill(struct sk_buff *skb, struct net_device *netdev)
 		}
 	}
 
+	if (ops->ieee_getqcn) {
+		struct ieee_qcn qcn;
+		memset(&qcn, 0, sizeof(qcn));
+		err = ops->ieee_getqcn(netdev, &qcn);
+		if (!err) {
+			err = nla_put(skb, DCB_ATTR_IEEE_QCN,
+				      sizeof(qcn), &qcn);
+			if (err)
+				return -EMSGSIZE;
+		}
+	}
+
+	if (ops->ieee_getqcnstats) {
+		struct ieee_qcn_stats qcn_stats;
+		memset(&qcn_stats, 0, sizeof(qcn_stats));
+		err = ops->ieee_getqcnstats(netdev, &qcn_stats);
+		if (!err) {
+			err = nla_put(skb, DCB_ATTR_IEEE_QCN_STATS,
+				      sizeof(qcn_stats), &qcn_stats);
+			if (err)
+				return -EMSGSIZE;
+		}
+	}
+
 	if (ops->ieee_getpfc) {
 		struct ieee_pfc pfc;
 		memset(&pfc, 0, sizeof(pfc));
@@ -1379,8 +1405,9 @@ int dcbnl_cee_notify(struct net_device *dev, int event, int cmd,
 }
 EXPORT_SYMBOL(dcbnl_cee_notify);
 
-/* Handle IEEE 802.1Qaz SET commands. If any requested operation can not
- * be completed the entire msg is aborted and error value is returned.
+/* Handle IEEE 802.1Qaz/802.1Qau/802.1Qbb SET commands.
+ * If any requested operation can not be completed
+ * the entire msg is aborted and error value is returned.
  * No attempt is made to reconcile the case where only part of the
  * cmd can be completed.
  */
@@ -1417,6 +1444,14 @@ static int dcbnl_ieee_set(struct net_device *netdev, struct nlmsghdr *nlh,
 			goto err;
 	}
 
+	if (ieee[DCB_ATTR_IEEE_QCN] && ops->ieee_setqcn) {
+		struct ieee_qcn *qcn =
+			nla_data(ieee[DCB_ATTR_IEEE_QCN]);
+		err = ops->ieee_setqcn(netdev, qcn);
+		if (err)
+			goto err;
+	}
+
 	if (ieee[DCB_ATTR_IEEE_PFC] && ops->ieee_setpfc) {
 		struct ieee_pfc *pfc = nla_data(ieee[DCB_ATTR_IEEE_PFC]);
 		err = ops->ieee_setpfc(netdev, pfc);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 2/3] net/mlx4_core: Add basic elements for QCN
  2015-03-04 12:51 [PATCH net-next 0/3] Add QCN support to the DCB NL layer Or Gerlitz
  2015-03-04 12:51 ` [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute Or Gerlitz
@ 2015-03-04 12:51 ` Or Gerlitz
  2015-03-04 12:51 ` [PATCH net-next 3/3] net/mlx4_en: Add QCN parameters and statistics handling Or Gerlitz
  2 siblings, 0 replies; 8+ messages in thread
From: Or Gerlitz @ 2015-03-04 12:51 UTC (permalink / raw)
  To: David S. Miller, John Fastabend
  Cc: netdev, Amir Vadai, Tal Alon, Shani Michaeli, Or Gerlitz

From: Shani Michaeli <shanim@mellanox.com>

Add device capability, firmware command opcode and etc prior elements
needed for QCN suppprt. Disable SRIOV VF view/access for QCN is disabled.

While here, remove a redundant offset definition into the
QUERY_DEV_CAP mailbox.

Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c |    9 +++++++++
 drivers/net/ethernet/mellanox/mlx4/fw.c  |   13 +++++++++++--
 include/linux/mlx4/cmd.h                 |   13 +++++++++++++
 include/linux/mlx4/device.h              |    3 ++-
 4 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index a681d7c..20b3c7b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -1499,6 +1499,15 @@ static struct mlx4_cmd_info cmd_info[] = {
 		.verify = NULL,
 		.wrapper = mlx4_ACCESS_REG_wrapper,
 	},
+	{
+		.opcode = MLX4_CMD_CONGESTION_CTRL_OPCODE,
+		.has_inbox = false,
+		.has_outbox = false,
+		.out_is_imm = false,
+		.encode_slave_id = false,
+		.verify = NULL,
+		.wrapper = mlx4_CMD_EPERM_wrapper,
+	},
 	/* Native multicast commands are not available for guests */
 	{
 		.opcode = MLX4_CMD_QP_ATTACH,
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 5a21e5d..242bcee 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -143,7 +143,8 @@ static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags)
 		[18] = "More than 80 VFs support",
 		[19] = "Performance optimized for limited rule configuration flow steering support",
 		[20] = "Recoverable error events support",
-		[21] = "Port Remap support"
+		[21] = "Port Remap support",
+		[22] = "QCN support"
 	};
 	int i;
 
@@ -675,7 +676,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 #define QUERY_DEV_CAP_FLOW_STEERING_RANGE_EN_OFFSET	0x76
 #define QUERY_DEV_CAP_FLOW_STEERING_MAX_QP_OFFSET	0x77
 #define QUERY_DEV_CAP_CQ_EQ_CACHE_LINE_STRIDE	0x7a
-#define QUERY_DEV_CAP_ETH_PROT_CTRL_OFFSET	0x7a
+#define QUERY_DEV_CAP_ECN_QCN_VER_OFFSET	0x7b
 #define QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET	0x80
 #define QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET	0x82
 #define QUERY_DEV_CAP_AUX_ENTRY_SZ_OFFSET	0x84
@@ -777,6 +778,9 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_DMFS_IPOIB;
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_FLOW_STEERING_MAX_QP_OFFSET);
 	dev_cap->fs_max_num_qp_per_entry = field;
+	MLX4_GET(field, outbox, QUERY_DEV_CAP_ECN_QCN_VER_OFFSET);
+	if (field & 0x1)
+		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_QCN;
 	MLX4_GET(stat_rate, outbox, QUERY_DEV_CAP_RATE_SUPPORT_OFFSET);
 	dev_cap->stat_rate_support = stat_rate;
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_CQ_TS_SUPPORT_OFFSET);
@@ -1149,6 +1153,11 @@ int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int slave,
 		     DEV_CAP_EXT_2_FLAG_FSM);
 	MLX4_PUT(outbox->buf, field32, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
 
+	/* turn off QCN for guests */
+	MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_ECN_QCN_VER_OFFSET);
+	field &= 0xfe;
+	MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_ECN_QCN_VER_OFFSET);
+
 	return 0;
 }
 
diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h
index 7b6d4e9..7299e95 100644
--- a/include/linux/mlx4/cmd.h
+++ b/include/linux/mlx4/cmd.h
@@ -163,6 +163,9 @@ enum {
 	MLX4_QP_FLOW_STEERING_ATTACH = 0x65,
 	MLX4_QP_FLOW_STEERING_DETACH = 0x66,
 	MLX4_FLOW_STEERING_IB_UC_QP_RANGE = 0x64,
+
+	/* Update and read QCN parameters */
+	MLX4_CMD_CONGESTION_CTRL_OPCODE = 0x68,
 };
 
 enum {
@@ -233,6 +236,16 @@ struct mlx4_config_dev_params {
 	u8	rx_csum_flags_port_2;
 };
 
+enum mlx4_en_congestion_control_algorithm {
+	MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT = 0,
+};
+
+enum mlx4_en_congestion_control_opmod {
+	MLX4_CONGESTION_CONTROL_GET_PARAMS,
+	MLX4_CONGESTION_CONTROL_GET_STATISTICS,
+	MLX4_CONGESTION_CONTROL_SET_PARAMS = 4,
+};
+
 struct mlx4_dev;
 
 struct mlx4_cmd_mailbox {
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index e4ebff7..1cc5482 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -203,7 +203,8 @@ enum {
 	MLX4_DEV_CAP_FLAG2_80_VFS		= 1LL <<  18,
 	MLX4_DEV_CAP_FLAG2_FS_A0		= 1LL <<  19,
 	MLX4_DEV_CAP_FLAG2_RECOVERABLE_ERROR_EVENT = 1LL << 20,
-	MLX4_DEV_CAP_FLAG2_PORT_REMAP		= 1LL <<  21
+	MLX4_DEV_CAP_FLAG2_PORT_REMAP		= 1LL <<  21,
+	MLX4_DEV_CAP_FLAG2_QCN			= 1LL <<  22,
 };
 
 enum {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 3/3] net/mlx4_en: Add QCN parameters and statistics handling
  2015-03-04 12:51 [PATCH net-next 0/3] Add QCN support to the DCB NL layer Or Gerlitz
  2015-03-04 12:51 ` [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute Or Gerlitz
  2015-03-04 12:51 ` [PATCH net-next 2/3] net/mlx4_core: Add basic elements for QCN Or Gerlitz
@ 2015-03-04 12:51 ` Or Gerlitz
  2 siblings, 0 replies; 8+ messages in thread
From: Or Gerlitz @ 2015-03-04 12:51 UTC (permalink / raw)
  To: David S. Miller, John Fastabend
  Cc: netdev, Amir Vadai, Tal Alon, Shani Michaeli, Or Gerlitz

From: Shani Michaeli <shanim@mellanox.com>

Implement the IEEE DCB handlers for set/get QCN parameters and
statistics reading per TC.

Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c |  218 ++++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    1 +
 2 files changed, 219 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index c95ca25..cde14fa 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -36,6 +36,49 @@
 
 #include "mlx4_en.h"
 
+/* Definitions for QCN
+ */
+
+struct mlx4_congestion_control_mb_prio_802_1_qau_params {
+	__be32 modify_enable_high;
+	__be32 modify_enable_low;
+	__be32 reserved1;
+	__be32 extended_enable;
+	__be32 rppp_max_rps;
+	__be32 rpg_time_reset;
+	__be32 rpg_byte_reset;
+	__be32 rpg_threshold;
+	__be32 rpg_max_rate;
+	__be32 rpg_ai_rate;
+	__be32 rpg_hai_rate;
+	__be32 rpg_gd;
+	__be32 rpg_min_dec_fac;
+	__be32 rpg_min_rate;
+	__be32 max_time_rise;
+	__be32 max_byte_rise;
+	__be32 max_qdelta;
+	__be32 min_qoffset;
+	__be32 gd_coefficient;
+	__be32 reserved2[5];
+	__be32 cp_sample_base;
+	__be32 reserved3[39];
+};
+
+struct mlx4_congestion_control_mb_prio_802_1_qau_statistics {
+	__be64 rppp_rp_centiseconds;
+	__be32 reserved1;
+	__be32 ignored_cnm;
+	__be32 rppp_created_rps;
+	__be32 estimated_total_rate;
+	__be32 max_active_rate_limiter_index;
+	__be32 dropped_cnms_busy_fw;
+	__be32 reserved2;
+	__be32 cnms_handled_successfully;
+	__be32 min_total_limiters_rate;
+	__be32 max_total_limiters_rate;
+	__be32 reserved3[4];
+};
+
 static int mlx4_en_dcbnl_ieee_getets(struct net_device *dev,
 				   struct ieee_ets *ets)
 {
@@ -242,6 +285,178 @@ static int mlx4_en_dcbnl_ieee_setmaxrate(struct net_device *dev,
 	return 0;
 }
 
+#define RPG_ENABLE_BIT	31
+#define CN_TAG_BIT	30
+
+static int mlx4_en_dcbnl_ieee_getqcn(struct net_device *dev,
+				     struct ieee_qcn *qcn)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	struct mlx4_congestion_control_mb_prio_802_1_qau_params *hw_qcn;
+	struct mlx4_cmd_mailbox *mailbox_out = NULL;
+	u64 mailbox_in_dma = 0;
+	u32 inmod = 0;
+	int i, err;
+
+	if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+		return -EOPNOTSUPP;
+
+	mailbox_out = mlx4_alloc_cmd_mailbox(priv->mdev->dev);
+	if (IS_ERR(mailbox_out))
+		return -ENOMEM;
+	hw_qcn =
+	(struct mlx4_congestion_control_mb_prio_802_1_qau_params *)
+	mailbox_out->buf;
+
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+		inmod = priv->port | ((1 << i) << 8) |
+			 (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+		err = mlx4_cmd_box(priv->mdev->dev, mailbox_in_dma,
+				   mailbox_out->dma,
+				   inmod, MLX4_CONGESTION_CONTROL_GET_PARAMS,
+				   MLX4_CMD_CONGESTION_CTRL_OPCODE,
+				   MLX4_CMD_TIME_CLASS_C,
+				   MLX4_CMD_NATIVE);
+		if (err) {
+			mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+			return err;
+		}
+
+		qcn->rpg_enable[i] =
+			be32_to_cpu(hw_qcn->extended_enable) >> RPG_ENABLE_BIT;
+		qcn->rppp_max_rps[i] =
+			be32_to_cpu(hw_qcn->rppp_max_rps);
+		qcn->rpg_time_reset[i] =
+			be32_to_cpu(hw_qcn->rpg_time_reset);
+		qcn->rpg_byte_reset[i] =
+			be32_to_cpu(hw_qcn->rpg_byte_reset);
+		qcn->rpg_threshold[i] =
+			be32_to_cpu(hw_qcn->rpg_threshold);
+		qcn->rpg_max_rate[i] =
+			be32_to_cpu(hw_qcn->rpg_max_rate);
+		qcn->rpg_ai_rate[i] =
+			be32_to_cpu(hw_qcn->rpg_ai_rate);
+		qcn->rpg_hai_rate[i] =
+			be32_to_cpu(hw_qcn->rpg_hai_rate);
+		qcn->rpg_gd[i] =
+			be32_to_cpu(hw_qcn->rpg_gd);
+		qcn->rpg_min_dec_fac[i] =
+			be32_to_cpu(hw_qcn->rpg_min_dec_fac);
+		qcn->rpg_min_rate[i] =
+			be32_to_cpu(hw_qcn->rpg_min_rate);
+		qcn->cndd_state_machine[i] =
+			priv->cndd_state[i];
+	}
+	mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+	return 0;
+}
+
+static int mlx4_en_dcbnl_ieee_setqcn(struct net_device *dev,
+				     struct ieee_qcn *qcn)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	struct mlx4_congestion_control_mb_prio_802_1_qau_params *hw_qcn;
+	struct mlx4_cmd_mailbox *mailbox_in = NULL;
+	u64 mailbox_in_dma = 0;
+	u32 inmod = 0;
+	int i, err;
+#define MODIFY_ENABLE_HIGH_MASK 0xc0000000
+#define MODIFY_ENABLE_LOW_MASK 0xffc00000
+
+	if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+		return -EOPNOTSUPP;
+
+	mailbox_in = mlx4_alloc_cmd_mailbox(priv->mdev->dev);
+	if (IS_ERR(mailbox_in))
+		return -ENOMEM;
+
+	mailbox_in_dma = mailbox_in->dma;
+	hw_qcn =
+	(struct mlx4_congestion_control_mb_prio_802_1_qau_params *)mailbox_in->buf;
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+		inmod = priv->port | ((1 << i) << 8) |
+			 (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+
+		/* Before updating QCN parameter,
+		 * need to set it's modify enable bit to 1
+		 */
+
+		hw_qcn->modify_enable_high = cpu_to_be32(
+						MODIFY_ENABLE_HIGH_MASK);
+		hw_qcn->modify_enable_low = cpu_to_be32(MODIFY_ENABLE_LOW_MASK);
+
+		hw_qcn->extended_enable = cpu_to_be32(qcn->rpg_enable[i] << RPG_ENABLE_BIT);
+		hw_qcn->rppp_max_rps = cpu_to_be32(qcn->rppp_max_rps[i]);
+		hw_qcn->rpg_time_reset = cpu_to_be32(qcn->rpg_time_reset[i]);
+		hw_qcn->rpg_byte_reset = cpu_to_be32(qcn->rpg_byte_reset[i]);
+		hw_qcn->rpg_threshold = cpu_to_be32(qcn->rpg_threshold[i]);
+		hw_qcn->rpg_max_rate = cpu_to_be32(qcn->rpg_max_rate[i]);
+		hw_qcn->rpg_ai_rate = cpu_to_be32(qcn->rpg_ai_rate[i]);
+		hw_qcn->rpg_hai_rate = cpu_to_be32(qcn->rpg_hai_rate[i]);
+		hw_qcn->rpg_gd = cpu_to_be32(qcn->rpg_gd[i]);
+		hw_qcn->rpg_min_dec_fac = cpu_to_be32(qcn->rpg_min_dec_fac[i]);
+		hw_qcn->rpg_min_rate = cpu_to_be32(qcn->rpg_min_rate[i]);
+		priv->cndd_state[i] = qcn->cndd_state_machine[i];
+		if (qcn->cndd_state_machine[i] == DCB_CNDD_INTERIOR_READY)
+			hw_qcn->extended_enable |= cpu_to_be32(1 << CN_TAG_BIT);
+
+		err = mlx4_cmd(priv->mdev->dev, mailbox_in_dma, inmod,
+			       MLX4_CONGESTION_CONTROL_SET_PARAMS,
+			       MLX4_CMD_CONGESTION_CTRL_OPCODE,
+			       MLX4_CMD_TIME_CLASS_C,
+			       MLX4_CMD_NATIVE);
+		if (err) {
+			mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_in);
+			return err;
+		}
+	}
+	mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_in);
+	return 0;
+}
+
+static int mlx4_en_dcbnl_ieee_getqcnstats(struct net_device *dev,
+					  struct ieee_qcn_stats *qcn_stats)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	struct mlx4_congestion_control_mb_prio_802_1_qau_statistics *hw_qcn_stats;
+	struct mlx4_cmd_mailbox *mailbox_out = NULL;
+	u64 mailbox_in_dma = 0;
+	u32 inmod = 0;
+	int i, err;
+
+	if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+		return -EOPNOTSUPP;
+
+	mailbox_out = mlx4_alloc_cmd_mailbox(priv->mdev->dev);
+	if (IS_ERR(mailbox_out))
+		return -ENOMEM;
+
+	hw_qcn_stats =
+	(struct mlx4_congestion_control_mb_prio_802_1_qau_statistics *)
+	mailbox_out->buf;
+
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+		inmod = priv->port | ((1 << i) << 8) |
+			 (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+		err = mlx4_cmd_box(priv->mdev->dev, mailbox_in_dma,
+				   mailbox_out->dma, inmod,
+				   MLX4_CONGESTION_CONTROL_GET_STATISTICS,
+				   MLX4_CMD_CONGESTION_CTRL_OPCODE,
+				   MLX4_CMD_TIME_CLASS_C,
+				   MLX4_CMD_NATIVE);
+		if (err) {
+			mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+			return err;
+		}
+		qcn_stats->rppp_rp_centiseconds[i] =
+			be64_to_cpu(hw_qcn_stats->rppp_rp_centiseconds);
+		qcn_stats->rppp_created_rps[i] =
+			be32_to_cpu(hw_qcn_stats->rppp_created_rps);
+	}
+	mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+	return 0;
+}
+
 const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops = {
 	.ieee_getets	= mlx4_en_dcbnl_ieee_getets,
 	.ieee_setets	= mlx4_en_dcbnl_ieee_setets,
@@ -252,6 +467,9 @@ const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops = {
 
 	.getdcbx	= mlx4_en_dcbnl_getdcbx,
 	.setdcbx	= mlx4_en_dcbnl_setdcbx,
+	.ieee_getqcn	= mlx4_en_dcbnl_ieee_getqcn,
+	.ieee_setqcn	= mlx4_en_dcbnl_ieee_setqcn,
+	.ieee_getqcnstats = mlx4_en_dcbnl_ieee_getqcnstats,
 };
 
 const struct dcbnl_rtnl_ops mlx4_en_dcbnl_pfc_ops = {
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 2a8268e..94553b5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -608,6 +608,7 @@ struct mlx4_en_priv {
 #ifdef CONFIG_MLX4_EN_DCB
 	struct ieee_ets ets;
 	u16 maxrate[IEEE_8021QAZ_MAX_TCS];
+	enum dcbnl_cndd_states cndd_state[IEEE_8021QAZ_MAX_TCS];
 #endif
 #ifdef CONFIG_RFS_ACCEL
 	spinlock_t filters_lock;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute
  2015-03-04 12:51 ` [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute Or Gerlitz
@ 2015-03-04 17:19   ` John Fastabend
  2015-03-05  6:53     ` Or Gerlitz
  0 siblings, 1 reply; 8+ messages in thread
From: John Fastabend @ 2015-03-04 17:19 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David S. Miller, John Fastabend, netdev, Amir Vadai, Tal Alon,
	Shani Michaeli, Shachar Raindel

On 03/04/2015 04:51 AM, Or Gerlitz wrote:
> From: Shani Michaeli <shanim@mellanox.com>
>
> As specified in 802.1Qau spec. Add this optional attribute to the
> DCB netlink layer. To allow for application to use the new attribute,
> NIC drivers should implement and register the  callbacks ieee_getqcn,
> ieee_setqcn and ieee_getqcnstats.
>
> The QCN attribute holds a set of parameters for management, and
> a set of statistics to provide informative data on Congestion-Control
> defined by this spec.
>
> Signed-off-by: Shani Michaeli <shanim@mellanox.com>
> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> ---

Looks good to me. Do you have a QCN enabled switch? I looked at
implementing this awhile ago but didn't have any switch support so
I never did it.

Also do you have a user space client to configure this? I would like
it if someone wanted to add support to lldpad/dcbtool.

[...]

> +
> +/* This structure contains the IEEE 802.1Qau QCN managed object.
> + *
> + *@rpg_enable: enable QCN RP
> + *@rppp_max_rps: maximum number of RPs allowed for this CNPV on this port
> + *@rpg_time_reset: time between rate increases if no CNMs received.
> + *		   given in u-seconds
> + *@rpg_byte_reset: transmitted data between rate increases if no CNMs received.
> + *		   given in Bytes
> + *@rpg_threshold: The number of times rpByteStage or rpTimeStage can count
> + *		   before RP rate control state machine advances states
> + *@rpg_max_rate: the maxinun rate, in Mbits per second,
> + *		 at which an RP can transmit
> + *@rpg_ai_rate: The rate, in Mbits per second,
> + *		used to increase rpTargetRate in the RPR_ACTIVE_INCREASE
> + *@rpg_hai_rate: The rate, in Mbits per second,
> + *		 used to increase rpTargetRate in the RPR_HYPER_INCREASE state
> + *@rpg_gd: Upon CNM receive, flow rate is limited to (Fb/Gd)*CurrentRate.
> + *	   rpgGd is given as log2(Gd), where Gd may only be powers of 2
> + *@rpg_min_dec_fac: The minimum factor by which the current transmit rate
> + *		    can be changed by reception of a CNM.
> + *		    value is given as percentage (1-100)
> + *@rpg_min_rate: The minimum value, in bits per second, for rate to limit
> + *@cndd_state_machine: The state of the congestion notification domain
> + *		       defense state machine, as defined by IEEE 802.3Qau
> + *		       section 32.1.1. In the interior ready state,
> + *		       the QCN capable hardware may add CN-TAG TLV to the
> + *		       outgoing traffic, to specifically identify outgoing
> + *		       flows.
> + */

I'm assuming this structure maps to an IEEE MIB? Its a rather large
structure for a single netlink type but this seems to be how we built
the dcbnl interface and if it does seem logical that the structure is
one logical block, meaning you need to supply all fields.

[...]

>
> +	if (ops->ieee_getqcn) {
> +		struct ieee_qcn qcn;

you might consider adding a newline here it  is the best practice for
new code although dcbnl has plenty of examples where it doesn't use
this convention.


> +		memset(&qcn, 0, sizeof(qcn));
> +		err = ops->ieee_getqcn(netdev, &qcn);
> +		if (!err) {
> +			err = nla_put(skb, DCB_ATTR_IEEE_QCN,
> +				      sizeof(qcn), &qcn);
> +			if (err)
> +				return -EMSGSIZE;
> +		}
> +	}
> +
> +	if (ops->ieee_getqcnstats) {
> +		struct ieee_qcn_stats qcn_stats;

same here.

> +		memset(&qcn_stats, 0, sizeof(qcn_stats));
> +		err = ops->ieee_getqcnstats(netdev, &qcn_stats);
> +		if (!err) {
> +			err = nla_put(skb, DCB_ATTR_IEEE_QCN_STATS,
> +				      sizeof(qcn_stats), &qcn_stats);
> +			if (err)
> +				return -EMSGSIZE;
> +		}
> +	}
> +
>   	if (ops->ieee_getpfc) {
>   		struct ieee_pfc pfc;
>   		memset(&pfc, 0, sizeof(pfc));
> @@ -1379,8 +1405,9 @@ int dcbnl_cee_notify(struct net_device *dev, int event, int cmd,
>   }
>   EXPORT_SYMBOL(dcbnl_cee_notify);
>
> -/* Handle IEEE 802.1Qaz SET commands. If any requested operation can not
> - * be completed the entire msg is aborted and error value is returned.
> +/* Handle IEEE 802.1Qaz/802.1Qau/802.1Qbb SET commands.
> + * If any requested operation can not be completed
> + * the entire msg is aborted and error value is returned.
>    * No attempt is made to reconcile the case where only part of the
>    * cmd can be completed.
>    */
> @@ -1417,6 +1444,14 @@ static int dcbnl_ieee_set(struct net_device *netdev, struct nlmsghdr *nlh,
>   			goto err;
>   	}
>
> +	if (ieee[DCB_ATTR_IEEE_QCN] && ops->ieee_setqcn) {
> +		struct ieee_qcn *qcn =
> +			nla_data(ieee[DCB_ATTR_IEEE_QCN]);

same here.

> +		err = ops->ieee_setqcn(netdev, qcn);
> +		if (err)
> +			goto err;
> +	}
> +

[...]

Feel free to add my acked by if you respin it with the newlines.

Acked-by: John Fastabend <john.r.fastabend@intel.com>

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute
  2015-03-04 17:19   ` John Fastabend
@ 2015-03-05  6:53     ` Or Gerlitz
  2015-03-05  9:21       ` Shachar Raindel
  0 siblings, 1 reply; 8+ messages in thread
From: Or Gerlitz @ 2015-03-05  6:53 UTC (permalink / raw)
  To: John Fastabend
  Cc: Or Gerlitz, David S. Miller, John Fastabend, Linux Netdev List,
	Amir Vadai, Tal Alon, Shani Michaeli, Shachar Raindel

On Wed, Mar 4, 2015 at 7:19 PM, John Fastabend <john.fastabend@gmail.com> wrote:
> On 03/04/2015 04:51 AM, Or Gerlitz wrote:
>>
>> From: Shani Michaeli <shanim@mellanox.com>
>>
>> As specified in 802.1Qau spec. Add this optional attribute to the
>> DCB netlink layer. To allow for application to use the new attribute,
>> NIC drivers should implement and register the  callbacks ieee_getqcn,
>> ieee_setqcn and ieee_getqcnstats.
>>
>> The QCN attribute holds a set of parameters for management, and
>> a set of statistics to provide informative data on Congestion-Control
>> defined by this spec.
>>
>> Signed-off-by: Shani Michaeli <shanim@mellanox.com>
>> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
>> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>

> Looks good to me. Do you have a QCN enabled switch? I looked at
> implementing this awhile ago but didn't have any switch support so
> I never did it.

I'l let Shachar to address the testing and the MIB questions.

> Also do you have a user space client to configure this? I would like
> it if someone wanted to add support to lldpad/dcbtool.

Sure, we have some netlink (python scripts) code to configure/read
this towards the kernel.


> [...]
>> +
>> +/* This structure contains the IEEE 802.1Qau QCN managed object.
>> + *
>> + *@rpg_enable: enable QCN RP
>> + *@rppp_max_rps: maximum number of RPs allowed for this CNPV on this port
>> + *@rpg_time_reset: time between rate increases if no CNMs received.
>> + *                given in u-seconds
>> + *@rpg_byte_reset: transmitted data between rate increases if no CNMs
>> received.
>> + *                given in Bytes
>> + *@rpg_threshold: The number of times rpByteStage or rpTimeStage can
>> count
>> + *                before RP rate control state machine advances states
>> + *@rpg_max_rate: the maxinun rate, in Mbits per second,
>> + *              at which an RP can transmit
>> + *@rpg_ai_rate: The rate, in Mbits per second,
>> + *             used to increase rpTargetRate in the RPR_ACTIVE_INCREASE
>> + *@rpg_hai_rate: The rate, in Mbits per second,
>> + *              used to increase rpTargetRate in the RPR_HYPER_INCREASE
>> state
>> + *@rpg_gd: Upon CNM receive, flow rate is limited to (Fb/Gd)*CurrentRate.
>> + *        rpgGd is given as log2(Gd), where Gd may only be powers of 2
>> + *@rpg_min_dec_fac: The minimum factor by which the current transmit rate
>> + *                 can be changed by reception of a CNM.
>> + *                 value is given as percentage (1-100)
>> + *@rpg_min_rate: The minimum value, in bits per second, for rate to limit
>> + *@cndd_state_machine: The state of the congestion notification domain
>> + *                    defense state machine, as defined by IEEE 802.3Qau
>> + *                    section 32.1.1. In the interior ready state,
>> + *                    the QCN capable hardware may add CN-TAG TLV to the
>> + *                    outgoing traffic, to specifically identify outgoing
>> + *                    flows.
>> + */
>
>
> I'm assuming this structure maps to an IEEE MIB?

yep, I guess so

> Its a rather large structure for a single netlink type but this seems
> to be how we built
> the dcbnl interface and if it does seem logical that the structure is
> one logical block, meaning you need to supply all fields.

> [...]
>> +       if (ops->ieee_getqcn) {
>> +               struct ieee_qcn qcn;

> you might consider adding a newline here it  is the best practice for
> new code although dcbnl has plenty of examples where it doesn't use
> this convention.

OK, will add this here and in the other places where used that practice.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute
  2015-03-05  6:53     ` Or Gerlitz
@ 2015-03-05  9:21       ` Shachar Raindel
  2015-03-05 15:05         ` John Fastabend
  0 siblings, 1 reply; 8+ messages in thread
From: Shachar Raindel @ 2015-03-05  9:21 UTC (permalink / raw)
  To: Or Gerlitz, John Fastabend
  Cc: Or Gerlitz, David S. Miller, John Fastabend, Linux Netdev List,
	Amir Vadai, Tal Alon, Shani Michaeli



> -----Original Message-----
> From: Or Gerlitz [mailto:gerlitz.or@gmail.com]
> Sent: Thursday, March 05, 2015 8:54 AM
> To: John Fastabend
> Cc: Or Gerlitz; David S. Miller; John Fastabend; Linux Netdev List; Amir
> Vadai; Tal Alon; Shani Michaeli; Shachar Raindel
> Subject: Re: [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute
> 
> On Wed, Mar 4, 2015 at 7:19 PM, John Fastabend
> <john.fastabend@gmail.com> wrote:
> > On 03/04/2015 04:51 AM, Or Gerlitz wrote:
> >>
> >> From: Shani Michaeli <shanim@mellanox.com>
> >>
> >> As specified in 802.1Qau spec. Add this optional attribute to the
> >> DCB netlink layer. To allow for application to use the new attribute,
> >> NIC drivers should implement and register the  callbacks ieee_getqcn,
> >> ieee_setqcn and ieee_getqcnstats.
> >>
> >> The QCN attribute holds a set of parameters for management, and
> >> a set of statistics to provide informative data on Congestion-Control
> >> defined by this spec.
> >>
> >> Signed-off-by: Shani Michaeli <shanim@mellanox.com>
> >> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
> >> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> 
> > Looks good to me. Do you have a QCN enabled switch? I looked at
> > implementing this awhile ago but didn't have any switch support so
> > I never did it.
> 
> I'l let Shachar to address the testing and the MIB questions.

The Mellanox SwitchX-2 IC supports QCN. We were testing our NICs both
with this switch IC and using internal test fixtures where we were
injecting congestion notification messages as raw Ethernet packets by
another host.

> 
> > Also do you have a user space client to configure this? I would like
> > it if someone wanted to add support to lldpad/dcbtool.
> 
> Sure, we have some netlink (python scripts) code to configure/read
> this towards the kernel.
> 
> 
> > [...]
> >> +
> >> +/* This structure contains the IEEE 802.1Qau QCN managed object.
> >> + *
> >> + *@rpg_enable: enable QCN RP
> >> + *@rppp_max_rps: maximum number of RPs allowed for this CNPV on this
> port
> >> + *@rpg_time_reset: time between rate increases if no CNMs received.
> >> + *                given in u-seconds
> >> + *@rpg_byte_reset: transmitted data between rate increases if no
> CNMs
> >> received.
> >> + *                given in Bytes
> >> + *@rpg_threshold: The number of times rpByteStage or rpTimeStage can
> >> count
> >> + *                before RP rate control state machine advances
> states
> >> + *@rpg_max_rate: the maxinun rate, in Mbits per second,
> >> + *              at which an RP can transmit
> >> + *@rpg_ai_rate: The rate, in Mbits per second,
> >> + *             used to increase rpTargetRate in the
> RPR_ACTIVE_INCREASE
> >> + *@rpg_hai_rate: The rate, in Mbits per second,
> >> + *              used to increase rpTargetRate in the
> RPR_HYPER_INCREASE
> >> state
> >> + *@rpg_gd: Upon CNM receive, flow rate is limited to
> (Fb/Gd)*CurrentRate.
> >> + *        rpgGd is given as log2(Gd), where Gd may only be powers of
> 2
> >> + *@rpg_min_dec_fac: The minimum factor by which the current transmit
> rate
> >> + *                 can be changed by reception of a CNM.
> >> + *                 value is given as percentage (1-100)
> >> + *@rpg_min_rate: The minimum value, in bits per second, for rate to
> limit
> >> + *@cndd_state_machine: The state of the congestion notification
> domain
> >> + *                    defense state machine, as defined by IEEE
> 802.3Qau
> >> + *                    section 32.1.1. In the interior ready state,
> >> + *                    the QCN capable hardware may add CN-TAG TLV to
> the
> >> + *                    outgoing traffic, to specifically identify
> outgoing
> >> + *                    flows.
> >> + */
> >
> >
> > I'm assuming this structure maps to an IEEE MIB?
> 
> yep, I guess so

The structures map to the IEEE MIB for QCN (part of IEEE 802.1Qau).
The fields rppp_max_rps and cndd_state_machine are in different sections
than the rest of the fields. However, it seems bit redundant to define a
whole struct just for one field. The cndd_state_machine is not explicitly
defined in the MIB, as the LLDP negotiation, which the standard assumes,
is implemented by lldpad.

> 
> > Its a rather large structure for a single netlink type but this seems
> > to be how we built
> > the dcbnl interface and if it does seem logical that the structure is
> > one logical block, meaning you need to supply all fields.
> 

This is the set of parameters defining how you will reduce or increase you
TX rate upon receiving CNM from the network. I agree that it is a long list,
but this is how the standard was written...

> > [...]
> >> +       if (ops->ieee_getqcn) {
> >> +               struct ieee_qcn qcn;
> 
> > you might consider adding a newline here it  is the best practice for
> > new code although dcbnl has plenty of examples where it doesn't use
> > this convention.
> 
> OK, will add this here and in the other places where used that practice.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute
  2015-03-05  9:21       ` Shachar Raindel
@ 2015-03-05 15:05         ` John Fastabend
  0 siblings, 0 replies; 8+ messages in thread
From: John Fastabend @ 2015-03-05 15:05 UTC (permalink / raw)
  To: Shachar Raindel
  Cc: Or Gerlitz, Or Gerlitz, David S. Miller, John Fastabend,
	Linux Netdev List, Amir Vadai, Tal Alon, Shani Michaeli

[...]

>>
>>> Looks good to me. Do you have a QCN enabled switch? I looked at
>>> implementing this awhile ago but didn't have any switch support so
>>> I never did it.
>>
>> I'l let Shachar to address the testing and the MIB questions.
>
> The Mellanox SwitchX-2 IC supports QCN. We were testing our NICs both
> with this switch IC and using internal test fixtures where we were
> injecting congestion notification messages as raw Ethernet packets by
> another host.

ah great didn't know that switch supported QCN.

>
>>
>>> Also do you have a user space client to configure this? I would like
>>> it if someone wanted to add support to lldpad/dcbtool.
>>
>> Sure, we have some netlink (python scripts) code to configure/read
>> this towards the kernel.
>>
>>
>>> [...]

[...]

>>>
>>>
>>> I'm assuming this structure maps to an IEEE MIB?
>>
>> yep, I guess so
>
> The structures map to the IEEE MIB for QCN (part of IEEE 802.1Qau).
> The fields rppp_max_rps and cndd_state_machine are in different sections
> than the rest of the fields. However, it seems bit redundant to define a
> whole struct just for one field. The cndd_state_machine is not explicitly
> defined in the MIB, as the LLDP negotiation, which the standard assumes,
> is implemented by lldpad.
>
>>
>>> Its a rather large structure for a single netlink type but this seems
>>> to be how we built
>>> the dcbnl interface and if it does seem logical that the structure is
>>> one logical block, meaning you need to supply all fields.
>>
>
> This is the set of parameters defining how you will reduce or increase you
> TX rate upon receiving CNM from the network. I agree that it is a long list,
> but this is how the standard was written...

yep works for me. I was just checking my assumptions are correct and
admittedly being a bit lazy so I didn't pull up the spec myself.

Thanks,
John

[...]

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-05 15:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-04 12:51 [PATCH net-next 0/3] Add QCN support to the DCB NL layer Or Gerlitz
2015-03-04 12:51 ` [PATCH net-next 1/3] net/dcb: Add IEEE QCN attribute Or Gerlitz
2015-03-04 17:19   ` John Fastabend
2015-03-05  6:53     ` Or Gerlitz
2015-03-05  9:21       ` Shachar Raindel
2015-03-05 15:05         ` John Fastabend
2015-03-04 12:51 ` [PATCH net-next 2/3] net/mlx4_core: Add basic elements for QCN Or Gerlitz
2015-03-04 12:51 ` [PATCH net-next 3/3] net/mlx4_en: Add QCN parameters and statistics handling Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).