* [PATCH for-3.2] IBoE fixes
@ 2011-10-10 8:51 Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:51 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
Roland,
This is a batch of IBoE related fixes plus addition of extended PMA counters.
Or.
Or Gerlitz (5):
ib/mlx4: enable 4K MTU for IBoE
ib/mlx4: remove setting of vlan in IBoE WQEs control segment
net/mlx4_core: remove module param controlling the vlan table size
ib/core: add support for extended performance counters in sysfs
ib/mlx4: add support for extended PMA counters under IBoE
drivers/infiniband/core/sysfs.c | 100 ++++++++++++++++++++++++++++++++++++-
drivers/infiniband/hw/mlx4/mad.c | 49 +++++++++++++++---
drivers/infiniband/hw/mlx4/main.c | 2 +-
drivers/infiniband/hw/mlx4/qp.c | 11 +---
drivers/net/mlx4/main.c | 12 +---
include/rdma/ib_pma.h | 1 +
6 files changed, 148 insertions(+), 27 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
@ 2011-10-10 8:53 ` Or Gerlitz
2011-10-10 8:54 ` [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment Or Gerlitz
` (4 subsequent siblings)
5 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:53 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
The IBoE port mtu is derived from the corresponding EN netdevice
mtu, which can go upto supporting Jumbo-Frames of 9KB and hence
surely supports the max IB mtu of 4K
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/main.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index fa643f4..e953cf9 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -227,7 +227,7 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
props->pkey_tbl_len = 1;
props->bad_pkey_cntr = be16_to_cpup((__be16 *) (out_mad->data + 46));
props->qkey_viol_cntr = be16_to_cpup((__be16 *) (out_mad->data + 48));
- props->max_mtu = IB_MTU_2048;
+ props->max_mtu = IB_MTU_4096;
props->subnet_timeout = 0;
props->max_vl_num = out_mad->data[37] >> 4;
props->init_type_reply = 0;
--
1.7.4.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 8:53 ` [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE Or Gerlitz
@ 2011-10-10 8:54 ` Or Gerlitz
2011-10-10 8:55 ` [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size Or Gerlitz
` (3 subsequent siblings)
5 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:54 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
There's no need to set the vlan related fields in an IBoE send WQE
control segment. This b/c the vlan to be used by a UD QP is specified
through the the datagram segment, and with GSI (CM) QP, all the headers
down to the 8021q and mac ones are built by the software anyway.
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/qp.c | 11 ++---------
1 files changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 3a91d9d..819b1a5 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1547,14 +1547,13 @@ static void set_masked_atomic_seg(struct mlx4_wqe_masked_atomic_seg *aseg,
}
static void set_datagram_seg(struct mlx4_wqe_datagram_seg *dseg,
- struct ib_send_wr *wr, __be16 *vlan)
+ struct ib_send_wr *wr)
{
memcpy(dseg->av, &to_mah(wr->wr.ud.ah)->av, sizeof (struct mlx4_av));
dseg->dqpn = cpu_to_be32(wr->wr.ud.remote_qpn);
dseg->qkey = cpu_to_be32(wr->wr.ud.remote_qkey);
dseg->vlan = to_mah(wr->wr.ud.ah)->av.eth.vlan;
memcpy(dseg->mac, to_mah(wr->wr.ud.ah)->av.eth.mac, 6);
- *vlan = dseg->vlan;
}
static void set_mlx_icrc_seg(void *dseg)
@@ -1657,7 +1656,6 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
__be32 uninitialized_var(lso_hdr_sz);
__be32 blh;
int i;
- __be16 vlan = cpu_to_be16(0xffff);
spin_lock_irqsave(&qp->sq.lock, flags);
@@ -1761,7 +1759,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
break;
case IB_QPT_UD:
- set_datagram_seg(wqe, wr, &vlan);
+ set_datagram_seg(wqe, wr);
wqe += sizeof (struct mlx4_wqe_datagram_seg);
size += sizeof (struct mlx4_wqe_datagram_seg) / 16;
@@ -1824,11 +1822,6 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
ctrl->fence_size = (wr->send_flags & IB_SEND_FENCE ?
MLX4_WQE_CTRL_FENCE : 0) | size;
- if (be16_to_cpu(vlan) < 0x1000) {
- ctrl->ins_vlan = 1 << 6;
- ctrl->vlan_tag = vlan;
- }
-
/*
* Make sure descriptor is fully written before
* setting ownership bit (because HW can start
--
1.7.4.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 8:53 ` [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE Or Gerlitz
2011-10-10 8:54 ` [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment Or Gerlitz
@ 2011-10-10 8:55 ` Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 8:56 ` [PATCH 4/5] ib/core: add support for extended performance counters in sysfs Or Gerlitz
` (2 subsequent siblings)
5 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:55 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
Enable the maximum size (128) supported by the device for the shadow
vlans table, without a module param to override it. This table is only
used by the IBoE control plane for setting a vlan index into an RC/UC
QP context or UD Address Handle.
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/net/mlx4/main.c | 12 +++---------
1 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c
index f0ee35d..0285c8a 100644
--- a/drivers/net/mlx4/main.c
+++ b/drivers/net/mlx4/main.c
@@ -93,9 +93,8 @@ static int log_num_mac = 2;
module_param_named(log_num_mac, log_num_mac, int, 0444);
MODULE_PARM_DESC(log_num_mac, "Log2 max number of MACs per ETH port (1-7)");
-static int log_num_vlan;
-module_param_named(log_num_vlan, log_num_vlan, int, 0444);
-MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)");
+/* Log2 max number of VLANs per ETH port (0-7) */
+#define MLX4_LOG_NUM_VLANS 7
static int use_prio;
module_param_named(use_prio, use_prio, bool, 0444);
@@ -230,7 +229,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
dev->caps.max_gso_sz = dev_cap->max_gso_sz;
dev->caps.log_num_macs = log_num_mac;
- dev->caps.log_num_vlans = log_num_vlan;
+ dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS;
dev->caps.log_num_prios = use_prio ? 3 : 0;
for (i = 1; i <= dev->caps.num_ports; ++i) {
@@ -1489,11 +1488,6 @@ static int __init mlx4_verify_params(void)
return -1;
}
- if ((log_num_vlan < 0) || (log_num_vlan > 7)) {
- pr_warning("mlx4_core: bad num_vlan: %d\n", log_num_vlan);
- return -1;
- }
-
if ((log_mtts_per_seg < 1) || (log_mtts_per_seg > 7)) {
pr_warning("mlx4_core: bad log_mtts_per_seg: %d\n", log_mtts_per_seg);
return -1;
--
1.7.4.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
` (2 preceding siblings ...)
2011-10-10 8:55 ` [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size Or Gerlitz
@ 2011-10-10 8:56 ` Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 8:57 ` [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE Or Gerlitz
2011-10-26 7:04 ` [PATCH for-3.2] IBoE fixes Or Gerlitz
5 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:56 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
issue a PMA classportinfo query towards the HW driver, and if extended
counters are supported, expose new sysfs entries which allow to read them.
Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
---
drivers/infiniband/core/sysfs.c | 100 ++++++++++++++++++++++++++++++++++++++-
include/rdma/ib_pma.h | 1 +
2 files changed, 100 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 9ab5df7..8bbed12 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -38,6 +38,7 @@
#include <linux/string.h>
#include <rdma/ib_mad.h>
+#include <rdma/ib_pma.h>
struct ib_port {
struct kobject kobj;
@@ -288,6 +289,13 @@ static ssize_t show_port_pkey(struct ib_port *p, struct port_attribute *attr,
return sprintf(buf, "0x%04x\n", pkey);
}
+enum {
+ COUNTERS_START = 0,
+ COUNTERS_LAST = 15,
+ EXT_COUNTERS_START = 16,
+ EXT_COUNTERS_LAST = 23
+};
+
#define PORT_PMA_ATTR(_name, _counter, _width, _offset) \
struct port_table_attribute port_pma_attr_##_name = { \
.attr = __ATTR(_name, S_IRUGO, show_pma_counter, NULL), \
@@ -301,6 +309,7 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr,
container_of(attr, struct port_table_attribute, attr);
int offset = tab_attr->index & 0xffff;
int width = (tab_attr->index >> 16) & 0xff;
+ int counter = (tab_attr->index >> 24) & 0xff;
struct ib_mad *in_mad = NULL;
struct ib_mad *out_mad = NULL;
ssize_t ret;
@@ -319,7 +328,14 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr,
in_mad->mad_hdr.mgmt_class = IB_MGMT_CLASS_PERF_MGMT;
in_mad->mad_hdr.class_version = 1;
in_mad->mad_hdr.method = IB_MGMT_METHOD_GET;
- in_mad->mad_hdr.attr_id = cpu_to_be16(0x12); /* PortCounters */
+ if (counter >= COUNTERS_START && counter <= COUNTERS_LAST)
+ in_mad->mad_hdr.attr_id = IB_PMA_PORT_COUNTERS;
+ else if (counter >= EXT_COUNTERS_START && counter <= EXT_COUNTERS_LAST)
+ in_mad->mad_hdr.attr_id = IB_PMA_PORT_COUNTERS_EXT;
+ else {
+ ret = -EINVAL;
+ goto out;
+ }
in_mad->data[41] = p->port_num; /* PortSelect field */
@@ -347,6 +363,10 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr,
ret = sprintf(buf, "%u\n",
be32_to_cpup((__be32 *)(out_mad->data + 40 + offset / 8)));
break;
+ case 64:
+ ret = sprintf(buf, "%llu\n", be64_to_cpup(
+ (__be64 *)(out_mad->data + 40 + offset / 8)));
+ break;
default:
ret = 0;
}
@@ -358,6 +378,7 @@ out:
return ret;
}
+
static PORT_PMA_ATTR(symbol_error , 0, 16, 32);
static PORT_PMA_ATTR(link_error_recovery , 1, 8, 48);
static PORT_PMA_ATTR(link_downed , 2, 8, 56);
@@ -374,6 +395,15 @@ static PORT_PMA_ATTR(port_xmit_data , 12, 32, 192);
static PORT_PMA_ATTR(port_rcv_data , 13, 32, 224);
static PORT_PMA_ATTR(port_xmit_packets , 14, 32, 256);
static PORT_PMA_ATTR(port_rcv_packets , 15, 32, 288);
+/* definition for port counters extension */
+static PORT_PMA_ATTR(port_xmit_data_64 , 16, 64, 64);
+static PORT_PMA_ATTR(port_rcv_data_64 , 17, 64, 128);
+static PORT_PMA_ATTR(port_xmit_packets_64 , 18, 64, 192);
+static PORT_PMA_ATTR(port_rcv_packets_64 , 19, 64, 256);
+static PORT_PMA_ATTR(port_unicast_xmit_packets , 20, 64, 320);
+static PORT_PMA_ATTR(port_unicast_rcv_packets , 21, 64, 384);
+static PORT_PMA_ATTR(port_multicast_xmit_packets , 22, 64, 448);
+static PORT_PMA_ATTR(port_multicast_rcv_packets , 23, 64, 512);
static struct attribute *pma_attrs[] = {
&port_pma_attr_symbol_error.attr.attr,
@@ -400,6 +430,68 @@ static struct attribute_group pma_group = {
.attrs = pma_attrs
};
+static struct attribute *pma_attrs_ext[] = {
+ &port_pma_attr_port_xmit_data_64.attr.attr,
+ &port_pma_attr_port_rcv_data_64.attr.attr,
+ &port_pma_attr_port_xmit_packets_64.attr.attr,
+ &port_pma_attr_port_rcv_packets_64.attr.attr,
+ &port_pma_attr_port_unicast_xmit_packets.attr.attr,
+ &port_pma_attr_port_unicast_rcv_packets.attr.attr,
+ &port_pma_attr_port_multicast_xmit_packets.attr.attr,
+ &port_pma_attr_port_multicast_rcv_packets.attr.attr,
+ NULL
+};
+
+static struct attribute_group pma_ext_group = {
+ .name = "counters_ext",
+ .attrs = pma_attrs_ext
+};
+
+static int is_pma_class_cap_ext_width(struct ib_device *ibdev, int port_num)
+{
+ struct ib_mad *in_mad = NULL;
+ struct ib_mad *out_mad = NULL;
+ struct ib_class_port_info *cpi = NULL;
+ int ret = 0;
+
+ if (!ibdev->process_mad)
+ return -ENOSYS;
+
+ in_mad = kzalloc(sizeof *in_mad, GFP_KERNEL);
+ out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL);
+ if (!in_mad || !out_mad) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ in_mad->mad_hdr.base_version = 1;
+ in_mad->mad_hdr.mgmt_class = IB_MGMT_CLASS_PERF_MGMT;
+ in_mad->mad_hdr.class_version = 1;
+ in_mad->mad_hdr.method = IB_MGMT_METHOD_GET;
+ in_mad->mad_hdr.attr_id = IB_PMA_CLASS_PORT_INFO;
+
+ in_mad->data[41] = port_num;
+
+ if ((ibdev->process_mad(ibdev, IB_MAD_IGNORE_MKEY, port_num,
+ NULL, NULL, in_mad, out_mad) &
+ (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) !=
+ (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ cpi = (struct ib_class_port_info *)(out_mad->data + 40);
+ if (!(cpi->capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH) &&
+ !(cpi->capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH_NOIETF))
+ ret = -ENOSYS;
+
+out:
+ kfree(in_mad);
+ kfree(out_mad);
+
+ return ret;
+}
+
static void ib_port_release(struct kobject *kobj)
{
struct ib_port *p = container_of(kobj, struct ib_port, kobj);
@@ -520,6 +612,12 @@ static int add_port(struct ib_device *device, int port_num,
if (ret)
goto err_put;
+ if (is_pma_class_cap_ext_width(device, port_num) == 0) {
+ ret = sysfs_create_group(&p->kobj, &pma_ext_group);
+ if (ret)
+ goto err_put;
+ }
+
p->gid_group.name = "gids";
p->gid_group.attrs = alloc_group_attrs(show_port_gid, attr.gid_tbl_len);
if (!p->gid_group.attrs)
diff --git a/include/rdma/ib_pma.h b/include/rdma/ib_pma.h
index a5889f1..2f8a65c 100644
--- a/include/rdma/ib_pma.h
+++ b/include/rdma/ib_pma.h
@@ -42,6 +42,7 @@
*/
#define IB_PMA_CLASS_CAP_ALLPORTSELECT cpu_to_be16(1 << 8)
#define IB_PMA_CLASS_CAP_EXT_WIDTH cpu_to_be16(1 << 9)
+#define IB_PMA_CLASS_CAP_EXT_WIDTH_NOIETF cpu_to_be16(1 << 10)
#define IB_PMA_CLASS_CAP_XMIT_WAIT cpu_to_be16(1 << 12)
#define IB_PMA_CLASS_PORT_INFO cpu_to_be16(0x0001)
--
1.7.4.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
` (3 preceding siblings ...)
2011-10-10 8:56 ` [PATCH 4/5] ib/core: add support for extended performance counters in sysfs Or Gerlitz
@ 2011-10-10 8:57 ` Or Gerlitz
2011-10-26 7:04 ` [PATCH for-3.2] IBoE fixes Or Gerlitz
5 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:57 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
With the HW counters being 64bit ones, use the existing IBoE
PMA counters to support also extended counters.
Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/mad.c | 49 ++++++++++++++++++++++++++++++++-----
1 files changed, 42 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index f36da99..7ccc9a0 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -303,13 +303,44 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
}
-static void edit_counter(struct mlx4_counter *cnt,
- struct ib_pma_portcounters *pma_cnt)
+static int pma_get_classportinfo(struct ib_mad *out_mad)
{
- pma_cnt->port_xmit_data = cpu_to_be32((be64_to_cpu(cnt->tx_bytes)>>2));
- pma_cnt->port_rcv_data = cpu_to_be32((be64_to_cpu(cnt->rx_bytes)>>2));
- pma_cnt->port_xmit_packets = cpu_to_be32(be64_to_cpu(cnt->tx_frames));
- pma_cnt->port_rcv_packets = cpu_to_be32(be64_to_cpu(cnt->rx_frames));
+ struct ib_class_port_info *cpi =
+ (struct ib_class_port_info *)(out_mad->data + 40);
+
+ memset(cpi, 0, sizeof(cpi));
+
+ cpi->capability_mask = IB_PMA_CLASS_CAP_EXT_WIDTH_NOIETF ;
+
+ return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
+}
+
+static void edit_counter(struct mlx4_counter *cnt, void *counters,
+ __be16 attr_id)
+{
+ if (attr_id == IB_PMA_PORT_COUNTERS) {
+ struct ib_pma_portcounters *pma_cnt =
+ (struct ib_pma_portcounters *) counters;
+
+ pma_cnt->port_xmit_data =
+ cpu_to_be32((be64_to_cpu(cnt->tx_bytes)>>2));
+ pma_cnt->port_rcv_data =
+ cpu_to_be32((be64_to_cpu(cnt->rx_bytes)>>2));
+ pma_cnt->port_xmit_packets =
+ cpu_to_be32(be64_to_cpu(cnt->tx_frames));
+ pma_cnt->port_rcv_packets =
+ cpu_to_be32(be64_to_cpu(cnt->rx_frames));
+ } else {
+ struct ib_pma_portcounters_ext *pma_cnt_ext =
+ (struct ib_pma_portcounters_ext *) counters;
+
+ pma_cnt_ext->port_xmit_data =
+ cpu_to_be64((be64_to_cpu(cnt->tx_bytes)>>2));
+ pma_cnt_ext->port_rcv_data =
+ cpu_to_be64((be64_to_cpu(cnt->rx_bytes)>>2));
+ pma_cnt_ext->port_xmit_packets = cnt->tx_frames;
+ pma_cnt_ext->port_rcv_packets = cnt->rx_frames;
+ }
}
static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
@@ -325,6 +356,9 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
if (in_mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_PERF_MGMT)
return -EINVAL;
+ if (in_mad->mad_hdr.attr_id == IB_PMA_CLASS_PORT_INFO)
+ return pma_get_classportinfo(out_mad);
+
mailbox = mlx4_alloc_cmd_mailbox(dev->dev);
if (IS_ERR(mailbox))
return IB_MAD_RESULT_FAILURE;
@@ -339,7 +373,8 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
switch (mode & 0xf) {
case 0:
edit_counter(mailbox->buf,
- (void *)(out_mad->data + 40));
+ (void *)(out_mad->data + 40),
+ in_mad->mad_hdr.attr_id);
err = IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
break;
default:
--
1.7.4.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
@ 2011-10-10 17:00 ` Roland Dreier
[not found] ` <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-16 8:26 ` [PATCH V1 3/5] net/mlx4_core: deprecate " Or Gerlitz
1 sibling, 1 reply; 49+ messages in thread
From: Roland Dreier @ 2011-10-10 17:00 UTC (permalink / raw)
To: Or Gerlitz; +Cc: linux-rdma
> -static int log_num_vlan;
> -module_param_named(log_num_vlan, log_num_vlan, int, 0444);
> -MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)");
Can we really do this? Doesn't this break anyone who has this
option configured in modprobe.conf?
This is one reason why I hate adding module parameters -- you can't
just delete them later like this.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-10-11 7:24 ` Or Gerlitz
[not found] ` <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-10-11 7:24 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
On 10/10/2011 7:00 PM, Roland Dreier wrote:
> Can we really do this? Doesn't this break anyone who has this option
> configured in modprobe.conf?
> This is one reason why I hate adding module parameters -- you can't
> just delete them later like this
Yes, I now understand even better why you hate adding module params...
specifically here, this table relates
only to IBoE users, which is relatively new (in since 2.6.37) and didn't
hit yet most (if not all) of the
distros, so I don't expect anything to break, makes sense?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2011-10-11 17:28 ` Roland Dreier
[not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Roland Dreier @ 2011-10-11 17:28 UTC (permalink / raw)
To: Or Gerlitz; +Cc: linux-rdma
On Tue, Oct 11, 2011 at 12:24 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> Yes, I now understand even better why you hate adding module params...
> specifically here, this table relates
> only to IBoE users, which is relatively new (in since 2.6.37) and didn't hit
> yet most (if not all) of the
> distros, so I don't expect anything to break, makes sense?
No, I think we need to switch to ignoring the parameter and printing a
warning if someone sets it, and then we can remove it in a year or so.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-10-11 19:18 ` Or Gerlitz
[not found] ` <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-16 8:32 ` Or Gerlitz
1 sibling, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-10-11 19:18 UTC (permalink / raw)
To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma
On Tue, Oct 11, 2011 at 7:28 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> No, I think we need to switch to ignoring the parameter and printing a
> warning if someone sets it, and then we can remove it in a year or so.
no problem, I don't think I will be able to send such patch before
Sunday, so in spite of the merge window coming soon and as the rest of
the series isn't dependant on that patch, if the other patches are
okay, they can be picked before this one is fixed.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-10-11 19:22 ` Roland Dreier
0 siblings, 0 replies; 49+ messages in thread
From: Roland Dreier @ 2011-10-11 19:22 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Or Gerlitz, linux-rdma
On Tue, Oct 11, 2011 at 12:18 PM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> no problem, I don't think I will be able to send such patch before
> Sunday, so in spite of the merge window coming soon and as the rest of
> the series isn't dependant on that patch, if the other patches are
> okay, they can be picked before this one is fixed.
Sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* [PATCH V1 3/5] net/mlx4_core: deprecate module param controlling the vlan table size
[not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 17:00 ` Roland Dreier
@ 2011-10-16 8:26 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-10-16 8:26 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
Enable the maximum size (128) supported by the device for the shadow
vlans table, igonring the module param which could override it.
This table is only used by the IBoE control plane for setting a vlan
index into an RC/UC QP context or UD Address Handle.
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
changes from V0:
- deprecate the module param instead of removing it, such that
users will get warning if they attempt to set it.
drivers/net/mlx4/main.c | 11 ++++++-----
1 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c
index f0ee35d..9c4ef20 100644
--- a/drivers/net/mlx4/main.c
+++ b/drivers/net/mlx4/main.c
@@ -96,6 +96,8 @@ MODULE_PARM_DESC(log_num_mac, "Log2 max number of MACs per ETH port (1-7)");
static int log_num_vlan;
module_param_named(log_num_vlan, log_num_vlan, int, 0444);
MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)");
+/* Log2 max number of VLANs per ETH port (0-7) */
+#define MLX4_LOG_NUM_VLANS 7
static int use_prio;
module_param_named(use_prio, use_prio, bool, 0444);
@@ -230,7 +232,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
dev->caps.max_gso_sz = dev_cap->max_gso_sz;
dev->caps.log_num_macs = log_num_mac;
- dev->caps.log_num_vlans = log_num_vlan;
+ dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS;
dev->caps.log_num_prios = use_prio ? 3 : 0;
for (i = 1; i <= dev->caps.num_ports; ++i) {
@@ -1489,10 +1491,9 @@ static int __init mlx4_verify_params(void)
return -1;
}
- if ((log_num_vlan < 0) || (log_num_vlan > 7)) {
- pr_warning("mlx4_core: bad num_vlan: %d\n", log_num_vlan);
- return -1;
- }
+ if (log_num_vlan != 0)
+ pr_warning("mlx4_core: log_num_vlan - obsoleted module param, using %d\n",
+ MLX4_LOG_NUM_VLANS);
if ((log_mtts_per_seg < 1) || (log_mtts_per_seg > 7)) {
pr_warning("mlx4_core: bad log_mtts_per_seg: %d\n", log_mtts_per_seg);
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size
[not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 19:18 ` Or Gerlitz
@ 2011-10-16 8:32 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-10-16 8:32 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
On 10/11/2011 7:28 PM, Roland Dreier wrote:
> No, I think we need to switch to ignoring the parameter and printing a
> warning if someone sets it, and then we can remove it in a year or so
done, I sent you V1 of patch 3/5 which fixes that, any other comment
for the rest of the series?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH for-3.2] IBoE fixes
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
` (4 preceding siblings ...)
2011-10-10 8:57 ` [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE Or Gerlitz
@ 2011-10-26 7:04 ` Or Gerlitz
5 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-10-26 7:04 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma, Marcel Apfelbaum
On 10/10/2011 10:51 AM, Or Gerlitz wrote:
> This is a batch of IBoE related fixes plus addition of extended PMA counters.
Hi Roland,
With the merge window opening around yesterday, just checking if we're
on track
for merging this patch set into 3.2 (I sent V1 for patch #3 as you asked).
Also, for the FDR set, Marcel has sent a fix to the "[... v2 3/5]
ib/mlx4: Add extended
port capabilities support" patch along the lines you were suggesting, so
we'll be
happy to know if you're fine with this patch and also the fourth patch
which depends
on it, thanks!
Or.
>
>
>
> Or Gerlitz (5):
> ib/mlx4: enable 4K MTU for IBoE
> ib/mlx4: remove setting of vlan in IBoE WQEs control segment
> net/mlx4_core: remove module param controlling the vlan table size
> ib/core: add support for extended performance counters in sysfs
> ib/mlx4: add support for extended PMA counters under IBoE
>
> drivers/infiniband/core/sysfs.c | 100 ++++++++++++++++++++++++++++++++++++-
> drivers/infiniband/hw/mlx4/mad.c | 49 +++++++++++++++---
> drivers/infiniband/hw/mlx4/main.c | 2 +-
> drivers/infiniband/hw/mlx4/qp.c | 11 +---
> drivers/net/mlx4/main.c | 12 +---
> include/rdma/ib_pma.h | 1 +
> 6 files changed, 148 insertions(+), 27 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
@ 2011-10-31 19:38 ` Roland Dreier
[not found] ` <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Roland Dreier @ 2011-10-31 19:38 UTC (permalink / raw)
To: Or Gerlitz; +Cc: linux-rdma
On Mon, Oct 10, 2011 at 1:56 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> +static struct attribute_group pma_ext_group = {
> + .name = "counters_ext",
> + .attrs = pma_attrs_ext
> +};
Sorry for the late review here, but does it seem like the best
approach to have a separate "counters_ext" directory for
some subset of performance counters? Instead we could
have two attribute_groups, one the basic counters and one
the basic and extended counters, and basically do
if (is_pma_class_cap_ext_width(device, port_num) == 0)
sysfs_create_group(...basic and extended counters...)
else
sysfs_create_group(...basic counters...)
(by the way, is_pma_class_cap_ext_width() seems backwards
since it returns 0 for true. How about bool pma_has_ext_width()
and have it return true if the extended counters ARE supported?)
Or is there some reason why users would want to make the
distinction between basic and extended counters?
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 6:40 ` Or Gerlitz
[not found] ` <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 6:40 UTC (permalink / raw)
To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma
On Mon, Oct 31, 2011 at 9:38 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> Sorry for the late review here
Oh yes... BTW this is patch 4/5, I don't see patches 1,2,3 on your for-next
tree/branch @ kernel.org, have you accepted them?
> Sorry for the late review here, but does it seem like the best
> approach to have a separate "counters_ext" directory for
> some subset of performance counters? Instead we could
> have two attribute_groups, one the basic counters and one
> the basic and extended counters, and basically do
>
> if (is_pma_class_cap_ext_width(device, port_num) == 0)
> sysfs_create_group(...basic and extended counters...)
> else
> sysfs_create_group(...basic counters...)
Basically, I don't see a problem to have one directory along the lines
of your suggestion
> Or is there some reason why users would want to make the
> distinction between basic and extended counters?
Today, e.g in some IBoE perf monitoring scripts we wrote, the
distinction is done by if (the ext counter directory exists) then go
and read the counters from there, else read from the non extended
counters directory. With the change you propose, that if (.) would
become a little less elegant and would check if this or that --file--
exists (e.g the 64 bit tx data counter) and if yes, would read the
four 64 bit counters (rx/tx packets/data) else the four 32 bits
counters, so from our user standpoint, diff dirs seems better, but we
can get along with same dir with diff contents depending on the
device.
> (by the way, is_pma_class_cap_ext_width() seems backwards
> since it returns 0 for true. How about bool pma_has_ext_width()
> and have it return true if the extended counters ARE supported?)
sure, I will be able to handle this little change Wed, but we should
be okay / enough time for the current merge window, correct?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 16:22 ` Jason Gunthorpe
[not found] ` <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-01 16:22 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
On Tue, Nov 01, 2011 at 08:40:12AM +0200, Or Gerlitz wrote:
> Today, e.g in some IBoE perf monitoring scripts we wrote, the
> distinction is done by if (the ext counter directory exists) then go
> and read the counters from there, else read from the non extended
> counters directory. With the change you propose, that if (.) would
> become a little less elegant and would check if this or that --file--
> exists (e.g the 64 bit tx data counter) and if yes, would read the
> four 64 bit counters (rx/tx packets/data) else the four 32 bits
> counters, so from our user standpoint, diff dirs seems better, but we
> can get along with same dir with diff contents depending on the
> device.
Is there any reason to expose the 32 and 64 bit version of the same
counter? That seems needless. Emit the largest version available and
prepend 0's to fill out to the available width so that userspace can
know the counter size.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-01 17:23 ` Or Gerlitz
[not found] ` <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 17:23 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> Is there any reason to expose the 32 and 64 bit version of the same
> counter? That seems needless. Emit the largest version available and
> prepend 0's to fill out to the available width so that userspace can
> know the counter size.
Basically, the approach you suggest seems fine for IBoE which is
pretty new, however,
the problem is that the 32 bit counters exists from kind of day one
AND have the same semantics either if returned through sysfs or
through perfquery and alike mad based apps.
In other words around IB there should be some legacy which exists
today, and I don't think it would be wise to touch that area such that
the 32 bit counters become embedded in 64 bit numbers, thoughts?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 17:36 ` Jason Gunthorpe
[not found] ` <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-01 17:36 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
On Tue, Nov 01, 2011 at 07:23:52PM +0200, Or Gerlitz wrote:
> Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > Is there any reason to expose the 32 and 64 bit version of the same
> > counter? That seems needless. Emit the largest version available and
> > prepend 0's to fill out to the available width so that userspace can
> > know the counter size.
>
> Basically, the approach you suggest seems fine for IBoE which is
> pretty new, however,
>
> the problem is that the 32 bit counters exists from kind of day one
> AND have the same semantics either if returned through sysfs or
> through perfquery and alike mad based apps.
> In other words around IB there should be some legacy which exists
> today, and I don't think it would be wise to touch that area such that
> the 32 bit counters become embedded in 64 bit numbers, thoughts?
I don't see a problem with having a sysfs counter file being extended
to return a 64 bit number.. I think that is within the purvue of
acceptable changes. Shame the counter wasn't exported as hex though -
makes it harder to signal if it is 32 or 64 bit.
Frankly, exporting these PMA counters as saturate on maximum via sysfs
is pretty useless. Does anyone actually use them aside from a few
scripts?
What would be useful is free running 64 bit sysfs counters that are
independent and not reset by PMA activity. Like all the other Linux
networking counters. That would be great. I hope that is what is done
for IBoE?
Unifying the counters to be semantically the same on IB and IBoE seems
like a very good idea.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-01 18:14 ` Or Gerlitz
[not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 18:14 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
Guys (Roland, Jason), I'm open to any comments, any time, for any
patch, but for a patch which was posted weeks ago it's pretty unfair
to have your comments coming only eight days after the merge window
has been opened, lets try to come quickly to decision so I can fix
this up along those lines,
> I don't see a problem with having a sysfs counter file being extended
> to return a 64 bit number.. I think that is within the purvue of
> acceptable changes. Shame the counter wasn't exported as hex though -
> makes it harder to signal if it is 32 or 64 bit.
if I understand you right, we would have traffic counters exposed
through sysfs, where a counter is either a 32 zero-padded/embedded in
64bit one or true 64 bit one, a problem is that the four 32bit traffic
counters (rx/tx data/packets) are actually part of the IB port L2
basic counter set which includes about ten more counters to mark
different kinds of errors, wheres the 64bit counters are only traffic
counters, so what do you suggest for them? use the same approach for
the error counters as well even though IB doesn't define 64 bit
version for them? also zero padding for something which isn't exported
in hex is very ugly, isn't that?
> Frankly, exporting these PMA counters as saturate on maximum via sysfs
> is pretty useless. Does anyone actually use them aside from a few scripts?
under IB our monitorying code/scripts use perfquery/mads wheres under
IBoE we use sysfs, the mad approach allows to reset the counters, so
the 32 bit counters aren't useless, reset via sysfs isn't supported so
the 64 bit counter are kind of must, anyway,
> What would be useful is free running 64 bit sysfs counters that are
> independent and not reset by PMA activity. Like all the other Linux
> networking counters. That would be great. I hope that is what is done for IBoE?
yes this is what the 64 bit counters are
> Unifying the counters to be semantically the same on IB and IBoE seems
> like a very good idea.
yes, this is what we do here
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 18:37 ` Jason Gunthorpe
[not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 21:49 ` Roland Dreier
1 sibling, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-01 18:37 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
> > I don't see a problem with having a sysfs counter file being extended
> > to return a 64 bit number.. I think that is within the purvue of
> > acceptable changes. Shame the counter wasn't exported as hex though -
> > makes it harder to signal if it is 32 or 64 bit.
>
> if I understand you right, we would have traffic counters exposed
> through sysfs, where a counter is either a 32 zero-padded/embedded in
> 64bit one or true 64 bit one, a problem is that the four 32bit traffic
> counters (rx/tx data/packets) are actually part of the IB port L2
> basic counter set which includes about ten more counters to mark
> different kinds of errors, wheres the 64bit counters are only traffic
> counters, so what do you suggest for them? use the same approach for
> the error counters as well even though IB doesn't define 64 bit
> version for them? also zero padding for something which isn't exported
> in hex is very ugly, isn't that?
Whats the problem here? If a 64 bit counter is available then export
it as 64 bit otherwise keep exporting something smaller.
I agree zero padding non-hex numbers isn't ideal. Export as hex?
Broadly, this is another problem with the sysfs interface because the
width matters for any kind of serious data collection, and IBA defined
interesting widths for many of the counters that was flowed right
through the sysfs interface, with no means of discovery.
> > Frankly, exporting these PMA counters as saturate on maximum via sysfs
> > is pretty useless. Does anyone actually use them aside from a few scripts?
>
> under IB our monitorying code/scripts use perfquery/mads wheres under
> IBoE we use sysfs, the mad approach allows to reset the counters, so
> the 32 bit counters aren't useless, reset via sysfs isn't supported so
> the 64 bit counter are kind of must, anyway,
I don't mean the 32 bit counters are useless, I mean exposing PMA
counters that saturate and can be randomly reset by external agents
through sysfs is useless. You can't make any kind of data collection
based on such a system.
Ideally the sysfs counters are all non-saturating, non-resetting
counters like everything else in the net stack. You need a different
interface to the chip firmware to implement this, can't use the
existing PMA stuff.
In the same vien adding saturating but non-resettable PMA-esque
counters for IBoE seems pretty hackish to me.. Though I agree it is
not terribly relevant for 64 bit counters.
> > What would be useful is free running 64 bit sysfs counters that
> > are independent and not reset by PMA activity. Like all the other
> > Linux networking counters. That would be great. I hope that is
> > what is done for IBoE?
>
> yes this is what the 64 bit counters are
IBA defined 64 bit counters are not free-running, they still
saturate. Does the firmware not do this in IBoE mode?
> > Unifying the counters to be semantically the same on IB and IBoE seems
> > like a very good idea.
> yes, this is what we do here
I disagree. Your IBoE counters cannot be reset, externally or
otherwise - aside from the saturating this makes them almost the same
as the usual Linux net counters. When the port is in IB mode the
counter doesn't have those properties.
That is a big semantic difference when compared to what these sysfs
files show for normal IB counters.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-01 21:42 ` Roland Dreier
2011-11-01 21:44 ` Or Gerlitz
` (2 more replies)
2011-11-01 21:46 ` Or Gerlitz
1 sibling, 3 replies; 49+ messages in thread
From: Roland Dreier @ 2011-11-01 21:42 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Or Gerlitz, Or Gerlitz, linux-rdma
On Tue, Nov 1, 2011 at 11:37 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> Whats the problem here? If a 64 bit counter is available then export
> it as 64 bit otherwise keep exporting something smaller.
>
> I agree zero padding non-hex numbers isn't ideal. Export as hex?
I agree that it definitely is more appealing, if we have a 64-bit
version of a counter, that we should just export that counter
where we used to export the 32-bit version.
But I'm not sure there's a feasible way to do this without
breaking old userspace. For sure we can't assume userspace
can cope with hex where we used to have decimal. And
I don't think it's even a safe assumption that userspace
can cope with a 64-bit quantity where we used to have a
32-bit quantity. It doesn't seem safe to assume that
userspace that used to work with 32-bit quantities can
cope with a 0-padded 20 digit value.
The least bad way forward does seem like it is probably
the separate new directory thing.
- R.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
2011-11-01 21:42 ` Roland Dreier
@ 2011-11-01 21:44 ` Or Gerlitz
2011-11-01 21:52 ` Jason Gunthorpe
2011-11-02 17:16 ` Or Gerlitz
2 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 21:44 UTC (permalink / raw)
To: Roland Dreier; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma
On Tue, Nov 1, 2011 at 11:42 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> The least bad way forward does seem like it is probably
> the separate new directory thing.
I agree
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 21:42 ` Roland Dreier
@ 2011-11-01 21:46 ` Or Gerlitz
[not found] ` <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 21:46 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> I don't mean the 32 bit counters are useless, I mean exposing PMA
> counters that saturate and can be randomly reset by external agents
> through sysfs is useless. You can't make any kind of data collection
> based on such a system.
> Ideally the sysfs counters are all non-saturating, non-resetting
> counters like everything else in the net stack. You need a different
> interface to the chip firmware to implement this, can't use the
> existing PMA stuff.
> In the same vien adding saturating but non-resettable PMA-esque
> counters for IBoE seems pretty hackish to me.. Though I agree it is
> not terribly relevant for 64 bit counters.
Jason,
To put things in place, the IB stack PMA counters aren't resettable
through sysfs, still, under IB, the same counter set is readable
through both mads and sysfs and resettable through mads.
As for the saturation thing, I didn't think about that, but you're
probably right and all the IBA PMA counters are saturating, but as
your comment said, the 64 bit case is practically okay
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 18:37 ` Jason Gunthorpe
@ 2011-11-01 21:49 ` Roland Dreier
[not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 49+ messages in thread
From: Roland Dreier @ 2011-11-01 21:49 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma
On Tue, Nov 1, 2011 at 11:14 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Guys (Roland, Jason), I'm open to any comments, any time, for any
> patch, but for a patch which was posted weeks ago it's pretty unfair
> to have your comments coming only eight days after the merge window
> has been opened, lets try to come quickly to decision so I can fix
> this up along those lines,
Let's not get into fairness here... I'm trying to make progress on my backlog
but there are patches that for better or worse have been around for a year
or more. And for this merge window, I did manage to do
84 files changed, 4738 insertions(+), 984 deletions(-)
and include more than 60 patches.
There's no obligation to merge something just because you posted it before
the merge window, and in fact Linus's complaint at the kernel summit is
always that sub-maintainers don't say no enough.
And let's be honest in this specific case: the world is not going to end without
a few performance counters in sysfs.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
2011-11-01 21:42 ` Roland Dreier
2011-11-01 21:44 ` Or Gerlitz
@ 2011-11-01 21:52 ` Jason Gunthorpe
[not found] ` <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-02 17:16 ` Or Gerlitz
2 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-01 21:52 UTC (permalink / raw)
To: Roland Dreier; +Cc: Or Gerlitz, Or Gerlitz, linux-rdma
On Tue, Nov 01, 2011 at 02:42:33PM -0700, Roland Dreier wrote:
> I agree that it definitely is more appealing, if we have a 64-bit
> version of a counter, that we should just export that counter
> where we used to export the 32-bit version.
I think this falls under the 'undocumented, beware' API design. This
interface isn't specified so exactly as to have set out how many bytes
are in the files and how many bits are in the numbers.
If you wrote a reader that can't handle a 20 byte integer with leading
zeros then your user space isn't following the API.
If your reader doesn't elegantly handle overflow to whatever type your
reader picked, then you aren't following the API.
There are many examples of the kernel tweaking APIs along this
undocumented axis, and theoritically the text-free-form nature of
sysfs is supposed to save us from having to worry about exactly this
sort of case.
And again, this is a useless interface in IB. IBoE is going to be
first real, serious, long term user, let's make it saner instead of
keeping it as is forever?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 21:58 ` Jason Gunthorpe
[not found] ` <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-01 21:58 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
On Tue, Nov 01, 2011 at 11:46:08PM +0200, Or Gerlitz wrote:
> > In the same vien adding saturating but non-resettable PMA-esque
> > counters for IBoE seems pretty hackish to me.. Though I agree it is
> > not terribly relevant for 64 bit counters.
> To put things in place, the IB stack PMA counters aren't resettable
> through sysfs, still, under IB, the same counter set is readable
> through both mads and sysfs and resettable through mads.
Right, the sysfs interface is pretty much unusable for IB. Your work
to make it go on IBoE makes something is very nearly usuable, but you
can't write a tool that collects these counters from a port in IBoE
mode and also expect it to work in IB mode because the semantics are
different.
My argument here is that the semantics we have for the IB case are not
useful. Let us define sane semantics for the IBoE case and have a
longer term clean up to make the IB case follow them as well.
Sane semantics for a sysfs counter are:
- Free-running
- Non-saturating
- No reset
- 64 or 32 bit value, detectable by user-space
No 6 bit counters. No counters that saturate. No counters that
randomly reset.
To this end, I think exporting 64 bit and 32 bit counts of the same
value is not the way to go.
> As for the saturation thing, I didn't think about that, but you're
> probably right and all the IBA PMA counters are saturating, but as
> your comment said, the 64 bit case is practically okay
Will any counters that get exposed when IBoE is turned on not be 64
bits? There are not very many 64 bit PMA counters.
If yes, maybe you should patch to un-export them until things can be
fixed sanely...
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-01 22:03 ` Ira Weiny
[not found] ` <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Ira Weiny @ 2011-11-01 22:03 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, Or Gerlitz, linux-rdma
On Tue, 1 Nov 2011 14:52:00 -0700
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Nov 01, 2011 at 02:42:33PM -0700, Roland Dreier wrote:
>
[snip]
>
> And again, this is a useless interface in IB.
Why do you mean by this?
Ira
> IBoE is going to be
> first real, serious, long term user, let's make it saner instead of
> keeping it as is forever?
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ira Weiny
Member of Technical Staff
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2011-11-01 22:11 ` Jason Gunthorpe
[not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-01 22:11 UTC (permalink / raw)
To: Ira Weiny; +Cc: Roland Dreier, Or Gerlitz, Or Gerlitz, linux-rdma
On Tue, Nov 01, 2011 at 03:03:58PM -0700, Ira Weiny wrote:
> > And again, this is a useless interface in IB.
>
> Why do you mean by this?
A counter that is randomly reset by an external IB performance manager
is not useful for collecting local statistical information.
A counter that saturates and cannot be reset locally is not useful for
any automatic process.
Why have a sysfs counter at all when you can just ask the PMA and get
exactly the same data?
I can't make a SNMP MIB from this counter. I can't make graphs from
this counter. I can't do anything with it except show it to the user
and hope they understand when it was last reset, or understand what
to do when it is saturated.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-01 22:19 ` Ira Weiny
2011-11-01 22:34 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Ira Weiny @ 2011-11-01 22:19 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, Or Gerlitz, linux-rdma
On Tue, 1 Nov 2011 15:11:35 -0700
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Nov 01, 2011 at 03:03:58PM -0700, Ira Weiny wrote:
>
> > > And again, this is a useless interface in IB.
> >
> > Why do you mean by this?
>
> A counter that is randomly reset by an external IB performance manager
> is not useful for collecting local statistical information.
>
> A counter that saturates and cannot be reset locally is not useful for
> any automatic process.
>
> Why have a sysfs counter at all when you can just ask the PMA and get
> exactly the same data?
I agree, but... sys admins like sysfs files... :-(
>
> I can't make a SNMP MIB from this counter. I can't make graphs from
> this counter. I can't do anything with it except show it to the user
> and hope they understand when it was last reset, or understand what
> to do when it is saturated.
Ok, just making sure we are on the same page.
Ira
>
> Jason
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 22:32 ` Or Gerlitz
2011-11-01 22:49 ` Hefty, Sean
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 22:32 UTC (permalink / raw)
To: Roland Dreier; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma
On Tue, Nov 1, 2011 at 11:49 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> There's no obligation to merge something just because you posted it before
> the merge window, and in fact Linus's complaint at the kernel summit is
> always that sub-maintainers don't say no enough.
>
> And let's be honest in this specific case: the world is not going to end without
> a few performance counters in sysfs.
Agree on all, I just want to see progress here, its okay for this
discussion and its such
to miss this or that merge window, as long as at some point we get
into a resolution which allows to fix a patch and queue it for the
next merge window, e.g in this case, if we miss 3.2-rc1 and you don't
feel the patches are appropriate for -rc2, they can be queued for 3.3
BTW - this brings something I wanted to raise long ago... I would be
happy to see the for-next branch active at all times, e.g in the same
manner net-next is, which means that patches aren't reviewed/accepted
only/mostly in the weeks before the merge window opens but rather
constantly over time. This shouldn't increase the amortized load on
you and reduce the possible (maybe existing to some extent?)
frustration by people who submit patches long before the next window
opens and don't much feedback... (again, not relevant to this patch
and the next)
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 22:19 ` Ira Weiny
@ 2011-11-01 22:34 ` Or Gerlitz
[not found] ` <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-11-01 22:34 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Ira Weiny, Roland Dreier, Or Gerlitz, linux-rdma
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> Why have a sysfs counter at all when you can just ask the PMA and get exactly the
> same data?
The HW/FW PMA agent isn't supported for IBoE only for IB, the
counters are for both
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 22:46 ` Ira Weiny
[not found] ` <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Ira Weiny @ 2011-11-01 22:46 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Jason Gunthorpe, Roland Dreier, Or Gerlitz, linux-rdma
On Tue, 1 Nov 2011 15:34:46 -0700
Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > Why have a sysfs counter at all when you can just ask the PMA and get exactly the
> > same data?
>
> The HW/FW PMA agent isn't supported for IBoE only for IB, the
> counters are for both
What do you mean "for both"?
Ira
>
> Or.
--
Ira Weiny
Member of Technical Staff
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* RE: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 22:32 ` Or Gerlitz
@ 2011-11-01 22:49 ` Hefty, Sean
1 sibling, 0 replies; 49+ messages in thread
From: Hefty, Sean @ 2011-11-01 22:49 UTC (permalink / raw)
To: Roland Dreier, Or Gerlitz; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma
> Let's not get into fairness here... I'm trying to make progress on my backlog
> but there are patches that for better or worse have been around for a year
> or more.
Along these lines, is there any news on when patchwork might be available again? I've been trying to help review some of the backlog patches, but it's hard to do without seeing them.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2011-11-02 7:38 ` Or Gerlitz
0 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-11-02 7:38 UTC (permalink / raw)
To: Ira Weiny; +Cc: Jason Gunthorpe, Roland Dreier, linux-rdma
On 11/2/2011 12:46 AM, Ira Weiny wrote:
> What do you mean "for both"? Ira
The sysfs PMA counters are functional for both IB and IBoE, the latter uses
a HW counter allocated per device/port for which all the QPs created on
that
port are reporting their rx/tx bytes/packets, see commits
cfcde11c3d7ae175f49280bb6f913478c2f1bd8c and
c37791349cc79d025df6e9a4f896a7b0a97cdbd3
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-02 8:27 ` Or Gerlitz
0 siblings, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-11-02 8:27 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: linux-rdma
On 11/1/2011 11:58 PM, Jason Gunthorpe wrote:
> maybe you should patch to un-export them until things can be fixed
> sanely...
Jason,
I don't think we want to work this way, bug are either opened or
attempted to be fixed or fixed.
You can't just come out of the the blue and remove functionality which
is buggy (BTW - to
your taste, I don't agree with your conclusive comments).
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
2011-11-01 21:42 ` Roland Dreier
2011-11-01 21:44 ` Or Gerlitz
2011-11-01 21:52 ` Jason Gunthorpe
@ 2011-11-02 17:16 ` Or Gerlitz
[not found] ` <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-11-02 17:16 UTC (permalink / raw)
To: Roland Dreier; +Cc: Jason Gunthorpe, linux-rdma
On 11/1/2011 11:42 PM, Roland Dreier wrote:
> The least bad way forward does seem like it is probably the separate
> new directory thing
Hi Roland,
I suggest we go that least bad way along the lines of your comment.
If/when on some future point something constructive can be formed from
Jason's observations, changes will follow, agree?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2011-11-08 0:54 ` Roland Dreier
[not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Roland Dreier @ 2011-11-08 0:54 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Jason Gunthorpe, linux-rdma
On Wed, Nov 2, 2011 at 10:16 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> I suggest we go that least bad way along the lines of your comment.
>
> If/when on some future point something constructive can be formed from
> Jason's observations, changes will follow, agree?
Let's make sure we learn from our mistakes. Let's say we create a new
"ext_counters"
directory. What should the format of those files be? Should they be
assumed to be
64-bit quantities? Do we want to allow some way of indicating the
number of bits
(ie 0-padded hex entries?)?
- R>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-08 1:09 ` Jason Gunthorpe
[not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-12-22 17:09 ` Or Gerlitz
1 sibling, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-11-08 1:09 UTC (permalink / raw)
To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma
On Mon, Nov 07, 2011 at 04:54:42PM -0800, Roland Dreier wrote:
> On Wed, Nov 2, 2011 at 10:16 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> > I suggest we go that least bad way along the lines of your comment.
> >
> > If/when on some future point something constructive can be formed from
> > Jason's observations, changes will follow, agree?
>
> Let's make sure we learn from our mistakes. Let's say we create a
> new "ext_counters" directory. What should the format of those files
> be? Should they be assumed to be 64-bit quantities? Do we want to
> allow some way of indicating the number of bits (ie 0-padded hex
> entries?)?
That is a good idea. Let's require counters_ext to be sane:
1 Hex quantity of unspecified size
2 Prefixed with 0x and leading zeros to fill out to size and allow
userspace discovery of size
3 Size must be a multiple of 4 bits.
4 Counters do not saturate
5 Counters wrap around at all F's back to 0.
6 If the counter is resettable it is only via a local operation
through netlink or sysfs or something. Not PMA reset.
Certainly, aside from some minor details and different string
formatting, the 64 bit counters Or proposes to add meet these
requirements when the port is used in Ethernet mode.
How do you feel about having counters_ext appear in ethernet mode and
disappear in IB mode?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-11-08 16:06 ` Ira Weiny
[not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-12-20 12:03 ` Or Gerlitz
1 sibling, 1 reply; 49+ messages in thread
From: Ira Weiny @ 2011-11-08 16:06 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma
On Mon, 7 Nov 2011 17:09:52 -0800
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Mon, Nov 07, 2011 at 04:54:42PM -0800, Roland Dreier wrote:
> > On Wed, Nov 2, 2011 at 10:16 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> > > I suggest we go that least bad way along the lines of your comment.
> > >
> > > If/when on some future point something constructive can be formed from
> > > Jason's observations, changes will follow, agree?
> >
> > Let's make sure we learn from our mistakes. Let's say we create a
> > new "ext_counters" directory. What should the format of those files
> > be? Should they be assumed to be 64-bit quantities? Do we want to
> > allow some way of indicating the number of bits (ie 0-padded hex
> > entries?)?
>
> That is a good idea. Let's require counters_ext to be sane:
>
> 1 Hex quantity of unspecified size
> 2 Prefixed with 0x and leading zeros to fill out to size and allow
> userspace discovery of size
> 3 Size must be a multiple of 4 bits.
> 4 Counters do not saturate
> 5 Counters wrap around at all F's back to 0.
> 6 If the counter is resettable it is only via a local operation
> through netlink or sysfs or something. Not PMA reset.
>
> Certainly, aside from some minor details and different string
> formatting, the 64 bit counters Or proposes to add meet these
> requirements when the port is used in Ethernet mode.
>
> How do you feel about having counters_ext appear in ethernet mode and
> disappear in IB mode?
That might get complicated with ports which can be either mode.
Ira
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ira Weiny
Member of Technical Staff
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2011-12-20 11:49 ` Or Gerlitz
2011-12-20 11:49 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-12-20 11:49 UTC (permalink / raw)
To: Ira Weiny; +Cc: Jason Gunthorpe, Roland Dreier, linux-rdma
Ira Weiny wrote:
> Jason Gunthorpe<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> How do you feel about having counters_ext appear in ethernet mode and
> disappear in IB mode?
>
> That might get complicated with ports which can be either mode.
Ira (reviving this thread),
At the ib core level, the link layer sysfs attribute is read only.
At the mlx4 VPI level support, a port has given link layer at certain
point of time, and further, AFAIK, this isn't something that can be
changed in the life-cycle of the ib device exported by mlx4_ib, which is
deleted/re-added when the port link layer change, see commit
7ff93f8b7ecbc36e7ffc5c11a61643821c1bfee5 which states " When the type of
a port is changed, all mlx4 interfaces are unregistered, and then
registered again with the new port types"
Still, this patch makes the decision when the ib device is added, so you
have a point here w.r.t to future devices which could change their link
layer at run time, I'll look on that.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-12-20 11:49 ` Or Gerlitz
@ 2011-12-20 11:49 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-12-20 11:49 UTC (permalink / raw)
To: Ira Weiny; +Cc: Jason Gunthorpe, Roland Dreier, linux-rdma
Ira Weiny wrote:
> Jason Gunthorpe<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> How do you feel about having counters_ext appear in ethernet mode and
> disappear in IB mode?
>
> That might get complicated with ports which can be either mode.
Ira (reviving this thread),
At the ib core level, the link layer sysfs attribute is read only.
At the mlx4 VPI level support, a port has given link layer at certain
point of time, and further, AFAIK, this isn't something that can be
changed in the life-cycle of the ib device exported by mlx4_ib, which is
deleted/re-added when the port link layer change, see commit
7ff93f8b7ecbc36e7ffc5c11a61643821c1bfee5 which states " When the type of
a port is changed, all mlx4 interfaces are unregistered, and then
registered again with the new port types"
Still, this patch makes the decision when the ib device is added, so you
have a point here w.r.t to future devices which could change their link
layer at run time, I'll look on that.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-08 16:06 ` Ira Weiny
@ 2011-12-20 12:03 ` Or Gerlitz
[not found] ` <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
1 sibling, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-12-20 12:03 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Roland Dreier, linux-rdma
On 11/8/2011 3:09 AM, Jason Gunthorpe wrote:
> Roland Dreier wrote:
>>
>> Let's make sure we learn from our mistakes. Let's say we create a
>> new "ext_counters" directory. What should the format of those files
>> be? Should they be assumed to be 64-bit quantities? Do we want to
>> allow some way of indicating the number of bits (ie 0-padded hex
>> entries?)?
>
> That is a good idea. Let's require counters_ext to be sane:
>
> 1 Hex quantity of unspecified size
> 2 Prefixed with 0x and leading zeros to fill out to size and allow
> userspace discovery of size
> 3 Size must be a multiple of 4 bits.
> 4 Counters do not saturate
> 5 Counters wrap around at all F's back to 0.
> 6 If the counter is resettable it is only via a local operation
> through netlink or sysfs or something. Not PMA reset.
>
> Certainly, aside from some minor details and different string
> formatting, the 64 bit counters Or proposes to add meet these
> requirements when the port is used in Ethernet mode.
Jason, in an earlier post of yours you mentioned the netdev counters as
something we should be following, well, I took a look - the counters are
read-only (see /sys/class/net/$DEV/statistics/*) they are displayed in
decimal/non-padded manner (see /sys/class/net/$DEV/statistics/* or
/proc/net/dev or ifconfig and friends)
Roland, I'd like to re-format these patches for 3.3, will be glad to
hear what you would like to see here,
> How do you feel about having counters_ext appear in ethernet mode and
> disappear in IB mode?
I think I'm fine with that, IB has its own MAD based means to query these counters.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2011-12-20 17:46 ` Jason Gunthorpe
[not found] ` <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-12-20 17:46 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, linux-rdma
On Tue, Dec 20, 2011 at 02:03:31PM +0200, Or Gerlitz wrote:
> >That is a good idea. Let's require counters_ext to be sane:
> >
> > 1 Hex quantity of unspecified size
> > 2 Prefixed with 0x and leading zeros to fill out to size and allow
> > userspace discovery of size
> > 3 Size must be a multiple of 4 bits.
> > 4 Counters do not saturate
> > 5 Counters wrap around at all F's back to 0.
> > 6 If the counter is resettable it is only via a local operation
> > through netlink or sysfs or something. Not PMA reset.
> >
> >Certainly, aside from some minor details and different string
> >formatting, the 64 bit counters Or proposes to add meet these
> >requirements when the port is used in Ethernet mode.
>
> Jason, in an earlier post of yours you mentioned the netdev counters
> as something we should be following, well, I took a look - the
> counters are read-only (see /sys/class/net/$DEV/statistics/*) they
> are displayed in decimal/non-padded manner (see
> /sys/class/net/$DEV/statistics/* or /proc/net/dev or ifconfig and
> friends)
The netdev counters are all the same size and there is some other way
to discover what the size is. I'd like to see that for IB counters
too, but it is probably infeasible. So if we have counters that are
not the same size as netdev counters then some kind of size indicator
is required for sane userspace.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-12-20 19:40 ` Or Gerlitz
[not found] ` <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-12-20 19:40 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Or Gerlitz, Roland Dreier, linux-rdma
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> The netdev counters are all the same size and there is some other way
> to discover what the size is. I'd like to see that for IB counters
> too, but it is probably infeasible. So if we have counters that are
> not the same size as netdev counters then some kind of size indicator
> is required for sane userspace.
We're talking now only on the IB extended counters who are all 64 bits
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-12-20 19:50 ` Jason Gunthorpe
[not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2011-12-20 19:50 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Or Gerlitz, Roland Dreier, linux-rdma
On Tue, Dec 20, 2011 at 09:40:09PM +0200, Or Gerlitz wrote:
> Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > The netdev counters are all the same size and there is some other way
> > to discover what the size is. I'd like to see that for IB counters
> > too, but it is probably infeasible. So if we have counters that are
> > not the same size as netdev counters then some kind of size indicator
> > is required for sane userspace.
>
> We're talking now only on the IB extended counters who are all 64 bits
netdev counters are 32 bit or 64 bit, depending on how the kernel was
compiled. I think indicating the size explicitly, or always being 64
bit (and extending all the lessor counters in future) is the way to go
for IB..
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-12-20 19:56 ` Or Gerlitz
2012-01-04 19:29 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-12-20 19:56 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Or Gerlitz, Roland Dreier, linux-rdma
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> We're talking now only on the IB extended counters who are all 64 bits
> netdev counters are 32 bit or 64 bit, depending on how the kernel was
> compiled. I think indicating the size explicitly, or always being 64
> bit (and extending all the lessor counters in future) is the way to go for IB..
So for the extended counters which we are dealing with now, and who
are all 64bit, anything I should change in my patches?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-08 1:09 ` Jason Gunthorpe
@ 2011-12-22 17:09 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2011-12-22 17:09 UTC (permalink / raw)
To: Roland Dreier; +Cc: Jason Gunthorpe, linux-rdma
On 11/8/2011 2:54 AM, Roland Dreier wrote:
> Let's make sure we learn from our mistakes. Let's say we create a new
> "ext_counters" directory. What should the format of those files be?
> Should they be assumed to be 64-bit quantities? Do we want to allow
> some way of indicating the number of bits (ie 0-padded hex entries?)?
Hi Roland,Jason - again, I'd like to re-submit that for kernel 3.3 - if
there's anything specific you think need to change, could you please
comment? thanks
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs
[not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-12-20 19:56 ` Or Gerlitz
@ 2012-01-04 19:29 ` Or Gerlitz
1 sibling, 0 replies; 49+ messages in thread
From: Or Gerlitz @ 2012-01-04 19:29 UTC (permalink / raw)
To: Roland Dreier, Jason Gunthorpe; +Cc: linux-rdma
On Tue, Dec 20, 2011 Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Dec 20, 2011 Or Gerlitz wrote:
>> Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> > The netdev counters are all the same size and there is some other way
>> > to discover what the size is. I'd like to see that for IB counters
>> > too, but it is probably infeasible. So if we have counters that are
>> > not the same size as netdev counters then some kind of size indicator
>> > is required for sane userspace.
>> We're talking now only on the IB extended counters who are all 64 bits
> netdev counters are 32 bit or 64 bit, depending on how the kernel was
> compiled. I think indicating the size explicitly, or always being 64
> bit (and extending all the lessor counters in future) is the way to go for IB..
Roland/Jason,
Any concrete preference here? I'd like to fix the patch so it can go
into 3.3-rc1
* do we want that directory to be present only when the port link
type is Ethernet (I assume the ib device will be re-created across
link type change as the per port HW elements need to be
re-initialized).
* decimal display as all the network counters and the other IB counters are?
* could clarify "indicating the size explicitly"
* "always being 64 bit" applies to hex decimal only, I guess
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2012-01-04 19:29 UTC | newest]
Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-10 8:51 [PATCH for-3.2] IBoE fixes Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 8:53 ` [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE Or Gerlitz
2011-10-10 8:54 ` [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment Or Gerlitz
2011-10-10 8:55 ` [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 17:00 ` Roland Dreier
[not found] ` <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 7:24 ` Or Gerlitz
[not found] ` <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-10-11 17:28 ` Roland Dreier
[not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 19:18 ` Or Gerlitz
[not found] ` <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 19:22 ` Roland Dreier
2011-10-16 8:32 ` Or Gerlitz
2011-10-16 8:26 ` [PATCH V1 3/5] net/mlx4_core: deprecate " Or Gerlitz
2011-10-10 8:56 ` [PATCH 4/5] ib/core: add support for extended performance counters in sysfs Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-31 19:38 ` Roland Dreier
[not found] ` <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 6:40 ` Or Gerlitz
[not found] ` <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 16:22 ` Jason Gunthorpe
[not found] ` <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 17:23 ` Or Gerlitz
[not found] ` <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 17:36 ` Jason Gunthorpe
[not found] ` <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 18:14 ` Or Gerlitz
[not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 18:37 ` Jason Gunthorpe
[not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 21:42 ` Roland Dreier
2011-11-01 21:44 ` Or Gerlitz
2011-11-01 21:52 ` Jason Gunthorpe
[not found] ` <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 22:03 ` Ira Weiny
[not found] ` <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-11-01 22:11 ` Jason Gunthorpe
[not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 22:19 ` Ira Weiny
2011-11-01 22:34 ` Or Gerlitz
[not found] ` <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 22:46 ` Ira Weiny
[not found] ` <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-11-02 7:38 ` Or Gerlitz
2011-11-02 17:16 ` Or Gerlitz
[not found] ` <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-11-08 0:54 ` Roland Dreier
[not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-08 1:09 ` Jason Gunthorpe
[not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-08 16:06 ` Ira Weiny
[not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-12-20 11:49 ` Or Gerlitz
2011-12-20 11:49 ` Or Gerlitz
2011-12-20 12:03 ` Or Gerlitz
[not found] ` <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-12-20 17:46 ` Jason Gunthorpe
[not found] ` <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-12-20 19:40 ` Or Gerlitz
[not found] ` <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-20 19:50 ` Jason Gunthorpe
[not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-12-20 19:56 ` Or Gerlitz
2012-01-04 19:29 ` Or Gerlitz
2011-12-22 17:09 ` Or Gerlitz
2011-11-01 21:46 ` Or Gerlitz
[not found] ` <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 21:58 ` Jason Gunthorpe
[not found] ` <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-02 8:27 ` Or Gerlitz
2011-11-01 21:49 ` Roland Dreier
[not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 22:32 ` Or Gerlitz
2011-11-01 22:49 ` Hefty, Sean
2011-10-10 8:57 ` [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE Or Gerlitz
2011-10-26 7:04 ` [PATCH for-3.2] IBoE fixes Or Gerlitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox