* [PATCH for-3.2] IBoE fixes
@ 2011-10-10 8:51 Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
0 siblings, 1 reply; 49+ messages in thread
From: Or Gerlitz @ 2011-10-10 8:51 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma
Roland,
This is a batch of IBoE related fixes plus addition of extended PMA counters.
Or.
Or Gerlitz (5):
ib/mlx4: enable 4K MTU for IBoE
ib/mlx4: remove setting of vlan in IBoE WQEs control segment
net/mlx4_core: remove module param controlling the vlan table size
ib/core: add support for extended performance counters in sysfs
ib/mlx4: add support for extended PMA counters under IBoE
drivers/infiniband/core/sysfs.c | 100 ++++++++++++++++++++++++++++++++++++-
drivers/infiniband/hw/mlx4/mad.c | 49 +++++++++++++++---
drivers/infiniband/hw/mlx4/main.c | 2 +-
drivers/infiniband/hw/mlx4/qp.c | 11 +---
drivers/net/mlx4/main.c | 12 +---
include/rdma/ib_pma.h | 1 +
6 files changed, 148 insertions(+), 27 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 49+ messages in thread[parent not found: <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>]
* [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE [not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> @ 2011-10-10 8:53 ` Or Gerlitz 2011-10-10 8:54 ` [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment Or Gerlitz ` (4 subsequent siblings) 5 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-10-10 8:53 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma The IBoE port mtu is derived from the corresponding EN netdevice mtu, which can go upto supporting Jumbo-Frames of 9KB and hence surely supports the max IB mtu of 4K Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/hw/mlx4/main.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index fa643f4..e953cf9 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -227,7 +227,7 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port, props->pkey_tbl_len = 1; props->bad_pkey_cntr = be16_to_cpup((__be16 *) (out_mad->data + 46)); props->qkey_viol_cntr = be16_to_cpup((__be16 *) (out_mad->data + 48)); - props->max_mtu = IB_MTU_2048; + props->max_mtu = IB_MTU_4096; props->subnet_timeout = 0; props->max_vl_num = out_mad->data[37] >> 4; props->init_type_reply = 0; -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment [not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> 2011-10-10 8:53 ` [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE Or Gerlitz @ 2011-10-10 8:54 ` Or Gerlitz 2011-10-10 8:55 ` [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size Or Gerlitz ` (3 subsequent siblings) 5 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-10-10 8:54 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma There's no need to set the vlan related fields in an IBoE send WQE control segment. This b/c the vlan to be used by a UD QP is specified through the the datagram segment, and with GSI (CM) QP, all the headers down to the 8021q and mac ones are built by the software anyway. Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/hw/mlx4/qp.c | 11 ++--------- 1 files changed, 2 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 3a91d9d..819b1a5 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1547,14 +1547,13 @@ static void set_masked_atomic_seg(struct mlx4_wqe_masked_atomic_seg *aseg, } static void set_datagram_seg(struct mlx4_wqe_datagram_seg *dseg, - struct ib_send_wr *wr, __be16 *vlan) + struct ib_send_wr *wr) { memcpy(dseg->av, &to_mah(wr->wr.ud.ah)->av, sizeof (struct mlx4_av)); dseg->dqpn = cpu_to_be32(wr->wr.ud.remote_qpn); dseg->qkey = cpu_to_be32(wr->wr.ud.remote_qkey); dseg->vlan = to_mah(wr->wr.ud.ah)->av.eth.vlan; memcpy(dseg->mac, to_mah(wr->wr.ud.ah)->av.eth.mac, 6); - *vlan = dseg->vlan; } static void set_mlx_icrc_seg(void *dseg) @@ -1657,7 +1656,6 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, __be32 uninitialized_var(lso_hdr_sz); __be32 blh; int i; - __be16 vlan = cpu_to_be16(0xffff); spin_lock_irqsave(&qp->sq.lock, flags); @@ -1761,7 +1759,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, break; case IB_QPT_UD: - set_datagram_seg(wqe, wr, &vlan); + set_datagram_seg(wqe, wr); wqe += sizeof (struct mlx4_wqe_datagram_seg); size += sizeof (struct mlx4_wqe_datagram_seg) / 16; @@ -1824,11 +1822,6 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, ctrl->fence_size = (wr->send_flags & IB_SEND_FENCE ? MLX4_WQE_CTRL_FENCE : 0) | size; - if (be16_to_cpu(vlan) < 0x1000) { - ctrl->ins_vlan = 1 << 6; - ctrl->vlan_tag = vlan; - } - /* * Make sure descriptor is fully written before * setting ownership bit (because HW can start -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> 2011-10-10 8:53 ` [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE Or Gerlitz 2011-10-10 8:54 ` [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment Or Gerlitz @ 2011-10-10 8:55 ` Or Gerlitz [not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> 2011-10-10 8:56 ` [PATCH 4/5] ib/core: add support for extended performance counters in sysfs Or Gerlitz ` (2 subsequent siblings) 5 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-10-10 8:55 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma Enable the maximum size (128) supported by the device for the shadow vlans table, without a module param to override it. This table is only used by the IBoE control plane for setting a vlan index into an RC/UC QP context or UD Address Handle. Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/net/mlx4/main.c | 12 +++--------- 1 files changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index f0ee35d..0285c8a 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -93,9 +93,8 @@ static int log_num_mac = 2; module_param_named(log_num_mac, log_num_mac, int, 0444); MODULE_PARM_DESC(log_num_mac, "Log2 max number of MACs per ETH port (1-7)"); -static int log_num_vlan; -module_param_named(log_num_vlan, log_num_vlan, int, 0444); -MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)"); +/* Log2 max number of VLANs per ETH port (0-7) */ +#define MLX4_LOG_NUM_VLANS 7 static int use_prio; module_param_named(use_prio, use_prio, bool, 0444); @@ -230,7 +229,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.max_gso_sz = dev_cap->max_gso_sz; dev->caps.log_num_macs = log_num_mac; - dev->caps.log_num_vlans = log_num_vlan; + dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS; dev->caps.log_num_prios = use_prio ? 3 : 0; for (i = 1; i <= dev->caps.num_ports; ++i) { @@ -1489,11 +1488,6 @@ static int __init mlx4_verify_params(void) return -1; } - if ((log_num_vlan < 0) || (log_num_vlan > 7)) { - pr_warning("mlx4_core: bad num_vlan: %d\n", log_num_vlan); - return -1; - } - if ((log_mtts_per_seg < 1) || (log_mtts_per_seg > 7)) { pr_warning("mlx4_core: bad log_mtts_per_seg: %d\n", log_mtts_per_seg); return -1; -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 49+ messages in thread
[parent not found: <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>]
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> @ 2011-10-10 17:00 ` Roland Dreier [not found] ` <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-10-16 8:26 ` [PATCH V1 3/5] net/mlx4_core: deprecate " Or Gerlitz 1 sibling, 1 reply; 49+ messages in thread From: Roland Dreier @ 2011-10-10 17:00 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-rdma > -static int log_num_vlan; > -module_param_named(log_num_vlan, log_num_vlan, int, 0444); > -MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)"); Can we really do this? Doesn't this break anyone who has this option configured in modprobe.conf? This is one reason why I hate adding module parameters -- you can't just delete them later like this. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-10-11 7:24 ` Or Gerlitz [not found] ` <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-10-11 7:24 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma On 10/10/2011 7:00 PM, Roland Dreier wrote: > Can we really do this? Doesn't this break anyone who has this option > configured in modprobe.conf? > This is one reason why I hate adding module parameters -- you can't > just delete them later like this Yes, I now understand even better why you hate adding module params... specifically here, this table relates only to IBoE users, which is relatively new (in since 2.6.37) and didn't hit yet most (if not all) of the distros, so I don't expect anything to break, makes sense? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2011-10-11 17:28 ` Roland Dreier [not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Roland Dreier @ 2011-10-11 17:28 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-rdma On Tue, Oct 11, 2011 at 12:24 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > Yes, I now understand even better why you hate adding module params... > specifically here, this table relates > only to IBoE users, which is relatively new (in since 2.6.37) and didn't hit > yet most (if not all) of the > distros, so I don't expect anything to break, makes sense? No, I think we need to switch to ignoring the parameter and printing a warning if someone sets it, and then we can remove it in a year or so. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-10-11 19:18 ` Or Gerlitz [not found] ` <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-10-16 8:32 ` Or Gerlitz 1 sibling, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-10-11 19:18 UTC (permalink / raw) To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma On Tue, Oct 11, 2011 at 7:28 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > No, I think we need to switch to ignoring the parameter and printing a > warning if someone sets it, and then we can remove it in a year or so. no problem, I don't think I will be able to send such patch before Sunday, so in spite of the merge window coming soon and as the rest of the series isn't dependant on that patch, if the other patches are okay, they can be picked before this one is fixed. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-10-11 19:22 ` Roland Dreier 0 siblings, 0 replies; 49+ messages in thread From: Roland Dreier @ 2011-10-11 19:22 UTC (permalink / raw) To: Or Gerlitz; +Cc: Or Gerlitz, linux-rdma On Tue, Oct 11, 2011 at 12:18 PM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > no problem, I don't think I will be able to send such patch before > Sunday, so in spite of the merge window coming soon and as the rest of > the series isn't dependant on that patch, if the other patches are > okay, they can be picked before this one is fixed. Sure. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size [not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-10-11 19:18 ` Or Gerlitz @ 2011-10-16 8:32 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-10-16 8:32 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma On 10/11/2011 7:28 PM, Roland Dreier wrote: > No, I think we need to switch to ignoring the parameter and printing a > warning if someone sets it, and then we can remove it in a year or so done, I sent you V1 of patch 3/5 which fixes that, any other comment for the rest of the series? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* [PATCH V1 3/5] net/mlx4_core: deprecate module param controlling the vlan table size [not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> 2011-10-10 17:00 ` Roland Dreier @ 2011-10-16 8:26 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-10-16 8:26 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma Enable the maximum size (128) supported by the device for the shadow vlans table, igonring the module param which could override it. This table is only used by the IBoE control plane for setting a vlan index into an RC/UC QP context or UD Address Handle. Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- changes from V0: - deprecate the module param instead of removing it, such that users will get warning if they attempt to set it. drivers/net/mlx4/main.c | 11 ++++++----- 1 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index f0ee35d..9c4ef20 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -96,6 +96,8 @@ MODULE_PARM_DESC(log_num_mac, "Log2 max number of MACs per ETH port (1-7)"); static int log_num_vlan; module_param_named(log_num_vlan, log_num_vlan, int, 0444); MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)"); +/* Log2 max number of VLANs per ETH port (0-7) */ +#define MLX4_LOG_NUM_VLANS 7 static int use_prio; module_param_named(use_prio, use_prio, bool, 0444); @@ -230,7 +232,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.max_gso_sz = dev_cap->max_gso_sz; dev->caps.log_num_macs = log_num_mac; - dev->caps.log_num_vlans = log_num_vlan; + dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS; dev->caps.log_num_prios = use_prio ? 3 : 0; for (i = 1; i <= dev->caps.num_ports; ++i) { @@ -1489,10 +1491,9 @@ static int __init mlx4_verify_params(void) return -1; } - if ((log_num_vlan < 0) || (log_num_vlan > 7)) { - pr_warning("mlx4_core: bad num_vlan: %d\n", log_num_vlan); - return -1; - } + if (log_num_vlan != 0) + pr_warning("mlx4_core: log_num_vlan - obsoleted module param, using %d\n", + MLX4_LOG_NUM_VLANS); if ((log_mtts_per_seg < 1) || (log_mtts_per_seg > 7)) { pr_warning("mlx4_core: bad log_mtts_per_seg: %d\n", log_mtts_per_seg); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> ` (2 preceding siblings ...) 2011-10-10 8:55 ` [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size Or Gerlitz @ 2011-10-10 8:56 ` Or Gerlitz [not found] ` <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> 2011-10-10 8:57 ` [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE Or Gerlitz 2011-10-26 7:04 ` [PATCH for-3.2] IBoE fixes Or Gerlitz 5 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-10-10 8:56 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma issue a PMA classportinfo query towards the HW driver, and if extended counters are supported, expose new sysfs entries which allow to read them. Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- --- drivers/infiniband/core/sysfs.c | 100 ++++++++++++++++++++++++++++++++++++++- include/rdma/ib_pma.h | 1 + 2 files changed, 100 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c index 9ab5df7..8bbed12 100644 --- a/drivers/infiniband/core/sysfs.c +++ b/drivers/infiniband/core/sysfs.c @@ -38,6 +38,7 @@ #include <linux/string.h> #include <rdma/ib_mad.h> +#include <rdma/ib_pma.h> struct ib_port { struct kobject kobj; @@ -288,6 +289,13 @@ static ssize_t show_port_pkey(struct ib_port *p, struct port_attribute *attr, return sprintf(buf, "0x%04x\n", pkey); } +enum { + COUNTERS_START = 0, + COUNTERS_LAST = 15, + EXT_COUNTERS_START = 16, + EXT_COUNTERS_LAST = 23 +}; + #define PORT_PMA_ATTR(_name, _counter, _width, _offset) \ struct port_table_attribute port_pma_attr_##_name = { \ .attr = __ATTR(_name, S_IRUGO, show_pma_counter, NULL), \ @@ -301,6 +309,7 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr, container_of(attr, struct port_table_attribute, attr); int offset = tab_attr->index & 0xffff; int width = (tab_attr->index >> 16) & 0xff; + int counter = (tab_attr->index >> 24) & 0xff; struct ib_mad *in_mad = NULL; struct ib_mad *out_mad = NULL; ssize_t ret; @@ -319,7 +328,14 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr, in_mad->mad_hdr.mgmt_class = IB_MGMT_CLASS_PERF_MGMT; in_mad->mad_hdr.class_version = 1; in_mad->mad_hdr.method = IB_MGMT_METHOD_GET; - in_mad->mad_hdr.attr_id = cpu_to_be16(0x12); /* PortCounters */ + if (counter >= COUNTERS_START && counter <= COUNTERS_LAST) + in_mad->mad_hdr.attr_id = IB_PMA_PORT_COUNTERS; + else if (counter >= EXT_COUNTERS_START && counter <= EXT_COUNTERS_LAST) + in_mad->mad_hdr.attr_id = IB_PMA_PORT_COUNTERS_EXT; + else { + ret = -EINVAL; + goto out; + } in_mad->data[41] = p->port_num; /* PortSelect field */ @@ -347,6 +363,10 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr, ret = sprintf(buf, "%u\n", be32_to_cpup((__be32 *)(out_mad->data + 40 + offset / 8))); break; + case 64: + ret = sprintf(buf, "%llu\n", be64_to_cpup( + (__be64 *)(out_mad->data + 40 + offset / 8))); + break; default: ret = 0; } @@ -358,6 +378,7 @@ out: return ret; } + static PORT_PMA_ATTR(symbol_error , 0, 16, 32); static PORT_PMA_ATTR(link_error_recovery , 1, 8, 48); static PORT_PMA_ATTR(link_downed , 2, 8, 56); @@ -374,6 +395,15 @@ static PORT_PMA_ATTR(port_xmit_data , 12, 32, 192); static PORT_PMA_ATTR(port_rcv_data , 13, 32, 224); static PORT_PMA_ATTR(port_xmit_packets , 14, 32, 256); static PORT_PMA_ATTR(port_rcv_packets , 15, 32, 288); +/* definition for port counters extension */ +static PORT_PMA_ATTR(port_xmit_data_64 , 16, 64, 64); +static PORT_PMA_ATTR(port_rcv_data_64 , 17, 64, 128); +static PORT_PMA_ATTR(port_xmit_packets_64 , 18, 64, 192); +static PORT_PMA_ATTR(port_rcv_packets_64 , 19, 64, 256); +static PORT_PMA_ATTR(port_unicast_xmit_packets , 20, 64, 320); +static PORT_PMA_ATTR(port_unicast_rcv_packets , 21, 64, 384); +static PORT_PMA_ATTR(port_multicast_xmit_packets , 22, 64, 448); +static PORT_PMA_ATTR(port_multicast_rcv_packets , 23, 64, 512); static struct attribute *pma_attrs[] = { &port_pma_attr_symbol_error.attr.attr, @@ -400,6 +430,68 @@ static struct attribute_group pma_group = { .attrs = pma_attrs }; +static struct attribute *pma_attrs_ext[] = { + &port_pma_attr_port_xmit_data_64.attr.attr, + &port_pma_attr_port_rcv_data_64.attr.attr, + &port_pma_attr_port_xmit_packets_64.attr.attr, + &port_pma_attr_port_rcv_packets_64.attr.attr, + &port_pma_attr_port_unicast_xmit_packets.attr.attr, + &port_pma_attr_port_unicast_rcv_packets.attr.attr, + &port_pma_attr_port_multicast_xmit_packets.attr.attr, + &port_pma_attr_port_multicast_rcv_packets.attr.attr, + NULL +}; + +static struct attribute_group pma_ext_group = { + .name = "counters_ext", + .attrs = pma_attrs_ext +}; + +static int is_pma_class_cap_ext_width(struct ib_device *ibdev, int port_num) +{ + struct ib_mad *in_mad = NULL; + struct ib_mad *out_mad = NULL; + struct ib_class_port_info *cpi = NULL; + int ret = 0; + + if (!ibdev->process_mad) + return -ENOSYS; + + in_mad = kzalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL); + if (!in_mad || !out_mad) { + ret = -ENOMEM; + goto out; + } + + in_mad->mad_hdr.base_version = 1; + in_mad->mad_hdr.mgmt_class = IB_MGMT_CLASS_PERF_MGMT; + in_mad->mad_hdr.class_version = 1; + in_mad->mad_hdr.method = IB_MGMT_METHOD_GET; + in_mad->mad_hdr.attr_id = IB_PMA_CLASS_PORT_INFO; + + in_mad->data[41] = port_num; + + if ((ibdev->process_mad(ibdev, IB_MAD_IGNORE_MKEY, port_num, + NULL, NULL, in_mad, out_mad) & + (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) != + (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) { + ret = -EINVAL; + goto out; + } + + cpi = (struct ib_class_port_info *)(out_mad->data + 40); + if (!(cpi->capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH) && + !(cpi->capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH_NOIETF)) + ret = -ENOSYS; + +out: + kfree(in_mad); + kfree(out_mad); + + return ret; +} + static void ib_port_release(struct kobject *kobj) { struct ib_port *p = container_of(kobj, struct ib_port, kobj); @@ -520,6 +612,12 @@ static int add_port(struct ib_device *device, int port_num, if (ret) goto err_put; + if (is_pma_class_cap_ext_width(device, port_num) == 0) { + ret = sysfs_create_group(&p->kobj, &pma_ext_group); + if (ret) + goto err_put; + } + p->gid_group.name = "gids"; p->gid_group.attrs = alloc_group_attrs(show_port_gid, attr.gid_tbl_len); if (!p->gid_group.attrs) diff --git a/include/rdma/ib_pma.h b/include/rdma/ib_pma.h index a5889f1..2f8a65c 100644 --- a/include/rdma/ib_pma.h +++ b/include/rdma/ib_pma.h @@ -42,6 +42,7 @@ */ #define IB_PMA_CLASS_CAP_ALLPORTSELECT cpu_to_be16(1 << 8) #define IB_PMA_CLASS_CAP_EXT_WIDTH cpu_to_be16(1 << 9) +#define IB_PMA_CLASS_CAP_EXT_WIDTH_NOIETF cpu_to_be16(1 << 10) #define IB_PMA_CLASS_CAP_XMIT_WAIT cpu_to_be16(1 << 12) #define IB_PMA_CLASS_PORT_INFO cpu_to_be16(0x0001) -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 49+ messages in thread
[parent not found: <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> @ 2011-10-31 19:38 ` Roland Dreier [not found] ` <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Roland Dreier @ 2011-10-31 19:38 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-rdma On Mon, Oct 10, 2011 at 1:56 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > +static struct attribute_group pma_ext_group = { > + .name = "counters_ext", > + .attrs = pma_attrs_ext > +}; Sorry for the late review here, but does it seem like the best approach to have a separate "counters_ext" directory for some subset of performance counters? Instead we could have two attribute_groups, one the basic counters and one the basic and extended counters, and basically do if (is_pma_class_cap_ext_width(device, port_num) == 0) sysfs_create_group(...basic and extended counters...) else sysfs_create_group(...basic counters...) (by the way, is_pma_class_cap_ext_width() seems backwards since it returns 0 for true. How about bool pma_has_ext_width() and have it return true if the extended counters ARE supported?) Or is there some reason why users would want to make the distinction between basic and extended counters? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 6:40 ` Or Gerlitz [not found] ` <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 6:40 UTC (permalink / raw) To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma On Mon, Oct 31, 2011 at 9:38 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > Sorry for the late review here Oh yes... BTW this is patch 4/5, I don't see patches 1,2,3 on your for-next tree/branch @ kernel.org, have you accepted them? > Sorry for the late review here, but does it seem like the best > approach to have a separate "counters_ext" directory for > some subset of performance counters? Instead we could > have two attribute_groups, one the basic counters and one > the basic and extended counters, and basically do > > if (is_pma_class_cap_ext_width(device, port_num) == 0) > sysfs_create_group(...basic and extended counters...) > else > sysfs_create_group(...basic counters...) Basically, I don't see a problem to have one directory along the lines of your suggestion > Or is there some reason why users would want to make the > distinction between basic and extended counters? Today, e.g in some IBoE perf monitoring scripts we wrote, the distinction is done by if (the ext counter directory exists) then go and read the counters from there, else read from the non extended counters directory. With the change you propose, that if (.) would become a little less elegant and would check if this or that --file-- exists (e.g the 64 bit tx data counter) and if yes, would read the four 64 bit counters (rx/tx packets/data) else the four 32 bits counters, so from our user standpoint, diff dirs seems better, but we can get along with same dir with diff contents depending on the device. > (by the way, is_pma_class_cap_ext_width() seems backwards > since it returns 0 for true. How about bool pma_has_ext_width() > and have it return true if the extended counters ARE supported?) sure, I will be able to handle this little change Wed, but we should be okay / enough time for the current merge window, correct? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 16:22 ` Jason Gunthorpe [not found] ` <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-01 16:22 UTC (permalink / raw) To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma On Tue, Nov 01, 2011 at 08:40:12AM +0200, Or Gerlitz wrote: > Today, e.g in some IBoE perf monitoring scripts we wrote, the > distinction is done by if (the ext counter directory exists) then go > and read the counters from there, else read from the non extended > counters directory. With the change you propose, that if (.) would > become a little less elegant and would check if this or that --file-- > exists (e.g the 64 bit tx data counter) and if yes, would read the > four 64 bit counters (rx/tx packets/data) else the four 32 bits > counters, so from our user standpoint, diff dirs seems better, but we > can get along with same dir with diff contents depending on the > device. Is there any reason to expose the 32 and 64 bit version of the same counter? That seems needless. Emit the largest version available and prepend 0's to fill out to the available width so that userspace can know the counter size. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-01 17:23 ` Or Gerlitz [not found] ` <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 17:23 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > Is there any reason to expose the 32 and 64 bit version of the same > counter? That seems needless. Emit the largest version available and > prepend 0's to fill out to the available width so that userspace can > know the counter size. Basically, the approach you suggest seems fine for IBoE which is pretty new, however, the problem is that the 32 bit counters exists from kind of day one AND have the same semantics either if returned through sysfs or through perfquery and alike mad based apps. In other words around IB there should be some legacy which exists today, and I don't think it would be wise to touch that area such that the 32 bit counters become embedded in 64 bit numbers, thoughts? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 17:36 ` Jason Gunthorpe [not found] ` <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-01 17:36 UTC (permalink / raw) To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma On Tue, Nov 01, 2011 at 07:23:52PM +0200, Or Gerlitz wrote: > Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > > Is there any reason to expose the 32 and 64 bit version of the same > > counter? That seems needless. Emit the largest version available and > > prepend 0's to fill out to the available width so that userspace can > > know the counter size. > > Basically, the approach you suggest seems fine for IBoE which is > pretty new, however, > > the problem is that the 32 bit counters exists from kind of day one > AND have the same semantics either if returned through sysfs or > through perfquery and alike mad based apps. > In other words around IB there should be some legacy which exists > today, and I don't think it would be wise to touch that area such that > the 32 bit counters become embedded in 64 bit numbers, thoughts? I don't see a problem with having a sysfs counter file being extended to return a 64 bit number.. I think that is within the purvue of acceptable changes. Shame the counter wasn't exported as hex though - makes it harder to signal if it is 32 or 64 bit. Frankly, exporting these PMA counters as saturate on maximum via sysfs is pretty useless. Does anyone actually use them aside from a few scripts? What would be useful is free running 64 bit sysfs counters that are independent and not reset by PMA activity. Like all the other Linux networking counters. That would be great. I hope that is what is done for IBoE? Unifying the counters to be semantically the same on IB and IBoE seems like a very good idea. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-01 18:14 ` Or Gerlitz [not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 18:14 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: Guys (Roland, Jason), I'm open to any comments, any time, for any patch, but for a patch which was posted weeks ago it's pretty unfair to have your comments coming only eight days after the merge window has been opened, lets try to come quickly to decision so I can fix this up along those lines, > I don't see a problem with having a sysfs counter file being extended > to return a 64 bit number.. I think that is within the purvue of > acceptable changes. Shame the counter wasn't exported as hex though - > makes it harder to signal if it is 32 or 64 bit. if I understand you right, we would have traffic counters exposed through sysfs, where a counter is either a 32 zero-padded/embedded in 64bit one or true 64 bit one, a problem is that the four 32bit traffic counters (rx/tx data/packets) are actually part of the IB port L2 basic counter set which includes about ten more counters to mark different kinds of errors, wheres the 64bit counters are only traffic counters, so what do you suggest for them? use the same approach for the error counters as well even though IB doesn't define 64 bit version for them? also zero padding for something which isn't exported in hex is very ugly, isn't that? > Frankly, exporting these PMA counters as saturate on maximum via sysfs > is pretty useless. Does anyone actually use them aside from a few scripts? under IB our monitorying code/scripts use perfquery/mads wheres under IBoE we use sysfs, the mad approach allows to reset the counters, so the 32 bit counters aren't useless, reset via sysfs isn't supported so the 64 bit counter are kind of must, anyway, > What would be useful is free running 64 bit sysfs counters that are > independent and not reset by PMA activity. Like all the other Linux > networking counters. That would be great. I hope that is what is done for IBoE? yes this is what the 64 bit counters are > Unifying the counters to be semantically the same on IB and IBoE seems > like a very good idea. yes, this is what we do here -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 18:37 ` Jason Gunthorpe [not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-11-01 21:49 ` Roland Dreier 1 sibling, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-01 18:37 UTC (permalink / raw) To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma > > I don't see a problem with having a sysfs counter file being extended > > to return a 64 bit number.. I think that is within the purvue of > > acceptable changes. Shame the counter wasn't exported as hex though - > > makes it harder to signal if it is 32 or 64 bit. > > if I understand you right, we would have traffic counters exposed > through sysfs, where a counter is either a 32 zero-padded/embedded in > 64bit one or true 64 bit one, a problem is that the four 32bit traffic > counters (rx/tx data/packets) are actually part of the IB port L2 > basic counter set which includes about ten more counters to mark > different kinds of errors, wheres the 64bit counters are only traffic > counters, so what do you suggest for them? use the same approach for > the error counters as well even though IB doesn't define 64 bit > version for them? also zero padding for something which isn't exported > in hex is very ugly, isn't that? Whats the problem here? If a 64 bit counter is available then export it as 64 bit otherwise keep exporting something smaller. I agree zero padding non-hex numbers isn't ideal. Export as hex? Broadly, this is another problem with the sysfs interface because the width matters for any kind of serious data collection, and IBA defined interesting widths for many of the counters that was flowed right through the sysfs interface, with no means of discovery. > > Frankly, exporting these PMA counters as saturate on maximum via sysfs > > is pretty useless. Does anyone actually use them aside from a few scripts? > > under IB our monitorying code/scripts use perfquery/mads wheres under > IBoE we use sysfs, the mad approach allows to reset the counters, so > the 32 bit counters aren't useless, reset via sysfs isn't supported so > the 64 bit counter are kind of must, anyway, I don't mean the 32 bit counters are useless, I mean exposing PMA counters that saturate and can be randomly reset by external agents through sysfs is useless. You can't make any kind of data collection based on such a system. Ideally the sysfs counters are all non-saturating, non-resetting counters like everything else in the net stack. You need a different interface to the chip firmware to implement this, can't use the existing PMA stuff. In the same vien adding saturating but non-resettable PMA-esque counters for IBoE seems pretty hackish to me.. Though I agree it is not terribly relevant for 64 bit counters. > > What would be useful is free running 64 bit sysfs counters that > > are independent and not reset by PMA activity. Like all the other > > Linux networking counters. That would be great. I hope that is > > what is done for IBoE? > > yes this is what the 64 bit counters are IBA defined 64 bit counters are not free-running, they still saturate. Does the firmware not do this in IBoE mode? > > Unifying the counters to be semantically the same on IB and IBoE seems > > like a very good idea. > yes, this is what we do here I disagree. Your IBoE counters cannot be reset, externally or otherwise - aside from the saturating this makes them almost the same as the usual Linux net counters. When the port is in IB mode the counter doesn't have those properties. That is a big semantic difference when compared to what these sysfs files show for normal IB counters. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-01 21:42 ` Roland Dreier 2011-11-01 21:44 ` Or Gerlitz ` (2 more replies) 2011-11-01 21:46 ` Or Gerlitz 1 sibling, 3 replies; 49+ messages in thread From: Roland Dreier @ 2011-11-01 21:42 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Or Gerlitz, Or Gerlitz, linux-rdma On Tue, Nov 1, 2011 at 11:37 AM, Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > Whats the problem here? If a 64 bit counter is available then export > it as 64 bit otherwise keep exporting something smaller. > > I agree zero padding non-hex numbers isn't ideal. Export as hex? I agree that it definitely is more appealing, if we have a 64-bit version of a counter, that we should just export that counter where we used to export the 32-bit version. But I'm not sure there's a feasible way to do this without breaking old userspace. For sure we can't assume userspace can cope with hex where we used to have decimal. And I don't think it's even a safe assumption that userspace can cope with a 64-bit quantity where we used to have a 32-bit quantity. It doesn't seem safe to assume that userspace that used to work with 32-bit quantities can cope with a 0-padded 20 digit value. The least bad way forward does seem like it is probably the separate new directory thing. - R. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs 2011-11-01 21:42 ` Roland Dreier @ 2011-11-01 21:44 ` Or Gerlitz 2011-11-01 21:52 ` Jason Gunthorpe 2011-11-02 17:16 ` Or Gerlitz 2 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 21:44 UTC (permalink / raw) To: Roland Dreier; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma On Tue, Nov 1, 2011 at 11:42 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > The least bad way forward does seem like it is probably > the separate new directory thing. I agree Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs 2011-11-01 21:42 ` Roland Dreier 2011-11-01 21:44 ` Or Gerlitz @ 2011-11-01 21:52 ` Jason Gunthorpe [not found] ` <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-11-02 17:16 ` Or Gerlitz 2 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-01 21:52 UTC (permalink / raw) To: Roland Dreier; +Cc: Or Gerlitz, Or Gerlitz, linux-rdma On Tue, Nov 01, 2011 at 02:42:33PM -0700, Roland Dreier wrote: > I agree that it definitely is more appealing, if we have a 64-bit > version of a counter, that we should just export that counter > where we used to export the 32-bit version. I think this falls under the 'undocumented, beware' API design. This interface isn't specified so exactly as to have set out how many bytes are in the files and how many bits are in the numbers. If you wrote a reader that can't handle a 20 byte integer with leading zeros then your user space isn't following the API. If your reader doesn't elegantly handle overflow to whatever type your reader picked, then you aren't following the API. There are many examples of the kernel tweaking APIs along this undocumented axis, and theoritically the text-free-form nature of sysfs is supposed to save us from having to worry about exactly this sort of case. And again, this is a useless interface in IB. IBoE is going to be first real, serious, long term user, let's make it saner instead of keeping it as is forever? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-01 22:03 ` Ira Weiny [not found] ` <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Ira Weiny @ 2011-11-01 22:03 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, Or Gerlitz, linux-rdma On Tue, 1 Nov 2011 14:52:00 -0700 Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > On Tue, Nov 01, 2011 at 02:42:33PM -0700, Roland Dreier wrote: > [snip] > > And again, this is a useless interface in IB. Why do you mean by this? Ira > IBoE is going to be > first real, serious, long term user, let's make it saner instead of > keeping it as is forever? > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ira Weiny Member of Technical Staff Lawrence Livermore National Lab 925-423-8008 weiny2-i2BcT+NCU+M@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org> @ 2011-11-01 22:11 ` Jason Gunthorpe [not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-01 22:11 UTC (permalink / raw) To: Ira Weiny; +Cc: Roland Dreier, Or Gerlitz, Or Gerlitz, linux-rdma On Tue, Nov 01, 2011 at 03:03:58PM -0700, Ira Weiny wrote: > > And again, this is a useless interface in IB. > > Why do you mean by this? A counter that is randomly reset by an external IB performance manager is not useful for collecting local statistical information. A counter that saturates and cannot be reset locally is not useful for any automatic process. Why have a sysfs counter at all when you can just ask the PMA and get exactly the same data? I can't make a SNMP MIB from this counter. I can't make graphs from this counter. I can't do anything with it except show it to the user and hope they understand when it was last reset, or understand what to do when it is saturated. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-01 22:19 ` Ira Weiny 2011-11-01 22:34 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Ira Weiny @ 2011-11-01 22:19 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, Or Gerlitz, linux-rdma On Tue, 1 Nov 2011 15:11:35 -0700 Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > On Tue, Nov 01, 2011 at 03:03:58PM -0700, Ira Weiny wrote: > > > > And again, this is a useless interface in IB. > > > > Why do you mean by this? > > A counter that is randomly reset by an external IB performance manager > is not useful for collecting local statistical information. > > A counter that saturates and cannot be reset locally is not useful for > any automatic process. > > Why have a sysfs counter at all when you can just ask the PMA and get > exactly the same data? I agree, but... sys admins like sysfs files... :-( > > I can't make a SNMP MIB from this counter. I can't make graphs from > this counter. I can't do anything with it except show it to the user > and hope they understand when it was last reset, or understand what > to do when it is saturated. Ok, just making sure we are on the same page. Ira > > Jason -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2-i2BcT+NCU+M@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-11-01 22:19 ` Ira Weiny @ 2011-11-01 22:34 ` Or Gerlitz [not found] ` <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 22:34 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Ira Weiny, Roland Dreier, Or Gerlitz, linux-rdma Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > Why have a sysfs counter at all when you can just ask the PMA and get exactly the > same data? The HW/FW PMA agent isn't supported for IBoE only for IB, the counters are for both Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 22:46 ` Ira Weiny [not found] ` <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Ira Weiny @ 2011-11-01 22:46 UTC (permalink / raw) To: Or Gerlitz; +Cc: Jason Gunthorpe, Roland Dreier, Or Gerlitz, linux-rdma On Tue, 1 Nov 2011 15:34:46 -0700 Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > > Why have a sysfs counter at all when you can just ask the PMA and get exactly the > > same data? > > The HW/FW PMA agent isn't supported for IBoE only for IB, the > counters are for both What do you mean "for both"? Ira > > Or. -- Ira Weiny Member of Technical Staff Lawrence Livermore National Lab 925-423-8008 weiny2-i2BcT+NCU+M@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org> @ 2011-11-02 7:38 ` Or Gerlitz 0 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-11-02 7:38 UTC (permalink / raw) To: Ira Weiny; +Cc: Jason Gunthorpe, Roland Dreier, linux-rdma On 11/2/2011 12:46 AM, Ira Weiny wrote: > What do you mean "for both"? Ira The sysfs PMA counters are functional for both IB and IBoE, the latter uses a HW counter allocated per device/port for which all the QPs created on that port are reporting their rx/tx bytes/packets, see commits cfcde11c3d7ae175f49280bb6f913478c2f1bd8c and c37791349cc79d025df6e9a4f896a7b0a97cdbd3 Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs 2011-11-01 21:42 ` Roland Dreier 2011-11-01 21:44 ` Or Gerlitz 2011-11-01 21:52 ` Jason Gunthorpe @ 2011-11-02 17:16 ` Or Gerlitz [not found] ` <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-11-02 17:16 UTC (permalink / raw) To: Roland Dreier; +Cc: Jason Gunthorpe, linux-rdma On 11/1/2011 11:42 PM, Roland Dreier wrote: > The least bad way forward does seem like it is probably the separate > new directory thing Hi Roland, I suggest we go that least bad way along the lines of your comment. If/when on some future point something constructive can be formed from Jason's observations, changes will follow, agree? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2011-11-08 0:54 ` Roland Dreier [not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Roland Dreier @ 2011-11-08 0:54 UTC (permalink / raw) To: Or Gerlitz; +Cc: Jason Gunthorpe, linux-rdma On Wed, Nov 2, 2011 at 10:16 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > I suggest we go that least bad way along the lines of your comment. > > If/when on some future point something constructive can be formed from > Jason's observations, changes will follow, agree? Let's make sure we learn from our mistakes. Let's say we create a new "ext_counters" directory. What should the format of those files be? Should they be assumed to be 64-bit quantities? Do we want to allow some way of indicating the number of bits (ie 0-padded hex entries?)? - R> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-08 1:09 ` Jason Gunthorpe [not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-12-22 17:09 ` Or Gerlitz 1 sibling, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-08 1:09 UTC (permalink / raw) To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma On Mon, Nov 07, 2011 at 04:54:42PM -0800, Roland Dreier wrote: > On Wed, Nov 2, 2011 at 10:16 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > > I suggest we go that least bad way along the lines of your comment. > > > > If/when on some future point something constructive can be formed from > > Jason's observations, changes will follow, agree? > > Let's make sure we learn from our mistakes. Let's say we create a > new "ext_counters" directory. What should the format of those files > be? Should they be assumed to be 64-bit quantities? Do we want to > allow some way of indicating the number of bits (ie 0-padded hex > entries?)? That is a good idea. Let's require counters_ext to be sane: 1 Hex quantity of unspecified size 2 Prefixed with 0x and leading zeros to fill out to size and allow userspace discovery of size 3 Size must be a multiple of 4 bits. 4 Counters do not saturate 5 Counters wrap around at all F's back to 0. 6 If the counter is resettable it is only via a local operation through netlink or sysfs or something. Not PMA reset. Certainly, aside from some minor details and different string formatting, the 64 bit counters Or proposes to add meet these requirements when the port is used in Ethernet mode. How do you feel about having counters_ext appear in ethernet mode and disappear in IB mode? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-08 16:06 ` Ira Weiny [not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org> 2011-12-20 12:03 ` Or Gerlitz 1 sibling, 1 reply; 49+ messages in thread From: Ira Weiny @ 2011-11-08 16:06 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma On Mon, 7 Nov 2011 17:09:52 -0800 Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > On Mon, Nov 07, 2011 at 04:54:42PM -0800, Roland Dreier wrote: > > On Wed, Nov 2, 2011 at 10:16 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > > > I suggest we go that least bad way along the lines of your comment. > > > > > > If/when on some future point something constructive can be formed from > > > Jason's observations, changes will follow, agree? > > > > Let's make sure we learn from our mistakes. Let's say we create a > > new "ext_counters" directory. What should the format of those files > > be? Should they be assumed to be 64-bit quantities? Do we want to > > allow some way of indicating the number of bits (ie 0-padded hex > > entries?)? > > That is a good idea. Let's require counters_ext to be sane: > > 1 Hex quantity of unspecified size > 2 Prefixed with 0x and leading zeros to fill out to size and allow > userspace discovery of size > 3 Size must be a multiple of 4 bits. > 4 Counters do not saturate > 5 Counters wrap around at all F's back to 0. > 6 If the counter is resettable it is only via a local operation > through netlink or sysfs or something. Not PMA reset. > > Certainly, aside from some minor details and different string > formatting, the 64 bit counters Or proposes to add meet these > requirements when the port is used in Ethernet mode. > > How do you feel about having counters_ext appear in ethernet mode and > disappear in IB mode? That might get complicated with ports which can be either mode. Ira > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ira Weiny Member of Technical Staff Lawrence Livermore National Lab 925-423-8008 weiny2-i2BcT+NCU+M@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org> @ 2011-12-20 11:49 ` Or Gerlitz 2011-12-20 11:49 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-12-20 11:49 UTC (permalink / raw) To: Ira Weiny; +Cc: Jason Gunthorpe, Roland Dreier, linux-rdma Ira Weiny wrote: > Jason Gunthorpe<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > How do you feel about having counters_ext appear in ethernet mode and > disappear in IB mode? > > That might get complicated with ports which can be either mode. Ira (reviving this thread), At the ib core level, the link layer sysfs attribute is read only. At the mlx4 VPI level support, a port has given link layer at certain point of time, and further, AFAIK, this isn't something that can be changed in the life-cycle of the ib device exported by mlx4_ib, which is deleted/re-added when the port link layer change, see commit 7ff93f8b7ecbc36e7ffc5c11a61643821c1bfee5 which states " When the type of a port is changed, all mlx4 interfaces are unregistered, and then registered again with the new port types" Still, this patch makes the decision when the ib device is added, so you have a point here w.r.t to future devices which could change their link layer at run time, I'll look on that. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org> 2011-12-20 11:49 ` Or Gerlitz @ 2011-12-20 11:49 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-12-20 11:49 UTC (permalink / raw) To: Ira Weiny; +Cc: Jason Gunthorpe, Roland Dreier, linux-rdma Ira Weiny wrote: > Jason Gunthorpe<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > How do you feel about having counters_ext appear in ethernet mode and > disappear in IB mode? > > That might get complicated with ports which can be either mode. Ira (reviving this thread), At the ib core level, the link layer sysfs attribute is read only. At the mlx4 VPI level support, a port has given link layer at certain point of time, and further, AFAIK, this isn't something that can be changed in the life-cycle of the ib device exported by mlx4_ib, which is deleted/re-added when the port link layer change, see commit 7ff93f8b7ecbc36e7ffc5c11a61643821c1bfee5 which states " When the type of a port is changed, all mlx4 interfaces are unregistered, and then registered again with the new port types" Still, this patch makes the decision when the ib device is added, so you have a point here w.r.t to future devices which could change their link layer at run time, I'll look on that. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-11-08 16:06 ` Ira Weiny @ 2011-12-20 12:03 ` Or Gerlitz [not found] ` <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 1 sibling, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-12-20 12:03 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, linux-rdma On 11/8/2011 3:09 AM, Jason Gunthorpe wrote: > Roland Dreier wrote: >> >> Let's make sure we learn from our mistakes. Let's say we create a >> new "ext_counters" directory. What should the format of those files >> be? Should they be assumed to be 64-bit quantities? Do we want to >> allow some way of indicating the number of bits (ie 0-padded hex >> entries?)? > > That is a good idea. Let's require counters_ext to be sane: > > 1 Hex quantity of unspecified size > 2 Prefixed with 0x and leading zeros to fill out to size and allow > userspace discovery of size > 3 Size must be a multiple of 4 bits. > 4 Counters do not saturate > 5 Counters wrap around at all F's back to 0. > 6 If the counter is resettable it is only via a local operation > through netlink or sysfs or something. Not PMA reset. > > Certainly, aside from some minor details and different string > formatting, the 64 bit counters Or proposes to add meet these > requirements when the port is used in Ethernet mode. Jason, in an earlier post of yours you mentioned the netdev counters as something we should be following, well, I took a look - the counters are read-only (see /sys/class/net/$DEV/statistics/*) they are displayed in decimal/non-padded manner (see /sys/class/net/$DEV/statistics/* or /proc/net/dev or ifconfig and friends) Roland, I'd like to re-format these patches for 3.3, will be glad to hear what you would like to see here, > How do you feel about having counters_ext appear in ethernet mode and > disappear in IB mode? I think I'm fine with that, IB has its own MAD based means to query these counters. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2011-12-20 17:46 ` Jason Gunthorpe [not found] ` <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-12-20 17:46 UTC (permalink / raw) To: Or Gerlitz; +Cc: Roland Dreier, linux-rdma On Tue, Dec 20, 2011 at 02:03:31PM +0200, Or Gerlitz wrote: > >That is a good idea. Let's require counters_ext to be sane: > > > > 1 Hex quantity of unspecified size > > 2 Prefixed with 0x and leading zeros to fill out to size and allow > > userspace discovery of size > > 3 Size must be a multiple of 4 bits. > > 4 Counters do not saturate > > 5 Counters wrap around at all F's back to 0. > > 6 If the counter is resettable it is only via a local operation > > through netlink or sysfs or something. Not PMA reset. > > > >Certainly, aside from some minor details and different string > >formatting, the 64 bit counters Or proposes to add meet these > >requirements when the port is used in Ethernet mode. > > Jason, in an earlier post of yours you mentioned the netdev counters > as something we should be following, well, I took a look - the > counters are read-only (see /sys/class/net/$DEV/statistics/*) they > are displayed in decimal/non-padded manner (see > /sys/class/net/$DEV/statistics/* or /proc/net/dev or ifconfig and > friends) The netdev counters are all the same size and there is some other way to discover what the size is. I'd like to see that for IB counters too, but it is probably infeasible. So if we have counters that are not the same size as netdev counters then some kind of size indicator is required for sane userspace. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-12-20 19:40 ` Or Gerlitz [not found] ` <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-12-20 19:40 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Or Gerlitz, Roland Dreier, linux-rdma Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > The netdev counters are all the same size and there is some other way > to discover what the size is. I'd like to see that for IB counters > too, but it is probably infeasible. So if we have counters that are > not the same size as netdev counters then some kind of size indicator > is required for sane userspace. We're talking now only on the IB extended counters who are all 64 bits Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-12-20 19:50 ` Jason Gunthorpe [not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-12-20 19:50 UTC (permalink / raw) To: Or Gerlitz; +Cc: Or Gerlitz, Roland Dreier, linux-rdma On Tue, Dec 20, 2011 at 09:40:09PM +0200, Or Gerlitz wrote: > Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > > The netdev counters are all the same size and there is some other way > > to discover what the size is. I'd like to see that for IB counters > > too, but it is probably infeasible. So if we have counters that are > > not the same size as netdev counters then some kind of size indicator > > is required for sane userspace. > > We're talking now only on the IB extended counters who are all 64 bits netdev counters are 32 bit or 64 bit, depending on how the kernel was compiled. I think indicating the size explicitly, or always being 64 bit (and extending all the lessor counters in future) is the way to go for IB.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-12-20 19:56 ` Or Gerlitz 2012-01-04 19:29 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-12-20 19:56 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Or Gerlitz, Roland Dreier, linux-rdma Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: >> We're talking now only on the IB extended counters who are all 64 bits > netdev counters are 32 bit or 64 bit, depending on how the kernel was > compiled. I think indicating the size explicitly, or always being 64 > bit (and extending all the lessor counters in future) is the way to go for IB.. So for the extended counters which we are dealing with now, and who are all 64bit, anything I should change in my patches? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-12-20 19:56 ` Or Gerlitz @ 2012-01-04 19:29 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2012-01-04 19:29 UTC (permalink / raw) To: Roland Dreier, Jason Gunthorpe; +Cc: linux-rdma On Tue, Dec 20, 2011 Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > On Tue, Dec 20, 2011 Or Gerlitz wrote: >> Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: >> > The netdev counters are all the same size and there is some other way >> > to discover what the size is. I'd like to see that for IB counters >> > too, but it is probably infeasible. So if we have counters that are >> > not the same size as netdev counters then some kind of size indicator >> > is required for sane userspace. >> We're talking now only on the IB extended counters who are all 64 bits > netdev counters are 32 bit or 64 bit, depending on how the kernel was > compiled. I think indicating the size explicitly, or always being 64 > bit (and extending all the lessor counters in future) is the way to go for IB.. Roland/Jason, Any concrete preference here? I'd like to fix the patch so it can go into 3.3-rc1 * do we want that directory to be present only when the port link type is Ethernet (I assume the ib device will be re-created across link type change as the per port HW elements need to be re-initialized). * decimal display as all the network counters and the other IB counters are? * could clarify "indicating the size explicitly" * "always being 64 bit" applies to hex decimal only, I guess Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-11-08 1:09 ` Jason Gunthorpe @ 2011-12-22 17:09 ` Or Gerlitz 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-12-22 17:09 UTC (permalink / raw) To: Roland Dreier; +Cc: Jason Gunthorpe, linux-rdma On 11/8/2011 2:54 AM, Roland Dreier wrote: > Let's make sure we learn from our mistakes. Let's say we create a new > "ext_counters" directory. What should the format of those files be? > Should they be assumed to be 64-bit quantities? Do we want to allow > some way of indicating the number of bits (ie 0-padded hex entries?)? Hi Roland,Jason - again, I'd like to re-submit that for kernel 3.3 - if there's anything specific you think need to change, could you please comment? thanks Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2011-11-01 21:42 ` Roland Dreier @ 2011-11-01 21:46 ` Or Gerlitz [not found] ` <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 21:46 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Roland Dreier, Or Gerlitz, linux-rdma Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote: > I don't mean the 32 bit counters are useless, I mean exposing PMA > counters that saturate and can be randomly reset by external agents > through sysfs is useless. You can't make any kind of data collection > based on such a system. > Ideally the sysfs counters are all non-saturating, non-resetting > counters like everything else in the net stack. You need a different > interface to the chip firmware to implement this, can't use the > existing PMA stuff. > In the same vien adding saturating but non-resettable PMA-esque > counters for IBoE seems pretty hackish to me.. Though I agree it is > not terribly relevant for 64 bit counters. Jason, To put things in place, the IB stack PMA counters aren't resettable through sysfs, still, under IB, the same counter set is readable through both mads and sysfs and resettable through mads. As for the saturation thing, I didn't think about that, but you're probably right and all the IBA PMA counters are saturating, but as your comment said, the 64 bit case is practically okay Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 21:58 ` Jason Gunthorpe [not found] ` <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 49+ messages in thread From: Jason Gunthorpe @ 2011-11-01 21:58 UTC (permalink / raw) To: Or Gerlitz; +Cc: Roland Dreier, Or Gerlitz, linux-rdma On Tue, Nov 01, 2011 at 11:46:08PM +0200, Or Gerlitz wrote: > > In the same vien adding saturating but non-resettable PMA-esque > > counters for IBoE seems pretty hackish to me.. Though I agree it is > > not terribly relevant for 64 bit counters. > To put things in place, the IB stack PMA counters aren't resettable > through sysfs, still, under IB, the same counter set is readable > through both mads and sysfs and resettable through mads. Right, the sysfs interface is pretty much unusable for IB. Your work to make it go on IBoE makes something is very nearly usuable, but you can't write a tool that collects these counters from a port in IBoE mode and also expect it to work in IB mode because the semantics are different. My argument here is that the semantics we have for the IB case are not useful. Let us define sane semantics for the IBoE case and have a longer term clean up to make the IB case follow them as well. Sane semantics for a sysfs counter are: - Free-running - Non-saturating - No reset - 64 or 32 bit value, detectable by user-space No 6 bit counters. No counters that saturate. No counters that randomly reset. To this end, I think exporting 64 bit and 32 bit counts of the same value is not the way to go. > As for the saturation thing, I didn't think about that, but you're > probably right and all the IBA PMA counters are saturating, but as > your comment said, the 64 bit case is practically okay Will any counters that get exposed when IBoE is turned on not be 64 bits? There are not very many 64 bit PMA counters. If yes, maybe you should patch to un-export them until things can be fixed sanely... Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2011-11-02 8:27 ` Or Gerlitz 0 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-11-02 8:27 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-rdma On 11/1/2011 11:58 PM, Jason Gunthorpe wrote: > maybe you should patch to un-export them until things can be fixed > sanely... Jason, I don't think we want to work this way, bug are either opened or attempted to be fixed or fixed. You can't just come out of the the blue and remove functionality which is buggy (BTW - to your taste, I don't agree with your conclusive comments). Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-11-01 18:37 ` Jason Gunthorpe @ 2011-11-01 21:49 ` Roland Dreier [not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 49+ messages in thread From: Roland Dreier @ 2011-11-01 21:49 UTC (permalink / raw) To: Or Gerlitz; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma On Tue, Nov 1, 2011 at 11:14 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > Guys (Roland, Jason), I'm open to any comments, any time, for any > patch, but for a patch which was posted weeks ago it's pretty unfair > to have your comments coming only eight days after the merge window > has been opened, lets try to come quickly to decision so I can fix > this up along those lines, Let's not get into fairness here... I'm trying to make progress on my backlog but there are patches that for better or worse have been around for a year or more. And for this merge window, I did manage to do 84 files changed, 4738 insertions(+), 984 deletions(-) and include more than 60 patches. There's no obligation to merge something just because you posted it before the merge window, and in fact Linus's complaint at the kernel summit is always that sub-maintainers don't say no enough. And let's be honest in this specific case: the world is not going to end without a few performance counters in sysfs. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-01 22:32 ` Or Gerlitz 2011-11-01 22:49 ` Hefty, Sean 1 sibling, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-11-01 22:32 UTC (permalink / raw) To: Roland Dreier; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma On Tue, Nov 1, 2011 at 11:49 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > There's no obligation to merge something just because you posted it before > the merge window, and in fact Linus's complaint at the kernel summit is > always that sub-maintainers don't say no enough. > > And let's be honest in this specific case: the world is not going to end without > a few performance counters in sysfs. Agree on all, I just want to see progress here, its okay for this discussion and its such to miss this or that merge window, as long as at some point we get into a resolution which allows to fix a patch and queue it for the next merge window, e.g in this case, if we miss 3.2-rc1 and you don't feel the patches are appropriate for -rc2, they can be queued for 3.3 BTW - this brings something I wanted to raise long ago... I would be happy to see the for-next branch active at all times, e.g in the same manner net-next is, which means that patches aren't reviewed/accepted only/mostly in the weeks before the merge window opens but rather constantly over time. This shouldn't increase the amortized load on you and reduce the possible (maybe existing to some extent?) frustration by people who submit patches long before the next window opens and don't much feedback... (again, not relevant to this patch and the next) Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* RE: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs [not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-11-01 22:32 ` Or Gerlitz @ 2011-11-01 22:49 ` Hefty, Sean 1 sibling, 0 replies; 49+ messages in thread From: Hefty, Sean @ 2011-11-01 22:49 UTC (permalink / raw) To: Roland Dreier, Or Gerlitz; +Cc: Jason Gunthorpe, Or Gerlitz, linux-rdma > Let's not get into fairness here... I'm trying to make progress on my backlog > but there are patches that for better or worse have been around for a year > or more. Along these lines, is there any news on when patchwork might be available again? I've been trying to help review some of the backlog patches, but it's hard to do without seeing them. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
* [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE [not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> ` (3 preceding siblings ...) 2011-10-10 8:56 ` [PATCH 4/5] ib/core: add support for extended performance counters in sysfs Or Gerlitz @ 2011-10-10 8:57 ` Or Gerlitz 2011-10-26 7:04 ` [PATCH for-3.2] IBoE fixes Or Gerlitz 5 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-10-10 8:57 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma With the HW counters being 64bit ones, use the existing IBoE PMA counters to support also extended counters. Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/hw/mlx4/mad.c | 49 ++++++++++++++++++++++++++++++++----- 1 files changed, 42 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c index f36da99..7ccc9a0 100644 --- a/drivers/infiniband/hw/mlx4/mad.c +++ b/drivers/infiniband/hw/mlx4/mad.c @@ -303,13 +303,44 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY; } -static void edit_counter(struct mlx4_counter *cnt, - struct ib_pma_portcounters *pma_cnt) +static int pma_get_classportinfo(struct ib_mad *out_mad) { - pma_cnt->port_xmit_data = cpu_to_be32((be64_to_cpu(cnt->tx_bytes)>>2)); - pma_cnt->port_rcv_data = cpu_to_be32((be64_to_cpu(cnt->rx_bytes)>>2)); - pma_cnt->port_xmit_packets = cpu_to_be32(be64_to_cpu(cnt->tx_frames)); - pma_cnt->port_rcv_packets = cpu_to_be32(be64_to_cpu(cnt->rx_frames)); + struct ib_class_port_info *cpi = + (struct ib_class_port_info *)(out_mad->data + 40); + + memset(cpi, 0, sizeof(cpi)); + + cpi->capability_mask = IB_PMA_CLASS_CAP_EXT_WIDTH_NOIETF ; + + return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY; +} + +static void edit_counter(struct mlx4_counter *cnt, void *counters, + __be16 attr_id) +{ + if (attr_id == IB_PMA_PORT_COUNTERS) { + struct ib_pma_portcounters *pma_cnt = + (struct ib_pma_portcounters *) counters; + + pma_cnt->port_xmit_data = + cpu_to_be32((be64_to_cpu(cnt->tx_bytes)>>2)); + pma_cnt->port_rcv_data = + cpu_to_be32((be64_to_cpu(cnt->rx_bytes)>>2)); + pma_cnt->port_xmit_packets = + cpu_to_be32(be64_to_cpu(cnt->tx_frames)); + pma_cnt->port_rcv_packets = + cpu_to_be32(be64_to_cpu(cnt->rx_frames)); + } else { + struct ib_pma_portcounters_ext *pma_cnt_ext = + (struct ib_pma_portcounters_ext *) counters; + + pma_cnt_ext->port_xmit_data = + cpu_to_be64((be64_to_cpu(cnt->tx_bytes)>>2)); + pma_cnt_ext->port_rcv_data = + cpu_to_be64((be64_to_cpu(cnt->rx_bytes)>>2)); + pma_cnt_ext->port_xmit_packets = cnt->tx_frames; + pma_cnt_ext->port_rcv_packets = cnt->rx_frames; + } } static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, @@ -325,6 +356,9 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, if (in_mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_PERF_MGMT) return -EINVAL; + if (in_mad->mad_hdr.attr_id == IB_PMA_CLASS_PORT_INFO) + return pma_get_classportinfo(out_mad); + mailbox = mlx4_alloc_cmd_mailbox(dev->dev); if (IS_ERR(mailbox)) return IB_MAD_RESULT_FAILURE; @@ -339,7 +373,8 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, switch (mode & 0xf) { case 0: edit_counter(mailbox->buf, - (void *)(out_mad->data + 40)); + (void *)(out_mad->data + 40), + in_mad->mad_hdr.attr_id); err = IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY; break; default: -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH for-3.2] IBoE fixes [not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org> ` (4 preceding siblings ...) 2011-10-10 8:57 ` [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE Or Gerlitz @ 2011-10-26 7:04 ` Or Gerlitz 5 siblings, 0 replies; 49+ messages in thread From: Or Gerlitz @ 2011-10-26 7:04 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-rdma, Marcel Apfelbaum On 10/10/2011 10:51 AM, Or Gerlitz wrote: > This is a batch of IBoE related fixes plus addition of extended PMA counters. Hi Roland, With the merge window opening around yesterday, just checking if we're on track for merging this patch set into 3.2 (I sent V1 for patch #3 as you asked). Also, for the FDR set, Marcel has sent a fix to the "[... v2 3/5] ib/mlx4: Add extended port capabilities support" patch along the lines you were suggesting, so we'll be happy to know if you're fine with this patch and also the fourth patch which depends on it, thanks! Or. > > > > Or Gerlitz (5): > ib/mlx4: enable 4K MTU for IBoE > ib/mlx4: remove setting of vlan in IBoE WQEs control segment > net/mlx4_core: remove module param controlling the vlan table size > ib/core: add support for extended performance counters in sysfs > ib/mlx4: add support for extended PMA counters under IBoE > > drivers/infiniband/core/sysfs.c | 100 ++++++++++++++++++++++++++++++++++++- > drivers/infiniband/hw/mlx4/mad.c | 49 +++++++++++++++--- > drivers/infiniband/hw/mlx4/main.c | 2 +- > drivers/infiniband/hw/mlx4/qp.c | 11 +--- > drivers/net/mlx4/main.c | 12 +--- > include/rdma/ib_pma.h | 1 + > 6 files changed, 148 insertions(+), 27 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2012-01-04 19:29 UTC | newest]
Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-10 8:51 [PATCH for-3.2] IBoE fixes Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101047420.10901-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 8:53 ` [PATCH 1/5] ib/mlx4: enable 4K mtu for IBoE Or Gerlitz
2011-10-10 8:54 ` [PATCH 2/5] ib/mlx4: remove setting of vlan in IBoE WQEs control segment Or Gerlitz
2011-10-10 8:55 ` [PATCH 3/5] net/mlx4_core: remove module param controlling the vlan table size Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101055050.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-10 17:00 ` Roland Dreier
[not found] ` <CAG4TOxNaQicA6ExuNsw8V95mJPD+AQX5Wrfg+rCoBxdr2E+2ww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 7:24 ` Or Gerlitz
[not found] ` <4E93EF38.5040707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-10-11 17:28 ` Roland Dreier
[not found] ` <CAL1RGDVk2EOmgWU_iZQWQqKJFPNYN4o_5_14Qbr_6wUzTJ1bOg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 19:18 ` Or Gerlitz
[not found] ` <CAJZOPZJnzUm2M1zm1564TUwKCN3r=y3g-4pwTL4bJPDo2V4s9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-11 19:22 ` Roland Dreier
2011-10-16 8:32 ` Or Gerlitz
2011-10-16 8:26 ` [PATCH V1 3/5] net/mlx4_core: deprecate " Or Gerlitz
2011-10-10 8:56 ` [PATCH 4/5] ib/core: add support for extended performance counters in sysfs Or Gerlitz
[not found] ` <alpine.LRH.2.00.1110101055570.11243-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2011-10-31 19:38 ` Roland Dreier
[not found] ` <CAL1RGDXdkiZNtAsnqDtgbCJGLXLFvk8s-YA5mR189kmg6FkabQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 6:40 ` Or Gerlitz
[not found] ` <CAJZOPZ+K3u+u5XutGfBhKh3DaBpNRKV_bw67CLK6jy3060OHqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 16:22 ` Jason Gunthorpe
[not found] ` <20111101162218.GA10710-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 17:23 ` Or Gerlitz
[not found] ` <CAJZOPZLXZL=vEDdXE3rE5Jzw840SrdY=73M7QgKSNzOysRE6MQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 17:36 ` Jason Gunthorpe
[not found] ` <20111101173625.GH26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 18:14 ` Or Gerlitz
[not found] ` <CAJZOPZJCY5ehTGw8htsxw0dJgMXJJAS-=LGJde9ddcMsmhqvMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 18:37 ` Jason Gunthorpe
[not found] ` <20111101183730.GI26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 21:42 ` Roland Dreier
2011-11-01 21:44 ` Or Gerlitz
2011-11-01 21:52 ` Jason Gunthorpe
[not found] ` <20111101215200.GN26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 22:03 ` Ira Weiny
[not found] ` <20111101150358.17d232ee.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-11-01 22:11 ` Jason Gunthorpe
[not found] ` <20111101221135.GP26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-01 22:19 ` Ira Weiny
2011-11-01 22:34 ` Or Gerlitz
[not found] ` <CAJZOPZJh2T5nCsdU9B0anUnoB=SpRD2_yBLT620zEi0Kr12MFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 22:46 ` Ira Weiny
[not found] ` <20111101154627.f3c847dd.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-11-02 7:38 ` Or Gerlitz
2011-11-02 17:16 ` Or Gerlitz
[not found] ` <4EB17AF9.3000608-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-11-08 0:54 ` Roland Dreier
[not found] ` <CAG4TOxOYAnuB8oKkJgr9_6EpF+YR-h9WNak+gSWwpV0Nz-LeZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-08 1:09 ` Jason Gunthorpe
[not found] ` <20111108010952.GC4934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-08 16:06 ` Ira Weiny
[not found] ` <20111108080606.7c5dd62c.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-12-20 11:49 ` Or Gerlitz
2011-12-20 11:49 ` Or Gerlitz
2011-12-20 12:03 ` Or Gerlitz
[not found] ` <4EF07993.2090900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-12-20 17:46 ` Jason Gunthorpe
[not found] ` <20111220174639.GE25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-12-20 19:40 ` Or Gerlitz
[not found] ` <CAJZOPZ+rBKck50kmyhLZ3VNLNmoS_K6cA3Pj3L6XQjcBQo6DjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-20 19:50 ` Jason Gunthorpe
[not found] ` <20111220195014.GH25774-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-12-20 19:56 ` Or Gerlitz
2012-01-04 19:29 ` Or Gerlitz
2011-12-22 17:09 ` Or Gerlitz
2011-11-01 21:46 ` Or Gerlitz
[not found] ` <CAJZOPZ+zmwTdHva5C7k0QhBmEzxSGTa1Ajszz3eSH=WpP3A4fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 21:58 ` Jason Gunthorpe
[not found] ` <20111101215843.GO26974-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-11-02 8:27 ` Or Gerlitz
2011-11-01 21:49 ` Roland Dreier
[not found] ` <CAG4TOxM4n3=51-3UqYF=iRS25OdRjTXoKDD6u1DukgRrwvTW1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 22:32 ` Or Gerlitz
2011-11-01 22:49 ` Hefty, Sean
2011-10-10 8:57 ` [PATCH 5/5] ib/mlx4: added support for extended PMA counters under IBoE Or Gerlitz
2011-10-26 7:04 ` [PATCH for-3.2] IBoE fixes Or Gerlitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox