* [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint()
2024-03-08 22:29 [PATCH iproute2-next 0/4] Support for nexthop group statistics Petr Machata
@ 2024-03-08 22:29 ` Petr Machata
2024-03-08 22:58 ` Stephen Hemminger
2024-03-08 22:29 ` [PATCH iproute2-next 2/4] ip: ipnexthop: Support dumping next hop group stats Petr Machata
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Petr Machata @ 2024-03-08 22:29 UTC (permalink / raw)
To: David Ahern, Stephen Hemminger, netdev; +Cc: Ido Schimmel, Petr Machata, mlxsw
NLA_UINT attributes have a 4- or 8-byte payload. Add a function to extract
these.
Signed-off-by: Petr Machata <petrm@nvidia.com>
---
include/libnetlink.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/libnetlink.h b/include/libnetlink.h
index ad7e7127..a1233659 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -260,6 +260,12 @@ static inline __u64 rta_getattr_u64(const struct rtattr *rta)
memcpy(&tmp, RTA_DATA(rta), sizeof(__u64));
return tmp;
}
+static inline __u64 rta_getattr_uint(const struct rtattr *rta)
+{
+ if (RTA_PAYLOAD(rta) == sizeof(__u32))
+ return rta_getattr_u32(rta);
+ return rta_getattr_u64(rta);
+}
static inline __s32 rta_getattr_s32(const struct rtattr *rta)
{
return *(__s32 *)RTA_DATA(rta);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint()
2024-03-08 22:29 ` [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint() Petr Machata
@ 2024-03-08 22:58 ` Stephen Hemminger
2024-03-09 3:43 ` Jakub Kicinski
0 siblings, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2024-03-08 22:58 UTC (permalink / raw)
To: Petr Machata; +Cc: David Ahern, netdev, Ido Schimmel, mlxsw
On Fri, 8 Mar 2024 23:29:06 +0100
Petr Machata <petrm@nvidia.com> wrote:
> +static inline __u64 rta_getattr_uint(const struct rtattr *rta)
> +{
> + if (RTA_PAYLOAD(rta) == sizeof(__u32))
> + return rta_getattr_u32(rta);
> + return rta_getattr_u64(rta);
Don't understand the use case here.
The kernel always sends the same payload size for the same attribute.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint()
2024-03-08 22:58 ` Stephen Hemminger
@ 2024-03-09 3:43 ` Jakub Kicinski
2024-03-09 17:21 ` Stephen Hemminger
0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2024-03-09 3:43 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Petr Machata, David Ahern, netdev, Ido Schimmel, mlxsw
On Fri, 8 Mar 2024 14:58:59 -0800 Stephen Hemminger wrote:
> > +static inline __u64 rta_getattr_uint(const struct rtattr *rta)
> > +{
> > + if (RTA_PAYLOAD(rta) == sizeof(__u32))
> > + return rta_getattr_u32(rta);
> > + return rta_getattr_u64(rta);
>
> Don't understand the use case here.
> The kernel always sends the same payload size for the same attribute.
Please see commit 374d345d9b5e13380c in the kernel.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint()
2024-03-09 3:43 ` Jakub Kicinski
@ 2024-03-09 17:21 ` Stephen Hemminger
2024-03-09 18:37 ` Jakub Kicinski
2024-03-11 10:53 ` Petr Machata
0 siblings, 2 replies; 11+ messages in thread
From: Stephen Hemminger @ 2024-03-09 17:21 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: Petr Machata, David Ahern, netdev, Ido Schimmel, mlxsw
On Fri, 8 Mar 2024 19:43:34 -0800
Jakub Kicinski <kuba@kernel.org> wrote:
> On Fri, 8 Mar 2024 14:58:59 -0800 Stephen Hemminger wrote:
> > > +static inline __u64 rta_getattr_uint(const struct rtattr *rta)
> > > +{
> > > + if (RTA_PAYLOAD(rta) == sizeof(__u32))
> > > + return rta_getattr_u32(rta);
> > > + return rta_getattr_u64(rta);
> >
> > Don't understand the use case here.
> > The kernel always sends the same payload size for the same attribute.
>
> Please see commit 374d345d9b5e13380c in the kernel.
Ok, but maybe go further and handle u16 and u8
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint()
2024-03-09 17:21 ` Stephen Hemminger
@ 2024-03-09 18:37 ` Jakub Kicinski
2024-03-11 10:53 ` Petr Machata
1 sibling, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2024-03-09 18:37 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Petr Machata, David Ahern, netdev, Ido Schimmel, mlxsw
On Sat, 9 Mar 2024 09:21:58 -0800 Stephen Hemminger wrote:
> > > Don't understand the use case here.
> > > The kernel always sends the same payload size for the same attribute.
> >
> > Please see commit 374d345d9b5e13380c in the kernel.
>
> Ok, but maybe go further and handle u16 and u8
I guess you can if that's helpful in iproute2, as a "universal getter",
cost of a few branches. In the kernel I try hard to convince people to
never use u8 and u16 as they simply waste space on padding, and the bits
may turn out to be needed later.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint()
2024-03-09 17:21 ` Stephen Hemminger
2024-03-09 18:37 ` Jakub Kicinski
@ 2024-03-11 10:53 ` Petr Machata
1 sibling, 0 replies; 11+ messages in thread
From: Petr Machata @ 2024-03-11 10:53 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Jakub Kicinski, Petr Machata, David Ahern, netdev, Ido Schimmel,
mlxsw
Stephen Hemminger <stephen@networkplumber.org> writes:
> On Fri, 8 Mar 2024 19:43:34 -0800
> Jakub Kicinski <kuba@kernel.org> wrote:
>
>> On Fri, 8 Mar 2024 14:58:59 -0800 Stephen Hemminger wrote:
>> > > +static inline __u64 rta_getattr_uint(const struct rtattr *rta)
>> > > +{
>> > > + if (RTA_PAYLOAD(rta) == sizeof(__u32))
>> > > + return rta_getattr_u32(rta);
>> > > + return rta_getattr_u64(rta);
>> >
>> > Don't understand the use case here.
>> > The kernel always sends the same payload size for the same attribute.
>>
>> Please see commit 374d345d9b5e13380c in the kernel.
>
> Ok, but maybe go further and handle u16 and u8
I intended this as a getter of the corresponding kernel attribute type,
which only ever sends 4- and 8-byte payloads, but sure, I can add that.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH iproute2-next 2/4] ip: ipnexthop: Support dumping next hop group stats
2024-03-08 22:29 [PATCH iproute2-next 0/4] Support for nexthop group statistics Petr Machata
2024-03-08 22:29 ` [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint() Petr Machata
@ 2024-03-08 22:29 ` Petr Machata
2024-03-08 22:57 ` Stephen Hemminger
2024-03-08 22:29 ` [PATCH iproute2-next 3/4] ip: ipnexthop: Support dumping next hop group HW stats Petr Machata
2024-03-08 22:29 ` [PATCH iproute2-next 4/4] ip: ipnexthop: Allow toggling collection of nexthop group HW statistics Petr Machata
3 siblings, 1 reply; 11+ messages in thread
From: Petr Machata @ 2024-03-08 22:29 UTC (permalink / raw)
To: David Ahern, Stephen Hemminger, netdev; +Cc: Ido Schimmel, Petr Machata, mlxsw
Next hop group stats allow verification of balancedness of a next hop
group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch
'nexthop-group-stats'"). Add to ip the corresponding support. The
statistics are requested if "ip nexthop" is started with -s.
Signed-off-by: Petr Machata <petrm@nvidia.com>
---
ip/ipnexthop.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++
ip/nh_common.h | 6 ++++
2 files changed, 93 insertions(+)
diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index e946d6f9..8aa89546 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -25,6 +25,7 @@ static struct {
unsigned int fdb;
unsigned int id;
unsigned int nhid;
+ unsigned int op_flags;
} filter;
enum {
@@ -92,6 +93,14 @@ static int nh_dump_filter(struct nlmsghdr *nlh, int reqlen)
return err;
}
+ if (filter.op_flags) {
+ __u32 op_flags = filter.op_flags;
+
+ err = addattr32(nlh, reqlen, NHA_OP_FLAGS, op_flags);
+ if (err)
+ return err;
+ }
+
return 0;
}
@@ -296,6 +305,31 @@ static void parse_nh_res_group_rta(const struct rtattr *res_grp_attr,
}
}
+static void parse_nh_group_stats_rta(const struct rtattr *grp_stats_attr,
+ struct nh_entry *nhe)
+{
+ const struct rtattr *pos;
+ int i = 0;
+
+ rtattr_for_each_nested(pos, grp_stats_attr) {
+ struct nh_grp_stats *nh_grp_stats = &nhe->nh_grp_stats[i++];
+ struct rtattr *tb[NHA_GROUP_STATS_ENTRY_MAX + 1];
+ struct rtattr *rta;
+
+ parse_rtattr_nested(tb, NHA_GROUP_STATS_ENTRY_MAX, pos);
+
+ if (tb[NHA_GROUP_STATS_ENTRY_ID]) {
+ rta = tb[NHA_GROUP_STATS_ENTRY_ID];
+ nh_grp_stats->nh_id = rta_getattr_u32(rta);
+ }
+
+ if (tb[NHA_GROUP_STATS_ENTRY_PACKETS]) {
+ rta = tb[NHA_GROUP_STATS_ENTRY_PACKETS];
+ nh_grp_stats->packets = rta_getattr_uint(rta);
+ }
+ }
+}
+
static void print_nh_res_group(const struct nha_res_grp *res_grp)
{
struct timeval tv;
@@ -343,8 +377,33 @@ static void print_nh_res_bucket(FILE *fp, const struct rtattr *res_bucket_attr)
close_json_object();
}
+static void print_nh_grp_stats(const struct nh_entry *nhe)
+{
+ int i;
+
+ if (!show_stats)
+ return;
+
+ open_json_array(PRINT_JSON, "group_stats");
+ print_string(PRINT_FP, NULL, "\n stats:\n", NULL);
+ for (i = 0; i < nhe->nh_groups_cnt; i++) {
+ open_json_object(NULL);
+
+ print_uint(PRINT_ANY, "id", " id %u",
+ nhe->nh_grp_stats[i].nh_id);
+ print_u64(PRINT_ANY, "packets", " packets %llu",
+ nhe->nh_grp_stats[i].packets);
+
+ if (i != nhe->nh_groups_cnt - 1)
+ print_string(PRINT_FP, NULL, "\n", NULL);
+ close_json_object();
+ }
+ close_json_array(PRINT_JSON, NULL);
+}
+
static void ipnh_destroy_entry(struct nh_entry *nhe)
{
+ free(nhe->nh_grp_stats);
free(nhe->nh_encap);
free(nhe->nh_groups);
}
@@ -418,6 +477,16 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
nhe->nh_has_res_grp = true;
}
+ if (tb[NHA_GROUP_STATS]) {
+ nhe->nh_grp_stats = calloc(nhe->nh_groups_cnt,
+ sizeof(*nhe->nh_grp_stats));
+ if (!nhe->nh_grp_stats) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+ parse_nh_group_stats_rta(tb[NHA_GROUP_STATS], nhe);
+ }
+
nhe->nh_blackhole = !!tb[NHA_BLACKHOLE];
nhe->nh_fdb = !!tb[NHA_FDB];
@@ -484,9 +553,23 @@ static void __print_nexthop_entry(FILE *fp, const char *jsobj,
if (nhe->nh_fdb)
print_null(PRINT_ANY, "fdb", "fdb", NULL);
+ if (nhe->nh_grp_stats)
+ print_nh_grp_stats(nhe);
+
close_json_object();
}
+static __u32 ipnh_get_op_flags(void)
+{
+ __u32 op_flags = 0;
+
+ if (show_stats) {
+ op_flags |= NHA_OP_FLAG_DUMP_STATS;
+ }
+
+ return op_flags;
+}
+
static int __ipnh_get_id(struct rtnl_handle *rthp, __u32 nh_id,
struct nlmsghdr **answer)
{
@@ -500,8 +583,10 @@ static int __ipnh_get_id(struct rtnl_handle *rthp, __u32 nh_id,
.n.nlmsg_type = RTM_GETNEXTHOP,
.nhm.nh_family = preferred_family,
};
+ __u32 op_flags = ipnh_get_op_flags();
addattr32(&req.n, sizeof(req), NHA_ID, nh_id);
+ addattr32(&req.n, sizeof(req), NHA_OP_FLAGS, op_flags);
return rtnl_talk(rthp, &req.n, answer);
}
@@ -1093,6 +1178,8 @@ static int ipnh_list_flush(int argc, char **argv, int action)
argc--; argv++;
}
+ filter.op_flags = ipnh_get_op_flags();
+
if (action == IPNH_FLUSH)
return ipnh_flush(all);
diff --git a/ip/nh_common.h b/ip/nh_common.h
index 4d6677e6..e2f74ec5 100644
--- a/ip/nh_common.h
+++ b/ip/nh_common.h
@@ -13,6 +13,11 @@ struct nha_res_grp {
__u64 unbalanced_time;
};
+struct nh_grp_stats {
+ __u32 nh_id;
+ __u64 packets;
+};
+
struct nh_entry {
struct hlist_node nh_hash;
@@ -44,6 +49,7 @@ struct nh_entry {
int nh_groups_cnt;
struct nexthop_grp *nh_groups;
+ struct nh_grp_stats *nh_grp_stats;
};
void print_cache_nexthop_id(FILE *fp, const char *fp_prefix, const char *jsobj,
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH iproute2-next 3/4] ip: ipnexthop: Support dumping next hop group HW stats
2024-03-08 22:29 [PATCH iproute2-next 0/4] Support for nexthop group statistics Petr Machata
2024-03-08 22:29 ` [PATCH iproute2-next 1/4] libnetlink: Add rta_getattr_uint() Petr Machata
2024-03-08 22:29 ` [PATCH iproute2-next 2/4] ip: ipnexthop: Support dumping next hop group stats Petr Machata
@ 2024-03-08 22:29 ` Petr Machata
2024-03-08 22:29 ` [PATCH iproute2-next 4/4] ip: ipnexthop: Allow toggling collection of nexthop group HW statistics Petr Machata
3 siblings, 0 replies; 11+ messages in thread
From: Petr Machata @ 2024-03-08 22:29 UTC (permalink / raw)
To: David Ahern, Stephen Hemminger, netdev; +Cc: Ido Schimmel, Petr Machata, mlxsw
Besides SW datapath stats, the kernel also support collecting statistics
from HW datapath, for nexthop groups offloaded to HW. Request that these be
collected when ip is given "-s -s", similarly to how "ip link" shows more
statistics in that case.
Besides the statistics themselves, also show whether the collection of HW
statistics was in fact requested, and whether any driver actually
implemented the request.
Signed-off-by: Petr Machata <petrm@nvidia.com>
---
ip/ipnexthop.c | 28 ++++++++++++++++++++++++++++
ip/nh_common.h | 5 +++++
2 files changed, 33 insertions(+)
diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 8aa89546..573f1abb 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -327,6 +327,11 @@ static void parse_nh_group_stats_rta(const struct rtattr *grp_stats_attr,
rta = tb[NHA_GROUP_STATS_ENTRY_PACKETS];
nh_grp_stats->packets = rta_getattr_uint(rta);
}
+
+ if (tb[NHA_GROUP_STATS_ENTRY_PACKETS_HW]) {
+ rta = tb[NHA_GROUP_STATS_ENTRY_PACKETS_HW];
+ nh_grp_stats->packets_hw = rta_getattr_uint(rta);
+ }
}
}
@@ -393,6 +398,9 @@ static void print_nh_grp_stats(const struct nh_entry *nhe)
nhe->nh_grp_stats[i].nh_id);
print_u64(PRINT_ANY, "packets", " packets %llu",
nhe->nh_grp_stats[i].packets);
+ if (show_stats > 1)
+ print_u64(PRINT_ANY, "packets_hw", " packets_hw %llu",
+ nhe->nh_grp_stats[i].packets_hw);
if (i != nhe->nh_groups_cnt - 1)
print_string(PRINT_FP, NULL, "\n", NULL);
@@ -477,6 +485,15 @@ static int ipnh_parse_nhmsg(FILE *fp, const struct nhmsg *nhm, int len,
nhe->nh_has_res_grp = true;
}
+ if (tb[NHA_HW_STATS_ENABLE]) {
+ nhe->nh_hw_stats_supported = true;
+ nhe->nh_hw_stats_enabled =
+ !!rta_getattr_u32(tb[NHA_HW_STATS_ENABLE]);
+ }
+
+ if (tb[NHA_HW_STATS_USED])
+ nhe->nh_hw_stats_used = !!rta_getattr_u32(tb[NHA_HW_STATS_USED]);
+
if (tb[NHA_GROUP_STATS]) {
nhe->nh_grp_stats = calloc(nhe->nh_groups_cnt,
sizeof(*nhe->nh_grp_stats));
@@ -553,6 +570,15 @@ static void __print_nexthop_entry(FILE *fp, const char *jsobj,
if (nhe->nh_fdb)
print_null(PRINT_ANY, "fdb", "fdb", NULL);
+ if ((show_details > 0 || show_stats) && nhe->nh_hw_stats_supported) {
+ open_json_object("hw_stats");
+ print_on_off(PRINT_ANY, "enabled", "hw_stats %s ",
+ nhe->nh_hw_stats_enabled);
+ print_on_off(PRINT_ANY, "used", "used %s ",
+ nhe->nh_hw_stats_used);
+ close_json_object();
+ }
+
if (nhe->nh_grp_stats)
print_nh_grp_stats(nhe);
@@ -565,6 +591,8 @@ static __u32 ipnh_get_op_flags(void)
if (show_stats) {
op_flags |= NHA_OP_FLAG_DUMP_STATS;
+ if (show_stats > 1)
+ op_flags |= NHA_OP_FLAG_DUMP_HW_STATS;
}
return op_flags;
diff --git a/ip/nh_common.h b/ip/nh_common.h
index e2f74ec5..33b1d847 100644
--- a/ip/nh_common.h
+++ b/ip/nh_common.h
@@ -16,6 +16,7 @@ struct nha_res_grp {
struct nh_grp_stats {
__u32 nh_id;
__u64 packets;
+ __u64 packets_hw;
};
struct nh_entry {
@@ -32,6 +33,10 @@ struct nh_entry {
bool nh_blackhole;
bool nh_fdb;
+ bool nh_hw_stats_supported;
+ bool nh_hw_stats_enabled;
+ bool nh_hw_stats_used;
+
int nh_gateway_len;
union {
__be32 ipv4;
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH iproute2-next 4/4] ip: ipnexthop: Allow toggling collection of nexthop group HW statistics
2024-03-08 22:29 [PATCH iproute2-next 0/4] Support for nexthop group statistics Petr Machata
` (2 preceding siblings ...)
2024-03-08 22:29 ` [PATCH iproute2-next 3/4] ip: ipnexthop: Support dumping next hop group HW stats Petr Machata
@ 2024-03-08 22:29 ` Petr Machata
3 siblings, 0 replies; 11+ messages in thread
From: Petr Machata @ 2024-03-08 22:29 UTC (permalink / raw)
To: David Ahern, Stephen Hemminger, netdev; +Cc: Ido Schimmel, Petr Machata, mlxsw
Besides SW datapath stats, the kernel also support collecting statistics
from HW datapath, for nexthop groups offloaded to HW. Since collection of
these statistics may consume HW resources, there is an interface to request
that the HW stats be recorded. Add this toggle to "ip nexthop".
Signed-off-by: Petr Machata <petrm@nvidia.com>
---
ip/ipnexthop.c | 12 ++++++++++++
man/man8/ip-nexthop.8 | 2 ++
2 files changed, 14 insertions(+)
diff --git a/ip/ipnexthop.c b/ip/ipnexthop.c
index 573f1abb..74685c49 100644
--- a/ip/ipnexthop.c
+++ b/ip/ipnexthop.c
@@ -56,6 +56,7 @@ static void usage(void)
" [ encap ENCAPTYPE ENCAPHDR ] |\n"
" group GROUP [ fdb ] [ type TYPE [ TYPE_ARGS ] ] }\n"
"GROUP := [ <id[,weight]>/<id[,weight]>/... ]\n"
+ " [ hw_stats {off|on} ]\n"
"TYPE := { mpath | resilient }\n"
"TYPE_ARGS := [ RESILIENT_ARGS ]\n"
"RESILIENT_ARGS := [ buckets BUCKETS ] [ idle_timer IDLE ]\n"
@@ -1100,6 +1101,17 @@ static int ipnh_modify(int cmd, unsigned int flags, int argc, char **argv)
if (rtnl_rtprot_a2n(&prot, *argv))
invarg("\"protocol\" value is invalid\n", *argv);
req.nhm.nh_protocol = prot;
+ } else if (!strcmp(*argv, "hw_stats")) {
+ bool hw_stats;
+ int ret;
+
+ NEXT_ARG();
+ hw_stats = parse_on_off("hw_stats", *argv, &ret);
+ if (ret)
+ return ret;
+
+ addattr32(&req.n, sizeof(req), NHA_HW_STATS_ENABLE,
+ hw_stats);
} else if (strcmp(*argv, "help") == 0) {
usage();
} else {
diff --git a/man/man8/ip-nexthop.8 b/man/man8/ip-nexthop.8
index f81a5910..aad68696 100644
--- a/man/man8/ip-nexthop.8
+++ b/man/man8/ip-nexthop.8
@@ -68,6 +68,8 @@ ip-nexthop \- nexthop object management
.BR fdb " ] | "
.B group
.IR GROUP " [ "
+.BR hw_stats " { "
+.BR on " | " off " } ] [ "
.BR fdb " ] [ "
.B type
.IR TYPE " [ " TYPE_ARGS " ] ] }"
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread