* [PATCH net-next v12 0/4] net sched actions: improve dump performance
@ 2017-07-30 17:24 Jamal Hadi Salim
2017-07-30 17:24 ` [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32 Jamal Hadi Salim
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-30 17:24 UTC (permalink / raw)
To: davem
Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern,
Jamal Hadi Salim
From: Jamal Hadi Salim <jhs@mojatatu.com>
Changes since v11:
------------------
1) Jiri - renames: nla_value to value and nla_selector to selector
2) Jiri - rename: validate_nla_bitfield_32 to validate_nla_bitfield_32
3) Jiri - rename: NLA_BITFIELD_32 to NLA_BITFIELD32
4) Jiri - remove unnecessary break when we return in case statement
5) Jiri - rename and move nla_get_bitfield_32 to an earlier patch
6) Jiri - xmas tree alignment of var declaration
7) Jiri - rename all declarations of bitfield 32 vars to be consistent ("bf")
8) Jiri - improve validate_nla_bitfield32() validation to disallow valid
bit values that are not selected by the selector
Changes since v10:
-----------------
1) Jiri: move type->validate_content() to its own patch
Jamal: decided to remove it altogether so we can get this patch set in.
2) Change name of NLA_FLAG_BITS to NLA_BITFIELD_32 based on discussions
with D. Ahern and Jiri. D. Ahern suggests to make this a variable bitmap size.
My analysis at this point is it too complex and i only need a few bit
flags. If we run out of bits someone else can create a new NLA_BITFIELD_XXX
and start using that. So please let this go.
3) Jamal - Add Suggested-by: Jiri for type NLA_BITFIELD_32
4) Jiri: Change name allowed_flags to tcaa_root_flags_allowed
5) Jiri: Introduce nla_get_flag_bits_values() helper instead of using
memcpy for retrieving nla_bitfield_32 fields.
Changes since v9:
-----------------
1) General consensus:
- remove again the use of BIT() to maintain uapi consistency ;->
1) Jiri:
- Add a new netlink type NLA_FLAG_BITS to check for valid bits
and use it instead of inline vetting (patch 4/4 now)
Changes since v8:
-----------------
1) Jiri:
- Add back the use of BIT(). Eventually fix iproute2 instead
- Rename VALID_TCA_FLAGS to VALID_TCA_ROOT_FLAGS
Changes since v7:
-----------------
Jamal:
No changes.
Patch 1 went out twice. Resend without two copies of patch 1
changes since v6:
-----------------
1) DaveM:
New rules for netlink messages. From now on we are going to start
checking for bits that are not used and rejecting anything we dont
understand. In the future this is going to require major changes
to user space code (tc etc). This is just a start.
To quote, David:
"
Again, bits you aren't using now, make sure userspace doesn't
set them. And if it does, reject.
"
Added checks for ensuring things work as above.
2) Jiri:
a)Fix the commit message to properly use "Fixes" description
b)Align assignments for nla_policy
Changes since v5:
----------------
0)
Remove use of BIT() because it is kernel specific. Requires a separate
patch (Jiri can submit that in his cleanups)
1)To paraphrase Eric D.
"memcpy(nla_data(count_attr), &cb->args[1], sizeof(u32));
wont work on 64bit BE machines because cb->args[1]
(which is 64 bit is larger in size than sizeof(u32))"
Fixed
2) Jiri Pirko
i) Spotted a bug fix mixed in the patch for wrong TLV
fix. Add patch 1/3 to address this. Make part of this
series because of dependencies.
ii) Rename ACT_LARGE_DUMP_ON -> TCA_FLAG_LARGE_DUMP_ON
iii) Satisfy Jiri's obsession against the noun "tcaa"
a)Rename struct nlattr *tcaa --> struct nlattr *tb
b)Rename TCAA_ACT_XXX -> TCA_ROOT_XXX
Changes since v4:
-----------------
1) Eric D.
pointed out that when all skb space is used up by the dump
there will be no space to insert the TCAA_ACT_COUNT attribute.
2) Jiri:
i) Change:
enum {
TCAA_UNSPEC,
TCAA_ACT_TAB,
TCAA_ACT_FLAGS,
TCAA_ACT_COUNT,
TCAA_ACT_TIME_FILTER,
__TCAA_MAX
};
#define TCAA_MAX (__TCAA_MAX - 1)
#define ACT_LARGE_DUMP_ON (1 << 0)
to:
enum {
TCAA_UNSPEC,
TCAA_ACT_TAB,
#define TCA_ACT_TAB TCAA_ACT_TAB
TCAA_ACT_FLAGS,
TCAA_ACT_COUNT,
__TCAA_MAX,
#define TCAA_MAX (__TCAA_MAX - 1)
};
#define ACT_LARGE_DUMP_ON BIT(0)
Jiri plans to followup with the rest of the code to make the
style consistent.
ii) Rename attribute TCAA_ACT_TIME_FILTER --> TCAA_ACT_TIME_DELTA
iii) Rename variable jiffy_filter --> jiffy_since
iv) Rename msecs_filter --> msecs_since
v) get rid of unused cb->args[0] and rename cb->args[4] to cb->args[0]
Earlier Changes
----------------
- Jiri mostly on names of things.
Jamal Hadi Salim (4):
net netlink: Add new type NLA_BITFIELD32
net sched actions: Use proper root attribute table for actions
net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
net sched actions: add time filter for action dumping
include/net/netlink.h | 16 ++++++++++
include/uapi/linux/netlink.h | 17 ++++++++++
include/uapi/linux/rtnetlink.h | 23 ++++++++++++--
lib/nlattr.c | 30 ++++++++++++++++++
net/sched/act_api.c | 71 +++++++++++++++++++++++++++++++++++-------
5 files changed, 144 insertions(+), 13 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-30 17:24 [PATCH net-next v12 0/4] net sched actions: improve dump performance Jamal Hadi Salim
@ 2017-07-30 17:24 ` Jamal Hadi Salim
2017-07-30 18:42 ` Jiri Pirko
2017-07-30 17:24 ` [PATCH net-next v12 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
` (3 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-30 17:24 UTC (permalink / raw)
To: davem
Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern,
Jamal Hadi Salim
From: Jamal Hadi Salim <jhs@mojatatu.com>
Generic bitflags attribute content sent to the kernel by user.
With this netlink attr type the user can either set or unset a
flag in the kernel.
The value is a bitmap that defines the bit values being set
The selector is a bitmask that defines which value bit is to be
considered.
A check is made to ensure the rules that a kernel subsystem always
conforms to bitflags the kernel already knows about. i.e
if the user tries to set a bit flag that is not understood then
the _it will be rejected_.
In the most basic form, the user specifies the attribute policy as:
[ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
where myvalidflags is the bit mask of the flags the kernel understands.
If the user _does not_ provide myvalidflags then the attribute will
also be rejected.
Examples:
value = 0x0, and selector = 0x1
implies we are selecting bit 1 and we want to set its value to 0.
value = 0x2, and selector = 0x2
implies we are selecting bit 2 and we want to set its value to 1.
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
include/net/netlink.h | 16 ++++++++++++++++
include/uapi/linux/netlink.h | 17 +++++++++++++++++
lib/nlattr.c | 30 ++++++++++++++++++++++++++++++
3 files changed, 63 insertions(+)
diff --git a/include/net/netlink.h b/include/net/netlink.h
index ef8e6c3..82dd298 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -178,6 +178,7 @@ enum {
NLA_S16,
NLA_S32,
NLA_S64,
+ NLA_BITFIELD32,
__NLA_TYPE_MAX,
};
@@ -206,6 +207,7 @@ enum {
* NLA_MSECS Leaving the length field zero will verify the
* given type fits, using it verifies minimum length
* just like "All other"
+ * NLA_BITFIELD32 A 32-bit bitmap/bitselector attribute
* All other Minimum length of attribute payload
*
* Example:
@@ -213,11 +215,13 @@ enum {
* [ATTR_FOO] = { .type = NLA_U16 },
* [ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
* [ATTR_BAZ] = { .len = sizeof(struct mystruct) },
+ * [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
* };
*/
struct nla_policy {
u16 type;
u16 len;
+ void *validation_data;
};
/**
@@ -1203,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla)
}
/**
+ * nla_get_bitfield32 - return payload of 32 bitfield attribute
+ * @nla: nla_bitfield32 attribute
+ */
+static inline struct nla_bitfield32 nla_get_bitfield32(const struct nlattr *nla)
+{
+ struct nla_bitfield32 tmp;
+
+ nla_memcpy(&tmp, nla, sizeof(tmp));
+ return tmp;
+}
+
+/**
* nla_memdup - duplicate attribute memory (kmemdup)
* @src: netlink attribute to duplicate from
* @gfp: GFP mask
diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index f86127a..f4fc9c9 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -226,5 +226,22 @@ struct nlattr {
#define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
#define NLA_HDRLEN ((int) NLA_ALIGN(sizeof(struct nlattr)))
+/* Generic 32 bitflags attribute content sent to the kernel.
+ *
+ * The value is a bitmap that defines the values being set
+ * The selector is a bitmask that defines which value is legit
+ *
+ * Examples:
+ * value = 0x0, and selector = 0x1
+ * implies we are selecting bit 1 and we want to set its value to 0.
+ *
+ * value = 0x2, and selector = 0x2
+ * implies we are selecting bit 2 and we want to set its value to 1.
+ *
+ */
+struct nla_bitfield32 {
+ __u32 value;
+ __u32 selector;
+};
#endif /* _UAPI__LINUX_NETLINK_H */
diff --git a/lib/nlattr.c b/lib/nlattr.c
index fb52435..ee79b7a 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -27,6 +27,30 @@
[NLA_S64] = sizeof(s64),
};
+static int validate_nla_bitfield32(const struct nlattr *nla,
+ u32 *valid_flags_allowed)
+{
+ const struct nla_bitfield32 *bf = nla_data(nla);
+ u32 *valid_flags_mask = valid_flags_allowed;
+
+ if (!valid_flags_allowed)
+ return -EINVAL;
+
+ /*disallow invalid bit selector */
+ if (bf->selector & ~*valid_flags_mask)
+ return -EINVAL;
+
+ /*disallow invalid bit values */
+ if (bf->value & ~*valid_flags_mask)
+ return -EINVAL;
+
+ /*disallow valid bit values that are not selected*/
+ if (bf->value & ~bf->selector)
+ return -EINVAL;
+
+ return 0;
+}
+
static int validate_nla(const struct nlattr *nla, int maxtype,
const struct nla_policy *policy)
{
@@ -46,6 +70,12 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
return -ERANGE;
break;
+ case NLA_BITFIELD32:
+ if (attrlen != sizeof(struct nla_bitfield32))
+ return -ERANGE;
+
+ return validate_nla_bitfield32(nla, pt->validation_data);
+
case NLA_NUL_STRING:
if (pt->len)
minlen = min_t(int, attrlen, pt->len + 1);
--
1.9.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH net-next v12 2/4] net sched actions: Use proper root attribute table for actions
2017-07-30 17:24 [PATCH net-next v12 0/4] net sched actions: improve dump performance Jamal Hadi Salim
2017-07-30 17:24 ` [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32 Jamal Hadi Salim
@ 2017-07-30 17:24 ` Jamal Hadi Salim
2017-07-30 18:44 ` Jiri Pirko
2017-07-30 17:24 ` [PATCH net-next v12 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
` (2 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-30 17:24 UTC (permalink / raw)
To: davem
Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern,
Jamal Hadi Salim
From: Jamal Hadi Salim <jhs@mojatatu.com>
Bug fix for an issue which has been around for about a decade.
We got away with it because the enumeration was larger than needed.
Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
net/sched/act_api.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index f2e9ed3..848370e 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1072,7 +1072,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
struct netlink_ext_ack *extack)
{
struct net *net = sock_net(skb->sk);
- struct nlattr *tca[TCA_ACT_MAX + 1];
+ struct nlattr *tca[TCAA_MAX + 1];
u32 portid = skb ? NETLINK_CB(skb).portid : 0;
int ret = 0, ovr = 0;
@@ -1080,7 +1080,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
!netlink_capable(skb, CAP_NET_ADMIN))
return -EPERM;
- ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ACT_MAX, NULL,
+ ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
extack);
if (ret < 0)
return ret;
--
1.9.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH net-next v12 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
2017-07-30 17:24 [PATCH net-next v12 0/4] net sched actions: improve dump performance Jamal Hadi Salim
2017-07-30 17:24 ` [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32 Jamal Hadi Salim
2017-07-30 17:24 ` [PATCH net-next v12 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
@ 2017-07-30 17:24 ` Jamal Hadi Salim
2017-07-30 19:05 ` Jiri Pirko
2017-07-30 17:24 ` [PATCH net-next v12 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
2017-07-31 2:28 ` [PATCH net-next v12 0/4] net sched actions: improve dump performance David Miller
4 siblings, 1 reply; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-30 17:24 UTC (permalink / raw)
To: davem
Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern,
Jamal Hadi Salim
From: Jamal Hadi Salim <jhs@mojatatu.com>
When you dump hundreds of thousands of actions, getting only 32 per
dump batch even when the socket buffer and memory allocations allow
is inefficient.
With this change, the user will get as many as possibly fitting
within the given constraints available to the kernel.
The top level action TLV space is extended. An attribute
TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
is set by the user indicating the user is capable of processing
these large dumps. Older user space which doesnt set this flag
doesnt get the large (than 32) batches.
The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
actions are put in a single batch. As such user space app knows how long
to iterate (independent of the type of action being dumped)
instead of hardcoded maximum of 32 thus maintaining backward compat.
Some results dumping 1.5M actions below:
first an unpatched tc which doesnt understand these features...
prompt$ time -p tc actions ls action gact | grep index | wc -l
1500000
real 1388.43
user 2.07
sys 1386.79
Now lets see a patched tc which sets the correct flags when requesting
a dump:
prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
1500000
real 178.13
user 2.02
sys 176.96
That is about 8x performance improvement for tc app which sets its
receive buffer to about 32K.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
include/uapi/linux/rtnetlink.h | 22 +++++++++++++++++--
net/sched/act_api.c | 50 +++++++++++++++++++++++++++++++++---------
2 files changed, 60 insertions(+), 12 deletions(-)
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index d148505..bfa80a6 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -683,10 +683,28 @@ struct tcamsg {
unsigned char tca__pad1;
unsigned short tca__pad2;
};
+
+enum {
+ TCA_ROOT_UNSPEC,
+ TCA_ROOT_TAB,
+#define TCA_ACT_TAB TCA_ROOT_TAB
+#define TCAA_MAX TCA_ROOT_TAB
+ TCA_ROOT_FLAGS,
+ TCA_ROOT_COUNT,
+ __TCA_ROOT_MAX,
+#define TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
+};
+
#define TA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg))))
#define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
-#define TCA_ACT_TAB 1 /* attr type must be >=1 */
-#define TCAA_MAX 1
+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
+ *
+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
+ * actions in a dump. All dump responses will contain the number of actions
+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
+ *
+ */
+#define TCA_FLAG_LARGE_DUMP_ON (1 << 0)
/* New extended info filters for IFLA_EXT_MASK */
#define RTEXT_FILTER_VF (1 << 0)
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 848370e..d53653a 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -110,6 +110,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
struct netlink_callback *cb)
{
int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
+ u32 act_flags = cb->args[2];
struct nlattr *nest;
spin_lock_bh(&hinfo->lock);
@@ -138,14 +139,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
}
nla_nest_end(skb, nest);
n_i++;
- if (n_i >= TCA_ACT_MAX_PRIO)
+ if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
+ n_i >= TCA_ACT_MAX_PRIO)
goto done;
}
}
done:
spin_unlock_bh(&hinfo->lock);
- if (n_i)
+ if (n_i) {
cb->args[0] += n_i;
+ if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
+ cb->args[1] = n_i;
+ }
return n_i;
nla_put_failure:
@@ -1068,11 +1073,17 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
return tcf_add_notify(net, n, &actions, portid);
}
+static u32 tcaa_root_flags_allowed = TCA_FLAG_LARGE_DUMP_ON;
+static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
+ [TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD32,
+ .validation_data = &tcaa_root_flags_allowed },
+};
+
static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
struct netlink_ext_ack *extack)
{
struct net *net = sock_net(skb->sk);
- struct nlattr *tca[TCAA_MAX + 1];
+ struct nlattr *tca[TCA_ROOT_MAX + 1];
u32 portid = skb ? NETLINK_CB(skb).portid : 0;
int ret = 0, ovr = 0;
@@ -1080,7 +1091,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
!netlink_capable(skb, CAP_NET_ADMIN))
return -EPERM;
- ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
+ ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
extack);
if (ret < 0)
return ret;
@@ -1121,16 +1132,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
return ret;
}
-static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
+static struct nlattr *find_dump_kind(struct nlattr **nla)
{
struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
- struct nlattr *nla[TCAA_MAX + 1];
struct nlattr *kind;
- if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
- NULL, NULL) < 0)
- return NULL;
tb1 = nla[TCA_ACT_TAB];
if (tb1 == NULL)
return NULL;
@@ -1157,8 +1164,18 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
struct tc_action_ops *a_o;
int ret = 0;
struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
- struct nlattr *kind = find_dump_kind(cb->nlh);
+ struct nlattr *tb[TCA_ROOT_MAX + 1];
+ struct nlattr *count_attr = NULL;
+ struct nlattr *kind = NULL;
+ struct nla_bitfield32 bf;
+ u32 act_count = 0;
+
+ ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
+ tcaa_policy, NULL);
+ if (ret < 0)
+ return ret;
+ kind = find_dump_kind(tb);
if (kind == NULL) {
pr_info("tc_dump_action: action bad kind\n");
return 0;
@@ -1168,14 +1185,24 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
if (a_o == NULL)
return 0;
+ cb->args[2] = 0;
+ if (tb[TCA_ROOT_FLAGS]) {
+ bf = nla_get_bitfield32(tb[TCA_ROOT_FLAGS]);
+ cb->args[2] = bf.value;
+ }
+
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
cb->nlh->nlmsg_type, sizeof(*t), 0);
if (!nlh)
goto out_module_put;
+
t = nlmsg_data(nlh);
t->tca_family = AF_UNSPEC;
t->tca__pad1 = 0;
t->tca__pad2 = 0;
+ count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32));
+ if (!count_attr)
+ goto out_module_put;
nest = nla_nest_start(skb, TCA_ACT_TAB);
if (nest == NULL)
@@ -1188,6 +1215,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
if (ret > 0) {
nla_nest_end(skb, nest);
ret = skb->len;
+ act_count = cb->args[1];
+ memcpy(nla_data(count_attr), &act_count, sizeof(u32));
+ cb->args[1] = 0;
} else
nlmsg_trim(skb, b);
--
1.9.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH net-next v12 4/4] net sched actions: add time filter for action dumping
2017-07-30 17:24 [PATCH net-next v12 0/4] net sched actions: improve dump performance Jamal Hadi Salim
` (2 preceding siblings ...)
2017-07-30 17:24 ` [PATCH net-next v12 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
@ 2017-07-30 17:24 ` Jamal Hadi Salim
2017-07-30 19:06 ` Jiri Pirko
2017-07-31 2:28 ` [PATCH net-next v12 0/4] net sched actions: improve dump performance David Miller
4 siblings, 1 reply; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-30 17:24 UTC (permalink / raw)
To: davem
Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern,
Jamal Hadi Salim
From: Jamal Hadi Salim <jhs@mojatatu.com>
This patch adds support for filtering based on time since last used.
When we are dumping a large number of actions it is useful to
have the option of filtering based on when the action was last
used to reduce the amount of data crossing to user space.
With this patch the user space app sets the TCA_ROOT_TIME_DELTA
attribute with the value in milliseconds with "time of interest
since now". The kernel converts this to jiffies and does the
filtering comparison matching entries that have seen activity
since then and returns them to user space.
Old kernels and old tc continue to work in legacy mode since
they dont specify this attribute.
Some example (we have 400 actions bound to 400 filters); at
installation time. Using updated when tc setting the time of
interest to 120 seconds earlier (we see 400 actions):
prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l
400
go get some coffee and wait for > 120 seconds and try again:
prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l
0
Lets see a filter bound to one of these actions:
....
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1)
match 7f000002/ffffffff at 12 (success 1 )
action order 1: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1145 sec used 802 sec
Action statistics:
Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
....
that coffee took long, no? It was good.
Now lets ping -c 1 127.0.0.2, then run the actions again:
prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
1
More details please:
prompt$ hackedtc -s actions ls action gact since 120000
action order 0: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1270 sec used 30 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
And the filter?
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2)
match 7f000002/ffffffff at 12 (success 2 )
action order 1: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1324 sec used 84 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
include/uapi/linux/rtnetlink.h | 1 +
net/sched/act_api.c | 21 ++++++++++++++++++++-
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index bfa80a6..dab7dad 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -691,6 +691,7 @@ enum {
#define TCAA_MAX TCA_ROOT_TAB
TCA_ROOT_FLAGS,
TCA_ROOT_COUNT,
+ TCA_ROOT_TIME_DELTA, /* in msecs */
__TCA_ROOT_MAX,
#define TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
};
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index d53653a..f19b118 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -111,6 +111,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
{
int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
u32 act_flags = cb->args[2];
+ unsigned long jiffy_since = cb->args[3];
struct nlattr *nest;
spin_lock_bh(&hinfo->lock);
@@ -128,6 +129,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
if (index < s_i)
continue;
+ if (jiffy_since &&
+ time_after(jiffy_since,
+ (unsigned long)p->tcfa_tm.lastuse))
+ continue;
+
nest = nla_nest_start(skb, n_i);
if (nest == NULL)
goto nla_put_failure;
@@ -145,9 +151,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
}
}
done:
+ if (index >= 0)
+ cb->args[0] = index + 1;
+
spin_unlock_bh(&hinfo->lock);
if (n_i) {
- cb->args[0] += n_i;
if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
cb->args[1] = n_i;
}
@@ -1077,6 +1085,7 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
[TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD32,
.validation_data = &tcaa_root_flags_allowed },
+ [TCA_ROOT_TIME_DELTA] = { .type = NLA_U32 },
};
static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
@@ -1166,8 +1175,10 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
struct nlattr *tb[TCA_ROOT_MAX + 1];
struct nlattr *count_attr = NULL;
+ unsigned long jiffy_since = 0;
struct nlattr *kind = NULL;
struct nla_bitfield32 bf;
+ u32 msecs_since = 0;
u32 act_count = 0;
ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
@@ -1191,15 +1202,23 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
cb->args[2] = bf.value;
}
+ if (tb[TCA_ROOT_TIME_DELTA]) {
+ msecs_since = nla_get_u32(tb[TCA_ROOT_TIME_DELTA]);
+ }
+
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
cb->nlh->nlmsg_type, sizeof(*t), 0);
if (!nlh)
goto out_module_put;
+ if (msecs_since)
+ jiffy_since = jiffies - msecs_to_jiffies(msecs_since);
+
t = nlmsg_data(nlh);
t->tca_family = AF_UNSPEC;
t->tca__pad1 = 0;
t->tca__pad2 = 0;
+ cb->args[3] = jiffy_since;
count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32));
if (!count_attr)
goto out_module_put;
--
1.9.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-30 17:24 ` [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32 Jamal Hadi Salim
@ 2017-07-30 18:42 ` Jiri Pirko
2017-07-30 19:59 ` Jamal Hadi Salim
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Pirko @ 2017-07-30 18:42 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Sun, Jul 30, 2017 at 07:24:49PM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>Generic bitflags attribute content sent to the kernel by user.
>With this netlink attr type the user can either set or unset a
>flag in the kernel.
>
>The value is a bitmap that defines the bit values being set
>The selector is a bitmask that defines which value bit is to be
>considered.
>
>A check is made to ensure the rules that a kernel subsystem always
>conforms to bitflags the kernel already knows about. i.e
>if the user tries to set a bit flag that is not understood then
>the _it will be rejected_.
>
>In the most basic form, the user specifies the attribute policy as:
>[ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
>
>where myvalidflags is the bit mask of the flags the kernel understands.
>
>If the user _does not_ provide myvalidflags then the attribute will
>also be rejected.
>
>Examples:
>value = 0x0, and selector = 0x1
>implies we are selecting bit 1 and we want to set its value to 0.
>
>value = 0x2, and selector = 0x2
>implies we are selecting bit 2 and we want to set its value to 1.
>
>Suggested-by: Jiri Pirko <jiri@mellanox.com>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>---
> include/net/netlink.h | 16 ++++++++++++++++
> include/uapi/linux/netlink.h | 17 +++++++++++++++++
> lib/nlattr.c | 30 ++++++++++++++++++++++++++++++
> 3 files changed, 63 insertions(+)
>
>diff --git a/include/net/netlink.h b/include/net/netlink.h
>index ef8e6c3..82dd298 100644
>--- a/include/net/netlink.h
>+++ b/include/net/netlink.h
>@@ -178,6 +178,7 @@ enum {
> NLA_S16,
> NLA_S32,
> NLA_S64,
>+ NLA_BITFIELD32,
> __NLA_TYPE_MAX,
> };
>
>@@ -206,6 +207,7 @@ enum {
> * NLA_MSECS Leaving the length field zero will verify the
> * given type fits, using it verifies minimum length
> * just like "All other"
>+ * NLA_BITFIELD32 A 32-bit bitmap/bitselector attribute
> * All other Minimum length of attribute payload
> *
> * Example:
>@@ -213,11 +215,13 @@ enum {
> * [ATTR_FOO] = { .type = NLA_U16 },
> * [ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
> * [ATTR_BAZ] = { .len = sizeof(struct mystruct) },
>+ * [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
Checkpatch warns you about the line to long, please wrap it.
Btw, I did not see you reached a consensus with DavidA regarding this.
Did I miss it?
> * };
> */
> struct nla_policy {
> u16 type;
> u16 len;
>+ void *validation_data;
> };
>
> /**
>@@ -1203,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla)
> }
>
> /**
>+ * nla_get_bitfield32 - return payload of 32 bitfield attribute
>+ * @nla: nla_bitfield32 attribute
>+ */
>+static inline struct nla_bitfield32 nla_get_bitfield32(const struct nlattr *nla)
>+{
>+ struct nla_bitfield32 tmp;
>+
>+ nla_memcpy(&tmp, nla, sizeof(tmp));
>+ return tmp;
>+}
>+
>+/**
> * nla_memdup - duplicate attribute memory (kmemdup)
> * @src: netlink attribute to duplicate from
> * @gfp: GFP mask
>diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
>index f86127a..f4fc9c9 100644
>--- a/include/uapi/linux/netlink.h
>+++ b/include/uapi/linux/netlink.h
>@@ -226,5 +226,22 @@ struct nlattr {
> #define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
> #define NLA_HDRLEN ((int) NLA_ALIGN(sizeof(struct nlattr)))
>
>+/* Generic 32 bitflags attribute content sent to the kernel.
>+ *
>+ * The value is a bitmap that defines the values being set
>+ * The selector is a bitmask that defines which value is legit
>+ *
>+ * Examples:
>+ * value = 0x0, and selector = 0x1
>+ * implies we are selecting bit 1 and we want to set its value to 0.
>+ *
>+ * value = 0x2, and selector = 0x2
>+ * implies we are selecting bit 2 and we want to set its value to 1.
>+ *
>+ */
>+struct nla_bitfield32 {
>+ __u32 value;
>+ __u32 selector;
>+};
>
> #endif /* _UAPI__LINUX_NETLINK_H */
>diff --git a/lib/nlattr.c b/lib/nlattr.c
>index fb52435..ee79b7a 100644
>--- a/lib/nlattr.c
>+++ b/lib/nlattr.c
>@@ -27,6 +27,30 @@
> [NLA_S64] = sizeof(s64),
> };
>
>+static int validate_nla_bitfield32(const struct nlattr *nla,
>+ u32 *valid_flags_allowed)
>+{
>+ const struct nla_bitfield32 *bf = nla_data(nla);
>+ u32 *valid_flags_mask = valid_flags_allowed;
I pointed this out already. This weird.
You do *u32 = *u32, just with different name. Just use valid_flags_allowed
directly.
>+
>+ if (!valid_flags_allowed)
>+ return -EINVAL;
>+
>+ /*disallow invalid bit selector */
Fix all the comments in this function. Should be
/* something */
with spaces in front and at the end.
>+ if (bf->selector & ~*valid_flags_mask)
>+ return -EINVAL;
>+
>+ /*disallow invalid bit values */
>+ if (bf->value & ~*valid_flags_mask)
>+ return -EINVAL;
>+
>+ /*disallow valid bit values that are not selected*/
>+ if (bf->value & ~bf->selector)
>+ return -EINVAL;
>+
>+ return 0;
>+}
>+
> static int validate_nla(const struct nlattr *nla, int maxtype,
> const struct nla_policy *policy)
> {
>@@ -46,6 +70,12 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
> return -ERANGE;
> break;
>
>+ case NLA_BITFIELD32:
>+ if (attrlen != sizeof(struct nla_bitfield32))
>+ return -ERANGE;
>+
>+ return validate_nla_bitfield32(nla, pt->validation_data);
>+
> case NLA_NUL_STRING:
> if (pt->len)
> minlen = min_t(int, attrlen, pt->len + 1);
>--
>1.9.1
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 2/4] net sched actions: Use proper root attribute table for actions
2017-07-30 17:24 ` [PATCH net-next v12 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
@ 2017-07-30 18:44 ` Jiri Pirko
0 siblings, 0 replies; 19+ messages in thread
From: Jiri Pirko @ 2017-07-30 18:44 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Sun, Jul 30, 2017 at 07:24:50PM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>Bug fix for an issue which has been around for about a decade.
>We got away with it because the enumeration was larger than needed.
>
>Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
>Suggested-by: Jiri Pirko <jiri@mellanox.com>
>Reviewed-by: Simon Horman <simon.horman@netronome.com>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
2017-07-30 17:24 ` [PATCH net-next v12 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
@ 2017-07-30 19:05 ` Jiri Pirko
0 siblings, 0 replies; 19+ messages in thread
From: Jiri Pirko @ 2017-07-30 19:05 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Sun, Jul 30, 2017 at 07:24:51PM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>When you dump hundreds of thousands of actions, getting only 32 per
>dump batch even when the socket buffer and memory allocations allow
>is inefficient.
>
>With this change, the user will get as many as possibly fitting
>within the given constraints available to the kernel.
>
>The top level action TLV space is extended. An attribute
>TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
>is set by the user indicating the user is capable of processing
>these large dumps. Older user space which doesnt set this flag
>doesnt get the large (than 32) batches.
>The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
>actions are put in a single batch. As such user space app knows how long
>to iterate (independent of the type of action being dumped)
>instead of hardcoded maximum of 32 thus maintaining backward compat.
>
>Some results dumping 1.5M actions below:
>first an unpatched tc which doesnt understand these features...
>
>prompt$ time -p tc actions ls action gact | grep index | wc -l
>1500000
>real 1388.43
>user 2.07
>sys 1386.79
>
>Now lets see a patched tc which sets the correct flags when requesting
>a dump:
>
>prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
>1500000
>real 178.13
>user 2.02
>sys 176.96
>
>That is about 8x performance improvement for tc app which sets its
>receive buffer to about 32K.
>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
If DavidA is ok with the "validation_data", I am fine with this patch.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 4/4] net sched actions: add time filter for action dumping
2017-07-30 17:24 ` [PATCH net-next v12 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
@ 2017-07-30 19:06 ` Jiri Pirko
0 siblings, 0 replies; 19+ messages in thread
From: Jiri Pirko @ 2017-07-30 19:06 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Sun, Jul 30, 2017 at 07:24:52PM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>This patch adds support for filtering based on time since last used.
>When we are dumping a large number of actions it is useful to
>have the option of filtering based on when the action was last
>used to reduce the amount of data crossing to user space.
>
>With this patch the user space app sets the TCA_ROOT_TIME_DELTA
>attribute with the value in milliseconds with "time of interest
>since now". The kernel converts this to jiffies and does the
>filtering comparison matching entries that have seen activity
>since then and returns them to user space.
>Old kernels and old tc continue to work in legacy mode since
>they dont specify this attribute.
>
>Some example (we have 400 actions bound to 400 filters); at
>installation time. Using updated when tc setting the time of
>interest to 120 seconds earlier (we see 400 actions):
>prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l
>400
>
>go get some coffee and wait for > 120 seconds and try again:
>
>prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l
>0
>
>Lets see a filter bound to one of these actions:
>....
>filter pref 10 u32
>filter pref 10 u32 fh 800: ht divisor 1
>filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1)
> match 7f000002/ffffffff at 12 (success 1 )
> action order 1: gact action pass
> random type none pass val 0
> index 23 ref 2 bind 1 installed 1145 sec used 802 sec
> Action statistics:
> Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
>....
>
>that coffee took long, no? It was good.
>
>Now lets ping -c 1 127.0.0.2, then run the actions again:
>prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
>1
>
>More details please:
>prompt$ hackedtc -s actions ls action gact since 120000
>
> action order 0: gact action pass
> random type none pass val 0
> index 23 ref 2 bind 1 installed 1270 sec used 30 sec
> Action statistics:
> Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
>
>And the filter?
>
>filter pref 10 u32
>filter pref 10 u32 fh 800: ht divisor 1
>filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2)
> match 7f000002/ffffffff at 12 (success 2 )
> action order 1: gact action pass
> random type none pass val 0
> index 23 ref 2 bind 1 installed 1324 sec used 84 sec
> Action statistics:
> Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-30 18:42 ` Jiri Pirko
@ 2017-07-30 19:59 ` Jamal Hadi Salim
2017-07-31 2:27 ` David Ahern
2017-07-31 6:38 ` Jiri Pirko
0 siblings, 2 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-30 19:59 UTC (permalink / raw)
To: Jiri Pirko; +Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Jiri,
This is getting exhausting, seriously.
I posted the code you are commenting one two days ago so i dont have to
repost.
On D. Ahern: I dont think we are disagreeing anymore on the need to
generalize the check. He is saying it should be a helper and I already
had the validation data; either works. I dont see the gapping need
to remove the validation data.
cheers,
jamal
On 17-07-30 02:42 PM, Jiri Pirko wrote:
> Sun, Jul 30, 2017 at 07:24:49PM CEST, jhs@mojatatu.com wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> Generic bitflags attribute content sent to the kernel by user.
>> With this netlink attr type the user can either set or unset a
>> flag in the kernel.
>>
>> The value is a bitmap that defines the bit values being set
>> The selector is a bitmask that defines which value bit is to be
>> considered.
>>
>> A check is made to ensure the rules that a kernel subsystem always
>> conforms to bitflags the kernel already knows about. i.e
>> if the user tries to set a bit flag that is not understood then
>> the _it will be rejected_.
>>
>> In the most basic form, the user specifies the attribute policy as:
>> [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
>>
>> where myvalidflags is the bit mask of the flags the kernel understands.
>>
>> If the user _does not_ provide myvalidflags then the attribute will
>> also be rejected.
>>
>> Examples:
>> value = 0x0, and selector = 0x1
>> implies we are selecting bit 1 and we want to set its value to 0.
>>
>> value = 0x2, and selector = 0x2
>> implies we are selecting bit 2 and we want to set its value to 1.
>>
>> Suggested-by: Jiri Pirko <jiri@mellanox.com>
>> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>> ---
>> include/net/netlink.h | 16 ++++++++++++++++
>> include/uapi/linux/netlink.h | 17 +++++++++++++++++
>> lib/nlattr.c | 30 ++++++++++++++++++++++++++++++
>> 3 files changed, 63 insertions(+)
>>
>> diff --git a/include/net/netlink.h b/include/net/netlink.h
>> index ef8e6c3..82dd298 100644
>> --- a/include/net/netlink.h
>> +++ b/include/net/netlink.h
>> @@ -178,6 +178,7 @@ enum {
>> NLA_S16,
>> NLA_S32,
>> NLA_S64,
>> + NLA_BITFIELD32,
>> __NLA_TYPE_MAX,
>> };
>>
>> @@ -206,6 +207,7 @@ enum {
>> * NLA_MSECS Leaving the length field zero will verify the
>> * given type fits, using it verifies minimum length
>> * just like "All other"
>> + * NLA_BITFIELD32 A 32-bit bitmap/bitselector attribute
>> * All other Minimum length of attribute payload
>> *
>> * Example:
>> @@ -213,11 +215,13 @@ enum {
>> * [ATTR_FOO] = { .type = NLA_U16 },
>> * [ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
>> * [ATTR_BAZ] = { .len = sizeof(struct mystruct) },
>> + * [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
>
> Checkpatch warns you about the line to long, please wrap it.
>
> Btw, I did not see you reached a consensus with DavidA regarding this.
> Did I miss it?
>
>
>> * };
>> */
>> struct nla_policy {
>> u16 type;
>> u16 len;
>> + void *validation_data;
>> };
>>
>> /**
>> @@ -1203,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla)
>> }
>>
>> /**
>> + * nla_get_bitfield32 - return payload of 32 bitfield attribute
>> + * @nla: nla_bitfield32 attribute
>> + */
>> +static inline struct nla_bitfield32 nla_get_bitfield32(const struct nlattr *nla)
>> +{
>> + struct nla_bitfield32 tmp;
>> +
>> + nla_memcpy(&tmp, nla, sizeof(tmp));
>> + return tmp;
>> +}
>> +
>> +/**
>> * nla_memdup - duplicate attribute memory (kmemdup)
>> * @src: netlink attribute to duplicate from
>> * @gfp: GFP mask
>> diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
>> index f86127a..f4fc9c9 100644
>> --- a/include/uapi/linux/netlink.h
>> +++ b/include/uapi/linux/netlink.h
>> @@ -226,5 +226,22 @@ struct nlattr {
>> #define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
>> #define NLA_HDRLEN ((int) NLA_ALIGN(sizeof(struct nlattr)))
>>
>> +/* Generic 32 bitflags attribute content sent to the kernel.
>> + *
>> + * The value is a bitmap that defines the values being set
>> + * The selector is a bitmask that defines which value is legit
>> + *
>> + * Examples:
>> + * value = 0x0, and selector = 0x1
>> + * implies we are selecting bit 1 and we want to set its value to 0.
>> + *
>> + * value = 0x2, and selector = 0x2
>> + * implies we are selecting bit 2 and we want to set its value to 1.
>> + *
>> + */
>> +struct nla_bitfield32 {
>> + __u32 value;
>> + __u32 selector;
>> +};
>>
>> #endif /* _UAPI__LINUX_NETLINK_H */
>> diff --git a/lib/nlattr.c b/lib/nlattr.c
>> index fb52435..ee79b7a 100644
>> --- a/lib/nlattr.c
>> +++ b/lib/nlattr.c
>> @@ -27,6 +27,30 @@
>> [NLA_S64] = sizeof(s64),
>> };
>>
>> +static int validate_nla_bitfield32(const struct nlattr *nla,
>> + u32 *valid_flags_allowed)
>> +{
>> + const struct nla_bitfield32 *bf = nla_data(nla);
>> + u32 *valid_flags_mask = valid_flags_allowed;
>
> I pointed this out already. This weird.
> You do *u32 = *u32, just with different name. Just use valid_flags_allowed
> directly.
>
>
>> +
>> + if (!valid_flags_allowed)
>> + return -EINVAL;
>> +
>> + /*disallow invalid bit selector */
>
> Fix all the comments in this function. Should be
> /* something */
> with spaces in front and at the end.
>
>
>> + if (bf->selector & ~*valid_flags_mask)
>> + return -EINVAL;
>> +
>> + /*disallow invalid bit values */
>> + if (bf->value & ~*valid_flags_mask)
>> + return -EINVAL;
>> +
>> + /*disallow valid bit values that are not selected*/
>> + if (bf->value & ~bf->selector)
>> + return -EINVAL;
>> +
>> + return 0;
>> +}
>> +
>> static int validate_nla(const struct nlattr *nla, int maxtype,
>> const struct nla_policy *policy)
>> {
>> @@ -46,6 +70,12 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
>> return -ERANGE;
>> break;
>>
>> + case NLA_BITFIELD32:
>> + if (attrlen != sizeof(struct nla_bitfield32))
>> + return -ERANGE;
>> +
>> + return validate_nla_bitfield32(nla, pt->validation_data);
>> +
>> case NLA_NUL_STRING:
>> if (pt->len)
>> minlen = min_t(int, attrlen, pt->len + 1);
>> --
>> 1.9.1
>>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-30 19:59 ` Jamal Hadi Salim
@ 2017-07-31 2:27 ` David Ahern
2017-07-31 6:38 ` Jiri Pirko
1 sibling, 0 replies; 19+ messages in thread
From: David Ahern @ 2017-07-31 2:27 UTC (permalink / raw)
To: Jamal Hadi Salim, Jiri Pirko
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms
On 7/30/17 1:59 PM, Jamal Hadi Salim wrote:
> On D. Ahern: I dont think we are disagreeing anymore on the need to
> generalize the check. He is saying it should be a helper and I already
> had the validation data; either works. I dont see the gapping need
> to remove the validation data.
I never disagreed on general code; I have always disagreed on validating
values as part of the policy check.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 0/4] net sched actions: improve dump performance
2017-07-30 17:24 [PATCH net-next v12 0/4] net sched actions: improve dump performance Jamal Hadi Salim
` (3 preceding siblings ...)
2017-07-30 17:24 ` [PATCH net-next v12 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
@ 2017-07-31 2:28 ` David Miller
2017-07-31 12:06 ` Jamal Hadi Salim
4 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2017-07-31 2:28 UTC (permalink / raw)
To: jhs; +Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern
Series applied, thanks.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-30 19:59 ` Jamal Hadi Salim
2017-07-31 2:27 ` David Ahern
@ 2017-07-31 6:38 ` Jiri Pirko
2017-07-31 12:03 ` Jamal Hadi Salim
1 sibling, 1 reply; 19+ messages in thread
From: Jiri Pirko @ 2017-07-31 6:38 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Sun, Jul 30, 2017 at 09:59:10PM CEST, jhs@mojatatu.com wrote:
>Jiri,
>
>This is getting exhausting, seriously.
>I posted the code you are commenting one two days ago so i dont have to
>repost.
And I commented on the "*u32 = *u32" thing. But you ignored it. Pardon
me for mentioning that again now :/
>
>On D. Ahern: I dont think we are disagreeing anymore on the need to
>generalize the check. He is saying it should be a helper and I already
>had the validation data; either works. I dont see the gapping need
>to remove the validation data.
DavidA? Your opinion.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-31 6:38 ` Jiri Pirko
@ 2017-07-31 12:03 ` Jamal Hadi Salim
2017-07-31 12:21 ` Jiri Pirko
0 siblings, 1 reply; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-31 12:03 UTC (permalink / raw)
To: Jiri Pirko; +Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
On 17-07-31 02:38 AM, Jiri Pirko wrote:
> Sun, Jul 30, 2017 at 09:59:10PM CEST, jhs@mojatatu.com wrote:
>> Jiri,
>>
>> This is getting exhausting, seriously.
>> I posted the code you are commenting one two days ago so i dont have to
>> repost.
>
> And I commented on the "*u32 = *u32" thing. But you ignored it. Pardon
> me for mentioning that again now :/
>
You commented on *u32 assignment from *void which i fixed. I
intentionally selected the different assignment names to reflect
meaning. Had you commented earlier - although I would have found
it disagreable - I would have fixed that too. Jiri, you need to be
more tolerant so progress can be made at times.
>
>>
>> On D. Ahern: I dont think we are disagreeing anymore on the need to
>> generalize the check. He is saying it should be a helper and I already
>> had the validation data; either works. I dont see the gapping need
>> to remove the validation data.
>
> DavidA? Your opinion.
>
With DavidA(reading his response) - the issue is one of taste.
Again either approach is fine. You can call helpers for every user
or make them invoked behind the scenes.
Again - like all your comments on code taste which I addressed, I
would have made that change if the comment had come in earlier. I got
exhausted. Imagine how a newbie corporate guy wouldve felt after this.
cheers,
jamal
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 0/4] net sched actions: improve dump performance
2017-07-31 2:28 ` [PATCH net-next v12 0/4] net sched actions: improve dump performance David Miller
@ 2017-07-31 12:06 ` Jamal Hadi Salim
2017-07-31 23:43 ` Stephen Hemminger
2017-08-01 3:54 ` Stephen Hemminger
0 siblings, 2 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-07-31 12:06 UTC (permalink / raw)
To: David Miller
Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, horms, dsahern,
Stephen Hemminger
[-- Attachment #1: Type: text/plain, Size: 232 bytes --]
On 17-07-30 10:28 PM, David Miller wrote:
>
> Series applied, thanks.
>
Thanks David.
Attaching the iproute2 patch. I will submit an official one with
man page changes later. Stephen - you take net-next changes?
cheers,
jamal
[-- Attachment #2: large-dump-patch --]
[-- Type: text/plain, Size: 13793 bytes --]
diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 3a53b9a..f4fc9c9 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -1,5 +1,5 @@
-#ifndef __LINUX_NETLINK_H
-#define __LINUX_NETLINK_H
+#ifndef _UAPI__LINUX_NETLINK_H
+#define _UAPI__LINUX_NETLINK_H
#include <linux/kernel.h>
#include <linux/socket.h> /* for __kernel_sa_family_t */
@@ -143,8 +143,10 @@ enum nlmsgerr_attrs {
#define NETLINK_PKTINFO 3
#define NETLINK_BROADCAST_ERROR 4
#define NETLINK_NO_ENOBUFS 5
+#ifndef __KERNEL__
#define NETLINK_RX_RING 6
#define NETLINK_TX_RING 7
+#endif
#define NETLINK_LISTEN_ALL_NSID 8
#define NETLINK_LIST_MEMBERSHIPS 9
#define NETLINK_CAP_ACK 10
@@ -171,6 +173,7 @@ struct nl_mmap_hdr {
__u32 nm_gid;
};
+#ifndef __KERNEL__
enum nl_mmap_status {
NL_MMAP_STATUS_UNUSED,
NL_MMAP_STATUS_RESERVED,
@@ -182,6 +185,7 @@ enum nl_mmap_status {
#define NL_MMAP_MSG_ALIGNMENT NLMSG_ALIGNTO
#define NL_MMAP_MSG_ALIGN(sz) __ALIGN_KERNEL(sz, NL_MMAP_MSG_ALIGNMENT)
#define NL_MMAP_HDRLEN NL_MMAP_MSG_ALIGN(sizeof(struct nl_mmap_hdr))
+#endif
#define NET_MAJOR 36 /* Major 36 is reserved for networking */
@@ -222,5 +226,22 @@ struct nlattr {
#define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
#define NLA_HDRLEN ((int) NLA_ALIGN(sizeof(struct nlattr)))
+/* Generic 32 bitflags attribute content sent to the kernel.
+ *
+ * The value is a bitmap that defines the values being set
+ * The selector is a bitmask that defines which value is legit
+ *
+ * Examples:
+ * value = 0x0, and selector = 0x1
+ * implies we are selecting bit 1 and we want to set its value to 0.
+ *
+ * value = 0x2, and selector = 0x2
+ * implies we are selecting bit 2 and we want to set its value to 1.
+ *
+ */
+struct nla_bitfield32 {
+ __u32 value;
+ __u32 selector;
+};
-#endif /* __LINUX_NETLINK_H */
+#endif /* _UAPI__LINUX_NETLINK_H */
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 1d62dad..dab7dad 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -1,5 +1,5 @@
-#ifndef __LINUX_RTNETLINK_H
-#define __LINUX_RTNETLINK_H
+#ifndef _UAPI__LINUX_RTNETLINK_H
+#define _UAPI__LINUX_RTNETLINK_H
#include <linux/types.h>
#include <linux/netlink.h>
@@ -586,6 +586,7 @@ enum {
#define NDUSEROPT_MAX (__NDUSEROPT_MAX - 1)
+#ifndef __KERNEL__
/* RTnetlink multicast groups - backwards compatibility for userspace */
#define RTMGRP_LINK 1
#define RTMGRP_NOTIFY 2
@@ -606,6 +607,7 @@ enum {
#define RTMGRP_DECnet_ROUTE 0x4000
#define RTMGRP_IPV6_PREFIX 0x20000
+#endif
/* RTnetlink multicast groups */
enum rtnetlink_groups {
@@ -681,10 +683,29 @@ struct tcamsg {
unsigned char tca__pad1;
unsigned short tca__pad2;
};
+
+enum {
+ TCA_ROOT_UNSPEC,
+ TCA_ROOT_TAB,
+#define TCA_ACT_TAB TCA_ROOT_TAB
+#define TCAA_MAX TCA_ROOT_TAB
+ TCA_ROOT_FLAGS,
+ TCA_ROOT_COUNT,
+ TCA_ROOT_TIME_DELTA, /* in msecs */
+ __TCA_ROOT_MAX,
+#define TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
+};
+
#define TA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg))))
#define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
-#define TCA_ACT_TAB 1 /* attr type must be >=1 */
-#define TCAA_MAX 1
+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
+ *
+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
+ * actions in a dump. All dump responses will contain the number of actions
+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
+ *
+ */
+#define TCA_FLAG_LARGE_DUMP_ON (1 << 0)
/* New extended info filters for IFLA_EXT_MASK */
#define RTEXT_FILTER_VF (1 << 0)
@@ -696,4 +717,4 @@ struct tcamsg {
-#endif /* __LINUX_RTNETLINK_H */
+#endif /* _UAPI__LINUX_RTNETLINK_H */
diff --git a/tc/f_basic.c b/tc/f_basic.c
index d663668..8370ea6 100644
--- a/tc/f_basic.c
+++ b/tc/f_basic.c
@@ -135,7 +135,7 @@ static int basic_print_opt(struct filter_util *qu, FILE *f,
}
if (tb[TCA_BASIC_ACT]) {
- tc_print_action(f, tb[TCA_BASIC_ACT]);
+ tc_print_action(f, tb[TCA_BASIC_ACT], 0);
}
return 0;
diff --git a/tc/f_bpf.c b/tc/f_bpf.c
index 2f8d12a..c115409 100644
--- a/tc/f_bpf.c
+++ b/tc/f_bpf.c
@@ -239,7 +239,7 @@ static int bpf_print_opt(struct filter_util *qu, FILE *f,
}
if (tb[TCA_BPF_ACT])
- tc_print_action(f, tb[TCA_BPF_ACT]);
+ tc_print_action(f, tb[TCA_BPF_ACT], 0);
return 0;
}
diff --git a/tc/f_cgroup.c b/tc/f_cgroup.c
index ecf9909..633700e 100644
--- a/tc/f_cgroup.c
+++ b/tc/f_cgroup.c
@@ -102,7 +102,7 @@ static int cgroup_print_opt(struct filter_util *qu, FILE *f,
}
if (tb[TCA_CGROUP_ACT])
- tc_print_action(f, tb[TCA_CGROUP_ACT]);
+ tc_print_action(f, tb[TCA_CGROUP_ACT], 0);
return 0;
}
diff --git a/tc/f_flow.c b/tc/f_flow.c
index 09ddcaa..b157104 100644
--- a/tc/f_flow.c
+++ b/tc/f_flow.c
@@ -347,7 +347,7 @@ static int flow_print_opt(struct filter_util *fu, FILE *f, struct rtattr *opt,
tc_print_police(f, tb[TCA_FLOW_POLICE]);
if (tb[TCA_FLOW_ACT]) {
fprintf(f, "\n");
- tc_print_action(f, tb[TCA_FLOW_ACT]);
+ tc_print_action(f, tb[TCA_FLOW_ACT], 0);
}
return 0;
}
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 5be693a..934832e 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -1316,7 +1316,7 @@ static int flower_print_opt(struct filter_util *qu, FILE *f,
}
if (tb[TCA_FLOWER_ACT])
- tc_print_action(f, tb[TCA_FLOWER_ACT]);
+ tc_print_action(f, tb[TCA_FLOWER_ACT], 0);
return 0;
}
diff --git a/tc/f_fw.c b/tc/f_fw.c
index 790bef9..c39789b 100644
--- a/tc/f_fw.c
+++ b/tc/f_fw.c
@@ -160,7 +160,7 @@ static int fw_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt, __u
if (tb[TCA_FW_ACT]) {
fprintf(f, "\n");
- tc_print_action(f, tb[TCA_FW_ACT]);
+ tc_print_action(f, tb[TCA_FW_ACT], 0);
}
return 0;
}
diff --git a/tc/f_matchall.c b/tc/f_matchall.c
index 5a51e75..d78660e 100644
--- a/tc/f_matchall.c
+++ b/tc/f_matchall.c
@@ -145,7 +145,7 @@ static int matchall_print_opt(struct filter_util *qu, FILE *f,
}
if (tb[TCA_MATCHALL_ACT])
- tc_print_action(f, tb[TCA_MATCHALL_ACT]);
+ tc_print_action(f, tb[TCA_MATCHALL_ACT], 0);
return 0;
}
diff --git a/tc/f_route.c b/tc/f_route.c
index 30514c4..e88313f 100644
--- a/tc/f_route.c
+++ b/tc/f_route.c
@@ -168,7 +168,7 @@ static int route_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
if (tb[TCA_ROUTE4_POLICE])
tc_print_police(f, tb[TCA_ROUTE4_POLICE]);
if (tb[TCA_ROUTE4_ACT])
- tc_print_action(f, tb[TCA_ROUTE4_ACT]);
+ tc_print_action(f, tb[TCA_ROUTE4_ACT], 0);
return 0;
}
diff --git a/tc/f_rsvp.c b/tc/f_rsvp.c
index 94bfbef..65caeb4 100644
--- a/tc/f_rsvp.c
+++ b/tc/f_rsvp.c
@@ -402,7 +402,7 @@ static int rsvp_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt, _
}
if (tb[TCA_RSVP_ACT]) {
- tc_print_action(f, tb[TCA_RSVP_ACT]);
+ tc_print_action(f, tb[TCA_RSVP_ACT], 0);
}
if (tb[TCA_RSVP_POLICE])
tc_print_police(f, tb[TCA_RSVP_POLICE]);
diff --git a/tc/f_tcindex.c b/tc/f_tcindex.c
index 784c890..dd1cb47 100644
--- a/tc/f_tcindex.c
+++ b/tc/f_tcindex.c
@@ -173,7 +173,7 @@ static int tcindex_print_opt(struct filter_util *qu, FILE *f,
}
if (tb[TCA_TCINDEX_ACT]) {
fprintf(f, "\n");
- tc_print_action(f, tb[TCA_TCINDEX_ACT]);
+ tc_print_action(f, tb[TCA_TCINDEX_ACT], 0);
}
return 0;
}
diff --git a/tc/f_u32.c b/tc/f_u32.c
index b272c2c..5815be9 100644
--- a/tc/f_u32.c
+++ b/tc/f_u32.c
@@ -1337,7 +1337,7 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
}
if (tb[TCA_U32_ACT])
- tc_print_action(f, tb[TCA_U32_ACT]);
+ tc_print_action(f, tb[TCA_U32_ACT], 0);
return 0;
}
diff --git a/tc/m_action.c b/tc/m_action.c
index 6ebe85e..123295c 100644
--- a/tc/m_action.c
+++ b/tc/m_action.c
@@ -346,21 +346,24 @@ tc_print_action_flush(FILE *f, const struct rtattr *arg)
}
int
-tc_print_action(FILE *f, const struct rtattr *arg)
+tc_print_action(FILE *f, const struct rtattr *arg, unsigned short tot_acts)
{
int i;
- struct rtattr *tb[TCA_ACT_MAX_PRIO + 1];
if (arg == NULL)
return 0;
- parse_rtattr_nested(tb, TCA_ACT_MAX_PRIO, arg);
+ if (!tot_acts)
+ tot_acts = TCA_ACT_MAX_PRIO;
+
+ struct rtattr *tb[tot_acts + 1];
+ parse_rtattr_nested(tb, tot_acts, arg);
if (tab_flush && NULL != tb[0] && NULL == tb[1])
return tc_print_action_flush(f, tb[0]);
- for (i = 0; i < TCA_ACT_MAX_PRIO; i++) {
+ for (i = 0; i < tot_acts; i++) {
if (tb[i]) {
fprintf(f, "\n\taction order %d: ", i);
if (tc_print_one_action(f, tb[i]) < 0) {
@@ -380,7 +383,8 @@ int print_action(const struct sockaddr_nl *who,
FILE *fp = (FILE *)arg;
struct tcamsg *t = NLMSG_DATA(n);
int len = n->nlmsg_len;
- struct rtattr *tb[TCAA_MAX+1];
+ __u32 *tot_acts = NULL;
+ struct rtattr *tb[TCA_ROOT_MAX+1];
len -= NLMSG_LENGTH(sizeof(*t));
@@ -389,8 +393,12 @@ int print_action(const struct sockaddr_nl *who,
return -1;
}
- parse_rtattr(tb, TCAA_MAX, TA_RTA(t), len);
+ parse_rtattr(tb, TCA_ROOT_MAX, TA_RTA(t), len);
+
+ if (tb[TCA_ROOT_COUNT])
+ tot_acts = RTA_DATA(tb[TCA_ROOT_COUNT]);
+ fprintf(fp, "total acts %d \n", tot_acts?*tot_acts:0);
if (tb[TCA_ACT_TAB] == NULL) {
if (n->nlmsg_type != RTM_GETACTION)
fprintf(stderr, "print_action: NULL kind\n");
@@ -414,7 +422,9 @@ int print_action(const struct sockaddr_nl *who,
fprintf(fp, "Replaced action ");
}
}
- tc_print_action(fp, tb[TCA_ACT_TAB]);
+
+
+ tc_print_action(fp, tb[TCA_ACT_TAB], tot_acts?*tot_acts:0);
return 0;
}
@@ -427,7 +437,7 @@ static int tc_action_gd(int cmd, unsigned int flags, int *argc_p, char ***argv_p
char **argv = *argv_p;
int prio = 0;
int ret = 0;
- __u32 i;
+ __u32 i = 0;
struct rtattr *tail;
struct rtattr *tail2;
struct nlmsghdr *ans = NULL;
@@ -498,7 +508,8 @@ static int tc_action_gd(int cmd, unsigned int flags, int *argc_p, char ***argv_p
tail2 = NLMSG_TAIL(&req.n);
addattr_l(&req.n, MAX_MSG, ++prio, NULL, 0);
addattr_l(&req.n, MAX_MSG, TCA_ACT_KIND, k, strlen(k) + 1);
- addattr32(&req.n, MAX_MSG, TCA_ACT_INDEX, i);
+ if (i > 0)
+ addattr32(&req.n, MAX_MSG, TCA_ACT_INDEX, i);
tail2->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail2;
}
@@ -561,12 +572,16 @@ static int tc_action_modify(int cmd, unsigned int flags, int *argc_p, char ***ar
return ret;
}
-static int tc_act_list_or_flush(int argc, char **argv, int event)
+static int tc_act_list_or_flush(int *argc_p, char ***argv_p, int event)
{
+ struct rtattr *tail, *tail2, *tail3, *tail4;
int ret = 0, prio = 0, msg_size = 0;
- char k[16];
- struct rtattr *tail, *tail2;
struct action_util *a = NULL;
+ struct nla_bitfield32 flag_select = { 0 };
+ char **argv = *argv_p;
+ __u32 msec_since = 0;
+ int argc = *argc_p;
+ char k[16];
struct {
struct nlmsghdr n;
struct tcamsg t;
@@ -597,11 +612,40 @@ static int tc_act_list_or_flush(int argc, char **argv, int event)
}
strncpy(k, *argv, sizeof(k) - 1);
+ argc -= 1;
+ argv += 1;
+
+ if (argc && (strcmp(*argv, "since") == 0)) {
+ NEXT_ARG();
+ if (get_u32(&msec_since, *argv, 0))
+ invarg("dump time \"since\" is invalid", *argv);
+ }
+
addattr_l(&req.n, MAX_MSG, ++prio, NULL, 0);
addattr_l(&req.n, MAX_MSG, TCA_ACT_KIND, k, strlen(k) + 1);
tail2->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail2;
tail->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail;
+ tail3 = NLMSG_TAIL(&req.n);
+#if 1
+ flag_select.value |= TCA_FLAG_LARGE_DUMP_ON;
+ flag_select.selector |= TCA_FLAG_LARGE_DUMP_ON;
+#endif
+#if 0
+ flag_select.value |= 8; /* test rejection */
+ flag_select.selector |= 8; /* test rejection */
+ flag_select.value = 0; /* test rejection */
+ flag_select.selector |= TCA_FLAG_LARGE_DUMP_ON; /* test rejection */
+#endif
+ addattr_l(&req.n, MAX_MSG, TCA_ROOT_FLAGS, &flag_select,
+ sizeof(struct nla_bitfield32));
+ tail3->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail3;
+ if (msec_since) {
+ fprintf(stderr, "XXX: since %d\n", msec_since);
+ tail4 = NLMSG_TAIL(&req.n);
+ addattr32(&req.n, MAX_MSG, TCA_ROOT_TIME_DELTA, msec_since);
+ tail4->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail4;
+ }
msg_size = NLMSG_ALIGN(req.n.nlmsg_len) - NLMSG_ALIGN(sizeof(struct nlmsghdr));
if (event == RTM_GETACTION) {
@@ -626,6 +670,8 @@ static int tc_act_list_or_flush(int argc, char **argv, int event)
bad_val:
+ *argc_p = argc;
+ *argv_p = argv;
return ret;
}
@@ -655,13 +701,21 @@ int do_action(int argc, char **argv)
act_usage();
return -1;
}
- return tc_act_list_or_flush(argc-2, argv+2, RTM_GETACTION);
+
+ argc -= 2;
+ argv += 2;
+ return tc_act_list_or_flush(&argc, &argv,
+ RTM_GETACTION);
} else if (matches(*argv, "flush") == 0) {
if (argc <= 2) {
act_usage();
return -1;
}
- return tc_act_list_or_flush(argc-2, argv+2, RTM_DELACTION);
+
+ argc -= 2;
+ argv += 2;
+ return tc_act_list_or_flush(&argc, &argv,
+ RTM_DELACTION);
} else if (matches(*argv, "help") == 0) {
act_usage();
return -1;
diff --git a/tc/tc_util.h b/tc/tc_util.h
index 5c54ad3..583a21a 100644
--- a/tc/tc_util.h
+++ b/tc/tc_util.h
@@ -113,7 +113,7 @@ int act_parse_police(struct action_util *a, int *argc_p,
char ***argv_p, int tca_id, struct nlmsghdr *n);
int print_police(struct action_util *a, FILE *f, struct rtattr *tb);
int police_print_xstats(struct action_util *a, FILE *f, struct rtattr *tb);
-int tc_print_action(FILE *f, const struct rtattr *tb);
+int tc_print_action(FILE *f, const struct rtattr *tb, unsigned short tot_acts);
int tc_print_ipt(FILE *f, const struct rtattr *tb);
int parse_action(int *argc_p, char ***argv_p, int tca_id, struct nlmsghdr *n);
void print_tm(FILE *f, const struct tcf_t *tm);
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32
2017-07-31 12:03 ` Jamal Hadi Salim
@ 2017-07-31 12:21 ` Jiri Pirko
0 siblings, 0 replies; 19+ messages in thread
From: Jiri Pirko @ 2017-07-31 12:21 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, xiyou.wangcong, eric.dumazet, horms, dsahern
Mon, Jul 31, 2017 at 02:03:55PM CEST, jhs@mojatatu.com wrote:
>On 17-07-31 02:38 AM, Jiri Pirko wrote:
>> Sun, Jul 30, 2017 at 09:59:10PM CEST, jhs@mojatatu.com wrote:
>> > Jiri,
>> >
>> > This is getting exhausting, seriously.
>> > I posted the code you are commenting one two days ago so i dont have to
>> > repost.
>>
>> And I commented on the "*u32 = *u32" thing. But you ignored it. Pardon
>> me for mentioning that again now :/
>>
>
>You commented on *u32 assignment from *void which i fixed. I
>intentionally selected the different assignment names to reflect
>meaning. Had you commented earlier - although I would have found
Yep, I don't understand why the function arg cannot have the desired
name right away. Also, I don't understand why you don't just have u32
instead of pointer as a local variable, if you really needed this local
variable. Ok, I admit that ":)" is probably not intuitive comment.
Will be more blunt next time.
>it disagreable - I would have fixed that too. Jiri, you need to be
>more tolerant so progress can be made at times.
I don't think so. I believe that it is really important that code can be
read nicely. If we don't do it, it will be just mess (like it is in lot
of net/sched/ places).
>
>>
>> >
>> > On D. Ahern: I dont think we are disagreeing anymore on the need to
>> > generalize the check. He is saying it should be a helper and I already
>> > had the validation data; either works. I dont see the gapping need
>> > to remove the validation data.
>>
>> DavidA? Your opinion.
>>
>
>With DavidA(reading his response) - the issue is one of taste.
>Again either approach is fine. You can call helpers for every user
>or make them invoked behind the scenes.
>Again - like all your comments on code taste which I addressed, I
>would have made that change if the comment had come in earlier. I got
>exhausted. Imagine how a newbie corporate guy wouldve felt after this.
That's how it is.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 0/4] net sched actions: improve dump performance
2017-07-31 12:06 ` Jamal Hadi Salim
@ 2017-07-31 23:43 ` Stephen Hemminger
2017-08-01 3:54 ` Stephen Hemminger
1 sibling, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2017-07-31 23:43 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: David Miller, netdev, jiri, xiyou.wangcong, eric.dumazet, horms,
dsahern
On Mon, 31 Jul 2017 08:06:42 -0400
Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 17-07-30 10:28 PM, David Miller wrote:
> >
> > Series applied, thanks.
> >
>
> Thanks David.
>
> Attaching the iproute2 patch. I will submit an official one with
> man page changes later. Stephen - you take net-next changes?
>
> cheers,
> jamal
I will fix this up. The kernel headers for iproute2 come from sanitized
kernel headers (not direct copy).
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 0/4] net sched actions: improve dump performance
2017-07-31 12:06 ` Jamal Hadi Salim
2017-07-31 23:43 ` Stephen Hemminger
@ 2017-08-01 3:54 ` Stephen Hemminger
2017-08-01 11:05 ` Jamal Hadi Salim
1 sibling, 1 reply; 19+ messages in thread
From: Stephen Hemminger @ 2017-08-01 3:54 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: David Miller, netdev, jiri, xiyou.wangcong, eric.dumazet, horms,
dsahern
On Mon, 31 Jul 2017 08:06:42 -0400
Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 17-07-30 10:28 PM, David Miller wrote:
> >
> > Series applied, thanks.
> >
>
> Thanks David.
>
> Attaching the iproute2 patch. I will submit an official one with
> man page changes later. Stephen - you take net-next changes?
>
> cheers,
> jamal
Please cleanup and resubmit for net-next.
The header files have been updated in iproute2 net-next branch.
It is not clear to me that the new code is backward compatiable.
Will new versions of tc work on old kernels and vice/versa?
Also, no #ifdef's
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next v12 0/4] net sched actions: improve dump performance
2017-08-01 3:54 ` Stephen Hemminger
@ 2017-08-01 11:05 ` Jamal Hadi Salim
0 siblings, 0 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2017-08-01 11:05 UTC (permalink / raw)
To: Stephen Hemminger
Cc: David Miller, netdev, jiri, xiyou.wangcong, eric.dumazet, horms,
dsahern
On 17-07-31 11:54 PM, Stephen Hemminger wrote:
> On Mon, 31 Jul 2017 08:06:42 -0400
> Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
[..]
> Please cleanup and resubmit for net-next.
>
Will do.
> The header files have been updated in iproute2 net-next branch.
>
When does net-next show up? I noticed some changes - example Jiri's
multi-table changes are not in the tree (I believe they were submitted
as part of net-next).
> It is not clear to me that the new code is backward compatiable
> Will new versions of tc work on old kernels and vice/versa?
>
AFAIK and tested it is.
>
> Also, no #ifdef's
Those will go away. The intention was to test things which will be
rejected (in case some other app in the future uses this feature).
cheers,
jamal
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-08-01 11:05 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-30 17:24 [PATCH net-next v12 0/4] net sched actions: improve dump performance Jamal Hadi Salim
2017-07-30 17:24 ` [PATCH net-next v12 1/4] net netlink: Add new type NLA_BITFIELD32 Jamal Hadi Salim
2017-07-30 18:42 ` Jiri Pirko
2017-07-30 19:59 ` Jamal Hadi Salim
2017-07-31 2:27 ` David Ahern
2017-07-31 6:38 ` Jiri Pirko
2017-07-31 12:03 ` Jamal Hadi Salim
2017-07-31 12:21 ` Jiri Pirko
2017-07-30 17:24 ` [PATCH net-next v12 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
2017-07-30 18:44 ` Jiri Pirko
2017-07-30 17:24 ` [PATCH net-next v12 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
2017-07-30 19:05 ` Jiri Pirko
2017-07-30 17:24 ` [PATCH net-next v12 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
2017-07-30 19:06 ` Jiri Pirko
2017-07-31 2:28 ` [PATCH net-next v12 0/4] net sched actions: improve dump performance David Miller
2017-07-31 12:06 ` Jamal Hadi Salim
2017-07-31 23:43 ` Stephen Hemminger
2017-08-01 3:54 ` Stephen Hemminger
2017-08-01 11:05 ` Jamal Hadi Salim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).