* [PATCH net-next 0/6] net_sched: act: lockless operation
@ 2015-07-02 13:07 Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 1/6] net: sched: extend percpu stats helpers Eric Dumazet
` (5 more replies)
0 siblings, 6 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
As mentioned by Alexei last week in Budapest, it is a bit weird
to take a spinlock in order to drop a packet in a tc filter...
Lets add percpu infra for tc actions and use it for gact.
Before changes, my host with 8 RX queues was handling 5 Mpps,
and more than 10 Mpps after.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Eric Dumazet (6):
net: sched: extend percpu stats helpers
net: sched: add percpu stats to actions
net_sched: act: make tcfg_pval non zero
net_sched: act: use a separate packet counters for gact_determ()
net_sched: act: read tcfg_ptype once
net_sched: act: remove spinlock in fast path
include/net/act_api.h | 4 +++-
include/net/sch_generic.h | 27 +++++++++++++++++----------
include/net/tc_act/tc_gact.h | 7 ++++---
net/core/dev.c | 4 ++--
net/sched/act_api.c | 44 ++++++++++++++++++++++++++++++++++----------
net/sched/act_bpf.c | 2 +-
net/sched/act_connmark.c | 3 ++-
net/sched/act_csum.c | 3 ++-
net/sched/act_gact.c | 35 ++++++++++++++++++-----------------
net/sched/act_ipt.c | 2 +-
net/sched/act_mirred.c | 3 ++-
net/sched/act_nat.c | 3 ++-
net/sched/act_pedit.c | 3 ++-
net/sched/act_simple.c | 3 ++-
net/sched/act_skbedit.c | 3 ++-
net/sched/act_vlan.c | 3 ++-
16 files changed, 96 insertions(+), 53 deletions(-)
--
2.4.3.573.g4eafbef
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH net-next 1/6] net: sched: extend percpu stats helpers
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
@ 2015-07-02 13:07 ` Eric Dumazet
2015-07-02 15:19 ` John Fastabend
2015-07-03 10:44 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 2/6] net: sched: add percpu stats to actions Eric Dumazet
` (4 subsequent siblings)
5 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
qdisc_bstats_update_cpu() and other helpers were added to support
percpu stats for qdisc.
We want to add percpu stats for tc action, so this patch add common
helpers.
qdisc_bstats_update_cpu() is renamed to qdisc_bstats_cpu_update()
qdisc_qstats_drop_cpu() is renamed to qdisc_qstats_cpu_drop()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
include/net/sch_generic.h | 27 +++++++++++++++++----------
net/core/dev.c | 4 ++--
2 files changed, 19 insertions(+), 12 deletions(-)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 2738f6f87908..0cd49b21b211 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -513,17 +513,21 @@ static inline void bstats_update(struct gnet_stats_basic_packed *bstats,
bstats->packets += skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1;
}
-static inline void qdisc_bstats_update_cpu(struct Qdisc *sch,
- const struct sk_buff *skb)
+static inline void bstats_cpu_update(struct gnet_stats_basic_cpu *bstats,
+ const struct sk_buff *skb)
{
- struct gnet_stats_basic_cpu *bstats =
- this_cpu_ptr(sch->cpu_bstats);
-
u64_stats_update_begin(&bstats->syncp);
bstats_update(&bstats->bstats, skb);
u64_stats_update_end(&bstats->syncp);
}
+static inline void qdisc_bstats_cpu_update(struct Qdisc *sch,
+ const struct sk_buff *skb)
+{
+ bstats_cpu_update(this_cpu_ptr(sch->cpu_bstats), skb);
+
+}
+
static inline void qdisc_bstats_update(struct Qdisc *sch,
const struct sk_buff *skb)
{
@@ -547,16 +551,19 @@ static inline void __qdisc_qstats_drop(struct Qdisc *sch, int count)
sch->qstats.drops += count;
}
-static inline void qdisc_qstats_drop(struct Qdisc *sch)
+static inline void qstats_drop_inc(struct gnet_stats_queue *qstats)
{
- sch->qstats.drops++;
+ qstats->drops++;
}
-static inline void qdisc_qstats_drop_cpu(struct Qdisc *sch)
+static inline void qdisc_qstats_drop(struct Qdisc *sch)
{
- struct gnet_stats_queue *qstats = this_cpu_ptr(sch->cpu_qstats);
+ qstats_drop_inc(&sch->qstats);
+}
- qstats->drops++;
+static inline void qdisc_qstats_cpu_drop(struct Qdisc *sch)
+{
+ qstats_drop_inc(this_cpu_ptr(sch->cpu_qstats));
}
static inline void qdisc_qstats_overlimit(struct Qdisc *sch)
diff --git a/net/core/dev.c b/net/core/dev.c
index 6778a9999d52..e0d270143fc7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3646,7 +3646,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
qdisc_skb_cb(skb)->pkt_len = skb->len;
skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
- qdisc_bstats_update_cpu(cl->q, skb);
+ qdisc_bstats_cpu_update(cl->q, skb);
switch (tc_classify(skb, cl, &cl_res)) {
case TC_ACT_OK:
@@ -3654,7 +3654,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
skb->tc_index = TC_H_MIN(cl_res.classid);
break;
case TC_ACT_SHOT:
- qdisc_qstats_drop_cpu(cl->q);
+ qdisc_qstats_cpu_drop(cl->q);
case TC_ACT_STOLEN:
case TC_ACT_QUEUED:
kfree_skb(skb);
--
2.4.3.573.g4eafbef
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH net-next 2/6] net: sched: add percpu stats to actions
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 1/6] net: sched: extend percpu stats helpers Eric Dumazet
@ 2015-07-02 13:07 ` Eric Dumazet
2015-07-02 15:31 ` John Fastabend
2015-07-03 10:48 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero Eric Dumazet
` (3 subsequent siblings)
5 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
Reuse existing percpu infrastructure Jonh added for qdisc.
This patch adds a new cpustats parameter to tcf_hash_create() and all
actions pass false, meaning this patch should have no effect yet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
include/net/act_api.h | 4 +++-
net/sched/act_api.c | 44 ++++++++++++++++++++++++++++++++++----------
net/sched/act_bpf.c | 2 +-
net/sched/act_connmark.c | 3 ++-
net/sched/act_csum.c | 3 ++-
net/sched/act_gact.c | 3 ++-
net/sched/act_ipt.c | 2 +-
net/sched/act_mirred.c | 3 ++-
net/sched/act_nat.c | 3 ++-
net/sched/act_pedit.c | 3 ++-
net/sched/act_simple.c | 3 ++-
net/sched/act_skbedit.c | 3 ++-
net/sched/act_vlan.c | 3 ++-
13 files changed, 57 insertions(+), 22 deletions(-)
diff --git a/include/net/act_api.h b/include/net/act_api.h
index 3ee4c92afd1b..db2063ffd181 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -21,6 +21,8 @@ struct tcf_common {
struct gnet_stats_rate_est64 tcfc_rate_est;
spinlock_t tcfc_lock;
struct rcu_head tcfc_rcu;
+ struct gnet_stats_basic_cpu __percpu *cpu_bstats;
+ struct gnet_stats_queue __percpu *cpu_qstats;
};
#define tcf_head common.tcfc_head
#define tcf_index common.tcfc_index
@@ -103,7 +105,7 @@ int tcf_hash_release(struct tc_action *a, int bind);
u32 tcf_hash_new_index(struct tcf_hashinfo *hinfo);
int tcf_hash_check(u32 index, struct tc_action *a, int bind);
int tcf_hash_create(u32 index, struct nlattr *est, struct tc_action *a,
- int size, int bind);
+ int size, int bind, bool cpustats);
void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est);
void tcf_hash_insert(struct tc_action *a);
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index af427a3dbcba..074a32f466f8 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -27,6 +27,15 @@
#include <net/act_api.h>
#include <net/netlink.h>
+static void free_tcf(struct rcu_head *head)
+{
+ struct tcf_common *p = container_of(head, struct tcf_common, tcfc_rcu);
+
+ free_percpu(p->cpu_bstats);
+ free_percpu(p->cpu_qstats);
+ kfree(p);
+}
+
void tcf_hash_destroy(struct tc_action *a)
{
struct tcf_common *p = a->priv;
@@ -41,7 +50,7 @@ void tcf_hash_destroy(struct tc_action *a)
* gen_estimator est_timer() might access p->tcfc_lock
* or bstats, wait a RCU grace period before freeing p
*/
- kfree_rcu(p, tcfc_rcu);
+ call_rcu(&p->tcfc_rcu, free_tcf);
}
EXPORT_SYMBOL(tcf_hash_destroy);
@@ -230,15 +239,16 @@ void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est)
if (est)
gen_kill_estimator(&pc->tcfc_bstats,
&pc->tcfc_rate_est);
- kfree_rcu(pc, tcfc_rcu);
+ call_rcu(&pc->tcfc_rcu, free_tcf);
}
EXPORT_SYMBOL(tcf_hash_cleanup);
int tcf_hash_create(u32 index, struct nlattr *est, struct tc_action *a,
- int size, int bind)
+ int size, int bind, bool cpustats)
{
struct tcf_hashinfo *hinfo = a->ops->hinfo;
struct tcf_common *p = kzalloc(size, GFP_KERNEL);
+ int err = -ENOMEM;
if (unlikely(!p))
return -ENOMEM;
@@ -246,18 +256,32 @@ int tcf_hash_create(u32 index, struct nlattr *est, struct tc_action *a,
if (bind)
p->tcfc_bindcnt = 1;
+ if (cpustats) {
+ p->cpu_bstats = netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
+ if (!p->cpu_bstats) {
+err1:
+ kfree(p);
+ return err;
+ }
+ p->cpu_qstats = alloc_percpu(struct gnet_stats_queue);
+ if (!p->cpu_qstats) {
+err2:
+ free_percpu(p->cpu_bstats);
+ goto err1;
+ }
+ }
spin_lock_init(&p->tcfc_lock);
INIT_HLIST_NODE(&p->tcfc_head);
p->tcfc_index = index ? index : tcf_hash_new_index(hinfo);
p->tcfc_tm.install = jiffies;
p->tcfc_tm.lastuse = jiffies;
if (est) {
- int err = gen_new_estimator(&p->tcfc_bstats, NULL,
- &p->tcfc_rate_est,
- &p->tcfc_lock, est);
+ err = gen_new_estimator(&p->tcfc_bstats, p->cpu_bstats,
+ &p->tcfc_rate_est,
+ &p->tcfc_lock, est);
if (err) {
- kfree(p);
- return err;
+ free_percpu(p->cpu_qstats);
+ goto err2;
}
}
@@ -615,10 +639,10 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a,
if (err < 0)
goto errout;
- if (gnet_stats_copy_basic(&d, NULL, &p->tcfc_bstats) < 0 ||
+ if (gnet_stats_copy_basic(&d, p->cpu_bstats, &p->tcfc_bstats) < 0 ||
gnet_stats_copy_rate_est(&d, &p->tcfc_bstats,
&p->tcfc_rate_est) < 0 ||
- gnet_stats_copy_queue(&d, NULL,
+ gnet_stats_copy_queue(&d, p->cpu_qstats,
&p->tcfc_qstats,
p->tcfc_qstats.qlen) < 0)
goto errout;
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 1d56903fd4c7..99aa271633e9 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -281,7 +281,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla,
if (!tcf_hash_check(parm->index, act, bind)) {
ret = tcf_hash_create(parm->index, est, act,
- sizeof(*prog), bind);
+ sizeof(*prog), bind, false);
if (ret < 0)
goto destroy_fp;
diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c
index 295d14bd6c67..f2b540220ad0 100644
--- a/net/sched/act_connmark.c
+++ b/net/sched/act_connmark.c
@@ -108,7 +108,8 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla,
parm = nla_data(tb[TCA_CONNMARK_PARMS]);
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*ci), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*ci),
+ bind, false);
if (ret)
return ret;
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index 4cd5cf1aedf8..b07c535ba8e7 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -62,7 +62,8 @@ static int tcf_csum_init(struct net *n, struct nlattr *nla, struct nlattr *est,
parm = nla_data(tb[TCA_CSUM_PARMS]);
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*p), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*p),
+ bind, false);
if (ret)
return ret;
ret = ACT_P_CREATED;
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 7fffc2272701..a4f8af29ee30 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -85,7 +85,8 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
#endif
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*gact), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*gact),
+ bind, false);
if (ret)
return ret;
ret = ACT_P_CREATED;
diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index cbc8dd7dd48a..99c9cc1c7af9 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -114,7 +114,7 @@ static int tcf_ipt_init(struct net *net, struct nlattr *nla, struct nlattr *est,
index = nla_get_u32(tb[TCA_IPT_INDEX]);
if (!tcf_hash_check(index, a, bind) ) {
- ret = tcf_hash_create(index, est, a, sizeof(*ipt), bind);
+ ret = tcf_hash_create(index, est, a, sizeof(*ipt), bind, false);
if (ret)
return ret;
ret = ACT_P_CREATED;
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index a42a3b257226..002cd6c83dc6 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -93,7 +93,8 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
if (!tcf_hash_check(parm->index, a, bind)) {
if (dev == NULL)
return -EINVAL;
- ret = tcf_hash_create(parm->index, est, a, sizeof(*m), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*m),
+ bind, false);
if (ret)
return ret;
ret = ACT_P_CREATED;
diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c
index 270a030d5fd0..5be0b3c1c5b0 100644
--- a/net/sched/act_nat.c
+++ b/net/sched/act_nat.c
@@ -55,7 +55,8 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est,
parm = nla_data(tb[TCA_NAT_PARMS]);
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*p), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*p),
+ bind, false);
if (ret)
return ret;
ret = ACT_P_CREATED;
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index 17e6d6669c7f..ce8676ad892f 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -57,7 +57,8 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla,
if (!tcf_hash_check(parm->index, a, bind)) {
if (!parm->nkeys)
return -EINVAL;
- ret = tcf_hash_create(parm->index, est, a, sizeof(*p), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*p),
+ bind, false);
if (ret)
return ret;
p = to_pedit(a);
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index 6a8d9488613a..d6b708d6afdf 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -103,7 +103,8 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
defdata = nla_data(tb[TCA_DEF_DATA]);
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*d), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*d),
+ bind, false);
if (ret)
return ret;
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index fcfeeaf838be..6751b5f8c046 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -99,7 +99,8 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
parm = nla_data(tb[TCA_SKBEDIT_PARMS]);
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*d), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*d),
+ bind, false);
if (ret)
return ret;
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index d735ecf0b1a7..796785e0bf96 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -116,7 +116,8 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
action = parm->v_action;
if (!tcf_hash_check(parm->index, a, bind)) {
- ret = tcf_hash_create(parm->index, est, a, sizeof(*v), bind);
+ ret = tcf_hash_create(parm->index, est, a, sizeof(*v),
+ bind, false);
if (ret)
return ret;
--
2.4.3.573.g4eafbef
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 1/6] net: sched: extend percpu stats helpers Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 2/6] net: sched: add percpu stats to actions Eric Dumazet
@ 2015-07-02 13:07 ` Eric Dumazet
2015-07-02 15:33 ` John Fastabend
` (2 more replies)
2015-07-02 13:07 ` [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ() Eric Dumazet
` (2 subsequent siblings)
5 siblings, 3 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
First step for gact RCU operation :
Instead of testing if tcfg_pval is zero or not, just make it 1.
No change in behavior, but slightly faster code.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
net/sched/act_gact.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index a4f8af29ee30..42284aad77dd 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -28,14 +28,14 @@
#ifdef CONFIG_GACT_PROB
static int gact_net_rand(struct tcf_gact *gact)
{
- if (!gact->tcfg_pval || prandom_u32() % gact->tcfg_pval)
+ if (prandom_u32() % gact->tcfg_pval)
return gact->tcf_action;
return gact->tcfg_paction;
}
static int gact_determ(struct tcf_gact *gact)
{
- if (!gact->tcfg_pval || gact->tcf_bstats.packets % gact->tcfg_pval)
+ if (gact->tcf_bstats.packets % gact->tcfg_pval)
return gact->tcf_action;
return gact->tcfg_paction;
}
@@ -105,7 +105,7 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
#ifdef CONFIG_GACT_PROB
if (p_parm) {
gact->tcfg_paction = p_parm->paction;
- gact->tcfg_pval = p_parm->pval;
+ gact->tcfg_pval = max_t(u16, 1, p_parm->pval);
gact->tcfg_ptype = p_parm->ptype;
}
#endif
--
2.4.3.573.g4eafbef
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ()
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
` (2 preceding siblings ...)
2015-07-02 13:07 ` [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero Eric Dumazet
@ 2015-07-02 13:07 ` Eric Dumazet
2015-07-02 15:44 ` John Fastabend
2015-07-03 10:50 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path Eric Dumazet
5 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
Second step for gact RCU operation :
We want to get rid of the spinlock protecting gact operations.
Stats (packets/bytes) will soon be per cpu.
gact_determ() would not work without a central packet counter,
so lets add it for this mode.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
include/net/tc_act/tc_gact.h | 7 ++++---
net/sched/act_gact.c | 4 +++-
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h
index 9fc9b578908a..592a6bc02b0b 100644
--- a/include/net/tc_act/tc_gact.h
+++ b/include/net/tc_act/tc_gact.h
@@ -6,9 +6,10 @@
struct tcf_gact {
struct tcf_common common;
#ifdef CONFIG_GACT_PROB
- u16 tcfg_ptype;
- u16 tcfg_pval;
- int tcfg_paction;
+ u16 tcfg_ptype;
+ u16 tcfg_pval;
+ int tcfg_paction;
+ atomic_t packets;
#endif
};
#define to_gact(a) \
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 42284aad77dd..e968290e8378 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -35,7 +35,9 @@ static int gact_net_rand(struct tcf_gact *gact)
static int gact_determ(struct tcf_gact *gact)
{
- if (gact->tcf_bstats.packets % gact->tcfg_pval)
+ u32 pack = atomic_inc_return(&gact->packets);
+
+ if (pack % gact->tcfg_pval)
return gact->tcf_action;
return gact->tcfg_paction;
}
--
2.4.3.573.g4eafbef
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
` (3 preceding siblings ...)
2015-07-02 13:07 ` [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ() Eric Dumazet
@ 2015-07-02 13:07 ` Eric Dumazet
2015-07-02 15:47 ` John Fastabend
2015-07-03 10:57 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path Eric Dumazet
5 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
Third step for gact RCU operation :
Following patch will get rid of spinlock protection,
so we need to read tcfg_ptype once.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
net/sched/act_gact.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index e968290e8378..7c7e72e95943 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -121,16 +121,16 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
struct tcf_result *res)
{
struct tcf_gact *gact = a->priv;
- int action = TC_ACT_SHOT;
+ int action = gact->tcf_action;
spin_lock(&gact->tcf_lock);
#ifdef CONFIG_GACT_PROB
- if (gact->tcfg_ptype)
- action = gact_rand[gact->tcfg_ptype](gact);
- else
- action = gact->tcf_action;
-#else
- action = gact->tcf_action;
+ {
+ u32 ptype = READ_ONCE(gact->tcfg_ptype);
+
+ if (ptype)
+ action = gact_rand[ptype](gact);
+ }
#endif
gact->tcf_bstats.bytes += qdisc_pkt_len(skb);
gact->tcf_bstats.packets++;
--
2.4.3.573.g4eafbef
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
` (4 preceding siblings ...)
2015-07-02 13:07 ` [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once Eric Dumazet
@ 2015-07-02 13:07 ` Eric Dumazet
2015-07-02 16:35 ` John Fastabend
` (2 more replies)
5 siblings, 3 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 13:07 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
Final step for gact RCU operation :
1) Use percpu stats
2) update lastuse only every clock tick
3) Remove spinlock acquisition, as it is no longer needed.
Since this is the last contended lock in packet RX when tc gact is used,
this gives impressive gain.
My host with 8 RX queues was handling 5 Mpps before the patch,
and more than 10 Mpps after patch.
Tested:
On receiver :
IP=ip
TC=tc
dev=eth0
$TC qdisc del dev $dev ingress 2>/dev/null
$TC qdisc add dev $dev ingress
$TC filter del dev $dev root pref 10 2>/dev/null
$TC filter del dev $dev pref 10 2>/dev/null
tc filter add dev $dev est 1sec 4sec parent ffff: protocol ip prio 1 \
u32 match ip src 7.0.0.0/8 flowid 1:15 action drop
Sender sends packets flood from 7/8 network
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
net/sched/act_gact.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 7c7e72e95943..e054e5630aab 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -88,7 +88,7 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
if (!tcf_hash_check(parm->index, a, bind)) {
ret = tcf_hash_create(parm->index, est, a, sizeof(*gact),
- bind, false);
+ bind, true);
if (ret)
return ret;
ret = ACT_P_CREATED;
@@ -121,9 +121,8 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
struct tcf_result *res)
{
struct tcf_gact *gact = a->priv;
- int action = gact->tcf_action;
+ int action = READ_ONCE(gact->tcf_action);
- spin_lock(&gact->tcf_lock);
#ifdef CONFIG_GACT_PROB
{
u32 ptype = READ_ONCE(gact->tcfg_ptype);
@@ -132,12 +131,11 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
action = gact_rand[ptype](gact);
}
#endif
- gact->tcf_bstats.bytes += qdisc_pkt_len(skb);
- gact->tcf_bstats.packets++;
+ bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), skb);
if (action == TC_ACT_SHOT)
- gact->tcf_qstats.drops++;
- gact->tcf_tm.lastuse = jiffies;
- spin_unlock(&gact->tcf_lock);
+ qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
+ if (gact->tcf_tm.lastuse != jiffies)
+ gact->tcf_tm.lastuse = jiffies;
return action;
}
--
2.4.3.573.g4eafbef
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 1/6] net: sched: extend percpu stats helpers
2015-07-02 13:07 ` [PATCH net-next 1/6] net: sched: extend percpu stats helpers Eric Dumazet
@ 2015-07-02 15:19 ` John Fastabend
2015-07-03 10:44 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: John Fastabend @ 2015-07-02 15:19 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, Eric Dumazet
On 15-07-02 06:07 AM, Eric Dumazet wrote:
> qdisc_bstats_update_cpu() and other helpers were added to support
> percpu stats for qdisc.
>
> We want to add percpu stats for tc action, so this patch add common
> helpers.
>
> qdisc_bstats_update_cpu() is renamed to qdisc_bstats_cpu_update()
> qdisc_qstats_drop_cpu() is renamed to qdisc_qstats_cpu_drop()
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> ---
Acked-by: John Fastabend <john.r.fastabend@intel.com>
stupid nit below,
> include/net/sch_generic.h | 27 +++++++++++++++++----------
> net/core/dev.c | 4 ++--
> 2 files changed, 19 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 2738f6f87908..0cd49b21b211 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -513,17 +513,21 @@ static inline void bstats_update(struct gnet_stats_basic_packed *bstats,
> bstats->packets += skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1;
> }
>
> -static inline void qdisc_bstats_update_cpu(struct Qdisc *sch,
> - const struct sk_buff *skb)
> +static inline void bstats_cpu_update(struct gnet_stats_basic_cpu *bstats,
> + const struct sk_buff *skb)
> {
> - struct gnet_stats_basic_cpu *bstats =
> - this_cpu_ptr(sch->cpu_bstats);
> -
> u64_stats_update_begin(&bstats->syncp);
> bstats_update(&bstats->bstats, skb);
> u64_stats_update_end(&bstats->syncp);
> }
>
> +static inline void qdisc_bstats_cpu_update(struct Qdisc *sch,
> + const struct sk_buff *skb)
> +{
> + bstats_cpu_update(this_cpu_ptr(sch->cpu_bstats), skb);
> +
spurious new line.
> +}
> +
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 2/6] net: sched: add percpu stats to actions
2015-07-02 13:07 ` [PATCH net-next 2/6] net: sched: add percpu stats to actions Eric Dumazet
@ 2015-07-02 15:31 ` John Fastabend
2015-07-03 10:48 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: John Fastabend @ 2015-07-02 15:31 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, Eric Dumazet
On 15-07-02 06:07 AM, Eric Dumazet wrote:
> Reuse existing percpu infrastructure Jonh added for qdisc.
>
> This patch adds a new cpustats parameter to tcf_hash_create() and all
> actions pass false, meaning this patch should have no effect yet.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> ---
> include/net/act_api.h | 4 +++-
> net/sched/act_api.c | 44 ++++++++++++++++++++++++++++++++++----------
> net/sched/act_bpf.c | 2 +-
> net/sched/act_connmark.c | 3 ++-
> net/sched/act_csum.c | 3 ++-
> net/sched/act_gact.c | 3 ++-
> net/sched/act_ipt.c | 2 +-
> net/sched/act_mirred.c | 3 ++-
> net/sched/act_nat.c | 3 ++-
> net/sched/act_pedit.c | 3 ++-
> net/sched/act_simple.c | 3 ++-
> net/sched/act_skbedit.c | 3 ++-
> net/sched/act_vlan.c | 3 ++-
> 13 files changed, 57 insertions(+), 22 deletions(-)
>
Nice.
Acked-by: John Fastabend <john.r.fastabend@intel.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero
2015-07-02 13:07 ` [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero Eric Dumazet
@ 2015-07-02 15:33 ` John Fastabend
2015-07-02 16:58 ` Alexei Starovoitov
2015-07-03 10:49 ` Jamal Hadi Salim
2 siblings, 0 replies; 27+ messages in thread
From: John Fastabend @ 2015-07-02 15:33 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, Eric Dumazet
On 15-07-02 06:07 AM, Eric Dumazet wrote:
> First step for gact RCU operation :
>
> Instead of testing if tcfg_pval is zero or not, just make it 1.
>
> No change in behavior, but slightly faster code.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> ---
> net/sched/act_gact.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
> index a4f8af29ee30..42284aad77dd 100644
> --- a/net/sched/act_gact.c
> +++ b/net/sched/act_gact.c
Acked-by: John Fastabend <john.r.fastabend@intel.com>
> @@ -28,14 +28,14 @@
> #ifdef CONFIG_GACT_PROB
> static int gact_net_rand(struct tcf_gact *gact)
> {
> - if (!gact->tcfg_pval || prandom_u32() % gact->tcfg_pval)
> + if (prandom_u32() % gact->tcfg_pval)
> return gact->tcf_action;
> return gact->tcfg_paction;
> }
>
> static int gact_determ(struct tcf_gact *gact)
> {
> - if (!gact->tcfg_pval || gact->tcf_bstats.packets % gact->tcfg_pval)
> + if (gact->tcf_bstats.packets % gact->tcfg_pval)
> return gact->tcf_action;
> return gact->tcfg_paction;
> }
> @@ -105,7 +105,7 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
> #ifdef CONFIG_GACT_PROB
> if (p_parm) {
> gact->tcfg_paction = p_parm->paction;
> - gact->tcfg_pval = p_parm->pval;
> + gact->tcfg_pval = max_t(u16, 1, p_parm->pval);
> gact->tcfg_ptype = p_parm->ptype;
> }
> #endif
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ()
2015-07-02 13:07 ` [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ() Eric Dumazet
@ 2015-07-02 15:44 ` John Fastabend
2015-07-03 10:50 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: John Fastabend @ 2015-07-02 15:44 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, Eric Dumazet
On 15-07-02 06:07 AM, Eric Dumazet wrote:
> Second step for gact RCU operation :
>
> We want to get rid of the spinlock protecting gact operations.
> Stats (packets/bytes) will soon be per cpu.
>
> gact_determ() would not work without a central packet counter,
> so lets add it for this mode.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> ---
Acked-by: John Fastabend <john.r.fastabend@intel.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once
2015-07-02 13:07 ` [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once Eric Dumazet
@ 2015-07-02 15:47 ` John Fastabend
2015-07-03 10:57 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: John Fastabend @ 2015-07-02 15:47 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, Eric Dumazet
On 15-07-02 06:07 AM, Eric Dumazet wrote:
> Third step for gact RCU operation :
>
> Following patch will get rid of spinlock protection,
> so we need to read tcfg_ptype once.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> ---
> net/sched/act_gact.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
> index e968290e8378..7c7e72e95943 100644
> --- a/net/sched/act_gact.c
> +++ b/net/sched/act_gact.c
> @@ -121,16 +121,16 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
> struct tcf_result *res)
> {
> struct tcf_gact *gact = a->priv;
> - int action = TC_ACT_SHOT;
> + int action = gact->tcf_action;
>
> spin_lock(&gact->tcf_lock);
> #ifdef CONFIG_GACT_PROB
> - if (gact->tcfg_ptype)
> - action = gact_rand[gact->tcfg_ptype](gact);
> - else
> - action = gact->tcf_action;
> -#else
> - action = gact->tcf_action;
> + {
> + u32 ptype = READ_ONCE(gact->tcfg_ptype);
> +
> + if (ptype)
> + action = gact_rand[ptype](gact);
> + }
> #endif
> gact->tcf_bstats.bytes += qdisc_pkt_len(skb);
> gact->tcf_bstats.packets++;
>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-02 13:07 ` [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path Eric Dumazet
@ 2015-07-02 16:35 ` John Fastabend
2015-07-02 20:59 ` Eric Dumazet
2015-07-02 17:13 ` Alexei Starovoitov
2015-07-03 11:07 ` Jamal Hadi Salim
2 siblings, 1 reply; 27+ messages in thread
From: John Fastabend @ 2015-07-02 16:35 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, Eric Dumazet
On 15-07-02 06:07 AM, Eric Dumazet wrote:
> Final step for gact RCU operation :
>
> 1) Use percpu stats
> 2) update lastuse only every clock tick
> 3) Remove spinlock acquisition, as it is no longer needed.
>
> Since this is the last contended lock in packet RX when tc gact is used,
> this gives impressive gain.
>
> My host with 8 RX queues was handling 5 Mpps before the patch,
> and more than 10 Mpps after patch.
>
> Tested:
>
> On receiver :
> IP=ip
> TC=tc
> dev=eth0
>
> $TC qdisc del dev $dev ingress 2>/dev/null
> $TC qdisc add dev $dev ingress
> $TC filter del dev $dev root pref 10 2>/dev/null
> $TC filter del dev $dev pref 10 2>/dev/null
> tc filter add dev $dev est 1sec 4sec parent ffff: protocol ip prio 1 \
> u32 match ip src 7.0.0.0/8 flowid 1:15 action drop
>
> Sender sends packets flood from 7/8 network
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> ---
> net/sched/act_gact.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
[...]
> @@ -121,9 +121,8 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
> struct tcf_result *res)
> {
> struct tcf_gact *gact = a->priv;
> - int action = gact->tcf_action;
> + int action = READ_ONCE(gact->tcf_action);
>
> - spin_lock(&gact->tcf_lock);
> #ifdef CONFIG_GACT_PROB
> {
> u32 ptype = READ_ONCE(gact->tcfg_ptype);
> @@ -132,12 +131,11 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
> action = gact_rand[ptype](gact);
> }
> #endif
> - gact->tcf_bstats.bytes += qdisc_pkt_len(skb);
> - gact->tcf_bstats.packets++;
> + bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), skb);
> if (action == TC_ACT_SHOT)
> - gact->tcf_qstats.drops++;
> - gact->tcf_tm.lastuse = jiffies;
> - spin_unlock(&gact->tcf_lock);
> + qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
> + if (gact->tcf_tm.lastuse != jiffies)
> + gact->tcf_tm.lastuse = jiffies;
I'm missing the point of the if block. Is that really good enough
for the 32bit system case? I would have expected some wrapper to
handle it here something like u64_stats_() maybe _u64_jiffies(). Maybe
after a coffee I'll make sense of it.
>
> return action;
> }
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero
2015-07-02 13:07 ` [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero Eric Dumazet
2015-07-02 15:33 ` John Fastabend
@ 2015-07-02 16:58 ` Alexei Starovoitov
2015-07-03 10:49 ` Jamal Hadi Salim
2 siblings, 0 replies; 27+ messages in thread
From: Alexei Starovoitov @ 2015-07-02 16:58 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Jamal Hadi Salim, John Fastabend, Eric Dumazet
On 7/2/15 6:07 AM, Eric Dumazet wrote:
> First step for gact RCU operation :
>
> Instead of testing if tcfg_pval is zero or not, just make it 1.
>
> No change in behavior, but slightly faster code.
>
> Signed-off-by: Eric Dumazet<edumazet@google.com>
Nice trick!
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-02 13:07 ` [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path Eric Dumazet
2015-07-02 16:35 ` John Fastabend
@ 2015-07-02 17:13 ` Alexei Starovoitov
2015-07-03 11:07 ` Jamal Hadi Salim
2 siblings, 0 replies; 27+ messages in thread
From: Alexei Starovoitov @ 2015-07-02 17:13 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Jamal Hadi Salim, John Fastabend, Eric Dumazet
On 7/2/15 6:07 AM, Eric Dumazet wrote:
> Final step for gact RCU operation :
>
> 1) Use percpu stats
> 2) update lastuse only every clock tick
> 3) Remove spinlock acquisition, as it is no longer needed.
>
> Since this is the last contended lock in packet RX when tc gact is used,
> this gives impressive gain.
>
> My host with 8 RX queues was handling 5 Mpps before the patch,
> and more than 10 Mpps after patch.
>
> Signed-off-by: Eric Dumazet<edumazet@google.com>
Great stuff. Thank you for fixing it!
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-02 16:35 ` John Fastabend
@ 2015-07-02 20:59 ` Eric Dumazet
2015-07-03 11:25 ` Jamal Hadi Salim
0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2015-07-02 20:59 UTC (permalink / raw)
To: John Fastabend
Cc: Eric Dumazet, David S. Miller, netdev, Alexei Starovoitov,
Jamal Hadi Salim, Eric Dumazet
On Thu, 2015-07-02 at 09:35 -0700, John Fastabend wrote:
> > + if (gact->tcf_tm.lastuse != jiffies)
> > + gact->tcf_tm.lastuse = jiffies;
>
> I'm missing the point of the if block. Is that really good enough
> for the 32bit system case? I would have expected some wrapper to
> handle it here something like u64_stats_() maybe _u64_jiffies(). Maybe
> after a coffee I'll make sense of it.
>
Point is to not dirty cache line for every packet ?
Doing the test means we attempt dirtying only ~HZ times per second,
which really matters to handle millions of packets per second.
My tests show a good enough performance, not sure we want a percpu thing
for this lastuse field.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 1/6] net: sched: extend percpu stats helpers
2015-07-02 13:07 ` [PATCH net-next 1/6] net: sched: extend percpu stats helpers Eric Dumazet
2015-07-02 15:19 ` John Fastabend
@ 2015-07-03 10:44 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 10:44 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet
On 07/02/15 09:07, Eric Dumazet wrote:
> qdisc_bstats_update_cpu() and other helpers were added to support
> percpu stats for qdisc.
>
> We want to add percpu stats for tc action, so this patch add common
> helpers.
>
> qdisc_bstats_update_cpu() is renamed to qdisc_bstats_cpu_update()
> qdisc_qstats_drop_cpu() is renamed to qdisc_qstats_cpu_drop()
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 2/6] net: sched: add percpu stats to actions
2015-07-02 13:07 ` [PATCH net-next 2/6] net: sched: add percpu stats to actions Eric Dumazet
2015-07-02 15:31 ` John Fastabend
@ 2015-07-03 10:48 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 10:48 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet
On 07/02/15 09:07, Eric Dumazet wrote:
> Reuse existing percpu infrastructure Jonh added for qdisc.
>
I am not big on typos - s/Jonh/John ? ;->
> This patch adds a new cpustats parameter to tcf_hash_create() and all
> actions pass false, meaning this patch should have no effect yet.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero
2015-07-02 13:07 ` [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero Eric Dumazet
2015-07-02 15:33 ` John Fastabend
2015-07-02 16:58 ` Alexei Starovoitov
@ 2015-07-03 10:49 ` Jamal Hadi Salim
2015-07-04 7:18 ` Eric Dumazet
2 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 10:49 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet
On 07/02/15 09:07, Eric Dumazet wrote:
> First step for gact RCU operation :
>
> Instead of testing if tcfg_pval is zero or not, just make it 1.
>
> No change in behavior, but slightly faster code.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ()
2015-07-02 13:07 ` [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ() Eric Dumazet
2015-07-02 15:44 ` John Fastabend
@ 2015-07-03 10:50 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 10:50 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet
On 07/02/15 09:07, Eric Dumazet wrote:
> Second step for gact RCU operation :
>
> We want to get rid of the spinlock protecting gact operations.
> Stats (packets/bytes) will soon be per cpu.
>
> gact_determ() would not work without a central packet counter,
> so lets add it for this mode.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once
2015-07-02 13:07 ` [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once Eric Dumazet
2015-07-02 15:47 ` John Fastabend
@ 2015-07-03 10:57 ` Jamal Hadi Salim
1 sibling, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 10:57 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet
On 07/02/15 09:07, Eric Dumazet wrote:
> Third step for gact RCU operation :
>
> Following patch will get rid of spinlock protection,
> so we need to read tcfg_ptype once.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-02 13:07 ` [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path Eric Dumazet
2015-07-02 16:35 ` John Fastabend
2015-07-02 17:13 ` Alexei Starovoitov
@ 2015-07-03 11:07 ` Jamal Hadi Salim
2015-07-04 5:45 ` Eric Dumazet
2 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 11:07 UTC (permalink / raw)
To: Eric Dumazet, David S. Miller
Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet,
Gregoire Baron, Jiri Pirko, alexander.h.duyck@redhat.com
On 07/02/15 09:07, Eric Dumazet wrote:
> Final step for gact RCU operation :
>
> 1) Use percpu stats
> 2) update lastuse only every clock tick
> 3) Remove spinlock acquisition, as it is no longer needed.
>
> Since this is the last contended lock in packet RX when tc gact is used,
> this gives impressive gain.
>
> My host with 8 RX queues was handling 5 Mpps before the patch,
> and more than 10 Mpps after patch.
>
> Tested:
>
> On receiver :
> IP=ip
> TC=tc
> dev=eth0
>
> $TC qdisc del dev $dev ingress 2>/dev/null
> $TC qdisc add dev $dev ingress
> $TC filter del dev $dev root pref 10 2>/dev/null
> $TC filter del dev $dev pref 10 2>/dev/null
> tc filter add dev $dev est 1sec 4sec parent ffff: protocol ip prio 1 \
> u32 match ip src 7.0.0.0/8 flowid 1:15 action drop
>
> Sender sends packets flood from 7/8 network
>
Very nice Eric;-> thanks.
So now basic accept/drop should be flying ;->
Other really low hanging fruit is act_csum, vlan and skbedit.
CCing the respective authors.
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-02 20:59 ` Eric Dumazet
@ 2015-07-03 11:25 ` Jamal Hadi Salim
2015-07-03 12:08 ` Eric Dumazet
0 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2015-07-03 11:25 UTC (permalink / raw)
To: Eric Dumazet, John Fastabend
Cc: Eric Dumazet, David S. Miller, netdev, Alexei Starovoitov,
Eric Dumazet
On 07/02/15 16:59, Eric Dumazet wrote:
> On Thu, 2015-07-02 at 09:35 -0700, John Fastabend wrote:
>
> Point is to not dirty cache line for every packet ?
>
> Doing the test means we attempt dirtying only ~HZ times per second,
> which really matters to handle millions of packets per second.
>
> My tests show a good enough performance, not sure we want a percpu thing
> for this lastuse field.
>
Does it harm to always set gact->tcf_tm.lastuse ?
cheers,
jamal
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-03 11:25 ` Jamal Hadi Salim
@ 2015-07-03 12:08 ` Eric Dumazet
2015-07-03 17:21 ` John Fastabend
0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2015-07-03 12:08 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: John Fastabend, Eric Dumazet, David S. Miller, netdev,
Alexei Starovoitov, Eric Dumazet
On Fri, 2015-07-03 at 07:25 -0400, Jamal Hadi Salim wrote:
> On 07/02/15 16:59, Eric Dumazet wrote:
> > On Thu, 2015-07-02 at 09:35 -0700, John Fastabend wrote:
>
> >
> > Point is to not dirty cache line for every packet ?
> >
> > Doing the test means we attempt dirtying only ~HZ times per second,
> > which really matters to handle millions of packets per second.
> >
> > My tests show a good enough performance, not sure we want a percpu thing
> > for this lastuse field.
> >
>
> Does it harm to always set gact->tcf_tm.lastuse ?
Yes it harms, because of one false sharing on a cache line, for every
packet...
Samples: 20K of event 'cycles:pp', Event count (approx.): 15451660731
32.50% ksoftirqd/1 [kernel.kallsyms] [k] tcf_gact
19.43% ksoftirqd/1 [kernel.kallsyms] [k] put_compound_page
4.34% ksoftirqd/1 [kernel.kallsyms] [k] __skb_get_hash
3.88% ksoftirqd/1 [kernel.kallsyms] [k] memcpy_erms
2.95% ksoftirqd/1 [kernel.kallsyms] [k] get_rps_cpu
2.81% ksoftirqd/1 [kernel.kallsyms] [k] __build_skb
Note that gact->tcf_action & gact->tcf_tm.lastuse share same cache line
│ Disassembly of section load0:
│
│ ffffffffa016f040 <load0>:
0.11 │ nop
0.06 │ push %rbp
0.14 │ mov %rsp,%rbp
0.06 │ push %r12
│ mov %rdi,%r12
│ push %rbx
0.16 │ mov (%rsi),%rbx
98.46 │ mov 0x20(%rbx),%eax action = READ_ONCE(gact->tcf_action);
_huge_ stall here because another cpu dirtied gact->tcf_tm.lastuse
0.33 │ movzwl 0x98(%rbx),%edx
│ test %edx,%edx
│ jne 87
│20: mov 0x88(%rbx),%rdx
0.11 │ add %gs:0x5fe9b0b9(%rip),%rdx
│ mov 0x28(%r12),%ecx
0.02 │ mov 0x8(%rdx),%edi
│ mov $0x1,%esi
0.11 │ add %rcx,(%rdx)
│ mov 0xc8(%r12),%ecx
│ add 0xd0(%r12),%rcx
0.08 │ cmpw $0x0,0x2(%rcx)
│ je 5a
│ movzwl 0x4(%rcx),%esi
│5a: add %edi,%esi
│ cmp $0x2,%eax
0.14 │ mov %esi,0x8(%rdx)
│ jne 77
│ mov 0x90(%rbx),%rdx
│ add %gs:0x5fe9b075(%rip),%rdx
0.11 │ addl $0x1,0x8(%rdx)
│77: mov -0x1e5640be(%rip),%rdx
│ mov %rdx,0x30(%rbx) gact->tcf_tm.lastuse = jiffies;
0.08 │ pop %rbx
0.02 │ pop %r12
│ pop %rbp
│ retq
│87: mov %edx,%edx
│ mov %rbx,%rdi
│ callq *-0x5fe90b70(,%rdx,8)
│ jmp 20
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-03 12:08 ` Eric Dumazet
@ 2015-07-03 17:21 ` John Fastabend
0 siblings, 0 replies; 27+ messages in thread
From: John Fastabend @ 2015-07-03 17:21 UTC (permalink / raw)
To: Eric Dumazet, Jamal Hadi Salim
Cc: Eric Dumazet, David S. Miller, netdev, Alexei Starovoitov,
Eric Dumazet
On 15-07-03 05:08 AM, Eric Dumazet wrote:
> On Fri, 2015-07-03 at 07:25 -0400, Jamal Hadi Salim wrote:
>> On 07/02/15 16:59, Eric Dumazet wrote:
>>> On Thu, 2015-07-02 at 09:35 -0700, John Fastabend wrote:
>>
>>>
>>> Point is to not dirty cache line for every packet ?
>>>
>>> Doing the test means we attempt dirtying only ~HZ times per second,
>>> which really matters to handle millions of packets per second.
>>>
>>> My tests show a good enough performance, not sure we want a percpu thing
>>> for this lastuse field.
Agreed percpu is overkill thanks for the explanation. Looks good
to me.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path
2015-07-03 11:07 ` Jamal Hadi Salim
@ 2015-07-04 5:45 ` Eric Dumazet
0 siblings, 0 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-04 5:45 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: David S. Miller, netdev, Alexei Starovoitov, John Fastabend,
Eric Dumazet, Gregoire Baron, Jiri Pirko,
alexander.h.duyck@redhat.com
On Fri, Jul 3, 2015 at 1:07 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> Very nice Eric;-> thanks.
> So now basic accept/drop should be flying ;->
> Other really low hanging fruit is act_csum, vlan and skbedit.
> CCing the respective authors.
Note that act_mirred can be converted as well. I will do this because
this is my main usage.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero
2015-07-03 10:49 ` Jamal Hadi Salim
@ 2015-07-04 7:18 ` Eric Dumazet
0 siblings, 0 replies; 27+ messages in thread
From: Eric Dumazet @ 2015-07-04 7:18 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: David S. Miller, netdev, Alexei Starovoitov, John Fastabend,
Eric Dumazet
Thanks guys for the review.
For completeness, I'll add smp_wmb() here :
gact->tcfg_pval = max_t(u16, 1, p_parm->pval);
smp_wmb();
gact->tcfg_ptype = p_parm->ptype;
And corresponding smp_rmb()
On Fri, Jul 3, 2015 at 12:49 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 07/02/15 09:07, Eric Dumazet wrote:
>>
>> First step for gact RCU operation :
>>
>> Instead of testing if tcfg_pval is zero or not, just make it 1.
>>
>> No change in behavior, but slightly faster code.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>
>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
>
> cheers,
> jamal
>
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2015-07-05 10:55 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-02 13:07 [PATCH net-next 0/6] net_sched: act: lockless operation Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 1/6] net: sched: extend percpu stats helpers Eric Dumazet
2015-07-02 15:19 ` John Fastabend
2015-07-03 10:44 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 2/6] net: sched: add percpu stats to actions Eric Dumazet
2015-07-02 15:31 ` John Fastabend
2015-07-03 10:48 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 3/6] net_sched: act: make tcfg_pval non zero Eric Dumazet
2015-07-02 15:33 ` John Fastabend
2015-07-02 16:58 ` Alexei Starovoitov
2015-07-03 10:49 ` Jamal Hadi Salim
2015-07-04 7:18 ` Eric Dumazet
2015-07-02 13:07 ` [PATCH net-next 4/6] net_sched: act: use a separate packet counters for gact_determ() Eric Dumazet
2015-07-02 15:44 ` John Fastabend
2015-07-03 10:50 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 5/6] net_sched: act: read tcfg_ptype once Eric Dumazet
2015-07-02 15:47 ` John Fastabend
2015-07-03 10:57 ` Jamal Hadi Salim
2015-07-02 13:07 ` [PATCH net-next 6/6] net_sched: act: remove spinlock in fast path Eric Dumazet
2015-07-02 16:35 ` John Fastabend
2015-07-02 20:59 ` Eric Dumazet
2015-07-03 11:25 ` Jamal Hadi Salim
2015-07-03 12:08 ` Eric Dumazet
2015-07-03 17:21 ` John Fastabend
2015-07-02 17:13 ` Alexei Starovoitov
2015-07-03 11:07 ` Jamal Hadi Salim
2015-07-04 5:45 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).