* [PATCH v2 net-next 0/7] net_sched: act: lockless operation
@ 2015-07-06 6:41 Eric Dumazet
2015-07-06 6:41 ` [PATCH v2 net-next 1/7] net: sched: extend percpu stats helpers Eric Dumazet
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Eric Dumazet @ 2015-07-06 6:41 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend,
Eric Dumazet, Eric Dumazet
As mentioned by Alexei last week in Budapest, it is a bit weird
to take a spinlock in order to drop a packet in a tc filter...
Lets add percpu infra for tc actions and use it for gact & mirred.
Before changes, my host with 8 RX queues was handling 5 Mpps with gact,
and more than 11 Mpps after.
Mirred change is not yet visible if ifb+qdisc is used, as ifb is
not yet multi queue enabled, but is a step forward.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Eric Dumazet (7):
net: sched: extend percpu stats helpers
net: sched: add percpu stats to actions
net_sched: act_gact: make tcfg_pval non zero
net_sched: act_gact: use a separate packet counters for gact_determ()
net_sched: act_gact: read tcfg_ptype once
net_sched: act_gact: remove spinlock in fast path
net_sched: act_mirred: remove spinlock in fast path
include/net/act_api.h | 15 ++++++++++-
include/net/sch_generic.h | 31 ++++++++++++++--------
include/net/tc_act/tc_gact.h | 7 ++---
include/net/tc_act/tc_mirred.h | 2 +-
net/core/dev.c | 4 +--
net/sched/act_api.c | 44 ++++++++++++++++++++++++--------
net/sched/act_bpf.c | 2 +-
net/sched/act_connmark.c | 3 ++-
net/sched/act_csum.c | 3 ++-
net/sched/act_gact.c | 44 ++++++++++++++++++--------------
net/sched/act_ipt.c | 2 +-
net/sched/act_mirred.c | 58 ++++++++++++++++++++++--------------------
net/sched/act_nat.c | 3 ++-
net/sched/act_pedit.c | 3 ++-
net/sched/act_simple.c | 3 ++-
net/sched/act_skbedit.c | 3 ++-
net/sched/act_vlan.c | 3 ++-
17 files changed, 148 insertions(+), 82 deletions(-)
--
2.4.3.573.g4eafbef
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2 net-next 1/7] net: sched: extend percpu stats helpers 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet @ 2015-07-06 6:41 ` Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 2/7] net: sched: add percpu stats to actions Eric Dumazet ` (5 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 6:41 UTC (permalink / raw) To: David S. Miller Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend, Eric Dumazet, Eric Dumazet qdisc_bstats_update_cpu() and other helpers were added to support percpu stats for qdisc. We want to add percpu stats for tc action, so this patch add common helpers. qdisc_bstats_update_cpu() is renamed to qdisc_bstats_cpu_update() qdisc_qstats_drop_cpu() is renamed to qdisc_qstats_cpu_drop() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> --- include/net/sch_generic.h | 31 +++++++++++++++++++++---------- net/core/dev.c | 4 ++-- 2 files changed, 23 insertions(+), 12 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 2738f6f87908..2eab08c38e32 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -513,17 +513,20 @@ static inline void bstats_update(struct gnet_stats_basic_packed *bstats, bstats->packets += skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1; } -static inline void qdisc_bstats_update_cpu(struct Qdisc *sch, - const struct sk_buff *skb) +static inline void bstats_cpu_update(struct gnet_stats_basic_cpu *bstats, + const struct sk_buff *skb) { - struct gnet_stats_basic_cpu *bstats = - this_cpu_ptr(sch->cpu_bstats); - u64_stats_update_begin(&bstats->syncp); bstats_update(&bstats->bstats, skb); u64_stats_update_end(&bstats->syncp); } +static inline void qdisc_bstats_cpu_update(struct Qdisc *sch, + const struct sk_buff *skb) +{ + bstats_cpu_update(this_cpu_ptr(sch->cpu_bstats), skb); +} + static inline void qdisc_bstats_update(struct Qdisc *sch, const struct sk_buff *skb) { @@ -547,16 +550,24 @@ static inline void __qdisc_qstats_drop(struct Qdisc *sch, int count) sch->qstats.drops += count; } -static inline void qdisc_qstats_drop(struct Qdisc *sch) +static inline void qstats_drop_inc(struct gnet_stats_queue *qstats) { - sch->qstats.drops++; + qstats->drops++; } -static inline void qdisc_qstats_drop_cpu(struct Qdisc *sch) +static inline void qstats_overlimit_inc(struct gnet_stats_queue *qstats) { - struct gnet_stats_queue *qstats = this_cpu_ptr(sch->cpu_qstats); + qstats->overlimits++; +} - qstats->drops++; +static inline void qdisc_qstats_drop(struct Qdisc *sch) +{ + qstats_drop_inc(&sch->qstats); +} + +static inline void qdisc_qstats_cpu_drop(struct Qdisc *sch) +{ + qstats_drop_inc(this_cpu_ptr(sch->cpu_qstats)); } static inline void qdisc_qstats_overlimit(struct Qdisc *sch) diff --git a/net/core/dev.c b/net/core/dev.c index 6778a9999d52..e0d270143fc7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3646,7 +3646,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb, qdisc_skb_cb(skb)->pkt_len = skb->len; skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS); - qdisc_bstats_update_cpu(cl->q, skb); + qdisc_bstats_cpu_update(cl->q, skb); switch (tc_classify(skb, cl, &cl_res)) { case TC_ACT_OK: @@ -3654,7 +3654,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb, skb->tc_index = TC_H_MIN(cl_res.classid); break; case TC_ACT_SHOT: - qdisc_qstats_drop_cpu(cl->q); + qdisc_qstats_cpu_drop(cl->q); case TC_ACT_STOLEN: case TC_ACT_QUEUED: kfree_skb(skb); -- 2.4.3.573.g4eafbef ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 2/7] net: sched: add percpu stats to actions 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 1/7] net: sched: extend percpu stats helpers Eric Dumazet @ 2015-07-06 6:41 ` Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 3/7] net_sched: act_gact: make tcfg_pval non zero Eric Dumazet ` (4 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 6:41 UTC (permalink / raw) To: David S. Miller Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend, Eric Dumazet, Eric Dumazet Reuse existing percpu infrastructure John Fastabend added for qdisc. This patch adds a new cpustats parameter to tcf_hash_create() and all actions pass false, meaning this patch should have no effect yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> --- include/net/act_api.h | 4 +++- net/sched/act_api.c | 44 ++++++++++++++++++++++++++++++++++---------- net/sched/act_bpf.c | 2 +- net/sched/act_connmark.c | 3 ++- net/sched/act_csum.c | 3 ++- net/sched/act_gact.c | 3 ++- net/sched/act_ipt.c | 2 +- net/sched/act_mirred.c | 3 ++- net/sched/act_nat.c | 3 ++- net/sched/act_pedit.c | 3 ++- net/sched/act_simple.c | 3 ++- net/sched/act_skbedit.c | 3 ++- net/sched/act_vlan.c | 3 ++- 13 files changed, 57 insertions(+), 22 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 3ee4c92afd1b..db2063ffd181 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -21,6 +21,8 @@ struct tcf_common { struct gnet_stats_rate_est64 tcfc_rate_est; spinlock_t tcfc_lock; struct rcu_head tcfc_rcu; + struct gnet_stats_basic_cpu __percpu *cpu_bstats; + struct gnet_stats_queue __percpu *cpu_qstats; }; #define tcf_head common.tcfc_head #define tcf_index common.tcfc_index @@ -103,7 +105,7 @@ int tcf_hash_release(struct tc_action *a, int bind); u32 tcf_hash_new_index(struct tcf_hashinfo *hinfo); int tcf_hash_check(u32 index, struct tc_action *a, int bind); int tcf_hash_create(u32 index, struct nlattr *est, struct tc_action *a, - int size, int bind); + int size, int bind, bool cpustats); void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est); void tcf_hash_insert(struct tc_action *a); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index af427a3dbcba..074a32f466f8 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -27,6 +27,15 @@ #include <net/act_api.h> #include <net/netlink.h> +static void free_tcf(struct rcu_head *head) +{ + struct tcf_common *p = container_of(head, struct tcf_common, tcfc_rcu); + + free_percpu(p->cpu_bstats); + free_percpu(p->cpu_qstats); + kfree(p); +} + void tcf_hash_destroy(struct tc_action *a) { struct tcf_common *p = a->priv; @@ -41,7 +50,7 @@ void tcf_hash_destroy(struct tc_action *a) * gen_estimator est_timer() might access p->tcfc_lock * or bstats, wait a RCU grace period before freeing p */ - kfree_rcu(p, tcfc_rcu); + call_rcu(&p->tcfc_rcu, free_tcf); } EXPORT_SYMBOL(tcf_hash_destroy); @@ -230,15 +239,16 @@ void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est) if (est) gen_kill_estimator(&pc->tcfc_bstats, &pc->tcfc_rate_est); - kfree_rcu(pc, tcfc_rcu); + call_rcu(&pc->tcfc_rcu, free_tcf); } EXPORT_SYMBOL(tcf_hash_cleanup); int tcf_hash_create(u32 index, struct nlattr *est, struct tc_action *a, - int size, int bind) + int size, int bind, bool cpustats) { struct tcf_hashinfo *hinfo = a->ops->hinfo; struct tcf_common *p = kzalloc(size, GFP_KERNEL); + int err = -ENOMEM; if (unlikely(!p)) return -ENOMEM; @@ -246,18 +256,32 @@ int tcf_hash_create(u32 index, struct nlattr *est, struct tc_action *a, if (bind) p->tcfc_bindcnt = 1; + if (cpustats) { + p->cpu_bstats = netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu); + if (!p->cpu_bstats) { +err1: + kfree(p); + return err; + } + p->cpu_qstats = alloc_percpu(struct gnet_stats_queue); + if (!p->cpu_qstats) { +err2: + free_percpu(p->cpu_bstats); + goto err1; + } + } spin_lock_init(&p->tcfc_lock); INIT_HLIST_NODE(&p->tcfc_head); p->tcfc_index = index ? index : tcf_hash_new_index(hinfo); p->tcfc_tm.install = jiffies; p->tcfc_tm.lastuse = jiffies; if (est) { - int err = gen_new_estimator(&p->tcfc_bstats, NULL, - &p->tcfc_rate_est, - &p->tcfc_lock, est); + err = gen_new_estimator(&p->tcfc_bstats, p->cpu_bstats, + &p->tcfc_rate_est, + &p->tcfc_lock, est); if (err) { - kfree(p); - return err; + free_percpu(p->cpu_qstats); + goto err2; } } @@ -615,10 +639,10 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a, if (err < 0) goto errout; - if (gnet_stats_copy_basic(&d, NULL, &p->tcfc_bstats) < 0 || + if (gnet_stats_copy_basic(&d, p->cpu_bstats, &p->tcfc_bstats) < 0 || gnet_stats_copy_rate_est(&d, &p->tcfc_bstats, &p->tcfc_rate_est) < 0 || - gnet_stats_copy_queue(&d, NULL, + gnet_stats_copy_queue(&d, p->cpu_qstats, &p->tcfc_qstats, p->tcfc_qstats.qlen) < 0) goto errout; diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index 1d56903fd4c7..99aa271633e9 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -281,7 +281,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, if (!tcf_hash_check(parm->index, act, bind)) { ret = tcf_hash_create(parm->index, est, act, - sizeof(*prog), bind); + sizeof(*prog), bind, false); if (ret < 0) goto destroy_fp; diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index 295d14bd6c67..f2b540220ad0 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -108,7 +108,8 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla, parm = nla_data(tb[TCA_CONNMARK_PARMS]); if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*ci), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*ci), + bind, false); if (ret) return ret; diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c index 4cd5cf1aedf8..b07c535ba8e7 100644 --- a/net/sched/act_csum.c +++ b/net/sched/act_csum.c @@ -62,7 +62,8 @@ static int tcf_csum_init(struct net *n, struct nlattr *nla, struct nlattr *est, parm = nla_data(tb[TCA_CSUM_PARMS]); if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*p), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*p), + bind, false); if (ret) return ret; ret = ACT_P_CREATED; diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index 7fffc2272701..a4f8af29ee30 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -85,7 +85,8 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla, #endif if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*gact), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*gact), + bind, false); if (ret) return ret; ret = ACT_P_CREATED; diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index cbc8dd7dd48a..99c9cc1c7af9 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -114,7 +114,7 @@ static int tcf_ipt_init(struct net *net, struct nlattr *nla, struct nlattr *est, index = nla_get_u32(tb[TCA_IPT_INDEX]); if (!tcf_hash_check(index, a, bind) ) { - ret = tcf_hash_create(index, est, a, sizeof(*ipt), bind); + ret = tcf_hash_create(index, est, a, sizeof(*ipt), bind, false); if (ret) return ret; ret = ACT_P_CREATED; diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index a42a3b257226..002cd6c83dc6 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -93,7 +93,8 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, if (!tcf_hash_check(parm->index, a, bind)) { if (dev == NULL) return -EINVAL; - ret = tcf_hash_create(parm->index, est, a, sizeof(*m), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*m), + bind, false); if (ret) return ret; ret = ACT_P_CREATED; diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c index 270a030d5fd0..5be0b3c1c5b0 100644 --- a/net/sched/act_nat.c +++ b/net/sched/act_nat.c @@ -55,7 +55,8 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est, parm = nla_data(tb[TCA_NAT_PARMS]); if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*p), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*p), + bind, false); if (ret) return ret; ret = ACT_P_CREATED; diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index 17e6d6669c7f..ce8676ad892f 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c @@ -57,7 +57,8 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla, if (!tcf_hash_check(parm->index, a, bind)) { if (!parm->nkeys) return -EINVAL; - ret = tcf_hash_create(parm->index, est, a, sizeof(*p), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*p), + bind, false); if (ret) return ret; p = to_pedit(a); diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c index 6a8d9488613a..d6b708d6afdf 100644 --- a/net/sched/act_simple.c +++ b/net/sched/act_simple.c @@ -103,7 +103,8 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla, defdata = nla_data(tb[TCA_DEF_DATA]); if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*d), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*d), + bind, false); if (ret) return ret; diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c index fcfeeaf838be..6751b5f8c046 100644 --- a/net/sched/act_skbedit.c +++ b/net/sched/act_skbedit.c @@ -99,7 +99,8 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla, parm = nla_data(tb[TCA_SKBEDIT_PARMS]); if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*d), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*d), + bind, false); if (ret) return ret; diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c index d735ecf0b1a7..796785e0bf96 100644 --- a/net/sched/act_vlan.c +++ b/net/sched/act_vlan.c @@ -116,7 +116,8 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla, action = parm->v_action; if (!tcf_hash_check(parm->index, a, bind)) { - ret = tcf_hash_create(parm->index, est, a, sizeof(*v), bind); + ret = tcf_hash_create(parm->index, est, a, sizeof(*v), + bind, false); if (ret) return ret; -- 2.4.3.573.g4eafbef ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 3/7] net_sched: act_gact: make tcfg_pval non zero 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 1/7] net: sched: extend percpu stats helpers Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 2/7] net: sched: add percpu stats to actions Eric Dumazet @ 2015-07-06 6:41 ` Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 3/7] net_sched: act: " Eric Dumazet ` (3 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 6:41 UTC (permalink / raw) To: David S. Miller Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend, Eric Dumazet, Eric Dumazet First step for gact RCU operation : Instead of testing if tcfg_pval is zero or not, just make it 1. No change in behavior, but slightly faster code. The smp_rmb()/smp_wmb() barriers, while not strictly needed at this stage are added for upcoming spinlock removal. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> --- net/sched/act_gact.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index a4f8af29ee30..22a3a61aa090 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -28,14 +28,16 @@ #ifdef CONFIG_GACT_PROB static int gact_net_rand(struct tcf_gact *gact) { - if (!gact->tcfg_pval || prandom_u32() % gact->tcfg_pval) + smp_rmb(); /* coupled with smp_wmb() in tcf_gact_init() */ + if (prandom_u32() % gact->tcfg_pval) return gact->tcf_action; return gact->tcfg_paction; } static int gact_determ(struct tcf_gact *gact) { - if (!gact->tcfg_pval || gact->tcf_bstats.packets % gact->tcfg_pval) + smp_rmb(); /* coupled with smp_wmb() in tcf_gact_init() */ + if (gact->tcf_bstats.packets % gact->tcfg_pval) return gact->tcf_action; return gact->tcfg_paction; } @@ -105,7 +107,11 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla, #ifdef CONFIG_GACT_PROB if (p_parm) { gact->tcfg_paction = p_parm->paction; - gact->tcfg_pval = p_parm->pval; + gact->tcfg_pval = max_t(u16, 1, p_parm->pval); + /* Make sure tcfg_pval is written before tcfg_ptype + * coupled with smp_rmb() in gact_net_rand() & gact_determ() + */ + smp_wmb(); gact->tcfg_ptype = p_parm->ptype; } #endif -- 2.4.3.573.g4eafbef ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 3/7] net_sched: act: make tcfg_pval non zero 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet ` (2 preceding siblings ...) 2015-07-06 6:41 ` [PATCH v2 net-next 3/7] net_sched: act_gact: make tcfg_pval non zero Eric Dumazet @ 2015-07-06 6:41 ` Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 4/7] net_sched: act_gact: use a separate packet counters for gact_determ() Eric Dumazet ` (2 subsequent siblings) 6 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 6:41 UTC (permalink / raw) To: David S. Miller Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend, Eric Dumazet, Eric Dumazet First step for gact RCU operation : Instead of testing if tcfg_pval is zero or not, just make it 1. No change in behavior, but slightly faster code. The smp_rmb()/smp_wmb() barriers, while not strictly needed at this stage are added for upcoming spinlock removal. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> --- net/sched/act_gact.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index a4f8af29ee30..22a3a61aa090 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -28,14 +28,16 @@ #ifdef CONFIG_GACT_PROB static int gact_net_rand(struct tcf_gact *gact) { - if (!gact->tcfg_pval || prandom_u32() % gact->tcfg_pval) + smp_rmb(); /* coupled with smp_wmb() in tcf_gact_init() */ + if (prandom_u32() % gact->tcfg_pval) return gact->tcf_action; return gact->tcfg_paction; } static int gact_determ(struct tcf_gact *gact) { - if (!gact->tcfg_pval || gact->tcf_bstats.packets % gact->tcfg_pval) + smp_rmb(); /* coupled with smp_wmb() in tcf_gact_init() */ + if (gact->tcf_bstats.packets % gact->tcfg_pval) return gact->tcf_action; return gact->tcfg_paction; } @@ -105,7 +107,11 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla, #ifdef CONFIG_GACT_PROB if (p_parm) { gact->tcfg_paction = p_parm->paction; - gact->tcfg_pval = p_parm->pval; + gact->tcfg_pval = max_t(u16, 1, p_parm->pval); + /* Make sure tcfg_pval is written before tcfg_ptype + * coupled with smp_rmb() in gact_net_rand() & gact_determ() + */ + smp_wmb(); gact->tcfg_ptype = p_parm->ptype; } #endif -- 2.4.3.573.g4eafbef ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 4/7] net_sched: act_gact: use a separate packet counters for gact_determ() 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet ` (3 preceding siblings ...) 2015-07-06 6:41 ` [PATCH v2 net-next 3/7] net_sched: act: " Eric Dumazet @ 2015-07-06 6:41 ` Eric Dumazet 2015-07-06 6:49 ` [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet 2015-07-06 12:12 ` Jamal Hadi Salim 6 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 6:41 UTC (permalink / raw) To: David S. Miller Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend, Eric Dumazet, Eric Dumazet Second step for gact RCU operation : We want to get rid of the spinlock protecting gact operations. Stats (packets/bytes) will soon be per cpu. gact_determ() would not work without a central packet counter, so lets add it for this mode. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> --- include/net/tc_act/tc_gact.h | 7 ++++--- net/sched/act_gact.c | 4 +++- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h index 9fc9b578908a..592a6bc02b0b 100644 --- a/include/net/tc_act/tc_gact.h +++ b/include/net/tc_act/tc_gact.h @@ -6,9 +6,10 @@ struct tcf_gact { struct tcf_common common; #ifdef CONFIG_GACT_PROB - u16 tcfg_ptype; - u16 tcfg_pval; - int tcfg_paction; + u16 tcfg_ptype; + u16 tcfg_pval; + int tcfg_paction; + atomic_t packets; #endif }; #define to_gact(a) \ diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index 22a3a61aa090..2f9bec584b3f 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -36,8 +36,10 @@ static int gact_net_rand(struct tcf_gact *gact) static int gact_determ(struct tcf_gact *gact) { + u32 pack = atomic_inc_return(&gact->packets); + smp_rmb(); /* coupled with smp_wmb() in tcf_gact_init() */ - if (gact->tcf_bstats.packets % gact->tcfg_pval) + if (pack % gact->tcfg_pval) return gact->tcf_action; return gact->tcfg_paction; } -- 2.4.3.573.g4eafbef ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/7] net_sched: act: lockless operation 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet ` (4 preceding siblings ...) 2015-07-06 6:41 ` [PATCH v2 net-next 4/7] net_sched: act_gact: use a separate packet counters for gact_determ() Eric Dumazet @ 2015-07-06 6:49 ` Eric Dumazet 2015-07-06 12:12 ` Jamal Hadi Salim 6 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 6:49 UTC (permalink / raw) To: David S. Miller Cc: netdev, Alexei Starovoitov, Jamal Hadi Salim, John Fastabend, Eric Dumazet, Eric Dumazet On Mon, Jul 6, 2015 at 8:41 AM, Eric Dumazet <edumazet@google.com> wrote: > As mentioned by Alexei last week in Budapest, it is a bit weird > to take a spinlock in order to drop a packet in a tc filter... > Arg, please ignore v2, I messed my git send-email command ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/7] net_sched: act: lockless operation 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet ` (5 preceding siblings ...) 2015-07-06 6:49 ` [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet @ 2015-07-06 12:12 ` Jamal Hadi Salim 2015-07-06 12:17 ` Eric Dumazet 6 siblings, 1 reply; 9+ messages in thread From: Jamal Hadi Salim @ 2015-07-06 12:12 UTC (permalink / raw) To: Eric Dumazet, David S. Miller Cc: netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet On 07/06/15 02:41, Eric Dumazet wrote: > As mentioned by Alexei last week in Budapest, it is a bit weird > to take a spinlock in order to drop a packet in a tc filter... > > Lets add percpu infra for tc actions and use it for gact & mirred. > > Before changes, my host with 8 RX queues was handling 5 Mpps with gact, > and more than 11 Mpps after. > > Mirred change is not yet visible if ifb+qdisc is used, as ifb is > not yet multi queue enabled, but is a step forward. > on mirred: The main thing that is read-write is the stats. lastuse timestamp as well, but that one doesnt have to be accurate (used mainly to help decide when to flush out policies). so the lock around all that code is not necessary. BTW: I was wondering based on what you said earlier on false sharing if tcf_common can be re-arranged to avoid that? For mirred, you may have to add for overlimit stat in the common structure (we should really increment drop stats when we return TC_ACT_SHOT). Having said that: we want to maintain the fact that an action instance (based on index) can be used across different policies. i dont see any issue with the way you are proceeding for that to continue to work. i.e i can say something like: - create an action to redirect or mirror to port blah index 10 - create filter one on eth1 with 1 of the actions being mirred index 10 - create filter two on eth0 with 1 of the actions being mirred index 10 On IFB: we use tasklets originally to avoid re-ordering. I think tasklets would still be useful to keep, with each tied to a queue? cheers, jamal ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/7] net_sched: act: lockless operation 2015-07-06 12:12 ` Jamal Hadi Salim @ 2015-07-06 12:17 ` Eric Dumazet 0 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2015-07-06 12:17 UTC (permalink / raw) To: Jamal Hadi Salim Cc: David S. Miller, netdev, Alexei Starovoitov, John Fastabend, Eric Dumazet On Mon, Jul 6, 2015 at 2:12 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote: > on mirred: > The main thing that is read-write is the stats. lastuse timestamp > as well, but that one doesnt have to be accurate (used mainly > to help decide when to flush out policies). so the lock around > all that code is not necessary. > > BTW: I was wondering based on what you said earlier on false sharing > if tcf_common can be re-arranged to avoid that? Not really : You would still have a stall on lastuse ultimately, even if it sits in a cache line of its own. I mentioned that it shared a cache line to explain with "perf" was showing the stall on reading another field. > > For mirred, you may have to add for overlimit stat in the common > structure (we should really increment drop stats when we return > TC_ACT_SHOT). > > Having said that: we want to maintain the fact that an action > instance (based on index) can be used across different policies. > i dont see any issue with the way you are proceeding for that > to continue to work. i.e i can say something like: > - create an action to redirect or mirror to port blah index 10 > - create filter one on eth1 with 1 of the actions being mirred index 10 > - create filter two on eth0 with 1 of the actions being mirred index 10 I do not see any issues either. > > On IFB: > we use tasklets originally to avoid re-ordering. I think tasklets would > still be useful to keep, with each tied to a queue? Sure. We do not care about reorders if packets were delivered to different RX queues. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-07-06 12:17 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-06 6:41 [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 1/7] net: sched: extend percpu stats helpers Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 2/7] net: sched: add percpu stats to actions Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 3/7] net_sched: act_gact: make tcfg_pval non zero Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 3/7] net_sched: act: " Eric Dumazet 2015-07-06 6:41 ` [PATCH v2 net-next 4/7] net_sched: act_gact: use a separate packet counters for gact_determ() Eric Dumazet 2015-07-06 6:49 ` [PATCH v2 net-next 0/7] net_sched: act: lockless operation Eric Dumazet 2015-07-06 12:12 ` Jamal Hadi Salim 2015-07-06 12:17 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).