* [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
@ 2015-05-13 20:50 Daniel Borkmann
2015-05-13 21:07 ` Alexei Starovoitov
2015-05-13 21:20 ` Eric Dumazet
0 siblings, 2 replies; 9+ messages in thread
From: Daniel Borkmann @ 2015-05-13 20:50 UTC (permalink / raw)
To: davem; +Cc: fw, ast, jhs, eric.dumazet, netdev, Daniel Borkmann
From: Florian Westphal <fw@strlen.de>
Commit d2788d34885d4ce5ba ("net: sched: further simplify handle_ing")
removed the call to qdisc_enqueue_root(). However, after this removal
we no longer set qdisc pkt length, which this patch fixes. We make use
of qdisc_pkt_len_init() from 1def9238d4aa ("net_sched: more precise
pkt_len computation"), which was suggested as the current approach so
far does not take into account when GRO builds up GSO packets.
Fixes: d2788d34885d ("net: sched: further simplify handle_ing")
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
v1->v2:
- use qdisc_pkt_len_init() as suggested, thanks!
net/core/dev.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 15e51a9..6e19521 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2787,8 +2787,8 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
qdisc_skb_cb(skb)->pkt_len = skb->len;
- /* To get more precise estimation of bytes sent on wire,
- * we add to pkt_len the headers size of all segments
+ /* To get more precise estimation of bytes sent/received on
+ * wire, we add to pkt_len the headers size of all segments.
*/
if (shinfo->gso_size) {
unsigned int hdr_len;
@@ -3541,9 +3541,11 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
*pt_prev = NULL;
}
- qdisc_bstats_update_cpu(cl->q, skb);
+ qdisc_pkt_len_init(skb);
skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
+ qdisc_bstats_update_cpu(cl->q, skb);
+
switch (tc_classify(skb, cl, &cl_res)) {
case TC_ACT_OK:
case TC_ACT_RECLASSIFY:
--
1.9.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 20:50 [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify Daniel Borkmann
@ 2015-05-13 21:07 ` Alexei Starovoitov
2015-05-13 21:20 ` Eric Dumazet
1 sibling, 0 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2015-05-13 21:07 UTC (permalink / raw)
To: Daniel Borkmann, davem; +Cc: fw, jhs, eric.dumazet, netdev
On 5/13/15 1:50 PM, Daniel Borkmann wrote:
> From: Florian Westphal <fw@strlen.de>
>
> Commit d2788d34885d4ce5ba ("net: sched: further simplify handle_ing")
> removed the call to qdisc_enqueue_root(). However, after this removal
> we no longer set qdisc pkt length, which this patch fixes. We make use
> of qdisc_pkt_len_init() from 1def9238d4aa ("net_sched: more precise
> pkt_len computation"), which was suggested as the current approach so
> far does not take into account when GRO builds up GSO packets.
>
> Fixes: d2788d34885d ("net: sched: further simplify handle_ing")
> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
Looks good as quick fix for now. Long term I still want to try to reuse
stats from underlying netdev ;)
For TX we don't have this option and have to do qdisc_pkt_len_init(),
but for RX we should be able to do something smarter.
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 20:50 [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify Daniel Borkmann
2015-05-13 21:07 ` Alexei Starovoitov
@ 2015-05-13 21:20 ` Eric Dumazet
2015-05-13 21:33 ` Florian Westphal
` (2 more replies)
1 sibling, 3 replies; 9+ messages in thread
From: Eric Dumazet @ 2015-05-13 21:20 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, fw, ast, jhs, netdev
On Wed, 2015-05-13 at 22:50 +0200, Daniel Borkmann wrote:
> From: Florian Westphal <fw@strlen.de>
>
> Commit d2788d34885d4ce5ba ("net: sched: further simplify handle_ing")
> removed the call to qdisc_enqueue_root(). However, after this removal
> we no longer set qdisc pkt length, which this patch fixes. We make use
> of qdisc_pkt_len_init() from 1def9238d4aa ("net_sched: more precise
> pkt_len computation"), which was suggested as the current approach so
> far does not take into account when GRO builds up GSO packets.
>
> Fixes: d2788d34885d ("net: sched: further simplify handle_ing")
> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
> v1->v2:
> - use qdisc_pkt_len_init() as suggested, thanks!
>
> net/core/dev.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 15e51a9..6e19521 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2787,8 +2787,8 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
>
> qdisc_skb_cb(skb)->pkt_len = skb->len;
>
> - /* To get more precise estimation of bytes sent on wire,
> - * we add to pkt_len the headers size of all segments
> + /* To get more precise estimation of bytes sent/received on
> + * wire, we add to pkt_len the headers size of all segments.
> */
> if (shinfo->gso_size) {
> unsigned int hdr_len;
> @@ -3541,9 +3541,11 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
> *pt_prev = NULL;
> }
>
> - qdisc_bstats_update_cpu(cl->q, skb);
> + qdisc_pkt_len_init(skb);
> skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
>
> + qdisc_bstats_update_cpu(cl->q, skb);
> +
> switch (tc_classify(skb, cl, &cl_res)) {
> case TC_ACT_OK:
> case TC_ACT_RECLASSIFY:
Is qdisc_pkt_len_init() still inlined, now it has 2 callers ?
People using policers on ingress are very often disabling GRO for
various historical reasons.
Since handle_ing() is very uncommon, I would prefer not slowing down tx
just so that GRO ~5% error in ingress is fixed.
I noticed you did not CC me, don't be afraid to include me next time one
of my commit is referred in a changelog ;)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 21:20 ` Eric Dumazet
@ 2015-05-13 21:33 ` Florian Westphal
2015-05-13 21:38 ` Eric Dumazet
2015-05-13 21:36 ` Daniel Borkmann
2015-05-13 21:42 ` Eric Dumazet
2 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2015-05-13 21:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Daniel Borkmann, davem, fw, ast, jhs, netdev
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Commit d2788d34885d4ce5ba ("net: sched: further simplify handle_ing")
> > removed the call to qdisc_enqueue_root(). However, after this removal
> > we no longer set qdisc pkt length, which this patch fixes. We make use
> > of qdisc_pkt_len_init() from 1def9238d4aa ("net_sched: more precise
> > pkt_len computation"), which was suggested as the current approach so
> > far does not take into account when GRO builds up GSO packets.
> >
> > Fixes: d2788d34885d ("net: sched: further simplify handle_ing")
> > Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> > ---
> > v1->v2:
> > - use qdisc_pkt_len_init() as suggested, thanks!
> >
> > @@ -3541,9 +3541,11 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
> > *pt_prev = NULL;
> > }
> >
> > - qdisc_bstats_update_cpu(cl->q, skb);
> > + qdisc_pkt_len_init(skb);
[..]
>
> Is qdisc_pkt_len_init() still inlined, now it has 2 callers ?
gcc 4.8.4, OPTIMIZE_FOR_SIZE=n: nope, callq.
> People using policers on ingress are very often disabling GRO for
> various historical reasons.
right, ingress policing is rather poor with GRO (or you need to set mtu 64000
on the tc police cmd line).
> Since handle_ing() is very uncommon, I would prefer not slowing down tx
> just so that GRO ~5% error in ingress is fixed.
Hmm, so whats you suggestion? Just use
qdisc_skb_cb(skb)->pkt_len = skb->len;
like in v1 of the patch set?
resurrect qdisc_enqueue_root?
set always_inline (i'd prefer not, its not very small)?
Any other idea?
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 21:20 ` Eric Dumazet
2015-05-13 21:33 ` Florian Westphal
@ 2015-05-13 21:36 ` Daniel Borkmann
2015-05-13 21:42 ` Eric Dumazet
2 siblings, 0 replies; 9+ messages in thread
From: Daniel Borkmann @ 2015-05-13 21:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, fw, ast, jhs, netdev
On 05/13/2015 11:20 PM, Eric Dumazet wrote:
...
> Is qdisc_pkt_len_init() still inlined, now it has 2 callers ?
>
> People using policers on ingress are very often disabling GRO for
> various historical reasons.
Yes, tested policer and for getting it working on my side, GRO needs
to be off.
> Since handle_ing() is very uncommon, I would prefer not slowing down tx
> just so that GRO ~5% error in ingress is fixed.
I see a callq dc0 <qdisc_pkt_len_init> now. Ok, in that case Florian's
original version is fine:
http://patchwork.ozlabs.org/patch/471346/
> I noticed you did not CC me, don't be afraid to include me next time one
> of my commit is referred in a changelog ;)
Hmm, I did Cc you. ;)
To: davem@davemloft.net
Cc: fw@strlen.de, ast@plumgrid.com, jhs@mojatatu.com,
eric.dumazet@gmail.com, netdev@vger.kernel.org,
daniel@iogearbox.net
Thanks,
Daniel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 21:33 ` Florian Westphal
@ 2015-05-13 21:38 ` Eric Dumazet
2015-05-13 22:27 ` Alexei Starovoitov
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2015-05-13 21:38 UTC (permalink / raw)
To: Florian Westphal; +Cc: Daniel Borkmann, davem, ast, jhs, netdev
On Wed, 2015-05-13 at 23:33 +0200, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > Commit d2788d34885d4ce5ba ("net: sched: further simplify handle_ing")
> > > removed the call to qdisc_enqueue_root(). However, after this removal
> > > we no longer set qdisc pkt length, which this patch fixes. We make use
> > > of qdisc_pkt_len_init() from 1def9238d4aa ("net_sched: more precise
> > > pkt_len computation"), which was suggested as the current approach so
> > > far does not take into account when GRO builds up GSO packets.
> > >
> > > Fixes: d2788d34885d ("net: sched: further simplify handle_ing")
> > > Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> > > Signed-off-by: Florian Westphal <fw@strlen.de>
> > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> > > ---
> > > v1->v2:
> > > - use qdisc_pkt_len_init() as suggested, thanks!
> > >
> > > @@ -3541,9 +3541,11 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
> > > *pt_prev = NULL;
> > > }
> > >
> > > - qdisc_bstats_update_cpu(cl->q, skb);
> > > + qdisc_pkt_len_init(skb);
> [..]
> >
> > Is qdisc_pkt_len_init() still inlined, now it has 2 callers ?
>
> gcc 4.8.4, OPTIMIZE_FOR_SIZE=n: nope, callq.
>
> > People using policers on ingress are very often disabling GRO for
> > various historical reasons.
>
> right, ingress policing is rather poor with GRO (or you need to set mtu 64000
> on the tc police cmd line).
>
> > Since handle_ing() is very uncommon, I would prefer not slowing down tx
> > just so that GRO ~5% error in ingress is fixed.
>
> Hmm, so whats you suggestion? Just use
>
> qdisc_skb_cb(skb)->pkt_len = skb->len;
>
> like in v1 of the patch set?
> resurrect qdisc_enqueue_root?
> set always_inline (i'd prefer not, its not very small)?
>
> Any other idea?
One possibility would be to set it in GRO engine.
But then we would have to set qdisc_skb_cb(skb)->pkt_len = skb->len for
non GRO packets.
Also note the DODGY thing might be not needed for ingress,
and I am not sure we can currently deliver UDP aggregated packets.
Hmm....
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 21:20 ` Eric Dumazet
2015-05-13 21:33 ` Florian Westphal
2015-05-13 21:36 ` Daniel Borkmann
@ 2015-05-13 21:42 ` Eric Dumazet
2 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2015-05-13 21:42 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, fw, ast, jhs, netdev
On Wed, 2015-05-13 at 14:20 -0700, Eric Dumazet wrote:
>
> I noticed you did not CC me, don't be afraid to include me next time one
> of my commit is referred in a changelog ;)
Sorry, I was wrong on this ;)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 21:38 ` Eric Dumazet
@ 2015-05-13 22:27 ` Alexei Starovoitov
2015-05-13 22:43 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2015-05-13 22:27 UTC (permalink / raw)
To: Eric Dumazet, Florian Westphal; +Cc: Daniel Borkmann, davem, jhs, netdev
On 5/13/15 2:38 PM, Eric Dumazet wrote:
>
> One possibility would be to set it in GRO engine.
>
> But then we would have to set qdisc_skb_cb(skb)->pkt_len = skb->len for
> non GRO packets.
>
> Also note the DODGY thing might be not needed for ingress,
> and I am not sure we can currently deliver UDP aggregated packets.
yeah, current qdisc_pkt_len_init() is certainly TX oriented.
I'm not sure we can use tcp_hdrlen(skb) on RX which is called as
part of qdisc_pkt_len_init().
I think it's probably better to take Florian's V1 with:
qdisc_skb_cb(skb)->pkt_len = skb->len;
to fix immediate breakage of act police and then proceed further?
Like doing qdisc_skb_cb(skb)->pkt_len = skb->len inside GRO engine
is probably not possible, since cb is used by napi_gro_cb ?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify
2015-05-13 22:27 ` Alexei Starovoitov
@ 2015-05-13 22:43 ` Eric Dumazet
0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2015-05-13 22:43 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: Florian Westphal, Daniel Borkmann, davem, jhs, netdev
On Wed, 2015-05-13 at 15:27 -0700, Alexei Starovoitov wrote:
> On 5/13/15 2:38 PM, Eric Dumazet wrote:
> >
> > One possibility would be to set it in GRO engine.
> >
> > But then we would have to set qdisc_skb_cb(skb)->pkt_len = skb->len for
> > non GRO packets.
> >
> > Also note the DODGY thing might be not needed for ingress,
> > and I am not sure we can currently deliver UDP aggregated packets.
>
> yeah, current qdisc_pkt_len_init() is certainly TX oriented.
> I'm not sure we can use tcp_hdrlen(skb) on RX which is called as
> part of qdisc_pkt_len_init().
We can, because GRO leave transport header set at the right place.
> I think it's probably better to take Florian's V1 with:
> qdisc_skb_cb(skb)->pkt_len = skb->len;
> to fix immediate breakage of act police and then proceed further?
Yes.
> Like doing qdisc_skb_cb(skb)->pkt_len = skb->len inside GRO engine
> is probably not possible, since cb is used by napi_gro_cb ?
Everything is possible ;)
At the time tcp_gro_complete() is called, we certainly can overwrite
NAPI_GRO_CB(skb)->frag0, with good documentation and BUILD_BUG_ON()
safet nets.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-05-13 22:43 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-13 20:50 [PATCH net-next v2] net: core: set and refine qdisc pkt len before tc_classify Daniel Borkmann
2015-05-13 21:07 ` Alexei Starovoitov
2015-05-13 21:20 ` Eric Dumazet
2015-05-13 21:33 ` Florian Westphal
2015-05-13 21:38 ` Eric Dumazet
2015-05-13 22:27 ` Alexei Starovoitov
2015-05-13 22:43 ` Eric Dumazet
2015-05-13 21:36 ` Daniel Borkmann
2015-05-13 21:42 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).