netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pkt_sched: fq: change classification of control packets
@ 2013-11-14 16:50 Eric Dumazet
  2013-11-14 18:58 ` [PATCH] pkt_sched: fq: fix pacing for small frames Eric Dumazet
  2013-11-14 22:16 ` [PATCH] pkt_sched: fq: change classification of control packets David Miller
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Dumazet @ 2013-11-14 16:50 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Maciej Żenczykowski, Willem de Bruijn, Yuchung Cheng,
	Stephen Hemminger

From: Maciej Żenczykowski <maze@google.com>

Initial sch_fq implementation copied code from pfifo_fast to classify
a packet as a high prio packet.

This clashes with setups using PRIO with say 7 bands, as one of the
band could be incorrectly (mis)classified by FQ.

Packets would be queued in the 'internal' queue, and no pacing ever
happen for this special queue.

Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/sched/sch_fq.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index fdc041c..d4fa38e 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -209,21 +209,15 @@ static void fq_gc(struct fq_sched_data *q,
 	}
 }
 
-static const u8 prio2band[TC_PRIO_MAX + 1] = {
-	1, 2, 2, 2, 1, 2, 0, 0 , 1, 1, 1, 1, 1, 1, 1, 1
-};
-
 static struct fq_flow *fq_classify(struct sk_buff *skb, struct fq_sched_data *q)
 {
 	struct rb_node **p, *parent;
 	struct sock *sk = skb->sk;
 	struct rb_root *root;
 	struct fq_flow *f;
-	int band;
 
 	/* warning: no starvation prevention... */
-	band = prio2band[skb->priority & TC_PRIO_MAX];
-	if (unlikely(band == 0))
+	if (unlikely((skb->priority & TC_PRIO_MAX) == TC_PRIO_CONTROL))
 		return &q->internal;
 
 	if (unlikely(!sk)) {

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] pkt_sched: fq: fix pacing for small frames
  2013-11-14 16:50 [PATCH] pkt_sched: fq: change classification of control packets Eric Dumazet
@ 2013-11-14 18:58 ` Eric Dumazet
  2013-11-14 22:22   ` David Miller
  2013-11-14 22:16 ` [PATCH] pkt_sched: fq: change classification of control packets David Miller
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2013-11-14 18:58 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Maciej Żenczykowski, Willem de Bruijn, Yuchung Cheng,
	Neal Cardwell

From: Eric Dumazet <edumazet@google.com>

For performance reasons, sch_fq tried hard to not setup timers for every
sent packet, using a quantum based heuristic : A delay is setup only if
the flow exhausted its credit.

Problem is that application limited flows can refill their credit
for every queued packet, and they can evade pacing.

This problem can also be triggered when TCP flows use small MSS values,
as TSO auto sizing builds packets that are smaller than the default fq
quantum (3028 bytes) 

This patch adds a 40 ms delay to guard flow credit refill.

We'll send a patch adding ability to tune this threshold for linux-3.14,
but this value should be good enough for most workloads.

Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
 net/sched/sch_fq.c |   28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index fdc041c57853..46a0cacd7280 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -88,7 +88,7 @@ struct fq_sched_data {
 	struct fq_flow	internal;	/* for non classified or high prio packets */
 	u32		quantum;
 	u32		initial_quantum;
-	u32		flow_default_rate;/* rate per flow : bytes per second */
+	u32		flow_refill_delay;
 	u32		flow_max_rate;	/* optional max rate per flow */
 	u32		flow_plimit;	/* max packets per flow */
 	struct rb_root	*fq_root;
@@ -115,6 +115,7 @@ static struct fq_flow detached, throttled;
 static void fq_flow_set_detached(struct fq_flow *f)
 {
 	f->next = &detached;
+	f->age = jiffies;
 }
 
 static bool fq_flow_is_detached(const struct fq_flow *f)
@@ -160,8 +161,12 @@ static void fq_flow_add_tail(struct fq_flow_head *head, struct fq_flow *flow)
 }
 
 /* limit number of collected flows per round */
-#define FQ_GC_MAX 8
-#define FQ_GC_AGE (3*HZ)
+#define FQ_GC_MAX		8
+
+#define FQ_GC_AGE		(3*HZ)
+
+/* Flow credit is refilled after 40ms idle time */
+#define FQ_REFILL_DELAY		msecs_to_jiffies(40)
 
 static bool fq_gc_candidate(const struct fq_flow *f)
 {
@@ -373,17 +378,22 @@ static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	}
 
 	f->qlen++;
-	flow_queue_add(f, skb);
 	if (skb_is_retransmit(skb))
 		q->stat_tcp_retrans++;
 	sch->qstats.backlog += qdisc_pkt_len(skb);
 	if (fq_flow_is_detached(f)) {
 		fq_flow_add_tail(&q->new_flows, f);
-		if (q->quantum > f->credit)
-			f->credit = q->quantum;
+
+		if (time_after(jiffies, f->age + q->flow_refill_delay))
+			f->credit = max_t(u32, f->credit, q->quantum);
+
 		q->inactive_flows--;
 		qdisc_unthrottled(sch);
 	}
+
+	/* Note: this overwrites f->age */
+	flow_queue_add(f, skb);
+
 	if (unlikely(f == &q->internal)) {
 		q->stat_internal_packets++;
 		qdisc_unthrottled(sch);
@@ -461,7 +471,6 @@ begin:
 			fq_flow_add_tail(&q->old_flows, f);
 		} else {
 			fq_flow_set_detached(f);
-			f->age = jiffies;
 			q->inactive_flows++;
 		}
 		goto begin;
@@ -655,9 +664,6 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt)
 	if (tb[TCA_FQ_INITIAL_QUANTUM])
 		q->initial_quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
 
-	if (tb[TCA_FQ_FLOW_DEFAULT_RATE])
-		q->flow_default_rate = nla_get_u32(tb[TCA_FQ_FLOW_DEFAULT_RATE]);
-
 	if (tb[TCA_FQ_FLOW_MAX_RATE])
 		q->flow_max_rate = nla_get_u32(tb[TCA_FQ_FLOW_MAX_RATE]);
 
@@ -705,7 +711,7 @@ static int fq_init(struct Qdisc *sch, struct nlattr *opt)
 	q->flow_plimit		= 100;
 	q->quantum		= 2 * psched_mtu(qdisc_dev(sch));
 	q->initial_quantum	= 10 * psched_mtu(qdisc_dev(sch));
-	q->flow_default_rate	= 0;
+	q->flow_refill_delay	= FQ_REFILL_DELAY;
 	q->flow_max_rate	= ~0U;
 	q->rate_enable		= 1;
 	q->new_flows.first	= NULL;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] pkt_sched: fq: change classification of control packets
  2013-11-14 16:50 [PATCH] pkt_sched: fq: change classification of control packets Eric Dumazet
  2013-11-14 18:58 ` [PATCH] pkt_sched: fq: fix pacing for small frames Eric Dumazet
@ 2013-11-14 22:16 ` David Miller
  1 sibling, 0 replies; 5+ messages in thread
From: David Miller @ 2013-11-14 22:16 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, maze, willemb, ycheng, stephen

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 14 Nov 2013 08:50:43 -0800

> From: Maciej Żenczykowski <maze@google.com>
> 
> Initial sch_fq implementation copied code from pfifo_fast to classify
> a packet as a high prio packet.
> 
> This clashes with setups using PRIO with say 7 bands, as one of the
> band could be incorrectly (mis)classified by FQ.
> 
> Packets would be queued in the 'internal' queue, and no pacing ever
> happen for this special queue.
> 
> Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] pkt_sched: fq: fix pacing for small frames
  2013-11-14 18:58 ` [PATCH] pkt_sched: fq: fix pacing for small frames Eric Dumazet
@ 2013-11-14 22:22   ` David Miller
  2013-11-14 22:51     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2013-11-14 22:22 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, maze, willemb, ycheng, ncardwell

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 14 Nov 2013 10:58:16 -0800

> @@ -655,9 +664,6 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt)
>  	if (tb[TCA_FQ_INITIAL_QUANTUM])
>  		q->initial_quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
>  
> -	if (tb[TCA_FQ_FLOW_DEFAULT_RATE])
> -		q->flow_default_rate = nla_get_u32(tb[TCA_FQ_FLOW_DEFAULT_RATE]);
> -
>  	if (tb[TCA_FQ_FLOW_MAX_RATE])
>  		q->flow_max_rate = nla_get_u32(tb[TCA_FQ_FLOW_MAX_RATE]);
>  

I think it's at best confusing to suddenly stop ignoring a configuration
parameter the user is giving us.

Can you at least ratelimit warn if the parameter is specified so the user
has some chance to figure out what is happening?

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] pkt_sched: fq: fix pacing for small frames
  2013-11-14 22:22   ` David Miller
@ 2013-11-14 22:51     ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2013-11-14 22:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, maze, willemb, ycheng, ncardwell

On Thu, 2013-11-14 at 17:22 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 14 Nov 2013 10:58:16 -0800
> 
> > @@ -655,9 +664,6 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt)
> >  	if (tb[TCA_FQ_INITIAL_QUANTUM])
> >  		q->initial_quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
> >  
> > -	if (tb[TCA_FQ_FLOW_DEFAULT_RATE])
> > -		q->flow_default_rate = nla_get_u32(tb[TCA_FQ_FLOW_DEFAULT_RATE]);
> > -
> >  	if (tb[TCA_FQ_FLOW_MAX_RATE])
> >  		q->flow_max_rate = nla_get_u32(tb[TCA_FQ_FLOW_MAX_RATE]);
> >  
> 
> I think it's at best confusing to suddenly stop ignoring a configuration
> parameter the user is giving us.
> 
> Can you at least ratelimit warn if the parameter is specified so the user
> has some chance to figure out what is happening?

Oh this parameter was removed in 7eec4174ff29cd
("pkt_sched: fq: fix non TCP flows pacing"), I probably should
have added this warning at that time...

OK, we can warn the user, will send a v2.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-14 22:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-14 16:50 [PATCH] pkt_sched: fq: change classification of control packets Eric Dumazet
2013-11-14 18:58 ` [PATCH] pkt_sched: fq: fix pacing for small frames Eric Dumazet
2013-11-14 22:22   ` David Miller
2013-11-14 22:51     ` Eric Dumazet
2013-11-14 22:16 ` [PATCH] pkt_sched: fq: change classification of control packets David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).