netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs
@ 2018-11-15  1:26 Vinicius Costa Gomes
  2018-11-15  1:26 ` [PATCH net-next v1 2/4] etf: Use cached rb_root Vinicius Costa Gomes
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Vinicius Costa Gomes @ 2018-11-15  1:26 UTC (permalink / raw)
  To: netdev
  Cc: Jesus Sanchez-Palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>

There is no point in firing the qdisc watchdog if there are no future
skbs pending in the queue and the watchdog had been set previously.

Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>
---
 net/sched/sch_etf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
index 1538d6fa8165..fa85b24ac794 100644
--- a/net/sched/sch_etf.c
+++ b/net/sched/sch_etf.c
@@ -117,8 +117,10 @@ static void reset_watchdog(struct Qdisc *sch)
 	struct sk_buff *skb = etf_peek_timesortedlist(sch);
 	ktime_t next;
 
-	if (!skb)
+	if (!skb) {
+		qdisc_watchdog_cancel(&q->watchdog);
 		return;
+	}
 
 	next = ktime_sub_ns(skb->tstamp, q->delta);
 	qdisc_watchdog_schedule_ns(&q->watchdog, ktime_to_ns(next));
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next v1 2/4] etf: Use cached rb_root
  2018-11-15  1:26 [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs Vinicius Costa Gomes
@ 2018-11-15  1:26 ` Vinicius Costa Gomes
  2018-11-17  4:40   ` David Miller
  2018-11-15  1:26 ` [PATCH net-next v1 3/4] etf: Split timersortedlist_erase() Vinicius Costa Gomes
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Vinicius Costa Gomes @ 2018-11-15  1:26 UTC (permalink / raw)
  To: netdev
  Cc: Jesus Sanchez-Palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>

ETF's peek() operation is heavily used so use an rb_root_cached instead
and leverage rb_first_cached() which will run in O(1) instead of
O(log n).

Even if on 'timesortedlist_clear()' we could be using rb_erase(), we
choose to use rb_erase_cached(), because if in the future we allow
runtime changes to ETF parameters, and need to do a '_clear()', this
might cause some hard to debug issues.

Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>
---
 net/sched/sch_etf.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
index fa85b24ac794..52452b546564 100644
--- a/net/sched/sch_etf.c
+++ b/net/sched/sch_etf.c
@@ -30,7 +30,7 @@ struct etf_sched_data {
 	int queue;
 	s32 delta; /* in ns */
 	ktime_t last; /* The txtime of the last skb sent to the netdevice. */
-	struct rb_root head;
+	struct rb_root_cached head;
 	struct qdisc_watchdog watchdog;
 	ktime_t (*get_time)(void);
 };
@@ -104,7 +104,7 @@ static struct sk_buff *etf_peek_timesortedlist(struct Qdisc *sch)
 	struct etf_sched_data *q = qdisc_priv(sch);
 	struct rb_node *p;
 
-	p = rb_first(&q->head);
+	p = rb_first_cached(&q->head);
 	if (!p)
 		return NULL;
 
@@ -156,8 +156,9 @@ static int etf_enqueue_timesortedlist(struct sk_buff *nskb, struct Qdisc *sch,
 				      struct sk_buff **to_free)
 {
 	struct etf_sched_data *q = qdisc_priv(sch);
-	struct rb_node **p = &q->head.rb_node, *parent = NULL;
+	struct rb_node **p = &q->head.rb_root.rb_node, *parent = NULL;
 	ktime_t txtime = nskb->tstamp;
+	bool leftmost = true;
 
 	if (!is_packet_valid(sch, nskb)) {
 		report_sock_error(nskb, EINVAL,
@@ -170,13 +171,15 @@ static int etf_enqueue_timesortedlist(struct sk_buff *nskb, struct Qdisc *sch,
 
 		parent = *p;
 		skb = rb_to_skb(parent);
-		if (ktime_after(txtime, skb->tstamp))
+		if (ktime_after(txtime, skb->tstamp)) {
 			p = &parent->rb_right;
-		else
+			leftmost = false;
+		} else {
 			p = &parent->rb_left;
+		}
 	}
 	rb_link_node(&nskb->rbnode, parent, p);
-	rb_insert_color(&nskb->rbnode, &q->head);
+	rb_insert_color_cached(&nskb->rbnode, &q->head, leftmost);
 
 	qdisc_qstats_backlog_inc(sch, nskb);
 	sch->q.qlen++;
@@ -192,7 +195,7 @@ static void timesortedlist_erase(struct Qdisc *sch, struct sk_buff *skb,
 {
 	struct etf_sched_data *q = qdisc_priv(sch);
 
-	rb_erase(&skb->rbnode, &q->head);
+	rb_erase_cached(&skb->rbnode, &q->head);
 
 	/* The rbnode field in the skb re-uses these fields, now that
 	 * we are done with the rbnode, reset them.
@@ -388,14 +391,14 @@ static int etf_init(struct Qdisc *sch, struct nlattr *opt,
 static void timesortedlist_clear(struct Qdisc *sch)
 {
 	struct etf_sched_data *q = qdisc_priv(sch);
-	struct rb_node *p = rb_first(&q->head);
+	struct rb_node *p = rb_first_cached(&q->head);
 
 	while (p) {
 		struct sk_buff *skb = rb_to_skb(p);
 
 		p = rb_next(p);
 
-		rb_erase(&skb->rbnode, &q->head);
+		rb_erase_cached(&skb->rbnode, &q->head);
 		rtnl_kfree_skbs(skb, skb);
 		sch->q.qlen--;
 	}
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next v1 3/4] etf: Split timersortedlist_erase()
  2018-11-15  1:26 [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs Vinicius Costa Gomes
  2018-11-15  1:26 ` [PATCH net-next v1 2/4] etf: Use cached rb_root Vinicius Costa Gomes
@ 2018-11-15  1:26 ` Vinicius Costa Gomes
  2018-11-17  4:40   ` David Miller
  2018-11-15  1:26 ` [PATCH net-next v1 4/4] etf: Drop all expired packets Vinicius Costa Gomes
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Vinicius Costa Gomes @ 2018-11-15  1:26 UTC (permalink / raw)
  To: netdev
  Cc: Jesus Sanchez-Palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>

This is just a refactor that will simplify the implementation of the
next patch in this series which will drop all expired packets on the
dequeue flow.

Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>
---
 net/sched/sch_etf.c | 44 +++++++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
index 52452b546564..bfe04748d5f0 100644
--- a/net/sched/sch_etf.c
+++ b/net/sched/sch_etf.c
@@ -190,10 +190,10 @@ static int etf_enqueue_timesortedlist(struct sk_buff *nskb, struct Qdisc *sch,
 	return NET_XMIT_SUCCESS;
 }
 
-static void timesortedlist_erase(struct Qdisc *sch, struct sk_buff *skb,
-				 bool drop)
+static void timesortedlist_drop(struct Qdisc *sch, struct sk_buff *skb)
 {
 	struct etf_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *to_free = NULL;
 
 	rb_erase_cached(&skb->rbnode, &q->head);
 
@@ -206,19 +206,33 @@ static void timesortedlist_erase(struct Qdisc *sch, struct sk_buff *skb,
 
 	qdisc_qstats_backlog_dec(sch, skb);
 
-	if (drop) {
-		struct sk_buff *to_free = NULL;
+	report_sock_error(skb, ECANCELED, SO_EE_CODE_TXTIME_MISSED);
 
-		report_sock_error(skb, ECANCELED, SO_EE_CODE_TXTIME_MISSED);
+	qdisc_drop(skb, sch, &to_free);
+	kfree_skb_list(to_free);
+	qdisc_qstats_overlimit(sch);
 
-		qdisc_drop(skb, sch, &to_free);
-		kfree_skb_list(to_free);
-		qdisc_qstats_overlimit(sch);
-	} else {
-		qdisc_bstats_update(sch, skb);
+	sch->q.qlen--;
+}
 
-		q->last = skb->tstamp;
-	}
+static void timesortedlist_remove(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct etf_sched_data *q = qdisc_priv(sch);
+
+	rb_erase_cached(&skb->rbnode, &q->head);
+
+	/* The rbnode field in the skb re-uses these fields, now that
+	 * we are done with the rbnode, reset them.
+	 */
+	skb->next = NULL;
+	skb->prev = NULL;
+	skb->dev = qdisc_dev(sch);
+
+	qdisc_qstats_backlog_dec(sch, skb);
+
+	qdisc_bstats_update(sch, skb);
+
+	q->last = skb->tstamp;
 
 	sch->q.qlen--;
 }
@@ -237,7 +251,7 @@ static struct sk_buff *etf_dequeue_timesortedlist(struct Qdisc *sch)
 
 	/* Drop if packet has expired while in queue. */
 	if (ktime_before(skb->tstamp, now)) {
-		timesortedlist_erase(sch, skb, true);
+		timesortedlist_drop(sch, skb);
 		skb = NULL;
 		goto out;
 	}
@@ -246,7 +260,7 @@ static struct sk_buff *etf_dequeue_timesortedlist(struct Qdisc *sch)
 	 * txtime from deadline to (now + delta).
 	 */
 	if (q->deadline_mode) {
-		timesortedlist_erase(sch, skb, false);
+		timesortedlist_remove(sch, skb);
 		skb->tstamp = now;
 		goto out;
 	}
@@ -255,7 +269,7 @@ static struct sk_buff *etf_dequeue_timesortedlist(struct Qdisc *sch)
 
 	/* Dequeue only if now is within the [txtime - delta, txtime] range. */
 	if (ktime_after(now, next))
-		timesortedlist_erase(sch, skb, false);
+		timesortedlist_remove(sch, skb);
 	else
 		skb = NULL;
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next v1 4/4] etf: Drop all expired packets
  2018-11-15  1:26 [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs Vinicius Costa Gomes
  2018-11-15  1:26 ` [PATCH net-next v1 2/4] etf: Use cached rb_root Vinicius Costa Gomes
  2018-11-15  1:26 ` [PATCH net-next v1 3/4] etf: Split timersortedlist_erase() Vinicius Costa Gomes
@ 2018-11-15  1:26 ` Vinicius Costa Gomes
  2018-11-17  4:40   ` David Miller
  2018-11-17  4:39 ` [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs David Miller
  2018-12-03 19:19 ` Vinicius Costa Gomes
  4 siblings, 1 reply; 9+ messages in thread
From: Vinicius Costa Gomes @ 2018-11-15  1:26 UTC (permalink / raw)
  To: netdev
  Cc: Jesus Sanchez-Palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>

Currently on dequeue() ETF only drops the first expired packet, which
causes a problem if the next packet is already expired. When this
happens, the watchdog will be configured with a time in the past, fire
straight way and the packet will finally be dropped once the dequeue()
function of the qdisc is called again.

We can save quite a few cycles and improve the overall behavior of the
qdisc if we drop all expired packets if the next packet is expired.
This should allow ETF to recover faster from bad situations. But
packet drops are still a very serious warning that the requirements
imposed on the system aren't reasonable.

This was inspired by how the implementation of hrtimers use the
rb_tree inside the kernel.

Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>
---
 net/sched/sch_etf.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
index bfe04748d5f0..1150f22983df 100644
--- a/net/sched/sch_etf.c
+++ b/net/sched/sch_etf.c
@@ -190,29 +190,35 @@ static int etf_enqueue_timesortedlist(struct sk_buff *nskb, struct Qdisc *sch,
 	return NET_XMIT_SUCCESS;
 }
 
-static void timesortedlist_drop(struct Qdisc *sch, struct sk_buff *skb)
+static void timesortedlist_drop(struct Qdisc *sch, struct sk_buff *skb,
+				ktime_t now)
 {
 	struct etf_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *to_free = NULL;
+	struct sk_buff *tmp = NULL;
 
-	rb_erase_cached(&skb->rbnode, &q->head);
+	skb_rbtree_walk_from_safe(skb, tmp) {
+		if (ktime_after(skb->tstamp, now))
+			break;
 
-	/* The rbnode field in the skb re-uses these fields, now that
-	 * we are done with the rbnode, reset them.
-	 */
-	skb->next = NULL;
-	skb->prev = NULL;
-	skb->dev = qdisc_dev(sch);
+		rb_erase_cached(&skb->rbnode, &q->head);
 
-	qdisc_qstats_backlog_dec(sch, skb);
+		/* The rbnode field in the skb re-uses these fields, now that
+		 * we are done with the rbnode, reset them.
+		 */
+		skb->next = NULL;
+		skb->prev = NULL;
+		skb->dev = qdisc_dev(sch);
 
-	report_sock_error(skb, ECANCELED, SO_EE_CODE_TXTIME_MISSED);
+		report_sock_error(skb, ECANCELED, SO_EE_CODE_TXTIME_MISSED);
 
-	qdisc_drop(skb, sch, &to_free);
-	kfree_skb_list(to_free);
-	qdisc_qstats_overlimit(sch);
+		qdisc_qstats_backlog_dec(sch, skb);
+		qdisc_drop(skb, sch, &to_free);
+		qdisc_qstats_overlimit(sch);
+		sch->q.qlen--;
+	}
 
-	sch->q.qlen--;
+	kfree_skb_list(to_free);
 }
 
 static void timesortedlist_remove(struct Qdisc *sch, struct sk_buff *skb)
@@ -251,7 +257,7 @@ static struct sk_buff *etf_dequeue_timesortedlist(struct Qdisc *sch)
 
 	/* Drop if packet has expired while in queue. */
 	if (ktime_before(skb->tstamp, now)) {
-		timesortedlist_drop(sch, skb);
+		timesortedlist_drop(sch, skb, now);
 		skb = NULL;
 		goto out;
 	}
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs
  2018-11-15  1:26 [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs Vinicius Costa Gomes
                   ` (2 preceding siblings ...)
  2018-11-15  1:26 ` [PATCH net-next v1 4/4] etf: Drop all expired packets Vinicius Costa Gomes
@ 2018-11-17  4:39 ` David Miller
  2018-12-03 19:19 ` Vinicius Costa Gomes
  4 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2018-11-17  4:39 UTC (permalink / raw)
  To: vinicius.gomes
  Cc: netdev, jesus.sanchez-palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed, 14 Nov 2018 17:26:32 -0800

> From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
> 
> There is no point in firing the qdisc watchdog if there are no future
> skbs pending in the queue and the watchdog had been set previously.
> 
> Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v1 2/4] etf: Use cached rb_root
  2018-11-15  1:26 ` [PATCH net-next v1 2/4] etf: Use cached rb_root Vinicius Costa Gomes
@ 2018-11-17  4:40   ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2018-11-17  4:40 UTC (permalink / raw)
  To: vinicius.gomes
  Cc: netdev, jesus.sanchez-palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed, 14 Nov 2018 17:26:33 -0800

> From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
> 
> ETF's peek() operation is heavily used so use an rb_root_cached instead
> and leverage rb_first_cached() which will run in O(1) instead of
> O(log n).
> 
> Even if on 'timesortedlist_clear()' we could be using rb_erase(), we
> choose to use rb_erase_cached(), because if in the future we allow
> runtime changes to ETF parameters, and need to do a '_clear()', this
> might cause some hard to debug issues.
> 
> Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v1 3/4] etf: Split timersortedlist_erase()
  2018-11-15  1:26 ` [PATCH net-next v1 3/4] etf: Split timersortedlist_erase() Vinicius Costa Gomes
@ 2018-11-17  4:40   ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2018-11-17  4:40 UTC (permalink / raw)
  To: vinicius.gomes
  Cc: netdev, jesus.sanchez-palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed, 14 Nov 2018 17:26:34 -0800

> From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
> 
> This is just a refactor that will simplify the implementation of the
> next patch in this series which will drop all expired packets on the
> dequeue flow.
> 
> Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v1 4/4] etf: Drop all expired packets
  2018-11-15  1:26 ` [PATCH net-next v1 4/4] etf: Drop all expired packets Vinicius Costa Gomes
@ 2018-11-17  4:40   ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2018-11-17  4:40 UTC (permalink / raw)
  To: vinicius.gomes
  Cc: netdev, jesus.sanchez-palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed, 14 Nov 2018 17:26:35 -0800

> From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
> 
> Currently on dequeue() ETF only drops the first expired packet, which
> causes a problem if the next packet is already expired. When this
> happens, the watchdog will be configured with a time in the past, fire
> straight way and the packet will finally be dropped once the dequeue()
> function of the qdisc is called again.
> 
> We can save quite a few cycles and improve the overall behavior of the
> qdisc if we drop all expired packets if the next packet is expired.
> This should allow ETF to recover faster from bad situations. But
> packet drops are still a very serious warning that the requirements
> imposed on the system aren't reasonable.
> 
> This was inspired by how the implementation of hrtimers use the
> rb_tree inside the kernel.
> 
> Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs
  2018-11-15  1:26 [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs Vinicius Costa Gomes
                   ` (3 preceding siblings ...)
  2018-11-17  4:39 ` [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs David Miller
@ 2018-12-03 19:19 ` Vinicius Costa Gomes
  4 siblings, 0 replies; 9+ messages in thread
From: Vinicius Costa Gomes @ 2018-12-03 19:19 UTC (permalink / raw)
  To: netdev
  Cc: Jesus Sanchez-Palencia, jhs, xiyou.wangcong, jiri,
	jesus.s.palencia, ilias.apalodimas

Hi,

Vinicius Costa Gomes <vinicius.gomes@intel.com> writes:

> From: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
>
> There is no point in firing the qdisc watchdog if there are no future
> skbs pending in the queue and the watchdog had been set previously.
>
> Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>

It seems that I made a mistake when submitting this series.

It should have been proposed to the net tree (instead of net-next) and
should have included a "Fixes" tag.

Is there a way to have this series considered for the net tree at this
point (given that it already landed on net-next)?


Cheers,

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-12-03 19:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-15  1:26 [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs Vinicius Costa Gomes
2018-11-15  1:26 ` [PATCH net-next v1 2/4] etf: Use cached rb_root Vinicius Costa Gomes
2018-11-17  4:40   ` David Miller
2018-11-15  1:26 ` [PATCH net-next v1 3/4] etf: Split timersortedlist_erase() Vinicius Costa Gomes
2018-11-17  4:40   ` David Miller
2018-11-15  1:26 ` [PATCH net-next v1 4/4] etf: Drop all expired packets Vinicius Costa Gomes
2018-11-17  4:40   ` David Miller
2018-11-17  4:39 ` [PATCH net-next v1 1/4] etf: Cancel timer if there are no pending skbs David Miller
2018-12-03 19:19 ` Vinicius Costa Gomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).