[PATCH net-next v2] net: sched: fix skb leak in dev_requeue

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb()
@ 2017-12-25  3:49 Wei Yongjun
  2017-12-27  5:24 ` John Fastabend
  0 siblings, 1 reply; 3+ messages in thread
From: Wei Yongjun @ 2017-12-25  3:49 UTC (permalink / raw)
  To: John Fastabend, Jamal Hadi Salim, Cong Wang, Jiri Pirko
  Cc: Wei Yongjun, netdev

When dev_requeue_skb() is called with bluked skb list, only the
first skb of the list will be requeued to qdisc layer, and leak
the others without free them.

TCP is broken due to skb leak since no free skb will be considered
as still in the host queue and never be retransmitted. This happend
when dev_requeue_skb() called from qdisc_restart().
  qdisc_restart
  |-- dequeue_skb
  |-- sch_direct_xmit()
      |-- dev_requeue_skb() <-- skb may bluked

Fix dev_requeue_skb() to requeue the full bluked list.

Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
v1 -> v2: add net-next prefix
---
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 981c08f..0df2dbf 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -111,10 +111,16 @@ static inline void qdisc_enqueue_skb_bad_txq(struct Qdisc *q,
 
 static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
 {
-	__skb_queue_head(&q->gso_skb, skb);
-	q->qstats.requeues++;
-	qdisc_qstats_backlog_inc(q, skb);
-	q->q.qlen++;	/* it's still part of the queue */
+	while (skb) {
+		struct sk_buff *next = skb->next;
+
+		__skb_queue_tail(&q->gso_skb, skb);
+		q->qstats.requeues++;
+		qdisc_qstats_backlog_inc(q, skb);
+		q->q.qlen++;	/* it's still part of the queue */
+
+		skb = next;
+	}
 	__netif_schedule(q);
 
 	return 0;
@@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct sk_buff *skb, struct Qdisc *q)
 {
 	spinlock_t *lock = qdisc_lock(q);
 
-	spin_lock(lock);
-	__skb_queue_tail(&q->gso_skb, skb);
-	spin_unlock(lock);
+	while (skb) {
+		struct sk_buff *next = skb->next;
+
+		spin_lock(lock);
+		__skb_queue_tail(&q->gso_skb, skb);
+		spin_unlock(lock);
+
+		qdisc_qstats_cpu_requeues_inc(q);
+		qdisc_qstats_cpu_backlog_inc(q, skb);
+		qdisc_qstats_cpu_qlen_inc(q);
+
+		skb = next;
+	}
 
-	qdisc_qstats_cpu_requeues_inc(q);
-	qdisc_qstats_cpu_backlog_inc(q, skb);
-	qdisc_qstats_cpu_qlen_inc(q);
 	__netif_schedule(q);
 
 	return 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb()
  2017-12-25  3:49 [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb() Wei Yongjun
@ 2017-12-27  5:24 ` John Fastabend
  2017-12-27  8:02   ` weiyongjun (A)
  0 siblings, 1 reply; 3+ messages in thread
From: John Fastabend @ 2017-12-27  5:24 UTC (permalink / raw)
  To: Wei Yongjun, Jamal Hadi Salim, Cong Wang, Jiri Pirko; +Cc: netdev

On 12/24/2017 07:49 PM, Wei Yongjun wrote:
> When dev_requeue_skb() is called with bluked skb list, only the
> first skb of the list will be requeued to qdisc layer, and leak
> the others without free them.
> 
> TCP is broken due to skb leak since no free skb will be considered
> as still in the host queue and never be retransmitted. This happend
> when dev_requeue_skb() called from qdisc_restart().
>   qdisc_restart
>   |-- dequeue_skb
>   |-- sch_direct_xmit()
>       |-- dev_requeue_skb() <-- skb may bluked
> 
> Fix dev_requeue_skb() to requeue the full bluked list.
> 
> Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> ---
> v1 -> v2: add net-next prefix
> ---

First, thanks for tracking this down.

> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 981c08f..0df2dbf 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -111,10 +111,16 @@ static inline void qdisc_enqueue_skb_bad_txq(struct Qdisc *q,
>  
>  static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
>  {
> -	__skb_queue_head(&q->gso_skb, skb);
> -	q->qstats.requeues++;
> -	qdisc_qstats_backlog_inc(q, skb);
> -	q->q.qlen++;	/* it's still part of the queue */
> +	while (skb) {
> +		struct sk_buff *next = skb->next;
> +
> +		__skb_queue_tail(&q->gso_skb, skb);

Was the change from __skb_queue_head to __skb_queue_tail here
intentional? We should re-queue packets to the head of the list.

> +		q->qstats.requeues++;
> +		qdisc_qstats_backlog_inc(q, skb);
> +		q->q.qlen++;	/* it's still part of the queue */
> +
> +		skb = next;
> +	}
>  	__netif_schedule(q);
>  
>  	return 0;
> @@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct sk_buff *skb, struct Qdisc *q)
>  {
>  	spinlock_t *lock = qdisc_lock(q);
>  
> -	spin_lock(lock);
> -	__skb_queue_tail(&q->gso_skb, skb);
> -	spin_unlock(lock);
> +	while (skb) {
> +		struct sk_buff *next = skb->next;
> +
> +		spin_lock(lock);

In this case I suspect its better to move the lock to be around the
while loop rather than grab and drop it repeatedly. I don't have
any data at this point so OK either way. Assuming other head/tail
comment is addressed.

> +		__skb_queue_tail(&q->gso_skb, skb);

Same here *_tail should be *_head?

> +		spin_unlock(lock);
> +
> +		qdisc_qstats_cpu_requeues_inc(q);
> +		qdisc_qstats_cpu_backlog_inc(q, skb);
> +		qdisc_qstats_cpu_qlen_inc(q);
> +
> +		skb = next;
> +	}
>  
> -	qdisc_qstats_cpu_requeues_inc(q);
> -	qdisc_qstats_cpu_backlog_inc(q, skb);
> -	qdisc_qstats_cpu_qlen_inc(q);
>  	__netif_schedule(q);
>  
>  	return 0;
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb()
  2017-12-27  5:24 ` John Fastabend
@ 2017-12-27  8:02   ` weiyongjun (A)
  0 siblings, 0 replies; 3+ messages in thread
From: weiyongjun (A) @ 2017-12-27  8:02 UTC (permalink / raw)
  To: John Fastabend, Jamal Hadi Salim, Cong Wang, Jiri Pirko
  Cc: netdev@vger.kernel.org

On 12/27/2017 1:24 PM, John Fastabend wrote:
> On 12/24/2017 07:49 PM, Wei Yongjun wrote:
> > When dev_requeue_skb() is called with bluked skb list, only the
> > first skb of the list will be requeued to qdisc layer, and leak
> > the others without free them.
> >
> > TCP is broken due to skb leak since no free skb will be considered
> > as still in the host queue and never be retransmitted. This happend
> > when dev_requeue_skb() called from qdisc_restart().
> >   qdisc_restart
> >   |-- dequeue_skb
> >   |-- sch_direct_xmit()
> >       |-- dev_requeue_skb() <-- skb may bluked
> >
> > Fix dev_requeue_skb() to requeue the full bluked list.
> >
> > Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
> > Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> > ---
> > v1 -> v2: add net-next prefix
> > ---
> 
> First, thanks for tracking this down.
> 
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > index 981c08f..0df2dbf 100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -111,10 +111,16 @@ static inline void
> qdisc_enqueue_skb_bad_txq(struct Qdisc *q,
> >
> >  static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
> >  {
> > -	__skb_queue_head(&q->gso_skb, skb);
> > -	q->qstats.requeues++;
> > -	qdisc_qstats_backlog_inc(q, skb);
> > -	q->q.qlen++;	/* it's still part of the queue */
> > +	while (skb) {
> > +		struct sk_buff *next = skb->next;
> > +
> > +		__skb_queue_tail(&q->gso_skb, skb);
> 
> Was the change from __skb_queue_head to __skb_queue_tail here
> intentional? We should re-queue packets to the head of the list.

skb is dequeue from gso_skb in order, so if packets re-queue to the head
of the list, skb will be sent in reverse order, so I think __skb_queue_tail is
better here. Tcpdump show those skbs are out of order packets.

> 
> > +		q->qstats.requeues++;
> > +		qdisc_qstats_backlog_inc(q, skb);
> > +		q->q.qlen++;	/* it's still part of the queue */
> > +
> > +		skb = next;
> > +	}
> >  	__netif_schedule(q);
> >
> >  	return 0;
> > @@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct
> sk_buff *skb, struct Qdisc *q)
> >  {
> >  	spinlock_t *lock = qdisc_lock(q);
> >
> > -	spin_lock(lock);
> > -	__skb_queue_tail(&q->gso_skb, skb);
> > -	spin_unlock(lock);
> > +	while (skb) {
> > +		struct sk_buff *next = skb->next;
> > +
> > +		spin_lock(lock);
> 
> In this case I suspect its better to move the lock to be around the
> while loop rather than grab and drop it repeatedly. I don't have
> any data at this point so OK either way. Assuming other head/tail
> comment is addressed.

I will fix it in v3.

> 
> > +		__skb_queue_tail(&q->gso_skb, skb);
> 
> Same here *_tail should be *_head?
> 
> > +		spin_unlock(lock);
> > +
> > +		qdisc_qstats_cpu_requeues_inc(q);
> > +		qdisc_qstats_cpu_backlog_inc(q, skb);
> > +		qdisc_qstats_cpu_qlen_inc(q);
> > +
> > +		skb = next;
> > +	}
> >
> > -	qdisc_qstats_cpu_requeues_inc(q);
> > -	qdisc_qstats_cpu_backlog_inc(q, skb);
> > -	qdisc_qstats_cpu_qlen_inc(q);
> >  	__netif_schedule(q);
> >
> >  	return 0;
> >


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-12-27  8:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-25  3:49 [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb() Wei Yongjun
2017-12-27  5:24 ` John Fastabend
2017-12-27  8:02   ` weiyongjun (A)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox