From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: Re: [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb() Date: Tue, 26 Dec 2017 21:24:07 -0800 Message-ID: <42a543e3-82ff-2421-186d-cc0957ea45ed@gmail.com> References: <1514173746-165282-1-git-send-email-weiyongjun1@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org To: Wei Yongjun , Jamal Hadi Salim , Cong Wang , Jiri Pirko Return-path: Received: from mail-pg0-f67.google.com ([74.125.83.67]:43052 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750762AbdL0FYW (ORCPT ); Wed, 27 Dec 2017 00:24:22 -0500 Received: by mail-pg0-f67.google.com with SMTP id f18so1817738pga.10 for ; Tue, 26 Dec 2017 21:24:21 -0800 (PST) In-Reply-To: <1514173746-165282-1-git-send-email-weiyongjun1@huawei.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 12/24/2017 07:49 PM, Wei Yongjun wrote: > When dev_requeue_skb() is called with bluked skb list, only the > first skb of the list will be requeued to qdisc layer, and leak > the others without free them. > > TCP is broken due to skb leak since no free skb will be considered > as still in the host queue and never be retransmitted. This happend > when dev_requeue_skb() called from qdisc_restart(). > qdisc_restart > |-- dequeue_skb > |-- sch_direct_xmit() > |-- dev_requeue_skb() <-- skb may bluked > > Fix dev_requeue_skb() to requeue the full bluked list. > > Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback") > Signed-off-by: Wei Yongjun > --- > v1 -> v2: add net-next prefix > --- First, thanks for tracking this down. > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index 981c08f..0df2dbf 100644 > --- a/net/sched/sch_generic.c > +++ b/net/sched/sch_generic.c > @@ -111,10 +111,16 @@ static inline void qdisc_enqueue_skb_bad_txq(struct Qdisc *q, > > static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q) > { > - __skb_queue_head(&q->gso_skb, skb); > - q->qstats.requeues++; > - qdisc_qstats_backlog_inc(q, skb); > - q->q.qlen++; /* it's still part of the queue */ > + while (skb) { > + struct sk_buff *next = skb->next; > + > + __skb_queue_tail(&q->gso_skb, skb); Was the change from __skb_queue_head to __skb_queue_tail here intentional? We should re-queue packets to the head of the list. > + q->qstats.requeues++; > + qdisc_qstats_backlog_inc(q, skb); > + q->q.qlen++; /* it's still part of the queue */ > + > + skb = next; > + } > __netif_schedule(q); > > return 0; > @@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct sk_buff *skb, struct Qdisc *q) > { > spinlock_t *lock = qdisc_lock(q); > > - spin_lock(lock); > - __skb_queue_tail(&q->gso_skb, skb); > - spin_unlock(lock); > + while (skb) { > + struct sk_buff *next = skb->next; > + > + spin_lock(lock); In this case I suspect its better to move the lock to be around the while loop rather than grab and drop it repeatedly. I don't have any data at this point so OK either way. Assuming other head/tail comment is addressed. > + __skb_queue_tail(&q->gso_skb, skb); Same here *_tail should be *_head? > + spin_unlock(lock); > + > + qdisc_qstats_cpu_requeues_inc(q); > + qdisc_qstats_cpu_backlog_inc(q, skb); > + qdisc_qstats_cpu_qlen_inc(q); > + > + skb = next; > + } > > - qdisc_qstats_cpu_requeues_inc(q); > - qdisc_qstats_cpu_backlog_inc(q, skb); > - qdisc_qstats_cpu_qlen_inc(q); > __netif_schedule(q); > > return 0; >