From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: [net PATCH v2] net: sched, fix OOO packets with pfifo_fast Date: Mon, 26 Mar 2018 12:36:43 -0400 (EDT) Message-ID: <20180326.123643.803872307508307757.davem@davemloft.net> References: <20180325052505.4098.36912.stgit@john-Precision-Tower-5810> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: eric.dumazet@gmail.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, netdev@vger.kernel.org To: john.fastabend@gmail.com Return-path: Received: from shards.monkeyblade.net ([184.105.139.130]:53002 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751099AbeCZQgq (ORCPT ); Mon, 26 Mar 2018 12:36:46 -0400 In-Reply-To: <20180325052505.4098.36912.stgit@john-Precision-Tower-5810> Sender: netdev-owner@vger.kernel.org List-ID: From: John Fastabend Date: Sat, 24 Mar 2018 22:25:06 -0700 > After the qdisc lock was dropped in pfifo_fast we allow multiple > enqueue threads and dequeue threads to run in parallel. On the > enqueue side the skb bit ooo_okay is used to ensure all related > skbs are enqueued in-order. On the dequeue side though there is > no similar logic. What we observe is with fewer queues than CPUs > it is possible to re-order packets when two instances of > __qdisc_run() are running in parallel. Each thread will dequeue > a skb and then whichever thread calls the ndo op first will > be sent on the wire. This doesn't typically happen because > qdisc_run() is usually triggered by the same core that did the > enqueue. However, drivers will trigger __netif_schedule() > when queues are transitioning from stopped to awake using the > netif_tx_wake_* APIs. When this happens netif_schedule() calls > qdisc_run() on the same CPU that did the netif_tx_wake_* which > is usually done in the interrupt completion context. This CPU > is selected with the irq affinity which is unrelated to the > enqueue operations. > > To resolve this we add a RUNNING bit to the qdisc to ensure > only a single dequeue per qdisc is running. Enqueue and dequeue > operations can still run in parallel and also on multi queue > NICs we can still have a dequeue in-flight per qdisc, which > is typically per CPU. > > Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") > Reported-by: Jakob Unterwurzacher > Signed-off-by: John Fastabend Applied, thanks John.