netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: eric.dumazet@gmail.com, xiyou.wangcong@gmail.com,
	jiri@resnulli.us, netdev@vger.kernel.org,
	Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Subject: Re: [net PATCH v2] net: sched, fix OOO packets with pfifo_fast
Date: Mon, 26 Mar 2018 10:10:06 -0700	[thread overview]
Message-ID: <03243235-f9ae-e44b-a0d7-0b8f3294dd2a@gmail.com> (raw)
In-Reply-To: <20180326.123643.803872307508307757.davem@davemloft.net>

On 03/26/2018 09:36 AM, David Miller wrote:
> From: John Fastabend <john.fastabend@gmail.com>
> Date: Sat, 24 Mar 2018 22:25:06 -0700
> 
>> After the qdisc lock was dropped in pfifo_fast we allow multiple
>> enqueue threads and dequeue threads to run in parallel. On the
>> enqueue side the skb bit ooo_okay is used to ensure all related
>> skbs are enqueued in-order. On the dequeue side though there is
>> no similar logic. What we observe is with fewer queues than CPUs
>> it is possible to re-order packets when two instances of
>> __qdisc_run() are running in parallel. Each thread will dequeue
>> a skb and then whichever thread calls the ndo op first will
>> be sent on the wire. This doesn't typically happen because
>> qdisc_run() is usually triggered by the same core that did the
>> enqueue. However, drivers will trigger __netif_schedule()
>> when queues are transitioning from stopped to awake using the
>> netif_tx_wake_* APIs. When this happens netif_schedule() calls
>> qdisc_run() on the same CPU that did the netif_tx_wake_* which
>> is usually done in the interrupt completion context. This CPU
>> is selected with the irq affinity which is unrelated to the
>> enqueue operations.
>>
>> To resolve this we add a RUNNING bit to the qdisc to ensure
>> only a single dequeue per qdisc is running. Enqueue and dequeue
>> operations can still run in parallel and also on multi queue
>> NICs we can still have a dequeue in-flight per qdisc, which
>> is typically per CPU.
>>
>> Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
>> Reported-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> 
> Applied, thanks John.
> 

Great, also off-list email from Jakob (I forgot to add him to the
CC list here, oops) told me to add, 

Tested-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>

Also in net-next I'll look to see if we can avoid doing the extra
atomics especially in cases where they are not actually needed. For
example the 1:1 qdisc to txq mappings. It seems a bit evasive
though for net.

Finally just an FYI but I think I'll look at a distributed counter
soon so we can get a lockless token bucket. I need the counter for
BPF as well so coming soon.

Thanks,
John

  reply	other threads:[~2018-03-26 17:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-25  5:25 [net PATCH v2] net: sched, fix OOO packets with pfifo_fast John Fastabend
2018-03-26 16:36 ` David Miller
2018-03-26 17:10   ` John Fastabend [this message]
2018-03-26 17:30 ` Cong Wang
2018-03-26 18:16   ` John Fastabend
2018-04-18  7:28     ` Paolo Abeni
2018-04-18 16:44       ` John Fastabend
2018-04-19  8:00         ` Paolo Abeni
2018-05-08 16:17         ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03243235-f9ae-e44b-a0d7-0b8f3294dd2a@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jakob.unterwurzacher@theobroma-systems.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).