netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>, David Miller <davem@davemloft.net>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [net PATCH v2] net: sched, fix OOO packets with pfifo_fast
Date: Tue, 08 May 2018 18:17:14 +0200	[thread overview]
Message-ID: <1525796234.2723.21.camel@redhat.com> (raw)
In-Reply-To: <36a89ed1-d6ff-ddad-c736-4e68909d61c4@gmail.com>

Hi all,

I'm still crashing my head on this item...

On Wed, 2018-04-18 at 09:44 -0700, John Fastabend wrote:
> There is a set of conditions
> that if met we can run without the lock. Possibly ONETXQUEUE and
> aligned cpu_map is sufficient. We could detect this case and drop
> the locking. For existing systems and high Gbps NICs I think (feel
> free to correct me) assuming a core per cpu is OK. At some point
> though we probably need to revisit this assumption.

I think we can improve measurably moving the __QDISC_STATE_RUNNING bit
fiddling around the __qdisc_run() call in the 'lockless' path, instead
of keeping it inside __qdisc_restart().

Currently, in the single sender, pkt rate below link-limit scenario we
hit the atomic bit overhead twice per xmitted packet: one for each
dequeue, plus another one for the next, failing, dequeue attempt. With
the wider scope we will hit it always only once.

After that change __QDISC_STATE_RUNNING usage will look a bit like
qdisc_lock(), for the dequeue part at least. So I'm wondering if we
could replace __QDISC_STATE_RUNNING with spin_trylock(qdisc_lock())
_and_ keep such lock held for the whole qdisc_run() !?! 

The comment above qdisc_restart() states clearly we can't, but I don't
see why !?! Acquiring qdisc_lock() and xmit lock always in the given
sequence looks safe to me. Can someone please explain? Is there some
possible deathlock condition I'm missing ?!?

It looks like the comment itself cames directly from the pre-bitkeeper
era (modulo locks name change).

Performance wise, acquiring the qdisc_lock only once per xmitted packet
should improve considerably 'locked' qdisc performance, both in the
contented and in the uncontended scenario (and some quick experiments
seems to confirm that).

Thanks,

Paolo

      parent reply	other threads:[~2018-05-08 16:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-25  5:25 [net PATCH v2] net: sched, fix OOO packets with pfifo_fast John Fastabend
2018-03-26 16:36 ` David Miller
2018-03-26 17:10   ` John Fastabend
2018-03-26 17:30 ` Cong Wang
2018-03-26 18:16   ` John Fastabend
2018-04-18  7:28     ` Paolo Abeni
2018-04-18 16:44       ` John Fastabend
2018-04-19  8:00         ` Paolo Abeni
2018-05-08 16:17         ` Paolo Abeni [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1525796234.2723.21.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).