From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Abeni Subject: Re: [net PATCH v2] net: sched, fix OOO packets with pfifo_fast Date: Tue, 08 May 2018 18:17:14 +0200 Message-ID: <1525796234.2723.21.camel@redhat.com> References: <20180325052505.4098.36912.stgit@john-Precision-Tower-5810> <7f8636e3-c04f-18b6-7e6c-0f28bc54edbb@gmail.com> <1524036512.2599.4.camel@redhat.com> <36a89ed1-d6ff-ddad-c736-4e68909d61c4@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Eric Dumazet , Jiri Pirko , David Miller , Linux Kernel Network Developers To: John Fastabend , Cong Wang , Jamal Hadi Salim Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:52252 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752463AbeEHQRR (ORCPT ); Tue, 8 May 2018 12:17:17 -0400 In-Reply-To: <36a89ed1-d6ff-ddad-c736-4e68909d61c4@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: Hi all, I'm still crashing my head on this item... On Wed, 2018-04-18 at 09:44 -0700, John Fastabend wrote: > There is a set of conditions > that if met we can run without the lock. Possibly ONETXQUEUE and > aligned cpu_map is sufficient. We could detect this case and drop > the locking. For existing systems and high Gbps NICs I think (feel > free to correct me) assuming a core per cpu is OK. At some point > though we probably need to revisit this assumption. I think we can improve measurably moving the __QDISC_STATE_RUNNING bit fiddling around the __qdisc_run() call in the 'lockless' path, instead of keeping it inside __qdisc_restart(). Currently, in the single sender, pkt rate below link-limit scenario we hit the atomic bit overhead twice per xmitted packet: one for each dequeue, plus another one for the next, failing, dequeue attempt. With the wider scope we will hit it always only once. After that change __QDISC_STATE_RUNNING usage will look a bit like qdisc_lock(), for the dequeue part at least. So I'm wondering if we could replace __QDISC_STATE_RUNNING with spin_trylock(qdisc_lock()) _and_ keep such lock held for the whole qdisc_run() !?! The comment above qdisc_restart() states clearly we can't, but I don't see why !?! Acquiring qdisc_lock() and xmit lock always in the given sequence looks safe to me. Can someone please explain? Is there some possible deathlock condition I'm missing ?!? It looks like the comment itself cames directly from the pre-bitkeeper era (modulo locks name change). Performance wise, acquiring the qdisc_lock only once per xmitted packet should improve considerably 'locked' qdisc performance, both in the contented and in the uncontended scenario (and some quick experiments seems to confirm that). Thanks, Paolo