From mboxrd@z Thu Jan  1 00:00:00 1970
From: Paolo Abeni <pabeni@redhat.com>
Subject: Re: [net PATCH v2] net: sched, fix OOO packets with pfifo_fast
Date: Thu, 19 Apr 2018 10:00:54 +0200
Message-ID: <1524124854.3160.25.camel@redhat.com>
References: <20180325052505.4098.36912.stgit@john-Precision-Tower-5810>
         <CAM_iQpWNX-9p-bo+caUyJ8yfsNDS1a2pV9LNvHK4=y3ec4qRVw@mail.gmail.com>
         <7f8636e3-c04f-18b6-7e6c-0f28bc54edbb@gmail.com>
         <1524036512.2599.4.camel@redhat.com>
         <36a89ed1-d6ff-ddad-c736-4e68909d61c4@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
        Jiri Pirko <jiri@resnulli.us>,
        David Miller <davem@davemloft.net>,
        Linux Kernel Network Developers <netdev@vger.kernel.org>
To: John Fastabend <john.fastabend@gmail.com>,
        Cong Wang <xiyou.wangcong@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mx3-rdu2.redhat.com ([66.187.233.73]:58456 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1751010AbeDSIA5 (ORCPT <rfc822;netdev@vger.kernel.org>);
        Thu, 19 Apr 2018 04:00:57 -0400
In-Reply-To: <36a89ed1-d6ff-ddad-c736-4e68909d61c4@gmail.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Wed, 2018-04-18 at 09:44 -0700, John Fastabend wrote:
> Thanks for bringing this up. I'll think about it for a bit maybe
> there is something we can do here. There is a set of conditions
> that if met we can run without the lock. Possibly ONETXQUEUE and
> aligned cpu_map is sufficient. 

I think you mean "root qdisc is mq and aligned cpu_map": AFAICS we can
have ONETXQUEUE when root qdisc is e.g. pfifo_fast which would not help
here.

> We could detect this case and drop
> the locking. For existing systems and high Gbps NICs I think (feel
> free to correct me) assuming a core per cpu is OK. 

I'm sorry, I'm lost. Do you mean "a tx queue per core" instead ?!? 

I'm unsure we can assume the above. In my experiments, at least in some
scenarios it's preferrable configuring a limited number of rx/tx
queues, confine BH processing to the related cores and let user space
processes run on the others, with a many to 1 relationship between the
cores "assigned" to user-space and the cores "assigned" to BH
processing. 

Can't we somewhat try to leverage TCQ_F_CAN_BYPASS even with NOLOCK
qdisc? I *think* we can avoid the qdisc_run() call after
sch_direct_xmit() in the bypass scenario, and that will avoid the
blamed atomic ops above.

Cheers,

Paolo