From: John Fastabend <john.fastabend@gmail.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: "lockless" qdisc breaks tx_queue_len change too?
Date: Wed, 3 Jan 2018 19:03:30 -0800 [thread overview]
Message-ID: <9bac96cd-055c-f794-75a6-7f87abf07aa4@gmail.com> (raw)
In-Reply-To: <CAM_iQpV2re1q1B+NQJtOPJRE28edpzQMEDot04fxLHdqFe5s1Q@mail.gmail.com>
On 01/03/2018 03:41 PM, Cong Wang wrote:
> On Wed, Jan 3, 2018 at 10:09 AM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> On 01/02/2018 08:41 PM, Cong Wang wrote:
>>> Hi, John
>>>
>>> While reviewing your ptr_ring fix again today, it looks like your
>>> "lockless" qdisc patchset breaks dev->tx_queue_len behavior.
>>>
>>> Before your patchset, dev->tx_queue_len is merely an integer to read,
>>> after your patchset, the skb array has to be resized when
>>> dev->tx_queue_len changes, but I don't see any qdisc code handles
>>> this...
>>>
>>> Also, because of that, I doubt __skb_array_empty() in
>>> pfifo_fast_dequeue() can be safe any more even with your ptr_ring fix.
>>>
>>> What am I missing?
>>>
>>
>> I dropped support for tx_queue_len changes after qdisc has been
>> created. The only check is at init time when building the qdisc.
>
> This is where it breaks.
>
>
>>
>> Before this series teql and pfifo_fast were the only qdiscs that
>> used tx_queue_len other qdiscs used other mechanisms or copied
>> tx_queue_len at init time. So the API is inconsistent.
>
> Yeah, pfifo_fast was able to drop based on latest value of tx_queue_len
> before your patchset, this is why I am complaining.
>
Yep good complaint.
>
>>
>> OK, but arguably its kAPI now and needs to be supported on live
>> qdiscs. So couple options drop the __skb_array_empty() check,
>> stop supporting changes on running qdiscs, or do a qdisc swap
>> with the new array.
>
> I don't think we can break the old behavior of tx_queue_len change
> for pfifo_fast, people may already rely on it.
>
Agreed needed for legacy support.
> Doing a swap seems reasonable.
>
>>
>> I'm tempted to make the qdisc swap work, still need benchmarks
>> I guess without the empty check. Either way to get it working
>> we need a callback from tx_queue_len code paths.
>
> Right, probably need a new ops in Qdisc_ops.
>
Maybe instead of a Qdisc op just do a direct call to avoid
encouraging users to use this code path. Either way is
probably fine we can just watch any future patches and
have users add a specific attribute for it like codel.
>
>>
>> Unfortunately, I guess someone somewhere probably uses pfifo_fast
>> and changes there queue length with a script after creating the
>> qdisc and expects it to work.
>>
>
> This is my concern as well. I will work on some patches, this doesn't
> look trivial to solve at all.
How about a dev_deactivate_many() that instead of replacing
with noop qdisc replaces with new updated qdisc. Seems like
it might work.
Thanks,
John
>
> Thanks.
>
prev parent reply other threads:[~2018-01-04 3:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-03 4:41 "lockless" qdisc breaks tx_queue_len change too? Cong Wang
2018-01-03 18:09 ` John Fastabend
2018-01-03 23:41 ` Cong Wang
2018-01-04 3:03 ` John Fastabend [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9bac96cd-055c-f794-75a6-7f87abf07aa4@gmail.com \
--to=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox