Re: [PATCH net-next v2 01/17] net: sched: refactor mini_qdisc_pair_swap() to use workqueue

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Vlad Buslov <vladbu@mellanox.com>
To: Cong Wang <xiyou.wangcong@gmail.com>, Jiri Pirko <jiri@resnulli.us>
Cc: Jiri Pirko <jiri@resnulli.us>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	David Miller <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH net-next v2 01/17] net: sched: refactor mini_qdisc_pair_swap() to use workqueue
Date: Mon, 14 Jan 2019 19:00:48 +0000	[thread overview]
Message-ID: <vbfef9fvzc3.fsf@mellanox.com> (raw)
In-Reply-To: <vbf1s6c5mlz.fsf@mellanox.com>


On Thu 20 Dec 2018 at 13:55, Vlad Buslov <vladbu@mellanox.com> wrote:
> On Wed 19 Dec 2018 at 04:27, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Mon, Dec 17, 2018 at 2:30 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>
>>> Sun, Dec 16, 2018 at 07:52:18PM CET, xiyou.wangcong@gmail.com wrote:
>>> >On Sun, Dec 16, 2018 at 8:32 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>> >>
>>> >> On Thu 13 Dec 2018 at 23:32, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> >> > On Tue, Dec 11, 2018 at 2:19 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>> >> >>
>>> >> >> As a part of the effort to remove dependency on rtnl lock, cls API is being
>>> >> >> converted to use fine-grained locking mechanisms instead of global rtnl
>>> >> >> lock. However, chain_head_change callback for ingress Qdisc is a sleeping
>>> >> >> function and cannot be executed while holding a spinlock.
>>> >> >
>>> >> >
>>> >> > Why does it have to be a spinlock not a mutex?
>>> >> >
>>> >> > I've read your cover letter and this changelog, I don't find any
>>> >> > answer.
>>> >>
>>> >> My initial implementation used mutex. However, it was changed to
>>> >> spinlock by Jiri's request during internal review.
>>> >>
>>> >
>>> >So what's the answer to my question? :)
>>>
>>> Yeah, my concern agains mutexes was that it would be needed to have one
>>> per every block and per every chain. I find it quite heavy and I believe
>>> it is better to use spinlock in those cases. This patch is a side effect
>>> of that. Do you think it would be better to have mutexes instead of
>>> spinlocks?
>>
>> My only concern with spinlock is we have to go async as we
>> can't block. This is almost always error-prone especially
>> when dealing with tcf block. I had to give up with spinlock
>> for idrinfo->lock, please take a look at:
>>
>> commit 95278ddaa15cfa23e4a06ee9ed7b6ee0197c500b
>> Author: Cong Wang <xiyou.wangcong@gmail.com>
>> Date:   Tue Oct 2 12:50:19 2018 -0700
>>
>>     net_sched: convert idrinfo->lock from spinlock to a mutex
>>
>>
>> There are indeed some cases in kernel we do take multiple
>> mutex'es, for example,
>>
>> /*
>>  * bd_mutex locking:
>>  *
>>  *  mutex_lock(part->bd_mutex)
>>  *    mutex_lock_nested(whole->bd_mutex, 1)
>>  */
>>
>> So, how heavy are they comparing with spinlocks?
>>
>> Thanks.
>
> Hi Cong,
>
> I did quick-and-dirty mutex-based implementation of my cls API patchset
> (removed async miniqp refactoring, changed block and chain lock types to
> mutex lock) and performed some benchmarks to determine exact mutex
> impact on cls API performance (both rule and chains API).
>
> 1. Parallel flower insertion rate (create 5 million flower filters on
> same chain/prio with 10 tc instances) - same performance:
>
> SPINLOCK
> $ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    0m12.183s
> user    0m35.013s
> sys     1m24.947s
>
> MUTEX
> $ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    0m12.246s
> user    0m35.417s
> sys     1m25.330s
>
> 2. Parallel flower insertion rate (create 100k flower filters, each on
> new instance of chain/tp, 10 tc instances) - mutex implementation is
> ~17% slower:
>
> SPINLOCK:
> $ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    2m19.285s
> user    0m1.863s
> sys     5m41.157s
>
> MUTEX:
> multichain$ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    2m43.495s
> user    0m1.664s
> sys     8m32.483s
>
> 3. Chains insertion rate with cls chain API (create 100k empty chains, 1
> tc instance) - ~3% difference:
>
> SPINLOCK:
> $ time sudo tc -b add_chains
>
> real    0m46.822s
> user    0m0.239s
> sys     0m46.437s
>
> MUTEX:
> $ time sudo tc -b add_chains
>
> real    0m48.600s
> user    0m0.230s
> sys     0m48.227s
>
>
> Only case where performance is significantly impacted by mutex is second
> test. This happens because chain lookup is a linear search that is
> performed while holding highly contested block lock. Perf profile for
> mutex version confirms this:
>
> +   91.83%     0.00%  tc               [kernel.vmlinux]            [k] entry_SYSCALL_64_after_hwframe
> +   91.83%     0.00%  tc               [kernel.vmlinux]            [k] do_syscall_64
> +   91.80%     0.00%  tc               libc-2.25.so                [.] __libc_sendmsg
> +   91.78%     0.00%  tc               [kernel.vmlinux]            [k] __sys_sendmsg
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] ___sys_sendmsg
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] sock_sendmsg
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] netlink_sendmsg
> +   91.77%     0.01%  tc               [kernel.vmlinux]            [k] netlink_unicast
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] netlink_rcv_skb
> +   91.71%     0.01%  tc               [kernel.vmlinux]            [k] rtnetlink_rcv_msg
> +   91.69%     0.03%  tc               [kernel.vmlinux]            [k] tc_new_tfilter
> +   90.96%    26.45%  tc               [kernel.vmlinux]            [k] __tcf_chain_get
> +   64.30%     0.01%  tc               [kernel.vmlinux]            [k] __mutex_lock.isra.7
> +   39.36%    39.18%  tc               [kernel.vmlinux]            [k] osq_lock
> +   24.92%    24.81%  tc               [kernel.vmlinux]            [k] mutex_spin_on_owner
>
> Based on these results we can conclude that use-cases with significant
> amount of chains on single block will be affected by using mutex in cls
> API.
>
> Regards,
> Vlad

Hi Cong,

Any comments regarding benchmark results? Are you okay with
spinlock-based implementation?

Thanks,
Vlad

next prev parent reply	other threads:[~2019-01-14 19:00 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-11 10:11 [PATCH net-next v2 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 01/17] net: sched: refactor mini_qdisc_pair_swap() to use workqueue Vlad Buslov
2018-12-12  3:23   ` kbuild test robot
2018-12-13 23:32   ` Cong Wang
2018-12-16 16:32     ` Vlad Buslov
2018-12-16 18:52       ` Cong Wang
2018-12-17 10:22         ` Jiri Pirko
2018-12-19  4:27           ` Cong Wang
2018-12-20 13:55             ` Vlad Buslov
2019-01-14 19:00               ` Vlad Buslov [this message]
2019-01-14 23:00                 ` Cong Wang
2018-12-19  8:12   ` kbuild test robot
2018-12-11 10:11 ` [PATCH net-next v2 02/17] net: sched: protect block state with spinlock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 03/17] net: sched: refactor tc_ctl_chain() to use block->lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 04/17] net: sched: protect block->chain0 with block->lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 05/17] net: sched: traverse chains in block with tcf_get_next_chain() Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 06/17] net: sched: protect chain template accesses with block lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 07/17] net: sched: lock the chain when accessing filter_chain list Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 08/17] net: sched: introduce reference counting for tcf proto Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 09/17] net: sched: traverse classifiers in chain with tcf_get_next_proto() Vlad Buslov
2018-12-14 10:15   ` kbuild test robot
2018-12-11 10:11 ` [PATCH net-next v2 10/17] net: sched: refactor tp insert/delete for concurrent execution Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 11/17] net: sched: prevent insertion of new classifiers during chain flush Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 12/17] net: sched: track rtnl lock status when validating extensions Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 13/17] net: sched: extend proto ops with 'put' callback Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 14/17] net: sched: extend proto ops to support unlocked classifiers Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 15/17] net: sched: add flags to Qdisc class ops struct Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 16/17] net: sched: refactor tcf_block_find() into standalone functions Vlad Buslov
2018-12-11 10:12 ` [PATCH net-next v2 17/17] net: sched: unlock rules update API Vlad Buslov
2018-12-20  0:15   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vbfef9fvzc3.fsf@mellanox.com \
    --to=vladbu@mellanox.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.