From: Vlad Buslov <vlad@buslov.dev>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
netdev@vger.kernel.org, Cong Wang <cong.wang@bytedance.com>,
syzbot+82752bc5331601cf4899@syzkaller.appspotmail.com,
syzbot+b3b63b6bff456bd95294@syzkaller.appspotmail.com,
syzbot+ba67b12b1ca729912834@syzkaller.appspotmail.com,
Jiri Pirko <jiri@resnulli.us>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
Davide Caratti <dcaratti@redhat.com>,
Briana Oursler <briana.oursler@gmail.com>
Subject: Re: [Patch net-next] net_sched: fix RTNL deadlock again caused by request_module()
Date: Mon, 18 Jan 2021 16:31:14 +0200 [thread overview]
Message-ID: <87im7u9v0t.fsf@buslov.dev> (raw)
In-Reply-To: <acba35f6-2e29-903f-6eb8-a50dde25a147@mojatatu.com>
On Sun 17 Jan 2021 at 17:15, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2021-01-16 7:56 p.m., Cong Wang wrote:
>> From: Cong Wang <cong.wang@bytedance.com>
>> tcf_action_init_1() loads tc action modules automatically with
>> request_module() after parsing the tc action names, and it drops RTNL
>> lock and re-holds it before and after request_module(). This causes a
>> lot of troubles, as discovered by syzbot, because we can be in the
>> middle of batch initializations when we create an array of tc actions.
>> One of the problem is deadlock:
>> CPU 0 CPU 1
>> rtnl_lock();
>> for (...) {
>> tcf_action_init_1();
>> -> rtnl_unlock();
>> -> request_module();
>> rtnl_lock();
>> for (...) {
>> tcf_action_init_1();
>> -> tcf_idr_check_alloc();
>> // Insert one action into idr,
>> // but it is not committed until
>> // tcf_idr_insert_many(), then drop
>> // the RTNL lock in the _next_
>> // iteration
>> -> rtnl_unlock();
>> -> rtnl_lock();
>> -> a_o->init();
>> -> tcf_idr_check_alloc();
>> // Now waiting for the same index
>> // to be committed
>> -> request_module();
>> -> rtnl_lock()
>> // Now waiting for RTNL lock
>> }
>> rtnl_unlock();
>> }
>> rtnl_unlock();
>> This is not easy to solve, we can move the request_module() before
>> this loop and pre-load all the modules we need for this netlink
>> message and then do the rest initializations. So the loop breaks down
>> to two now:
>> for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
>> struct tc_action_ops *a_o;
>> a_o = tc_action_load_ops(name, tb[i]...);
>> ops[i - 1] = a_o;
>> }
>> for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
>> act = tcf_action_init_1(ops[i - 1]...);
>> }
>> Although this looks serious, it only has been reported by syzbot, so it
>> seems hard to trigger this by humans. And given the size of this patch,
>> I'd suggest to make it to net-next and not to backport to stable.
>> This patch has been tested by syzbot and tested with tdc.py by me.
>>
>
> LGTM.
> Initially i was worried about performance impact but i found nothing
> observable. We need to add a tdc test for batch (I can share how i did
> batch testing at next meet).
>
> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
>
> cheers,
> jamal
Hi,
Thanks for adding me to the thread!
I ran our performance tests with the patch applied and didn't observe
any regression.
Regards,
Vlad
next prev parent reply other threads:[~2021-01-18 14:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-17 0:56 [Patch net-next] net_sched: fix RTNL deadlock again caused by request_module() Cong Wang
2021-01-17 15:15 ` Jamal Hadi Salim
2021-01-18 14:31 ` Vlad Buslov [this message]
2021-01-19 4:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87im7u9v0t.fsf@buslov.dev \
--to=vlad@buslov.dev \
--cc=briana.oursler@gmail.com \
--cc=cong.wang@bytedance.com \
--cc=dcaratti@redhat.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=marcelo.leitner@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=syzbot+82752bc5331601cf4899@syzkaller.appspotmail.com \
--cc=syzbot+b3b63b6bff456bd95294@syzkaller.appspotmail.com \
--cc=syzbot+ba67b12b1ca729912834@syzkaller.appspotmail.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.