From: Daniel Borkmann <daniel@iogearbox.net>
To: Shahar Klein <shahark@mellanox.com>, davem@davemloft.net
Cc: xiyou.wangcong@gmail.com, gerlitz.or@gmail.com,
roid@mellanox.com, jiri@mellanox.com, john.fastabend@gmail.com,
netdev@vger.kernel.org
Subject: Re: [PATCH net] net, sched: fix soft lockup in tc_classify
Date: Fri, 23 Dec 2016 00:20:18 +0100 [thread overview]
Message-ID: <585C5FB2.2000909@iogearbox.net> (raw)
In-Reply-To: <7033ed3d-0665-1a38-7631-a1346d46e1b4@mellanox.com>
On 12/22/2016 02:16 PM, Shahar Klein wrote:
> On 12/21/2016 7:04 PM, Daniel Borkmann wrote:
>> Shahar reported a soft lockup in tc_classify(), where we run into an
>> endless loop when walking the classifier chain due to tp->next == tp
>> which is a state we should never run into. The issue only seems to
>> trigger under load in the tc control path.
>>
>> What happens is that in tc_ctl_tfilter(), thread A allocates a new
>> tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
>> with it. In that classifier callback we had to unlock/lock the rtnl
>> mutex and returned with -EAGAIN. One reason why we need to drop there
>> is, for example, that we need to request an action module to be loaded.
>>
>> This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
>> after we loaded and found the requested action, we need to redo the
>> whole request so we don't race against others. While we had to unlock
>> rtnl in that time, thread B's request was processed next on that CPU.
>> Thread B added a new tp instance successfully to the classifier chain.
>> When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
>> and destroying its tp instance which never got linked, we goto replay
>> and redo A's request.
>>
>> This time when walking the classifier chain in tc_ctl_tfilter() for
>> checking for existing tp instances we had a priority match and found
>> the tp instance that was created and linked by thread B. Now calling
>> again into tp->ops->change() with that tp was successful and returned
>> without error.
>>
>> tp_created was never cleared in the second round, thus kernel thinks
>> that we need to link it into the classifier chain (once again). tp and
>> *back point to the same object due to the match we had earlier on. Thus
>> for thread B's already public tp, we reset tp->next to tp itself and
>> link it into the chain, which eventually causes the mentioned endless
>> loop in tc_classify() once a packet hits the data path.
>>
>> Fix is to clear tp_created at the beginning of each request, also when
>> we replay it. On the paths that can cause -EAGAIN we already destroy
>> the original tp instance we had and on replay we really need to start
>> from scratch. It seems that this issue was first introduced in commit
>> 12186be7d2e1 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
>> and avoid kernel panic when we use cls_cgroup").
>>
>> Fixes: 12186be7d2e1 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
>> Reported-by: Shahar Klein <shahark@mellanox.com>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: Cong Wang <xiyou.wangcong@gmail.com>
[...]
> I've tested this specific patch (without cong's patch and without the many prints) many times and in many test permutations today and it didn't lock
>
> Thanks again Daniel!
Just catching up with email (traveling whole day), thanks a bunch for
your effort, Shahar! In that case lets then add:
Tested-by: Shahar Klein <shahark@mellanox.com>
next prev parent reply other threads:[~2016-12-22 23:20 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-21 17:04 [PATCH net] net, sched: fix soft lockup in tc_classify Daniel Borkmann
2016-12-21 17:37 ` Eric Dumazet
2016-12-21 18:51 ` Cong Wang
2016-12-21 19:10 ` Cong Wang
2016-12-21 20:02 ` Daniel Borkmann
2016-12-21 20:47 ` Cong Wang
2016-12-21 21:07 ` Daniel Borkmann
2016-12-22 16:53 ` David Miller
2016-12-22 17:50 ` John Fastabend
2016-12-22 23:21 ` Daniel Borkmann
2016-12-22 19:05 ` Cong Wang
2016-12-23 0:26 ` Daniel Borkmann
2016-12-24 7:34 ` Cong Wang
2016-12-24 21:03 ` Daniel Borkmann
2016-12-21 19:16 ` Daniel Borkmann
2016-12-22 13:16 ` Shahar Klein
2016-12-22 23:20 ` Daniel Borkmann [this message]
2016-12-26 16:24 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=585C5FB2.2000909@iogearbox.net \
--to=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=gerlitz.or@gmail.com \
--cc=jiri@mellanox.com \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=roid@mellanox.com \
--cc=shahark@mellanox.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).