netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Shahar Klein <shahark@mellanox.com>,
	Or Gerlitz <gerlitz.or@gmail.com>, Roi Dayan <roid@mellanox.com>,
	Jiri Pirko <jiri@mellanox.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [PATCH net] net, sched: fix soft lockup in tc_classify
Date: Sat, 24 Dec 2016 22:03:43 +0100	[thread overview]
Message-ID: <585EE2AF.2000100@iogearbox.net> (raw)
In-Reply-To: <CAM_iQpX2X-WHbf1VxfQzh_-YUEqk=o6B+uYfYhj_45jJGaFSfQ@mail.gmail.com>

On 12/24/2016 08:34 AM, Cong Wang wrote:
> On Thu, Dec 22, 2016 at 4:26 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 12/22/2016 08:05 PM, Cong Wang wrote:
>>> On Wed, Dec 21, 2016 at 1:07 PM, Daniel Borkmann <daniel@iogearbox.net>
>>> wrote:
>>>>
>>>> Ok, you mean for net. In that case I prefer the smaller sized fix to be
>>>> honest. It also covers everything from the point where we fetch the chain
>>>> via cops->tcf_chain() to the end of the function, which is where most of
>>>> the complexity resides, and only the two mentioned commits do the relock,
>>>
>>> I really wish the problem is only about relocking, but look at the code,
>>> the deeper reason why we have this bug is the complexity of the logic
>>> inside tc_ctl_tfilter(): 1) the replay logic is hard, we have to make it
>>> idempotent; 2) the request logic itself is hard, because of tc filter
>>> design
>>> and implementation.
>>>
>>> This is why I worry more than just relocking.
>>
>> But do you have a concrete 2nd issue/bug you're seeing? It rather sounds to
>> me your argument is more about fear of complexity on tc framework itself.
>> I agree it's complex, and tc_ctl_tfilter() is quite big in itself, where it
>> would be good to reduce it's complexity into smaller pieces. But it's not
>> really related to the fix itself, reducing complexity requires significantly
>> more and deeper work on the code. We can rework tc_ctl_tfilter() in net-next
>> to try to simplify it, sure, but I don't get why we have to discuss so much
>> on this matter in this context, really.
>
> Thanks for ignoring my point 1) above... You are dragging the discussion
> further.

I don't think so. The analysis and patch I proposed provides an explanation
of how we get into the seen endless loop, it provides a logical fix for it,
which has been reviewed by others and it has been tested extensively that it
resolves the issue, which was easily reproducible for the reporter and that
after the fix it never occurred again. The delta is absolutely simple and
really low risk. Given this function has not much changed over time, also
distros could pick it up that have a much older base kernel than current
stable ones. This initiated follow-up discussion we're having here in general
is dragging the focus away for everyone, and quite frankly I'm getting tired
of discussing it. I have stated my preferences, you have stated yours, and
we're only repeating ourselves in circles which isn't helpful in any way,
the discussion is not about some concrete bug in the logic to fix anymore
(otherwise please name it). Hence my proposal that everything else can wait
and be done in net-next.

  reply	other threads:[~2016-12-24 21:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-21 17:04 [PATCH net] net, sched: fix soft lockup in tc_classify Daniel Borkmann
2016-12-21 17:37 ` Eric Dumazet
2016-12-21 18:51 ` Cong Wang
2016-12-21 19:10   ` Cong Wang
2016-12-21 20:02     ` Daniel Borkmann
2016-12-21 20:47       ` Cong Wang
2016-12-21 21:07         ` Daniel Borkmann
2016-12-22 16:53           ` David Miller
2016-12-22 17:50             ` John Fastabend
2016-12-22 23:21               ` Daniel Borkmann
2016-12-22 19:05           ` Cong Wang
2016-12-23  0:26             ` Daniel Borkmann
2016-12-24  7:34               ` Cong Wang
2016-12-24 21:03                 ` Daniel Borkmann [this message]
2016-12-21 19:16   ` Daniel Borkmann
2016-12-22 13:16 ` Shahar Klein
2016-12-22 23:20   ` Daniel Borkmann
2016-12-26 16:24 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585EE2AF.2000100@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=jiri@mellanox.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=roid@mellanox.com \
    --cc=shahark@mellanox.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).