From: Krister Johansen <kjlx@templeofstupid.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>,
Jamal Hadi Salim <jhs@mojatatu.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
Date: Wed, 5 Oct 2016 23:11:50 -0700 [thread overview]
Message-ID: <20161006061150.GA2525@templeofstupid.com> (raw)
In-Reply-To: <CAM_iQpUya1ORUUNC9-spLKHjsjNyj6xHO79kvoDGaZcWT_mEwg@mail.gmail.com>
On Wed, Oct 05, 2016 at 11:01:38AM -0700, Cong Wang wrote:
> On Tue, Oct 4, 2016 at 11:52 PM, Krister Johansen
> <kjlx@templeofstupid.com> wrote:
> > On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote:
> >> Please try the attached patch. I also convert the read path to RCU
> >> to avoid a possible deadlock. A quick test shows no lockdep splat.
> >
> > I tried this patch, but it doesn't solve the problem. I got a panic on
> > my very first try:
>
> Thanks for testing it.
Absolutely; thanks for helping to try to simplify this fix.
> > The problem here is the same as before: by using RCU the race isn't
> > fixed because the module is still discoverable from act_base before the
> > pernet initialization is completed.
> >
> > You can see from the trap frame that the first two arguments to
> > tcf_hash_check were 0. It couldn't look up the correct per-subsystem
> > pointer because the id hadn't yet been registered.
>
> I thought the problem is that we don't do pernet ops registration and
> action ops registration atomically therefore chose to use mutex+RCU,
> but I was wrong, the problem here is just ordering, we need to finish
> the pernet initialization before making action ops visible.
>
> If so, why not just reorder them? Does the attached patch make any
> sense now? Our pernet init doesn't rely on act_base, so even we have
> some race, the worst case is after we initialize the pernet netns for an
> action but its ops still not visible, which seems fine (at least no crash).
>
> Or I still miss something here?
I'm not sure. The reason I didn't take this approach from the outset is
that all of TC's callers of tcf_register_action pass a pointer to a
static structure as their *ops argument. The existence of code that
checks the action for uniqueness suggests that it's possible for
tcf_register_action to get passed two identical tc_action_ops. If that
happens in the current code base, we'll also get passed a duplicate
pernet_operations pointer. The code in register_pernet_subsys() makes
no attempt to check for duplicates. If we add a pointer that's already
in the list, and subsequently call unregister, the results seem
undefined. It looks like we'll remove the pernet_operations for the
existing action, assuming we don't corrupt the list in the process.
Is this actually safe? If so, what corner case is the act->type /
act->kind protecting us from?
> (Sorry that I don't have the environment to reproduce your bug)
I'm sorry that I didn't do a good job of explaining how we end up in
this situation in the first place. I can give a few more details,
because it may explain some of my concern about the request_module()
call.
The system that encounters this bug launches a bunch of containers from
systemd on boot. Each container creates a new user, net, pid, and mount
namespace and begins its setup. When the networking in all of these
containers, each in a new netns, try to configure TC and no modules are
loaded we end up with this race.
I can also reproduce by unloading the modules, and then launching a
bunch of processes that configure tc in new namespaces.
Part of the desire to inhibit extra modprobe calls is that if hundreds
of these all start at once on boot, it's really unnecessary to have all
of the rest of them wait while lots of extra modprobe calls are forked
by the kernel.
> Thanks for your patience and testing!
Thank you for taking the time to look through the fix and discuss
alternatives.
-K
next prev parent reply other threads:[~2016-10-06 6:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-02 3:13 [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action Krister Johansen
2016-10-03 1:18 ` Jamal Hadi Salim
2016-10-04 6:38 ` Krister Johansen
2016-10-03 18:22 ` Cong Wang
2016-10-04 6:39 ` Krister Johansen
2016-10-05 6:52 ` Krister Johansen
2016-10-05 18:01 ` Cong Wang
2016-10-05 18:07 ` Cong Wang
2016-10-06 6:11 ` Krister Johansen [this message]
2016-10-06 19:01 ` Cong Wang
2016-10-09 6:13 ` Krister Johansen
2016-10-11 17:36 ` Cong Wang
2016-10-11 9:28 ` Krister Johansen
2016-10-11 17:51 ` Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161006061150.GA2525@templeofstupid.com \
--to=kjlx@templeofstupid.com \
--cc=jhs@mojatatu.com \
--cc=netdev@vger.kernel.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.