From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vlad Yasevich Subject: Re: [PATCH net] sctp: fix race on protocol/netns initialization Date: Thu, 10 Sep 2015 15:14:41 -0400 Message-ID: <55F1D6A1.6060605@gmail.com> References: <27f20cc781cac28bc2ce7229719748a2f6824bd8.1441827818.git.marcelo.leitner@gmail.com> <55F096EB.7000204@gmail.com> <55F09F57.6080602@gmail.com> <55F1847A.9010801@gmail.com> <55F19243.3010006@gmail.com> <55F1A6AE.2010804@gmail.com> <20150910162454.GB4496@localhost.localdomain> <20150910183520.GD4496@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, Neil Horman , linux-sctp@vger.kernel.org To: Marcelo Ricardo Leitner Return-path: Received: from mail-qg0-f46.google.com ([209.85.192.46]:33870 "EHLO mail-qg0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752365AbbIJTOo (ORCPT ); Thu, 10 Sep 2015 15:14:44 -0400 In-Reply-To: <20150910183520.GD4496@localhost.localdomain> Sender: netdev-owner@vger.kernel.org List-ID: On 09/10/2015 02:35 PM, Marcelo Ricardo Leitner wrote: > On Thu, Sep 10, 2015 at 01:24:54PM -0300, Marcelo Ricardo Leitner wrote: >> On Thu, Sep 10, 2015 at 11:50:06AM -0400, Vlad Yasevich wrote: >>> On 09/10/2015 10:22 AM, Marcelo Ricardo Leitner wrote: >>>> Em 10-09-2015 10:24, Vlad Yasevich escreveu: >> ... >>>>> Then you can order sctp_net_init() such that it happens first, then protosw registration >>>>> happens, then control socket initialization happens, then inet protocol registration >>>>> happens. >>>>> >>>>> This way, we are always guaranteed that by the time user calls socket(), protocol >>>>> defaults are fully initialized. >>>> >>>> Okay, that works for module loading stage, but then how would we handle new netns's? We >>>> have to create the control socket per netns and AFAICT sctp_net_init() is the only hook >>>> called when a new netns is being created. >>>> >>>> Then if we move it a workqueue that is scheduled by sctp_net_init(), we loose the ability >>>> to handle its errors by propagating through sctp_net_init() return value, not good. >>> >>> Here is kind of what I had in mind. It's incomplete and completely untested (not even >>> compiled), but good enough to describe the idea: >> ... >> >> Ohh, ok now I get it, thanks. If having two pernet_subsys for a given >> module is fine, that works for me. It's clearer and has no moment of >> temporary failure. >> >> I can finish this patch if everybody agrees with it. >> >>>>>> I used the list pointer because that's null as that memory is entirely zeroed when alloced >>>>>> and, after initialization, it's never null again. Works like a lock/condition without >>>>>> using an extra field. >>>>>> >>>>> >>>>> I understand this a well. What I don't particularly like is that we are re-using >>>>> a list without really stating why it's now done this way. Additionally, it's not really >>>>> the last that happens so it's seems kind of hacky... If we need to add new >>>>> per-net initializers, we now need to make sure that the code is put in the right >>>>> place. I'd just really like to have a cleaner solution... >>>> >>>> Ok, got you. We could add a dedicated flag/bit for that then, if reusing the list is not >>>> clear enough. Or, as we are discussing on the other part of thread, we could make it block >>>> and wait for the initialization, probably using some wait_queue. I'm still thinking on >>>> something this way, likely something more below than sctp then. >>>> >>> >>> I think if we don the above, the second process calling socket() will either find the >>> the protosw or will try to load the module also. I think either is ok after >>> request_module returns we'll look at the protosw and will find find things. >> >> Seems so, yes. Nice. > > I was testing with it, something is not good. I finished your patch and > testing with a flooder like: > # for j in {1..5}; do for i in {1234..1280}; do \ > sctp_darn -H 192.168.122.147 -P $j$i -l & done & done > ... snip... > > It seems that request_module will not serialize it as we wanted and we > would be putting unexpected pressure on it, yet it fixes the original > issue. So, wouldn't the same issue exist when running the above with DCCP sockets? > Maybe we can place a semaphore at inet_create(), protecting the > request_module()s so only one socket can do it at a time and, after it > is released, whoever was blocked on it re-checks if the module isn't > already loaded before attempting again. It makes the loading of > different modules slower, though, but I'm not sure if that's really a > problem. Not many modules are loaded at the same time like that. What do > you think? I think this is a different issue. The fact that we keep trying to probe the same module is silly. May be a per proto semaphore so that SCTP doesn't block DCCP for example. -vlad > > Marcelo >