From mboxrd@z Thu Jan 1 00:00:00 1970 From: Heiko Carstens Subject: Re: [patch] ipv4: initialize arp_tbl rw lock Date: Mon, 24 Apr 2006 12:18:52 +0200 Message-ID: <20060424101852.GA16007@osiris.boeblingen.de.ibm.com> References: <20060408100213.GA9412@osiris.boeblingen.de.ibm.com> <20060408.031404.111884281.davem@davemloft.net> <20060415072745.GA17011@osiris.boeblingen.de.ibm.com> <20060415.003457.103031290.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: shemminger@osdl.org, jgarzik@pobox.com, akpm@osdl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, fpavlic@de.ibm.com, davem@sunset.davemloft.net Return-path: To: "David S. Miller" Content-Disposition: inline In-Reply-To: <20060415.003457.103031290.davem@davemloft.net> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org > > Tried to figure out what is causing the delays I experienced when I replaced > > module_init() in af_inet.c with fs_initcall(). After all it turned out that > > synchronize_net() which is basicically nothing else than synchronize_rcu() > > sometimes takes several seconds to complete?! No idea why that is... > > > > callchain: inet_init() -> inet_register_protosw() -> synchronize_net() > > The problem can't be rcu_init(), that gets done very early > in init/main.c > > Maybe it's some timer or something else specific to s390? > > It could also be that there's perhaps nothing to context > switch to, thus the RCU takes forever to "happen". Yes, it's more or less s390 specific. What happens is the following: synchronize_rcu() enqueues an RCU callback on cpu 0. Later on cpu 0 handles a bunch of RCU batches, but without handling this specific request (it's in rdp->curlist). Since this cpu has nothing else to do it enters cpu_idle() (it's a nohz idle, therefore it might be quite a long time in idle state). While cpu 0 is in idle state cpu 2 calls cpu_quiet() which in turn will call rcu_start_batch(). If cpu 0 would run now, it would notice rdp->curlist moved to rdp->donelist and that there is something to do. Unfortunately it doesn't get notified from rcu_start_batch(). That's why I ended up waiting several seconds until finally some interrupt arrived at cpu 0 which made things go on finally. Avoiding this could be done if we look at rdp->curlist before going into a nohz idle wait, or we could send an interprocessor interrupt to idle cpus. Sending an interrupt while looking only on nohz_cpu_mask seems to be a bit racy, since other cpus might have entered cpu idle after nohz_cpu_mask has been read... At least the initcall change for inet_init() can go in, since it just revealed a problem that we have anyway.