From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760572AbYDNPqx (ORCPT ); Mon, 14 Apr 2008 11:46:53 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753892AbYDNPqq (ORCPT ); Mon, 14 Apr 2008 11:46:46 -0400 Received: from E23SMTP05.au.ibm.com ([202.81.18.174]:48563 "EHLO e23smtp05.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752603AbYDNPqp (ORCPT ); Mon, 14 Apr 2008 11:46:45 -0400 Date: Mon, 14 Apr 2008 21:16:42 +0530 From: Gautham R Shenoy To: Heiko Carstens Cc: Peter Zijlstra , Miles Lane , LKML , "Rafael J. Wysocki" , Ingo Molnar Subject: Re: 2.6.25-rc9 -- INFO: possible circular locking dependency detected Message-ID: <20080414154642.GA30018@in.ibm.com> Reply-To: ego@in.ibm.com References: <1208156045.7427.16.camel@twins> <1208174767.7427.60.camel@twins> <20080414122741.GB7416@in.ibm.com> <1208176947.7427.68.camel@twins> <20080414144801.GC20671@in.ibm.com> <20080414151946.GA17995@osiris.boeblingen.de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080414151946.GA17995@osiris.boeblingen.de.ibm.com> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 14, 2008 at 05:19:46PM +0200, Heiko Carstens wrote: > On Mon, Apr 14, 2008 at 08:18:01PM +0530, Gautham R Shenoy wrote: > > On Mon, Apr 14, 2008 at 02:42:27PM +0200, Peter Zijlstra wrote: > > While you're fixing the cpu hotplug stuff anyway, there's still a bug > present in a few modules init code: > > Usually they do something like: > > register_hotcpu_notifier(...); > for_each_online_cpu(i) > ... > > A module's init functions gets called from sys_init_module and there is nothing > that would protect from cpu hotplug. > Therefore the sequence of for_each_online_cpu() and register_hotcpu_notifier() > better should be protected by a surrounding get/put_online_cpus() like this: > > get_online_cpus(); > register_hotcpu_notifier(...); > for_each_online_cpu(i) > ... > put_online_cpus(); But shouldn't this be: register_hotcpu_notifier(...); get_online_cpus(); for_each_online_cpus() ... put_online_cpus(); What's the problem with this ordering? > > But as far as I can see that can lead to a deadlock if e.g. cpu 0 would > execute the code above whild cpu 1 is executing some cpu hotplug code: > > cpu0: get_online_cpus() > -> increase cpu_hotplug.refcount > cpu1: cpu_down() > -> cpu_maps_update_begin() > -> grab cpu_add_remove_lock > -> wait for cpu_hotplug.refcount to drop to zero again > cpu0: register_hotcpu_notifier() > -> cpu_maps_update_begin > -> tries to grab cpu_add_remove_lock that cpu 1 holds already > -> dead -- Thanks and Regards gautham