From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754558AbcBPItl (ORCPT ); Tue, 16 Feb 2016 03:49:41 -0500 Received: from mga11.intel.com ([192.55.52.93]:56900 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754291AbcBPItk (ORCPT ); Tue, 16 Feb 2016 03:49:40 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,454,1449561600"; d="scan'208";a="915834851" Message-ID: <1455612576.4977.11.camel@linux.intel.com> Subject: Re: [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting From: Joonas Lahtinen To: Peter Zijlstra Cc: Intel graphics driver community testing & development , Linux kernel development , Ingo Molnar , David Hildenbrand , "Paul E. McKenney" , "Gautham R. Shenoy" , Chris Wilson , Daniel Vetter Date: Tue, 16 Feb 2016 10:49:36 +0200 In-Reply-To: <20160215170618.GL6375@twins.programming.kicks-ass.net> References: <1455539803-13913-1-git-send-email-joonas.lahtinen@linux.intel.com> <20160215141755.GG6357@twins.programming.kicks-ass.net> <20160215170618.GL6375@twins.programming.kicks-ass.net> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.4 (3.18.4-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On ma, 2016-02-15 at 18:06 +0100, Peter Zijlstra wrote: > On Mon, Feb 15, 2016 at 03:17:55PM +0100, Peter Zijlstra wrote: > > On Mon, Feb 15, 2016 at 02:36:43PM +0200, Joonas Lahtinen wrote: > > > Instead of implementing a custom locked reference counting, use lockref. > > > > > > Current implementation leads to a deadlock splat on Intel SKL platforms > > > when lockdep debugging is enabled. > > > > > > This is due to few of CPUfreq drivers (including Intel P-state) having this; > > > policy->rwsem is locked during driver initialization and the functions called > > > during init that actually apply CPU limits use get_online_cpus (because they > > > have other calling paths too), which will briefly lock cpu_hotplug.lock to > > > increase cpu_hotplug.refcount. > > > > > > On later calling path, when doing a suspend, when cpu_hotplug_begin() is called > > > in disable_nonboot_cpus(), callbacks to CPUfreq functions get called after, > > > which will lock policy->rwsem and cpu_hotplug.lock is already held by > > > cpu_hotplug_begin() and we do have a potential deadlock scenario reported by > > > our CI system (though it is a very unlikely one). See the Bugzilla link for more > > > details. > > > > I've been meaning to change the thing into a percpu-rwsem, I just > > haven't had time to look into the lockdep splat that generated. > > > The below has plenty lockdep issues because percpu-rwsem is > reader-writer fair (like the regular rwsem), so it does throw up a fair > number of very icky issues. > I originally thought of implementing this more similar to what you specify, but then I came across a discussion in the mailing list where it was NAKed adding more members to task_struct; http://comments.gmane.org/gmane.linux.kernel/970273 Adding proper recursion (the way my initial implementation was going) got ugly without modifying task_struct becauseĀ get_online_cpus() is a speed critical code path. So I'm all for fixing the current code in a different way if that will then be merged. Regards, Joonas > If at all possible, I'd really rather fix those and have a 'saner' > hotplug lock, rather than muddle on with open-coded horror lock we have > now. > > -- Joonas Lahtinen Open Source Technology Center Intel Corporation