From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joonas Lahtinen Subject: Re: [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting Date: Tue, 16 Feb 2016 12:51:03 +0200 Message-ID: <1455619863.4977.29.camel@linux.intel.com> References: <1455539803-13913-1-git-send-email-joonas.lahtinen@linux.intel.com> <20160215141755.GG6357@twins.programming.kicks-ass.net> <20160215170618.GL6375@twins.programming.kicks-ass.net> <1455612576.4977.11.camel@linux.intel.com> <20160216091440.GT6357@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTP id B4A9E6E781 for ; Tue, 16 Feb 2016 10:51:10 +0000 (UTC) In-Reply-To: <20160216091440.GT6357@twins.programming.kicks-ass.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Peter Zijlstra Cc: "Gautham R. Shenoy" , Intel graphics driver community testing & development , Linux kernel development , David Hildenbrand , "Paul E. McKenney" , Ingo Molnar List-Id: intel-gfx@lists.freedesktop.org T24gdGksIDIwMTYtMDItMTYgYXQgMTA6MTQgKzAxMDAsIFBldGVyIFppamxzdHJhIHdyb3RlOgo+ IE9uIFR1ZSwgRmViIDE2LCAyMDE2IGF0IDEwOjQ5OjM2QU0gKzAyMDAsIEpvb25hcyBMYWh0aW5l biB3cm90ZToKPiA+IEkgb3JpZ2luYWxseSB0aG91Z2h0IG9mIGltcGxlbWVudGluZyB0aGlzIG1v cmUgc2ltaWxhciB0byB3aGF0IHlvdQo+ID4gc3BlY2lmeSwgYnV0IHRoZW4gSSBjYW1lIGFjcm9z cyBhIGRpc2N1c3Npb24gaW4gdGhlIG1haWxpbmcgbGlzdCB3aGVyZQo+ID4gaXQgd2FzIE5BS2Vk IGFkZGluZyBtb3JlIG1lbWJlcnMgdG8gdGFza19zdHJ1Y3Q7Cj4gPiAKPiA+IGh0dHA6Ly9jb21t ZW50cy5nbWFuZS5vcmcvZ21hbmUubGludXgua2VybmVsLzk3MDI3Mwo+ID4gCj4gPiBBZGRpbmcg cHJvcGVyIHJlY3Vyc2lvbiAodGhlIHdheSBteSBpbml0aWFsIGltcGxlbWVudGF0aW9uIHdhcyBn b2luZykKPiA+IGdvdCB1Z2x5IHdpdGhvdXQgbW9kaWZ5aW5nIHRhc2tfc3RydWN0IGJlY2F1c2XC oGdldF9vbmxpbmVfY3B1cygpIGlzIGEKPiA+IHNwZWVkIGNyaXRpY2FsIGNvZGUgcGF0aC4KPiAK PiBZZWFoLCBqdXN0IGRvbid0IGxldCBMaW51cyBoZWFyIHlvdSBzYXkgdGhhdC4gZ2V0X29ubGlu ZV9jcHVzKCkgaXMgX25vdF8KPiBjb25zaWRlcmVkIHBlcmZvcm1hbmNlIGNyaXRpY2FsLgoKT2gg d2VsbCwgYXQgbGVhc3QgY2hhbmdlcyB0byBpdCBhZGRlZCBxdWl0ZSBub3RpY2VhYmx5IHRvIHRo ZSBib290dXAKdGltZSBvZiBhIHN5c3RlbS4KCj4gCj4gPiBTbyBJJ20gYWxsIGZvciBmaXhpbmcg dGhlIGN1cnJlbnQgY29kZSBpbiBhIGRpZmZlcmVudCB3YXkgaWYgdGhhdCB3aWxsCj4gPiB0aGVu IGJlIG1lcmdlZC4KPiAKPiBTbyBJJ20gbm90IHN1cmUgd2h5IHlvdSdyZSBwb2tpbmcgYXQgdGhp cyBob3Jyb3Igc2hvdyB0byBiZWdpbiB3aXRoLgo+IElTVFIgeW91IG1lbnRpb25pbmcgYSBsb2Nr ZGVwIHNwbGF0IGZvciBTS0wsIGJ1dCBmYWlsZWQgdG8gcHJvdmlkZQo+IGRldGFpbC4KPiAKClF1 b3RpbmcgbXkgb3JpZ2luYWwgcGF0Y2g7CgoiU2VlIHRoZSBCdWd6aWxsYSBsaW5rIGZvciBtb3Jl IGRldGFpbHMuCgpCdWd6aWxsYTogaHR0cHM6Ly9idWdzLmZyZWVkZXNrdG9wLm9yZy9zaG93X2J1 Zy5jZ2k/aWQ9OTMyOTQiCgpUaGUgaW1wcm92ZW1lbnQgbXkgcGF0Y2ggaW1wbGVtZW50cyBpcyB0 byB1c2UgbG9ja3JlZiBmb3IgbG9ja2VkCnJlZmVyZW5jZSBjb3VudGluZyAoaG90cGx1ZyBjb2Rl IHByZXZpb3VzbHkgcm9sbGVkIGl0cyBvd24gbXV0ZXggKwphdG9taWMgY29tYm8pLCB3aGljaCBn ZXRzIHJpZCBvZiB0aGUgZGVhZGxvY2sgc2NlbmFyaW8gZGVzY3JpYmVkIGFuZApsaW5rZWQgaW4g dGhlIGluaXRpYWwgcGF0Y2guIFRyYWNlIGZvciB0aGUgc2NlbmFyaW87CgpodHRwczovL2J1Z3Mu ZnJlZWRlc2t0b3Aub3JnL2F0dGFjaG1lbnQuY2dpP2lkPTEyMTQ5MAoKSSB0aGluayB1c2luZyBs b2NrcmVmIG1ha2VzIGl0IHN1YnN0YW50aWFsbHkgbGVzcyBzcGVjaWFsLCBsb2NrcmVmIGNvZGUK YmVpbmcgYSBsb3QgbW9yZSBiYXR0bGUtdGVzdGVkIGluIHRoZSBGUyBjb2RlIHRoYW4gdGhlIHBy ZXZpb3VzCmNwdV9ob3RwbHVnLmxvY2sgbWVzcy4KCj4gTWFraW5nIHRoZSBob3RwbHVnIGxvY2sg X21vcmVfIHNwZWNpYWwgdG8gZml4IHRoYXQgaXMganVzdCB3cm9uZy4gRml4Cj4gdGhlIHJldGFy ZGVkIGxvY2tpbmcgdGhhdCBsZWFkIHRvIGl0Lgo+IAoKSSBkbyBhZ3JlZSB0aGF0IGl0J3Mgc3Rp bGwgbm90IHByZXR0eSwgYnV0IG5vdyBpdCBkb2VzIGNvcnJlY3RseSB3aGF0CnRoZSBwcmV2aW91 cyBjb2RlIHdhcyB0cnlpbmcgdG8gZG8gd2l0aCBjdXN0b20gbXV0ZXggKyBhdG9taWMuCgpJJ20g YWxsIGZvciBmaXhpbmcgdGhlIGNvZGUgZnVydGhlciwgYnV0IHByaW9yIHRvIHByb2NlZWRpbmcg dGhlcmUKbmVlZHMgdG8gYmUgc29tZSBzb3J0IG9mIGFuIGFncmVlbWVudCBvbiBlaXRoZXIgbWFr aW5nCmdldF9vbmxpbmVfY3B1cygpIHNsb3dlciAod2hpY2ggZG9lcyBub3Qgc2VlbSBsaWtlIGEg Z29vZCBpZGVhKSBvcgphZGRpbmcgbW9yZSBtZW1iZXJzIHRvIHRhc2tfc3RydWN0LgoKUmVnYXJk cywgSm9vbmFzCgo+IAotLSAKSm9vbmFzIExhaHRpbmVuCk9wZW4gU291cmNlIFRlY2hub2xvZ3kg Q2VudGVyCkludGVsIENvcnBvcmF0aW9uCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fCkludGVsLWdmeCBtYWlsaW5nIGxpc3QKSW50ZWwtZ2Z4QGxpc3RzLmZy ZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3Rp bmZvL2ludGVsLWdmeAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754835AbcBPKvM (ORCPT ); Tue, 16 Feb 2016 05:51:12 -0500 Received: from mga03.intel.com ([134.134.136.65]:40402 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750927AbcBPKvK (ORCPT ); Tue, 16 Feb 2016 05:51:10 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,455,1449561600"; d="scan'208";a="885913966" Message-ID: <1455619863.4977.29.camel@linux.intel.com> Subject: Re: [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting From: Joonas Lahtinen To: Peter Zijlstra Cc: Intel graphics driver community testing & development , Linux kernel development , Ingo Molnar , David Hildenbrand , "Paul E. McKenney" , "Gautham R. Shenoy" , Chris Wilson , Daniel Vetter Date: Tue, 16 Feb 2016 12:51:03 +0200 In-Reply-To: <20160216091440.GT6357@twins.programming.kicks-ass.net> References: <1455539803-13913-1-git-send-email-joonas.lahtinen@linux.intel.com> <20160215141755.GG6357@twins.programming.kicks-ass.net> <20160215170618.GL6375@twins.programming.kicks-ass.net> <1455612576.4977.11.camel@linux.intel.com> <20160216091440.GT6357@twins.programming.kicks-ass.net> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.4 (3.18.4-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On ti, 2016-02-16 at 10:14 +0100, Peter Zijlstra wrote: > On Tue, Feb 16, 2016 at 10:49:36AM +0200, Joonas Lahtinen wrote: > > I originally thought of implementing this more similar to what you > > specify, but then I came across a discussion in the mailing list where > > it was NAKed adding more members to task_struct; > > > > http://comments.gmane.org/gmane.linux.kernel/970273 > > > > Adding proper recursion (the way my initial implementation was going) > > got ugly without modifying task_struct becauseĀ get_online_cpus() is a > > speed critical code path. > > Yeah, just don't let Linus hear you say that. get_online_cpus() is _not_ > considered performance critical. Oh well, at least changes to it added quite noticeably to the bootup time of a system. > > > So I'm all for fixing the current code in a different way if that will > > then be merged. > > So I'm not sure why you're poking at this horror show to begin with. > ISTR you mentioning a lockdep splat for SKL, but failed to provide > detail. > Quoting my original patch; "See the Bugzilla link for more details. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93294" The improvement my patch implements is to use lockref for locked reference counting (hotplug code previously rolled its own mutex + atomic combo), which gets rid of the deadlock scenario described and linked in the initial patch. Trace for the scenario; https://bugs.freedesktop.org/attachment.cgi?id=121490 I think using lockref makes it substantially less special, lockref code being a lot more battle-tested in the FS code than the previous cpu_hotplug.lock mess. > Making the hotplug lock _more_ special to fix that is just wrong. Fix > the retarded locking that lead to it. > I do agree that it's still not pretty, but now it does correctly what the previous code was trying to do with custom mutex + atomic. I'm all for fixing the code further, but prior to proceeding there needs to be some sort of an agreement on either making get_online_cpus() slower (which does not seem like a good idea) or adding more members to task_struct. Regards, Joonas > -- Joonas Lahtinen Open Source Technology Center Intel Corporation