From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
Intel graphics driver community testing & development
<intel-gfx@lists.freedesktop.org>,
Linux kernel development <linux-kernel@vger.kernel.org>,
David Hildenbrand <dahi@linux.vnet.ibm.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting
Date: Tue, 16 Feb 2016 10:49:36 +0200 [thread overview]
Message-ID: <1455612576.4977.11.camel@linux.intel.com> (raw)
In-Reply-To: <20160215170618.GL6375@twins.programming.kicks-ass.net>
Hi,
On ma, 2016-02-15 at 18:06 +0100, Peter Zijlstra wrote:
> On Mon, Feb 15, 2016 at 03:17:55PM +0100, Peter Zijlstra wrote:
> > On Mon, Feb 15, 2016 at 02:36:43PM +0200, Joonas Lahtinen wrote:
> > > Instead of implementing a custom locked reference counting, use lockref.
> > >
> > > Current implementation leads to a deadlock splat on Intel SKL platforms
> > > when lockdep debugging is enabled.
> > >
> > > This is due to few of CPUfreq drivers (including Intel P-state) having this;
> > > policy->rwsem is locked during driver initialization and the functions called
> > > during init that actually apply CPU limits use get_online_cpus (because they
> > > have other calling paths too), which will briefly lock cpu_hotplug.lock to
> > > increase cpu_hotplug.refcount.
> > >
> > > On later calling path, when doing a suspend, when cpu_hotplug_begin() is called
> > > in disable_nonboot_cpus(), callbacks to CPUfreq functions get called after,
> > > which will lock policy->rwsem and cpu_hotplug.lock is already held by
> > > cpu_hotplug_begin() and we do have a potential deadlock scenario reported by
> > > our CI system (though it is a very unlikely one). See the Bugzilla link for more
> > > details.
> >
> > I've been meaning to change the thing into a percpu-rwsem, I just
> > haven't had time to look into the lockdep splat that generated.
>
>
> The below has plenty lockdep issues because percpu-rwsem is
> reader-writer fair (like the regular rwsem), so it does throw up a fair
> number of very icky issues.
>
I originally thought of implementing this more similar to what you
specify, but then I came across a discussion in the mailing list where
it was NAKed adding more members to task_struct;
http://comments.gmane.org/gmane.linux.kernel/970273
Adding proper recursion (the way my initial implementation was going)
got ugly without modifying task_struct because get_online_cpus() is a
speed critical code path.
So I'm all for fixing the current code in a different way if that will
then be merged.
Regards, Joonas
> If at all possible, I'd really rather fix those and have a 'saner'
> hotplug lock, rather than muddle on with open-coded horror lock we have
> now.
>
>
<SNIP>
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
WARNING: multiple messages have this Message-ID (diff)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Intel graphics driver community testing & development
<intel-gfx@lists.freedesktop.org>,
Linux kernel development <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
David Hildenbrand <dahi@linux.vnet.ibm.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
"Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
Chris Wilson <chris@chris-wilson.co.uk>,
Daniel Vetter <daniel@ffwll.ch>
Subject: Re: [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting
Date: Tue, 16 Feb 2016 10:49:36 +0200 [thread overview]
Message-ID: <1455612576.4977.11.camel@linux.intel.com> (raw)
In-Reply-To: <20160215170618.GL6375@twins.programming.kicks-ass.net>
Hi,
On ma, 2016-02-15 at 18:06 +0100, Peter Zijlstra wrote:
> On Mon, Feb 15, 2016 at 03:17:55PM +0100, Peter Zijlstra wrote:
> > On Mon, Feb 15, 2016 at 02:36:43PM +0200, Joonas Lahtinen wrote:
> > > Instead of implementing a custom locked reference counting, use lockref.
> > >
> > > Current implementation leads to a deadlock splat on Intel SKL platforms
> > > when lockdep debugging is enabled.
> > >
> > > This is due to few of CPUfreq drivers (including Intel P-state) having this;
> > > policy->rwsem is locked during driver initialization and the functions called
> > > during init that actually apply CPU limits use get_online_cpus (because they
> > > have other calling paths too), which will briefly lock cpu_hotplug.lock to
> > > increase cpu_hotplug.refcount.
> > >
> > > On later calling path, when doing a suspend, when cpu_hotplug_begin() is called
> > > in disable_nonboot_cpus(), callbacks to CPUfreq functions get called after,
> > > which will lock policy->rwsem and cpu_hotplug.lock is already held by
> > > cpu_hotplug_begin() and we do have a potential deadlock scenario reported by
> > > our CI system (though it is a very unlikely one). See the Bugzilla link for more
> > > details.
> >
> > I've been meaning to change the thing into a percpu-rwsem, I just
> > haven't had time to look into the lockdep splat that generated.
>
>
> The below has plenty lockdep issues because percpu-rwsem is
> reader-writer fair (like the regular rwsem), so it does throw up a fair
> number of very icky issues.
>
I originally thought of implementing this more similar to what you
specify, but then I came across a discussion in the mailing list where
it was NAKed adding more members to task_struct;
http://comments.gmane.org/gmane.linux.kernel/970273
Adding proper recursion (the way my initial implementation was going)
got ugly without modifying task_struct because get_online_cpus() is a
speed critical code path.
So I'm all for fixing the current code in a different way if that will
then be merged.
Regards, Joonas
> If at all possible, I'd really rather fix those and have a 'saner'
> hotplug lock, rather than muddle on with open-coded horror lock we have
> now.
>
>
<SNIP>
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
next prev parent reply other threads:[~2016-02-16 8:49 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-15 12:36 [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting Joonas Lahtinen
2016-02-15 12:36 ` Joonas Lahtinen
2016-02-15 14:17 ` Peter Zijlstra
2016-02-15 14:17 ` Peter Zijlstra
2016-02-15 17:06 ` Peter Zijlstra
2016-02-15 17:06 ` Peter Zijlstra
2016-02-16 8:49 ` Joonas Lahtinen [this message]
2016-02-16 8:49 ` Joonas Lahtinen
2016-02-16 9:14 ` Peter Zijlstra
2016-02-16 9:14 ` Peter Zijlstra
2016-02-16 10:51 ` Joonas Lahtinen
2016-02-16 10:51 ` Joonas Lahtinen
2016-02-16 11:07 ` Peter Zijlstra
2016-02-16 11:07 ` Peter Zijlstra
2016-02-17 12:47 ` Joonas Lahtinen
2016-02-17 12:47 ` Joonas Lahtinen
2016-02-17 14:20 ` Peter Zijlstra
2016-02-17 14:20 ` Peter Zijlstra
2016-02-17 16:13 ` Daniel Vetter
2016-02-17 16:13 ` Daniel Vetter
2016-02-17 16:14 ` Peter Zijlstra
2016-02-17 16:14 ` Peter Zijlstra
2016-02-17 16:33 ` [Intel-gfx] " Daniel Vetter
2016-02-17 16:37 ` Peter Zijlstra
2016-02-17 16:37 ` [Intel-gfx] " Peter Zijlstra
2016-02-18 10:39 ` Joonas Lahtinen
2016-02-18 10:39 ` [Intel-gfx] " Joonas Lahtinen
2016-02-18 10:54 ` Joonas Lahtinen
2016-02-18 10:54 ` Joonas Lahtinen
2016-02-15 17:18 ` Daniel Vetter
2016-02-15 17:18 ` [Intel-gfx] " Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1455612576.4977.11.camel@linux.intel.com \
--to=joonas.lahtinen@linux.intel.com \
--cc=dahi@linux.vnet.ibm.com \
--cc=ego@linux.vnet.ibm.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.