Re: [PATCH] intel_rapl: Correct hotplug correction

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>, Borislav Petkov <bp@suse.de>,
	Ingo Molnar <mingo@kernel.org>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"ego@linux.vnet.ibm.com" <ego@linux.vnet.ibm.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: [PATCH] intel_rapl: Correct hotplug correction
Date: Thu, 22 May 2014 17:24:33 +0530	[thread overview]
Message-ID: <537DE579.6000505@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140522100820.GE4383@pd.tnic>

On 05/22/2014 03:38 PM, Borislav Petkov wrote:
> On Thu, May 22, 2014 at 03:13:46PM +0530, Srivatsa S. Bhat wrote:
>> On 05/22/2014 02:53 PM, Borislav Petkov wrote:
>>> From: Borislav Petkov <bp@suse.de>
>>>
>>> So 009f225ef050 ("powercap, intel-rapl: Fix CPU hotplug callback
>>> registration") says how get_/put_online_cpus() should be replaced with
>>> this cpu_notifier_register_begin/_done().
>>>
>>> But they're still there so what's up?
>>>
>>
>> Ok, so I retained that because the comments in the code said that
>> the caller of rapl_cleanup_data() should hold the hotplug lock.
>>
>> Here is the snippet from the patch's changelog:
>>
>>     ...
>>     Fix the intel-rapl code in the powercap driver by using this latter form
>>     of callback registration. But retain the calls to get/put_online_cpus(),
>>     since they also protect the function rapl_cleanup_data(). By nesting
>>     get/put_online_cpus() *inside* cpu_notifier_register_begin/done(), we avoid
>>     the ABBA deadlock possibility mentioned above.
> 
> My bad, I missed that part.
> 
>> But looking closer at the code, I think the only requirement is that
>> rapl_cleanup_data() should be protected against CPU hotplug, and we
>> don't actually need to hold the cpu_hotplug.lock per-se.
> 
> What is the difference between "CPU hotplug" and cpu_hotplug.lock?
> From looking at the code those are two different mutexes with
> cpu_hotplug.lock, i.e. get_online_cpus() having a preemption point.
> 
> And yet, you want to replace get_/put_online_cpus() with
> cpu_notifier_register_begin/_done()

Its not a plain replacement; it is only where both get/put_online_cpus()
as well as notifier-[un]registration is involved. I was trying to overcome
the problems with the existing notifier registration APIs in cpu hotplug,
which were simply impossible to use in a race-free way. But we can certainly
make this much better with a fresh redesign of the whole thing.

I had proposed some other ways of fixing this here:
http://www.kernelhub.org/?msg=26421&p=2

> which is kinda the same thing but
> not really. The one protects against hotplug operations and the other
> protects against cpu hotplug notifier registration.
> 
> Oh, and there's a third one, aliased to the notifier one, which
> is "attempting to serialize the updates to cpu_online_mask &
> cpu_present_mask."
> 
> So why, oh why do we need three? This is absolutely insane. Do we have
> at least one sensible reason why cpu hotplug users should need to know
> all that gunk?
> 

Yeah, its complicated and perhaps we can do much better than that. But I'll
try to explain why there are so many different locks in the existing code.

get/put_online_cpus() uses cpu_hotplug.lock underneath. Although that's just
a mutex, its not used in the usual way (more about that below).

cpu_maps_update_begin(), cpu_notifier_register_begin/done,
register/unregister_cpu_notifier  -- all of these use the cpu_add_remove_lock.

The fundamental difference between these 2 categories is the concurrency
allowed with the lock :-
get_online_cpus() is like a read_lock(). That is, it allows any number
of concurrent CPU hotplug readers (tasks that want to block hotplug), but it
ensures that a writer won't run concurrently with any reader.

Hence, this won't work for notifier registrations because register/unregister
has to _mutate_ the notifier linked-list, and obviously we can't have multiple
tasks trying to do this at the same time. So we need a proper mutex that
allows only 1 task at a time into the critical section. That's why the
cpu_add_remove_lock is used in all the register/unregister paths.

Now, let's look at why the CPU hotplug writer path (the task that actually
does hotplug) takes cpu_add_remove_lock in addition to the cpu_hotplug.lock.
First reason is simple: you don't want tasks that are doing notifier
[un]registrations to race with hotplug. But the second, although subtle reason
is that put_online_cpus() and cpu_hotplug_writer_begin()/done() require that
there is only 1 CPU hotplug writer at a time. The cpu_add_remove_lock provides
this guarantee, since it is taken in the writer path. (The long comment above
cpu_hotplug_begin() mentions this as well).

And then there is powerpc which uses cpu_maps_update_begin/done() to protect
dlpar operations (operations that change the cpu_present_mask etc). But ideally
it should have just used cpu_hotplug_begin() itself, like what driver/acpi/
processor_driver.c does, but then it would have to hold cpu_add_remove_lock
as well, due to the current design :(

That was just me trying to explain the current mess, not justifying it! :-/

I think Oleg had a proposed patch to use per-cpu rwsem in CPU hotplug to
drastically simplify this whole locking scheme. I think we could look at
that again.

Regards,
Srivatsa S. Bhat

next prev parent reply	other threads:[~2014-05-22 11:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-22  9:23 [PATCH] intel_rapl: Correct hotplug correction Borislav Petkov
2014-05-22  9:43 ` Srivatsa S. Bhat
2014-05-22 10:08   ` Borislav Petkov
2014-05-22 11:54     ` Srivatsa S. Bhat [this message]
2014-05-22 12:13       ` Borislav Petkov
2014-05-22 12:32       ` Peter Zijlstra
2014-05-22 15:30         ` [PATCH] x86, MCE: Kill CPU_POST_DEAD Borislav Petkov
2014-05-22 15:50           ` Luck, Tony
2014-05-22 19:55             ` Borislav Petkov
2014-05-22 21:13               ` Srivatsa S. Bhat
2014-05-22 21:31                 ` Borislav Petkov
2014-05-22 21:40                   ` Srivatsa S. Bhat
2014-05-22 21:43               ` Srivatsa S. Bhat
2014-05-26 20:01               ` Borislav Petkov
2014-05-22 21:31         ` [PATCH] intel_rapl: Correct hotplug correction Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=537DE579.6000505@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=bp@alien8.de \
    --cc=bp@suse.de \
    --cc=ego@linux.vnet.ibm.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.