All of lore.kernel.org
 help / color / mirror / Atom feed
From: Viresh Kumar <viresh.kumar@linaro.org>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: rjw@rjwysocki.net, ego@linux.vnet.ibm.com, paulus@samba.org,
	linux-kernel@vger.kernel.org, shilpa.bhat@linux.vnet.ibm.com,
	linux-pm@vger.kernel.org
Subject: Re: [RFC PATCH] cpufreq/hotplug: Fix cpu-hotplug cpufreq race conditions
Date: Mon, 1 Jun 2015 12:49:34 +0530	[thread overview]
Message-ID: <20150601071934.GC4242@linux> (raw)
In-Reply-To: <20150601064031.2972.59208.stgit@perfhull-ltc.austin.ibm.com>

On 01-06-15, 01:40, Preeti U Murthy wrote:

I have to mention that this is somewhat inspired by:

https://git.linaro.org/people/viresh.kumar/linux.git/commit/1e37f1d6ae12f5896e4e216f986762c3050129a5

and I was waiting to finish some core-changes to make all this simple.

I am fine to you trying to finish it though :)

> The problem showed up when running hotplug operations and changing
> governors in parallel. The crash would be at:
> 
> [  174.319645] Unable to handle kernel paging request for data at address 0x00000000
> [  174.319782] Faulting instruction address: 0xc00000000053b3e0
> cpu 0x1: Vector: 300 (Data Access) at [c000000003fdb870]
>     pc: c00000000053b3e0: __bitmap_weight+0x70/0x100
>     lr: c00000000085a338: need_load_eval+0x38/0xf0
>     sp: c000000003fdbaf0
>    msr: 9000000100009033
>    dar: 0
>  dsisr: 40000000
>   current = 0xc000000003151a40
>   paca    = 0xc000000007da0980	 softe: 0	 irq_happened: 0x01
>     pid   = 842, comm = kworker/1:2
> enter ? for help
> [c000000003fdbb40] c00000000085a338 need_load_eval+0x38/0xf0
> [c000000003fdbb70] c000000000856a10 od_dbs_timer+0x90/0x1e0
> [c000000003fdbbe0] c0000000000f489c process_one_work+0x24c/0x910
> [c000000003fdbc90] c0000000000f50dc worker_thread+0x17c/0x540
> [c000000003fdbd20] c0000000000fed70 kthread+0x120/0x140
> [c000000003fdbe30] c000000000009678 ret_from_kernel_thread+0x5c/0x64
> 
> While debugging the issue, other problems in this area were uncovered,
> all of them necessitating serialized calls to __cpufreq_governor().  One
> potential race condition that can happen today is the below:
> 
> CPU0					CPU1
> 
> cpufreq_set_policy()
> 
> __cpufreq_governor
> (CPUFREQ_GOV_POLICY_EXIT)
> 				__cpufreq_remove_dev_finish()
> 
> free(dbs_data)			__cpufreq_governor
> 				(CPUFRQ_GOV_START)
> 
> 				dbs_data->mutex <= NULL dereference
> 
> The issue here is that calls to cpufreq_governor_dbs() is not serialized
> and they can conflict with each other in numerous ways. One way to sort
> this out would be to serialize all calls to cpufreq_governor_dbs()
> by setting the governor busy if a call is in progress and
> blocking all other calls. But this approach will not cover all loop
> holes. Take the above scenario: CPU1 will still hit a NULL dereference if
> care is not taken to check for a NULL dbs_data.
> 
> To sort such scenarios, we could filter out the sequence of events: A
> CPUFREQ_GOV_START cannot be called without an INIT, if the previous
> event was an EXIT. However this results in analysing all possible
> sequence of events and adding each of them as a filter. This results in
> unmanagable code. There is high probability of missing out on a race
> condition. Both the above approaches were tried out earlier [1]

I agree.

> Let us therefore look at the heart of the issue.

Yeah, we should :)

> It is not really about
> serializing calls to cpufreq_governor_dbs(), it seems to be about
> serializing entire sequence of CPUFREQ_GOV* operations. For instance, in
> cpufreq_set_policy(), we STOP,EXIT the old policy and INIT and START the
> new policy. Between the EXIT and INIT, there must not be
> anybody else starting the policy. And between INIT and START, there must
> be nobody stopping the policy.

Hmm..

> A similar argument holds for the CPUFREQ_GOV* operations in
>  __cpufreq_policy_dev_{prepare|finish} and cpufreq_add_policy(). Hence
> until each of these functions complete in totality, none of the others
> should run in parallel. The interleaving of the individual calls to
> cpufreq_governor_dbs() is resulting in invalid operations. This patch
> therefore tries to serialize entire cpufreq functions calling CPUFREQ_GOV*
> operations, with respect to each other.

We were forced to put band-aids until this time and I am really
looking into getting this fixed at the root.

The problem is that we drop policy locks before calling
__cpufreq_governor() and that's the root cause of all these problems
we are facing. We did that because we were getting warnings about
circular locks (955ef4833574 ("cpufreq: Drop rwsem lock around
CPUFREQ_GOV_POLICY_EXIT"))..

I have explained that problem here (Never sent this upstream, as I was
waiting for some other patches to get included first):
https://git.linaro.org/people/viresh.kumar/linux.git/commit/57714d5b1778f2f610bcc5c74d85b29ba1cc1995

The actual problem was:
If we hold any locks, that the attribute operations grab, when
removing the attribute, then it can result in a ABBA deadlock.

show()/store() holds the policy->rwsem lock while accessing any sysfs
attributes under cpu/cpuX/cpufreq/ directory.

But something like what I have done is the real way to tackle all
these problems.

These band-aid wouldn't take us anywhere.

-- 
viresh

  reply	other threads:[~2015-06-01  7:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-01  6:40 [RFC PATCH] cpufreq/hotplug: Fix cpu-hotplug cpufreq race conditions Preeti U Murthy
2015-06-01  7:19 ` Viresh Kumar [this message]
2015-06-01  7:55   ` Preeti U Murthy
2015-06-02  5:31   ` Preeti U Murthy
2015-06-02  5:39     ` Viresh Kumar
2015-06-02  6:03       ` Preeti U Murthy
2015-06-02  6:11         ` Viresh Kumar
2015-06-02  6:20           ` Preeti U Murthy
2015-06-02  6:27             ` Viresh Kumar
2015-06-02  6:56               ` Preeti U Murthy
2015-06-02  7:07                 ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150601071934.GC4242@linux \
    --to=viresh.kumar@linaro.org \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=shilpa.bhat@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.