From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael Wysocki <rjw@rjwysocki.net>,
linaro-kernel@lists.linaro.org, linux-pm@vger.kernel.org,
ego@linux.vnet.ibm.com, paulus@samba.org,
shilpa.bhat@linux.vnet.ibm.com, prarit@redhat.com,
robert.schoene@tu-dresden.de, skannan@codeaurora.org
Subject: Re: [PATCH 0/3] cpufreq: governor: Fix potential races
Date: Thu, 04 Jun 2015 12:06:49 +0530 [thread overview]
Message-ID: <556FF201.8080100@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150604061128.GF11325@linux>
On 06/04/2015 11:41 AM, Viresh Kumar wrote:
> On 04-06-15, 11:38, Preeti U Murthy wrote:
>> And a crash at the cpufreq worker thread again due to data access
>> exception when I change governors in parallel on a single core.
>>
>> cpu 0x3: Vector: 300 (Data Access) at [c000000fedb538f0]
>> pc: c000000000856750: od_dbs_timer+0x60/0x1e0
>> lr: c0000000000f489c: process_one_work+0x24c/0x910
>> sp: c000000fedb53b70
>> msr: 9000000100009033
>> dar: 10
>> dsisr: 40000000
>> current = 0xc000000fe3d128e0
>> paca = 0xc000000007da1c80 softe: 0 irq_happened: 0x01
>> pid = 17227, comm = kworker/3:1
>>
>> With the backtrace being:
>>
>> [c000000fedb53be0] c0000000000f489c process_one_work+0x24c/0x910
>> [c000000fedb53c90] c0000000000f50dc worker_thread+0x17c/0x540
>> [c000000fedb53d20] c0000000000fed70 kthread+0x120/0x140
>> [c000000fedb53e30] c000000000009678 ret_from_kernel_thread+0x5c/0x64
>>
>> But the kernel stays sane longer than before with the patchset. The
>> above crash happens around 15 seconds after the test begins, while
>> earlier it wouldn't survive 2 seconds even.
>
> I haven't attempted to solve the race between the worker threads and
> governor-callbacks yet. What I have tried to solve is the race between
> different callbacks. And you shouldn't see a race there for now. For
> example a race between INIT/EXIT/START/STOP/LIMITS.
Your fix may not be complete and here is why. The reason we see the crash
is because we have *only* attempted to serialize calls to cpufreq_governor_dbs()
and not attempted to serialize *entire logical sequence of operations*. Let's
take a look at what is happening as a consequence.
CPU0 CPU1
store_scaling_governor() __cpufreq_remove_dev_finish()
__cpufreq_governor(CPUFREQ_GOV_STOP) __cpufreq_governor(CPUFREQ_GOV_START)
policy->governor_enabled = false
cpufreq_governor_dbs() policy->governor_enabled = true
mutex_lock()
gov_cancel_work() cpufreq_governor_dbs()
wait on lock
may call gov_queue_work()
if (!policy->enabled) : fails
and we end up queuing work
mutex_unlock()
mutex_lock()
gov_queue_work()
mutex_unlock()
__cpufreq_governor(CPUFREQ_GOV_POLICY_EXIT)
mutex_lock()
cpufreq_governor_dbs()
kfree(dbs_data)
timer fires and od_dbs_timer/cs_dbs_timer() runs
References governor data structures which are freed
The issue as I see it is one set of operations must be allowed to run *completely*
before another begins. When store_scaling_governor() says STOP, all governor operations
must be stopped, till the time store_scaling_governor() itself gives permission
to restart. Somebody else, in this case __cpufreq_remove_dev_finish() cannot overrule
this if it arrives late. This is what is happening above.
So if store_scaling_governor() arrives first, STOP|EXIT|START|LIMIT must complete before
START|LIMIT of __cpufreq_remove_dev_finish() is allowed to run. So it is just not about
serializing, its about *what needs to be serialized*.
Regards
Preeti U Murthy
>
next prev parent reply other threads:[~2015-06-04 6:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-03 10:27 [PATCH 0/3] cpufreq: governor: Fix potential races Viresh Kumar
2015-06-03 10:27 ` [PATCH 1/3] cpufreq: governor: register notifier from cs_init() Viresh Kumar
2015-06-04 5:38 ` Preeti U Murthy
2015-06-04 6:02 ` Viresh Kumar
2015-06-04 7:33 ` Preeti U Murthy
2015-06-03 10:27 ` [PATCH 2/3] cpufreq: governor: split cpufreq_governor_dbs() Viresh Kumar
2015-06-04 10:04 ` Preeti U Murthy
2015-06-04 10:17 ` Viresh Kumar
2015-06-04 11:13 ` [PATCH V2 " Viresh Kumar
2015-06-05 2:51 ` Preeti U Murthy
2015-06-03 10:27 ` [PATCH 3/3] cpufreq: governor: Serialize governor callbacks Viresh Kumar
2015-06-04 10:47 ` Preeti U Murthy
2015-06-04 5:14 ` [PATCH 0/3] cpufreq: governor: Fix potential races Preeti U Murthy
2015-06-04 6:08 ` Preeti U Murthy
2015-06-04 6:11 ` Viresh Kumar
2015-06-04 6:36 ` Preeti U Murthy [this message]
2015-06-04 6:42 ` Viresh Kumar
2015-06-04 7:04 ` Preeti U Murthy
2015-06-04 7:13 ` Viresh Kumar
2015-06-04 7:27 ` Preeti U Murthy
2015-06-05 3:00 ` Viresh Kumar
2015-06-05 3:04 ` Preeti U Murthy
2015-06-05 4:05 ` Preeti U Murthy
2015-06-15 23:48 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556FF201.8080100@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=ego@linux.vnet.ibm.com \
--cc=linaro-kernel@lists.linaro.org \
--cc=linux-pm@vger.kernel.org \
--cc=paulus@samba.org \
--cc=prarit@redhat.com \
--cc=rjw@rjwysocki.net \
--cc=robert.schoene@tu-dresden.de \
--cc=shilpa.bhat@linux.vnet.ibm.com \
--cc=skannan@codeaurora.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.