All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prarit Bhargava <prarit@redhat.com>
To: "Viresh Kumar" <viresh.kumar@linaro.org>,
	"Saravana Kannan" <skannan@codeaurora.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"Linux Kernel" <linux-kernel@vger.kernel.org>,
	"Robert Schöne" <robert.schoene@tu-dresden.de>
Subject: Re: Locking issues with cpufreq and sysfs
Date: Mon, 13 Oct 2014 09:22:49 -0400	[thread overview]
Message-ID: <543BD229.4020207@redhat.com> (raw)
In-Reply-To: <543BCF9A.1060603@redhat.com>



On 10/13/2014 09:11 AM, Prarit Bhargava wrote:
> There are several issues with the current locking design of cpufreq.  Most
> notably is the panic reported here:
> 
> http://marc.info/?l=linux-kernel&m=140622451625236&w=2
> 
> which was introduced by commit 955ef4833574636819cd269cfbae12f79cbde63a,
> cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT, which introduces
> a race in the changing of the cpufreq policy.  This change was introduced
> because of a lockdep deadlock warning that can be reproduced (on x86 with
> the acpi_cpufreq driver) via the following debug patch:
> 
> iff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
> index b0c18ed..366cfb7 100644
> --- a/drivers/cpufreq/acpi-cpufreq.c
> +++ b/drivers/cpufreq/acpi-cpufreq.c
> @@ -885,6 +885,7 @@ static struct freq_attr *acpi_cpufreq_attr[] = {
> 
>  static struct cpufreq_driver acpi_cpufreq_driver = {
>         .verify         = cpufreq_generic_frequency_table_verify,
> +       .flags          = CPUFREQ_HAVE_GOVERNOR_PER_POLICY,
>         .target_index   = acpi_cpufreq_target,
>         .bios_limit     = acpi_processor_get_bios_limit,
>         .init           = acpi_cpufreq_cpu_init,
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 61190f6..4cb488a 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2195,9 +2195,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *polic
>         /* end old governor */
>         if (old_gov) {
>                 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> -               up_write(&policy->rwsem);
>                 __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
> -               down_write(&policy->rwsem);
>         }
> 
>         /* start new governor */
> @@ -2206,9 +2204,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *polic
>                 if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
>                         goto out;
> 
> -               up_write(&policy->rwsem);
>                 __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
> -               down_write(&policy->rwsem);
>         }
> 
>         /* new governor failed, so re-start old one */
> 
> (which causes the acpi-cpufreq driver to emulate the behaviour of the arm
> cpufreq driver), and by doing
> 
> echo ondemand > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
> cat /sys/devices/system/cpu/cpu5/cpufreq/ondemand/*
> echo conservative > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
> exit 0
> 
> [Question: was the original reported deadlock "real"?  Did it really happen or
> did lockdep only report it (I may have asked this question previously and
> forgot the answer)?  The reason I ask is that this situation is very similar to
> USB's device removal in which the sysfs attributes are removed for a device but
> not the device it was called for.  I actually think that's part of the problem
> here.]
> 
> The above, obviously, is a complete hack of the code but in a sense does
> mimic a proper locking fix.  However, even with this fix we are still left
> with a race in accessing the sysfs files.  Consider the following example,
> 
> CPU 1: accesses scaling_setspeed to set cpu speed
> 
> simultaneously,
> 
> CPU 2: accesses scaling_governor to set governor to ondemand
> 
> CPU 1 & 2 race ... and this can result in different critical situations.
> The first is that CPU 1 holds the scalling_setspeed open while CPU attempts
> to change the governor.  This results in a syfs warning about creating a
> file with an existing file name which in some cases can lead to additional
> corruption and a panic.  The second case is that CPU 1's setting of the speed
> is now done on the new governor -- which may or may not be correct.  In any
> case an argument could be made that the userspace program doing this type
> of action should be "smart" enough to confirm simultaneous changes... but
> in any case the kernel should not panic or corrupt data.
> 
> The locking is insufficient here, Viresh.  I no longer believe that fixes
> to this locking scheme are the right way to move forward here.  I'm wondering
> if we can look at other alternatives such as maintaining a refcount or
> perhaps using a queuing mechanism for governor and policy related changes.
> 

Uh ... I meant this as "I'm willing to modify the code to do this but I'd like
to know what everyone else thinks before I do anything" ;)

P.

> P.
> 

  reply	other threads:[~2014-10-13 13:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-13 13:11 Locking issues with cpufreq and sysfs Prarit Bhargava
2014-10-13 13:22 ` Prarit Bhargava [this message]
2014-10-13 15:09   ` Rafael J. Wysocki
2014-10-14  7:10 ` Viresh Kumar
2014-10-14 18:24   ` Prarit Bhargava
2014-10-14 19:18     ` Elliott, Robert (Server Storage)
2014-10-14 19:18       ` Elliott, Robert (Server Storage)
2014-10-16 11:23     ` Viresh Kumar
2014-10-17 12:14       ` Prarit Bhargava
2014-10-17 11:38 ` Viresh Kumar
2014-10-17 12:15   ` Prarit Bhargava
2014-10-17 13:25     ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=543BD229.4020207@redhat.com \
    --to=prarit@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robert.schoene@tu-dresden.de \
    --cc=skannan@codeaurora.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.