linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Dave Jones <davej@redhat.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Dave Young <hidave.darkstar@gmail.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	cpufreq@vger.kernel.org, Rusty Russell <rusty@rustcorp.com.au>,
	trenn@suse.de, sven.wegener@stealer.net,
	Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: rjw@sisk.pl, mingo@elte.hu, Shaohua Li <shaohua.li@intel.com>,
	cpufreq@vger.kernel.org
Subject: [PATCH] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)
Date: Mon, 8 Jun 2009 11:23:16 -0400	[thread overview]
Message-ID: <20090608152316.GA21033@Krystal> (raw)
In-Reply-To: <20090608143220.GC2516@redhat.com>

* Dave Jones (davej@redhat.com) wrote:
> On Mon, Jun 08, 2009 at 08:48:45AM -0400, Mathieu Desnoyers wrote:
>  
>  > > > >> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13475
>  > > > >> Subject         : suspend/hibernate lockdep warning
>  > > > >> References      : http://marc.info/?l=linux-kernel&m=124393723321241&w=4
>  > > > 
>  > > > I suspect the following commit, after revert this patch I test 5 times
>  > > > without lockdep warnings.
>  > > > 
>  > > > commit b14893a62c73af0eca414cfed505b8c09efc613c
>  > > > Author: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
>  > > > Date:   Sun May 17 10:30:45 2009 -0400
>  > > > 
>  > > > 	[CPUFREQ] fix timer teardown in ondemand governor
>  > > 
>  > > The patch is probably not at fault here. I suspect it's some latent bug
>  > > that simply got exposed by the change to cancel_delayed_work_sync(). In
>  > > any case, Mathieu, can you take a look at this please?
>  > 
>  > Yes, it's been looked at and discussed on the cpufreq ML. The short
>  > answer is that they plan to re-engineer cpufreq and remove the policy
>  > rwlock taken around almost every operations at the cpufreq level.
>  > 
>  > The short-term solution, which is recognised as ugly, would be do to the
>  > following before doing the cancel_delayed_work_sync() :
>  > 
>  > unlock policy rwlock write lock
>  > 
>  > lock policy rwlock write lock
>  > 
>  > It basically works because this rwlock is unneeded for teardown, hence
>  > the future re-work planned.
>  > 
>  > I'm sorry I cannot prepare a patch current... I've got quite a few pages
>  > of Ph.D. thesis due for the beginning of July.
>  
> I'm kinda scared to touch this code at all for .30 due to the number of
> unexpected gotchas we seem to run into every time we touch something
> locking related.  So I'm inclined to just live with the lockdep warning
> for .30, and see how the real fixes look for .31, and push them back
> as -stable updates if they work out.
> 
> 
> Venki, what are your thoughts?
> 

Hi Dave,

I've looked through the cpufreq code, and the following patch should
address the call site I've missed in commit 
42a06f2166f2f6f7bf04f32b4e823eacdceafdc9. I've followed all
__cpufreq_set_policy call sites within cpufreq.c to make sure they all
hold the rwsem write lock. An extra round of review would be good
though.

Can someone try the following patch and see if it fixes the regression ?
My test machine is currently busy running long formal verifications, and
therefore unavailable for kernel patch testing. It compiles fine on a
2.6.30-rc5 kernel with my (now mainlined) cpufreq patches applied.

Mathieu


remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)

commit	42a06f2166f2f6f7bf04f32b4e823eacdceafdc9

Missed a call site for CPUFREQ_GOV_STOP to remove the rwlock taken around the
teardown. To make a long story short, the rwlock write-lock causes a circular
dependency with cancel_delayed_work_sync(), because the timer handler takes the
read lock.

Note that all callers to __cpufreq_set_policy are taking the rwsem. All sysfs
callers (writers) hold the write rwsem at the earliest sysfs calling stage.

However, the rwlock write-lock is not needed upon governor stop.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: rjw@sisk.pl
CC: mingo@elte.hu
CC: Shaohua Li <shaohua.li@intel.com>
CC: Pekka Enberg <penberg@cs.helsinki.fi>
CC: Dave Young <hidave.darkstar@gmail.com>
CC: "Rafael J. Wysocki" <rjw@sisk.pl>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: trenn@suse.de
CC: sven.wegener@stealer.net
CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
CC: cpufreq@vger.kernel.org
---
 drivers/cpufreq/cpufreq.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-2.6-lttng.orig/drivers/cpufreq/cpufreq.c	2009-06-08 10:20:48.000000000 -0400
+++ linux-2.6-lttng/drivers/cpufreq/cpufreq.c	2009-06-08 10:48:52.000000000 -0400
@@ -1697,8 +1697,17 @@ static int __cpufreq_set_policy(struct c
 			dprintk("governor switch\n");
 
 			/* end old governor */
-			if (data->governor)
+			if (data->governor) {
+				/*
+				 * Need to release the rwsem around governor
+				 * stop due to lock dependency between
+				 * cancel_delayed_work_sync and the read lock
+				 * taken in the delayed work handler.
+				 */
+				unlock_policy_rwsem_write(data->cpu);
 				__cpufreq_governor(data, CPUFREQ_GOV_STOP);
+				lock_policy_rwsem_write(data->cpu);
+			}
 
 			/* start new governor */
 			data->governor = policy->governor;


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2009-06-08 15:23 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-07  9:47 2.6.30-rc8-git4: Reported regressions from 2.6.29 Rafael J. Wysocki
2009-06-07  9:47 ` [Bug #13109] High latency on /sys/class/thermal Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13219] Since kernel 2.6.30-rc1, computers hangs randomly Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13119] Trouble with make-install from a NFS mount Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13277] 2.6.30 regression - unreliable resume - bisected - Thinkpad X40 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13180] 2.6.30-rc2: WARNING at i915_gem.c for i915_gem_idle Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13179] CD-R: wodim intermittent failures Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13116] Can't boot with nosmp Rafael J. Wysocki
2009-06-08 16:15   ` Stephen Hemminger
2009-06-08 16:29     ` Dan Williams
2009-06-09  0:04       ` Stephen Hemminger
2009-06-09 17:20         ` Dan Williams
2009-06-09 18:30           ` Avi Kivity
2009-06-09 18:36             ` Stephen Hemminger
2009-06-09 18:42               ` Avi Kivity
2009-06-09 20:58                 ` Stephen Hemminger
2009-06-09 23:19                   ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13306] hibernate slow on _second_ run Rafael J. Wysocki
2009-06-08  6:36   ` Johannes Berg
2009-06-08 11:14     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13313] vm86old oops Rafael J. Wysocki
2009-06-11 13:02   ` Sergey Senozhatsky
2009-06-07  9:52 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-06-07 13:10   ` Larry Finger
2009-06-07 13:40     ` Pekka Enberg
2009-06-07 14:19       ` Rik van Riel
2009-06-07 14:32         ` Pekka Enberg
2009-06-07 16:35           ` Larry Finger
2009-06-08  8:32             ` KAMEZAWA Hiroyuki
2009-06-08 17:20               ` Larry Finger
2009-06-08 10:17           ` Mel Gorman
2009-06-08 10:52             ` Pekka Enberg
2009-06-08 11:03               ` Mel Gorman
2009-06-08 13:58                 ` Pekka J Enberg
2009-06-08 14:12                   ` Mel Gorman
2009-06-08 14:42                     ` Christoph Lameter
2009-06-09  7:06                     ` Pekka Enberg
2009-06-09  7:54                       ` David Rientjes
2009-06-09  7:58                         ` Pekka Enberg
2009-06-09  8:14                           ` David Rientjes
2009-06-09  8:28                             ` Pekka Enberg
2009-06-10 14:41                               ` Larry Finger
2009-06-10 15:44                                 ` Pekka Enberg
2009-06-10 15:49                                   ` Pekka Enberg
2009-06-10 15:52                                     ` Johannes Berg
2009-06-10 16:06                                       ` Pekka Enberg
2009-06-10 16:16                                       ` Pekka Enberg
2009-06-10 16:10                                     ` Larry Finger
2009-06-11 14:41                                   ` Christoph Lameter
2009-06-11 15:09                                     ` Pekka Enberg
2009-06-11 18:41                                       ` Johannes Berg
2009-06-10 15:56                       ` Mel Gorman
2009-06-10 18:03                         ` Pekka Enberg
2009-06-09  7:50                     ` Pekka Enberg
2009-06-08 13:20             ` Rik van Riel
2009-06-08 13:35               ` Mel Gorman
2009-06-08 13:34             ` Larry Finger
2009-06-07  9:52 ` [Bug #13318] AGP doesn't work anymore on nforce2 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13330] nfs4 NULL pointer dereference in _nfs4_do_setlk Rafael J. Wysocki
2009-06-07 19:28   ` Trond Myklebust
2009-06-07 21:04     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13337] [post 2.6.29 regression] hang during suspend of b44/b43 modules Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13341] Random Oops at boot at loading ip6tables rules Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13328] b44: eth0: BUG! Timeout waiting for bit 00000002 of register 42c to clear Rafael J. Wysocki
2009-06-08  7:29   ` Francis Moreau
2009-06-12 13:27     ` Francis Moreau
2009-06-12 19:14       ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13366] About 80% of shutdowns fail (blocking) Rafael J. Wysocki
2009-06-07 16:02   ` Martin Bammer
2009-06-07 21:09     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13362] rt2x00: slow wifi with correct basic rate bitmap Rafael J. Wysocki
2009-06-07 12:58   ` Alejandro Riveira Fernández
2009-06-07 21:05     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13351] 2.6.30 corrupts my system after suspend resume with readonly mounted hard disk Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13373] fbcon, intelfb, i915: INFO: possible circular locking dependency detected Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13389] Warning 'Invalid throttling state, reset' gets displayed when it should not be Rafael J. Wysocki
2009-06-08 11:31   ` Frans Pop
2009-06-07  9:52 ` [Bug #13391] Kernel boot hangs at about every second start when kms is activated Rafael J. Wysocki
2009-06-07 16:04   ` Martin Bammer
2009-06-07 21:11     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13374] reiserfs blocked for more than 120secs Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13401] pktcdvd writing is really slow with CFQ scheduler (bisected) Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13408] Performance regression in 2.6.30-rc7 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13424] possible deadlock when doing governor switching Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13423] JMicron SATA controller not available Rafael J. Wysocki
2009-06-07 15:23   ` Marc Dionne
2009-06-07 21:13     ` Rafael J. Wysocki
2009-06-08  2:12       ` Marc Dionne
2009-06-07  9:52 ` [Bug #13407] adb trackpad disappears after suspend to ram Rafael J. Wysocki
2009-06-25 15:07   ` Jan Scholz
2009-06-07  9:52 ` [Bug #13462] Unused bands in intefb console and smaller 180x56 -> 128x48 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13446] resume after suspend-to-ram broken on Toshiba Satellite A100 with 2.6.30-rc8 (works in 2.6.28) Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13470] Machine doesn't boot due to mmconfig detection problem Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13473] Bug while trying to launch a KVM guest Rafael J. Wysocki
2009-06-08  4:26   ` Sachin Sant
2009-06-08 11:16     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13475] suspend/hibernate lockdep warning Rafael J. Wysocki
2009-06-07 13:21   ` Pekka Enberg
2009-06-08  7:35     ` Dave Young
2009-06-08  7:49       ` Pekka Enberg
2009-06-08 12:48         ` Mathieu Desnoyers
2009-06-08 14:32           ` Dave Jones
2009-06-08 15:23             ` Mathieu Desnoyers [this message]
2009-06-08 16:57               ` [PATCH] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) Pallipadi, Venkatesh
2009-06-08 17:17                 ` Mathieu Desnoyers
2009-06-09  1:15               ` Dave Young
2009-06-09 15:23                 ` Mathieu Desnoyers
2009-06-11  4:46                   ` Dave Young
2009-06-11 13:39             ` [Bug #13475] suspend/hibernate lockdep warning Simon Holm Thøgersen
2009-06-11 15:23               ` Mathieu Desnoyers
2009-06-17  0:39                 ` Pallipadi, Venkatesh
2009-06-17  1:05                   ` Mathieu Desnoyers
2009-06-17 15:29                   ` Thomas Renninger
2009-06-17 17:03                     ` Pallipadi, Venkatesh
2009-06-18  5:46                   ` Dave Young
2009-06-07  9:52 ` [Bug #13472] Oops with minicom and USB serial Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13471] Loading parport_pc kills the keyboard if ACPI is enabled Rafael J. Wysocki
2009-06-07 13:25   ` Ozan Çağlayan
2009-06-07 21:14     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13474] Oops whilst booting Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090608152316.GA21033@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=cpufreq@vger.kernel.org \
    --cc=davej@redhat.com \
    --cc=hidave.darkstar@gmail.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=penberg@cs.helsinki.fi \
    --cc=rjw@sisk.pl \
    --cc=rusty@rustcorp.com.au \
    --cc=shaohua.li@intel.com \
    --cc=sven.wegener@stealer.net \
    --cc=trenn@suse.de \
    --cc=venkatesh.pallipadi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).