linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <tom.leiming@gmail.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Johannes Berg <johannes@sipsolutions.net>,
	Alan Stern <stern@rowland.harvard.edu>,
	Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Zdenek Kabelac <zdenek.kabelac@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	pm list <linux-pm@lists.linux-foundation.org>
Subject: Re: INFO: possible circular locking dependency at cleanup_workqueue_thread
Date: Sun, 24 May 2009 11:29:43 +0800	[thread overview]
Message-ID: <20090524112943.041935da@linux-lm> (raw)
In-Reply-To: <200905240120.30336.rjw@sisk.pl>

于 Sun, 24 May 2009 01:20:29 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> 写道:

> On Saturday 23 May 2009, Johannes Berg wrote:
> > On Sat, 2009-05-23 at 00:23 +0200, Rafael J. Wysocki wrote:
> > 
> > > > I just arrived at the same conclusion, heh. I can't say I
> > > > understand these changes though, the part about calling the
> > > > platform differently may make sense, but calling why disable
> > > > non-boot CPUs at a different place?
> > > 
> > > Because the ordering of platform callbacks and cpu[_up()|_down()]
> > > is also important, at least on resume.
> > > 
> > > In principle we can call device_pm_unlock() right before calling
> > > disable_nonboot_cpus() and take the lock again right after calling
> > > enable_nonboot_cpus(), if that helps.
> > 
> > Probably, unless the cpu_add_remove_lock wasn't a red herring after
> > all. I'd test, but I don't have much time today, will be travelling
> > tomorrow and be at UDS all week next week so I don't know when I'll
> > get to it -- could you provide a patch and also attach it to
> > http://bugzilla.kernel.org/show_bug.cgi?id=13245 please? Miles (the
> > reporter of that bug) has been very helpful in testing before.
> 
> OK
> 
> The patch is appended for reference (Alan, please have a look; I
> can't recall why exactly we have called device_pm_lock() from the
> core suspend/hibernation code instead of acquiring the lock locally
> in drivers/base/power/main.c) and I'll attach it to the bug entry too.
> 
> Thanks,
> Rafael
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM: Do not hold dpm_list_mtx while disabling/enabling
> nonboot CPUs
> 
> We shouldn't hold dpm_list_mtx while executing
> [disable|enable]_nonboot_cpus(), because theoretically this may lead
> to a deadlock as shown by the following example (provided by Johannes
> Berg):
> 
> CPU 3       CPU 2                     CPU 1
>                                       suspend/hibernate
>             something:
>             rtnl_lock()               device_pm_lock()
>                                        -> mutex_lock(&dpm_list_mtx)
> 
>             mutex_lock(&dpm_list_mtx)
> 
> linkwatch_work
>  -> rtnl_lock()
>                                       disable_nonboot_cpus()
>                                        -> flush CPU 3 workqueue
> 
> Fortunately, device drivers are supposed to stop any activities that
> might lead to the registration of new device objects and/or to the
> removal of the existing ones way before disable_nonboot_cpus() is
> called, so it shouldn't be necessary to hold dpm_list_mtx over the
> entire late part of device suspend and early part of device resume.
> 
> Thus, during the late suspend and the early resume of devices acquire
> dpm_list_mtx only when dpm_list is going to be traversed and release
> it right after that.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  drivers/base/power/main.c |    4 ++++
>  kernel/kexec.c            |    2 --
>  kernel/power/disk.c       |   21 +++------------------
>  kernel/power/main.c       |    7 +------
>  4 files changed, 8 insertions(+), 26 deletions(-)
> 

I try to apply the patch against lastest next tree(2009-05-22), but
"patch -p1" is failured:


[lm@linux-lm linux-2.6]$ patch -p1 <  ../patch_rx/INFO_possible_circular_locking_dependency_at_cleanup_workqueue_thread.patch 
patching file kernel/power/disk.c
Hunk #1 succeeded at 215 with fuzz 2.
Hunk #3 succeeded at 278 with fuzz 1.
Hunk #4 FAILED at 343.
Hunk #5 succeeded at 396 with fuzz 2 (offset -4 lines).
Hunk #6 FAILED at 454.
Hunk #7 succeeded at 485 with fuzz 2.
2 out of 7 hunks FAILED -- saving rejects to file kernel/power/disk.c.rej
patching file kernel/power/main.c
Hunk #1 succeeded at 289 with fuzz 1 (offset 18 lines).
patching file drivers/base/power/main.c
Hunk #3 succeeded at 616 with fuzz 2.
Hunk #4 succeeded at 625 with fuzz 2.
patching file kernel/kexec.c
Hunk #1 succeeded at 1451 with fuzz 2.
Hunk #2 succeeded at 1488 with fuzz 2.



> Index: linux-2.6/kernel/power/disk.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/disk.c
> +++ linux-2.6/kernel/power/disk.c
> @@ -215,8 +215,6 @@ static int create_image(int platform_mod
>  	if (error)
>  		return error;
>  
> -	device_pm_lock();
> -
>  	/* At this point, device_suspend() has been called, but *not*
>  	 * device_power_down(). We *must* call device_power_down()
> now.
>  	 * Otherwise, drivers for some devices (e.g. interrupt
> controllers) @@ -227,7 +225,7 @@ static int create_image(int
> platform_mod if (error) {
>  		printk(KERN_ERR "PM: Some devices failed to power
> down, " "aborting hibernation\n");
> -		goto Unlock;
> +		return error;
>  	}
>  
>  	error = platform_pre_snapshot(platform_mode);
> @@ -280,9 +278,6 @@ static int create_image(int platform_mod
>  	device_power_up(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
>  
> - Unlock:
> -	device_pm_unlock();
> -
>  	return error;
>  }
>  
> @@ -348,13 +343,11 @@ static int resume_target_kernel(bool pla
>  {
>  	int error;
>  
> -	device_pm_lock();
> -
>  	error = device_power_down(PMSG_QUIESCE);
>  	if (error) {
>  		printk(KERN_ERR "PM: Some devices failed to power
> down, " "aborting resume\n");
> -		goto Unlock;
> +		return error;
>  	}
>  
>  	error = platform_pre_restore(platform_mode);
> @@ -407,9 +400,6 @@ static int resume_target_kernel(bool pla
>  
>  	device_power_up(PMSG_RECOVER);
>  
> - Unlock:
> -	device_pm_unlock();
> -
>  	return error;
>  }
>  
> @@ -468,11 +458,9 @@ int hibernation_platform_enter(void)
>  		goto Resume_devices;
>  	}
>  
> -	device_pm_lock();
> -
>  	error = device_power_down(PMSG_HIBERNATE);
>  	if (error)
> -		goto Unlock;
> +		goto Resume_devices;
>  
>  	error = hibernation_ops->prepare();
>  	if (error)
> @@ -497,9 +485,6 @@ int hibernation_platform_enter(void)
>  
>  	device_power_up(PMSG_RESTORE);
>  
> - Unlock:
> -	device_pm_unlock();
> -
>   Resume_devices:
>  	entering_platform_hibernation = false;
>  	device_resume(PMSG_RESTORE);
> Index: linux-2.6/kernel/power/main.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/main.c
> +++ linux-2.6/kernel/power/main.c
> @@ -271,12 +271,10 @@ static int suspend_enter(suspend_state_t
>  {
>  	int error;
>  
> -	device_pm_lock();
> -
>  	if (suspend_ops->prepare) {
>  		error = suspend_ops->prepare();
>  		if (error)
> -			goto Done;
> +			return error;
>  	}
>  
>  	error = device_power_down(PMSG_SUSPEND);
> @@ -325,9 +323,6 @@ static int suspend_enter(suspend_state_t
>  	if (suspend_ops->finish)
>  		suspend_ops->finish();
>  
> - Done:
> -	device_pm_unlock();
> -
>  	return error;
>  }
>  
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -357,6 +357,7 @@ static void dpm_power_up(pm_message_t st
>  {
>  	struct device *dev;
>  
> +	mutex_lock(&dpm_list_mtx);
>  	list_for_each_entry(dev, &dpm_list, power.entry)
>  		if (dev->power.status > DPM_OFF) {
>  			int error;
> @@ -366,6 +367,7 @@ static void dpm_power_up(pm_message_t st
>  			if (error)
>  				pm_dev_err(dev, state, " early",
> error); }
> +	mutex_unlock(&dpm_list_mtx);
>  }
>  
>  /**
> @@ -614,6 +616,7 @@ int device_power_down(pm_message_t state
>  	int error = 0;
>  
>  	suspend_device_irqs();
> +	mutex_lock(&dpm_list_mtx);
>  	list_for_each_entry_reverse(dev, &dpm_list, power.entry) {
>  		error = suspend_device_noirq(dev, state);
>  		if (error) {
> @@ -622,6 +625,7 @@ int device_power_down(pm_message_t state
>  		}
>  		dev->power.status = DPM_OFF_IRQ;
>  	}
> +	mutex_unlock(&dpm_list_mtx);
>  	if (error)
>  		device_power_up(resume_event(state));
>  	return error;
> Index: linux-2.6/kernel/kexec.c
> ===================================================================
> --- linux-2.6.orig/kernel/kexec.c
> +++ linux-2.6/kernel/kexec.c
> @@ -1451,7 +1451,6 @@ int kernel_kexec(void)
>  		error = device_suspend(PMSG_FREEZE);
>  		if (error)
>  			goto Resume_console;
> -		device_pm_lock();
>  		/* At this point, device_suspend() has been called,
>  		 * but *not* device_power_down(). We *must*
>  		 * device_power_down() now.  Otherwise, drivers for
> @@ -1489,7 +1488,6 @@ int kernel_kexec(void)
>  		enable_nonboot_cpus();
>  		device_power_up(PMSG_RESTORE);
>   Resume_devices:
> -		device_pm_unlock();
>  		device_resume(PMSG_RESTORE);
>   Resume_console:
>  		resume_console();
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Lei Ming

  reply	other threads:[~2009-05-24  3:36 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12  7:59 INFO: possible circular locking dependency at cleanup_workqueue_thread Zdenek Kabelac
2009-05-17  7:18 ` Ingo Molnar
2009-05-17 10:42   ` Ming Lei
2009-05-17 11:18   ` Johannes Berg
2009-05-17 13:10     ` Ingo Molnar
2009-05-18 19:47     ` Oleg Nesterov
2009-05-18 20:00       ` Peter Zijlstra
2009-05-18 20:16         ` Oleg Nesterov
2009-05-18 20:40           ` Peter Zijlstra
2009-05-18 22:14             ` Oleg Nesterov
2009-05-19  9:13               ` Peter Zijlstra
2009-05-19 10:49                 ` Peter Zijlstra
2009-05-19 14:53                   ` Oleg Nesterov
2009-05-19  8:51       ` Johannes Berg
2009-05-19 12:00         ` Oleg Nesterov
2009-05-19 15:33           ` Johannes Berg
2009-05-19 16:09             ` Oleg Nesterov
2009-05-19 16:27               ` Johannes Berg
2009-05-19 18:51                 ` Oleg Nesterov
2009-05-22 10:46                   ` Johannes Berg
2009-05-22 22:23                     ` Rafael J. Wysocki
2009-05-23  8:21                       ` Johannes Berg
2009-05-23 23:20                         ` Rafael J. Wysocki
2009-05-24  3:29                           ` Ming Lei [this message]
2009-05-24 11:09                             ` Rafael J. Wysocki
2009-05-24 12:48                               ` Ming Lei
2009-05-24 19:09                                 ` Rafael J. Wysocki
2009-05-24 14:30                           ` Alan Stern
2009-05-24 19:06                             ` Rafael J. Wysocki
2009-05-20  3:36             ` Ming Lei
2009-05-20  6:47               ` Johannes Berg
2009-05-20  7:09                 ` Ming Lei
2009-05-20  7:12                   ` Johannes Berg
2009-05-20  8:21                     ` Ming Lei
2009-05-20  8:45                       ` Johannes Berg
2009-05-22  8:03                 ` Ming Lei
2009-05-22  8:11                   ` Johannes Berg
2009-05-20 12:18   ` Peter Zijlstra
2009-05-20 13:18     ` Oleg Nesterov
2009-05-20 13:44       ` Peter Zijlstra
2009-05-20 13:55         ` Oleg Nesterov
2009-05-20 14:12           ` Peter Zijlstra
2009-05-24 18:58 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090524112943.041935da@linux-lm \
    --to=tom.leiming@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=stern@rowland.harvard.edu \
    --cc=zdenek.kabelac@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).