All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maxim <maximlevitsky@gmail.com>
To: nigel@nigel.suspend2.net
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, linux-kernel@vger.kernel.org
Subject: Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems
Date: Thu, 22 Mar 2007 02:32:30 +0200	[thread overview]
Message-ID: <200703220232.30694.maximlevitsky@gmail.com> (raw)
In-Reply-To: <200703220114.05228.maximlevitsky@gmail.com>

On Thursday 22 March 2007 01:14:05 Maxim wrote:
> On Wednesday 21 March 2007 23:22:40 Nigel Cunningham wrote:
> > Hi.
> > 
> > On Wed, 2007-03-21 at 18:40 +0200, Maxim Levitsky wrote:
> > > Hi,
> > > 
> > > Starting with 2.6.21-rc1 suspend to ram and disk doesn't work anymore on my system.
> > > 
> > > I did a git-bisect and found that those commits break it:
> > > 
> > > e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c
> > > ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c
> > > 259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c
> > > 
> > > I already reported about it, but now i know the reason why suspend breaks.
> > > 
> > > The problem is that both cpu_up/cpu_down were allowed to sleep until now, 
> > > and it did work because those functions could be called only in process context
> > > (the one that writes to /sys/devices/system/cpu/cpu*/online) or  idle thread  that does smp_init()).
> > > 
> > > But now they are called _after_ all tasks were suspended, so if cpu_down tries for example to take a lock
> > > that is taken by different process, it can't since the different proccess is frozen and can't release the lock.
> > > 
> > > I tested this and all results are positive:
> > > 
> > > I disabled 2nd cpu by hand, and then suspend to ram was successfull.
> > > Suspend to disk went correctly, but it hang on resume, and I know why.
> > > It hang in old kernel trying to disable 2nd cpu that was enabled by it.
> > > 
> > > I was able using kdb to confirm that this is true because it was still possible to enter kdb, and see that
> > > idle thread (swapper) was active, and uswsusp was waiting on mutex inside workqueue_cpu_callback.
> > > 
> > > The solution for this problem seems to be ether complete audit of code that uses register_cpu_notifier,
> > > to ensure that it doesn't sleep. 
> > > Also documentation should be changed to note about it.
> > > 
> > > Or, it is also possible to revert this change.
> > 
> > Do you know exactly which mutex was being waited on and where it was
> > taken? If you can say that, it would be much more helpful.
> > 
> > Regards,
> > 
> > Nigel
> > 
> > 
> 
> Hello,
> 
> 	It is workqueue_mutex
> 	and it is taken in kernel/workqueue.c:797
> 
> 	this is guilt of freezable work queues , and XFS uses it (and I use XFS)
> 
> 	Thanks to Rafael J. Wysocki for pointing it out to me.
> 
> 	Regards,
> 		Maxim Levitsky
> 


I think that I made a mistake,

I now reverted the patch that fixes the above error, and I wrote down back-trace from this hang,
and it appears that system hangs in kthread_stop:

This I have written in paper, using kdb:

workqueue_cpu_callback ->
cleanup_workqueue_thread ->
kthread_stop ->

and then wake_up_process (I think , I didn't wrote this on paper, I will check again)


	Regards,
		Maxim Levitsky

  parent reply	other threads:[~2007-03-22  0:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-21 16:40 [BUG] Code reordering in swsusp breaks suspend on SMP systems Maxim Levitsky
2007-03-21 21:22 ` Nigel Cunningham
2007-03-21 21:38   ` Rafael J. Wysocki
2007-03-21 23:47     ` Nigel Cunningham
2007-03-22  0:25       ` Maxim
2007-03-22  4:51     ` David Chinner
2007-03-22  7:23       ` Rafael J. Wysocki
2007-03-22  7:31         ` Andrew Morton
2007-03-22  8:17           ` Rafael J. Wysocki
     [not found]   ` <200703220114.05228.maximlevitsky@gmail.com>
2007-03-21 23:16     ` Maxim
2007-03-22  0:32     ` Maxim [this message]
2007-03-21 22:21 ` Pavel Machek
2007-03-21 22:39   ` Rafael J. Wysocki
2007-03-21 22:58     ` [RFC] : Is /proc/kcore still usefull and/or maintained ? Eric Dumazet
2007-03-21 23:11       ` Jan Engelhardt
2007-03-21 23:28         ` Maxim
2007-03-21 23:53           ` Eric Dumazet
2007-03-22  0:04             ` Maxim
2007-03-22  6:35               ` Eric Dumazet
     [not found]     ` <200703220109.54719.maximlevitsky@gmail.com>
2007-03-21 23:18       ` [BUG] Code reordering in swsusp breaks suspend on SMP systems Maxim
     [not found]       ` <200703220024.25436.rjw@sisk.pl>
2007-03-21 23:39         ` Maxim
2007-03-21 23:44           ` Maxim
2007-03-21 23:53           ` Rafael J. Wysocki
2007-03-22  0:01             ` Maxim
2007-03-22 23:30             ` Rafael J. Wysocki
2007-03-23 14:42               ` Rafael J. Wysocki
2007-03-25  0:40                 ` Maxim
2007-03-25 12:13                   ` Rafael J. Wysocki
2007-03-25 15:10                     ` Maxim
2007-03-25 19:27                       ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200703220232.30694.maximlevitsky@gmail.com \
    --to=maximlevitsky@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nigel@nigel.suspend2.net \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.