public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Johannes Berg <johannes@sipsolutions.net>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@osdl.org>, Linus Torvalds <torvalds@osdl.org>,
	Dave Vasilevsky <djvasi@gmail.com>, Pavel Machek <pavel@ucw.cz>,
	Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>,
	Nigel Cunningham <ncunningham@cyclades.com>,
	linux-pm <linux-pm@lists.osdl.org>
Subject: SMP suspend broken due to "swsusp: Change code ordering in disk.c"	et al.
Date: Fri, 23 Feb 2007 04:29:45 +0100	[thread overview]
Message-ID: <1172201385.15769.32.camel@johannes.berg> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 2439 bytes --]

Hi,

After first debugging a while and then bisecting I found out why my quad
G5 won't suspend any longer.

Let me explain. The patch in question (committed as
ed746e3b18f4df18afa3763155972c5835f284c5, but the other ones around that
for other suspend methods will have the same problems) modifies the
suspend sequence to be like this:

freeze_processes
swsusp_shrink_memory
platform_prepare
device_suspend
disable_nonboot_cpus
[...]

while previously it was

disable_nonboot_cpus
freeze_processes
platform_prepare
swsusp_shrink_memory
[...]


The only thing I'm worried about here is the ordering of
freeze_processes vs. disable_nonboot_cpus. The problem with this new
ordering is with workqueues, specifically freezable per-CPU workqueues
which consist of one kthread per CPU, bound to a single CPU. Now, when
CPUs are hot-unplugged, the workqueue code (by having a cpu notifier
called) will kill the thread for the CPU that is being unplugged. If you
look into kernel/workqueue.c, you'll notice that this is done by a
regular kthread_stop() as one might expect.

However, and this is the problem, for any freezable workqueue, the
workqueue kthread will be frozen at this point! Hence, kthread_stop()
will wait forever for the thread to finish, blocking the suspend
process.

Now, as for a solution, I don't really have a great idea yet. We have a
bunch of things we could do:
 (1) simply change the ordering to disable nonboot CPUs much earlier
 (2a) teach kthread_stop() about frozen processes and that it doesn't
      need to wait for them because they'll die once they wake up again
 (2b) teach kthread_stop() about frozen processes modify the freezer to
      allow waking up a process that is destined to die
 (3) teach the workqueue code about suspend

Of these options,

(1) would work, but also only punts the problem until someone wants to
do multi-threaded suspend (as if...).

(2a) would sort-of work, but what if someone unplugs a CPU while the
system is suspended [will that even work]? the thread would get really
stuck there, bound to a CPU that no longer exists.

(2b) should be possible, but would require some sort of per-thread
exit-the-freezer API

(3) is icky

I think I prefer (2b) or alternatively (1). In any case, with the commit
mentioned above reverted, my quad G5 can suspend to disk again and I'm
happy that it isn't my fault ;)

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



             reply	other threads:[~2007-02-23  3:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-23  3:29 Johannes Berg [this message]
2007-02-23 11:54 ` SMP suspend broken due to "swsusp: Change code ordering in disk.c" et al Rafael J. Wysocki
2007-02-23 12:17   ` Johannes Berg
2007-02-23 13:25     ` Rafael J. Wysocki
2007-02-23 20:23       ` Johannes Berg
2007-02-24  0:01         ` Rafael J. Wysocki
2007-02-24  0:31           ` Johannes Berg
2007-02-24  8:57             ` Rafael J. Wysocki
2007-02-24 20:54               ` Rafael J. Wysocki
2007-02-24 21:07                 ` Johannes Berg
2007-03-12 16:57                 ` Roman Jarosz
2007-03-12 18:14                   ` Rafael J. Wysocki
2007-02-23 13:31     ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1172201385.15769.32.camel@johannes.berg \
    --to=johannes@sipsolutions.net \
    --cc=akpm@osdl.org \
    --cc=alexey.y.starikovskiy@linux.intel.com \
    --cc=djvasi@gmail.com \
    --cc=linux-pm@lists.osdl.org \
    --cc=ncunningham@cyclades.com \
    --cc=pavel@ucw.cz \
    --cc=rjw@sisk.pl \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox