All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: Linux-pm mailing list <linux-pm@lists.linux-foundation.org>,
	Kernel development list <linux-kernel@vger.kernel.org>
Subject: Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal
Date: Wed, 27 Feb 2008 20:50:39 +0100	[thread overview]
Message-ID: <200802272050.39769.rjw@sisk.pl> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0802271029280.4920-100000@iolanthe.rowland.org>

On Wednesday, 27 of February 2008, Alan Stern wrote:
> On Wed, 27 Feb 2008, Rafael J. Wysocki wrote:
> 
> > > I've got some ideas on how to implement this.
> > > 
> > > We can add a new field "suspend_called" to dev->power.
> > 
> > I'd call it "sleeping" or something like this, for it will also be used by
> > hibernation callbacks.
> 
> The name refers to the "suspend" method, not the type of sleep being
> carried out.  We use the same method for both suspend and hibernation.

We won't in the future.

> But maybe "sleeping" would be better.
> 
> > > It would be owned by the PM core (protect by dpm_list_mtx) and read-only to
> > > drivers.  Normally it will contain 0, but when the suspend method is
> > > running we set it to SUSPEND_RUNNING and when the method returns
> > > successfully we set it to SUSPEND_DONE.  Before calling the resume
> > > method we set it back to 0.
> > 
> > Why before?  I'd think that any non-suspended children should not be visible
> > by the partent's ->resume().
> 
> All right, we can set it to RESUME_RUNNING before calling the resume
> method and then set it to 0 afterwards.  The point is that the value
> shouldn't remain SUSPEND_DONE while resume runs, because it should be
> legal for resume to register new children.

I'm not sure.  The core moves the device to dpm_active only after ->resume()
has run.  Thus, if ->resume() registers new children, the ordering of
dpm_active will be wrong.

> > > When a new device is registered we check its parent's suspend_called
> > > value.  If it is SUSPEND_DONE then the caller has a bug and we have to
> > > fail the registration.  If it is SUSPEND_RUNNING then the registration
> > > is legal, but we remember what happened.
> > 
> > This seems to require some trickery.  Namely, device_add() will notice that
> > the registration is done concurrently with the running ->suspend() of the
> > parent and will have to communicate that to dpm_suspend() which is supposed
> > to resume the master in the next step.
> 
> It will get noticed in device_pm_add() while holding dpm_list_mtx.  
> The information can be stored in a static private flag
> "child_added_while_parent_suspends" (or maybe something more terse!).

Hmm, yes, we can do it this way.

> > > Then when the currently-running suspend method returns and we reacquire the
> > > dpm_list_mtx, we will realize that a race was lost.
> > 
> > How exactly do you want to check that?
> 
> Check whether child_added_while_parent_suspends is nonzero.
> 
> > > If the method completed successfully (which it shouldn't) we can resume that
> > > device immediately without ever taking it off the dpm_active list; but either
> > > way we should continue the suspend loop.  Now the new child will be at
> > > the end of the dpm_active_list, so it will be suspended before the
> > > parent is reached again.
> > > 
> > > This way we can recover from drivers that are willing to suspend their 
> > > device even though there are unsuspended children.  The only drawback 
> > > will be that for a short time the child will be active while its parent 
> > > is suspended.
> > 
> > Well, if the parent is a bus, that will be a problem.
> 
> Sure.  But it won't be the PM core's problem; it will be a bug in the
> bus's driver.  We will print a warning in the log so the bug can be 
> tracked down.
> 
> > > We should not abort the entire sleep transition simply because we lost 
> > > a race.
> > 
> > I don't agree here.  If we require drivers to prevent such races from happening
> > and they don't comply, we can give up instead of trying to work around the
> > non-compilance.
> 
> You misunderstand.

Well, I misunderstood indeed.

> We can't require drivers to prevent these races entirely.  As an example, a
> properly-written, compliant driver might work like this:
> 
> 	Task 0				Task 1
> 	------				------
> 	dev->power.sleeping =
> 	  SUSPEND_RUNNING;
> 	Call (drv->suspend)(dev)
> 					Register a child below dev
> 	suspend method prevents new
> 	  child registrations
> 	suspend method waits for
> 	  existing registration to
> 	  finish
> 					Check dev->power.sleeping and set
> 					  child_added_while_parent_suspends
> 					Registration completes successfully
> 	suspend method sees there is
> 	  an unsuspended child and
> 	  returns -EBUSY
> 
> 	Check child_added_while_parent_suspends
> 	  and realize that we lost the race
> 
> There's nothing illegal about this; it's just an accident of timing.  
> Nothing has gone wrong and we shouldn't abort the sleep.  We should
> continue where we left off, by suspending the new child and then trying
> to suspend the parent again.
> 
> > > With this scheme we won't even need the pm_sleep_rwsem; the  
> > > dpm_list_mtx will provide all the necessary protection.
> > > 
> > > This is more intricate than it should be.  It would have been better to
> > > have had "disable_new_children" and "enable_new_children" methods from
> > > the beginning; then there wouldn't be any races at all.  That's life...
> > > 
> > > The one tricky thing to watch out for is when a suspend or resume 
> > > method wants to unregister the device being suspended or resumed.
> > 
> > That can't happen, because dev->sem is taken by suspend_device() and
> > device_del() would lock up attempting to acquire it once again.
> 
> We'll have to fix device_del() to prevent that from happening.  Your 
> in_sleep_context() approach should work.

I'm not sure if we need to do it.  It's always been like this, so the current
drivers' ->suspend() and ->resume() don't unregister the device they're called
for.  I don't see any advantage from doing that for future drivers.

> > > Unregistration should always be allowed, and registration should be 
> > > allowed whenever the parent isn't suspended.
> > 
> > I'm still thinking that registering while the parent is suspending should not
> > be allowed.
> 
> Unfortunately the lack of "prevent_new_children" and 
> "allow_new_children" methods gives us no choice.  The example above 
> shows why.

Yes, it does.

Thanks,
Rafael

  reply	other threads:[~2008-02-27 19:52 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-25 15:39 Fundamental flaw in system suspend, exposed by freezer removal Alan Stern
2008-02-25 19:46 ` Alan Stern
2008-02-25 19:46 ` [linux-pm] " Alan Stern
2008-02-25 22:25   ` Rafael J. Wysocki
2008-02-25 23:37     ` Alan Stern
2008-02-25 23:37     ` [linux-pm] " Alan Stern
2008-02-26  0:07       ` Rafael J. Wysocki
2008-02-26 15:49         ` Alan Stern
2008-02-26 15:49         ` [linux-pm] " Alan Stern
2008-02-26 23:17           ` Rafael J. Wysocki
2008-02-27 16:03             ` Alan Stern
2008-02-27 16:03               ` [linux-pm] " Alan Stern
2008-02-27 19:50               ` Rafael J. Wysocki [this message]
2008-02-27 20:15                 ` Alan Stern
2008-02-27 20:15                 ` Alan Stern
2008-02-28 22:49                 ` Alan Stern
2008-02-28 22:49                 ` [linux-pm] " Alan Stern
2008-02-29  0:01                   ` Rafael J. Wysocki
2008-02-29  0:01                   ` Rafael J. Wysocki
2008-02-29 14:26                   ` Rafael J. Wysocki
2008-02-29 14:26                   ` [linux-pm] " Rafael J. Wysocki
2008-02-29 15:53                     ` Alan Stern
2008-02-29 15:53                     ` [linux-pm] " Alan Stern
2008-02-29 17:02                       ` Rafael J. Wysocki
2008-02-29 17:02                       ` [linux-pm] " Rafael J. Wysocki
2008-02-29 18:42                         ` Alan Stern
2008-02-29 18:42                         ` [linux-pm] " Alan Stern
2008-02-29 21:57                           ` Rafael J. Wysocki
2008-02-29 21:57                           ` [linux-pm] " Rafael J. Wysocki
2008-02-29 22:46                             ` Alan Stern
2008-03-01  0:13                               ` Rafael J. Wysocki
2008-03-01 15:30                                 ` Alan Stern
2008-03-01 15:30                                 ` [linux-pm] " Alan Stern
2008-03-02 13:37                                   ` Rafael J. Wysocki
2008-03-02 13:37                                   ` [linux-pm] " Rafael J. Wysocki
2008-03-02 16:22                                     ` Alan Stern
2008-03-02 19:11                                       ` Rafael J. Wysocki
2008-03-02 19:11                                       ` [linux-pm] " Rafael J. Wysocki
2008-03-03  3:54                                         ` Alan Stern
2008-03-03 16:32                                           ` Rafael J. Wysocki
2008-03-03 17:43                                             ` Alan Stern
2008-03-03 17:43                                               ` [linux-pm] " Alan Stern
2008-03-03 20:47                                               ` Rafael J. Wysocki
2008-03-03 20:47                                               ` [linux-pm] " Rafael J. Wysocki
2008-03-03 22:48                                                 ` Alan Stern
2008-03-03 22:48                                                 ` [linux-pm] " Alan Stern
2008-03-03 22:56                                                   ` Rafael J. Wysocki
2008-03-03 22:56                                                   ` [linux-pm] " Rafael J. Wysocki
2008-03-03 23:12                                                     ` Alan Stern
2008-03-03 23:18                                                       ` Rafael J. Wysocki
2008-03-03 23:18                                                       ` Rafael J. Wysocki
2008-03-03 23:12                                                     ` Alan Stern
2008-03-03 16:32                                           ` Rafael J. Wysocki
2008-03-03  3:54                                         ` Alan Stern
2008-03-02 16:22                                     ` Alan Stern
2008-03-01  0:13                               ` Rafael J. Wysocki
2008-02-29 22:46                             ` Alan Stern
2008-02-27 19:50               ` Rafael J. Wysocki
2008-02-26 23:17           ` Rafael J. Wysocki
2008-02-26  0:07       ` Rafael J. Wysocki
2008-02-26  7:13     ` David Brownell
2008-02-26  7:13     ` [linux-pm] " David Brownell
2008-02-26  8:25       ` David Newall
2008-02-26  8:25       ` [linux-pm] " David Newall
2008-02-26  9:16         ` David Brownell
2008-02-26  9:16         ` [linux-pm] " David Brownell
2008-02-26 13:36           ` David Newall
2008-02-26 15:58             ` Alan Stern
2008-02-26 15:58             ` Alan Stern
2008-02-26 13:36           ` David Newall
2008-02-25 22:25   ` Rafael J. Wysocki
2008-02-25 22:24 ` Rafael J. Wysocki
2008-02-25 22:24 ` Rafael J. Wysocki
2008-02-27 20:36 ` Benjamin Herrenschmidt
2008-02-27 20:36 ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200802272050.39769.rjw@sisk.pl \
    --to=rjw@sisk.pl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.