public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* Suspended devices and drivers
@ 2005-09-04 16:13 Alan Stern
  2005-09-04 20:46 ` David Brownell
  0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2005-09-04 16:13 UTC (permalink / raw)
  To: Linux-pm mailing list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1626 bytes --]

Is there any recommendation about what to do when a driver is probed
for a suspended device?  Probably most drivers don't bother to check
the device state; if they encounter errors because the device doesn't
respond the way they expect, the probe will simply fail.  (The USB
subsystem, for example, doesn't take this into account.)  One possible
answer is that this should be handled at the bus level -- the bus
subsystem could be responsible for setting a device to full power
before probing, or it could rely on individual device drivers doing
whatever they need.  It would be nicer if a common solution could be
found that would work uniformly for all devices and buses.  (As part
of such a solution, buses could have a standard policy that devices
with no driver should be left in a low-power state.)

A related problem is faced by USB drivers in a boot kernel.  The
current design relies on USB devices maintaining their state across a
suspend/resume, even suspend to disk.  This makes things difficult
when resuming from disk; the boot kernel has to realize that it
shouldn't disturb the state of any USB devices.  At the moment this
isn't handled very well.  For instance, it would be largely a matter
of luck if you could do STD with the image stored in a swap partition
on a USB storage device.

In fact, maybe it's a mistake to expect USB devices to maintain their
state across STD.  After all, devices on the motherboard aren't
expected to; it's generally accepted that drivers will restore
whatever state is necessary as part of their resume procedure.  Why
shouldn't USB devices behave the same way?

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-04 16:13 Suspended devices and drivers Alan Stern
@ 2005-09-04 20:46 ` David Brownell
  2005-09-05  1:25   ` Alan Stern
  0 siblings, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-09-04 20:46 UTC (permalink / raw)
  To: stern, linux-pm

[-- Attachment #1: Type: text/plain, Size: 4987 bytes --]

> Date: Sun, 4 Sep 2005 12:13:58 -0400 (EDT)
> From: Alan Stern <stern@rowland.harvard.edu>
>
> Is there any recommendation about what to do when a driver is probed
> for a suspended device?

Sure; see below ("common solution").


>	Probably most drivers don't bother to check
> the device state; if they encounter errors because the device doesn't
> respond the way they expect, the probe will simply fail.  (The USB
> subsystem, for example, doesn't take this into account.)

Actually the USB HCDs do take that into account.  The "subsystem"
does as much as it can:  PCI HCDs have utilities to share.  The first
thing usb_hcd_pci_probe() routine does is call the pci_enable_device(),
which puts the HC in D0 state (assuming the HC supports PCI PM at all).
In fact that's much of _why_ that call is made so early ...

The non-PCI HCDs necessarily work a bit differently, and can't share
any bus-specific code; there can't be much of a "subsystem" role.


>		One possible
> answer is that this should be handled at the bus level -- the bus
> subsystem could be responsible for setting a device to full power
> before probing, or it could rely on individual device drivers doing
> whatever they need.  It would be nicer if a common solution could be
> found that would work uniformly for all devices and buses.

The common solution is what it's always been:  drivers make no
assumptions about initial device state, and are ready to initialize
from scratch in their probe() routines.

But as Ben and others may point out, that's not always possible.
On PCs, video drivers often rely on proprietary BIOS setup; and many
other drivers use information passed from boot firmare through chip
registers.  So there will be a few exceptions to any such rule.


>	(As part
> of such a solution, buses could have a standard policy that devices
> with no driver should be left in a low-power state.)

I'd like to see that "left in low power state" become a common policy,
but suspect it'd be awkward as a "must do".  "Should do", fine.

Sometimes the boot firmware won't implement that policy.  (So Linux
would need device-specific code to do that; oops that's a driver!)
Also, sometimes it'd involve chip docs Linux develoeprs don't have.


> A related problem is faced by USB drivers in a boot kernel.  The
> current design relies on USB devices maintaining their state across a
> suspend/resume, even suspend to disk.

I don't know any particular way they'd "rely" on that.  Modulo any
recent bugs, the transitions from a USB suspend state are either
(a) resuming, or (b) disconnecting.  You imply (b) won't happen.


>	This makes things difficult
> when resuming from disk; the boot kernel has to realize that it
> shouldn't disturb the state of any USB devices.  At the moment this
> isn't handled very well.  For instance, it would be largely a matter
> of luck if you could do STD with the image stored in a swap partition
> on a USB storage device.

I'm told that boot-from-USB (or reboot) still needs some work.  :)

Also, ome of the suspend/resume scenarios that worked a few kernels ago
stopped working.  We'll need those to work again first before adding new
configurations.


> In fact, maybe it's a mistake to expect USB devices to maintain their
> state across STD.  After all, devices on the motherboard aren't
> expected to;

They're certainly _allowed_ to do so though.  Some are quite expected
to, like the RTC ... and on many platforms, there'll be some SRAM
(typically 16+ KB) maintaining state during suspend/resume transitions
even when DRAM goes unpowered.

(Of course, powercycling is another story entirely.  And it's very
easy lately to confuse STD with the powercycling version of swsusp
snapshot-resume.  Those are quite different.)


>	it's generally accepted that drivers will restore
> whatever state is necessary as part of their resume procedure.

If the USB stack has actually suspended that device, that's appropriate.

If its port has been power cycled for any reason (including even just
unplug/replug) then it's impossible to resume; so usbcore must then
disconnect() not resume().  It'll later re-enumerate that port.


> Why shouldn't USB devices behave the same way?

USB _devices_ don't know STD from Adam.  All they know is whether they're
still suspended or not ... and the answer is NO unless both (a) VBUS
connectivity from the root hub has powered the link the whole time, which
is easy with true STD but not snapshot-resume; and (b) nobody's reset their
port, e.g. by power cycling the root hub or by some software initiated reset.

So a true system _suspend_ state should already be doing that.
Even suspend-to-disk.

But the poweroff snapshot-resume thing isn't the same thing; USB devices 
can't maintain suspend states when VBUS power gets dropped.  Those get
disconnected ... just like happens on _any_ hotpluggable bus segment.

- Dave



> Alan Stern
>


---
After the 1906 San Francisco quake, the first Federal aid arrived
within two hours.


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-04 20:46 ` David Brownell
@ 2005-09-05  1:25   ` Alan Stern
  2005-09-05 17:09     ` Alan Stern
  2005-09-06  0:13     ` David Brownell
  0 siblings, 2 replies; 20+ messages in thread
From: Alan Stern @ 2005-09-05  1:25 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5175 bytes --]

On Sun, 4 Sep 2005, David Brownell wrote:

> > Date: Sun, 4 Sep 2005 12:13:58 -0400 (EDT)
> > From: Alan Stern <stern@rowland.harvard.edu>
> >
> > Is there any recommendation about what to do when a driver is probed
> > for a suspended device?
> 
> Sure; see below ("common solution").
> 
> 
> >	Probably most drivers don't bother to check
> > the device state; if they encounter errors because the device doesn't
> > respond the way they expect, the probe will simply fail.  (The USB
> > subsystem, for example, doesn't take this into account.)
> 
> Actually the USB HCDs do take that into account.  The "subsystem"
> does as much as it can:  PCI HCDs have utilities to share.  The first
> thing usb_hcd_pci_probe() routine does is call the pci_enable_device(),
> which puts the HC in D0 state (assuming the HC supports PCI PM at all).
> In fact that's much of _why_ that call is made so early ...
> 
> The non-PCI HCDs necessarily work a bit differently, and can't share
> any bus-specific code; there can't be much of a "subsystem" role.

It's true that the HCDs are reasonably capable in this respect.  I was
speaking more of the USB device drivers.  Right now usb_probe_interface
returns -EHOSTUNREACH when an interface on a suspended device is probed.  
It doesn't try to resume the device.  Is this something we should add?


> > A related problem is faced by USB drivers in a boot kernel.  The
> > current design relies on USB devices maintaining their state across a
> > suspend/resume, even suspend to disk.
> 
> I don't know any particular way they'd "rely" on that.  Modulo any
> recent bugs, the transitions from a USB suspend state are either
> (a) resuming, or (b) disconnecting.  You imply (b) won't happen.

> > In fact, maybe it's a mistake to expect USB devices to maintain their
> > state across STD.  After all, devices on the motherboard aren't
> > expected to;
> 
> They're certainly _allowed_ to do so though.  Some are quite expected
> to, like the RTC ... and on many platforms, there'll be some SRAM
> (typically 16+ KB) maintaining state during suspend/resume transitions
> even when DRAM goes unpowered.
> 
> (Of course, powercycling is another story entirely.  And it's very
> easy lately to confuse STD with the powercycling version of swsusp
> snapshot-resume.  Those are quite different.)

You're right, I've been confusing the two.  Okay, let's focus our 
attention on the powercycling version of swsusp.

> >	it's generally accepted that drivers will restore
> > whatever state is necessary as part of their resume procedure.
> 
> If the USB stack has actually suspended that device, that's appropriate.
> 
> If its port has been power cycled for any reason (including even just
> unplug/replug) then it's impossible to resume; so usbcore must then
> disconnect() not resume().  It'll later re-enumerate that port.

While it is impossible to "resume" in the sense used by the USB spec, it
is still possible to get much the same effect by re-enumerating.  That's
pretty much what usb_reset_device does right now (although it doesn't ever
drop the VBUS power).

> > Why shouldn't USB devices behave the same way?
> 
> USB _devices_ don't know STD from Adam.  All they know is whether they're
> still suspended or not ... and the answer is NO unless both (a) VBUS
> connectivity from the root hub has powered the link the whole time, which
> is easy with true STD but not snapshot-resume; and (b) nobody's reset their
> port, e.g. by power cycling the root hub or by some software initiated reset.
> 
> So a true system _suspend_ state should already be doing that.
> Even suspend-to-disk.
> 
> But the poweroff snapshot-resume thing isn't the same thing; USB devices 
> can't maintain suspend states when VBUS power gets dropped.  Those get
> disconnected ... just like happens on _any_ hotpluggable bus segment.

That's my point: The devices can't maintain state, but why should they
when the driver is supposed to maintain it for them?  (Yes, some aspects
of state, like firmware updates, might be _too_ difficult for the
driver to preserve.  Such things should be in the minority, however.)

People expect that following a powercycled swsusp, their memory mappings 
to disks on the motherboard will remain intact.  Why shouldn't mappings to 
hotpluggable disks remain intact as well, provided the disk is still 
plugged in?

Admittedly, that last proviso is a tricky point.  It seems to me that it
ought to be sufficient to make a good-faith attempt at verifying that the
device present on a port after a resume is the same device that was there
before the suspend.  For example, comparing USB descriptors (possibly
including some string descriptors as well).  Sure, you could fool such a
test -- but if you do, you get what you deserve.

My intuition says that users won't care about the technical differences 
between hotpluggable and non-hotpluggable devices, when it comes to the 
behavior of snapshot-poweroff-resume.  They will want things Just To 
Work.

Alan Stern

P.S.: Does anyone know how Windows behaves if a running program has files
open on a floppy disk (or a USB disk) across a Hibernate?  I should try
the experiment...


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-05  1:25   ` Alan Stern
@ 2005-09-05 17:09     ` Alan Stern
  2005-09-05 23:27       ` David Brownell
  2005-09-11  1:55       ` Leo L. Schwab
  2005-09-06  0:13     ` David Brownell
  1 sibling, 2 replies; 20+ messages in thread
From: Alan Stern @ 2005-09-05 17:09 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1517 bytes --]

On Sun, 4 Sep 2005, Alan Stern wrote:

> P.S.: Does anyone know how Windows behaves if a running program has files
> open on a floppy disk (or a USB disk) across a Hibernate?  I should try
> the experiment...

I tried it on my laptop, which has an old copy of Windows ME.

The floppy disk situation is about what you'd expect.  Even without 
putting the computer to sleep, you can remove a floppy disk while a 
program has an open file on it.  Windows just puts up a screen asking you 
to replace the floppy, and things continue as if nothing happened.  This 
is obviously an historical relic going back to the days of DOS.

With the USB device, things are more interesting.  If you unplug the 
device (even while it's not in use), Windows warns you not to do this 
without first getting permission by using the "Eject Removable Devices" 
button.  If you try to press that button while a program has a file open 
on the device, Windows says that you can't remove the device right now and 
advises you to try again later.

However...  I put the laptop into Hibernate mode.  To be absolutely sure 
this was a true snapshot-poweroff-resume cycle, I also unplugged the power 
cord and removed the battery.  Then I removed and replaced the USB device 
and restarted the laptop.  Everything worked smoothly; the file remained 
open and the program was able to continue reading it, well past the point 
where the I/O buffers needed to be refilled.

If Windows ME can do this, Linux should be able to do it too.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-05 17:09     ` Alan Stern
@ 2005-09-05 23:27       ` David Brownell
  2005-09-06 14:50         ` Alan Stern
  2005-09-11  1:55       ` Leo L. Schwab
  1 sibling, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-09-05 23:27 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 1741 bytes --]

> With the USB device, things are more interesting.  If you unplug the
> device (even while it's not in use), Windows warns you not to do this
> without first getting permission by using the "Eject Removable Devices"
> button.  If you try to press that button while a program has a file open
> on the device, Windows says that you can't remove the device right now and
> advises you to try again later.

So it's inconsistent in behavior, since this isn't how it handles
the same thing during resume-from-hibernate ...


> However...  I put the laptop into Hibernate mode.  To be absolutely sure
> this was a true snapshot-poweroff-resume cycle, I also unplugged the power
> cord and removed the battery.  Then I removed and replaced the USB device
> and restarted the laptop.  Everything worked smoothly; the file remained
> open and the program was able to continue reading it, well past the point
> where the I/O buffers needed to be refilled.

So basically there's a special case somewhere to treat _this_ disconnect
differently than other ones.

How does real suspend behave (like STR)?  How does it handle cases where you
plug in a different instance of the same device ... example, different CF
card in a CF reader?  Or when you move the device to another port?  And
does XP behave identically?


> If Windows ME can do this, Linux should be able to do it too.

That argument can be stretched too far!  Though from time to time I have
indeed wished for something more like a BSOD.  An oops hidden in a logfile
that never gets flushed to disk, with an X desktop, gives no clues ... :)

Linux certainly _could_ try to emulate up all the fault handling of some
version of Windows.  But whether it _should_ is a different story.

- Dave



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-05  1:25   ` Alan Stern
  2005-09-05 17:09     ` Alan Stern
@ 2005-09-06  0:13     ` David Brownell
  2005-09-06 15:19       ` Alan Stern
  1 sibling, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-09-06  0:13 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]

> Date: Sun, 4 Sep 2005 21:25:35 -0400 (EDT)
> From: Alan Stern <stern@rowland.harvard.edu>
>
> It's true that the HCDs are reasonably capable in this respect.  I was
> speaking more of the USB device drivers.  Right now usb_probe_interface
> returns -EHOSTUNREACH when an interface on a suspended device is probed.  
> It doesn't try to resume the device.  Is this something we should add?

Probably.  That may need to work down starting at the root hub ... 


> > >	it's generally accepted that drivers will restore
> > > whatever state is necessary as part of their resume procedure.
> > 
> > If the USB stack has actually suspended that device, that's appropriate.
> > 
> > If its port has been power cycled for any reason (including even just
> > unplug/replug) then it's impossible to resume; so usbcore must then
> > disconnect() not resume().  It'll later re-enumerate that port.
>
> While it is impossible to "resume" in the sense used by the USB spec, it
> is still possible to get much the same effect by re-enumerating.  That's
> pretty much what usb_reset_device does right now (although it doesn't ever
> drop the VBUS power).

The "dropped VBUS connection" link is pretty fundamental though.  The only
ways you can know the same device is there later are (a) if it's the same
VBUS power session, or (b) if none of the parent hub ports is removable.

If both of those are false, there's no way for usbcore to guarantee that the
same device is connected as was there when you suspended.

By the way, (b) surfaces something that might be useful for Linux at
some point.  For hotpluggable busses, not all segements are necessarily
hotpluggable.  We waste a certain amount of resources, such as memory
to store remove() methods, for hardware that's not actually removable.
Also, the reason IDE drives come back in the "same" state is that they
all assume the IDE analogue of (b).  That's fair, since re-cabling things
normally involves opening the case, implying a full reset/reboot.


> > But the poweroff snapshot-resume thing isn't the same thing; USB devices 
> > can't maintain suspend states when VBUS power gets dropped.  Those get
> > disconnected ... just like happens on _any_ hotpluggable bus segment.
>
> That's my point: The devices can't maintain state, but why should they
> when the driver is supposed to maintain it for them?  (Yes, some aspects
> of state, like firmware updates, might be _too_ difficult for the
> driver to preserve.  Such things should be in the minority, however.)

The driver is supposed to maintain driver state; but
the device is supposed to maintain device state.

Resuming implies _both_ of them have been preserved.

The kind of disk snapshot-resume that you're talking about (involving
users potentially swapping removable media) worries me.  I'd want to
know that for example the filesystem code would be made to verify that
the filesystems were in the state they're supposed to be in, otherwise
the wrong kind of resume could easily trash disks... and right now,
I don't think those checks are done in the kernel.


> People expect that following a powercycled swsusp, their memory mappings 
> to disks on the motherboard will remain intact.  Why shouldn't mappings to 
> hotpluggable disks remain intact as well, provided the disk is still 
> plugged in?

If either (a) or eventually (b) above are true, that should "just work".

- Dave



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-05 23:27       ` David Brownell
@ 2005-09-06 14:50         ` Alan Stern
  2005-09-06 15:18           ` David Brownell
  2005-09-06 19:47           ` Rafael J. Wysocki
  0 siblings, 2 replies; 20+ messages in thread
From: Alan Stern @ 2005-09-06 14:50 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2537 bytes --]

On Mon, 5 Sep 2005, David Brownell wrote:

> > With the USB device, things are more interesting.  If you unplug the
> > device (even while it's not in use), Windows warns you not to do this
> > without first getting permission by using the "Eject Removable Devices"
> > button.  If you try to press that button while a program has a file open
> > on the device, Windows says that you can't remove the device right now and
> > advises you to try again later.
> 
> So it's inconsistent in behavior, since this isn't how it handles
> the same thing during resume-from-hibernate ...

Obviously because Windows isn't aware of anything that happens during
hibernate while the power is turned off.  How could it possibly warn you
about unplugging a device when it doesn't know whether you unplugged the
device or not?

If you describe the behavior as "Windows warns you whenever it learns that
you ejected removable media without permission", then Windows _is_
consistent.

> > However...  I put the laptop into Hibernate mode.  To be absolutely sure
> > this was a true snapshot-poweroff-resume cycle, I also unplugged the power
> > cord and removed the battery.  Then I removed and replaced the USB device
> > and restarted the laptop.  Everything worked smoothly; the file remained
> > open and the program was able to continue reading it, well past the point
> > where the I/O buffers needed to be refilled.
> 
> So basically there's a special case somewhere to treat _this_ disconnect
> differently than other ones.

Yes, there must be.

> How does real suspend behave (like STR)?  How does it handle cases where you
> plug in a different instance of the same device ... example, different CF
> card in a CF reader?  Or when you move the device to another port?  And
> does XP behave identically?

I'll try doing some of those experiments when I have a chance.

> > If Windows ME can do this, Linux should be able to do it too.
> 
> That argument can be stretched too far!  Though from time to time I have
> indeed wished for something more like a BSOD.  An oops hidden in a logfile
> that never gets flushed to disk, with an X desktop, gives no clues ... :)
> 
> Linux certainly _could_ try to emulate up all the fault handling of some
> version of Windows.  But whether it _should_ is a different story.

I say this is a case where we should try, at least to some extent; users 
will feel that "Powerdown-swsusp automatically removes all hot-pluggable 
devices" is too Draconian.

What do other people on the PM list think?

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-06 14:50         ` Alan Stern
@ 2005-09-06 15:18           ` David Brownell
  2005-09-06 15:46             ` Alan Stern
  2005-09-06 19:47           ` Rafael J. Wysocki
  1 sibling, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-09-06 15:18 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 1822 bytes --]

> > > With the USB device, things are more interesting.  If you unplug the
> > > device (even while it's not in use), Windows warns you not to do this
> > > ...
> > 
> > So it's inconsistent in behavior, since this isn't how it handles
> > the same thing during resume-from-hibernate ...
>
> ...
> If you describe the behavior as "Windows warns you whenever it learns that
> you ejected removable media without permission", then Windows _is_
> consistent.

No, it's inconsistent ... when you ejected the media before that
resume-from-snapshot, it could tell.  And it ignored it, even though
you'd not told Windows you were going to do that.


> > > If Windows ME can do this, Linux should be able to do it too.
> > 
> > That argument can be stretched too far!  Though from time to time I have
> > indeed wished for something more like a BSOD.  An oops hidden in a logfile
> > that never gets flushed to disk, with an X desktop, gives no clues ... :)
> > 
> > Linux certainly _could_ try to emulate up all the fault handling of some
> > version of Windows.  But whether it _should_ is a different story.
>
> I say this is a case where we should try, at least to some extent; users 
> will feel that "Powerdown-swsusp automatically removes all hot-pluggable 
> devices" is too Draconian.

The simple solution is to use _real_ suspend states if you're going
to expect real suspend/resume behavior ... :)

Note that for things like HID devices (mice etc), users rarely notice
this since the HID driver masks enumeration through /dev/input/mice.

Also, that we've been aiming at that "removes devices" as the default
for drivers without suspend()/resume() support, because it's the ONLY
choice that's both reliable (all drivers can handle it) and safe (since
users and applications must already handle "live" unplug).

- Dave


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-06  0:13     ` David Brownell
@ 2005-09-06 15:19       ` Alan Stern
  2005-09-11 19:03         ` David Brownell
  0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2005-09-06 15:19 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5517 bytes --]

On Mon, 5 Sep 2005, David Brownell wrote:

> > Date: Sun, 4 Sep 2005 21:25:35 -0400 (EDT)
> > From: Alan Stern <stern@rowland.harvard.edu>
> >
> > It's true that the HCDs are reasonably capable in this respect.  I was
> > speaking more of the USB device drivers.  Right now usb_probe_interface
> > returns -EHOSTUNREACH when an interface on a suspended device is probed.  
> > It doesn't try to resume the device.  Is this something we should add?
> 
> Probably.  That may need to work down starting at the root hub ... 

It will be part of the RTPM framework I've described earlier.  (BTW, I'm
waiting for your USB suspend/resume-recursion patch before starting to
work on the RTPM stuff...)


> > > If its port has been power cycled for any reason (including even just
> > > unplug/replug) then it's impossible to resume; so usbcore must then
> > > disconnect() not resume().  It'll later re-enumerate that port.
> >
> > While it is impossible to "resume" in the sense used by the USB spec, it
> > is still possible to get much the same effect by re-enumerating.  That's
> > pretty much what usb_reset_device does right now (although it doesn't ever
> > drop the VBUS power).
> 
> The "dropped VBUS connection" link is pretty fundamental though.  The only
> ways you can know the same device is there later are (a) if it's the same
> VBUS power session, or (b) if none of the parent hub ports is removable.
> 
> If both of those are false, there's no way for usbcore to guarantee that the
> same device is connected as was there when you suspended.

No way to _guarantee_ it, agreed.  But we can try to do an acceptable job
of making sure.

It's not just USB, obviously.  What happens (and what _should_ happen) if 
you swap parallel-SCSI drives while the power is off?  Or if you exchange 
floppy disks or CD-ROM discs?

> By the way, (b) surfaces something that might be useful for Linux at
> some point.  For hotpluggable busses, not all segements are necessarily
> hotpluggable.  We waste a certain amount of resources, such as memory
> to store remove() methods, for hardware that's not actually removable.

Do you know of any examples of wasted code?  Even if one instance of a bus
has non-hotpluggable segments, other instances might still be
hotpluggable.  In such cases the remove methods would indeed be necessary,
even if they weren't always used.

If you want to stretch a point, you could say that all the USB disconnect
methods are wasted for users who never unplug their USB devices while the
system is on!  :-)

> Also, the reason IDE drives come back in the "same" state is that they
> all assume the IDE analogue of (b).  That's fair, since re-cabling things
> normally involves opening the case, implying a full reset/reboot.

But it _doesn't_ imply "a full reset/reboot", if by that phrase you mean
omitting to restore a swsusp image.  There's nothing to stop you from
doing swsusp to disk with powerdown, monkeying around with your IDE drives
& cables, and then trying to do a resume from disk.  It won't work,
clearly -- but you certainly can try it anyway.


> > That's my point: The devices can't maintain state, but why should they
> > when the driver is supposed to maintain it for them?  (Yes, some aspects
> > of state, like firmware updates, might be _too_ difficult for the
> > driver to preserve.  Such things should be in the minority, however.)
> 
> The driver is supposed to maintain driver state; but
> the device is supposed to maintain device state.

Even during shapshot-poweroff-resume?  How is the device supposed to
manage that?  Unless by "device state" you mean non-volatile things, like
the contents of flash memory.  Such things present no problem for
snapshot-resume.

> Resuming implies _both_ of them have been preserved.
> 
> The kind of disk snapshot-resume that you're talking about (involving
> users potentially swapping removable media) worries me.  I'd want to
> know that for example the filesystem code would be made to verify that
> the filesystems were in the state they're supposed to be in, otherwise
> the wrong kind of resume could easily trash disks... and right now,
> I don't think those checks are done in the kernel.

Okay, that's a valid issue to be concerned about.

(Actually, I suspect we don't need to worry about things at the filesystem 
level.  If the same device is still attached, and if nobody has altered 
its contents since the snapshot was made (a reasonable assumption), then 
the filesystems will be okay.  As will the contents of raw partitions.)

The tricky part is making sure that the same device is still attached and 
it still contains the same medium.  Two different problems, which need to 
be handled at two different levels.  Checking for the same device should 
be done at the bus level, and checking for the same medium should be done 
at the disk-driver level.  In each case it's not possible to be 100% 
certain, but it is possible to detect the most likely sorts of changes.  
I.e., unless the user deliberately tries to fool the system, things should 
work correctly.


> > People expect that following a powercycled swsusp, their memory mappings 
> > to disks on the motherboard will remain intact.  Why shouldn't mappings to 
> > hotpluggable disks remain intact as well, provided the disk is still 
> > plugged in?
> 
> If either (a) or eventually (b) above are true, that should "just work".

That's the easy part.  It _should_ "just work" even if (a) and (b) aren't 
true -- that's the hard part.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-06 15:18           ` David Brownell
@ 2005-09-06 15:46             ` Alan Stern
  2005-09-11 19:06               ` David Brownell
  0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2005-09-06 15:46 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2468 bytes --]

On Tue, 6 Sep 2005, David Brownell wrote:

> > > > With the USB device, things are more interesting.  If you unplug the
> > > > device (even while it's not in use), Windows warns you not to do this
> > > > ...
> > > 
> > > So it's inconsistent in behavior, since this isn't how it handles
> > > the same thing during resume-from-hibernate ...
> >
> > ...
> > If you describe the behavior as "Windows warns you whenever it learns that
> > you ejected removable media without permission", then Windows _is_
> > consistent.
> 
> No, it's inconsistent ... when you ejected the media before that
> resume-from-snapshot, it could tell.  And it ignored it, even though
> you'd not told Windows you were going to do that.

It sounds like we've got a failure to communicate.  There were two 
experiments.  In the first, I simply unplugged the device.  Windows did 
detect this and did warn me.  In the second experiment, I first did the 
snapshot-and-poweroff, then removed the device, then replaced the device, 
then did resume-from-snapshot.  In that experiment Windows could _not_ 
tell and did not warn me.

In none of the experiments was there a case where Windows could have
realized that I unplugged the device and failed to warn me about it.


> > I say this is a case where we should try, at least to some extent; users 
> > will feel that "Powerdown-swsusp automatically removes all hot-pluggable 
> > devices" is too Draconian.
> 
> The simple solution is to use _real_ suspend states if you're going
> to expect real suspend/resume behavior ... :)

It will probably be easier to add the code needed to verify devices and 
media after restore-from-snapshot than it would be to educate users about 
the difference between suspend/resume vs. snapshot/restore!  :-)

> Note that for things like HID devices (mice etc), users rarely notice
> this since the HID driver masks enumeration through /dev/input/mice.

Yes, certainly.  Mostly I'm concerned about storage devices.  Although the 
same concerns affect other things too, like audio/video devices.

> Also, that we've been aiming at that "removes devices" as the default
> for drivers without suspend()/resume() support, because it's the ONLY
> choice that's both reliable (all drivers can handle it) and safe (since
> users and applications must already handle "live" unplug).

Yes.  Eventually it would be good to get suspend/resume support into all 
the drivers that could stand to benefit from it.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-06 14:50         ` Alan Stern
  2005-09-06 15:18           ` David Brownell
@ 2005-09-06 19:47           ` Rafael J. Wysocki
  1 sibling, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2005-09-06 19:47 UTC (permalink / raw)
  To: linux-pm; +Cc: David Brownell

[-- Attachment #1: Type: text/plain, Size: 3327 bytes --]

Hi,

On Tuesday, 6 of September 2005 16:50, Alan Stern wrote:
> On Mon, 5 Sep 2005, David Brownell wrote:
> 
> > > With the USB device, things are more interesting.  If you unplug the
> > > device (even while it's not in use), Windows warns you not to do this
> > > without first getting permission by using the "Eject Removable Devices"
> > > button.  If you try to press that button while a program has a file open
> > > on the device, Windows says that you can't remove the device right now and
> > > advises you to try again later.
> > 
> > So it's inconsistent in behavior, since this isn't how it handles
> > the same thing during resume-from-hibernate ...
> 
> Obviously because Windows isn't aware of anything that happens during
> hibernate while the power is turned off.  How could it possibly warn you
> about unplugging a device when it doesn't know whether you unplugged the
> device or not?
> 
> If you describe the behavior as "Windows warns you whenever it learns that
> you ejected removable media without permission", then Windows _is_
> consistent.
> 
> > > However...  I put the laptop into Hibernate mode.  To be absolutely sure
> > > this was a true snapshot-poweroff-resume cycle, I also unplugged the power
> > > cord and removed the battery.  Then I removed and replaced the USB device
> > > and restarted the laptop.  Everything worked smoothly; the file remained
> > > open and the program was able to continue reading it, well past the point
> > > where the I/O buffers needed to be refilled.
> > 
> > So basically there's a special case somewhere to treat _this_ disconnect
> > differently than other ones.
> 
> Yes, there must be.
> 
> > How does real suspend behave (like STR)?  How does it handle cases where you
> > plug in a different instance of the same device ... example, different CF
> > card in a CF reader?  Or when you move the device to another port?  And
> > does XP behave identically?
> 
> I'll try doing some of those experiments when I have a chance.
> 
> > > If Windows ME can do this, Linux should be able to do it too.
> > 
> > That argument can be stretched too far!  Though from time to time I have
> > indeed wished for something more like a BSOD.  An oops hidden in a logfile
> > that never gets flushed to disk, with an X desktop, gives no clues ... :)
> > 
> > Linux certainly _could_ try to emulate up all the fault handling of some
> > version of Windows.  But whether it _should_ is a different story.
> 
> I say this is a case where we should try, at least to some extent; users 
> will feel that "Powerdown-swsusp automatically removes all hot-pluggable 
> devices" is too Draconian.
> 
> What do other people on the PM list think?

I think it is analogous to leaving a mounted CD in the drive and suspending.

If you resume the box without ejecting the CD, it remains mounted and all is fine
(I actually tested this).  I don't know what happens if you remove the CD from
the drive while suspended (from the Linux' point of view), but this is a different
kettle of fish.  Anyway, IMO, the behavior should be consistent across all
devices or users will get confused.

Greetings,
Rafael


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-05 17:09     ` Alan Stern
  2005-09-05 23:27       ` David Brownell
@ 2005-09-11  1:55       ` Leo L. Schwab
  2005-09-11 19:03         ` Alan Stern
  1 sibling, 1 reply; 20+ messages in thread
From: Leo L. Schwab @ 2005-09-11  1:55 UTC (permalink / raw)
  To: linux-pm

[-- Attachment #1: Type: text/plain, Size: 3439 bytes --]

	Apologies for the delay in response:

On Mon, Sep 05, 2005 at 01:09:04PM -0400, Alan Stern wrote:
> The floppy disk situation is about what you'd expect.  Even without 
> putting the computer to sleep, you can remove a floppy disk while a 
> program has an open file on it.  Windows just puts up a screen asking you 
> to replace the floppy, and things continue as if nothing happened.  This 
> is obviously an historical relic going back to the days of DOS.
>
	Well, not exactly.  This is what you have to do when dealing with
media that can be yanked out from under you without warning or veto.  You
have to run the filesystem with a write-through cache or no cache at all.
Thus, when you attempt to read, you simply notice there's no media present.
Throw a dialog, retry.  Is it the same volume ID?  Nope, throw a dialog,
retry.

	If you're uber-studly, all this happens completely inside the
filesystem, and the app never notices anything odd happened (except perhaps
by noticing the read() took a long time to complete).

> With the USB device, things are more interesting.  If you unplug the 
> device (even while it's not in use), Windows warns you not to do this 
> without first getting permission by using the "Eject Removable Devices" 
> button.

	What happens if you attempt to continue after this warning without
replacing the device?  What happens if you put the USB device in a
*different machine* after doing this?  Is the filesystem consistent?

> If you try to press that button while a program has a file open 
> on the device, Windows says that you can't remove the device right now and 
> advises you to try again later.
>
	Leave it to Micros~1 to fsck up something even this basic.  They're
trying to use an ungainly UI to fake a media eject request event.  If they
mounted the USB volume in write-through mode like they were supposed to,
then you should be able to yank the media any time the activity light is
off.  Further read/write attempts would notice the media was missing and
simply generate a "please put that back" dialog.

	The fact is, if you have media that can be yanked by the user
without the OS being able to stop it, then your default option should be to
mount the filesystem as write-through or uncached, or you risk filesystem
corruption.  (If the *user* requests full caching with deferred writes, then
the user takes responsibility for making sure the media doesn't go anywhere
until its explicitly unmounted.)  You should only enable deferred writes by
default if you have high confidence that either A) the medium isn't going
anywhere; or B) you have full eject control, and can properly flush and
unmount the volume when the user presses Eject.

> However...  I put the laptop into Hibernate mode.  To be absolutely sure 
> this was a true snapshot-poweroff-resume cycle, I also unplugged the power 
> cord and removed the battery.  Then I removed and replaced the USB device 
> and restarted the laptop.  Everything worked smoothly; the file remained 
> open and the program was able to continue reading it, well past the point 
> where the I/O buffers needed to be refilled.
>
	A far more interesting experiment would be to hibernate the machine,
remove the USB device, resume the machine, then replace the USB device and
see how Windows reacts.  (Still more interesting tests: Attempt to continue
without replacing the device, replace the USB device in a different socket.)

					Schwab

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-06 15:19       ` Alan Stern
@ 2005-09-11 19:03         ` David Brownell
  2005-09-11 20:06           ` Alan Stern
  0 siblings, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-09-11 19:03 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 5325 bytes --]

> > > 	  Right now usb_probe_interface
> > > returns -EHOSTUNREACH when an interface on a suspended device is probed.  
> > > It doesn't try to resume the device.  Is this something we should add?
> > 
> > Probably.  That may need to work down starting at the root hub ... 
>
> It will be part of the RTPM framework I've described earlier.  (BTW, I'm
> waiting for your USB suspend/resume-recursion patch before starting to
> work on the RTPM stuff...)

I've refreshed those patches and split them out into more digestible
chunks, against 2.6 GIT; I'll post at least a few in the next few days,
after retesting.


> > The "dropped VBUS connection" link is pretty fundamental though.  The only
> > ways you can know the same device is there later are (a) if it's the same
> > VBUS power session, or (b) if none of the parent hub ports is removable.
> > 
> > If both of those are false, there's no way for usbcore to guarantee that the
> > same device is connected as was there when you suspended.
>
> No way to _guarantee_ it, agreed.  But we can try to do an acceptable job
> of making sure.

Most usable definitions of "acceptable" will end up needing to be very
driver-specific ... or vendor-specific, or system-specific, etc.

Some usb mass storage devices have non-unique serial numbers, for example,
so there's no way that even usb-storage could always give the right answer.
Maybe something interesting could be done with a certificate that "vendor
XYZ guarantees unique serial numbers" or "vendor ZZZ supports the 'green'
device authentication protocol".


(out of order:)
> > If either (a) or eventually (b) above are true, that should "just work".
>
> That's the easy part.  It _should_ "just work" even if (a) and (b) aren't 
> true -- that's the hard part.

I disagree.  "Should" implies "could work robustly", for which I just gave
another counter-example.  (Don't forget the filesystem corruption one too.)

To me this is a correctness issue.  One that certainly has some user
interface implications;  UNIX (and hence Linux) is still on a long trek
to get rid of assumptions that once devices exist, they continue to exist.
Such bad assumptions have shown up in many user interfaces, and evidently
even in versions of the SCSI stack.  :(

But it's just the flip side of hotplugging (hot UNplugging) ... and in the
same way that userspace needed to learn that devices may appear (and need
to be set up, used, managed, etc), so must it also learn they may disappear
at various times.  Including "while suspended".


> It's not just USB, obviously.  What happens (and what _should_ happen) if 
> you swap parallel-SCSI drives while the power is off?

It's common for specs (like for your stereo, and ISTR ACPI) to emphasize
that once you open the case (e.g. to swap SCSI or IDE drives), resume
behavior is no longer specified (e.g. to behave sanely) ...

What "should" happen in such cases is as much a function of your own
knowledge and skills, and the whole-system spec, as of kernel behavior. :)



> > > That's my point: The devices can't maintain state, but why should they
> > > when the driver is supposed to maintain it for them?  (Yes, some aspects
> > > of state, like firmware updates, might be _too_ difficult for the
> > > driver to preserve.  Such things should be in the minority, however.)
> > 
> > The driver is supposed to maintain driver state; but
> > the device is supposed to maintain device state.
>
> Even during shapshot-poweroff-resume?  How is the device supposed to
> manage that?

That's something I'd expect its driver to know.  One answer is:

>	Unless by "device state" you mean non-volatile things, like
> the contents of flash memory.  Such things present no problem for
> snapshot-resume.

But there's also volatile state; consider bus powered static RAM.
Or that tamper detection ciruit, reset when you opened the case.  :)


> > Resuming implies _both_ of them have been preserved.
> ...


> The tricky part is making sure that the same device is still attached and 
> it still contains the same medium.  Two different problems, which need to 
> be handled at two different levels.  Checking for the same device should 
> be done at the bus level, and checking for the same medium should be done 
> at the disk-driver level.

I summarized the USB bus rules -- (a) and (b) -- above.

At least for wired USB.  Wireless USB would often need to re-authenticate,
implying access to the linux/security/keys store, and thus userspace.
(Wired devices could implement those authentication protocols too...)

Right now the "same medium" checks are done in userspace.  And I recall
folk arguing about that, and deciding that userspace was the the only
workable place for them ...


> 	  In each case it's not possible to be 100% 
> certain, but it is possible to detect the most likely sorts of changes.  
> I.e., unless the user deliberately tries to fool the system, things should 
> work correctly.

Oddly enough, for wired USB (a) *IS* 100% certain (given no hardware bugs).
There's wriggle room in (b) ... we don't check it right now, but it'd make
sense for that to be manageable from userspace.

And I couldn't support the assumption of "non-hostile environment" for
most any general purpose system ... they'll surely get used in places
where that's an unsafe assumption!

- Dave



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-11  1:55       ` Leo L. Schwab
@ 2005-09-11 19:03         ` Alan Stern
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Stern @ 2005-09-11 19:03 UTC (permalink / raw)
  To: Leo L. Schwab; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6212 bytes --]

On Sat, 10 Sep 2005, Leo L. Schwab wrote:

> 	Apologies for the delay in response:
> 
> On Mon, Sep 05, 2005 at 01:09:04PM -0400, Alan Stern wrote:
> > The floppy disk situation is about what you'd expect.  Even without 
> > putting the computer to sleep, you can remove a floppy disk while a 
> > program has an open file on it.  Windows just puts up a screen asking you 
> > to replace the floppy, and things continue as if nothing happened.  This 
> > is obviously an historical relic going back to the days of DOS.
> >
> 	Well, not exactly.

Why do you say "not exactly"?  All your comments below are consistent with 
what I wrote.

>  This is what you have to do when dealing with
> media that can be yanked out from under you without warning or veto.  You
> have to run the filesystem with a write-through cache or no cache at all.
> Thus, when you attempt to read, you simply notice there's no media present.
> Throw a dialog, retry.  Is it the same volume ID?  Nope, throw a dialog,
> retry.

Yes, but it's not really a dialog.  That is, Windows does not display a
dialog window or a message box.  Instead it goes to a white-on-blue screen
in text mode (rather resembling the BSOD).  It's sort of a souped-up
version of the old "Abort, retry, or ignore?" message.

> 	If you're uber-studly, all this happens completely inside the
> filesystem, and the app never notices anything odd happened (except perhaps
> by noticing the read() took a long time to complete).
> 
> > With the USB device, things are more interesting.  If you unplug the 
> > device (even while it's not in use), Windows warns you not to do this 
> > without first getting permission by using the "Eject Removable Devices" 
> > button.
> 
> 	What happens if you attempt to continue after this warning without
> replacing the device?

The program continued reading until it exhausted some buffer, then it
stopped exactly as though it had reached end-of-file.  Maybe it got an
error return from the OS and didn't tell me about it, or maybe the file
descriptor was summarily closed.

>  What happens if you put the USB device in a
> *different machine* after doing this?  Is the filesystem consistent?

I don't know, and I don't really care.  With this particular experiment it 
certainly would be consistent, because I wasn't doing any writing, only 
reading.

> > If you try to press that button while a program has a file open 
> > on the device, Windows says that you can't remove the device right now and 
> > advises you to try again later.
> >
> 	Leave it to Micros~1 to fsck up something even this basic.  They're
> trying to use an ungainly UI to fake a media eject request event.  If they
> mounted the USB volume in write-through mode like they were supposed to,
> then you should be able to yank the media any time the activity light is
> off.  Further read/write attempts would notice the media was missing and
> simply generate a "please put that back" dialog.
> 
> 	The fact is, if you have media that can be yanked by the user
> without the OS being able to stop it, then your default option should be to
> mount the filesystem as write-through or uncached, or you risk filesystem
> corruption.  (If the *user* requests full caching with deferred writes, then
> the user takes responsibility for making sure the media doesn't go anywhere
> until its explicitly unmounted.)  You should only enable deferred writes by
> default if you have high confidence that either A) the medium isn't going
> anywhere; or B) you have full eject control, and can properly flush and
> unmount the volume when the user presses Eject.

Things aren't that simple.  First, bear in mind that most removable USB
devices used with Windows have VFAT filesystems, which require constant 
updating of the FATs while writing new files.  Next, bear in mind that 
flash memory devices tend to behave very badly when the same sectors are 
written over and over again.  Performance suffers and eventually those 
sectors stop working entirely.

The upshot is that you're much better off caching the metadata updates, at
least for short periods or until a file is closed.  There are messages in
the email archives describing the problems people started to encounter
when the VFAT driver in Linux finally implemented "-o sync" correctly.  
For a good summary look at bug #4882 in the OSDL Bugzilla.

> > However...  I put the laptop into Hibernate mode.  To be absolutely sure 
> > this was a true snapshot-poweroff-resume cycle, I also unplugged the power 
> > cord and removed the battery.  Then I removed and replaced the USB device 
> > and restarted the laptop.  Everything worked smoothly; the file remained 
> > open and the program was able to continue reading it, well past the point 
> > where the I/O buffers needed to be refilled.
> >
> 	A far more interesting experiment would be to hibernate the machine,
> remove the USB device, resume the machine, then replace the USB device and
> see how Windows reacts.

Just tried it, several times.  On some of the occasions the hibernate
failed, even before I removed the device (the computer never shut down
fully -- don't ask me why not).  On the occasions when it did go to sleep
properly, resuming without the device plugged in succeeded provided I
wasn't running a program with an open file on the device.  There wasn't
even a message box warning about my failure to use the "Eject removable
media" button.

If a program _was_ running with an open file on the device, then after 
unplugging the device the resume from hibernate failed.  It hung up at 
some stage after restoring the memory image, and I gave up on it after two 
minutes.  Even plugging the device back in didn't help.  However, cycling 
the power, plugging the device back in, and trying to resume over again 
did work.

>  (Still more interesting tests: Attempt to continue
> without replacing the device, replace the USB device in a different socket.)

Not feasible on my system.  Still, this is only Windows ME.  I would
expect XP to be considerably more robust.  I haven't had a chance yet to
try doing these sorts of things on an XP system.  _You_ could do some
experiments, if you have access to such a machine.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-06 15:46             ` Alan Stern
@ 2005-09-11 19:06               ` David Brownell
  2005-09-11 19:16                 ` Alan Stern
  0 siblings, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-09-11 19:06 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

> > > > > With the USB device, things are more interesting.  If you unplug the
> > > > > device (even while it's not in use), Windows warns you not to do this
> > > > > ...
> > > > 
> > > > So it's inconsistent in behavior, since this isn't how it handles
> > > > the same thing during resume-from-hibernate ...
> > > ...
>
> It sounds like we've got a failure to communicate.  There were two 
> experiments.  In the first, I simply unplugged the device.  Windows did 
> detect this and did warn me.  In the second experiment, I first did the 
> snapshot-and-poweroff, then removed the device, then replaced the device, 
> then did resume-from-snapshot.  In that experiment Windows could _not_ 
> tell and did not warn me.

I'm saying that the *HARDWARE* absolutely did tell ... but Windows
discarded that information, creating the inconsistency I highlighted.

- Dave


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-11 19:06               ` David Brownell
@ 2005-09-11 19:16                 ` Alan Stern
  2005-09-11 21:42                   ` David Brownell
  2005-09-12  2:33                   ` Dmitry Torokhov
  0 siblings, 2 replies; 20+ messages in thread
From: Alan Stern @ 2005-09-11 19:16 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2027 bytes --]

On Sun, 11 Sep 2005, David Brownell wrote:

> > > > > > With the USB device, things are more interesting.  If you unplug the
> > > > > > device (even while it's not in use), Windows warns you not to do this
> > > > > > ...
> > > > > 
> > > > > So it's inconsistent in behavior, since this isn't how it handles
> > > > > the same thing during resume-from-hibernate ...
> > > > ...
> >
> > It sounds like we've got a failure to communicate.  There were two 
> > experiments.  In the first, I simply unplugged the device.  Windows did 
> > detect this and did warn me.  In the second experiment, I first did the 
> > snapshot-and-poweroff, then removed the device, then replaced the device, 
> > then did resume-from-snapshot.  In that experiment Windows could _not_ 
> > tell and did not warn me.
> 
> I'm saying that the *HARDWARE* absolutely did tell ... but Windows
> discarded that information, creating the inconsistency I highlighted.

What do you mean?  _How_ was the hardware able to tell (considering that
the computer was powered off at the time) that I had unplugged and
replugged the device?

Or do you mean to say that any poweroff event should be considered an 
unplug, always?  I'm sure there are people at Microsoft who would argue 
about that.

If we're going to be careful about use of words, then "unplug" and
"replug" should refer to physical actions taken on cables and connectors.  
What you seem to have in mind could better be described as "breaking and
restoring connectivity", or even "interrupting and restoring Vbus power".

So yes, Windows did behave inconsistently with respect to interruptions of 
Vbus.  But it did not behave inconsistently (to within the limits of its 
knowledge) with respect to plugging and unplugging.

(Although my latest experiment _does_ show inconsistent behavior in this 
regard when I unplugged the device while it was not in use and Windows was 
asleep.  Maybe this was regarded as such a likely occurrence that there 
was no point in warning against it.)

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-11 19:03         ` David Brownell
@ 2005-09-11 20:06           ` Alan Stern
  2005-09-11 22:27             ` David Brownell
  0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2005-09-11 20:06 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6281 bytes --]

I suspect this is a losing argument, but I'll carry on a bit more just to 
be perverse...


On Sun, 11 Sep 2005, David Brownell wrote:

> > > If both of those are false, there's no way for usbcore to guarantee that the
> > > same device is connected as was there when you suspended.
> >
> > No way to _guarantee_ it, agreed.  But we can try to do an acceptable job
> > of making sure.
> 
> Most usable definitions of "acceptable" will end up needing to be very
> driver-specific ... or vendor-specific, or system-specific, etc.

I see nothing wrong in that.  And we can have minimal standards that are 
more general, say filesystem-specific.

> Some usb mass storage devices have non-unique serial numbers, for example,
> so there's no way that even usb-storage could always give the right answer.

That's why I said "acceptable job" instead of "perfect job".

> Maybe something interesting could be done with a certificate that "vendor
> XYZ guarantees unique serial numbers" or "vendor ZZZ supports the 'green'
> device authentication protocol".

Maybe.  But I would like to see this available even in situations where 
such certificates do not exist.


> > > If either (a) or eventually (b) above are true, that should "just work".
> >
> > That's the easy part.  It _should_ "just work" even if (a) and (b) aren't 
> > true -- that's the hard part.
> 
> I disagree.  "Should" implies "could work robustly", for which I just gave
> another counter-example.  (Don't forget the filesystem corruption one too.)

One always has to balance robustness against useability.  I think you're
pushing it too far in one direction.  There could, for example, be a
configuration option (or kernel attribute) for "forced removal of
hot-unpluggable devices on poweroff-suspend".  Then people could go either
way, according to their own preferences.

> To me this is a correctness issue.  One that certainly has some user
> interface implications;  UNIX (and hence Linux) is still on a long trek
> to get rid of assumptions that once devices exist, they continue to exist.
> Such bad assumptions have shown up in many user interfaces, and evidently
> even in versions of the SCSI stack.  :(
> 
> But it's just the flip side of hotplugging (hot UNplugging) ... and in the
> same way that userspace needed to learn that devices may appear (and need
> to be set up, used, managed, etc), so must it also learn they may disappear
> at various times.  Including "while suspended".

Nobody is denying that.  But there's a big difference between learning 
that devices _may_ disappear and _forcing_ them to disappear needlessly.


> > It's not just USB, obviously.  What happens (and what _should_ happen) if 
> > you swap parallel-SCSI drives while the power is off?
> 
> It's common for specs (like for your stereo, and ISTR ACPI) to emphasize
> that once you open the case (e.g. to swap SCSI or IDE drives),

Or change external connectors or cables...

>   resume
> behavior is no longer specified (e.g. to behave sanely) ...
> 
> What "should" happen in such cases is as much a function of your own
> knowledge and skills, and the whole-system spec, as of kernel behavior. :)

Isn't that consistent with what I've been saying all along?  If you don't 
change the connections during snapshot-poweroff-resume then the devices 
remain intact; if you do then anything might happen (but usually the 
devices will simply disappear).


> > > The driver is supposed to maintain driver state; but
> > > the device is supposed to maintain device state.
> >
> > Even during shapshot-poweroff-resume?  How is the device supposed to
> > manage that?
> 
> That's something I'd expect its driver to know.  One answer is:
> 
> >	Unless by "device state" you mean non-volatile things, like
> > the contents of flash memory.  Such things present no problem for
> > snapshot-resume.
> 
> But there's also volatile state; consider bus powered static RAM.
> Or that tamper detection ciruit, reset when you opened the case.  :)
> 
> 
> > > Resuming implies _both_ of them have been preserved.
> > ...

In cases where state cannot be preserved, then obviously the device can't 
hope to survive snapshot-poweroff-resume.  But you're distorting matters 
by implying that the driver can not be responsible for maintaining device 
state.  Of course it can; and if the driver and device _together_ maintain 
sufficient state then the device should survive.


> > The tricky part is making sure that the same device is still attached and 
> > it still contains the same medium.  Two different problems, which need to 
> > be handled at two different levels.  Checking for the same device should 
> > be done at the bus level, and checking for the same medium should be done 
> > at the disk-driver level.
> 
> I summarized the USB bus rules -- (a) and (b) -- above.

Those are the rules for _guaranteeing_ that the USB device hasn't changed.  
What about rules for saying that as far as you can tell the device hasn't 
changed?

> Right now the "same medium" checks are done in userspace.  And I recall
> folk arguing about that, and deciding that userspace was the the only
> workable place for them ...

That may be so; I don't know.  The thought occurs that the filesystem 
ought to get involved, at least to the extent of checking a hash of the 
superblock...

> > 	  In each case it's not possible to be 100% 
> > certain, but it is possible to detect the most likely sorts of changes.  
> > I.e., unless the user deliberately tries to fool the system, things should 
> > work correctly.
> 
> Oddly enough, for wired USB (a) *IS* 100% certain (given no hardware bugs).

Yep.  But for other systems, certainty is impossible.

> There's wriggle room in (b) ... we don't check it right now, but it'd make
> sense for that to be manageable from userspace.
> 
> And I couldn't support the assumption of "non-hostile environment" for
> most any general purpose system ... they'll surely get used in places
> where that's an unsafe assumption!

I hardly think that matters.  Anyone who has sufficient physical access to
the computer to disconnect and reconnect peripheral devices is already in
a position to do a lot of damage.  You might just as well go around trying
to protect against hostile actions by root!  :-)

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-11 19:16                 ` Alan Stern
@ 2005-09-11 21:42                   ` David Brownell
  2005-09-12  2:33                   ` Dmitry Torokhov
  1 sibling, 0 replies; 20+ messages in thread
From: David Brownell @ 2005-09-11 21:42 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

> > I'm saying that the *HARDWARE* absolutely did tell ... but Windows
> > discarded that information, creating the inconsistency I highlighted.
>
> What do you mean?  _How_ was the hardware able to tell (considering that
> the computer was powered off at the time) that I had unplugged and
> replugged the device?

The power session was lost, that root hub port was reset.


> If we're going to be careful about use of words, then "unplug" and
> "replug" should refer to physical actions taken on cables and connectors.  
> What you seem to have in mind could better be described as "breaking and
> restoring connectivity", or even "interrupting and restoring Vbus power".

The latter, since that's what's testable.

- Dave


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-11 20:06           ` Alan Stern
@ 2005-09-11 22:27             ` David Brownell
  0 siblings, 0 replies; 20+ messages in thread
From: David Brownell @ 2005-09-11 22:27 UTC (permalink / raw)
  To: stern; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 3295 bytes --]

> > Most usable definitions of "acceptable" will end up needing to be very
> > driver-specific ... or vendor-specific, or system-specific, etc.
>
> I see nothing wrong in that.  And we can have minimal standards that are 
> more general, say filesystem-specific.

The tricky part would then be how to connect all those customized
parts together.  It's hard enough even for non-custom ones!  :)


> > > > The driver is supposed to maintain driver state; but
> > > > the device is supposed to maintain device state.
> > >
> > > ...
> > > > Resuming implies _both_ of them have been preserved.
> > > ...
>
> In cases where state cannot be preserved, then obviously the device can't 
> hope to survive snapshot-poweroff-resume.  But you're distorting matters 
> by implying that the driver can not be responsible for maintaining device 
> state.

I'm not implying that; I'm _saying_ that's the general case, and
gave examples where clearly drivers can not maintain device state.

There's no distortion involved in recognizing that loss of power will
intentionally trigger device state changes (start with internal reset,
and go on from there), or that it'd take work -- which most vendors
won't invest! -- to ensure that those changes can't matter to drivers.


>	Of course it can; and if the driver and device _together_ maintain 
> sufficient state then the device should survive.

So the question becomes how to accomodate such special cases.
Agreed:  some devices and drivers could do that.


> > I summarized the USB bus rules -- (a) and (b) -- above.
>
> Those are the rules for _guaranteeing_ that the USB device hasn't changed.  
> What about rules for saying that as far as you can tell the device hasn't 
> changed?

That "as far as I can tell" can cover arbitrary amounts of work!
Userspace is the best place to provide such subjective judgments...


> > Right now the "same medium" checks are done in userspace.  And I recall
> > folk arguing about that, and deciding that userspace was the the only
> > workable place for them ...
>
> That may be so; I don't know.

Have a look at how UDEV systems determine disk identities and thus choose
how to name and mount disks.


>	The thought occurs that the filesystem 
> ought to get involved, at least to the extent of checking a hash of the 
> superblock...

Which superblock(s)?  And what about the non-filesystem data?  :)


> > And I couldn't support the assumption of "non-hostile environment" for
> > most any general purpose system ... they'll surely get used in places
> > where that's an unsafe assumption!
>
> I hardly think that matters.  Anyone who has sufficient physical access to
> the computer to disconnect and reconnect peripheral devices is already in
> a position to do a lot of damage.  You might just as well go around trying
> to protect against hostile actions by root!  :-)

Whipping out the Leatherman and having at a box may mean ten minutes' work,
and would be quite visible.  (That's assuming the system is already down.)

Swapping a USB device is maybe ten seconds' work ... if someone turned
their back to you for that long, they might never notice what you did.
USB based attacks could be _really_ easy to apply.  (Doesn't Windows
have "autorun" capability?  That could just start a big spam job...)

- Dave


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Suspended devices and drivers
  2005-09-11 19:16                 ` Alan Stern
  2005-09-11 21:42                   ` David Brownell
@ 2005-09-12  2:33                   ` Dmitry Torokhov
  1 sibling, 0 replies; 20+ messages in thread
From: Dmitry Torokhov @ 2005-09-12  2:33 UTC (permalink / raw)
  To: linux-pm; +Cc: David Brownell

[-- Attachment #1: Type: text/plain, Size: 543 bytes --]

On Sunday 11 September 2005 14:16, Alan Stern wrote:
> 
> So yes, Windows did behave inconsistently with respect to interruptions of 
> Vbus.  But it did not behave inconsistently (to within the limits of its 
> knowledge) with respect to plugging and unplugging.
> 

FWIW it is not inconsistency in windows world it seems. Quite a few
docking stations only support J3 so the normal course of cation is to
suspend, unpug and then, at resume, rfe-enumerate all devices to pick
up the new one and remove ones that were disconnected.

-- 
Dmitry

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2005-09-12  2:33 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-04 16:13 Suspended devices and drivers Alan Stern
2005-09-04 20:46 ` David Brownell
2005-09-05  1:25   ` Alan Stern
2005-09-05 17:09     ` Alan Stern
2005-09-05 23:27       ` David Brownell
2005-09-06 14:50         ` Alan Stern
2005-09-06 15:18           ` David Brownell
2005-09-06 15:46             ` Alan Stern
2005-09-11 19:06               ` David Brownell
2005-09-11 19:16                 ` Alan Stern
2005-09-11 21:42                   ` David Brownell
2005-09-12  2:33                   ` Dmitry Torokhov
2005-09-06 19:47           ` Rafael J. Wysocki
2005-09-11  1:55       ` Leo L. Schwab
2005-09-11 19:03         ` Alan Stern
2005-09-06  0:13     ` David Brownell
2005-09-06 15:19       ` Alan Stern
2005-09-11 19:03         ` David Brownell
2005-09-11 20:06           ` Alan Stern
2005-09-11 22:27             ` David Brownell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox