public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* Re: uhci-hcd suspend/resume under the new driver model
       [not found] ` <20050313101453.GA4820-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
@ 2005-03-13 18:12   ` Pavel Machek
       [not found]     ` <20050313181225.GC1579-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2005-03-13 18:12 UTC (permalink / raw)
  To: Bernard Blackham
  Cc: ncunningham-3EexvZdKGZRWk0Htik3J/w, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]

Hi!

> One of the problems Suspend2 is running into with the driver model
> changes, is uhci-hcd doesn't suspend and resume correctly. The
> problem is that in the changes to usb_hcd_pci_suspend in hcd-pci.c,
> we now use pci_choose_state to obtain the PCI power state to put the
> machine into. It affects uhci-hcd because it does not have the
> PCI_CAP_ID_PM capability bit set.
> 
> Before the new driver model, state would be 3 (passed directly to
> the function), and the function would run with state D3hot, and
> has_pci_pm false.  With these changes however, pci_choose_state sees
> the lack of PCI_CAP_ID_PM and returns state D0 instead of D3hot.
> Hence the function runs as if suspending into D0, and doesn't do the
> suspend as it should.
> 
> My hack (attached), fixes the suspend function to proceed with the
> suspend even if state is D0, if it sees that has_pci_pm is false. It
> resolves the suspend/resume issues, but it's probably not the
> cleanest solution, as it relies on the fact that the suspend
> function is only there to put the device into D3hot (and not any
> other power state).
> 
> Other drivers that lack PCI_CAP_ID_PM probably have the same
> problem, and I thought you might have a cleaner solution. Any
> suggestions? :)

I guess has_pci_pm check should be killed from pci_choose_state. It is
probably going to create some problems elsewhere....

								Pavel


-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 171 bytes --]

_______________________________________________
linux-pm mailing list
linux-pm-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
http://lists.osdl.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]     ` <20050313181225.GC1579-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-13 22:57       ` Benjamin Herrenschmidt
  2005-03-13 23:20         ` Pavel Machek
  0 siblings, 1 reply; 31+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-13 22:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 153 bytes --]


> 
> I guess has_pci_pm check should be killed from pci_choose_state. It is
> probably going to create some problems elsewhere....

Definitely.

Ben.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
  2005-03-13 22:57       ` Benjamin Herrenschmidt
@ 2005-03-13 23:20         ` Pavel Machek
       [not found]           ` <20050313232002.GC22635-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2005-03-13 23:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

Hi!

> > I guess has_pci_pm check should be killed from pci_choose_state. It is
> > probably going to create some problems elsewhere....
> 
> Definitely.

I fixed it in my tree, but I can't easily generate diffs just now (and
do not get as much testing as suspend2 :-). Bernard, could you remove
first two lines of pci_choose_state, verify it fixes uhci and submit
it as a patch? Oh and thanks for great detective work.

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]           ` <20050313232002.GC22635-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-14  2:23             ` Bernard Blackham
       [not found]               ` <20050314022308.GD6008-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bernard Blackham @ 2005-03-14  2:23 UTC (permalink / raw)
  To: Pavel Machek; +Cc: ncunningham-3EexvZdKGZRWk0Htik3J/w, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1260 bytes --]

On Mon, Mar 14, 2005 at 12:20:02AM +0100, Pavel Machek wrote:
> > > I guess has_pci_pm check should be killed from pci_choose_state. It is
> > > probably going to create some problems elsewhere....
> > 
> > Definitely.
> 
> I fixed it in my tree, but I can't easily generate diffs just now (and
> do not get as much testing as suspend2 :-). Bernard, could you remove
> first two lines of pci_choose_state, verify it fixes uhci and submit
> it as a patch?

Not happy yet. Passing D0 to uhci suspend's method is the cause of
the issue (it doesn't handle it particularly well), and D0 is also
passed when PMSG_FREEZE is sent. So while it works for entering S3,
it fails in the same way when preparing for the atomic copy in S4.

Hacking around the issue by always using D3hot should return it back
to the pre-drivermodel behaviour and mostly works. However at least
one person has reported that even doing this, uhci_hcd alone
suspends fine, but if both uhci_hcd and ehci_hcd are loaded, USB is
dead on resume until reloading both modules. I'm still digging
deeper and may need to enlist the help of linux-usb-devel...

I'll let you know when I have something cleaner that works for
everyone.

Bernard.

-- 
 Bernard Blackham <bernard at blackham dot com dot au>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]               ` <20050314022308.GD6008-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
@ 2005-03-14  4:03                 ` David Brownell
       [not found]                   ` <200503132003.27555.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2005-03-14 22:31                 ` Alan Stern
  2005-03-15 21:48                 ` Alan Stern
  2 siblings, 1 reply; 31+ messages in thread
From: David Brownell @ 2005-03-14  4:03 UTC (permalink / raw)
  To: linux-pm-qjLDD68F18O7TbgM5vRIOg
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 2456 bytes --]

On Sunday 13 March 2005 6:23 pm, Bernard Blackham wrote:
> On Mon, Mar 14, 2005 at 12:20:02AM +0100, Pavel Machek wrote:
> > > > I guess has_pci_pm check should be killed from pci_choose_state. It is
> > > > probably going to create some problems elsewhere....

What tree are you talking about?  The code I'm seeing doesn't use that
anywhere near USB, so by definition it can't matter.  That's the
latest linux-2.5 tree... where FWIW I observe that pci_choose_state()
is unusably stupid, it will gladly return power states that the
hardware doesn't support.  That is **NOT** what the original patch
with that routine did.


You need to change the USB PM stuff with extreme caution, else you'll
break it's ability to be a wakeup source, or handle selective suspend.
As well as its ability to work with the various different PM configs
that are possible with even minimal selection of BIOS, kernel, hardware,
and module config options.

(If you spend less than a couple weeks full time testing out those
different combinations, you'll surely miss some essential ones...)


> > > Definitely.
> > 
> > I fixed it in my tree, but I can't easily generate diffs just now (and
> > do not get as much testing as suspend2 :-). Bernard, could you remove
> > first two lines of pci_choose_state, verify it fixes uhci and submit
> > it as a patch?
> 
> Not happy yet. Passing D0 to uhci suspend's method is the cause of
> the issue (it doesn't handle it particularly well), and D0 is also
> passed when PMSG_FREEZE is sent. So while it works for entering S3,
> it fails in the same way when preparing for the atomic copy in S4.

Last I heard, UHCI still wasn't expected to behave with PM active.
OHCI and EHCI were, although there are likely some platforms that
have issues.  And various changes to usbcore may have borked some
things; it's clearly time to spend another couple weeks testing.

- Dave



> Hacking around the issue by always using D3hot should return it back
> to the pre-drivermodel behaviour and mostly works. However at least
> one person has reported that even doing this, uhci_hcd alone
> suspends fine, but if both uhci_hcd and ehci_hcd are loaded, USB is
> dead on resume until reloading both modules. I'm still digging
> deeper and may need to enlist the help of linux-usb-devel...
> 
> I'll let you know when I have something cleaner that works for
> everyone.
> 
> Bernard.
> 
> -- 
>  Bernard Blackham <bernard at blackham dot com dot au>
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                   ` <200503132003.27555.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-14  8:08                     ` Pavel Machek
       [not found]                       ` <20050314080827.GG22635-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2005-03-14  8:08 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]

On Ne 13-03-05 20:03:27, David Brownell wrote:
> On Sunday 13 March 2005 6:23 pm, Bernard Blackham wrote:
> > On Mon, Mar 14, 2005 at 12:20:02AM +0100, Pavel Machek wrote:
> > > > > I guess has_pci_pm check should be killed from pci_choose_state. It is
> > > > > probably going to create some problems elsewhere....
> 
> What tree are you talking about?  The code I'm seeing doesn't use that
> anywhere near USB, so by definition it can't matter.  That's the
> latest linux-2.5 tree... where FWIW I observe that pci_choose_state()
> is unusably stupid, it will gladly return power states that the
> hardware doesn't support.  That is **NOT** what the original patch
> with that routine did.
> 
> 
> You need to change the USB PM stuff with extreme caution, else you'll
> break it's ability to be a wakeup source, or handle selective suspend.
> As well as its ability to work with the various different PM configs
> that are possible with even minimal selection of BIOS, kernel, hardware,
> and module config options.

UHCI worked okay before. I'll really need to make pci_choose_state
"NOP" in the current code, so that it can be safely added to the
existing drivers.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                       ` <20050314080827.GG22635-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-14 18:17                         ` David Brownell
       [not found]                           ` <200503141017.17305.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: David Brownell @ 2005-03-14 18:17 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 912 bytes --]

On Monday 14 March 2005 12:08 am, Pavel Machek wrote:
> 	 I'll really need to make pci_choose_state
> "NOP" in the current code, so that it can be safely added to the
> existing drivers.

There's no point in adding NOPs.  Why bother?

Remember, the original idea behind pci_choose_state() was that
PCI drivers for PM-aware devices could use that instead of their
own dumb mappings between system suspend states (S0, S1, S2, S3,
S4, etc) and PCI states ... which in too many cases was just to
re-use the same numeric value.  That's _never_ going to be a NOP,
unless Linux only ever supports PCI D3hot (ACPI D2) and S3; and
likewise will never be needed for PM-stupid devices, or for those
drivers with actual intelligence (e.g. Ben's video examples).


My original question is unanswered though:  what tree is it that
made these (broken) changes to USB?  It's not BK-current, and it's
not the USB tree...

- Dave

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                           ` <200503141017.17305.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-14 18:44                             ` Pavel Machek
       [not found]                               ` <20050314184454.GL5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  2005-03-14 22:21                             ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2005-03-14 18:44 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 1303 bytes --]

Hi!

> > 	 I'll really need to make pci_choose_state
> > "NOP" in the current code, so that it can be safely added to the
> > existing drivers.
> 
> There's no point in adding NOPs.  Why bother?
> 
> Remember, the original idea behind pci_choose_state() was that
> PCI drivers for PM-aware devices could use that instead of their
> own dumb mappings between system suspend states (S0, S1, S2, S3,
> S4, etc) and PCI states ... which in too many cases was just to
> re-use the same numeric value.  That's _never_ going to be a NOP,
> unless Linux only ever supports PCI D3hot (ACPI D2) and S3; and
> likewise will never be needed for PM-stupid devices, or for those
> drivers with actual intelligence (e.g. Ben's video examples).

When I added pci_choose_state to drivers, I commented it as "no code
changes". I was wrong. pci_choose_state() needs to return d3hot even
for pmsg_freeze, because that's what old code did, and I did not audit
all the drivers.

> My original question is unanswered though:  what tree is it that
> made these (broken) changes to USB?  It's not BK-current, and it's
> not the USB tree...

suspend2 tree and my personal tree.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                               ` <20050314184454.GL5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-14 18:59                                 ` David Brownell
       [not found]                                   ` <200503141059.26107.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2005-03-14 22:22                                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 31+ messages in thread
From: David Brownell @ 2005-03-14 18:59 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

On Monday 14 March 2005 10:44 am, Pavel Machek wrote:
> Hi!
> 
> > > 	 I'll really need to make pci_choose_state
> > > "NOP" in the current code, so that it can be safely added to the
> > > existing drivers.
> > 
> > There's no point in adding NOPs.  Why bother?
> > 
> > Remember, the original idea behind pci_choose_state() was that
> > PCI drivers for PM-aware devices could use that instead of their
> > own dumb mappings between system suspend states (S0, S1, S2, S3,
> > S4, etc) and PCI states ... which in too many cases was just to
> > re-use the same numeric value.  That's _never_ going to be a NOP,
> > unless Linux only ever supports PCI D3hot (ACPI D2) and S3; and
> > likewise will never be needed for PM-stupid devices, or for those
> > drivers with actual intelligence (e.g. Ben's video examples).
> 
> When I added pci_choose_state to drivers, I commented it as "no code
> changes". I was wrong.

That's sort of what I said.  If it's a NOP, why bother?  The
idea behind such an API was to support S1/S2/S3 suspend levels,
where D3hot may be sub-optimal.  That's not NOP-ville.  And it
probably ought to consult ACPI tables on some systems...


> pci_choose_state() needs to return d3hot even 
> for pmsg_freeze, because that's what old code did, and I did not audit
> all the drivers.

Seems like a problem to me.  Do S1/S2/S3 transitions even
need a "freeze" transition?  I thought we'd agreed they don't;
that such transitions were needed only under swsusp models,
ensuring the image written to swap was internally consistent.
(Non-swsusp models only have the memory image, which will
by definition be internally consistent.)


> > My original question is unanswered though:  what tree is it that
> > made these (broken) changes to USB?  It's not BK-current, and it's
> > not the USB tree...
> 
> suspend2 tree and my personal tree.

OK, so nothing for me to worry about unless someone tries pushing
those changes upstream.

- Dave


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                   ` <200503141059.26107.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-14 19:30                                     ` Pavel Machek
       [not found]                                       ` <20050314193054.GO5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2005-03-14 19:30 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 2063 bytes --]

On Po 14-03-05 10:59:25, David Brownell wrote:
> On Monday 14 March 2005 10:44 am, Pavel Machek wrote:
> > Hi!
> > 
> > > > 	 I'll really need to make pci_choose_state
> > > > "NOP" in the current code, so that it can be safely added to the
> > > > existing drivers.
> > > 
> > > There's no point in adding NOPs.  Why bother?
> > > 
> > > Remember, the original idea behind pci_choose_state() was that
> > > PCI drivers for PM-aware devices could use that instead of their
> > > own dumb mappings between system suspend states (S0, S1, S2, S3,
> > > S4, etc) and PCI states ... which in too many cases was just to
> > > re-use the same numeric value.  That's _never_ going to be a NOP,
> > > unless Linux only ever supports PCI D3hot (ACPI D2) and S3; and
> > > likewise will never be needed for PM-stupid devices, or for those
> > > drivers with actual intelligence (e.g. Ben's video examples).
> > 
> > When I added pci_choose_state to drivers, I commented it as "no code
> > changes". I was wrong.
> 
> That's sort of what I said.  If it's a NOP, why bother?  The
> idea behind such an API was to support S1/S2/S3 suspend levels,
> where D3hot may be sub-optimal.  That's not NOP-ville.  And it
> probably ought to consult ACPI tables on some systems...

Because pm_message_t is not compatible with pci_power_t, you need
something to convert. And that something is pci_choose_state. If more
magic is needed, we'll need new function.

> > pci_choose_state() needs to return d3hot even 
> > for pmsg_freeze, because that's what old code did, and I did not audit
> > all the drivers.
> 
> Seems like a problem to me.  Do S1/S2/S3 transitions even
> need a "freeze" transition?  I thought we'd agreed they don't;

That's not the problem.

Old code put devices into D3hot in swsusp "freeze" case. We'll need to
do the same now, slowly auditing the drivers and removing that
unneccessary D0->D3hot->D0 transition.

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                           ` <200503141017.17305.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2005-03-14 18:44                             ` Pavel Machek
@ 2005-03-14 22:21                             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 31+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-14 22:21 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1209 bytes --]

On Mon, 2005-03-14 at 10:17 -0800, David Brownell wrote:
> On Monday 14 March 2005 12:08 am, Pavel Machek wrote:
> > 	 I'll really need to make pci_choose_state
> > "NOP" in the current code, so that it can be safely added to the
> > existing drivers.
> 
> There's no point in adding NOPs.  Why bother?
> 
> Remember, the original idea behind pci_choose_state() was that
> PCI drivers for PM-aware devices could use that instead of their
> own dumb mappings between system suspend states (S0, S1, S2, S3,
> S4, etc) and PCI states ... which in too many cases was just to
> re-use the same numeric value.  That's _never_ going to be a NOP,
> unless Linux only ever supports PCI D3hot (ACPI D2) and S3; and
> likewise will never be needed for PM-stupid devices, or for those
> drivers with actual intelligence (e.g. Ben's video examples).

Note that the apple OHCI cell in Apple ASICs, despite beeing a PCI
device, doesn't always expose PCI PM capabilities. We still need to do
the proper suspend stuff on it. So be careful with pci_choose_state().

> My original question is unanswered though:  what tree is it that
> made these (broken) changes to USB?  It's not BK-current, and it's
> not the USB tree...




[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                               ` <20050314184454.GL5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  2005-03-14 18:59                                 ` David Brownell
@ 2005-03-14 22:22                                 ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 31+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-14 22:22 UTC (permalink / raw)
  To: Pavel Machek
  Cc: David Brownell, Bernard Blackham,
	ncunningham-3EexvZdKGZRWk0Htik3J/w, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

On Mon, 2005-03-14 at 19:44 +0100, Pavel Machek wrote:
> Hi!
> 
> > > 	 I'll really need to make pci_choose_state
> > > "NOP" in the current code, so that it can be safely added to the
> > > existing drivers.
> > 
> > There's no point in adding NOPs.  Why bother?
> > 
> > Remember, the original idea behind pci_choose_state() was that
> > PCI drivers for PM-aware devices could use that instead of their
> > own dumb mappings between system suspend states (S0, S1, S2, S3,
> > S4, etc) and PCI states ... which in too many cases was just to
> > re-use the same numeric value.  That's _never_ going to be a NOP,
> > unless Linux only ever supports PCI D3hot (ACPI D2) and S3; and
> > likewise will never be needed for PM-stupid devices, or for those
> > drivers with actual intelligence (e.g. Ben's video examples).
> 
> When I added pci_choose_state to drivers, I commented it as "no code
> changes". I was wrong. pci_choose_state() needs to return d3hot even
> for pmsg_freeze, because that's what old code did, and I did not audit
> all the drivers.

Yes, we should default to D3 I suppose. Only drivers that directly
understand the new semantics may do fancy/different things for
pmsg_freeze.

> > My original question is unanswered though:  what tree is it that
> > made these (broken) changes to USB?  It's not BK-current, and it's
> > not the USB tree...
> 
> suspend2 tree and my personal tree.
> 								Pavel
> _______________________________________________
> linux-pm mailing list
> linux-pm-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
> http://lists.osdl.org/mailman/listinfo/linux-pm
-- 
Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]               ` <20050314022308.GD6008-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
  2005-03-14  4:03                 ` David Brownell
@ 2005-03-14 22:31                 ` Alan Stern
       [not found]                   ` <Pine.LNX.4.44L0.0503141717240.620-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2005-03-15 21:48                 ` Alan Stern
  2 siblings, 1 reply; 31+ messages in thread
From: Alan Stern @ 2005-03-14 22:31 UTC (permalink / raw)
  To: Bernard Blackham
  Cc: ncunningham-3EexvZdKGZRWk0Htik3J/w, Linux-pm mailing list,
	Pavel Machek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1921 bytes --]

On Mon, 14 Mar 2005, Bernard Blackham wrote:

> Not happy yet. Passing D0 to uhci suspend's method is the cause of
> the issue (it doesn't handle it particularly well), and D0 is also
> passed when PMSG_FREEZE is sent. So while it works for entering S3,
> it fails in the same way when preparing for the atomic copy in S4.
> 
> Hacking around the issue by always using D3hot should return it back
> to the pre-drivermodel behaviour and mostly works. However at least
> one person has reported that even doing this, uhci_hcd alone
> suspends fine, but if both uhci_hcd and ehci_hcd are loaded, USB is
> dead on resume until reloading both modules. I'm still digging
> deeper and may need to enlist the help of linux-usb-devel...

I'm seeing similar problems on my system, using the gregkh-2.6 kernel.  
The motherboard is a PIIX4 and the builtin UHCI controller doesn't support 
PCI PM.  The following sequence of commands provokes the problem (with no 
USB devices attached):

	cd /sys/device/pci*/*7.2	# The UHCI controller's directory
	echo -n 3 >usb1/power/state	# Suspend the root hub
	echo -n 3 >power/state		# Suspend the controller
	echo -n 0 >power/state		# Resume the controller

(The log indicates at this point that it's resuming from D0.)

	echo -n 0 >usb1/power/state	# Try to resume the root hub

Now things go crazy.  For some reason the controller is left in an
unconfigured state, as revealed by lspci -vvb (no I/O mapping, IRQs not
routed anywhere).  I'm not sure if that happens before or after the final
command.  I'm also not sure if it's caused by the resume routines in
hcd-pci.c or the resume routines in uhci-hcd.c.

The whole suspend/resume support in uhci-hcd is due for a drastic 
overhaul.  I've just done the work leading up to it; finishing it will 
take some time -- maybe a few weeks.

I'll try running some more tests when there's a chance, and let you know 
what turns up.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                   ` <Pine.LNX.4.44L0.0503141717240.620-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-14 22:38                     ` Nigel Cunningham
  0 siblings, 0 replies; 31+ messages in thread
From: Nigel Cunningham @ 2005-03-14 22:38 UTC (permalink / raw)
  To: Alan Stern; +Cc: Bernard Blackham, Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 503 bytes --]

Hi.

On Tue, 2005-03-15 at 09:31, Alan Stern wrote:
> The whole suspend/resume support in uhci-hcd is due for a drastic 
> overhaul.  I've just done the work leading up to it; finishing it will 
> take some time -- maybe a few weeks.

Alan, you'll be my hero when you're done! :>

Nigel
-- 
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028;  Mob: +61 (417) 100 574

Maintainer of Suspend2 Kernel Patches http://suspend2.net


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]               ` <20050314022308.GD6008-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
  2005-03-14  4:03                 ` David Brownell
  2005-03-14 22:31                 ` Alan Stern
@ 2005-03-15 21:48                 ` Alan Stern
       [not found]                   ` <Pine.LNX.4.44L0.0503151636290.711-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2 siblings, 1 reply; 31+ messages in thread
From: Alan Stern @ 2005-03-15 21:48 UTC (permalink / raw)
  To: Bernard Blackham
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3842 bytes --]

On Mon, 14 Mar 2005, Bernard Blackham wrote:

> Not happy yet. Passing D0 to uhci suspend's method is the cause of
> the issue (it doesn't handle it particularly well), and D0 is also
> passed when PMSG_FREEZE is sent. So while it works for entering S3,
> it fails in the same way when preparing for the atomic copy in S4.
> 
> Hacking around the issue by always using D3hot should return it back
> to the pre-drivermodel behaviour and mostly works. However at least
> one person has reported that even doing this, uhci_hcd alone
> suspends fine, but if both uhci_hcd and ehci_hcd are loaded, USB is
> dead on resume until reloading both modules. I'm still digging
> deeper and may need to enlist the help of linux-usb-devel...

It turns out there are a few problems in the hcd-pci.c suspend and resume
routines; mostly that the resume routine doesn't do the reverse of what
the suspend routine does.  One particularly bad symptom is that when
"resuming" from D0 it does pci_restore_state() even though
pci_save_state() was never called.  That's why my controller ended up
unconfigured.

One other point: Suspend calls free_irq() and resume calls request_irq().  
This doesn't seem necessary to me since the common IRQ handler will reject
interrupts occuring while the controller is suspended.  Also it's asking
for trouble if the driver is unloaded before the controller is resumed,
since the remove routine will call free_irq() again on its own.  I've
#ifdef'ed out those calls below, but this deserves closer attention.

Another thing deserving closer attention is the various error pathways and
what state they end up leaving the hardware and the data structures in.  
The patch below doesn't address this.

Anyway, here are the changes I made.  Now things are working better again.  
Bernard, does this help you?

Alan Stern



===== drivers/usb/core/hcd-pci.c 1.76 vs edited =====
--- 1.76/drivers/usb/core/hcd-pci.c	2005-03-11 02:15:12 -05:00
+++ edited/drivers/usb/core/hcd-pci.c	2005-03-15 16:26:14 -05:00
@@ -256,7 +256,9 @@
 		if (!dev->current_state) {
 			pci_save_state (dev);
 			pci_disable_device (dev);
+#if 0
 			free_irq (hcd->irq, hcd);
+#endif
 		}
 
 		if (!has_pci_pm) {
@@ -294,7 +296,6 @@
 		break;
 	}
 
-	/* update power_state **ONLY** to make sysfs happier */
 	if (retval == 0)
 		dev->dev.power.power_state = state;
 	return retval;
@@ -328,22 +329,38 @@
 
 	hcd->state = USB_STATE_RESUMING;
 
-	if (has_pci_pm)
-		pci_set_power_state (dev, 0);
-	dev->dev.power.power_state = PMSG_ON;
-	retval = request_irq (dev->irq, usb_hcd_irq, SA_SHIRQ,
-				hcd->driver->description, hcd);
-	if (retval < 0) {
-		dev_err (hcd->self.controller,
-			"can't restore IRQ after resume!\n");
-		return retval;
-	}
-	hcd->saw_irq = 0;
-	pci_restore_state (dev);
+	if (dev->current_state > PCI_D0) {
+		if (has_pci_pm) {
 #ifdef	CONFIG_USB_SUSPEND
-	pci_enable_wake (dev, dev->current_state, 0);
-	pci_enable_wake (dev, 4, 0);
+			pci_enable_wake (dev, dev->current_state, 0);
+			pci_enable_wake (dev, 4, 0);
+#endif
+			retval = pci_set_power_state (dev, PCI_D0);
+			if (retval < 0) {
+				dev_dbg (&dev->dev, "PCI resume fail, %d\n",
+						retval);
+				return retval;
+			}
+		}
+		dev->dev.power.power_state = PMSG_ON;
+#if 0
+		retval = request_irq (dev->irq, usb_hcd_irq, SA_SHIRQ,
+				hcd->irq_descr, hcd);
+		if (retval < 0) {
+			dev_err (hcd->self.controller,
+				"can't restore IRQ after resume!\n");
+			return retval;
+		}
 #endif
+		hcd->saw_irq = 0;
+		retval = pci_enable_device (dev);
+		if (retval < 0) {
+			dev_err (hcd->self.controller,
+				"can't enable device after resume!\n");
+			return retval;
+		}
+		pci_restore_state (dev);
+	}
 
 	retval = hcd->driver->resume (hcd);
 	if (!HCD_IS_RUNNING (hcd->state)) {
@@ -357,5 +374,3 @@
 EXPORT_SYMBOL (usb_hcd_pci_resume);
 
 #endif	/* CONFIG_PM */
-
-


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                   ` <Pine.LNX.4.44L0.0503151636290.711-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-15 21:52                     ` Pavel Machek
  2005-03-15 22:11                     ` David Brownell
  2005-03-16  3:04                     ` Bernard Blackham
  2 siblings, 0 replies; 31+ messages in thread
From: Pavel Machek @ 2005-03-15 21:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Bernard Blackham,
	ncunningham-3EexvZdKGZRWk0Htik3J/w, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 793 bytes --]

Hi!

> One other point: Suspend calls free_irq() and resume calls request_irq().  
> This doesn't seem necessary to me since the common IRQ handler will reject
> interrupts occuring while the controller is suspended.  Also it's asking
> for trouble if the driver is unloaded before the controller is resumed,
> since the remove routine will call free_irq() again on its own.  I've
> #ifdef'ed out those calls below, but this deserves closer attention.

Be carefull with free_irq() / request_irq(). For some reason I needed
to add them to b44 driver. I'm not sure what is going on, perhaps we
are not saving interrupt controller state right?
								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                   ` <Pine.LNX.4.44L0.0503151636290.711-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2005-03-15 21:52                     ` Pavel Machek
@ 2005-03-15 22:11                     ` David Brownell
  2005-03-16  3:04                     ` Bernard Blackham
  2 siblings, 0 replies; 31+ messages in thread
From: David Brownell @ 2005-03-15 22:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 1796 bytes --]

On Tuesday 15 March 2005 1:48 pm, Alan Stern wrote:

> It turns out there are a few problems in the hcd-pci.c suspend and resume
> routines; mostly that the resume routine doesn't do the reverse of what
> the suspend routine does.  One particularly bad symptom is that when
> "resuming" from D0 it does pci_restore_state() even though
> pci_save_state() was never called.  That's why my controller ended up
> unconfigured.

I noticed that a while back, and had a patch for it.  Didn't seem to
matter on anything I tested; sorry, I should have posted the patch
just for comments, in case it mattered on other hardware!  Attached;
applies to current code with some fuzz, not guaranteed to behave.


> One other point: Suspend calls free_irq() and resume calls request_irq().  
> This doesn't seem necessary to me since the common IRQ handler will reject
> interrupts occuring while the controller is suspended.  Also it's asking
> for trouble if the driver is unloaded before the controller is resumed,
> since the remove routine will call free_irq() again on its own.  I've
> #ifdef'ed out those calls below, but this deserves closer attention.

Those request/free calls moved up out of PPC-specific code; Ben can
comment on why at least some of the Apple hardware needed it.  The
free_irq() should be safe, since it should be a NOP if there's no
handler associated with that cookie.


> Another thing deserving closer attention is the various error pathways and
> what state they end up leaving the hardware and the data structures in.  

As always!  My rule of thumb is that if there aren't three lines of
fault handling code for every "productive" line, it's likely there's
something wrong in a fault path.  That's not always accurate, but it's
surprisingly close on stable codebases.

- Dave


[-- Attachment #2: hcd.patch --]
[-- Type: text/x-diff, Size: 732 bytes --]

--- 1.50/drivers/usb/core/hcd-pci.c	Sun Nov  7 05:06:51 2004
+++ edited/drivers/usb/core/hcd-pci.c	Wed Nov 10 09:55:10 2004
@@ -407,9 +407,9 @@
 
 	hcd->state = USB_STATE_RESUMING;
 
-	if (has_pci_pm)
-		pci_set_power_state (dev, 0);
-	dev->dev.power.power_state = 0;
+	pci_enable_device (dev);
+	pci_restore_state (dev);
+	pci_set_master (dev);
 	retval = request_irq (dev->irq, usb_hcd_irq, SA_SHIRQ,
 				hcd->description, hcd);
 	if (retval < 0) {
@@ -418,7 +418,9 @@
 		return retval;
 	}
 	hcd->saw_irq = 0;
-	pci_restore_state (dev);
+	if (has_pci_pm)
+		pci_set_power_state (dev, 0);
+	dev->dev.power.power_state = 0;
 #ifdef	CONFIG_USB_SUSPEND
 	pci_enable_wake (dev, dev->current_state, 0);
 	pci_enable_wake (dev, 4, 0);

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                   ` <Pine.LNX.4.44L0.0503151636290.711-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2005-03-15 21:52                     ` Pavel Machek
  2005-03-15 22:11                     ` David Brownell
@ 2005-03-16  3:04                     ` Bernard Blackham
       [not found]                       ` <20050316030448.GA8588-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
  2 siblings, 1 reply; 31+ messages in thread
From: Bernard Blackham @ 2005-03-16  3:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 3863 bytes --]

On Tue, Mar 15, 2005 at 04:48:19PM -0500, Alan Stern wrote:
> One other point: Suspend calls free_irq() and resume calls request_irq().  
> This doesn't seem necessary to me since the common IRQ handler will reject
> interrupts occuring while the controller is suspended.  Also it's asking
> for trouble if the driver is unloaded before the controller is resumed,
> since the remove routine will call free_irq() again on its own.  I've
> #ifdef'ed out those calls below, but this deserves closer attention.
> 
> Another thing deserving closer attention is the various error pathways and
> what state they end up leaving the hardware and the data structures in.  
> The patch below doesn't address this.
> 
> Anyway, here are the changes I made.  Now things are working better again.  
> Bernard, does this help you?

Somewhat. I had to tweak the patches for the new driver model
(specifically, change "power_state = state" to "power_state.event =
msg.event"), and I'm still using the patch to remove the
PCI_CAP_ID_PM check from pci_choose_state.

Suspend and resume into both S3, and suspend-to-disk (where
PMSG_FREEZE and PMSG_ON are used), are fine if there are no devices
plugged in. If there are devices plugged in, it's not so good.
(Using a boring USB mouse for testing here). They'll suspend and
resume the first time, suspend a second time, then fail on resume -
I get a stream of messages:

Mar 16 10:07:30 amidala kernel: uhci_hcd 0000:00:1d.1: suspend_hc
Mar 16 10:07:30 amidala kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
Mar 16 10:07:31 amidala kernel: uhci_hcd 0000:00:1d.1: host controller process error, something bad happened!
Mar 16 10:07:31 amidala kernel: uhci_hcd 0000:00:1d.1: host controller halted, very bad!
Mar 16 10:07:31 amidala kernel: usb 3-1: gpilotd timed out on ep0in
Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: suspend_hc
Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: host controller process error, something bad happened!
Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: host controller halted, very bad!
Mar 16 10:07:35 amidala kernel: uhci_hcd 0000:00:1d.1: suspend_hc
Mar 16 10:07:35 amidala kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
Mar 16 10:07:36 amidala kernel: uhci_hcd 0000:00:1d.1: host controller process error, something bad happened!

repeating over and over again. USB is dead, and any program that
tries to touch USB will hang:

Mar 16 10:08:55 amidala kernel: Xorg          D C041E3E0     0  3214   3118          3952       (NOTLB)
Mar 16 10:08:55 amidala kernel: f5fb5ee8 00003082 f5ae0570 c041e3e0 f5a2fb80 f5fb5ee8 c026271a c1bfbe00 
Mar 16 10:08:55 amidala kernel:        f5a2fb80 c01e3f43 f6c08020 00000c34 c53639fe 00000017 f5ae06c4 f5a2fb80 
Mar 16 10:08:55 amidala kernel:        f5fb5f18 f5fb5f0c f5fb5f44 c026321d f5a2fb80 fffffffe 00000000 f5ae0570 
Mar 16 10:08:55 amidala kernel: Call Trace:
Mar 16 10:08:55 amidala kernel:  [usb_kill_urb+221/304] usb_kill_urb+0xdd/0x130
Mar 16 10:08:55 amidala kernel:  [input_close_device+57/64] input_close_device+0x39/0x40
Mar 16 10:08:55 amidala kernel:  [mixdev_release+58/112] mixdev_release+0x3a/0x70
Mar 16 10:08:55 amidala kernel:  [__fput+301/320] __fput+0x12d/0x140
Mar 16 10:08:55 amidala kernel:  [filp_close+87/144] filp_close+0x57/0x90
Mar 16 10:08:55 amidala kernel:  [sys_close+97/160] sys_close+0x61/0xa0
Mar 16 10:08:55 amidala kernel:  [syscall_call+7/11] syscall_call+0x7/0xb

and strangely enough, soon every process in the system becomes hung.
I haven't tried David's patch yet - the only differences are it
seems to use a slightly different order, and also calls
pci_set_master. I'll give that a spin later today combined with and
without yours to see what works reliably.

Thanks,

Bernard.

-- 
 Bernard Blackham <bernard at blackham dot com dot au>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                       ` <20050316030448.GA8588-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
@ 2005-03-16 15:45                         ` Alan Stern
       [not found]                           ` <Pine.LNX.4.44L0.0503161024360.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Alan Stern @ 2005-03-16 15:45 UTC (permalink / raw)
  To: Bernard Blackham
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5554 bytes --]

On Wed, 16 Mar 2005, Bernard Blackham wrote:

> > Bernard, does this help you?
> 
> Somewhat. I had to tweak the patches for the new driver model
> (specifically, change "power_state = state" to "power_state.event =
> msg.event"), and I'm still using the patch to remove the
> PCI_CAP_ID_PM check from pci_choose_state.

The power_state.event change shouldn't affect the code's functioning.  
Neither should the PCI_CAP_ID_PM check provided it still ends up deciding 
to "suspend" to D0.

> Suspend and resume into both S3, and suspend-to-disk (where
> PMSG_FREEZE and PMSG_ON are used), are fine if there are no devices
> plugged in. If there are devices plugged in, it's not so good.
> (Using a boring USB mouse for testing here). They'll suspend and
> resume the first time, suspend a second time, then fail on resume -
> I get a stream of messages:
> 
> Mar 16 10:07:30 amidala kernel: uhci_hcd 0000:00:1d.1: suspend_hc
> Mar 16 10:07:30 amidala kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
> Mar 16 10:07:31 amidala kernel: uhci_hcd 0000:00:1d.1: host controller process error, something bad happened!
> Mar 16 10:07:31 amidala kernel: uhci_hcd 0000:00:1d.1: host controller halted, very bad!
> Mar 16 10:07:31 amidala kernel: usb 3-1: gpilotd timed out on ep0in
> Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: suspend_hc
> Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
> Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: host controller process error, something bad happened!
> Mar 16 10:07:33 amidala kernel: uhci_hcd 0000:00:1d.1: host controller halted, very bad!
> Mar 16 10:07:35 amidala kernel: uhci_hcd 0000:00:1d.1: suspend_hc
> Mar 16 10:07:35 amidala kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
> Mar 16 10:07:36 amidala kernel: uhci_hcd 0000:00:1d.1: host controller process error, something bad happened!

The repeating over-and-over results from the uhci-hcd driver's lack of
proper error handling; it's perfectly happy restarting a controller that
has died.  There's not much to do about it at the moment except prevent
the condition that triggered it initially.  Those "process error"s are
bad; they should never happen.

> I haven't tried David's patch yet - the only differences are it
> seems to use a slightly different order, and also calls
> pci_set_master. I'll give that a spin later today combined with and
> without yours to see what works reliably.

The missing pci_set_master in my patch probably explains the errors you 
got.

David's patch doesn't make all the necessary changes that my patch does.  
In particular it doesn't avoid doing pci_restore_state when there was no
prior pci_save_state, so it will continue to exhibit some of the problems
we were seeing before.


On Tue, 15 Mar 2005, Pavel Machek wrote:

> Be carefull with free_irq() / request_irq(). For some reason I needed
> to add them to b44 driver. I'm not sure what is going on, perhaps we 
> are not saving interrupt controller state right?

On Tue, 15 Mar 2005, David Brownell wrote:

> Those request/free calls moved up out of PPC-specific code; Ben can
> comment on why at least some of the Apple hardware needed it.  The 
> free_irq() should be safe, since it should be a NOP if there's no  
> handler associated with that cookie.

It's safe to remove the "#if 0 ... #endif" pairs that my patch leaves
around free_irq and request_irq.  Note however that the original code was 
passing an incorrect label to request_irq!  My fault; I didn't realize 
this code path existed when I changed the label earlier.

Below is a new version of my patch with the IRQ handling reinstated and
the missing pci_set_master included.  I'll do some more testing and get
back to you...

Alan Stern



===== drivers/usb/core/hcd-pci.c 1.76 vs edited =====
--- 1.76/drivers/usb/core/hcd-pci.c	2005-03-11 02:15:12 -05:00
+++ edited/drivers/usb/core/hcd-pci.c	2005-03-16 10:43:50 -05:00
@@ -294,7 +294,6 @@
 		break;
 	}
 
-	/* update power_state **ONLY** to make sysfs happier */
 	if (retval == 0)
 		dev->dev.power.power_state = state;
 	return retval;
@@ -328,22 +327,37 @@
 
 	hcd->state = USB_STATE_RESUMING;
 
-	if (has_pci_pm)
-		pci_set_power_state (dev, 0);
-	dev->dev.power.power_state = PMSG_ON;
-	retval = request_irq (dev->irq, usb_hcd_irq, SA_SHIRQ,
-				hcd->driver->description, hcd);
-	if (retval < 0) {
-		dev_err (hcd->self.controller,
-			"can't restore IRQ after resume!\n");
-		return retval;
-	}
-	hcd->saw_irq = 0;
-	pci_restore_state (dev);
+	if (dev->current_state > PCI_D0) {
+		if (has_pci_pm) {
 #ifdef	CONFIG_USB_SUSPEND
-	pci_enable_wake (dev, dev->current_state, 0);
-	pci_enable_wake (dev, 4, 0);
+			pci_enable_wake (dev, dev->current_state, 0);
+			pci_enable_wake (dev, 4, 0);
 #endif
+			retval = pci_set_power_state (dev, PCI_D0);
+			if (retval < 0) {
+				dev_dbg (&dev->dev, "PCI resume fail, %d\n",
+						retval);
+				return retval;
+			}
+		}
+		dev->dev.power.power_state = PMSG_ON;
+		retval = request_irq (dev->irq, usb_hcd_irq, SA_SHIRQ,
+				hcd->irq_descr, hcd);
+		if (retval < 0) {
+			dev_err (hcd->self.controller,
+				"can't restore IRQ after resume!\n");
+			return retval;
+		}
+		hcd->saw_irq = 0;
+		retval = pci_enable_device (dev);
+		if (retval < 0) {
+			dev_err (hcd->self.controller,
+				"can't enable device after resume!\n");
+			return retval;
+		}
+		pci_set_master (dev);
+		pci_restore_state (dev);
+	}
 
 	retval = hcd->driver->resume (hcd);
 	if (!HCD_IS_RUNNING (hcd->state)) {
@@ -357,5 +371,3 @@
 EXPORT_SYMBOL (usb_hcd_pci_resume);
 
 #endif	/* CONFIG_PM */
-
-


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                           ` <Pine.LNX.4.44L0.0503161024360.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-16 16:52                             ` Bernard Blackham
       [not found]                               ` <20050316165239.GA10545-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
  2005-03-16 21:09                             ` David Brownell
  1 sibling, 1 reply; 31+ messages in thread
From: Bernard Blackham @ 2005-03-16 16:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

On Wed, Mar 16, 2005 at 10:45:57AM -0500, Alan Stern wrote:
> The power_state.event change shouldn't affect the code's functioning.  
> Neither should the PCI_CAP_ID_PM check provided it still ends up deciding 
> to "suspend" to D0.

Can you elaborate on '"suspend" to D0' ? If the PCI_CAP_ID_PM check
is in place, the state variable in usb_hcd_pci_suspend is set to 0,
which means the if statement here:

    case HCD_STATE_SUSPENDED:
        if (state <= dev->current_state)
            break;

is hit, and pci_save_state, pci_disable_device, and free_irq are
never called.

If, on the other hand the PCI_CAP_ID_PM check is removed from
pci_choose_state, "state" is D3hot, so the PCI disabling code *is*
called, and the function bails out when it hits if (!has_pci_pm) {...

> Below is a new version of my patch with the IRQ handling reinstated and
> the missing pci_set_master included.  I'll do some more testing and get
> back to you...

Needed an extra patch to compile, attached. However, it still
doesn't survive more than one suspend/resume with devices plugged
in. On the second resume, USB is dead, and it enters the same
suspend_hc/wakeup_hc loop. I traced this back to the hcd never
really being resumed, as the pci power state never changes from D0
as pci_set_power_state is never called here. The attached patch also
addresses this - I'm not sure if it's the right way, but it
works(ish):

Suspend and resume for S3 or suspend to disk, without devices
plugged in, works fine. If my USB mouse is plugged in, it is resumed
and powered up, but nothing is sent out /dev/input/mice until I
unplug and replug the device (but USB is not dead :)

I'll forward this (combined) patch to the user on the Suspend2 list
having issues suspending when both uhci and ehci were loaded.

Thanks,

Bernard.

-- 
 Bernard Blackham <bernard at blackham dot com dot au>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                               ` <20050316165239.GA10545-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
@ 2005-03-16 18:44                                 ` Alan Stern
       [not found]                                   ` <Pine.LNX.4.44L0.0503161336500.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Alan Stern @ 2005-03-16 18:44 UTC (permalink / raw)
  To: Bernard Blackham
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2321 bytes --]

On Thu, 17 Mar 2005, Bernard Blackham wrote:

> On Wed, Mar 16, 2005 at 10:45:57AM -0500, Alan Stern wrote:
> > The power_state.event change shouldn't affect the code's functioning.  
> > Neither should the PCI_CAP_ID_PM check provided it still ends up deciding 
> > to "suspend" to D0.
> 
> Can you elaborate on '"suspend" to D0' ? If the PCI_CAP_ID_PM check
> is in place, the state variable in usb_hcd_pci_suspend is set to 0,
> which means the if statement here:
> 
>     case HCD_STATE_SUSPENDED:
>         if (state <= dev->current_state)
>             break;
> 
> is hit, and pci_save_state, pci_disable_device, and free_irq are
> never called.

Yes, that's what was happening to me.

> If, on the other hand the PCI_CAP_ID_PM check is removed from
> pci_choose_state, "state" is D3hot, so the PCI disabling code *is*
> called, and the function bails out when it hits if (!has_pci_pm) {...

That's different then, and it should be less troublesome.

> > Below is a new version of my patch with the IRQ handling reinstated and
> > the missing pci_set_master included.  I'll do some more testing and get
> > back to you...
> 
> Needed an extra patch to compile, attached.

I didn't get your attachment.

>  However, it still
> doesn't survive more than one suspend/resume with devices plugged
> in. On the second resume, USB is dead, and it enters the same
> suspend_hc/wakeup_hc loop. I traced this back to the hcd never
> really being resumed, as the pci power state never changes from D0
> as pci_set_power_state is never called here. The attached patch also
> addresses this - I'm not sure if it's the right way, but it
> works(ish):

Can _you_ elaborate on "never really being resumed"?  Since the state 
never changes from D0 much of the code in usb_hcd_pci_suspend never runs, 
as you said above.  In fact about the only thing that does happen is the 
call to hcd->driver->suspend.  However the reverse call to 
hcd->driver->resume does get made in usb_hcd_pci_resume even though the 
state upon entry is D0.

> Suspend and resume for S3 or suspend to disk, without devices
> plugged in, works fine. If my USB mouse is plugged in, it is resumed
> and powered up, but nothing is sent out /dev/input/mice until I
> unplug and replug the device (but USB is not dead :)

I'll try this out right now.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                       ` <20050314193054.GO5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-16 21:05                                         ` David Brownell
       [not found]                                           ` <200503161305.57698.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: David Brownell @ 2005-03-16 21:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 1978 bytes --]

On Monday 14 March 2005 11:30 am, Pavel Machek wrote:
> On Po 14-03-05 10:59:25, David Brownell wrote:
> 
> > > pci_choose_state() needs to return d3hot even 
> > > for pmsg_freeze, because that's what old code did, and I did not audit
> > > all the drivers.
> > 
> > Seems like a problem to me.  Do S1/S2/S3 transitions even
> > need a "freeze" transition?  I thought we'd agreed they don't;
> 
> That's not the problem.

It's _a_ problem, maybe not the one you want to focus on right now.


> Old code put devices into D3hot in swsusp "freeze" case. We'll need to
> do the same now, slowly auditing the drivers and removing that
> unneccessary D0->D3hot->D0 transition.

That's true.  The extra suspend/resume transition is trouble;
it's not necessary for the checkpoint stage, or system poweroff.


For the record, I've recently observed that all the swsusp issues
start making sense to me when I start thinking of swsusp as being
completely unrelated to suspend states.  (S4bios aside...)  And if
I don't think of it that way, I keep tripping over complications
where it's fighting against "real" suspend states.

The thing is, swsusp in normal usage does not involve system
suspend states like S1/S2/S3, or their analogues in non-ACPI
embedded systems.  Neither does it involve wakeup from those
states ... in fact, it fights against addressing all those.


If swsusp were called a system checkpoint/restore mechanism, it'd
have a much clearer relationship to power management:  enabling
system power-off is a useful side effect, but it's not exactly
the point of a checkpoint mechanism.  I suspect that if it were
packaged as halt-after-checkpoint, plus resume-from-checkpoint,
a lot of technical issues would start shaking out better.  Also
maybe some political/funding ones ... checkpointing has much
value for enterprise server operations.


OK, that's enough of a radical perspective for the moment.
I'm not sure I'll throw any more monkey wrenches today.  ;)

- Dave

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                           ` <Pine.LNX.4.44L0.0503161024360.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2005-03-16 16:52                             ` Bernard Blackham
@ 2005-03-16 21:09                             ` David Brownell
       [not found]                               ` <200503161309.34142.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  1 sibling, 1 reply; 31+ messages in thread
From: David Brownell @ 2005-03-16 21:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 875 bytes --]

On Wednesday 16 March 2005 7:45 am, Alan Stern wrote:
> On Wed, 16 Mar 2005, Bernard Blackham wrote:
>
> > I haven't tried David's patch yet - the only differences are it
> > seems to use a slightly different order, and also calls
> > pci_set_master. I'll give that a spin later today combined with and
> > without yours to see what works reliably.
> 
> The missing pci_set_master in my patch probably explains the errors you 
> got.
> 
> David's patch doesn't make all the necessary changes that my patch does.  
> In particular it doesn't avoid doing pci_restore_state when there was no
> prior pci_save_state, so it will continue to exhibit some of the problems
> we were seeing before.

Well, as I said it wasn't quite ready for prime time.  It's possible that
would explain some other USB related resume issues.

I'm not sure why your patch is so big/complex t hough...

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                           ` <200503161305.57698.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-16 21:44                                             ` Nigel Cunningham
       [not found]                                               ` <1111009442.3240.28.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
  2005-03-16 21:57                                             ` Pavel Machek
  1 sibling, 1 reply; 31+ messages in thread
From: Nigel Cunningham @ 2005-03-16 21:44 UTC (permalink / raw)
  To: David Brownell; +Cc: Bernard Blackham, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 3708 bytes --]

Hi.

On Thu, 2005-03-17 at 08:05, David Brownell wrote:
> On Monday 14 March 2005 11:30 am, Pavel Machek wrote:
> > On Po 14-03-05 10:59:25, David Brownell wrote:
> > 
> > > > pci_choose_state() needs to return d3hot even 
> > > > for pmsg_freeze, because that's what old code did, and I did not audit
> > > > all the drivers.
> > > 
> > > Seems like a problem to me.  Do S1/S2/S3 transitions even
> > > need a "freeze" transition?  I thought we'd agreed they don't;
> > 
> > That's not the problem.
> 
> It's _a_ problem, maybe not the one you want to focus on right now.
> 
> 
> > Old code put devices into D3hot in swsusp "freeze" case. We'll need to
> > do the same now, slowly auditing the drivers and removing that
> > unneccessary D0->D3hot->D0 transition.
> 
> That's true.  The extra suspend/resume transition is trouble;
> it's not necessary for the checkpoint stage, or system poweroff.
> 
> 
> For the record, I've recently observed that all the swsusp issues
> start making sense to me when I start thinking of swsusp as being
> completely unrelated to suspend states.  (S4bios aside...)  And if
> I don't think of it that way, I keep tripping over complications
> where it's fighting against "real" suspend states.
> 
> The thing is, swsusp in normal usage does not involve system
> suspend states like S1/S2/S3, or their analogues in non-ACPI
> embedded systems.  Neither does it involve wakeup from those
> states ... in fact, it fights against addressing all those.

Both swsusp and suspend2 can enter S4 as their method of powering down,
and do use the prepare, enter and finish methods when doing so.

> If swsusp were called a system checkpoint/restore mechanism, it'd
> have a much clearer relationship to power management:  enabling
> system power-off is a useful side effect, but it's not exactly
> the point of a checkpoint mechanism.  I suspect that if it were
> packaged as halt-after-checkpoint, plus resume-from-checkpoint,
> a lot of technical issues would start shaking out better.  Also
> maybe some political/funding ones ... checkpointing has much
> value for enterprise server operations.

Checkpointing and restoring is very different from what swsusp and
suspend2 do. I've been asked a number of times if I could make Suspend2
do checkpointing as well as suspending, and it's simply not possible.
The reason is that we really are suspending to disk. We're writing the
entire contents of memory, and those contents are only valid while the
disk is in exactly the state it is in when we suspend. As soon as you
change a file on the disk (as you would after checkpointing), the image
is invalid and will create corruption if restored. To do checkpointing,
we'd at least have to  modify the implementations to be able to reverse
on disk changes back to the point where the checkpoint was taken, and
probably also add a mechanism for tracking in-memory changes and
updating the image to reflect them. It's not impossible, but it's not
what swsusp and Suspend2 do now.

I realise you're writing in the context of freezing drivers, and may
simply be thinking in terms of the action being the same as would be
required if you were checkpointing. From that point of view I'd probably
agree. But in the first instance, I'm reacting to you speaking of
calling swsusp a system checkpoint/restore mechanism.

Hope that helps.

Nigel

> OK, that's enough of a radical perspective for the moment.
> I'm not sure I'll throw any more monkey wrenches today.  ;)
> 
> - Dave
-- 
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028;  Mob: +61 (417) 100 574

Maintainer of Suspend2 Kernel Patches http://suspend2.net


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                           ` <200503161305.57698.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2005-03-16 21:44                                             ` Nigel Cunningham
@ 2005-03-16 21:57                                             ` Pavel Machek
  1 sibling, 0 replies; 31+ messages in thread
From: Pavel Machek @ 2005-03-16 21:57 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 2519 bytes --]

On St 16-03-05 13:05:57, David Brownell wrote:
> On Monday 14 March 2005 11:30 am, Pavel Machek wrote:
> > On Po 14-03-05 10:59:25, David Brownell wrote:
> > 
> > > > pci_choose_state() needs to return d3hot even 
> > > > for pmsg_freeze, because that's what old code did, and I did not audit
> > > > all the drivers.
> > > 
> > > Seems like a problem to me.  Do S1/S2/S3 transitions even
> > > need a "freeze" transition?  I thought we'd agreed they don't;
> > 
> > That's not the problem.
> 
> It's _a_ problem, maybe not the one you want to focus on right now.

We do not do "device_suspend(...FREEZE)" for suspend-to-RAM (= S3). We
freeze processes in order to make driver's life easier.

> > Old code put devices into D3hot in swsusp "freeze" case. We'll need to
> > do the same now, slowly auditing the drivers and removing that
> > unneccessary D0->D3hot->D0 transition.
> 
> That's true.  The extra suspend/resume transition is trouble;
> it's not necessary for the checkpoint stage, or system poweroff.
> 
> 
> For the record, I've recently observed that all the swsusp issues
> start making sense to me when I start thinking of swsusp as being
> completely unrelated to suspend states.  (S4bios aside...)  And if
> I don't think of it that way, I keep tripping over complications
> where it's fighting against "real" suspend states.
> 
> The thing is, swsusp in normal usage does not involve system
> suspend states like S1/S2/S3, or their analogues in non-ACPI
> embedded systems.  Neither does it involve wakeup from those
> states ... in fact, it fights against addressing all those.
> 
> 
> If swsusp were called a system checkpoint/restore mechanism, it'd
> have a much clearer relationship to power management:  enabling
> system power-off is a useful side effect, but it's not exactly
> the point of a checkpoint mechanism.  I suspect that if it were
> packaged as halt-after-checkpoint, plus resume-from-checkpoint,
> a lot of technical issues would start shaking out better.  Also
> maybe some political/funding ones ... checkpointing has much
> value for enterprise server operations.

Yes, I pretty much agree with this one. OTOH drivers must be aware
that powerdown *is* eventually going to happen, and that state of
their devices is going to be lost. And they can't just store it just
before powerdown, because that context is going to be lost.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                               ` <200503161309.34142.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-16 22:10                                 ` Alan Stern
  0 siblings, 0 replies; 31+ messages in thread
From: Alan Stern @ 2005-03-16 22:10 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 412 bytes --]

On Wed, 16 Mar 2005, David Brownell wrote:

> I'm not sure why your patch is so big/complex t hough...

It's not as big as it looks; a large part of it just changes indentation
levels.  And it certainly isn't complex.  All it does is add a missing
test and rearrange some lines of code.  The idea is that the events in the 
resume routine should be the reverse of the events in the suspend routine.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                               ` <1111009442.3240.28.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
@ 2005-03-16 22:12                                                 ` David Brownell
       [not found]                                                   ` <200503161412.42387.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: David Brownell @ 2005-03-16 22:12 UTC (permalink / raw)
  To: ncunningham-3EexvZdKGZRWk0Htik3J/w
  Cc: Bernard Blackham, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 4221 bytes --]

Hi,

On Wednesday 16 March 2005 1:44 pm, Nigel Cunningham wrote:
> On Thu, 2005-03-17 at 08:05, David Brownell wrote:
> > For the record, I've recently observed that all the swsusp issues
> > start making sense to me when I start thinking of swsusp as being
> > completely unrelated to suspend states.  (S4bios aside...)  And if
> > I don't think of it that way, I keep tripping over complications
> > where it's fighting against "real" suspend states.
> > 
> > The thing is, swsusp in normal usage does not involve system
> > suspend states like S1/S2/S3, or their analogues in non-ACPI
> > embedded systems.  Neither does it involve wakeup from those
> > states ... in fact, it fights against addressing all those.
> 
> Both swsusp and suspend2 can enter S4 as their method of powering down,
> and do use the prepare, enter and finish methods when doing so.

As I said, S4 aside.  The history I recall is that swsusp came
out a fair degree of frustration with getting Linux to work
with the BIOS support ... and lack of firmware init for video,
etc.  And certainly S4 modes don't seem to be the default, or
widely used/tested.


> > If swsusp were called a system checkpoint/restore mechanism, it'd
> > have a much clearer relationship to power management:  enabling
> > system power-off is a useful side effect, but it's not exactly
> > the point of a checkpoint mechanism.  I suspect that if it were
> > packaged as halt-after-checkpoint, plus resume-from-checkpoint,
> > a lot of technical issues would start shaking out better.  Also
> > maybe some political/funding ones ... checkpointing has much
> > value for enterprise server operations.
> 
> Checkpointing and restoring is very different from what swsusp and
> suspend2 do. I've been asked a number of times if I could make Suspend2
> do checkpointing as well as suspending, and it's simply not possible.

That is, other folk have noticed that it's essentially the same
problem...

> The reason is that we really are suspending to disk. We're writing the
> entire contents of memory, and those contents are only valid while the
> disk is in exactly the state it is in when we suspend. As soon as you
> change a file on the disk (as you would after checkpointing), the image
> is invalid and will create corruption if restored. To do checkpointing,
> we'd at least have to  modify the implementations to be able to reverse
> on disk changes back to the point where the checkpoint was taken, and
> probably also add a mechanism for tracking in-memory changes and
> updating the image to reflect them. It's not impossible, but it's not
> what swsusp and Suspend2 do now.

True, not what they do now.  I didn't intend to imply they did.
A checkpoint package would have key differences, including those.

That wasn't my point:  that swsusp, in normal usage, is more of a
checkpoint/restore than a suspend/resume.  Which is why some of what
it wants is different from suspend/resume.  In particular, since
the devices are goin to be powered off, the resume paths are VERY
different.  They are in fact reset paths, not resume paths.


> I realise you're writing in the context of freezing drivers, and may
> simply be thinking in terms of the action being the same as would be
> required if you were checkpointing. From that point of view I'd probably
> agree. But in the first instance, I'm reacting to you speaking of
> calling swsusp a system checkpoint/restore mechanism.

I'm trying to articulate why so much of the swsusp stuff seems (to me)
to be fighting against "normal" suspend/resume and wakeup mechanisms,
which can more naturally be built piecemeal from selective suspend.
(And why there's any pushback on selective suspend, given that it's so
basic to "normal" suspend/resume/wakeup.)

So yes, that's from the driver perspective.

- Dave



> Hope that helps.
> 
> Nigel
> 
> > OK, that's enough of a radical perspective for the moment.
> > I'm not sure I'll throw any more monkey wrenches today.  ;)
> > 
> > - Dave
> -- 
> Nigel Cunningham
> Software Engineer, Canberra, Australia
> http://www.cyclades.com
> Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028;  Mob: +61 (417) 100 574
> 
> Maintainer of Suspend2 Kernel Patches http://suspend2.net
> 
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                                   ` <200503161412.42387.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-16 22:19                                                     ` Pavel Machek
  2005-03-16 22:48                                                     ` Nigel Cunningham
  1 sibling, 0 replies; 31+ messages in thread
From: Pavel Machek @ 2005-03-16 22:19 UTC (permalink / raw)
  To: David Brownell
  Cc: Bernard Blackham, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 2007 bytes --]

Hi!

> > > For the record, I've recently observed that all the swsusp issues
> > > start making sense to me when I start thinking of swsusp as being
> > > completely unrelated to suspend states.  (S4bios aside...)  And if
> > > I don't think of it that way, I keep tripping over complications
> > > where it's fighting against "real" suspend states.
> > > 
> > > The thing is, swsusp in normal usage does not involve system
> > > suspend states like S1/S2/S3, or their analogues in non-ACPI
> > > embedded systems.  Neither does it involve wakeup from those
> > > states ... in fact, it fights against addressing all those.
> > 
> > Both swsusp and suspend2 can enter S4 as their method of powering down,
> > and do use the prepare, enter and finish methods when doing so.
> 
> As I said, S4 aside.  The history I recall is that swsusp came
> out a fair degree of frustration with getting Linux to work
> with the BIOS support ... and lack of firmware init for video,
> etc.  And certainly S4 modes don't seem to be the default, or
> widely used/tested.

Careful, S4 *is* being used/tested. But it is very similar to swsusp
with powerdown. S4bios is *not* used. It is going to die real soon.

> True, not what they do now.  I didn't intend to imply they did.
> A checkpoint package would have key differences, including those.
> 
> That wasn't my point:  that swsusp, in normal usage, is more of a
> checkpoint/restore than a suspend/resume.  Which is why some of what
> it wants is different from suspend/resume.  In particular, since
> the devices are goin to be powered off, the resume paths are VERY
> different.  They are in fact reset paths, not resume paths.

Well, hardware may have been initialized by normal boot. So if drivers
are not modular, it will work okay with suspend/resume paths in
drivers. If drivers are modular, then you are right.

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                                   ` <200503161412.42387.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2005-03-16 22:19                                                     ` Pavel Machek
@ 2005-03-16 22:48                                                     ` Nigel Cunningham
  1 sibling, 0 replies; 31+ messages in thread
From: Nigel Cunningham @ 2005-03-16 22:48 UTC (permalink / raw)
  To: David Brownell; +Cc: Bernard Blackham, Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 5547 bytes --]

Hi David.

On Thu, 2005-03-17 at 09:12, David Brownell wrote:
> Hi,
> 
> On Wednesday 16 March 2005 1:44 pm, Nigel Cunningham wrote:
> > On Thu, 2005-03-17 at 08:05, David Brownell wrote:
> > > For the record, I've recently observed that all the swsusp issues
> > > start making sense to me when I start thinking of swsusp as being
> > > completely unrelated to suspend states.  (S4bios aside...)  And if
> > > I don't think of it that way, I keep tripping over complications
> > > where it's fighting against "real" suspend states.
> > > 
> > > The thing is, swsusp in normal usage does not involve system
> > > suspend states like S1/S2/S3, or their analogues in non-ACPI
> > > embedded systems.  Neither does it involve wakeup from those
> > > states ... in fact, it fights against addressing all those.
> > 
> > Both swsusp and suspend2 can enter S4 as their method of powering down,
> > and do use the prepare, enter and finish methods when doing so.
> 
> As I said, S4 aside.  The history I recall is that swsusp came
> out a fair degree of frustration with getting Linux to work
> with the BIOS support ... and lack of firmware init for video,
> etc.  And certainly S4 modes don't seem to be the default, or
> widely used/tested.

I think we're confusing things somewhere. You said "S4bios aside", and
that's fine. I wasn't talking about S4 bios support at all. Instead, I
was talking about the support swsusp and suspend2 have for invoking the
ACPI S4 sleep state. (S4 is suspend-to-disk. The problems you mention
sound more like S3 - suspend to ram).

> > > If swsusp were called a system checkpoint/restore mechanism, it'd
> > > have a much clearer relationship to power management:  enabling
> > > system power-off is a useful side effect, but it's not exactly
> > > the point of a checkpoint mechanism.  I suspect that if it were
> > > packaged as halt-after-checkpoint, plus resume-from-checkpoint,
> > > a lot of technical issues would start shaking out better.  Also
> > > maybe some political/funding ones ... checkpointing has much
> > > value for enterprise server operations.
> > 
> > Checkpointing and restoring is very different from what swsusp and
> > suspend2 do. I've been asked a number of times if I could make Suspend2
> > do checkpointing as well as suspending, and it's simply not possible.
> 
> That is, other folk have noticed that it's essentially the same
> problem...

Only from the point of view of quiescing drivers.

> > The reason is that we really are suspending to disk. We're writing the
> > entire contents of memory, and those contents are only valid while the
> > disk is in exactly the state it is in when we suspend. As soon as you
> > change a file on the disk (as you would after checkpointing), the image
> > is invalid and will create corruption if restored. To do checkpointing,
> > we'd at least have to  modify the implementations to be able to reverse
> > on disk changes back to the point where the checkpoint was taken, and
> > probably also add a mechanism for tracking in-memory changes and
> > updating the image to reflect them. It's not impossible, but it's not
> > what swsusp and Suspend2 do now.
> 
> True, not what they do now.  I didn't intend to imply they did.
> A checkpoint package would have key differences, including those.
> 
> That wasn't my point:  that swsusp, in normal usage, is more of a
> checkpoint/restore than a suspend/resume.  Which is why some of what
> it wants is different from suspend/resume.  In particular, since
> the devices are goin to be powered off, the resume paths are VERY
> different.  They are in fact reset paths, not resume paths.

Ok. I agree suspend to disk is by definition akin to a
checkpoint/restore operation. That's simply because of the nature of the
beast - devices get powered down in this state.

Can we nevertheless please still call it suspend/resume? Checkpointing
is a different creature, and we don't want to confuse the issue.

> > I realise you're writing in the context of freezing drivers, and may
> > simply be thinking in terms of the action being the same as would be
> > required if you were checkpointing. From that point of view I'd probably
> > agree. But in the first instance, I'm reacting to you speaking of
> > calling swsusp a system checkpoint/restore mechanism.
> 
> I'm trying to articulate why so much of the swsusp stuff seems (to me)
> to be fighting against "normal" suspend/resume and wakeup mechanisms,
> which can more naturally be built piecemeal from selective suspend.
> (And why there's any pushback on selective suspend, given that it's so
> basic to "normal" suspend/resume/wakeup.)

Not sure I understand everything you're saying here, so can I see if my
understanding matches/helps clarify?

I see the system states as being like a big clamp over the run-time
power management. To enter the PM_FREEZE state, for example, is to force
all the drivers into a lower state of activity & preparedness than they
would otherwise occupy. In normal operation, they'll happily interact
with each other and float between the various run-time states according
to whatever policy, but when PM_FREEZE comes along, they're collectively
forced to a lowest common denominator. Is that akin to the "fighting
again 'normal' suspend/resume and wakeup mechanisms" you speak of?

Regards,

Nigel 
-- 
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028;  Mob: +61 (417) 100 574

Maintainer of Suspend2 Kernel Patches http://suspend2.net


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                   ` <Pine.LNX.4.44L0.0503161336500.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-17  1:10                                     ` Bernard Blackham
       [not found]                                       ` <20050317011013.GD10545-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bernard Blackham @ 2005-03-17  1:10 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 1352 bytes --]

On Wed, Mar 16, 2005 at 01:44:00PM -0500, Alan Stern wrote:
> > Needed an extra patch to compile, attached.
> 
> I didn't get your attachment.

D'oh. For real this time.

> >  However, it still doesn't survive more than one suspend/resume
> >  with devices plugged in. On the second resume, USB is dead, and
> >  it enters the same suspend_hc/wakeup_hc loop. I traced this
> >  back to the hcd never really being resumed, as the pci power
> >  state never changes from D0 as pci_set_power_state is never
> >  called here. The attached patch also addresses this - I'm not
> >  sure if it's the right way, but it works(ish):
> 
> Can _you_ elaborate on "never really being resumed"?  Since the
> state never changes from D0 much of the code in
> usb_hcd_pci_suspend never runs, as you said above.  In fact about
> the only thing that does happen is the call to
> hcd->driver->suspend.  However the reverse call to
> hcd->driver->resume does get made in usb_hcd_pci_resume even
> though the state upon entry is D0.

I meant specifically in the resume path, there's an if statement to
check if the pci power state is > D0, and only acts if that's the
case. Hopefully the first hunk of the patch attached makes this more
clear. None of the code inside that if statement was being called.

Bernard

-- 
 Bernard Blackham <bernard at blackham dot com dot au>

[-- Attachment #2: alan-usb-fixes-2-fixes.diff --]
[-- Type: text/plain, Size: 794 bytes --]

Index: linux/drivers/usb/core/hcd-pci.c
===================================================================
--- linux.orig/drivers/usb/core/hcd-pci.c	2005-03-16 23:49:50.000000000 +0800
+++ linux/drivers/usb/core/hcd-pci.c	2005-03-17 00:28:10.000000000 +0800
@@ -396,7 +396,7 @@
 
 	hcd->state = USB_STATE_RESUMING;
 
-	if (dev->current_state > PCI_D0) {
+	if (dev->dev.power.power_state.event != EVENT_ON) {
 		if (has_pci_pm) {
 #ifdef	CONFIG_USB_SUSPEND
 			pci_enable_wake (dev, dev->current_state, 0);
@@ -411,7 +411,7 @@
 		}
 		dev->dev.power.power_state = PMSG_ON;
 		retval = request_irq (dev->irq, usb_hcd_irq, SA_SHIRQ,
-				hcd->irq_descr, hcd);
+				hcd->driver->description, hcd);
 		if (retval < 0) {
 			dev_err (hcd->self.controller,
 				"can't restore IRQ after resume!\n");

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Re: uhci-hcd suspend/resume under the new driver model
       [not found]                                       ` <20050317011013.GD10545-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
@ 2005-03-17  3:57                                         ` Alan Stern
  0 siblings, 0 replies; 31+ messages in thread
From: Alan Stern @ 2005-03-17  3:57 UTC (permalink / raw)
  To: Bernard Blackham
  Cc: David Brownell, ncunningham-3EexvZdKGZRWk0Htik3J/w,
	Linux-pm mailing list, Pavel Machek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1769 bytes --]

On Thu, 17 Mar 2005, Bernard Blackham wrote:

> > I didn't get your attachment.
> 
> D'oh. For real this time.

Of your two changes, it looks like the first one is an unimportant change
to use the new PM types.  The second is to revert something that hasn't
yet migrated out of the USB development tree.  So neither one is really
important.

> > Can _you_ elaborate on "never really being resumed"?  Since the
> > state never changes from D0 much of the code in
> > usb_hcd_pci_suspend never runs, as you said above.  In fact about
> > the only thing that does happen is the call to
> > hcd->driver->suspend.  However the reverse call to
> > hcd->driver->resume does get made in usb_hcd_pci_resume even
> > though the state upon entry is D0.
> 
> I meant specifically in the resume path, there's an if statement to
> check if the pci power state is > D0, and only acts if that's the
> case. Hopefully the first hunk of the patch attached makes this more
> clear. None of the code inside that if statement was being called.

Yes, that's right.  There's a corresponding if statement in the suspend
routine as well, and none of the code inside it was being run either.  In
fact, the stuff inside the resume "if" block undoes the actions taken by
the stuff inside the suspend "if" block.

So I don't think this is related to your mouse not functioning after 
resume.  More likely that was caused by some HID driver not handling the 
suspend/resume sequence correctly.

I tried a similar test on my system, not with a mouse but with a mass 
storage device.  There was no problem about binding the usb-storage driver 
to it after the resume.  This adds to my suspicion that your problem was 
caused by a mouse driver and not by the USB core or UHCI host driver.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2005-03-17  3:57 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20050313101453.GA4820@blackham.com.au>
     [not found] ` <20050313101453.GA4820-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
2005-03-13 18:12   ` uhci-hcd suspend/resume under the new driver model Pavel Machek
     [not found]     ` <20050313181225.GC1579-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-13 22:57       ` Benjamin Herrenschmidt
2005-03-13 23:20         ` Pavel Machek
     [not found]           ` <20050313232002.GC22635-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-14  2:23             ` Bernard Blackham
     [not found]               ` <20050314022308.GD6008-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
2005-03-14  4:03                 ` David Brownell
     [not found]                   ` <200503132003.27555.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-14  8:08                     ` Pavel Machek
     [not found]                       ` <20050314080827.GG22635-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-14 18:17                         ` David Brownell
     [not found]                           ` <200503141017.17305.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-14 18:44                             ` Pavel Machek
     [not found]                               ` <20050314184454.GL5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-14 18:59                                 ` David Brownell
     [not found]                                   ` <200503141059.26107.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-14 19:30                                     ` Pavel Machek
     [not found]                                       ` <20050314193054.GO5461-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-16 21:05                                         ` David Brownell
     [not found]                                           ` <200503161305.57698.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-16 21:44                                             ` Nigel Cunningham
     [not found]                                               ` <1111009442.3240.28.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
2005-03-16 22:12                                                 ` David Brownell
     [not found]                                                   ` <200503161412.42387.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-16 22:19                                                     ` Pavel Machek
2005-03-16 22:48                                                     ` Nigel Cunningham
2005-03-16 21:57                                             ` Pavel Machek
2005-03-14 22:22                                 ` Benjamin Herrenschmidt
2005-03-14 22:21                             ` Benjamin Herrenschmidt
2005-03-14 22:31                 ` Alan Stern
     [not found]                   ` <Pine.LNX.4.44L0.0503141717240.620-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-14 22:38                     ` Nigel Cunningham
2005-03-15 21:48                 ` Alan Stern
     [not found]                   ` <Pine.LNX.4.44L0.0503151636290.711-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-15 21:52                     ` Pavel Machek
2005-03-15 22:11                     ` David Brownell
2005-03-16  3:04                     ` Bernard Blackham
     [not found]                       ` <20050316030448.GA8588-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
2005-03-16 15:45                         ` Alan Stern
     [not found]                           ` <Pine.LNX.4.44L0.0503161024360.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-16 16:52                             ` Bernard Blackham
     [not found]                               ` <20050316165239.GA10545-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
2005-03-16 18:44                                 ` Alan Stern
     [not found]                                   ` <Pine.LNX.4.44L0.0503161336500.1040-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-17  1:10                                     ` Bernard Blackham
     [not found]                                       ` <20050317011013.GD10545-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
2005-03-17  3:57                                         ` Alan Stern
2005-03-16 21:09                             ` David Brownell
     [not found]                               ` <200503161309.34142.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-16 22:10                                 ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox