* A reference implementation of PCI suspend/resume?
@ 2005-05-12 1:24 Shaohua Li
2005-05-12 6:21 ` Adam Belay
2005-05-27 6:49 ` Shaohua Li
0 siblings, 2 replies; 20+ messages in thread
From: Shaohua Li @ 2005-05-12 1:24 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1299 bytes --]
Hi,
There are still many PCI drivers don't implement their .suspend/.resume
methods correctly. I felt it would be quite helpful there is a reference
implementation. Here is my thought:
.suspend()
{
driver specific operations;
pci_save_state();
pci_enable_wake();
/* as a result, PIC/IOAPIC pin is disabled */
free_irq();
/* as a result, bus master/irq router are disabled */
pci_disable_device();
pci_set_power_state();
}
.resume()
{
pci_set_power_state(PCI_D0);
pci_restore_state();
/* device's irq possibly is changed, driver should take care */
pci_enable_device();
pci_set_master();
request_irq();
driver specific operations;
}
Currently many drivers don't call
free_irq/request_irq/pci_disable_device, which makes unexpected
interrupt occur and even break suspend/resume in some systems. The
reason is we disable PIC/IOAPIC/irq router very later (they are treated
as a sysdev), so there is a window between when a device is suspend and
when its PIC/IOAPIC/irq router pin are disabled. The ideal way is
PIC/IOAPIC/irq router's pins are disabled soon after no device is
referencing them. I'm now working on disabling irq router if no
reference on it, but first I'd like to know your opinions about the
proposal (the reference .suspend/.resume implementation).
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: A reference implementation of PCI suspend/resume?
2005-05-12 1:24 A reference implementation of PCI suspend/resume? Shaohua Li
@ 2005-05-12 6:21 ` Adam Belay
2005-05-12 7:03 ` Shaohua Li
2005-05-27 6:49 ` Shaohua Li
1 sibling, 1 reply; 20+ messages in thread
From: Adam Belay @ 2005-05-12 6:21 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 2659 bytes --]
On Thu, May 12, 2005 at 09:24:47AM +0800, Shaohua Li wrote:
> Hi,
> There are still many PCI drivers don't implement their .suspend/.resume
> methods correctly. I felt it would be quite helpful there is a reference
> implementation. Here is my thought:
>
> .suspend()
> {
> driver specific operations;
> pci_save_state();
> pci_enable_wake();
> /* as a result, PIC/IOAPIC pin is disabled */
> free_irq();
We can do this earlier, under driver specific operations, right?
> /* as a result, bus master/irq router are disabled */
> pci_disable_device();
> pci_set_power_state();
> }
>
> .resume()
> {
> pci_set_power_state(PCI_D0);
> pci_restore_state();
> /* device's irq possibly is changed, driver should take care */
So what do you have in mind here? Could/should the device's resource
configuration also possibly change.
> pci_enable_device();
> pci_set_master();
> request_irq();
> driver specific operations;
> }
MSI support may require something here too, I need to look into it.
Also, we should disable the irq bit in the PCI configuration header, I'm not
sure if pci_disable_device() does this now.
Finally, I'm not sure if pci_save_state() and pci_restore_state() should be
part of the default procedure. I'd rather see drivers restoring the specific
configuration areas they need.
>
> Currently many drivers don't call
> free_irq/request_irq/pci_disable_device, which makes unexpected
> interrupt occur and even break suspend/resume in some systems. The
> reason is we disable PIC/IOAPIC/irq router very later (they are treated
> as a sysdev), so there is a window between when a device is suspend and
> when its PIC/IOAPIC/irq router pin are disabled. The ideal way is
> PIC/IOAPIC/irq router's pins are disabled soon after no device is
> referencing them. I'm now working on disabling irq router if no
> reference on it, but first I'd like to know your opinions about the
> proposal (the reference .suspend/.resume implementation).
>
> Thanks,
> Shaohua
So if we find that over 99% of PCI device drivers need to do exactly this
sequence, then why not just have the ->suspend and ->resume wrappers do it
automatically?
And let's say there's a good argument or it's better coding style to allow
drivers do it on thier own. We still have two problems:
a.) many drivers will get it wrong
b.) if we want to change this sequence, every driver must be changed
But say we added the following functions:
pci_suspend_power(PCI-[D1-D3]);
pci_restore_power();
These could do exactly what we have above, and the majority of PCI drivers
could utilize them. I think this would simplify things.
Just some thoughts.
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: A reference implementation of PCI suspend/resume?
2005-05-12 6:21 ` Adam Belay
@ 2005-05-12 7:03 ` Shaohua Li
0 siblings, 0 replies; 20+ messages in thread
From: Shaohua Li @ 2005-05-12 7:03 UTC (permalink / raw)
To: Adam Belay; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 3666 bytes --]
On Thu, 2005-05-12 at 14:21 +0800, Adam Belay wrote:
> On Thu, May 12, 2005 at 09:24:47AM +0800, Shaohua Li wrote:
> > Hi,
> > There are still many PCI drivers don't implement
> their .suspend/.resume
> > methods correctly. I felt it would be quite helpful there is a
> reference
> > implementation. Here is my thought:
> >
> > .suspend()
> > {
> > driver specific operations;
> > pci_save_state();
> > pci_enable_wake();
> > /* as a result, PIC/IOAPIC pin is disabled */
> > free_irq();
>
> We can do this earlier, under driver specific operations, right?
Right, I just list some operations. Some operations are optional and
some order can be slightly adjusted.
>
> > /* as a result, bus master/irq router are disabled */
> > pci_disable_device();
> > pci_set_power_state();
> > }
> >
> > .resume()
> > {
> > pci_set_power_state(PCI_D0);
> > pci_restore_state();
> > /* device's irq possibly is changed, driver should take care
> */
>
> So what do you have in mind here? Could/should the device's resource
> configuration also possibly change.
Resources balance need it. For suspend/resume, it's unnecessary
(pci_restore_state will restore the resource configuration). But we must
clean irq to avoid unexpected interrupt issue. Currently, I'd like we
temporarily ignore resources balance.
> > pci_enable_device();
> > pci_set_master();
> > request_irq();
> > driver specific operations;
> > }
>
> MSI support may require something here too, I need to look into it.
>
> Also, we should disable the irq bit in the PCI configuration header,
> I'm not
> sure if pci_disable_device() does this now.
Disabling the INTX for MSI? pci_enable_msi isn't called at
pci_enable_device, so pci_disable_msi should be the same. Yes,
pci_enable_msi/pci_disable_msi should be added in the suspend/resume
methods.
>
> Finally, I'm not sure if pci_save_state() and pci_restore_state()
> should be
> part of the default procedure. I'd rather see drivers restoring the
> specific
> configuration areas they need.
I think pci_save_state and pci_store_state are mandatory and drivers can
do extra config save/restore.
> >
> > Currently many drivers don't call
> > free_irq/request_irq/pci_disable_device, which makes unexpected
> > interrupt occur and even break suspend/resume in some systems. The
> > reason is we disable PIC/IOAPIC/irq router very later (they are
> treated
> > as a sysdev), so there is a window between when a device is suspend
> and
> > when its PIC/IOAPIC/irq router pin are disabled. The ideal way is
> > PIC/IOAPIC/irq router's pins are disabled soon after no device is
> > referencing them. I'm now working on disabling irq router if no
> > reference on it, but first I'd like to know your opinions about the
> > proposal (the reference .suspend/.resume implementation).
> So if we find that over 99% of PCI device drivers need to do exactly
> this
> sequence, then why not just have the ->suspend and ->resume wrappers
> do it
> automatically?
>
> And let's say there's a good argument or it's better coding style to
> allow
> drivers do it on thier own. We still have two problems:
> a.) many drivers will get it wrong
> b.) if we want to change this sequence, every driver must be changed
>
> But say we added the following functions:
> pci_suspend_power(PCI-[D1-D3]);
> pci_restore_power();
>
> These could do exactly what we have above, and the majority of PCI
> drivers
> could utilize them. I think this would simplify things.
It's ok. The question is how many drivers can use it.
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-12 1:24 A reference implementation of PCI suspend/resume? Shaohua Li
2005-05-12 6:21 ` Adam Belay
@ 2005-05-27 6:49 ` Shaohua Li
2005-05-27 7:20 ` Pavel Machek
1 sibling, 1 reply; 20+ messages in thread
From: Shaohua Li @ 2005-05-27 6:49 UTC (permalink / raw)
To: linux-pm; +Cc: Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 2226 bytes --]
>
> Currently many drivers don't call
> free_irq/request_irq/pci_disable_device, which makes unexpected
> interrupt occur and even break suspend/resume in some systems. The
> reason is we disable PIC/IOAPIC/irq router very later (they are
> treated
> as a sysdev), so there is a window between when a device is suspend
> and
> when its PIC/IOAPIC/irq router pin are disabled. The ideal way is
> PIC/IOAPIC/irq router's pins are disabled soon after no device is
> referencing them. I'm now working on disabling irq router if no
> reference on it, but first I'd like to know your opinions about the
> proposal (the reference .suspend/.resume implementation).
Hi Pavel,
Could this be put into your tree? We should fix many PCI drivers, this
possibly will help us.
Thanks,
Shaohua
---
linux-2.6.11-rc5-mm1-root/Documentation/power/pci.txt | 38 ++++++++++++++++++
1 files changed, 38 insertions(+)
diff -puN Documentation/power/pci.txt~int-doc Documentation/power/pci.txt
--- linux-2.6.11-rc5-mm1/Documentation/power/pci.txt~int-doc 2005-05-27 10:43:51.893015912 +0800
+++ linux-2.6.11-rc5-mm1-root/Documentation/power/pci.txt 2005-05-27 14:26:19.492872304 +0800
@@ -291,6 +291,44 @@ a request to enable wake events from D3,
pci_enable_wake (one for both D3hot and D3cold).
+A reference implementation
+-------------------------
+.suspend()
+{
+ /* driver specific operations */
+
+ /* Disable IRQ */
+ free_irq();
+ /* If using MSI */
+ pci_disable_msi();
+
+ pci_save_state();
+ pci_enable_wake();
+ /* Disable IO/bus master/irq router */
+ pci_disable_device();
+ pci_set_power_state(pci_choose_state());
+}
+
+.resume()
+{
+ pci_set_power_state(PCI_D0);
+ pci_restore_state();
+ /* device's irq possibly is changed, driver should take care */
+ pci_enable_device();
+ pci_set_master();
+
+ /* if using MSI, device's vector possibly is changed */
+ pci_enable_msi();
+
+ request_irq();
+ /* driver specific operations; */
+}
+
+This is a typical implementation. Drivers can slightly change the order
+of the operations in the implementation, ignore some operations or add
+more deriver specific operations in it, but drivers should do something like
+this on the whole.
+
5. Resources
~~~~~~~~~~~~
_
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: A reference implementation of PCI suspend/resume?
2005-05-27 6:49 ` Shaohua Li
@ 2005-05-27 7:20 ` Pavel Machek
0 siblings, 0 replies; 20+ messages in thread
From: Pavel Machek @ 2005-05-27 7:20 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 835 bytes --]
Hi!
> > Currently many drivers don't call
> > free_irq/request_irq/pci_disable_device, which makes unexpected
> > interrupt occur and even break suspend/resume in some systems. The
> > reason is we disable PIC/IOAPIC/irq router very later (they are
> > treated
> > as a sysdev), so there is a window between when a device is suspend
> > and
> > when its PIC/IOAPIC/irq router pin are disabled. The ideal way is
> > PIC/IOAPIC/irq router's pins are disabled soon after no device is
> > referencing them. I'm now working on disabling irq router if no
> > reference on it, but first I'd like to know your opinions about the
> > proposal (the reference .suspend/.resume implementation).
> Hi Pavel,
> Could this be put into your tree? We should fix many PCI drivers, this
> possibly will help us.
Thanks, applied.
Pavel
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: A reference implementation of PCI suspend/resume?
@ 2005-05-14 5:06 Fabrice Gautier
2005-05-16 4:53 ` David Brownell
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Fabrice Gautier @ 2005-05-14 5:06 UTC (permalink / raw)
To: Shaohua Li, linux-pm
[-- Attachment #1: Type: text/plain, Size: 1303 bytes --]
Hi,
> From: Shaohua Li [mailto:shaohua.li@intel.com]
> Sent: Wednesday, May 11, 2005 6:25 PM
> To: linux-pm
> Subject: [linux-pm] A reference implementation of PCI suspend/resume?
>
> Hi,
> There are still many PCI drivers don't implement their
> .suspend/.resume
> methods correctly. I felt it would be quite helpful there is
> a reference
> implementation. Here is my thought:
>
> [...snip...]
>
> Currently many drivers don't call
> free_irq/request_irq/pci_disable_device, which makes unexpected
> interrupt occur and even break suspend/resume in some systems.
I don't understand why not calling free_irq would cause unexpected
interrupts.
The way I understand the PCI spec is: if you suspend a PCI device, its IRQ
should be disabled. If an IRQ come from this device anyway (in violation of
the spec) then the driver for this device should handle it so you should NOT
unregister its irq_handler.
(How does such a device would handle such a irq is another matter - a device
specific one I guess)
I'm not understanding where your unexpected interrupt is coming from.
Care to elaborate?
Btw: I'm suspending PCI devices not in the context of a full suspend
situation, maybe that's why I'm missing your point. But your sample
implementation should work in both cases.
Thanks
-- Fabrice
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: A reference implementation of PCI suspend/resume?
2005-05-14 5:06 Fabrice Gautier
@ 2005-05-16 4:53 ` David Brownell
2005-05-16 8:55 ` Shaohua Li
2005-05-16 20:16 ` Adam Belay
2 siblings, 0 replies; 20+ messages in thread
From: David Brownell @ 2005-05-16 4:53 UTC (permalink / raw)
To: linux-pm; +Cc: Fabrice Gautier
[-- Attachment #1: Type: text/plain, Size: 570 bytes --]
On Friday 13 May 2005 10:06 pm, Fabrice Gautier wrote:
> The way I understand the PCI spec is: if you suspend a PCI device, its IRQ
> should be disabled. If an IRQ come from this device anyway (in violation of
> the spec) then the driver for this device should handle it so you should NOT
> unregister its irq_handler.
There's PCI ... and PCI with PM capabilities. That rule about only
issuing PME transactions while suspended can only apply in the latter
case. That is, not all PCI devices support the PCI PM rules; drivers
may need to help the hardware.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: A reference implementation of PCI suspend/resume?
2005-05-14 5:06 Fabrice Gautier
2005-05-16 4:53 ` David Brownell
@ 2005-05-16 8:55 ` Shaohua Li
2005-05-16 20:16 ` Adam Belay
2 siblings, 0 replies; 20+ messages in thread
From: Shaohua Li @ 2005-05-16 8:55 UTC (permalink / raw)
To: Fabrice Gautier; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]
On Fri, 2005-05-13 at 22:06 -0700, Fabrice Gautier wrote:
> > There are still many PCI drivers don't implement their
> > .suspend/.resume
> > methods correctly. I felt it would be quite helpful there is
> > a reference
> > implementation. Here is my thought:
> >
> > [...snip...]
> >
> > Currently many drivers don't call
> > free_irq/request_irq/pci_disable_device, which makes unexpected
> > interrupt occur and even break suspend/resume in some systems.
>
> I don't understand why not calling free_irq would cause unexpected
> interrupts.
>
> The way I understand the PCI spec is: if you suspend a PCI device, its IRQ
> should be disabled. If an IRQ come from this device anyway (in violation of
> the spec) then the driver for this device should handle it so you should NOT
> unregister its irq_handler.
> (How does such a device would handle such a irq is another matter - a device
> specific one I guess)
>
> I'm not understanding where your unexpected interrupt is coming from.
>
> Care to elaborate?
I actually don't know the exact reason but we did notice disabling a
device isn't sufficient (an 'irq nobody card' is reported after a device
is disabled). I think we should not trust device.
Another point is resume is just like boot. At boot time, PIC/IOAPIC/irq
router are disabled by default. They are enabled only when a device
really starts using them. suspend/resume should be the same manner.
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-14 5:06 Fabrice Gautier
2005-05-16 4:53 ` David Brownell
2005-05-16 8:55 ` Shaohua Li
@ 2005-05-16 20:16 ` Adam Belay
2005-05-16 20:56 ` Rafael J. Wysocki
2005-05-16 21:19 ` Jordan Crouse
2 siblings, 2 replies; 20+ messages in thread
From: Adam Belay @ 2005-05-16 20:16 UTC (permalink / raw)
To: Fabrice Gautier; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 2626 bytes --]
On Fri, May 13, 2005 at 10:06:42PM -0700, Fabrice Gautier wrote:
> Hi,
>
> > From: Shaohua Li [mailto:shaohua.li@intel.com]
> > Sent: Wednesday, May 11, 2005 6:25 PM
> > To: linux-pm
> > Subject: [linux-pm] A reference implementation of PCI suspend/resume?
> >
> > Hi,
> > There are still many PCI drivers don't implement their
> > .suspend/.resume
> > methods correctly. I felt it would be quite helpful there is
> > a reference
> > implementation. Here is my thought:
> >
> > [...snip...]
> >
> > Currently many drivers don't call
> > free_irq/request_irq/pci_disable_device, which makes unexpected
> > interrupt occur and even break suspend/resume in some systems.
>
> I don't understand why not calling free_irq would cause unexpected
> interrupts.
I don't think it would. However, there are three reasons in favor of
unregister PCI interrupts:
a.) the kernel will complain if the irq isn't handled, revealing a possible
problem with the driver's suspend routine.
b.) You're assuming every device has its own interrupt. This may not be the
case. Let's say one device was sharing interrupts with a few other devices.
We don't want it's interrupt handler to mess up things when it tries to read
hardware registers from a physically "off" device, as it may possibly generate
errors. Almost every interrupt handler assumes the device is "on", as it
should. Each interrupt handler has to ask its hardware if it generated the
interrupt.
c.) Selective device suspending: obviously we don't want to leave stale
interrupt handlers around.
>
> The way I understand the PCI spec is: if you suspend a PCI device, its IRQ
> should be disabled. If an IRQ come from this device anyway (in violation of
> the spec) then the driver for this device should handle it so you should NOT
> unregister its irq_handler.
This would be a quirk.
> (How does such a device would handle such a irq is another matter - a device
> specific one I guess)
It would need a special case for handling suspend time interrupts. This
would vary between situation and brokeness.
>
> I'm not understanding where your unexpected interrupt is coming from.
>
> Care to elaborate?
Like I said, another device on the same link is one possible problem.
>
> Btw: I'm suspending PCI devices not in the context of a full suspend
> situation, maybe that's why I'm missing your point. But your sample
> implementation should work in both cases.
Yes, although I think we need to specify the situation to the driver in case
it needs to do something special, in general, the full and partial device
suspends should be very similar.
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-16 20:16 ` Adam Belay
@ 2005-05-16 20:56 ` Rafael J. Wysocki
2005-05-16 21:19 ` Jordan Crouse
1 sibling, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2005-05-16 20:56 UTC (permalink / raw)
To: linux-pm; +Cc: Fabrice Gautier
[-- Attachment #1: Type: text/plain, Size: 2025 bytes --]
Hi,
On Monday, 16 of May 2005 22:16, Adam Belay wrote:
> On Fri, May 13, 2005 at 10:06:42PM -0700, Fabrice Gautier wrote:
> > Hi,
> >
> > > From: Shaohua Li [mailto:shaohua.li@intel.com]
> > > Sent: Wednesday, May 11, 2005 6:25 PM
> > > To: linux-pm
> > > Subject: [linux-pm] A reference implementation of PCI suspend/resume?
> > >
> > > Hi,
> > > There are still many PCI drivers don't implement their
> > > .suspend/.resume
> > > methods correctly. I felt it would be quite helpful there is
> > > a reference
> > > implementation. Here is my thought:
> > >
> > > [...snip...]
> > >
> > > Currently many drivers don't call
> > > free_irq/request_irq/pci_disable_device, which makes unexpected
> > > interrupt occur and even break suspend/resume in some systems.
> >
> > I don't understand why not calling free_irq would cause unexpected
> > interrupts.
>
> I don't think it would.
Please have a look at http://bugzilla.kernel.org/show_bug.cgi?id=4416
Perhaps you can figure out what else is going on there ...
> However, there are three reasons in favor of unregister PCI interrupts:
>
> a.) the kernel will complain if the irq isn't handled, revealing a possible
> problem with the driver's suspend routine.
>
> b.) You're assuming every device has its own interrupt. This may not be the
> case. Let's say one device was sharing interrupts with a few other devices.
> We don't want it's interrupt handler to mess up things when it tries to read
> hardware registers from a physically "off" device, as it may possibly generate
> errors. Almost every interrupt handler assumes the device is "on", as it
> should. Each interrupt handler has to ask its hardware if it generated the
> interrupt.
>
> c.) Selective device suspending: obviously we don't want to leave stale
> interrupt handlers around.
I agree.
Greets,
Rafael
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-16 20:16 ` Adam Belay
2005-05-16 20:56 ` Rafael J. Wysocki
@ 2005-05-16 21:19 ` Jordan Crouse
2005-05-17 0:26 ` David Brownell
1 sibling, 1 reply; 20+ messages in thread
From: Jordan Crouse @ 2005-05-16 21:19 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]
On Mon, 16 May 2005 16:16:06 -0400
"Adam Belay" <ambx1@neo.rr.com> wrote:
> b.) You're assuming every device has its own interrupt. This may not
> be the case. Let's say one device was sharing interrupts with a few
> other devices. We don't want it's interrupt handler to mess up things
> when it tries to read hardware registers from a physically "off"
> device, as it may possibly generate errors. Almost every interrupt
> handler assumes the device is "on", as it should. Each interrupt
> handler has to ask its hardware if it generated the interrupt.
I ran into this very problem a while back on 2.4. On my platform, the
EHCI and OCHI host controllers share an interrupt. I was insmoding the
EHCI module before the OHCI module, but it turned out that the OHCI
controller was first on the bus, so it went down first during suspend,
and up first on resume. Long story short, it starting firing interrupts
which went to the EHCI driver first (since it was earlier on the
interrupt handler chain), and the device faithfully hung waiting to read
hardware that was still powered off.
Now, granted that particular scenario is a corner case, and the PCI
hardware should have been configured better, but it shows that you
can't assume that any given interrupt handler will only be called with
active hardware underneath.
So either an "active" bit gets added to each and every driver to be
checked in an interrupt routine, or we bail on the handler all together
- I fancy the second.
Jordan
--
Jordan Crouse
Senior Linux Engineer
AMD - Personal Connectivity Solutions Group
<www.amd.com/embeddedprocessors>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-16 21:19 ` Jordan Crouse
@ 2005-05-17 0:26 ` David Brownell
2005-05-17 9:10 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-05-17 0:26 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]
On Monday 16 May 2005 2:19 pm, Jordan Crouse wrote:
> On Mon, 16 May 2005 16:16:06 -0400
> "Adam Belay" <ambx1@neo.rr.com> wrote:
>
> > b.) You're assuming every device has its own interrupt. This may not
> > be the case. Let's say one device was sharing interrupts with a few
> > other devices...
>
> I ran into this very problem a while back on 2.4. On my platform, the
> EHCI and OCHI host controllers share an interrupt. ...
>
> Now, granted that particular scenario is a corner case, and the PCI
> hardware should have been configured better, but it shows that you
> can't assume that any given interrupt handler will only be called with
> active hardware underneath.
Very true, and that's part of why drivers/usb/core/hcd-pci.c frees
IRQs during suspend, and reclaims them later.
AFAICT that code is the most generic PCI suspend/resume code now
available ... since it's got to work with dozens of different
implementations of EHCI, OHCI, and UHCI; with widely varying
BIOS versions and bugs; and where not all controllers support
the PCI PM extensions.
Note the wierdnesses there ... cases where the PCI API calls are
wierdly asymmetric, and where "swsusp" can really bork things
by entering bizarre "resume" paths.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-17 0:26 ` David Brownell
@ 2005-05-17 9:10 ` Rafael J. Wysocki
2005-05-17 18:35 ` David Brownell
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2005-05-17 9:10 UTC (permalink / raw)
To: linux-pm; +Cc: David Brownell
[-- Attachment #1: Type: text/plain, Size: 1741 bytes --]
Hi,
On Tuesday, 17 of May 2005 02:26, David Brownell wrote:
> On Monday 16 May 2005 2:19 pm, Jordan Crouse wrote:
> > On Mon, 16 May 2005 16:16:06 -0400
> > "Adam Belay" <ambx1@neo.rr.com> wrote:
> >
> > > b.) You're assuming every device has its own interrupt. This may not
> > > be the case. Let's say one device was sharing interrupts with a few
> > > other devices...
> >
> > I ran into this very problem a while back on 2.4. On my platform, the
> > EHCI and OCHI host controllers share an interrupt. ...
> >
> > Now, granted that particular scenario is a corner case, and the PCI
> > hardware should have been configured better, but it shows that you
> > can't assume that any given interrupt handler will only be called with
> > active hardware underneath.
>
> Very true, and that's part of why drivers/usb/core/hcd-pci.c frees
> IRQs during suspend, and reclaims them later.
>
> AFAICT that code is the most generic PCI suspend/resume code now
> available ... since it's got to work with dozens of different
> implementations of EHCI, OHCI, and UHCI; with widely varying
> BIOS versions and bugs; and where not all controllers support
> the PCI PM extensions.
>
> Note the wierdnesses there ... cases where the PCI API calls are
> wierdly asymmetric, and where "swsusp" can really bork things
> by entering bizarre "resume" paths.
Could you please have a look at http://bugzilla.kernel.org/show_bug.cgi?id=4416
There seems to be a problem with OHCI and EHCI drivers wrt suspend/resume
(swsusp) in the current -mm on that box.
Greets,
Rafael
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-17 9:10 ` Rafael J. Wysocki
@ 2005-05-17 18:35 ` David Brownell
2005-05-17 19:28 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: David Brownell @ 2005-05-17 18:35 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 524 bytes --]
On Tuesday 17 May 2005 2:10 am, Rafael J. Wysocki wrote:
> Could you please have a look at http://bugzilla.kernel.org/show_bug.cgi?id=4416
Sorry, I don't have much time for such things; especially for
problems that only appear in the MM tree (with so many non-USB
changes) ... and which already appear to be board-specific, and
tied to ACPI changes.
> There seems to be a problem with OHCI and EHCI drivers wrt suspend/resume
> (swsusp) in the current -mm on that box.
Is that with or without those ACPI fixes?
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-17 18:35 ` David Brownell
@ 2005-05-17 19:28 ` Rafael J. Wysocki
2005-05-18 1:26 ` Adam Belay
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2005-05-17 19:28 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
On Tuesday, 17 of May 2005 20:35, David Brownell wrote:
> On Tuesday 17 May 2005 2:10 am, Rafael J. Wysocki wrote:
>
> > Could you please have a look at http://bugzilla.kernel.org/show_bug.cgi?id=4416
>
> Sorry, I don't have much time for such things; especially for
> problems that only appear in the MM tree (with so many non-USB
> changes) ... and which already appear to be board-specific, and
> tied to ACPI changes.
Never mind. :-)
> > There seems to be a problem with OHCI and EHCI drivers wrt suspend/resume
> > (swsusp) in the current -mm on that box.
>
> Is that with or without those ACPI fixes?
Both ...
Rafael
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-17 19:28 ` Rafael J. Wysocki
@ 2005-05-18 1:26 ` Adam Belay
2005-05-18 17:40 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Adam Belay @ 2005-05-18 1:26 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: David Brownell, linux-pm
[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]
On Tue, May 17, 2005 at 09:28:10PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, 17 of May 2005 20:35, David Brownell wrote:
> > On Tuesday 17 May 2005 2:10 am, Rafael J. Wysocki wrote:
> >
> > > Could you please have a look at http://bugzilla.kernel.org/show_bug.cgi?id=4416
> >
> > Sorry, I don't have much time for such things; especially for
> > problems that only appear in the MM tree (with so many non-USB
> > changes) ... and which already appear to be board-specific, and
> > tied to ACPI changes.
>
> Never mind. :-)
>
> > > There seems to be a problem with OHCI and EHCI drivers wrt suspend/resume
> > > (swsusp) in the current -mm on that box.
> >
> > Is that with or without those ACPI fixes?
>
> Both ...
>
> Rafael
>
Have you checked which of these device drivers are calling free_irq() in
their suspend routines. Also could I see /proc/interrupts (with an offending
combination of drivers loaded) and a list of the hardware you think might be
involved. Have you been able to find a pair of drivers that will cause this
problem?
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-18 1:26 ` Adam Belay
@ 2005-05-18 17:40 ` Rafael J. Wysocki
2005-05-18 18:32 ` Alan Stern
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2005-05-18 17:40 UTC (permalink / raw)
To: Adam Belay; +Cc: David Brownell, linux-pm
Hi,
On Wednesday, 18 of May 2005 03:26, Adam Belay wrote:
> On Tue, May 17, 2005 at 09:28:10PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, 17 of May 2005 20:35, David Brownell wrote:
> > > On Tuesday 17 May 2005 2:10 am, Rafael J. Wysocki wrote:
> > >
> > > > Could you please have a look at http://bugzilla.kernel.org/show_bug.cgi?id=4416
> > >
> > > Sorry, I don't have much time for such things; especially for
> > > problems that only appear in the MM tree (with so many non-USB
> > > changes) ... and which already appear to be board-specific, and
> > > tied to ACPI changes.
> >
> > Never mind. :-)
> >
> > > > There seems to be a problem with OHCI and EHCI drivers wrt suspend/resume
> > > > (swsusp) in the current -mm on that box.
> > >
> > > Is that with or without those ACPI fixes?
> >
> > Both ...
> >
> > Rafael
> >
>
> Have you checked which of these device drivers are calling free_irq() in
> their suspend routines.
AFAICS, the EHCI and OHCI drivers do, but the snd_intel8x0 driver does not.
Anyway this does not seem to be relevant, as the box hangs when the
drivers' resume routines are called (ie on resume-during-resume).
All of the drivers that cause the problem are compiled as modules, so their
suspend routines are not called during swsusp resume.
> Also could I see /proc/interrupts (with an offending
> combination of drivers loaded) and a list of the hardware you think might be
> involved.
The offending devices are (from lspci):
0000:00:02.0 USB Controller: nVidia Corporation nForce3 USB 1.1 (rev a5)
0000:00:02.1 USB Controller: nVidia Corporation nForce3 USB 1.1 (rev a5)
0000:00:02.2 USB Controller: nVidia Corporation nForce3 USB 2.0 (rev a2)
0000:00:06.0 Multimedia audio controller: nVidia Corporation nForce3 Audio (rev a2)
IOW, it's the nForce3 chipset.
Also, there are multiple configurations in which the problem appears (please
see below).
> Have you been able to find a pair of drivers that will cause this
> problem?
The drivers that cause this problem (on this particular box) are:
ehci_hcd
ohci_hcd
snd_intel8x0
If any of them is loaded on suspend, the box hangs solid during resume (please
note that I was unable to reproduce the problem on a different box, using the
same drivers with the AMD chipset and the NEC USB 2.0).
Here go various /proc/interrupts that correspond to the "hanging" driver
configurations:
1) whith snd_intel8x0
CPU0
0: 1145893 XT-PIC timer
1: 1187 XT-PIC i8042
2: 0 XT-PIC cascade
8: 0 XT-PIC rtc
9: 428 XT-PIC acpi
10: 4898 XT-PIC NVidia nForce3
11: 0 XT-PIC yenta, yenta
12: 119 XT-PIC i8042
14: 9787 XT-PIC ide0
15: 7294 XT-PIC ide1
NMI: 0
LOC: 1145536
ERR: 0
MIS: 0
2) with EHCI
CPU0
0: 146184 XT-PIC timer
1: 243 XT-PIC i8042
2: 0 XT-PIC cascade
5: 5 XT-PIC ehci_hcd:usb1
8: 0 XT-PIC rtc
9: 76 XT-PIC acpi
11: 0 XT-PIC yenta, yenta
12: 119 XT-PIC i8042
14: 759 XT-PIC ide0
15: 7240 XT-PIC ide1
NMI: 0
LOC: 146005
ERR: 0
MIS: 0
3) with OHCI
CPU0
0: 152128 XT-PIC timer
1: 205 XT-PIC i8042
2: 0 XT-PIC cascade
8: 0 XT-PIC rtc
9: 77 XT-PIC acpi
11: 25 XT-PIC yenta, yenta, ohci_hcd:usb1, ohci_hcd:usb2
12: 119 XT-PIC i8042
14: 813 XT-PIC ide0
15: 7310 XT-PIC ide1
NMI: 0
LOC: 151961
ERR: 0
MIS: 0
Of course any combination of the three causes the problem.
Greets,
Rafael
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: A reference implementation of PCI suspend/resume?
2005-05-18 17:40 ` Rafael J. Wysocki
@ 2005-05-18 18:32 ` Alan Stern
0 siblings, 0 replies; 20+ messages in thread
From: Alan Stern @ 2005-05-18 18:32 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 672 bytes --]
On Wed, 18 May 2005, Rafael J. Wysocki wrote:
> AFAICS, the EHCI and OHCI drivers do, but the snd_intel8x0 driver does not.
> Anyway this does not seem to be relevant, as the box hangs when the
> drivers' resume routines are called (ie on resume-during-resume).
>
> All of the drivers that cause the problem are compiled as modules, so their
> suspend routines are not called during swsusp resume.
This looks more like a IRQ-routing kind of problem and less like something
specifically to do with resume. Here's a bug report that superficially
resembles what you've got, although this one comes up during boot:
http://bugme.osdl.org/show_bug.cgi?id=4276
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: A reference implementation of PCI suspend/resume?
@ 2005-05-16 19:02 Fabrice Gautier
2005-05-16 19:39 ` David Brownell
0 siblings, 1 reply; 20+ messages in thread
From: Fabrice Gautier @ 2005-05-16 19:02 UTC (permalink / raw)
To: David Brownell, linux-pm; +Cc: Fabrice Gautier
[-- Attachment #1: Type: text/plain, Size: 1234 bytes --]
> From: David Brownell [mailto:david-b@pacbell.net]
> Sent: Sunday, May 15, 2005 9:54 PM
> To: linux-pm@lists.osdl.org
> Cc: Fabrice Gautier; Shaohua Li
> Subject: Re: [linux-pm] A reference implementation of PCI
> suspend/resume?
>
> On Friday 13 May 2005 10:06 pm, Fabrice Gautier wrote:
>
> > The way I understand the PCI spec is: if you suspend a PCI
> > device, its IRQ should be disabled. If an IRQ come from
> > this device anyway (in violation of
> > the spec) then the driver for this device should handle it
> > so you should NOT unregister its irq_handler.
>
> There's PCI ... and PCI with PM capabilities. That rule about only
> issuing PME transactions while suspended can only apply in the latter
> case. That is, not all PCI devices support the PCI PM rules; drivers
> may need to help the hardware.
You mean, you support some kind of power management on PCI devices without
PM capabilities ? How are you even sure that the pci_set_power_state will
work in those kind of devices.
In any case, anyway it works I would say this is very much device specific
and not really part of a reference implementation. This is supposed to be a
reference implementation for "PCI PM compliant" devices right ?
-- Fabrice
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: A reference implementation of PCI suspend/resume?
2005-05-16 19:02 Fabrice Gautier
@ 2005-05-16 19:39 ` David Brownell
0 siblings, 0 replies; 20+ messages in thread
From: David Brownell @ 2005-05-16 19:39 UTC (permalink / raw)
To: Fabrice Gautier; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]
On Monday 16 May 2005 12:02 pm, Fabrice Gautier wrote:
> >
> > There's PCI ... and PCI with PM capabilities. That rule about only
> > issuing PME transactions while suspended can only apply in the latter
> > case. That is, not all PCI devices support the PCI PM rules; drivers
> > may need to help the hardware.
>
> You mean, you support some kind of power management on PCI devices without
> PM capabilities ? How are you even sure that the pci_set_power_state will
> work in those kind of devices.
Linux must certainly support suspend() and resume() methods for such
devices. In terms of PCI PM terminology, they will either stay in
the PCI_D0 state during suspend, or go into PCI_D3cold ... and in
the latter case, the device driver's resume() is going to have to
know how to reinit from a reset state. (They already have code to
init from that state, so it should be no problem.)
By definition, pci_set_power_state() will be a NOP. There might
be firmware that knows how to switch power on/off, or maybe not;
that would be board-specific.
> In any case, anyway it works I would say this is very much device specific
> and not really part of a reference implementation. This is supposed to be a
> reference implementation for "PCI PM compliant" devices right ?
It was supposed to be a reference implementation, yes. But if it was
only for one subset of PCI devices, I didn't notice any suggestion of
that before now. The "works for small subset" implementation wouldn't
be anywhere near as useful.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2005-05-27 7:20 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-12 1:24 A reference implementation of PCI suspend/resume? Shaohua Li
2005-05-12 6:21 ` Adam Belay
2005-05-12 7:03 ` Shaohua Li
2005-05-27 6:49 ` Shaohua Li
2005-05-27 7:20 ` Pavel Machek
-- strict thread matches above, loose matches on Subject: below --
2005-05-14 5:06 Fabrice Gautier
2005-05-16 4:53 ` David Brownell
2005-05-16 8:55 ` Shaohua Li
2005-05-16 20:16 ` Adam Belay
2005-05-16 20:56 ` Rafael J. Wysocki
2005-05-16 21:19 ` Jordan Crouse
2005-05-17 0:26 ` David Brownell
2005-05-17 9:10 ` Rafael J. Wysocki
2005-05-17 18:35 ` David Brownell
2005-05-17 19:28 ` Rafael J. Wysocki
2005-05-18 1:26 ` Adam Belay
2005-05-18 17:40 ` Rafael J. Wysocki
2005-05-18 18:32 ` Alan Stern
2005-05-16 19:02 Fabrice Gautier
2005-05-16 19:39 ` David Brownell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox