public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* revert yenta free_irq on suspend
@ 2005-07-30 19:10 Hugh Dickins
  2005-07-30 20:03 ` Russell King
                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Hugh Dickins @ 2005-07-30 19:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Dominik Brodowski, Daniel Ritz, linux-kernel

Please revert the yenta free_irq on suspend patch (below)
which went into 2.6.13-rc4 after 2.6.13-rc3-git9.

Sorry Daniel, you may have a box on which resume doesn't work without
it, but on my laptop APM resume from RAM now fails to work because of
it - locks up solid.  The patch sounded rather fishy when it went in,
but I've done an unprejudiced bisection and this turns out to be the
culprit.  Perhaps it needs something more (I can try further patches),
but as it stands it's unsuitable for 2.6.13.

Do I recall a worry about shared interrupts?  I indeed have
PCI: Found IRQ 11 for device 0000:00:1f.1
PCI: Sharing IRQ 11 with 0000:02:00.0
PCI: Found IRQ 11 for device 0000:02:00.0
PCI: Sharing IRQ 11 with 0000:00:1f.1
PCI: Found IRQ 11 for device 0000:02:01.0
PCI: Sharing IRQ 11 with 0000:02:01.1
PCI: Found IRQ 11 for device 0000:02:01.1
PCI: Sharing IRQ 11 with 0000:02:01.0
on resume before that patch, and again now I've backed it out.

Thanks,
Hugh

From: Daniel Ritz <daniel.ritz@gmx.ch>

Resume doesn't seem to work without.

Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/pcmcia/yenta_socket.c |    9 +++++++++
 1 files changed, 9 insertions(+)

diff -puN drivers/pcmcia/yenta_socket.c~yenta-free_irq-on-suspend drivers/pcmcia/yenta_socket.c
--- devel/drivers/pcmcia/yenta_socket.c~yenta-free_irq-on-suspend	2005-07-28 01:05:52.000000000 -0700
+++ devel-akpm/drivers/pcmcia/yenta_socket.c	2005-07-28 01:05:52.000000000 -0700
@@ -1107,6 +1107,8 @@ static int yenta_dev_suspend (struct pci
 		pci_read_config_dword(dev, 17*4, &socket->saved_state[1]);
 		pci_disable_device(dev);
 
+		free_irq(dev->irq, socket);
+
 		/*
 		 * Some laptops (IBM T22) do not like us putting the Cardbus
 		 * bridge into D3.  At a guess, some other laptop will
@@ -1132,6 +1134,13 @@ static int yenta_dev_resume (struct pci_
 		pci_enable_device(dev);
 		pci_set_master(dev);
 
+		if (socket->cb_irq)
+			if (request_irq(socket->cb_irq, yenta_interrupt,
+			                SA_SHIRQ, "yenta", socket)) {
+				printk(KERN_WARNING "Yenta: request_irq() failed on resume!\n");
+				socket->cb_irq = 0;
+			}
+
 		if (socket->type && socket->type->restore_state)
 			socket->type->restore_state(socket);
 	}

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 19:10 Hugh Dickins
@ 2005-07-30 20:03 ` Russell King
  2005-07-30 20:36   ` Linus Torvalds
  2005-07-30 20:34 ` Linus Torvalds
  2005-07-30 20:49 ` Rafael J. Wysocki
  2 siblings, 1 reply; 58+ messages in thread
From: Russell King @ 2005-07-30 20:03 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Sat, Jul 30, 2005 at 08:10:33PM +0100, Hugh Dickins wrote:
> Please revert the yenta free_irq on suspend patch (below)
> which went into 2.6.13-rc4 after 2.6.13-rc3-git9.
> 
> Sorry Daniel, you may have a box on which resume doesn't work without
> it, but on my laptop APM resume from RAM now fails to work because of
> it - locks up solid.  The patch sounded rather fishy when it went in,
> but I've done an unprejudiced bisection and this turns out to be the
> culprit.  Perhaps it needs something more (I can try further patches),
> but as it stands it's unsuitable for 2.6.13.

What this probably means is that we need some way to turn off interrupts
from devices on suspend, and on resume, keep them off until drivers
have had a chance to quiesce all devices, turn them back on, and then
do full resume.

The "quiesce" stage needs to take account of whether devices are
accessible (eg, USB mice and keyboards won't be because the USB host
controller isn't resumed.)

(and no I don't have a patch for this - I think this requires another
rework of the PM subsystem and drivers.) ;(

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 19:10 Hugh Dickins
  2005-07-30 20:03 ` Russell King
@ 2005-07-30 20:34 ` Linus Torvalds
  2005-07-31 13:29   ` Pavel Machek
  2005-07-30 20:49 ` Rafael J. Wysocki
  2 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-07-30 20:34 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Dominik Brodowski, Daniel Ritz,
	Linux Kernel Mailing List, Len Brown



On Sat, 30 Jul 2005, Hugh Dickins wrote:
>
> Please revert the yenta free_irq on suspend patch (below)
> which went into 2.6.13-rc4 after 2.6.13-rc3-git9.

Ok. Will do.

And the ACPI people had better stop doing this crazy thing in the first 
place. There's really no point at all to freeing and re-requesting the 
interrupts, and the IRQ controller should just re-initialize itself 
instead.

The excuse for not doing so was totally ludicrous in the first place, if I
recall correctly. Use GFP_ATOMIC in ACPI if you do things with interrupts
disabled, don't break a million drivers.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:03 ` Russell King
@ 2005-07-30 20:36   ` Linus Torvalds
  2005-07-30 20:54     ` Russell King
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-07-30 20:36 UTC (permalink / raw)
  To: Russell King
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel



On Sat, 30 Jul 2005, Russell King wrote:
> 
> What this probably means is that we need some way to turn off interrupts
> from devices on suspend, and on resume, keep them off until drivers
> have had a chance to quiesce all devices, turn them back on, and then
> do full resume.

No, we just need to suspend and resume the interrupt controller properly.  
Which we had the technology for, and we actually used to do, but for some
(incorrect) reason ACPI people thought it should be up to individual
drivers.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 19:10 Hugh Dickins
  2005-07-30 20:03 ` Russell King
  2005-07-30 20:34 ` Linus Torvalds
@ 2005-07-30 20:49 ` Rafael J. Wysocki
  2005-07-30 21:08   ` Daniel Ritz
  2005-07-30 21:32   ` Hugh Dickins
  2 siblings, 2 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-30 20:49 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Saturday, 30 of July 2005 21:10, Hugh Dickins wrote:
> Please revert the yenta free_irq on suspend patch (below)
> which went into 2.6.13-rc4 after 2.6.13-rc3-git9.
> 
> Sorry Daniel, you may have a box on which resume doesn't work without
> it, but on my laptop APM resume from RAM now fails to work because of
> it - locks up solid.

Well, the patch is needed on other boxes too (eg. mine :-)) due to the recent
changes in ACPI.

Could you please send the /proc/interrupts from your box?

Rafael
 

-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:36   ` Linus Torvalds
@ 2005-07-30 20:54     ` Russell King
  2005-07-30 21:10       ` Linus Torvalds
  2005-07-30 21:20       ` Rafael J. Wysocki
  0 siblings, 2 replies; 58+ messages in thread
From: Russell King @ 2005-07-30 20:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Sat, Jul 30, 2005 at 01:36:24PM -0700, Linus Torvalds wrote:
> On Sat, 30 Jul 2005, Russell King wrote:
> > 
> > What this probably means is that we need some way to turn off interrupts
> > from devices on suspend, and on resume, keep them off until drivers
> > have had a chance to quiesce all devices, turn them back on, and then
> > do full resume.
> 
> No, we just need to suspend and resume the interrupt controller properly.  
> Which we had the technology for, and we actually used to do, but for some
> (incorrect) reason ACPI people thought it should be up to individual
> drivers.

I don't think so - I believe one of the problem cases is where you
have a screaming interrupt caused by an improperly setup device.

Consider the case where you have a shared interrupt line and you're
partially through resuming devices, when one unresumed device (setup
by the BIOS) suddenly starts asserting its interrupt.

The kernel then disables the source.  Unfortunately, that was the IRQ
for your USB host, which has your USB keyboard and mouse attached.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:49 ` Rafael J. Wysocki
@ 2005-07-30 21:08   ` Daniel Ritz
  2005-07-30 21:32   ` Hugh Dickins
  1 sibling, 0 replies; 58+ messages in thread
From: Daniel Ritz @ 2005-07-30 21:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Hugh Dickins, Linus Torvalds, Andrew Morton, Dominik Brodowski,
	linux-kernel

On Saturday 30 July 2005 22.49, Rafael J. Wysocki wrote:
> On Saturday, 30 of July 2005 21:10, Hugh Dickins wrote:
> > Please revert the yenta free_irq on suspend patch (below)
> > which went into 2.6.13-rc4 after 2.6.13-rc3-git9.
> > 
> > Sorry Daniel, you may have a box on which resume doesn't work without
> > it, but on my laptop APM resume from RAM now fails to work because of
> > it - locks up solid.
> 
> Well, the patch is needed on other boxes too (eg. mine :-)) due to the recent
> changes in ACPI.
> 

well, i have been told so. but when i asked 'why?' nobody answered something
else than because of a ACPI change.

the patch already died in the git tree which is good.

rgds
-daniel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:54     ` Russell King
@ 2005-07-30 21:10       ` Linus Torvalds
  2005-07-30 21:30         ` Russell King
                           ` (2 more replies)
  2005-07-30 21:20       ` Rafael J. Wysocki
  1 sibling, 3 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-07-30 21:10 UTC (permalink / raw)
  To: Russell King
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel



On Sat, 30 Jul 2005, Russell King wrote:
> 
> I don't think so - I believe one of the problem cases is where you
> have a screaming interrupt caused by an improperly setup device.

Not a problem.

The thing is, this is trivially solved by
 - disable irq controller last on shutdown
 - re-enable irq controller last on resume

Think about it. Even if you have screaming irq's (and thus we'll shut
things down somewhere during the resume), when we then get to re-enable
the irq controller thing, we'll have them all back again at that point.
Problem solved.

You can have variations on this, of course - you can enable the irq
controller early _and_ late in the resume process. Ie do a minimal "get
the basics working" early - you might want to make sure that timers etc
work early on, for example, and then a "fix up the details" thing late.

An interrupt controller is clearly a special case, so it's worth spending 
some effort on handling it.

In contrast, what is _not_ worth doing is screweing over every single
driver we have.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:54     ` Russell King
  2005-07-30 21:10       ` Linus Torvalds
@ 2005-07-30 21:20       ` Rafael J. Wysocki
  1 sibling, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-30 21:20 UTC (permalink / raw)
  To: Russell King
  Cc: Linus Torvalds, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, linux-kernel, Li Shaohua, Len Brown

On Saturday, 30 of July 2005 22:54, Russell King wrote:
> On Sat, Jul 30, 2005 at 01:36:24PM -0700, Linus Torvalds wrote:
> > On Sat, 30 Jul 2005, Russell King wrote:
> > > 
> > > What this probably means is that we need some way to turn off interrupts
> > > from devices on suspend, and on resume, keep them off until drivers
> > > have had a chance to quiesce all devices, turn them back on, and then
> > > do full resume.
> > 
> > No, we just need to suspend and resume the interrupt controller properly.  
> > Which we had the technology for, and we actually used to do, but for some
> > (incorrect) reason ACPI people thought it should be up to individual
> > drivers.
> 
> I don't think so - I believe one of the problem cases is where you
> have a screaming interrupt caused by an improperly setup device.

This is happening in the real life, it seems (eg please see
http://bugzilla.kernel.org/show_bug.cgi?id=4416#c36).

> Consider the case where you have a shared interrupt line and you're
> partially through resuming devices, when one unresumed device (setup
> by the BIOS) suddenly starts asserting its interrupt.
> 
> The kernel then disables the source.  Unfortunately, that was the IRQ
> for your USB host, which has your USB keyboard and mouse attached.

Also, this has been discussed in this thread on the linux-pm list:
http://lists.osdl.org/pipermail/linux-pm/2005-May/000955.html

Greets,
Rafael
 

-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 21:10       ` Linus Torvalds
@ 2005-07-30 21:30         ` Russell King
  2005-07-30 22:28         ` Rafael J. Wysocki
  2005-08-01  9:06         ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 58+ messages in thread
From: Russell King @ 2005-07-30 21:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Sat, Jul 30, 2005 at 02:10:41PM -0700, Linus Torvalds wrote:
> On Sat, 30 Jul 2005, Russell King wrote:
> > I don't think so - I believe one of the problem cases is where you
> > have a screaming interrupt caused by an improperly setup device.
> 
> Not a problem.
> 
> The thing is, this is trivially solved by
>  - disable irq controller last on shutdown
>  - re-enable irq controller last on resume

I reserve further judgement on this idea until it's had some field
testing. 8)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:49 ` Rafael J. Wysocki
  2005-07-30 21:08   ` Daniel Ritz
@ 2005-07-30 21:32   ` Hugh Dickins
  2005-07-30 22:00     ` Rafael J. Wysocki
  1 sibling, 1 reply; 58+ messages in thread
From: Hugh Dickins @ 2005-07-30 21:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Sat, 30 Jul 2005, Rafael J. Wysocki wrote:
> 
> Well, the patch is needed on other boxes too (eg. mine :-)) due to the recent
> changes in ACPI.
> 
> Could you please send the /proc/interrupts from your box?

           CPU0       
  0:    2818513          XT-PIC  timer
  1:      56790          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  8:          2          XT-PIC  rtc
 11:      57443          XT-PIC  yenta, yenta, eth0
 12:     110579          XT-PIC  i8042
 14:      31332          XT-PIC  ide0
 15:     100988          XT-PIC  ide1
NMI:          0 
LOC:          0 
ERR:          0
MIS:          0

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 21:32   ` Hugh Dickins
@ 2005-07-30 22:00     ` Rafael J. Wysocki
  2005-07-30 22:24       ` Hugh Dickins
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-30 22:00 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Saturday, 30 of July 2005 23:32, Hugh Dickins wrote:
> On Sat, 30 Jul 2005, Rafael J. Wysocki wrote:
> > 
> > Well, the patch is needed on other boxes too (eg. mine :-)) due to the recent
> > changes in ACPI.
> > 
> > Could you please send the /proc/interrupts from your box?
> 
>            CPU0       
>   0:    2818513          XT-PIC  timer
>   1:      56790          XT-PIC  i8042
>   2:          0          XT-PIC  cascade
>   8:          2          XT-PIC  rtc
>  11:      57443          XT-PIC  yenta, yenta, eth0
>  12:     110579          XT-PIC  i8042
>  14:      31332          XT-PIC  ide0
>  15:     100988          XT-PIC  ide1
> NMI:          0 
> LOC:          0 
> ERR:          0
> MIS:          0

Thanks.  It looks like eth0 gets a yenta's interrupt and goes awry.
Could you please tell me what driver the eth0 is?

Rafael


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 22:00     ` Rafael J. Wysocki
@ 2005-07-30 22:24       ` Hugh Dickins
  2005-07-30 23:09         ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Hugh Dickins @ 2005-07-30 22:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Sun, 31 Jul 2005, Rafael J. Wysocki wrote:
> On Saturday, 30 of July 2005 23:32, Hugh Dickins wrote:
> > On Sat, 30 Jul 2005, Rafael J. Wysocki wrote:
> > > 
> > > Could you please send the /proc/interrupts from your box?
> > 
> >  11:      57443          XT-PIC  yenta, yenta, eth0
> 
> Thanks.  It looks like eth0 gets a yenta's interrupt and goes awry.
> Could you please tell me what driver the eth0 is?

CONFIG_VORTEX drivers/net/3c59x.c:
0000:02:00.0: 3Com PCI 3c905C Tornado at 0xec80. Vers LK1.1.19

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 21:10       ` Linus Torvalds
  2005-07-30 21:30         ` Russell King
@ 2005-07-30 22:28         ` Rafael J. Wysocki
  2005-07-31  4:49           ` Linus Torvalds
  2005-08-01  9:06         ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-30 22:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Russell King, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Len Brown

On Saturday, 30 of July 2005 23:10, Linus Torvalds wrote:
> 
> On Sat, 30 Jul 2005, Russell King wrote:
> > 
> > I don't think so - I believe one of the problem cases is where you
> > have a screaming interrupt caused by an improperly setup device.
> 
> Not a problem.
> 
> The thing is, this is trivially solved by
>  - disable irq controller last on shutdown
>  - re-enable irq controller last on resume
> 
> Think about it. Even if you have screaming irq's (and thus we'll shut
> things down somewhere during the resume), when we then get to re-enable
> the irq controller thing, we'll have them all back again at that point.
> Problem solved.
> 
> You can have variations on this, of course - you can enable the irq
> controller early _and_ late in the resume process. Ie do a minimal "get
> the basics working" early - you might want to make sure that timers etc
> work early on, for example, and then a "fix up the details" thing late.
> 
> An interrupt controller is clearly a special case, so it's worth spending 
> some effort on handling it.
> 
> In contrast, what is _not_ worth doing is screweing over every single
> driver we have.

Well, they have _already_ been screwed by the following patch that went
to your tree with the ACPI update.  If you drop it, all problems related to
freeing/requesting IRQs on suspend/resume will be gone.

Please note, however, that what it does is to get rid of unconditional setting
ACPI PCI links on resume in hope that someone will take care of them later,
which is not quite right, so to speak.

In my opinion the time for introducing this change is not the best, to put it
lightly, but it looks like it will have to be made at some point.    Still I'm not
an ACPI expert, so Len could you please advise?

Greets,
Rafael


diff -Naru a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
--- a/drivers/acpi/pci_link.c	2005-03-24 04:57:27 -08:00
+++ b/drivers/acpi/pci_link.c	2005-03-24 04:57:27 -08:00
@@ -72,10 +72,12 @@
 	u8			active;			/* Current IRQ */
 	u8			edge_level;		/* All IRQs */
 	u8			active_high_low;	/* All IRQs */
-	u8			initialized;
 	u8			resource_type;
 	u8			possible_count;
 	u8			possible[ACPI_PCI_LINK_MAX_POSSIBLE];
+	u8			initialized:1;
+	u8			suspend_resume:1;
+	u8			reserved:6;
 };
 
 struct acpi_pci_link {
@@ -530,6 +532,10 @@
 
 	ACPI_FUNCTION_TRACE("acpi_pci_link_allocate");
 
+	if (link->irq.suspend_resume) {
+		acpi_pci_link_set(link, link->irq.active);
+		link->irq.suspend_resume = 0;
+	}
 	if (link->irq.initialized)
 		return_VALUE(0);
 
@@ -713,38 +719,24 @@
 	return_VALUE(result);
 }
 
-
-static int
-acpi_pci_link_resume (
-	struct acpi_pci_link	*link)
-{
-	ACPI_FUNCTION_TRACE("acpi_pci_link_resume");
-	
-	if (link->irq.active && link->irq.initialized)
-		return_VALUE(acpi_pci_link_set(link, link->irq.active));
-	else
-		return_VALUE(0);
-}
-
-
 static int
-irqrouter_resume(
-	struct sys_device *dev)
+irqrouter_suspend(
+	struct sys_device *dev,
+	u32	state)
 {
 	struct list_head        *node = NULL;
 	struct acpi_pci_link    *link = NULL;
 
-	ACPI_FUNCTION_TRACE("irqrouter_resume");
+	ACPI_FUNCTION_TRACE("irqrouter_suspend");
 
 	list_for_each(node, &acpi_link.entries) {
-
 		link = list_entry(node, struct acpi_pci_link, node);
 		if (!link) {
 			ACPI_DEBUG_PRINT((ACPI_DB_ERROR, "Invalid link context\n"));
 			continue;
 		}
-
-		acpi_pci_link_resume(link);
+		if (link->irq.active && link->irq.initialized)
+			link->irq.suspend_resume = 1;
 	}
 	return_VALUE(0);
 }
@@ -856,7 +848,7 @@
 
 static struct sysdev_class irqrouter_sysdev_class = {
         set_kset_name("irqrouter"),
-        .resume = irqrouter_resume,
+        .suspend = irqrouter_suspend,
 };
 
 

-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 22:24       ` Hugh Dickins
@ 2005-07-30 23:09         ` Rafael J. Wysocki
  2005-07-31 20:15           ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-30 23:09 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	linux-kernel

On Sunday, 31 of July 2005 00:24, Hugh Dickins wrote:
> On Sun, 31 Jul 2005, Rafael J. Wysocki wrote:
> > On Saturday, 30 of July 2005 23:32, Hugh Dickins wrote:
> > > On Sat, 30 Jul 2005, Rafael J. Wysocki wrote:
> > > > 
> > > > Could you please send the /proc/interrupts from your box?
> > > 
> > >  11:      57443          XT-PIC  yenta, yenta, eth0
> > 
> > Thanks.  It looks like eth0 gets a yenta's interrupt and goes awry.
> > Could you please tell me what driver the eth0 is?
> 
> CONFIG_VORTEX drivers/net/3c59x.c:
> 0000:02:00.0: 3Com PCI 3c905C Tornado at 0xec80. Vers LK1.1.19

Thanks again.  From the first look the suspend/resume routines of the driver
are missing some calls.  In particular, with the IRQ-freeing patch for yenta it is
likely to get an out-of-order interrupt as I suspected.

Linus has apparently dropped that patch for yenta, but in case it is
reintroduced in the future you will probably need a patch to make the network
driver cooperate.  I'll try to prepare one tomorrow, if I can, but I have no hardware
to test it.

Greets,
Rafael


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 22:28         ` Rafael J. Wysocki
@ 2005-07-31  4:49           ` Linus Torvalds
  0 siblings, 0 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31  4:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Russell King, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Len Brown



On Sun, 31 Jul 2005, Rafael J. Wysocki wrote:
> 
> Well, they have _already_ been screwed by the following patch that went
> to your tree with the ACPI update.  If you drop it, all problems related to
> freeing/requesting IRQs on suspend/resume will be gone.

Yes. I really think we need to revert that patch, we just can't fix this 
any other way in the short run - the "request_irq/free_irq()" thing has 
broken too many setups. And we _need_ an answer for the short run, since I 
want to be able to release 2.6.13 without having lots of peoples laptop 
suspends be broken.

Yes, I realize that it fixed suspend for some people, but the thing is,
anything that expects lots of drivers to change just is fundamentally
broken. In a perfect world we might be able to get all drivers to do the
right thing, but dammit, in a perfect world we wouldn't have ACPI in the
first place.

So I guess I'll just have to revert the ACPI change that caused drivers to
do request_irq/free_irq. I'd prefer it if the ACPI people did that revert 
themselves, though.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: revert yenta free_irq on suspend
@ 2005-07-31  5:03 Brown, Len
  2005-07-31  5:31 ` Linus Torvalds
                   ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Brown, Len @ 2005-07-31  5:03 UTC (permalink / raw)
  To: Linus Torvalds, Rafael J. Wysocki
  Cc: linux-kernel, Russell King, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz

>So I guess I'll just have to revert the ACPI change that 
>caused drivers to do request_irq/free_irq. I'd prefer it
>if the ACPI people did that revert themselves, though.

If that is what you want, I'll be happy to do it.

If one believes that suspend/resume is working on a large number of
systems -- working to a level that a distro can acutally support it,
then restoring our temporary resume IRQ router hack to make many systems
work
is clearly the right thing to do.

But that believe would be total fantasy -- supsend/resume is not
working on a large number of machines, and no distro is currently
able to support it.  (I'm talking about S3 suspend to RAM primarily,
suspend to disk is less interesting -- though Red Hat doesn't
even support _that_)

We can got back to the old hack, but it will probably just delay
the day that suspend/resume is working broadly, and actually
can be deployed and supported by distros.

-Len

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: revert yenta free_irq on suspend
  2005-07-31  5:03 Brown, Len
@ 2005-07-31  5:31 ` Linus Torvalds
  2005-07-31  9:49 ` Rafael J. Wysocki
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31  5:31 UTC (permalink / raw)
  To: Brown, Len
  Cc: Rafael J. Wysocki, linux-kernel, Russell King, Hugh Dickins,
	Andrew Morton, Dominik Brodowski, Daniel Ritz



On Sun, 31 Jul 2005, Brown, Len wrote:
> 
> If one believes that suspend/resume is working on a large number of
> systems -- working to a level that a distro can acutally support it,
> then restoring our temporary resume IRQ router hack to make many systems
> work is clearly the right thing to do.

I don't believe that it works on a huge number of devices as-is, no.

But I'm definitely even less likely to believe in this "two steps forward,
one step back" dance. Because as far as I can tell, it's equally often
"one step forward, two steps back", and nobody can tell when we go forware 
more than we go backwards.

So I'd rather have "one tenth of a step forward, but if we see even a 
_hint_ of a step back, we revert immediately".

I realize that sounds damn timid and boring, but the thing is:

 - even _if_ (and quite frankly, judging by the complaints, I find that 
   unlikely) we're doing more forward progress than backwards progress 
   ("backwards progress? you moron!"), the "one step back" thing is really 
   doing a _huge_ amount of psychological damage to the whole thing.

   The thing is, we're better off making very very slow progress that is 
   _steady_, than having people who _used_ to have things work for them 
   suddenly break.

   So I believe that if we fix two machines and break one machine, we've 
   actually regressed. It doesn't matter that we fixed more than we broke: 
   we _still_ regressed. Because it means that people can't trust the 
   progress we make!

So this is why I'm a very strong proponent of the fact that if we _ever_
have anybody complain that a patch broke things, we should immediately
revert it, unless we can fix it asap.

Btw, this argument is much less true in areas where we can "think" about
the problems. In non-driver/non-firmware cases.

When we can logically argue from a purely theoretical standpoint for a
"known correct solution", and expect the theoretical argument to actually
be reflected in practice, I'm much more open to an argument of "ok, we
know where we are going, and we'll have to break a few eggs just because
the changes are extensive".

But when it comes to device drivers and badly documented stuff that
developers can usually not even reproduce, our #1 strength is the people
for whom it works, and when something breaks, that's a huge big red flag,
and then I really think that "revert or fix immediately" is the only
reasonable alternative. Otherwise we'll just oscillate about a point that
we don't even know where it is, and have no way to judge if the
oscillations are getting more violent or are dampening out - we don't have
a reference point.

But as with everything, there is no total black-and-white case. Things 
_do_ break occasionally, and clearly we can't guarantee nonbreakage or 
we'll end up being static.

But this particular thing has clearly caused a _lot_ of noise, with
second-order breakage from fixes from the first order one. At the very 
least alternatives should be tried, and I think there _are_ much less 
intrusive alternatives that are _much_ less likely to have these kinds of 
negative side effects.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31  5:03 Brown, Len
  2005-07-31  5:31 ` Linus Torvalds
@ 2005-07-31  9:49 ` Rafael J. Wysocki
  2005-07-31 22:27 ` Dave Jones
  2005-08-01  8:51 ` Matthew Garrett
  3 siblings, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-31  9:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Brown, Len, Linus Torvalds, Russell King, Hugh Dickins,
	Andrew Morton, Dominik Brodowski, Daniel Ritz, Li Shaohua

On Sunday, 31 of July 2005 07:03, Brown, Len wrote:
> >So I guess I'll just have to revert the ACPI change that 
> >caused drivers to do request_irq/free_irq. I'd prefer it
> >if the ACPI people did that revert themselves, though.
> 
> If that is what you want, I'll be happy to do it.
> 
> If one believes that suspend/resume is working on a large number of
> systems -- working to a level that a distro can acutally support it,
> then restoring our temporary resume IRQ router hack to make many systems
> work
> is clearly the right thing to do.
> 
> But that believe would be total fantasy -- supsend/resume is not
> working on a large number of machines, and no distro is currently
> able to support it.  (I'm talking about S3 suspend to RAM primarily,
> suspend to disk is less interesting -- though Red Hat doesn't
> even support _that_)
> 
> We can got back to the old hack, but it will probably just delay
> the day that suspend/resume is working broadly, and actually
> can be deployed and supported by distros.

May I propose to keep this change in -mm?

This issue has already been discussed on the linux-pm list and the people
there seem to agree thet the way to go is to convert all PCI drivers to the
model in which they drop their IRQs on suspend and request them back on
resume (ref. http://lists.osdl.org/pipermail/linux-pm/2005-May/000955.html).

There are some drivers that already do it (eg the USB drivers), but there are
many drivers that don't and in fact the recent problems have been related to
them.  If the change stays in -mm we will be able to convert the drivers
gradually and they will hopefully get some testing.  When it's done, it
will be safe to push the change along with the converted drivers to
mainline.

Greets,
Rafael


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 20:34 ` Linus Torvalds
@ 2005-07-31 13:29   ` Pavel Machek
  2005-07-31 15:53     ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Pavel Machek @ 2005-07-31 13:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	Linux Kernel Mailing List, Len Brown

Hi!

> > Please revert the yenta free_irq on suspend patch (below)
> > which went into 2.6.13-rc4 after 2.6.13-rc3-git9.
> 
> Ok. Will do.

> And the ACPI people had better stop doing this crazy thing in the first 
> place. There's really no point at all to freeing and re-requesting the 
> interrupts, and the IRQ controller should just re-initialize itself 
> instead.

Well, on some machines interrupts can change during suspend (or so I
was told). I did not like the ACPI change at one point, but it is very
wrong to revert PCMCIA fix without also fixing ACPI interpretter.

And it indeed seems that ACPI interpretter is hard to fix in the right
way.

								Pavel 
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 13:29   ` Pavel Machek
@ 2005-07-31 15:53     ` Linus Torvalds
  2005-07-31 17:09       ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31 15:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	Linux Kernel Mailing List, Len Brown



On Sun, 31 Jul 2005, Pavel Machek wrote:
> 
> Well, on some machines interrupts can change during suspend (or so I
> was told). I did not like the ACPI change at one point, but it is very
> wrong to revert PCMCIA fix without also fixing ACPI interpretter.

We _are_ going to fix the ACPI interpreter.

As to irq's changing during suspend - I'll believe that when I see it, not 
when some chicken little runs around worrying about it. I doubt anybody 
has ever seen it, and I'm 100% sure that we have serious breakage right 
now on machines where it definitely doesn't happen.

> And it indeed seems that ACPI interpretter is hard to fix in the right
> way.

We'll revert to the behaviour that it has traditionally had, and start 
working forwards in a more careful manner. Where we don't break working 
setups.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 15:53     ` Linus Torvalds
@ 2005-07-31 17:09       ` Linus Torvalds
  0 siblings, 0 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31 17:09 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz,
	Linux Kernel Mailing List, Len Brown



On Sun, 31 Jul 2005, Linus Torvalds wrote:
> 
> We'll revert to the behaviour that it has traditionally had, and start 
> working forwards in a more careful manner. Where we don't break working 
> setups.

Here's a suggested revert (a pure "patch -R" won't work, since there's 
been other differences since). It works for me, and suspends/resumes from 
RAM on my EVO with all the irq links getting re-programmed (dmesg is very 
clear about that ;).

It's pretty much the old 2.6.12 code, updated for the refcounting etc.

		Linus

----
diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
--- a/drivers/acpi/pci_link.c
+++ b/drivers/acpi/pci_link.c
@@ -776,15 +776,25 @@ end:
 }
 
 static int
-irqrouter_suspend(
-	struct sys_device *dev,
-	u32	state)
+acpi_pci_link_resume(
+	struct acpi_pci_link *link)
+{
+	ACPI_FUNCTION_TRACE("acpi_pci_link_resume");
+
+	if (link->refcnt && link->irq.active && link->irq.initialized)
+		return_VALUE(acpi_pci_link_set(link, link->irq.active));
+	else
+		return_VALUE(0);
+}
+
+static int
+irqrouter_resume(
+	struct sys_device *dev)
 {
 	struct list_head        *node = NULL;
 	struct acpi_pci_link    *link = NULL;
-	int			ret = 0;
 
-	ACPI_FUNCTION_TRACE("irqrouter_suspend");
+	ACPI_FUNCTION_TRACE("irqrouter_resume");
 
 	list_for_each(node, &acpi_link.entries) {
 		link = list_entry(node, struct acpi_pci_link, node);
@@ -793,24 +803,11 @@ irqrouter_suspend(
 				"Invalid link context\n"));
 			continue;
 		}
-		if (link->irq.initialized && link->refcnt != 0
-			/* We ignore legacy IDE device irq */
-			&& link->irq.active != 14 && link->irq.active !=15) {
-			printk(KERN_WARNING PREFIX
-				"%d drivers with interrupt %d neglected to call"
-				" pci_disable_device at .suspend\n",
-				link->refcnt,
-				link->irq.active);
-			printk(KERN_WARNING PREFIX
-				"Fix the driver, or rmmod before suspend\n");
-			link->refcnt = 0;
-			ret = -EINVAL;
-		}
+		acpi_pci_link_resume(link);
 	}
-	return_VALUE(ret);
+	return_VALUE(0);
 }
 
-
 static int
 acpi_pci_link_remove (
 	struct acpi_device	*device,
@@ -922,7 +919,7 @@ __setup("acpi_irq_balance", acpi_irq_bal
 /* FIXME: we will remove this interface after all drivers call pci_disable_device */
 static struct sysdev_class irqrouter_sysdev_class = {
         set_kset_name("irqrouter"),
-        .suspend = irqrouter_suspend,
+        .resume = irqrouter_resume,
 };
 
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 23:09         ` Rafael J. Wysocki
@ 2005-07-31 20:15           ` Rafael J. Wysocki
  2005-08-01 20:34             ` Hugh Dickins
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-07-31 20:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hugh Dickins, Linus Torvalds, Andrew Morton, Dominik Brodowski,
	Daniel Ritz

On Sunday, 31 of July 2005 01:09, Rafael J. Wysocki wrote:
> On Sunday, 31 of July 2005 00:24, Hugh Dickins wrote:
> > On Sun, 31 Jul 2005, Rafael J. Wysocki wrote:
> > > On Saturday, 30 of July 2005 23:32, Hugh Dickins wrote:
> > > > On Sat, 30 Jul 2005, Rafael J. Wysocki wrote:
> > > > > 
> > > > > Could you please send the /proc/interrupts from your box?
> > > > 
> > > >  11:      57443          XT-PIC  yenta, yenta, eth0
> > > 
> > > Thanks.  It looks like eth0 gets a yenta's interrupt and goes awry.
> > > Could you please tell me what driver the eth0 is?
> > 
> > CONFIG_VORTEX drivers/net/3c59x.c:
> > 0000:02:00.0: 3Com PCI 3c905C Tornado at 0xec80. Vers LK1.1.19
> 
> Thanks again.  From the first look the suspend/resume routines of the driver
> are missing some calls.  In particular, with the IRQ-freeing patch for yenta it is
> likely to get an out-of-order interrupt as I suspected.
> 
> Linus has apparently dropped that patch for yenta, but in case it is
> reintroduced in the future you will probably need a patch to make the network
> driver cooperate.  I'll try to prepare one tomorrow, if I can, but I have no hardware
> to test it.

The patch follows.  It compiles and should work, though I haven't tested it.

Greets,
Rafael

Index: linux-2.6.13-rc4-git3/drivers/net/3c59x.c
===================================================================
--- linux-2.6.13-rc4-git3.orig/drivers/net/3c59x.c
+++ linux-2.6.13-rc4-git3/drivers/net/3c59x.c
@@ -973,6 +973,11 @@ static int vortex_suspend (struct pci_de
 			netif_device_detach(dev);
 			vortex_down(dev, 1);
 		}
+		pci_save_state(pdev);
+		pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
+		free_irq(dev->irq, dev);
+		pci_disable_device(pdev);
+		pci_set_power_state(pdev, pci_choose_state(pdev, state));
 	}
 	return 0;
 }
@@ -980,8 +985,19 @@ static int vortex_suspend (struct pci_de
 static int vortex_resume (struct pci_dev *pdev)
 {
 	struct net_device *dev = pci_get_drvdata(pdev);
+	struct vortex_private *vp = netdev_priv(dev);
 
-	if (dev && dev->priv) {
+	if (dev && vp) {
+		pci_set_power_state(pdev, PCI_D0);
+		pci_restore_state(pdev);
+		pci_enable_device(pdev);
+		pci_set_master(pdev);
+		if (request_irq(dev->irq, vp->full_bus_master_rx ?
+				&boomerang_interrupt : &vortex_interrupt, SA_SHIRQ, dev->name, dev)) {
+			printk(KERN_WARNING "%s: Could not reserve IRQ %d\n", dev->name, dev->irq);
+			pci_disable_device(pdev);
+			return -EBUSY;
+		}
 		if (netif_running(dev)) {
 			vortex_up(dev);
 			netif_device_attach(dev);


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
@ 2005-07-31 20:34 ambx1
  2005-07-31 21:20 ` Pavel Machek
                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: ambx1 @ 2005-07-31 20:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown

----- Original Message ----- 
From: Linus Torvalds <torvalds@osdl.org> 
Date: Sunday, July 31, 2005 11:53 am 
Subject: Re: revert yenta free_irq on suspend 

> 
> 
> On Sun, 31 Jul 2005, Pavel Machek wrote: 
> > 
> > Well, on some machines interrupts can change during suspend (or 
> so I 
> > was told). I did not like the ACPI change at one point, but it 
> is very 
> > wrong to revert PCMCIA fix without also fixing ACPI interpretter. 
> 
> We _are_ going to fix the ACPI interpreter. 
> 
> As to irq's changing during suspend - I'll believe that when I see 
> it, not 
> when some chicken little runs around worrying about it. I doubt 
> anybody 
> has ever seen it, and I'm 100% sure that we have serious breakage 
> right 
> now on machines where it definitely doesn't happen. 
> 
> > And it indeed seems that ACPI interpretter is hard to fix in the 
> right> way. 
> 
> We'll revert to the behaviour that it has traditionally had, and 
> start 
> working forwards in a more careful manner. Where we don't break 
> working 
> setups. 
> 
> Linus 

Hi Linus,

In general, I think that calling free_irq is the right behavior.
Although irqs changing after suspend is rare, there are also some
more serious issues.  This has been discussed in the past, and a
summary is as follows:

1.) race conditions:
Consider the case where several PCI devices are sharing a single PCI
interrupt.  One device is suspended, but doesn't call free_irq.  Now
another properly behaving device on the same interrupt line generates
an interrupt.  Let's say the suspended device had its handler
registered first, and therefore is called first.  The handler will
attempt to access registers on the physically-off device to determine
if it generated the interrupt.  The PCI bus will issue a master abort.
The driver's interrupt handler will likely interpret the read
incorrectly.

Every interrupt handler could be modified to check if the device is
available, but it would be cleaner and more efficient to unregister
the interrupt.  Either way, every driver has to be changed to support
PM correctly.

2.) runtime power management:
We don't want to leave stale interrupt handlers registered when only
suspending a specific device.  As we move toward supporting runtime
power management it will be important to ensure every driver calls
free_irq() in its suspend() (or whatever we're using at that point)
routine.  This avoids interrupt handler bugs and extra interrupt
overhead.


Also I'd like to point out that this patch broke APM suspend-to-ram,
not ACPI S3.  IMO, it may not be possible to support both APM and ACPI
on every system, as their specs are not intended to be compatible.
Progress toward proper suspend-to-ram support will, in many cases, be
a small setback for APM.  This really can't be avoided.

There are, however, some things we can do to mitigate the breakage
toward APM.  Specifically, we should indicate the type of suspend state,
including if it's an ACPI or APM state, for each driver's ->suspend()
routine.  This will give drivers the opportunity to act differently
for APM when necessary.  I'm currenlty working on this issue.

APM is useful for legacy hardware and systems with blacklisted ACPI
support.  I don't think we should attempt to support APM on any system
with working ACPI suspend/resume.

Thanks,
Adam


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 20:34 revert yenta free_irq on suspend ambx1
@ 2005-07-31 21:20 ` Pavel Machek
  2005-08-01  8:56   ` Benjamin Herrenschmidt
  2005-07-31 22:55 ` Linus Torvalds
  2005-08-01  1:59 ` Shaohua Li
  2 siblings, 1 reply; 58+ messages in thread
From: Pavel Machek @ 2005-07-31 21:20 UTC (permalink / raw)
  To: ambx1
  Cc: Linus Torvalds, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown

Hi!

> Also I'd like to point out that this patch broke APM suspend-to-ram,
> not ACPI S3.  IMO, it may not be possible to support both APM and ACPI
> on every system, as their specs are not intended to be compatible.
> Progress toward proper suspend-to-ram support will, in many cases, be
> a small setback for APM.  This really can't be avoided.

Actually, for APM, OS theoretically does *not* need to FREEZE the
devices (or do anything else). "Doing nothing" should be easy...
								Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31  5:03 Brown, Len
  2005-07-31  5:31 ` Linus Torvalds
  2005-07-31  9:49 ` Rafael J. Wysocki
@ 2005-07-31 22:27 ` Dave Jones
  2005-08-01  0:00   ` Andreas Steinmetz
  2005-08-03  9:23   ` Pavel Machek
  2005-08-01  8:51 ` Matthew Garrett
  3 siblings, 2 replies; 58+ messages in thread
From: Dave Jones @ 2005-07-31 22:27 UTC (permalink / raw)
  To: Brown, Len
  Cc: Linus Torvalds, Rafael J. Wysocki, linux-kernel, Russell King,
	Hugh Dickins, Andrew Morton, Dominik Brodowski, Daniel Ritz

On Sun, Jul 31, 2005 at 01:03:56AM -0400, Brown, Len wrote:

 > But that believe would be total fantasy -- supsend/resume is not
 > working on a large number of machines, and no distro is currently
 > able to support it.  (I'm talking about S3 suspend to RAM primarily,
 > suspend to disk is less interesting -- though Red Hat doesn't
 > even support _that_)

After the 'swsusp works just fine' lovefest at OLS, I spent a little
while playing with the current in-tree swsusp implementation last week.

The outcome: I'm no more enthusiastic about enabling this in Red Hat
kernels than I ever was before.  It seems to have real issues with LVM
setups (which is default on Red Hat/Fedora installs these days).
After convincing it where to suspend/resume from by feeding it
the major/minor of my swap partition, it did actually seem
to suspend. And resume (though it did spew lots of 'sleeping whilst
atomic warnings, but thats trivial compared to whats coming up next).

I rebooted, and fsck found all sorts of damage on my / partition.
After spending 30 minutes pressing 'y', to fix things up, it failed
to boot after lots of files were missing.
Why it wrote anything to completely different lv to where I told it
(and yes, I did get the major:minor right) I have no idea, but
as it stands, it definitly isn't production-ready.

I'll look into it again sometime soon, but not until after I've
reinstalled my laptop.  (I'm just thankful I had the sense not to
try this whilst I was at OLS).

		Dave


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 20:34 revert yenta free_irq on suspend ambx1
  2005-07-31 21:20 ` Pavel Machek
@ 2005-07-31 22:55 ` Linus Torvalds
  2005-07-31 23:05   ` Pavel Machek
  2005-07-31 23:10   ` Dave Airlie
  2005-08-01  1:59 ` Shaohua Li
  2 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31 22:55 UTC (permalink / raw)
  To: ambx1
  Cc: Pavel Machek, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown



On Sun, 31 Jul 2005 ambx1@neo.rr.com wrote:
> 
> In general, I think that calling free_irq is the right behavior.

I DO NOT CARE!

It breaks hundreds of drivers. End of discussion.

You can do the free_irq() and request_irq() changes _without_ breaking 
hundreds of drivers by just doing one driver at a time. 

And if ACPI then restores the irq controller state, the drivers that 
_don't_ do this will _also_ continue to work.

Let me re-iterate: the ACPI changes provably BROKE REAL PEOPLES SETUPS. 

For absolutely _zero_ gain. Drivers that want to free and re-aquire an 
interrupt can do so _regardless_ of whether ACPI restores irq routings 
automatically or not.

And that's my argument. We don't do stupid things that break peoples 
existing setups in ways that nobody can debug. 

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 22:55 ` Linus Torvalds
@ 2005-07-31 23:05   ` Pavel Machek
  2005-07-31 23:24     ` Linus Torvalds
  2005-07-31 23:10   ` Dave Airlie
  1 sibling, 1 reply; 58+ messages in thread
From: Pavel Machek @ 2005-07-31 23:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ambx1, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown

Hi!

> > In general, I think that calling free_irq is the right behavior.
> 
> I DO NOT CARE!
> 
> It breaks hundreds of drivers. End of discussion.
> 
> You can do the free_irq() and request_irq() changes _without_ breaking 
> hundreds of drivers by just doing one driver at a time. 
> 
> And if ACPI then restores the irq controller state, the drivers that 
> _don't_ do this will _also_ continue to work.
> 
> Let me re-iterate: the ACPI changes provably BROKE REAL PEOPLES SETUPS. 
> 
> For absolutely _zero_ gain. Drivers that want to free and re-aquire an 
> interrupt can do so _regardless_ of whether ACPI restores irq routings 
> automatically or not.
> 
> And that's my argument. We don't do stupid things that break peoples 
> existing setups in ways that nobody can debug. 

Ok, so we'll keep adding those free_irq/request_irq pairs, and
re-introduce that ACPI change when we are ready? It would be helpfull
to keep the "right thing" in -mm, so there's real motivation to add
free_irq/request_irq.
								Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 22:55 ` Linus Torvalds
  2005-07-31 23:05   ` Pavel Machek
@ 2005-07-31 23:10   ` Dave Airlie
  1 sibling, 0 replies; 58+ messages in thread
From: Dave Airlie @ 2005-07-31 23:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ambx1, Pavel Machek, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown

> >
> > In general, I think that calling free_irq is the right behavior.
> 
> I DO NOT CARE!
> 
> It breaks hundreds of drivers. End of discussion.
> 
> You can do the free_irq() and request_irq() changes _without_ breaking
> hundreds of drivers by just doing one driver at a time.
> 

So are driver writers supposed to start accepting patches to
free/request irqs? in a sense of making it all go forward so at some
point we can switch over to doing the correct thing? but my patch for
yenta breaks setups for some strange reason..... (maybe just APM
ones..)

If so we should put the patch to break links into -mm and then start
fixing up drivers...

Dave.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 23:05   ` Pavel Machek
@ 2005-07-31 23:24     ` Linus Torvalds
  2005-07-31 23:27       ` Pavel Machek
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31 23:24 UTC (permalink / raw)
  To: Pavel Machek
  Cc: ambx1, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown



On Mon, 1 Aug 2005, Pavel Machek wrote:
> 
> Ok, so we'll keep adding those free_irq/request_irq pairs

I would suggest doing so only if you have a case where it can matter.

> and re-introduce that ACPI change when we are ready?

Why do it _ever_? There is _zero_ upside to doing it, I don't see why you 
want to.

Just make ACPI restore the dang thing. It's the right thing to do.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 23:24     ` Linus Torvalds
@ 2005-07-31 23:27       ` Pavel Machek
  2005-07-31 23:44         ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Pavel Machek @ 2005-07-31 23:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ambx1, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown

Hi!

> > Ok, so we'll keep adding those free_irq/request_irq pairs
> 
> I would suggest doing so only if you have a case where it can matter.
> 
> > and re-introduce that ACPI change when we are ready?
> 
> Why do it _ever_? There is _zero_ upside to doing it, I don't see why you 
> want to.

Being able to turn off your soundcard at runtime when you are not
using it was one of examples...

> Just make ACPI restore the dang thing. It's the right thing to do.

Requires interpretter running with interrupts disabled => ugly :-(.

								Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 23:27       ` Pavel Machek
@ 2005-07-31 23:44         ` Linus Torvalds
  2005-07-31 23:59           ` Dave Airlie
  2005-08-01  7:25           ` Pavel Machek
  0 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-07-31 23:44 UTC (permalink / raw)
  To: Pavel Machek
  Cc: ambx1, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown



On Mon, 1 Aug 2005, Pavel Machek wrote:
> > 
> > Why do it _ever_? There is _zero_ upside to doing it, I don't see why you 
> > want to.
> 
> Being able to turn off your soundcard at runtime when you are not
> using it was one of examples...

I meant the "ACPI restores irq controller state" thing.

Just leave it in. There's never any downside. If all the drivers end up 
doing free_irq/request_irq(), restoring the irq controller state still 
won't have any negative effect, and it solves the case where you have 
drivers that don't do it.

> > Just make ACPI restore the dang thing. It's the right thing to do.
> 
> Requires interpretter running with interrupts disabled => ugly :-(.

I don't see that. What's ugly? I agree that ACPI is ugly, but I do _not_ 
agree that it's ugly to restore irq controller state with interrupts 
disabled. It MakesSense(tm) to do so.

The fact that ACPI was designed by a group of monkeys high on LSD, and is 
some of the worst designs in the industry obviously makes running it at 
_any_ point pretty damn ugly. And the fact that MB vendors don't test it 
with anything else than Windows (and sometimes you wonder whether they do 
even that) doesn't help. So hell yes, it's ugly, and nasty. But interrupts 
disabled has nothing to do with any of it.

Besides, there's no real reason why you'd even have to do it with 
interrupts disabled. I personally think that it makes _sense_ to try to 
restore the irq controller state with irq's off, but as I made clear 
earlier in this flame-fest, there's no real reason why you couldn't just 
run with interrupts on.

If an interrupt is screaming due to lack of initialization and gets turned
off, just make sure it gets re-enabled when it is being initialized.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 23:44         ` Linus Torvalds
@ 2005-07-31 23:59           ` Dave Airlie
  2005-08-01  0:19             ` Linus Torvalds
  2005-08-01  7:25           ` Pavel Machek
  1 sibling, 1 reply; 58+ messages in thread
From: Dave Airlie @ 2005-07-31 23:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, ambx1, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown

> 
> If an interrupt is screaming due to lack of initialization and gets turned
> off, just make sure it gets re-enabled when it is being initialized.
> 

That still doesn't handle the case where a device has an interrupt
handler on a shared IRQ and another device on the chain interrupts it
after it has suspended its device,

we need to either fix *for all drivers* (otherwise people sharing IRQs
will have breakages that people not sharing them won't see ... )

a) add request/free irq sets
b) add code to the interrupt handlers to make sure we aren't in a
powerdown state...

I don't really mind which is the recommended one I'd just prefer we do
it the same way everwhere... so I still believe the yenta_irq patch is
correct if we are doing a, or if not we need to do b....

Dave.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 22:27 ` Dave Jones
@ 2005-08-01  0:00   ` Andreas Steinmetz
  2005-08-01  0:06     ` Dave Jones
  2005-08-03  9:23   ` Pavel Machek
  1 sibling, 1 reply; 58+ messages in thread
From: Andreas Steinmetz @ 2005-08-01  0:00 UTC (permalink / raw)
  To: Dave Jones
  Cc: Brown, Len, Linus Torvalds, Rafael J. Wysocki, linux-kernel,
	Russell King, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz

Dave Jones wrote:
> On Sun, Jul 31, 2005 at 01:03:56AM -0400, Brown, Len wrote:
> 
>  > But that believe would be total fantasy -- supsend/resume is not
>  > working on a large number of machines, and no distro is currently
>  > able to support it.  (I'm talking about S3 suspend to RAM primarily,
>  > suspend to disk is less interesting -- though Red Hat doesn't
>  > even support _that_)
> 
> After the 'swsusp works just fine' lovefest at OLS, I spent a little
> while playing with the current in-tree swsusp implementation last week.
> 
> The outcome: I'm no more enthusiastic about enabling this in Red Hat
> kernels than I ever was before.  It seems to have real issues with LVM
> setups (which is default on Red Hat/Fedora installs these days).
> After convincing it where to suspend/resume from by feeding it
> the major/minor of my swap partition, it did actually seem
> to suspend. And resume (though it did spew lots of 'sleeping whilst
> atomic warnings, but thats trivial compared to whats coming up next).
> 
> I rebooted, and fsck found all sorts of damage on my / partition.
> After spending 30 minutes pressing 'y', to fix things up, it failed
> to boot after lots of files were missing.
> Why it wrote anything to completely different lv to where I told it
> (and yes, I did get the major:minor right) I have no idea, but
> as it stands, it definitly isn't production-ready.
> 
> I'll look into it again sometime soon, but not until after I've
> reinstalled my laptop.  (I'm just thankful I had the sense not to
> try this whilst I was at OLS).

Hmm,
I'm using swsusp on my x86_64 laptop with lvm and dm-crypt. Works just
fine except for nasty spontaneous reboots from time to time caused by
yenta_socket (I do get these since I started to access my pcmcia flash
disk from initrd to retrieve the dm-crypt swap key). It does work since
at least 2.6.11 up to 2.6.13-rc4 and if the yenta_socket caused
spontaneous reboots after resume could be fixed I'd call it production
ready.

gringo:~ # fdisk -l /dev/hda

Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot   Start      End      Blocks   Id  System
/dev/hda1            1      244     1959898+  83  Linux
/dev/hda2          245      488     1959930   82  Linux swap / Solaris
/dev/hda3          489      732     1959930   82  Linux swap / Solaris
/dev/hda4          733     9729    72268402+   5  Extended
/dev/hda5          733      976     1959898+  82  Linux swap / Solaris
/dev/hda6          977     9729    70308441   88  Linux plaintext (*)

(*) dm-crypt :-)

gringo:~ # vgdisplay
  --- Volume group ---
  VG Name               rootvg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  27
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                8
  Open LV               8
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               67.05 GB
  PE Size               4.00 MB
  Total PE              17165
  Alloc PE / Size       14464 / 56.50 GB
  Free  PE / Size       2701 / 10.55 GB
  VG UUID               oHluq0-H5Nd-90dU-psLn-ygNT-u4GJ-D8aJhG

All filesystems are ext3 as I did have nasty experiences with reiserfs
on lvm+raid on another 2.6 system without ever using swsusp there.
-- 
Andreas Steinmetz                       SPAMmers use robotrap@domdv.de

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  0:00   ` Andreas Steinmetz
@ 2005-08-01  0:06     ` Dave Jones
  2005-08-01  0:09       ` Andreas Steinmetz
  0 siblings, 1 reply; 58+ messages in thread
From: Dave Jones @ 2005-08-01  0:06 UTC (permalink / raw)
  To: Andreas Steinmetz
  Cc: Brown, Len, Linus Torvalds, Rafael J. Wysocki, linux-kernel,
	Russell King, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz

On Mon, Aug 01, 2005 at 02:00:16AM +0200, Andreas Steinmetz wrote:

 > gringo:~ # fdisk -l /dev/hda
 > 
 > Disk /dev/hda: 80.0 GB, 80026361856 bytes
 > 255 heads, 63 sectors/track, 9729 cylinders
 > Units = cylinders of 16065 * 512 = 8225280 bytes
 > 
 >    Device Boot   Start      End      Blocks   Id  System
 > /dev/hda1            1      244     1959898+  83  Linux
 > /dev/hda2          245      488     1959930   82  Linux swap / Solaris
 > /dev/hda3          489      732     1959930   82  Linux swap / Solaris
 > /dev/hda4          733     9729    72268402+   5  Extended
 > /dev/hda5          733      976     1959898+  82  Linux swap / Solaris
 > /dev/hda6          977     9729    70308441   88  Linux plaintext (*)
 > 
 > (*) dm-crypt :-)

Your swap partitions are outside of your lv's.

		Dave


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  0:06     ` Dave Jones
@ 2005-08-01  0:09       ` Andreas Steinmetz
  0 siblings, 0 replies; 58+ messages in thread
From: Andreas Steinmetz @ 2005-08-01  0:09 UTC (permalink / raw)
  To: Dave Jones
  Cc: Brown, Len, Linus Torvalds, Rafael J. Wysocki, linux-kernel,
	Russell King, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz

Dave Jones wrote:
> On Mon, Aug 01, 2005 at 02:00:16AM +0200, Andreas Steinmetz wrote:
> 
>  > gringo:~ # fdisk -l /dev/hda
>  > 
>  > Disk /dev/hda: 80.0 GB, 80026361856 bytes
>  > 255 heads, 63 sectors/track, 9729 cylinders
>  > Units = cylinders of 16065 * 512 = 8225280 bytes
>  > 
>  >    Device Boot   Start      End      Blocks   Id  System
>  > /dev/hda1            1      244     1959898+  83  Linux
>  > /dev/hda2          245      488     1959930   82  Linux swap / Solaris
>  > /dev/hda3          489      732     1959930   82  Linux swap / Solaris
>  > /dev/hda4          733     9729    72268402+   5  Extended
>  > /dev/hda5          733      976     1959898+  82  Linux swap / Solaris
>  > /dev/hda6          977     9729    70308441   88  Linux plaintext (*)
>  > 
>  > (*) dm-crypt :-)
> 
> Your swap partitions are outside of your lv's.

Right, then this could be the problem you encountered. However the swap
partitions are set up with dm-crypt including the partition I do resume
from so I'm using device mapper to resume which is quite close to LVM.

-- 
Andreas Steinmetz                       SPAMmers use robotrap@domdv.de

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 23:59           ` Dave Airlie
@ 2005-08-01  0:19             ` Linus Torvalds
  2005-08-01  0:44               ` Dave Airlie
  2005-08-01  7:01               ` Sanjoy Mahajan
  0 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-08-01  0:19 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Pavel Machek, ambx1, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown



On Mon, 1 Aug 2005, Dave Airlie wrote:
> 
> That still doesn't handle the case where a device has an interrupt
> handler on a shared IRQ and another device on the chain interrupts it
> after it has suspended its device,

I don't know why people bother arguing about this. Face the facts: we have 
to do things incrementally. Any design that breaks unmodified drivers is 
by _definition_ broken. You can't fix all drivers, and anybody who starts 
their arguments with "we should just fix drivers" is living in la-la-land.

Anyway.

My original PM suggestion handled that case perfectly well. The rule was 
to make "go to sleep" be a two-phase operation:

 - phase 1: save state, and possibly return errors
 - phase 2: turn off

where the second stage was done with interrupts disabled and atomically.

And the first phase was done without actually changing the state of the
device at all (so that if some device said "I can't do that, Dave", there
is no need to go back and wake anything up at all), and we could allocate 
memory freely, because the disk was still working etc etc.

For some totally inexplicable reason, the PM people never liked this, and 
ended up doing a single "power off" setup. Which was always known to be 
broken, and caused tons of problems, like the fact that save-to-disk had 
to wake up a device that had already been shut off etc.

So the fact that the PM layer was "simplified" to a single-phase setup 
causes a lot of problems, but hey, stupid is as stupid does. I've given up 
worrying about what crazy things the PM list comes up with, and instead I 
now have a hard rule: a patch that breaks some machine gets reverted.

Is that too hard to understand?

I'll repeat it again:

	A patch that breaks some machine gets reverted.

and maybe somebody on the PM list will one day understand what it means to
have slow and steady progress instead of dicking around with the idea of 
the week.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  0:19             ` Linus Torvalds
@ 2005-08-01  0:44               ` Dave Airlie
  2005-08-01  1:07                 ` Linus Torvalds
  2005-08-01  7:01               ` Sanjoy Mahajan
  1 sibling, 1 reply; 58+ messages in thread
From: Dave Airlie @ 2005-08-01  0:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Machek, ambx1, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown

> On Mon, 1 Aug 2005, Dave Airlie wrote:
> >
> > That still doesn't handle the case where a device has an interrupt
> > handler on a shared IRQ and another device on the chain interrupts it
> > after it has suspended its device,
> 
> I don't know why people bother arguing about this. Face the facts: we have
> to do things incrementally. Any design that breaks unmodified drivers is
> by _definition_ broken. You can't fix all drivers, and anybody who starts
> their arguments with "we should just fix drivers" is living in la-la-land.
> 
>

I'm not arguing at all I agree with you, but I think you are missing
the point I'm trying to make,

You said earlier we only should fix drivers that need fixing, but they
all need fixing, I'm trying to see which way they should be fixed,
either the PM people say we need request/free irq pairs or say we need
to put support code in the interrupt handlers,

I fail to see how I can fix this very well incrementally, on my
hardware I have a yenta, my patch fixes it to work, on Hughs hardware
he has his yenta sharing IRQs with another driver which doesn't do the
request/free and it breaks, it isn't the yenta drivers fault the other
driver causes the issue, so therefore in order to apply the yenta fix
I need to go around and fix all the other drivers that might share an
interrupt with it before I can get patch accepted so that I don't
break someones machine? this is irrelevant to whether the ACPI link
change is in or not, adding the request/free pairs to one driver can
show up issues in other drivers that share that IRQ..

This has nothing to do with the issues with highlevel PM interfaces
for shutting down hardware, this is do with fixing the drivers in the
kernel currently and what the correct way to do it is without breaking
someone elses hardware....

Dave.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  0:44               ` Dave Airlie
@ 2005-08-01  1:07                 ` Linus Torvalds
  2005-08-01  7:15                   ` Pavel Machek
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-08-01  1:07 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Pavel Machek, ambx1, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown



On Mon, 1 Aug 2005, Dave Airlie wrote:
> 
> You said earlier we only should fix drivers that need fixing, but they
> all need fixing

I think you're still talking from a theoretical standpoing, while all my 
arguments are practical.

In _practice_, I hope that

 (a) we don't see that very much (ie the people for whom things work 
     already are a strong argument that this is less of a problem in
     practice than people try to make it appear)

 (b) drivers, _especially_ on notebooks, are already able to handle an 
     incoming interrupt with the device in D3 state and returning 0xff
     for all reads.

     In particular, this is exactly the same thing that you get on a 
     surprise device removal too.

iow it really _really_ shouldn't matter that a shared interrupt comes in
after (or before) a device has gone to sleep. Because a driver that can't
handle that schenario is buggy for totally unrelated reasons, and doing a 
"free_irq()/request_irq()" pair at suspend time is _not_ the solution!

> I'm trying to see which way they should be fixed, either the PM people
> say we need request/free irq pairs or say we need to put support code in
> the interrupt handlers,

And I say that's not true. See (b) above. As far as I can tell, the 
"interrupt when in D3" really looks _exactly_ the same as "interrupt when 
device was just removed by the user", and nobody will hopefully argue that 
free_irq/request_irq can protect against forced removal?

> This has nothing to do with the issues with highlevel PM interfaces
> for shutting down hardware, this is do with fixing the drivers in the
> kernel currently and what the correct way to do it is without breaking
> someone elses hardware....

... and I don't think this has _anything_ to do with free_irq/request_irq, 
and everything to do with the fact that we can try to make at least the 
common drivers "hardened" for unexpected interrupts coming in when the hw 
might not be ready for them.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 20:34 revert yenta free_irq on suspend ambx1
  2005-07-31 21:20 ` Pavel Machek
  2005-07-31 22:55 ` Linus Torvalds
@ 2005-08-01  1:59 ` Shaohua Li
  2005-08-01  2:06   ` Andrew Morton
  2 siblings, 1 reply; 58+ messages in thread
From: Shaohua Li @ 2005-08-01  1:59 UTC (permalink / raw)
  To: lkml
  Cc: Adam Belay, Linus Torvalds, Pavel Machek, Hugh Dickins,
	Andrew Morton, Dominik Brodowski, Daniel Ritz, Len Brown

Hi,
> In general, I think that calling free_irq is the right behavior.
> Although irqs changing after suspend is rare, there are also some
> more serious issues.  This has been discussed in the past, and a
> summary is as follows:
irqs actually isn't changed after suspend currently, it's a considering
for future usage like hotplug.
Calling free_irq actually isn't a complete ACPI issue, but ACPI requires
it to solve nasty 'sleep in atomic' warning. You will find such break
with swsusp without ACPI. Could we revert the ACPI change in Linus's
tree but keep it in -mm tree? So we get a chance to fix drivers.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  1:59 ` Shaohua Li
@ 2005-08-01  2:06   ` Andrew Morton
  2005-08-01  2:22     ` Shaohua Li
                       ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Andrew Morton @ 2005-08-01  2:06 UTC (permalink / raw)
  To: Shaohua Li
  Cc: linux-kernel, ambx1, torvalds, pavel, hugh, linux, daniel.ritz,
	len.brown

Shaohua Li <shaohua.li@intel.com> wrote:
>
> Hi,
> > In general, I think that calling free_irq is the right behavior.
> > Although irqs changing after suspend is rare, there are also some
> > more serious issues.  This has been discussed in the past, and a
> > summary is as follows:
>
> irqs actually isn't changed after suspend currently, it's a considering
> for future usage like hotplug.
> Calling free_irq actually isn't a complete ACPI issue, but ACPI requires
> it to solve nasty 'sleep in atomic' warning.

Is that the only problem?  If so, then surely we can make free_irq() run
happily with interrupts disabled: unlink the IRQ handler synchronously,
defer the /proc teardown or something like that.

> You will find such break
> with swsusp without ACPI. Could we revert the ACPI change in Linus's
> tree but keep it in -mm tree? So we get a chance to fix drivers.

That depends on the amount of brokenness involved: if it's significant then
I'll get a ton of bug reports concerning something which we already know is
broken and we'll drive away our long-suffering testers.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  2:06   ` Andrew Morton
@ 2005-08-01  2:22     ` Shaohua Li
  2005-08-01  7:19     ` Pavel Machek
  2005-08-01 21:38     ` Rafael J. Wysocki
  2 siblings, 0 replies; 58+ messages in thread
From: Shaohua Li @ 2005-08-01  2:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, ambx1, torvalds, pavel, hugh, linux, daniel.ritz,
	len.brown

On Sun, 2005-07-31 at 19:06 -0700, Andrew Morton wrote:
> Shaohua Li <shaohua.li@intel.com> wrote:
> >
> > Hi,
> > > In general, I think that calling free_irq is the right behavior.
> > > Although irqs changing after suspend is rare, there are also some
> > > more serious issues.  This has been discussed in the past, and a
> > > summary is as follows:
> >
> > irqs actually isn't changed after suspend currently, it's a considering
> > for future usage like hotplug.
> > Calling free_irq actually isn't a complete ACPI issue, but ACPI requires
> > it to solve nasty 'sleep in atomic' warning.
> 
> Is that the only problem?  If so, then surely we can make free_irq() run
> happily with interrupts disabled: unlink the IRQ handler synchronously,
> defer the /proc teardown or something like that.
The problem is we are going to use ACPI interpreter with interrupt
disabled. The interpreter itself might use kmalloc, semaphore, iomap,
msleep and etc, depends on BIOS. Fixing interpreter is hard. Originally
we think to introduce a new system state for suspend/resume to avoid
warning, but later we found drivers calling pci_disable_irq/free_irq is
better and safer not just for the issue at hand. We might reconsider
previous option (a new system state) if free_irq is rejected.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
@ 2005-08-01  3:03 ambx1
  2005-08-01  4:53 ` Linus Torvalds
  2005-08-01  8:49 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 58+ messages in thread
From: ambx1 @ 2005-08-01  3:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, Pavel Machek, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown

----- Original Message -----
From: Linus Torvalds <torvalds@osdl.org>
Date: Sunday, July 31, 2005 9:07 pm
Subject: Re: revert yenta free_irq on suspend

> 
> 
> On Mon, 1 Aug 2005, Dave Airlie wrote:
> > 
> > You said earlier we only should fix drivers that need fixing, but 
> they> all need fixing
> 
> I think you're still talking from a theoretical standpoing, while 
> all my 
> arguments are practical.
> 
> In _practice_, I hope that
> 
> (a) we don't see that very much (ie the people for whom things 
> work 
>     already are a strong argument that this is less of a problem in
>     practice than people try to make it appear)

They may not always be triggered, but they are real bugs.  We can't
ensure stable PM until they are fixed.  If they weren't a real issue,
then calling free_irq() in pcmcia probably wouldn't have broken the
sound driver we saw in this case.

> 
> (b) drivers, _especially_ on notebooks, are already able to handle 
> an 
>     incoming interrupt with the device in D3 state and returning 0xff
>     for all reads.
> 
>     In particular, this is exactly the same thing that you get on 
> a 
>     surprise device removal too.
> 
> iow it really _really_ shouldn't matter that a shared interrupt 
> comes in
> after (or before) a device has gone to sleep. Because a driver that 
> can'thandle that schenario is buggy for totally unrelated reasons, 
> and doing a 
> "free_irq()/request_irq()" pair at suspend time is _not_ the solution!

I understand this.  But calling free_irq()/request_irq() still seems
to make sense.  It's the cleanest and most straightforward approach.
It's the easiest way we can ensure there will not be race conditions.
An interrupt could be triggered during the device's power transition
when it's in between on and off.  More importantly, we need this change
for runtime power management anyway.

> 
> > This has nothing to do with the issues with highlevel PM interfaces
> > for shutting down hardware, this is do with fixing the drivers in 
> the> kernel currently and what the correct way to do it is without 
> breaking> someone elses hardware....
> 
> ... and I don't think this has _anything_ to do with 
> free_irq/request_irq, 
> and everything to do with the fact that we can try to make at least 
> the 
> common drivers "hardened" for unexpected interrupts coming in when 
> the hw 
> might not be ready for them.

We either need to change every driver to free irqs or "harden" each
of them.  Freeing irqs obviously seems easier.  I propose we make
this change in -mm in one pass to avoid bugs like this.

Also, as I said earlier, the better we support OSPM initiated power
management, the more likely APM will break.  This may be technically
unavoidable on some isolated boxes without quirks.  I agree with
Pavel that "do nothing" may make sense, but it seems some devices
may still need to be disabled by the OS.  As a real world example,
we currently can't turn off cardbus bridges because it breaks APM
on a couple of older laptops.

Thanks,
Adam


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  3:03 ambx1
@ 2005-08-01  4:53 ` Linus Torvalds
  2005-08-01  8:49 ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-08-01  4:53 UTC (permalink / raw)
  To: ambx1
  Cc: Dave Airlie, Pavel Machek, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown



On Sun, 31 Jul 2005 ambx1@neo.rr.com wrote:
> 
> We either need to change every driver to free irqs or "harden" each
> of them.

No. No "either". 

Drivers need to be safe from the hw going away, whether it be physically 
or because it got shut down. 

>  Freeing irqs obviously seems easier.

No.

Freeing irq's DOES NOT HELP.

Listen to me. You need the hardening of the driver anyway for the hotplug 
case. Freeing irq's doesn't do anything for it, it's just shuffling the 
real problem under the carpet.

So fix the damn problem _right_, and suddenly freeing the irq doesn't make 
any difference at all. It's just unnecessary and incorrect complexity.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  0:19             ` Linus Torvalds
  2005-08-01  0:44               ` Dave Airlie
@ 2005-08-01  7:01               ` Sanjoy Mahajan
  1 sibling, 0 replies; 58+ messages in thread
From: Sanjoy Mahajan @ 2005-08-01  7:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, Pavel Machek, ambx1, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown

> slow and steady progress

The oscillations are indeed discouraging.  For S3 sleep/wake on my TP
600X:

2.6.11.4        : works well (the console was hosed with jittering text, but
                  X restores fine), which hugely improved using my laptop.
2.6.12.3        : ditto

But:

2.6.13-rc3      : goes to sleep but hangs when waking up
      -rc3-mm2  : same
      -rc3-mm3  : same
      -rc4-git3 : same

With those 2.6.13 variants, once it has hung, the power switch will
turn it off, but to turn it on I first have to unplug the power and
remove both batteries.

On the good side, swsusp seems to be improving with later kernel
versions.  No luck with 2.6.11.4; reasonable luck with 2.6.12.3; and
good results with 2.6.13*.

-Sanjoy

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  1:07                 ` Linus Torvalds
@ 2005-08-01  7:15                   ` Pavel Machek
  0 siblings, 0 replies; 58+ messages in thread
From: Pavel Machek @ 2005-08-01  7:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, ambx1, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, Linux Kernel Mailing List,
	Len Brown

Hi!

> > You said earlier we only should fix drivers that need fixing, but they
> > all need fixing
> 
> I think you're still talking from a theoretical standpoing, while all my 
> arguments are practical.
> 
> In _practice_, I hope that
> 
>  (a) we don't see that very much (ie the people for whom things work 
>      already are a strong argument that this is less of a problem in
>      practice than people try to make it appear)
> 
>  (b) drivers, _especially_ on notebooks, are already able to handle an 
>      incoming interrupt with the device in D3 state and returning 0xff
>      for all reads.
> 
>      In particular, this is exactly the same thing that you get on a 
>      surprise device removal too.

Not all devices that we want power-managed are hot-pluggable, even on
notebooks. If I need to go all over the tree fixing drivers, I'd much
rather add free_irq/request_irq than mess with interrupt routines on
hardware I don't have...
								Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  2:06   ` Andrew Morton
  2005-08-01  2:22     ` Shaohua Li
@ 2005-08-01  7:19     ` Pavel Machek
  2005-08-01 21:38     ` Rafael J. Wysocki
  2 siblings, 0 replies; 58+ messages in thread
From: Pavel Machek @ 2005-08-01  7:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shaohua Li, linux-kernel, ambx1, torvalds, hugh, linux,
	daniel.ritz, len.brown

Hi!

> > > In general, I think that calling free_irq is the right behavior.
> > > Although irqs changing after suspend is rare, there are also some
> > > more serious issues.  This has been discussed in the past, and a
> > > summary is as follows:
> >
> > irqs actually isn't changed after suspend currently, it's a considering
> > for future usage like hotplug.
> > Calling free_irq actually isn't a complete ACPI issue, but ACPI requires
> > it to solve nasty 'sleep in atomic' warning.
> 
> Is that the only problem?  If so, then surely we can make free_irq() run
> happily with interrupts disabled: unlink the IRQ handler synchronously,
> defer the /proc teardown or something like that.

No, the problem is that

a) restoring interrupt links needs interrupts enabled [or rewriting
half of ACPI interpretter]

b) to solve a) [and to solve other stuff, too], we need
free_irq/request_irq all over the tree.

> > You will find such break
> > with swsusp without ACPI. Could we revert the ACPI change in Linus's
> > tree but keep it in -mm tree? So we get a chance to fix drivers.
> 
> That depends on the amount of brokenness involved: if it's significant then
> I'll get a ton of bug reports concerning something which we already know is
> broken and we'll drive away our long-suffering testers.

The amount of brokenness is not that bad, and it fixes some machines,
too.
									Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 23:44         ` Linus Torvalds
  2005-07-31 23:59           ` Dave Airlie
@ 2005-08-01  7:25           ` Pavel Machek
  1 sibling, 0 replies; 58+ messages in thread
From: Pavel Machek @ 2005-08-01  7:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ambx1, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, Linux Kernel Mailing List, Len Brown

Hi!

> > > Why do it _ever_? There is _zero_ upside to doing it, I don't see why you 
> > > want to.
> > 
> > Being able to turn off your soundcard at runtime when you are not
> > using it was one of examples...
> 
> I meant the "ACPI restores irq controller state" thing.
> 
> Just leave it in. There's never any downside. If all the drivers end up 
> doing free_irq/request_irq(), restoring the irq controller state still 
> won't have any negative effect, and it solves the case where you have 
> drivers that don't do it.
> 
> > > Just make ACPI restore the dang thing. It's the right thing to do.
> > 
> > Requires interpretter running with interrupts disabled => ugly :-(.
> 
> I don't see that. What's ugly? I agree that ACPI is ugly, but I do _not_ 
> agree that it's ugly to restore irq controller state with interrupts 
> disabled. It MakesSense(tm) to do so.

ACPI people claim it would mean rewritting lot of code in
interpretter. And they are probably right: restoring irq states is
done via running ACPI AML code, and it may ask you to grab semaphore;
which is something you can't do with interrupts disabled.

So you'd have to teach ACPI interpretter completely different "irq
disabled" mode, and even then it is not clear what to do if AML code
asks you for some "sleeping operation".

[Hey, it should be Len doing this debate, not me. I hope I'll never
have to touch ACPI interpretter.]
								Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  3:03 ambx1
  2005-08-01  4:53 ` Linus Torvalds
@ 2005-08-01  8:49 ` Benjamin Herrenschmidt
  2005-08-02 10:56   ` Pavel Machek
  1 sibling, 1 reply; 58+ messages in thread
From: Benjamin Herrenschmidt @ 2005-08-01  8:49 UTC (permalink / raw)
  To: abelay
  Cc: Len Brown, Linux Kernel Mailing List, Daniel Ritz,
	Dominik Brodowski, Andrew Morton, Hugh Dickins, Pavel Machek,
	Dave Airlie, Linus Torvalds

On Sun, 2005-07-31 at 23:03 -0400, ambx1@neo.rr.com wrote:

> Also, as I said earlier, the better we support OSPM initiated power
> management, the more likely APM will break.  This may be technically
> unavoidable on some isolated boxes without quirks.  I agree with
> Pavel that "do nothing" may make sense, but it seems some devices
> may still need to be disabled by the OS.  As a real world example,
> we currently can't turn off cardbus bridges because it breaks APM
> on a couple of older laptops.

Won't freeing of IRQs cause problems with things like handhelds that
actually rely on an interrupt to wake up ?

Ben.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31  5:03 Brown, Len
                   ` (2 preceding siblings ...)
  2005-07-31 22:27 ` Dave Jones
@ 2005-08-01  8:51 ` Matthew Garrett
  3 siblings, 0 replies; 58+ messages in thread
From: Matthew Garrett @ 2005-08-01  8:51 UTC (permalink / raw)
  To: Brown, Len
  Cc: linux-kernel, Russell King, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz, torvalds

Brown, Len <len.brown@intel.com> wrote:

> But that believe would be total fantasy -- supsend/resume is not
> working on a large number of machines, and no distro is currently
> able to support it.  (I'm talking about S3 suspend to RAM primarily,
> suspend to disk is less interesting -- though Red Hat doesn't
> even support _that_)

It's still failing on a large number of machines, but it's working on a
larger set. We (Ubuntu) shipped with suspend to RAM support in our
previous release. Frankly, at this stage the three biggest problems are:

a) resuming video (not going to be fixed in the ACPI core)
b) SATA resume (not going to be fixed in the ACPI core)
c) Motherboard chipsets that don't seem to be programmed to handle
memory refresh (not going to be fixed in the ACPI core)

People /do/ use this code, and breaking a large number of working setups
in order to fix a bug that appears on a small number of setups (and can
be worked around in a rather less invasive way) isn't sensible.
-- 
Matthew Garrett | mjg59-chiark.mail.linux-rutgers.kernel@srcf.ucam.org

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 21:20 ` Pavel Machek
@ 2005-08-01  8:56   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 58+ messages in thread
From: Benjamin Herrenschmidt @ 2005-08-01  8:56 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, Linux Kernel Mailing List, Daniel Ritz,
	Dominik Brodowski, Andrew Morton, Hugh Dickins, Linus Torvalds

On Sun, 2005-07-31 at 23:20 +0200, Pavel Machek wrote:
> Hi!
> 
> > Also I'd like to point out that this patch broke APM suspend-to-ram,
> > not ACPI S3.  IMO, it may not be possible to support both APM and ACPI
> > on every system, as their specs are not intended to be compatible.
> > Progress toward proper suspend-to-ram support will, in many cases, be
> > a small setback for APM.  This really can't be avoided.
> 
> Actually, for APM, OS theoretically does *not* need to FREEZE the
> devices (or do anything else). "Doing nothing" should be easy...

Bullshit. See what happens if you try to APM suspend while a IDE DMA
transfer is in progress ... 

Ben.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-30 21:10       ` Linus Torvalds
  2005-07-30 21:30         ` Russell King
  2005-07-30 22:28         ` Rafael J. Wysocki
@ 2005-08-01  9:06         ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 58+ messages in thread
From: Benjamin Herrenschmidt @ 2005-08-01  9:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Russell King, Hugh Dickins, Andrew Morton, Dominik Brodowski,
	Daniel Ritz, linux-kernel

On Sat, 2005-07-30 at 14:10 -0700, Linus Torvalds wrote:
> 
> On Sat, 30 Jul 2005, Russell King wrote:
> > 
> > I don't think so - I believe one of the problem cases is where you
> > have a screaming interrupt caused by an improperly setup device.
> 
> Not a problem.
> 
> The thing is, this is trivially solved by
>  - disable irq controller last on shutdown
>  - re-enable irq controller last on resume
> 
> Think about it. Even if you have screaming irq's (and thus we'll shut
> things down somewhere during the resume), when we then get to re-enable
> the irq controller thing, we'll have them all back again at that point.
> Problem solved.

That is nasty. That means every single driver must be careful not to
ever "wait" for the device interrupt to happen at resume. I'm not
talking about adding a timeout or such (that should be done anyway to
deal with hot unplug) but having something like IDE -> queues wakeup
request and blocks til it's dealt with. That sort of thing. It's
annoying, it means no driver can rely on IRQs actually working in their
resume callback. Not nice.

Also, once a driver has woken up, it doesn't really have an idea that
the rest of the system is not yet fully up (or that it actually is), and
thus doesn't really know if interrupts are still globally off or not if
we implement your scheme. That may cause nasty effects too I suppose...

> You can have variations on this, of course - you can enable the irq
> controller early _and_ late in the resume process. Ie do a minimal "get
> the basics working" early - you might want to make sure that timers etc
> work early on, for example, and then a "fix up the details" thing late.
> 
> An interrupt controller is clearly a special case, so it's worth spending 
> some effort on handling it.
> 
> In contrast, what is _not_ worth doing is screweing over every single
> driver we have.

I'm not fan of the free/request irq on suspend & resume for various
reasons, and I don't think it actually fixes the problem discussed
anyway because of shared interrupts, but heh, the problem is real on x86
it seems (BIOS crap !). Not sure what the best fix is.

Ben.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 20:15           ` Rafael J. Wysocki
@ 2005-08-01 20:34             ` Hugh Dickins
  2005-08-01 21:54               ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Hugh Dickins @ 2005-08-01 20:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Dominik Brodowski,
	Daniel Ritz

On Sun, 31 Jul 2005, Rafael J. Wysocki wrote:
> On Sunday, 31 of July 2005 01:09, Rafael J. Wysocki wrote:
> > 
> > Linus has apparently dropped that patch for yenta, but in case it is
> > reintroduced in the future you will probably need a patch to make the network
> > driver cooperate.  I'll try to prepare one tomorrow, if I can, but I have no hardware
> > to test it.
> 
> The patch follows.  It compiles and should work, though I haven't tested it.

Thanks for making the effort, Rafael,
but I'm afraid your patch does not solve it.

Prior to -rc4, or in current -git which has the yenta patch reverted,
my laptop manages APM resume from RAM with the following 8 messages
(I won't complain that it could list even more permutations!)

PCI: Found IRQ 11 for device 0000:00:1f.1
PCI: Sharing IRQ 11 with 0000:02:00.0
PCI: Found IRQ 11 for device 0000:02:00.0
PCI: Sharing IRQ 11 with 0000:00:1f.1
PCI: Found IRQ 11 for device 0000:02:01.0
PCI: Sharing IRQ 11 with 0000:02:01.1
PCI: Found IRQ 11 for device 0000:02:01.1
PCI: Sharing IRQ 11 with 0000:02:01.0

Unpatched -rc4 locks up on resume, showing none of those messages.
-rc4 with your drivers/net/3c59x.c patch locks up on resume,
after showing just the first four of those messages.

Whatever, I very much share the position Linus has expressed so
forcefully: it's foolish suddenly to demand changes in an indeterminate
number of drivers (surely yenta and 3c59x aren't the end of it?),
especially in the final days leading up to a release.

I surely would not have asked him to revert the yenta patch, nor would
he have done so (thank you, Linus), if my machine were the only problem.
It's very easy for me to carry my own patches to get working, but we
fear the trouble seen here gives a foretaste of others' trouble if
the changes were to remain in the release.

As to whether the changes should be retained in -mm, I simply don't
have an appreciation of the scope of the problems either way,
to have a worthwhile opinion.

Hugh

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  2:06   ` Andrew Morton
  2005-08-01  2:22     ` Shaohua Li
  2005-08-01  7:19     ` Pavel Machek
@ 2005-08-01 21:38     ` Rafael J. Wysocki
  2 siblings, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-08-01 21:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shaohua Li, linux-kernel, ambx1, torvalds, pavel, hugh, linux,
	daniel.ritz, len.brown

Hi,

On Monday, 1 of August 2005 04:06, Andrew Morton wrote:
> Shaohua Li <shaohua.li@intel.com> wrote:
> >
> > Hi,
> > > In general, I think that calling free_irq is the right behavior.
> > > Although irqs changing after suspend is rare, there are also some
> > > more serious issues.  This has been discussed in the past, and a
> > > summary is as follows:
> >
> > irqs actually isn't changed after suspend currently, it's a considering
> > for future usage like hotplug.
> > Calling free_irq actually isn't a complete ACPI issue, but ACPI requires
> > it to solve nasty 'sleep in atomic' warning.
> 
> Is that the only problem?  If so, then surely we can make free_irq() run
> happily with interrupts disabled: unlink the IRQ handler synchronously,
> defer the /proc teardown or something like that.
> 
> > You will find such break
> > with swsusp without ACPI. Could we revert the ACPI change in Linus's
> > tree but keep it in -mm tree? So we get a chance to fix drivers.
> 
> That depends on the amount of brokenness involved: if it's significant then
> I'll get a ton of bug reports concerning something which we already know is
> broken and we'll drive away our long-suffering testers.

Well, IMO there's no such danger.  ;-)  The relevent ACPI change has
been in -mm since 2.6.12-rc1-mm1 and _nobody_ except for me has reported
_any_ problems with it. :-)

OTOH, if the change is kept in -mm, we hopefully will be able to identify
the drivers that need to be converted first, on the basis of the bug reports.

And the testers need not be driven away if the issues they report are fixed
in a timely manner, which is quite simple in this particuar case, because we
know what's potentially broken, we know how to fix it, and the fix is quite not
that complicated.

Greets,
Rafael


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01 20:34             ` Hugh Dickins
@ 2005-08-01 21:54               ` Rafael J. Wysocki
  0 siblings, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2005-08-01 21:54 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Dominik Brodowski,
	Daniel Ritz

On Monday, 1 of August 2005 22:34, Hugh Dickins wrote:
> On Sun, 31 Jul 2005, Rafael J. Wysocki wrote:
> > On Sunday, 31 of July 2005 01:09, Rafael J. Wysocki wrote:
> > > 
> > > Linus has apparently dropped that patch for yenta, but in case it is
> > > reintroduced in the future you will probably need a patch to make the network
> > > driver cooperate.  I'll try to prepare one tomorrow, if I can, but I have no hardware
> > > to test it.
> > 
> > The patch follows.  It compiles and should work, though I haven't tested it.
> 
> Thanks for making the effort, Rafael,
> but I'm afraid your patch does not solve it.
> 
> Prior to -rc4, or in current -git which has the yenta patch reverted,
> my laptop manages APM resume from RAM with the following 8 messages
> (I won't complain that it could list even more permutations!)
> 
> PCI: Found IRQ 11 for device 0000:00:1f.1
> PCI: Sharing IRQ 11 with 0000:02:00.0
> PCI: Found IRQ 11 for device 0000:02:00.0
> PCI: Sharing IRQ 11 with 0000:00:1f.1
> PCI: Found IRQ 11 for device 0000:02:01.0
> PCI: Sharing IRQ 11 with 0000:02:01.1
> PCI: Found IRQ 11 for device 0000:02:01.1
> PCI: Sharing IRQ 11 with 0000:02:01.0
> 
> Unpatched -rc4 locks up on resume, showing none of those messages.
> -rc4 with your drivers/net/3c59x.c patch locks up on resume,
> after showing just the first four of those messages.

Thanks for testing.  The results you observe mean that the problem is
in fact more complicated than I thought.  It seems to make up a good
test case but I wouldn't like to bother you any more. :-)

> Whatever, I very much share the position Linus has expressed so
> forcefully: it's foolish suddenly to demand changes in an indeterminate
> number of drivers (surely yenta and 3c59x aren't the end of it?),
> especially in the final days leading up to a release.

Fully agreed.

> I surely would not have asked him to revert the yenta patch, nor would
> he have done so (thank you, Linus), if my machine were the only problem.
> It's very easy for me to carry my own patches to get working, but we
> fear the trouble seen here gives a foretaste of others' trouble if
> the changes were to remain in the release.

Indeed.

Greets,
Rafael
 

-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
		-- Lewis Carroll "Alice's Adventures in Wonderland"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-01  8:49 ` Benjamin Herrenschmidt
@ 2005-08-02 10:56   ` Pavel Machek
  2005-08-03 11:42     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 58+ messages in thread
From: Pavel Machek @ 2005-08-02 10:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: abelay, Len Brown, Linux Kernel Mailing List, Daniel Ritz,
	Dominik Brodowski, Andrew Morton, Hugh Dickins, Dave Airlie,
	Linus Torvalds

Hi!

> > Also, as I said earlier, the better we support OSPM initiated power
> > management, the more likely APM will break.  This may be technically
> > unavoidable on some isolated boxes without quirks.  I agree with
> > Pavel that "do nothing" may make sense, but it seems some devices
> > may still need to be disabled by the OS.  As a real world example,
> > we currently can't turn off cardbus bridges because it breaks APM
> > on a couple of older laptops.
> 
> Won't freeing of IRQs cause problems with things like handhelds that
> actually rely on an interrupt to wake up ?

Well, you probably don't want to free IRQ that is used for wakeup; but
if driver is used for wakeup, it probably needs some special handling,
anyway (right?).
								Pavel
-- 
teflon -- maybe it is a trademark, but it should not be.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-07-31 22:27 ` Dave Jones
  2005-08-01  0:00   ` Andreas Steinmetz
@ 2005-08-03  9:23   ` Pavel Machek
  1 sibling, 0 replies; 58+ messages in thread
From: Pavel Machek @ 2005-08-03  9:23 UTC (permalink / raw)
  To: Dave Jones, Brown, Len, Linus Torvalds, Rafael J. Wysocki,
	linux-kernel, Russell King, Hugh Dickins, Andrew Morton,
	Dominik Brodowski, Daniel Ritz

Hi!

>  > But that believe would be total fantasy -- supsend/resume is not
>  > working on a large number of machines, and no distro is currently
>  > able to support it.  (I'm talking about S3 suspend to RAM primarily,
>  > suspend to disk is less interesting -- though Red Hat doesn't
>  > even support _that_)
> 
> After the 'swsusp works just fine' lovefest at OLS, I spent a little
> while playing with the current in-tree swsusp implementation last week.
> 
> The outcome: I'm no more enthusiastic about enabling this in Red Hat
> kernels than I ever was before.  It seems to have real issues with LVM
> setups (which is default on Red Hat/Fedora installs these days).
> After convincing it where to suspend/resume from by feeding it
> the major/minor of my swap partition, it did actually seem
> to suspend. And resume (though it did spew lots of 'sleeping whilst
> atomic warnings, but thats trivial compared to whats coming up
> next).

I do not know much about LVM. How exactly did you resume= command line
look like? You were not resuming from initrd, right?

You did not boot *anything* between suspend and resume, right?

> I rebooted, and fsck found all sorts of damage on my / partition.
> After spending 30 minutes pressing 'y', to fix things up, it failed
> to boot after lots of files were missing.
> Why it wrote anything to completely different lv to where I told it
> (and yes, I did get the major:minor right) I have no idea, but
> as it stands, it definitly isn't production-ready.

Could you try to suspend on plain old swap-partition, first, to verify
that your drivers cooperate?
								Pavel
-- 
teflon -- maybe it is a trademark, but it should not be.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: revert yenta free_irq on suspend
  2005-08-02 10:56   ` Pavel Machek
@ 2005-08-03 11:42     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 58+ messages in thread
From: Benjamin Herrenschmidt @ 2005-08-03 11:42 UTC (permalink / raw)
  To: Pavel Machek
  Cc: abelay, Len Brown, Linux Kernel Mailing List, Daniel Ritz,
	Dominik Brodowski, Andrew Morton, Hugh Dickins, Dave Airlie,
	Linus Torvalds

On Tue, 2005-08-02 at 12:56 +0200, Pavel Machek wrote:
> Hi!
> 
> > > Also, as I said earlier, the better we support OSPM initiated power
> > > management, the more likely APM will break.  This may be technically
> > > unavoidable on some isolated boxes without quirks.  I agree with
> > > Pavel that "do nothing" may make sense, but it seems some devices
> > > may still need to be disabled by the OS.  As a real world example,
> > > we currently can't turn off cardbus bridges because it breaks APM
> > > on a couple of older laptops.
> > 
> > Won't freeing of IRQs cause problems with things like handhelds that
> > actually rely on an interrupt to wake up ?
> 
> Well, you probably don't want to free IRQ that is used for wakeup; but
> if driver is used for wakeup, it probably needs some special handling,
> anyway (right?).

Not necessarily something the driver itself knows about ... For example,
some platforms do a hardware OR between PME and INTA ...

Ben.


^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2005-08-03 11:49 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-31 20:34 revert yenta free_irq on suspend ambx1
2005-07-31 21:20 ` Pavel Machek
2005-08-01  8:56   ` Benjamin Herrenschmidt
2005-07-31 22:55 ` Linus Torvalds
2005-07-31 23:05   ` Pavel Machek
2005-07-31 23:24     ` Linus Torvalds
2005-07-31 23:27       ` Pavel Machek
2005-07-31 23:44         ` Linus Torvalds
2005-07-31 23:59           ` Dave Airlie
2005-08-01  0:19             ` Linus Torvalds
2005-08-01  0:44               ` Dave Airlie
2005-08-01  1:07                 ` Linus Torvalds
2005-08-01  7:15                   ` Pavel Machek
2005-08-01  7:01               ` Sanjoy Mahajan
2005-08-01  7:25           ` Pavel Machek
2005-07-31 23:10   ` Dave Airlie
2005-08-01  1:59 ` Shaohua Li
2005-08-01  2:06   ` Andrew Morton
2005-08-01  2:22     ` Shaohua Li
2005-08-01  7:19     ` Pavel Machek
2005-08-01 21:38     ` Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2005-08-01  3:03 ambx1
2005-08-01  4:53 ` Linus Torvalds
2005-08-01  8:49 ` Benjamin Herrenschmidt
2005-08-02 10:56   ` Pavel Machek
2005-08-03 11:42     ` Benjamin Herrenschmidt
2005-07-31  5:03 Brown, Len
2005-07-31  5:31 ` Linus Torvalds
2005-07-31  9:49 ` Rafael J. Wysocki
2005-07-31 22:27 ` Dave Jones
2005-08-01  0:00   ` Andreas Steinmetz
2005-08-01  0:06     ` Dave Jones
2005-08-01  0:09       ` Andreas Steinmetz
2005-08-03  9:23   ` Pavel Machek
2005-08-01  8:51 ` Matthew Garrett
2005-07-30 19:10 Hugh Dickins
2005-07-30 20:03 ` Russell King
2005-07-30 20:36   ` Linus Torvalds
2005-07-30 20:54     ` Russell King
2005-07-30 21:10       ` Linus Torvalds
2005-07-30 21:30         ` Russell King
2005-07-30 22:28         ` Rafael J. Wysocki
2005-07-31  4:49           ` Linus Torvalds
2005-08-01  9:06         ` Benjamin Herrenschmidt
2005-07-30 21:20       ` Rafael J. Wysocki
2005-07-30 20:34 ` Linus Torvalds
2005-07-31 13:29   ` Pavel Machek
2005-07-31 15:53     ` Linus Torvalds
2005-07-31 17:09       ` Linus Torvalds
2005-07-30 20:49 ` Rafael J. Wysocki
2005-07-30 21:08   ` Daniel Ritz
2005-07-30 21:32   ` Hugh Dickins
2005-07-30 22:00     ` Rafael J. Wysocki
2005-07-30 22:24       ` Hugh Dickins
2005-07-30 23:09         ` Rafael J. Wysocki
2005-07-31 20:15           ` Rafael J. Wysocki
2005-08-01 20:34             ` Hugh Dickins
2005-08-01 21:54               ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox