* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
[not found] <200712231419.40207.carlos@strangeworlds.co.uk>
@ 2007-12-23 16:30 ` Rafael J. Wysocki
2007-12-23 17:57 ` Ingo Molnar
` (2 more replies)
2007-12-23 17:53 ` [Bug 9528] " Linus Torvalds
1 sibling, 3 replies; 29+ messages in thread
From: Rafael J. Wysocki @ 2007-12-23 16:30 UTC (permalink / raw)
To: Carlos Corbacho
Cc: linux-kernel, Linus Torvalds, Greg KH, Ingo Molnar,
Thomas Gleixner, Len Brown, Andrew Morton
On Sunday, 23 of December 2007, Carlos Corbacho wrote:
> Fix suspend-to-RAM on nForce 4 (CK804) boards by increasing
> PCIBIOS_MIN_IO.
>
> Fixes kernel bugzilla #9528
>
> Problem:
>
> Linus' patch (52ade9b3b97fd3bea42842a056fe0786c28d0555) to re-order
> suspend (and fix fall out from Rafael's earlier suspend reordering work)
> broke suspend-to-RAM on nForce 4 (CK804) boards.
>
> Why:
>
> After debugging _PTS() in the DSDT, it turns out these nVidia boards are
> trying to write to an IO port > 0x1000 (0x142E) during suspend. Before the
> re-ordering, we got away with this.
>
> After the afore mentioned commit, we started hitting the PCIBIOS_MIN_IO
> limit and suspend then broke on these machines (the machine simply hangs
> when it reaches the 0x142E IO port write during suspend-to-RAM).
>
> There was some previous work in the PCIBIOS_MIN_IO area over two years ago
> (71db63acff69618b3d9d3114bd061938150e146b) which bumped this to 0x4000,
> but this was reverted (2ba84684e8cf6f980e4e95a2300f53a505eb794e) after
> causing new and entirely different problems on another nForce board.
>
> 0x1500 has been picked here as a nice, round and more conservative value
> than 0x4000, and which covers 0x142E.
The patch is fine by me, so if anyone has objections, please speak up.
Thanks,
Rafael
> Tested on x86-64.
>
> Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
> CC: Rafael J. Wysocki <rjw@sisk.pl>
> CC: Linus Torvalds <torvalds@linux-foundation.org>
> CC: Greg KH <gregkh@suse.de>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Len Brown <lenb@kernel.org>
> ---
> Since it's not entirely clear who is responsible for what in this file,
> and given what it fixes, I'm CC'ing you all in the hope that someone
> will handle this.
>
> include/asm-x86/pci.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
>
> diff --git a/include/asm-x86/pci.h b/include/asm-x86/pci.h
> index e883619..03cb123 100644
> --- a/include/asm-x86/pci.h
> +++ b/include/asm-x86/pci.h
> @@ -46,7 +46,7 @@ extern unsigned int pcibios_assign_all_busses(void);
> #define pcibios_scan_all_fns(a, b) 0
>
> extern unsigned long pci_mem_start;
> -#define PCIBIOS_MIN_IO 0x1000
> +#define PCIBIOS_MIN_IO 0x1500
> #define PCIBIOS_MIN_MEM (pci_mem_start)
>
> #define PCIBIOS_MIN_CARDBUS_IO 0x4000
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
[not found] <200712231419.40207.carlos@strangeworlds.co.uk>
2007-12-23 16:30 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Rafael J. Wysocki
@ 2007-12-23 17:53 ` Linus Torvalds
2007-12-23 17:58 ` Linus Torvalds
` (2 more replies)
1 sibling, 3 replies; 29+ messages in thread
From: Linus Torvalds @ 2007-12-23 17:53 UTC (permalink / raw)
To: Carlos Corbacho
Cc: Linux Kernel Mailing List, Rafael J. Wysocki, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, bugme-daemon
On Sun, 23 Dec 2007, Carlos Corbacho wrote:
>
> Fix suspend-to-RAM on nForce 4 (CK804) boards by increasing
> PCIBIOS_MIN_IO.
>
> Fixes kernel bugzilla #9528
>
> Problem:
>
> Linus' patch (52ade9b3b97fd3bea42842a056fe0786c28d0555) to re-order
> suspend (and fix fall out from Rafael's earlier suspend reordering work)
> broke suspend-to-RAM on nForce 4 (CK804) boards.
>
> Why:
>
> After debugging _PTS() in the DSDT, it turns out these nVidia boards are
> trying to write to an IO port > 0x1000 (0x142E) during suspend. Before the
> re-ordering, we got away with this.
Very interesting.
HOWEVER.
I'd much rather figure out what the magic IO resource is that clashes.
It's almost certainly some hidden and undocumented (or badly documented)
ACPI IO area that the kernel doesn't know about, because it's not a
regular PCI BAR resource, but some northbridge (or southbridge) magic
register range.
Those ranges *should* be reserved by the BIOS in the ACPI tables, but this
would definitely not be the first time that doesn't happen.
But the right fix would be for us to just figure out what the range is ass
a PCI quirk, and just know to avoid it on purpose, ratehr than just being
lucky and happen to avoid it because PCIBIOS_MIN_IO just happens to be
bigger than the particular address.
So can you:
- show what your /proc/ioports contains (*with* the bug triggering, ie
non-working suspend, so we see what it is that actually ends up using
that area)
- send out 'dmesg' for a boot (same deal)
- add "lspci -xxxvv" output to the deal too.
and also make them part of the bugzilla history (I'm cc'ing bugzilla here,
and added the bug number to the subject, so hopefully this thread ends up
being archived there too).
> There was some previous work in the PCIBIOS_MIN_IO area over two years ago
> (71db63acff69618b3d9d3114bd061938150e146b) which bumped this to 0x4000,
> but this was reverted (2ba84684e8cf6f980e4e95a2300f53a505eb794e) after
> causing new and entirely different problems on another nForce board.
The problem here is classic: these magic ranges tend to be *different* on
different boards (because they don't tend to be fixed by hardware, they
are programmed regions set up by firmware), so trying to change
PCIBIOS_MIN_IO to avoid a problem on one board is almost certain to just
introduce it on another board instead.
On *your* particular board, 0x142E is used for something, but on somebody
elses board it might be 0x162E, and now changing PCIBIOS_MIN_IO to 0x1500
might make that other board hang instead.
So you seem to have debugged this very successfully, and I'm wondering if
you might be able to find out where that 0x142e comes from, and we could
fix it for *all* boards using that chipset by just figuring out what the
*hardware* rules (rather than the random firmware setup that will be
different on different boards) for that chipset actually are!
For an example of what I mean, see the file "drivers/pci/quirks.c", and
check out the quirks for various chipsets:
- quirk_ali7101_acpi()
Knows about the magic ALI ACPI and SMB OI regions
- quirk_piix4_acpi(), quirk_ich6_lpc_acpi(), quirk_ich4_lpc_acpi()
Same thing for the Intel chipsets
- quirk_vt82c586_acpi(), quirk_vt82c686_acpi()
VIA chipsets
etc etc.
It would be *wonderful* if somebody could figure out what the equivalent
quirks for nVidia chipsets are! Because otherwise we'll just end up
bouncing back and forth between different random IO allocations, and they
are all almost guaranteed to cause the same problems, just on different
boards!
It's sometimes possible to even just guess what the registers are, even if
things are undocumented. In particular, that 142E range is almost
certainly programmed into the host bridge or possibly a "LPC controller"
or similar, and it will probably show up as the bytes "20 14" in the
output from lspci, so we can guess which register it is that sets the
base. That's not *always* how it works, but it's sometimes possible to
guess (although you usually need to see a few different cases of the same
chipset to have any kind of confirmation of the guess).
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 16:30 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Rafael J. Wysocki
@ 2007-12-23 17:57 ` Ingo Molnar
2007-12-23 18:00 ` Linus Torvalds
2007-12-25 12:12 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Pavel Machek
2 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2007-12-23 17:57 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Carlos Corbacho, linux-kernel, Linus Torvalds, Greg KH,
Thomas Gleixner, Len Brown, Andrew Morton
* Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 0x1500 has been picked here as a nice, round and more conservative
> > value than 0x4000, and which covers 0x142E.
>
> The patch is fine by me, so if anyone has objections, please speak up.
i'm quite nervous about that approach, partly due to the "black magic"
interaction of commit 2ba84684e8c. Could we try to figure out what's
really going on here, instead of playing with a general limit (which
might break other boxes)?
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 17:53 ` [Bug 9528] " Linus Torvalds
@ 2007-12-23 17:58 ` Linus Torvalds
2007-12-23 19:19 ` Ingo Molnar
2007-12-23 20:43 ` Yinghai Lu
2 siblings, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2007-12-23 17:58 UTC (permalink / raw)
To: Carlos Corbacho
Cc: Linux Kernel Mailing List, Rafael J. Wysocki, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, bugme-daemon,
Brice Goglin
On Sun, 23 Dec 2007, Linus Torvalds wrote:
>
> For an example of what I mean, see the file "drivers/pci/quirks.c", and
> check out the quirks for various chipsets:
Side note - we already do have some quirks for the CK804 chipset, we
probably just don't have enough (ie we have it for some HT stuff, but
there are probably different ranges for ACPI etc other registers).
I'm adding Brice Goglin to the Cc, since he was the one that created those
quirks, which implies that he probably has access to documentation on that
thing.
Brice: see the unfolding sad story on
http://bugzilla.kernel.org/show_bug.cgi?id=9528
which doesn't include Carlos' previous email, but does include my reply,
so you can see what the resource allocation issue is..
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 16:30 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Rafael J. Wysocki
2007-12-23 17:57 ` Ingo Molnar
@ 2007-12-23 18:00 ` Linus Torvalds
2007-12-23 22:20 ` Rafael J. Wysocki
2007-12-25 12:12 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Pavel Machek
2 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2007-12-23 18:00 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Carlos Corbacho, linux-kernel, Greg KH, Ingo Molnar,
Thomas Gleixner, Len Brown, Andrew Morton
On Sun, 23 Dec 2007, Rafael J. Wysocki wrote:
>
> The patch is fine by me, so if anyone has objections, please speak up.
There is absolutely *no* way I will apply this in an -rc6 release.
The number of machines this will break is totally unknown. It might be
zero. It might be hundreds. We just don't know. We might hit another
unlucky allocation that we just happened to avoid before.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 17:53 ` [Bug 9528] " Linus Torvalds
2007-12-23 17:58 ` Linus Torvalds
@ 2007-12-23 19:19 ` Ingo Molnar
2007-12-23 19:29 ` Linus Torvalds
2007-12-23 20:43 ` Yinghai Lu
2 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2007-12-23 19:19 UTC (permalink / raw)
To: Linus Torvalds
Cc: Carlos Corbacho, Linux Kernel Mailing List, Rafael J. Wysocki,
Greg KH, Thomas Gleixner, Len Brown, bugme-daemon
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > Why:
> >
> > After debugging _PTS() in the DSDT, it turns out these nVidia boards are
> > trying to write to an IO port > 0x1000 (0x142E) during suspend. Before the
> > re-ordering, we got away with this.
>
> Very interesting.
>
> HOWEVER.
>
> I'd much rather figure out what the magic IO resource is that clashes.
Carlos, could you please run the following script as root:
http://redhat.com/~mingo/misc/probe-ports.sh
and send us the resulting probe-ports.txt file?
This script will probe all unused ports as per /proc/ioports and will
list "suspect" IO port areas: ones that do not produce the expected 0xff
default reply from unclaimed IO ports. Magic chipset register areas can
potentially be mapped this way.
[ CAREFUL: This probes IO ports which might in theory trigger various
nastiness such as lockups. I this on a few boxes and the script
worked, but save any work in case you get lockups. ]
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 19:19 ` Ingo Molnar
@ 2007-12-23 19:29 ` Linus Torvalds
0 siblings, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2007-12-23 19:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: Carlos Corbacho, Linux Kernel Mailing List, Rafael J. Wysocki,
Greg KH, Thomas Gleixner, Len Brown, bugme-daemon
On Sun, 23 Dec 2007, Ingo Molnar wrote:
>
> This script will probe all unused ports as per /proc/ioports and will
> list "suspect" IO port areas: ones that do not produce the expected 0xff
> default reply from unclaimed IO ports. Magic chipset register areas can
> potentially be mapped this way.
This probably won't work, if the APCI ports aren't reserved.
The way suspend-to-ram works is that there's a magic port that the CPU
reads from, which just basically turns the CPU off.
Same goes for C-states, and while we should recover from that gracefully
(ie the CPU comes back at wakeup events), if you don't do the right setup,
that can also just hang the machine..
So this script sounds rather dangerous for this case (while it probably
tends to work fine for the case of the ACPI ports being properly reserved:
*most* IO devices tend to try to avoid having too many side effects on
normal reads - the most common one tends to be to reset any pending
interrupts from a device, but that one won't matter if we don't have a
driver listening to that device).
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 17:53 ` [Bug 9528] " Linus Torvalds
2007-12-23 17:58 ` Linus Torvalds
2007-12-23 19:19 ` Ingo Molnar
@ 2007-12-23 20:43 ` Yinghai Lu
2 siblings, 0 replies; 29+ messages in thread
From: Yinghai Lu @ 2007-12-23 20:43 UTC (permalink / raw)
To: Linus Torvalds
Cc: Carlos Corbacho, Linux Kernel Mailing List, Rafael J. Wysocki,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, bugme-daemon
On Dec 23, 2007 9:53 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Sun, 23 Dec 2007, Carlos Corbacho wrote:
> >
> > Fix suspend-to-RAM on nForce 4 (CK804) boards by increasing
> > PCIBIOS_MIN_IO.
> >
> > Fixes kernel bugzilla #9528
> >
> > Problem:
> >
> > Linus' patch (52ade9b3b97fd3bea42842a056fe0786c28d0555) to re-order
> > suspend (and fix fall out from Rafael's earlier suspend reordering work)
> > broke suspend-to-RAM on nForce 4 (CK804) boards.
> >
> > Why:
> >
> > After debugging _PTS() in the DSDT, it turns out these nVidia boards are
> > trying to write to an IO port > 0x1000 (0x142E) during suspend. Before the
> > re-ordering, we got away with this.
>
> Very interesting.
>
> HOWEVER.
>
> I'd much rather figure out what the magic IO resource is that clashes.
>
> It's almost certainly some hidden and undocumented (or badly documented)
> ACPI IO area that the kernel doesn't know about, because it's not a
> regular PCI BAR resource, but some northbridge (or southbridge) magic
> register range.
>
> Those ranges *should* be reserved by the BIOS in the ACPI tables, but this
> would definitely not be the first time that doesn't happen.
>
> But the right fix would be for us to just figure out what the range is ass
> a PCI quirk, and just know to avoid it on purpose, ratehr than just being
> lucky and happen to avoid it because PCIBIOS_MIN_IO just happens to be
> bigger than the particular address.
>
> So can you:
> - show what your /proc/ioports contains (*with* the bug triggering, ie
> non-working suspend, so we see what it is that actually ends up using
> that area)
> - send out 'dmesg' for a boot (same deal)
> - add "lspci -xxxvv" output to the deal too.
>
it looks like BIOS doesn't assign io port in bus 0. ( for PMU? or some 00:01.1)
and kernel try to assign value to it according to PCIBIOS_MIN_IO.
sometime some systems could have several HT chains.
bus: [00,08] on node 0 link 1
bus: 00 index 0 io port: [5000, dfff]
bus: 00 index 1 io port: [e000, efff]
bus: 00 index 2 io port: [0, fff]
bus: 00 index 3 mmio: [de000000, dfffffff]
bus: 00 index 4 mmio: [e0000000, e7ffffff]
bus: 00 index 5 mmio: [a0000, bffff]
bus: 00 index 6 mmio: [f0000000, ffffffff]
bus: [80,86] on node 1 link 2
bus: 80 index 0 io port: [1000, 4fff]
bus: 80 index 1 io port: [f000, ffff]
bus: 80 index 2 mmio: [c0000000, ddffffff]
bus: 80 index 3 mmio: [e8000000, efffffff]
current all the buses will use ioport_resource
@@ -1158,6 +1162,8 @@ struct pci_bus * pci_create_bus(struct d
b->resource[0] = &ioport_resource;
b->resource[1] = &iomem_resource;
kernel could try to allocate resource from [0x1000, 0x4fff] for the
device in first HT chain...
..
I met one case: when some cards insert, i can not use mcp55 on die nic.
then i make one patch that could read KB northbridge pci conf space to
make different peer root bus has right io/iomem resource range.
pci_assign_resource could get right value for the devices that is not
assigned io value by BIOS.
YH
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 18:00 ` Linus Torvalds
@ 2007-12-23 22:20 ` Rafael J. Wysocki
2007-12-23 23:12 ` H. Peter Anvin
0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2007-12-23 22:20 UTC (permalink / raw)
To: Linus Torvalds
Cc: Carlos Corbacho, linux-kernel, Greg KH, Ingo Molnar,
Thomas Gleixner, Len Brown, Andrew Morton
On Sunday, 23 of December 2007, Linus Torvalds wrote:
>
> On Sun, 23 Dec 2007, Rafael J. Wysocki wrote:
> >
> > The patch is fine by me, so if anyone has objections, please speak up.
>
> There is absolutely *no* way I will apply this in an -rc6 release.
>
> The number of machines this will break is totally unknown. It might be
> zero. It might be hundreds. We just don't know. We might hit another
> unlucky allocation that we just happened to avoid before.
I was rather thinking of putting it into -mm for some time and target for
2.6.25 if possible.
If it breaks systems, we can always revert before 2.6.25 final.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 22:20 ` Rafael J. Wysocki
@ 2007-12-23 23:12 ` H. Peter Anvin
2007-12-24 0:09 ` Carlos Corbacho
0 siblings, 1 reply; 29+ messages in thread
From: H. Peter Anvin @ 2007-12-23 23:12 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linus Torvalds, Carlos Corbacho, linux-kernel, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton
Rafael J. Wysocki wrote:
> On Sunday, 23 of December 2007, Linus Torvalds wrote:
>> On Sun, 23 Dec 2007, Rafael J. Wysocki wrote:
>>> The patch is fine by me, so if anyone has objections, please speak up.
>> There is absolutely *no* way I will apply this in an -rc6 release.
>>
>> The number of machines this will break is totally unknown. It might be
>> zero. It might be hundreds. We just don't know. We might hit another
>> unlucky allocation that we just happened to avoid before.
>
> I was rather thinking of putting it into -mm for some time and target for
> 2.6.25 if possible.
>
> If it breaks systems, we can always revert before 2.6.25 final.
>
This is totally the wrong way to go about it.
Instead, it should detect this particular chipset and reserve relevant
ports. Even better would be if we can find out what reserves these
ports and mark it as a quirk.
That being said, I have seen other chipsets allocate ports in the low
0x1XXX range using non-BAR methods. They *should* reserve them in ACPI,
of course.
-hpa
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 23:12 ` H. Peter Anvin
@ 2007-12-24 0:09 ` Carlos Corbacho
2007-12-24 0:56 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Carlos Corbacho @ 2007-12-24 0:09 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Rafael J. Wysocki, Linus Torvalds, linux-kernel, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton
On Sunday 23 December 2007 23:12:47 H. Peter Anvin wrote:
> Rafael J. Wysocki wrote:
> > On Sunday, 23 of December 2007, Linus Torvalds wrote:
> >> On Sun, 23 Dec 2007, Rafael J. Wysocki wrote:
> >>> The patch is fine by me, so if anyone has objections, please speak up.
> >>
> >> There is absolutely *no* way I will apply this in an -rc6 release.
> >>
> This is totally the wrong way to go about it.
Please disregard the patch anyway - my test system was still using the custom
DSDT - it doesn't fix anything.
Regardless, Linus' patch in question (in combination with Rafael's suspend
reordering work) still broke suspend, and the port 0x142E write is still the
offender, so something is still not playing nice - I'm just now at a complete
loss as to what.
(PNPACPI came to mind as a suspect, but even with that disabled, this board/
chipset still wedges on suspend).
-Carlos
--
E-Mail: carlos@strangeworlds.co.uk
Web: strangeworlds.co.uk
GPG Key ID: 0x23EE722D
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-24 0:09 ` Carlos Corbacho
@ 2007-12-24 0:56 ` Linus Torvalds
2007-12-24 1:14 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2007-12-24 0:56 UTC (permalink / raw)
To: Carlos Corbacho
Cc: H. Peter Anvin, Rafael J. Wysocki, linux-kernel, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton
On Mon, 24 Dec 2007, Carlos Corbacho wrote:
>
> Please disregard the patch anyway - my test system was still using the custom
> DSDT - it doesn't fix anything.
Ok, so it's not a simple IO port conflict.
And the range 0x1400-0x147f (which is apparently the ACPI block) is
properly marked as reserved.
So the IO write to 1428 is a red herring. It's just part of the normal
sequence to suspend-to-ram, and while it failing probably has something to
do with the failure to s2ram, it's simply a result of ACPI doing something
insane, and probably making some assumptions that we've broken by mistake
when we do the higher-level device_suspend() before we do the low-level
one.
IOW, it looks like the normal kind of ACPI mess. Color me not in the least
surprised, and it needs somebody who understands AML and what the heck is
supposed to happen to figure out.
The kernel doesn't do anything at all to port 1428 (since it is reserved),
so if any write to it "fails" (how?) it's probably because the writer made
some assumptions about system state.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-24 0:56 ` Linus Torvalds
@ 2007-12-24 1:14 ` Linus Torvalds
2007-12-24 3:05 ` Carlos Corbacho
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2007-12-24 1:14 UTC (permalink / raw)
To: Carlos Corbacho
Cc: H. Peter Anvin, Rafael J. Wysocki, Linux Kernel Mailing List,
Linux PM List, Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown,
Andrew Morton
On Sun, 23 Dec 2007, Linus Torvalds wrote:
>
> IOW, it looks like the normal kind of ACPI mess. Color me not in the least
> surprised, and it needs somebody who understands AML and what the heck is
> supposed to happen to figure out.
Side note: we could obviously undo the commit that triggered this for you
(ie 52ade9b3b97fd3bea42842a056fe0786c28d0555), but then we have to undo
also the commit that caused us to do that commit in the first place, and
change the ordering on resume too (that would be commit
e3c7db621bed4afb8e231cb005057f2feb5db557 - the commit that moved the
"pm_ops->finish()" call to before the call to device_resume())
In other words, we'd have to go back to our original ordering, which Len
said was fundamentally wrong. I don't think anybody really wants that.
It would be better to figure out why "device_suspend()" apparently causes
problems for your AML crud.
Oh, and why is linux-kernel cc'd, but not linux-pm? Are all the relevant
people from linux-pm cc'd, or should somebody who is on that list please
try to condense this down?
(For linux-pm: see
http://bugzilla.kernel.org/show_bug.cgi?id=9528
for some more details, including a few red herrings like the whole subject
line of this email thread which turned out to not be valid after all).
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-24 1:14 ` Linus Torvalds
@ 2007-12-24 3:05 ` Carlos Corbacho
2007-12-24 13:44 ` Rafael J. Wysocki
0 siblings, 1 reply; 29+ messages in thread
From: Carlos Corbacho @ 2007-12-24 3:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: H. Peter Anvin, Rafael J. Wysocki, Linux Kernel Mailing List,
Linux PM List, Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown,
Andrew Morton
On Monday 24 December 2007 01:14:34 Linus Torvalds wrote:
> Side note: we could obviously undo the commit that triggered this for you
> [..]
> In other words, we'd have to go back to our original ordering, which Len
> said was fundamentally wrong. I don't think anybody really wants that.
Nor would I argue to do so.
> It would be better to figure out why "device_suspend()" apparently causes
> problems for your AML crud.
Will do - thanks for the pointer.
> Oh, and why is linux-kernel cc'd, but not linux-pm?
Because I like to compound my errors (I mean, if you're going to screw up, you
might as well _really_ go for it).
-Carlos
--
E-Mail: carlos@strangeworlds.co.uk
Web: strangeworlds.co.uk
GPG Key ID: 0x23EE722D
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-24 3:05 ` Carlos Corbacho
@ 2007-12-24 13:44 ` Rafael J. Wysocki
2007-12-24 18:34 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2007-12-24 13:44 UTC (permalink / raw)
To: Carlos Corbacho
Cc: Linus Torvalds, H. Peter Anvin, Linux Kernel Mailing List,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton,
pm list
On Monday, 24 of December 2007, Carlos Corbacho wrote:
> On Monday 24 December 2007 01:14:34 Linus Torvalds wrote:
> > Side note: we could obviously undo the commit that triggered this for you
> > [..]
> > In other words, we'd have to go back to our original ordering, which Len
> > said was fundamentally wrong. I don't think anybody really wants that.
>
> Nor would I argue to do so.
>
> > It would be better to figure out why "device_suspend()" apparently causes
> > problems for your AML crud.
>
> Will do - thanks for the pointer.
Well, having considered that for a longer while, I think the AML code is
referring to a device that we have suspended already, and since it's in a low
power state, it just can't handle the reference.
If that is the case, we'll have to find the device (that should be possible
using some code instrumentation) and move the suspending of it into the late
stage.
> > Oh, and why is linux-kernel cc'd, but not linux-pm?
>
> Because I like to compound my errors (I mean, if you're going to screw up, you
> might as well _really_ go for it).
BTW, linux-pm is on linux-foundation, changed the CC. ;-)
Thanks,
Rafael
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-24 13:44 ` Rafael J. Wysocki
@ 2007-12-24 18:34 ` Linus Torvalds
2007-12-24 21:53 ` Carlos Corbacho
2007-12-25 16:13 ` Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM) Rafael J. Wysocki
0 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2007-12-24 18:34 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Carlos Corbacho, H. Peter Anvin, Linux Kernel Mailing List,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton,
pm list
On Mon, 24 Dec 2007, Rafael J. Wysocki wrote:
>
> Well, having considered that for a longer while, I think the AML code is
> referring to a device that we have suspended already, and since it's in a low
> power state, it just can't handle the reference.
>
> If that is the case, we'll have to find the device (that should be possible
> using some code instrumentation) and move the suspending of it into the late
> stage.
Yes.
In general, I'm personally of the opinion that drivers should *not*
actually go into D3 at all in the regular "->suspend()" phase. It should
be done in ->suspend_late. The early suspend is for saving state and
returning errors.
Sadly, we've made it a bit too inconvenient to actually do that. Almost
all drivers only do the "->suspend" thing, and the default PCI behaviour
doesn't help us in any way either.
Anyway, I wonder if a patch like this could make it easier for driver
writers to handle things. It basically does:
- if you don't have a regular "suspend()" function, we'll just save state
at suspend time.
- if you don't have a "suspend_late()" function, we'll look at the
current state, and if it's still in PCI_D0, we'll suspend to PCI_D3hot
if it's a regular PCI device (ie not a bridge or something else odd).
- then, at resume time, by default we don't do anything in the early
resume, but in the late resume we'll undo everything, of course.
Anyway, with this, most drivers could just remove the
"pci_set_power_state()" call *entirely*, and let the default
suspend_late action power the device down. But if you want to override
that default action, you can either:
- set the power state in your own ->suspend() routine (either by using
pci_set_power_state(), or by just explicitly setting the state to
unknown with "dev->current_state = PCI_UNKNOWN"
- have a "late_suspend()" action, which obviously will override the
default action entirely.
Hmm?
In the case of the NVidia issue, one thing to try migh be to remove the
current call to "pci_set_power_state(pdev, 3);" in agp_nvidia_suspend() in
drivers/char/agp/nvidia-agp.c. That sounds like the most likely culprit
for something that ACPI might want to shut down.
NOTE! This following patch is just for discussion, and while I think it's
conceptually a good thing to try, I don't think it will help Carlos'
problem. But removing the "pci_set_power_state()" in agp_nvidia_suspend()
might.
Linus
---
drivers/pci/pci-driver.c | 32 +++++++++++++++++++++++++-------
1 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 6d1a216..6992f73 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -264,6 +264,28 @@ static int pci_device_remove(struct device * dev)
return 0;
}
+static void pci_default_suspend(struct pci_dev *dev, pm_message_t state)
+{
+ pci_save_state(dev);
+}
+
+static void pci_default_suspend_late(struct pci_dev *dev, pm_message_t state)
+{
+ /* Something has already suspended it? Never mind then.. */
+ if (dev->current_state != PCI_D0)
+ return;
+
+ /* We avoid powering down bridges by default.. */
+ if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL)
+ pci_set_power_state(dev, PCI_D3hot);
+
+ /*
+ * mark its power state as "unknown", since we don't know if
+ * e.g. the BIOS will change its device state when we suspend.
+ */
+ dev->current_state = PCI_UNKNOWN;
+}
+
static int pci_device_suspend(struct device * dev, pm_message_t state)
{
struct pci_dev * pci_dev = to_pci_dev(dev);
@@ -274,13 +296,7 @@ static int pci_device_suspend(struct device * dev, pm_message_t state)
i = drv->suspend(pci_dev, state);
suspend_report_result(drv->suspend, i);
} else {
- pci_save_state(pci_dev);
- /*
- * mark its power state as "unknown", since we don't know if
- * e.g. the BIOS will change its device state when we suspend.
- */
- if (pci_dev->current_state == PCI_D0)
- pci_dev->current_state = PCI_UNKNOWN;
+ pci_default_suspend(pci_dev, state);
}
return i;
}
@@ -294,6 +310,8 @@ static int pci_device_suspend_late(struct device * dev, pm_message_t state)
if (drv && drv->suspend_late) {
i = drv->suspend_late(pci_dev, state);
suspend_report_result(drv->suspend_late, i);
+ } else {
+ pci_default_suspend_late(pci_dev, state);
}
return i;
}
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-24 18:34 ` Linus Torvalds
@ 2007-12-24 21:53 ` Carlos Corbacho
2007-12-25 16:13 ` Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM) Rafael J. Wysocki
1 sibling, 0 replies; 29+ messages in thread
From: Carlos Corbacho @ 2007-12-24 21:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Rafael J. Wysocki, H. Peter Anvin, Linux Kernel Mailing List,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton,
pm list
On Monday 24 December 2007 18:34:21 Linus Torvalds wrote:
> On Mon, 24 Dec 2007, Rafael J. Wysocki wrote:
> > Well, having considered that for a longer while, I think the AML code is
> > referring to a device that we have suspended already, and since it's in a
> > low power state, it just can't handle the reference.
> >
> > If that is the case, we'll have to find the device (that should be
> > possible using some code instrumentation) and move the suspending of it
> > into the late stage.
>
> Yes.
My own experimentation (in device_suspend(), calling _PTS() in the AML after
each suspend_device() runs, until one device causes it to hang) points to
ohci_hcd being the culprit here (with or without any devices attached). With
the ohci_hcd module unloaded, the machine suspends just fine[1].
Of course, I'm at a complete loss as to why suspending OHCI would cause a
problem for an IO port write.
> NOTE! This following patch is just for discussion, and while I think it's
> conceptually a good thing to try, I don't think it will help Carlos'
> problem. But removing the "pci_set_power_state()" in agp_nvidia_suspend()
> might.
nvidia-agp cannot be built on x86-64, so it's not the culprit in this case.
-Carlos
[1] And yes, I double checked the custom DSDT is not loaded this time.
--
E-Mail: carlos@strangeworlds.co.uk
Web: strangeworlds.co.uk
GPG Key ID: 0x23EE722D
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-23 16:30 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Rafael J. Wysocki
2007-12-23 17:57 ` Ingo Molnar
2007-12-23 18:00 ` Linus Torvalds
@ 2007-12-25 12:12 ` Pavel Machek
2007-12-25 12:28 ` Carlos Corbacho
2 siblings, 1 reply; 29+ messages in thread
From: Pavel Machek @ 2007-12-25 12:12 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Carlos Corbacho, linux-kernel, Linus Torvalds, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton
Hi!
> On Sunday, 23 of December 2007, Carlos Corbacho wrote:
> > Fix suspend-to-RAM on nForce 4 (CK804) boards by increasing
> > PCIBIOS_MIN_IO.
> >
> > Fixes kernel bugzilla #9528
> >
> > Problem:
> >
> > Linus' patch (52ade9b3b97fd3bea42842a056fe0786c28d0555) to re-order
> > suspend (and fix fall out from Rafael's earlier suspend reordering work)
> > broke suspend-to-RAM on nForce 4 (CK804) boards.
> >
> > Why:
> >
> > After debugging _PTS() in the DSDT, it turns out these nVidia boards are
> > trying to write to an IO port > 0x1000 (0x142E) during suspend. Before the
> > re-ordering, we got away with this.
> >
> > After the afore mentioned commit, we started hitting the PCIBIOS_MIN_IO
> > limit and suspend then broke on these machines (the machine simply hangs
> > when it reaches the 0x142E IO port write during suspend-to-RAM).
> >
> > There was some previous work in the PCIBIOS_MIN_IO area over two years ago
> > (71db63acff69618b3d9d3114bd061938150e146b) which bumped this to 0x4000,
> > but this was reverted (2ba84684e8cf6f980e4e95a2300f53a505eb794e) after
> > causing new and entirely different problems on another nForce board.
> >
> > 0x1500 has been picked here as a nice, round and more conservative value
> > than 0x4000, and which covers 0x142E.
>
> The patch is fine by me, so if anyone has objections, please speak up.
Just.. I have been running with very similar patch for few years... it
fixes few prototype machines I have here.
> > diff --git a/include/asm-x86/pci.h b/include/asm-x86/pci.h
> > index e883619..03cb123 100644
> > --- a/include/asm-x86/pci.h
> > +++ b/include/asm-x86/pci.h
> > @@ -46,7 +46,7 @@ extern unsigned int pcibios_assign_all_busses(void);
> > #define pcibios_scan_all_fns(a, b) 0
> >
> > extern unsigned long pci_mem_start;
> > -#define PCIBIOS_MIN_IO 0x1000
> > +#define PCIBIOS_MIN_IO 0x1500
> > #define PCIBIOS_MIN_MEM (pci_mem_start)
> >
> > #define PCIBIOS_MIN_CARDBUS_IO 0x4000
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
2007-12-25 12:12 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Pavel Machek
@ 2007-12-25 12:28 ` Carlos Corbacho
0 siblings, 0 replies; 29+ messages in thread
From: Carlos Corbacho @ 2007-12-25 12:28 UTC (permalink / raw)
To: Pavel Machek
Cc: Rafael J. Wysocki, linux-kernel, Linus Torvalds, Greg KH,
Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton
Pavel,
On Tuesday 25 December 2007 12:12:31 Pavel Machek wrote:
> > The patch is fine by me, so if anyone has objections, please speak up.
>
> Just.. I have been running with very similar patch for few years... it
> fixes few prototype machines I have here.
I've withdrawn the patch since it doesn't actually fix the issue (turns out
it's actually a bug in Linux's handling of suspend-to-RAM for ACPI 1.0 and
2.0 systems).
-Carlos
--
E-Mail: carlos@strangeworlds.co.uk
Web: strangeworlds.co.uk
GPG Key ID: 0x23EE722D
^ permalink raw reply [flat|nested] 29+ messages in thread
* Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM)
2007-12-24 18:34 ` Linus Torvalds
2007-12-24 21:53 ` Carlos Corbacho
@ 2007-12-25 16:13 ` Rafael J. Wysocki
2007-12-26 4:11 ` Linus Torvalds
1 sibling, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2007-12-25 16:13 UTC (permalink / raw)
To: Linus Torvalds
Cc: Carlos Corbacho, H. Peter Anvin, Linux Kernel Mailing List,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton,
pm list, ACPI Devel Maling List
On Monday, 24 of December 2007, Linus Torvalds wrote:
>
> On Mon, 24 Dec 2007, Rafael J. Wysocki wrote:
> >
> > Well, having considered that for a longer while, I think the AML code is
> > referring to a device that we have suspended already, and since it's in a low
> > power state, it just can't handle the reference.
> >
> > If that is the case, we'll have to find the device (that should be possible
> > using some code instrumentation) and move the suspending of it into the late
> > stage.
>
> Yes.
>
> In general, I'm personally of the opinion that drivers should *not*
> actually go into D3 at all in the regular "->suspend()" phase. It should
> be done in ->suspend_late. The early suspend is for saving state and
> returning errors.
>
> Sadly, we've made it a bit too inconvenient to actually do that. Almost
> all drivers only do the "->suspend" thing, and the default PCI behaviour
> doesn't help us in any way either.
>
> Anyway, I wonder if a patch like this could make it easier for driver
> writers to handle things. It basically does:
>
> - if you don't have a regular "suspend()" function, we'll just save state
> at suspend time.
>
> - if you don't have a "suspend_late()" function, we'll look at the
> current state, and if it's still in PCI_D0, we'll suspend to PCI_D3hot
> if it's a regular PCI device (ie not a bridge or something else odd).
>
> - then, at resume time, by default we don't do anything in the early
> resume, but in the late resume we'll undo everything, of course.
>
> Anyway, with this, most drivers could just remove the
> "pci_set_power_state()" call *entirely*, and let the default
> suspend_late action power the device down. But if you want to override
> that default action, you can either:
>
> - set the power state in your own ->suspend() routine (either by using
> pci_set_power_state(), or by just explicitly setting the state to
> unknown with "dev->current_state = PCI_UNKNOWN"
>
> - have a "late_suspend()" action, which obviously will override the
> default action entirely.
>
> Hmm?
Well, as Carlos correctly noticed, the problem is related to the change in
the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI
2.0 and later wants us to put devices into low power states before calling
_PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following
the 2.0 and later specifications right now, we're not doing the right thing for
the (strictly) ACPI 1.0x-compliant systems.
We ought to be able to fix things on the high level, by calling _PTS earlier on
systems that claim to be ACPI 1.0x-compliant. That will require us to modify
the generic susped code quite a bit and will need to be tested for some time.
I'm going to prepare patches to implement this idea targeted for the 2.6.25
time frame.
Greetings,
Rafael
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM)
2007-12-25 16:13 ` Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM) Rafael J. Wysocki
@ 2007-12-26 4:11 ` Linus Torvalds
2007-12-26 15:07 ` Rafael J. Wysocki
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2007-12-26 4:11 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Carlos Corbacho, H. Peter Anvin, Linux Kernel Mailing List,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton,
pm list, ACPI Devel Maling List
On Tue, 25 Dec 2007, Rafael J. Wysocki wrote:
>
> the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI
> 2.0 and later wants us to put devices into low power states before calling
> _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following
> the 2.0 and later specifications right now, we're not doing the right thing for
> the (strictly) ACPI 1.0x-compliant systems.
>
> We ought to be able to fix things on the high level, by calling _PTS earlier on
> systems that claim to be ACPI 1.0x-compliant. That will require us to modify
> the generic susped code quite a bit and will need to be tested for some time.
That's insane. Are you really saying that ACPI wants totally different
orderings for different versions of the spec? And does Windows really do
that?
Please don't make lots of modifications to the generic suspend code. The
only thing that is worth doing is to just have a firmware callback before
the "device_suspend()" thing (and then on a ACPI-1.0 system, call _PTS
*there*), and on an ACPI-2.0 system, call _PTS *after* device_suspend().
Still, the fact is, some (most, I think) drivers *should* put themselves
into D3 only in "late_suspend()", so if ACPI-2.0 really expects _PTS to be
called after that, we're just screwed. That's when the system is really
down, interrupts disabled etc, we don't want to call anything but the
final ACPI "turn us off" stuff there!
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM)
2007-12-26 4:11 ` Linus Torvalds
@ 2007-12-26 15:07 ` Rafael J. Wysocki
2007-12-26 15:24 ` Suspend code ordering (again) Alexey Starikovskiy
0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2007-12-26 15:07 UTC (permalink / raw)
To: Linus Torvalds
Cc: Carlos Corbacho, H. Peter Anvin, Linux Kernel Mailing List,
Greg KH, Ingo Molnar, Thomas Gleixner, Len Brown, Andrew Morton,
pm list, ACPI Devel Maling List
On Wednesday, 26 of December 2007, Linus Torvalds wrote:
>
> On Tue, 25 Dec 2007, Rafael J. Wysocki wrote:
> >
> > the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI
> > 2.0 and later wants us to put devices into low power states before calling
> > _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following
> > the 2.0 and later specifications right now, we're not doing the right thing for
> > the (strictly) ACPI 1.0x-compliant systems.
> >
> > We ought to be able to fix things on the high level, by calling _PTS earlier on
> > systems that claim to be ACPI 1.0x-compliant. That will require us to modify
> > the generic susped code quite a bit and will need to be tested for some time.
>
> That's insane. Are you really saying that ACPI wants totally different
> orderings for different versions of the spec?
Yes, I am.
> And does Windows really do that?
I don't know.
> Please don't make lots of modifications to the generic suspend code. The
> only thing that is worth doing is to just have a firmware callback before
> the "device_suspend()" thing (and then on a ACPI-1.0 system, call _PTS
> *there*), and on an ACPI-2.0 system, call _PTS *after* device_suspend().
Yes, that's what I'm going to do, but I need to untangle some ACPI code for
this purpose.
> Still, the fact is, some (most, I think) drivers *should* put themselves
> into D3 only in "late_suspend()", so if ACPI-2.0 really expects _PTS to be
> called after that, we're just screwed.
Well, section 9.1.6 of ACPI 2.0 specifies the suspend ordering directly and
says exactly that _PTS is to be executed after putting devices into respective
D states.
> That's when the system is really down, interrupts disabled etc, we don't want
> to call anything but the final ACPI "turn us off" stuff there!
OTOH, we ought to be able to put devices into low power states at any time, for
example when they are not used, without any problems and having to put them
back into D0 just in order to execute _PTS doesn't seem very logical to me. ;-)
Greetings,
Rafael
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
2007-12-26 15:07 ` Rafael J. Wysocki
@ 2007-12-26 15:24 ` Alexey Starikovskiy
2007-12-26 17:50 ` H. Peter Anvin
0 siblings, 1 reply; 29+ messages in thread
From: Alexey Starikovskiy @ 2007-12-26 15:24 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linus Torvalds, Carlos Corbacho, H. Peter Anvin,
Linux Kernel Mailing List, Greg KH, Ingo Molnar, Thomas Gleixner,
Len Brown, Andrew Morton, pm list, ACPI Devel Maling List
Rafael J. Wysocki wrote:
> On Wednesday, 26 of December 2007, Linus Torvalds wrote:
>
>> On Tue, 25 Dec 2007, Rafael J. Wysocki wrote:
>>
>>> the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI
>>> 2.0 and later wants us to put devices into low power states before calling
>>> _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following
>>> the 2.0 and later specifications right now, we're not doing the right thing for
>>> the (strictly) ACPI 1.0x-compliant systems.
>>>
>>> We ought to be able to fix things on the high level, by calling _PTS earlier on
>>> systems that claim to be ACPI 1.0x-compliant. That will require us to modify
>>> the generic susped code quite a bit and will need to be tested for some time.
>>>
>> That's insane. Are you really saying that ACPI wants totally different
>> orderings for different versions of the spec?
>>
>
> Yes, I am.
>
>
>> And does Windows really do that?
>>
>
> I don't know.
>
Windows was compliant only with 1.x spec until Vista.
With Vista claims are 3.x compliance.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
2007-12-26 15:24 ` Suspend code ordering (again) Alexey Starikovskiy
@ 2007-12-26 17:50 ` H. Peter Anvin
0 siblings, 0 replies; 29+ messages in thread
From: H. Peter Anvin @ 2007-12-26 17:50 UTC (permalink / raw)
To: Alexey Starikovskiy
Cc: Rafael J. Wysocki, Linus Torvalds, Carlos Corbacho,
Linux Kernel Mailing List, Greg KH, Ingo Molnar, Thomas Gleixner,
Len Brown, Andrew Morton, pm list, ACPI Devel Maling List
Alexey Starikovskiy wrote:
>>
>> I don't know.
>>
> Windows was compliant only with 1.x spec until Vista.
> With Vista claims are 3.x compliance.
>
In other words, the 1.x spec is the only thing that matters, at least in
the short term (*noone* is giving up XP compatibility at this point.)
-hpa
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
[not found] ` <fa.XycBwhGuyvtVl/QW5HONqLwOags@ifi.uio.no>
@ 2007-12-27 18:07 ` Robert Hancock
2007-12-27 20:00 ` Rafael J. Wysocki
0 siblings, 1 reply; 29+ messages in thread
From: Robert Hancock @ 2007-12-27 18:07 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linus Torvalds, Carlos Corbacho, H. Peter Anvin,
Linux Kernel Mailing List, Greg KH, Ingo Molnar, Thomas Gleixner,
Len Brown, Andrew Morton, pm list, ACPI Devel Maling List
Rafael J. Wysocki wrote:
> On Wednesday, 26 of December 2007, Linus Torvalds wrote:
>> On Tue, 25 Dec 2007, Rafael J. Wysocki wrote:
>>> the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI
>>> 2.0 and later wants us to put devices into low power states before calling
>>> _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following
>>> the 2.0 and later specifications right now, we're not doing the right thing for
>>> the (strictly) ACPI 1.0x-compliant systems.
>>>
>>> We ought to be able to fix things on the high level, by calling _PTS earlier on
>>> systems that claim to be ACPI 1.0x-compliant. That will require us to modify
>>> the generic susped code quite a bit and will need to be tested for some time.
>> That's insane. Are you really saying that ACPI wants totally different
>> orderings for different versions of the spec?
>
> Yes, I am.
>
>> And does Windows really do that?
>
> I don't know.
>
>> Please don't make lots of modifications to the generic suspend code. The
>> only thing that is worth doing is to just have a firmware callback before
>> the "device_suspend()" thing (and then on a ACPI-1.0 system, call _PTS
>> *there*), and on an ACPI-2.0 system, call _PTS *after* device_suspend().
>
> Yes, that's what I'm going to do, but I need to untangle some ACPI code for
> this purpose.
>
>> Still, the fact is, some (most, I think) drivers *should* put themselves
>> into D3 only in "late_suspend()", so if ACPI-2.0 really expects _PTS to be
>> called after that, we're just screwed.
>
> Well, section 9.1.6 of ACPI 2.0 specifies the suspend ordering directly and
> says exactly that _PTS is to be executed after putting devices into respective
> D states.
I would not take those sections as gospel, they're really an example
only. It's quite possible that Windows does not follow that ordering.
Also, as was pointed out, pre-Vista versions of Windows follow ACPI 1.0
and Vista follows 3.0, so 2.0 doesn't really matter since BIOS people
won't test against it. 1.0 specifies that _PTS is to be called before
suspending devices and 3.0 says that the AML must not depend on any
specific device power state, so in both cases it should be safe to call
_PTS before suspending, no?
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
2007-12-27 18:07 ` Suspend code ordering (again) Robert Hancock
@ 2007-12-27 20:00 ` Rafael J. Wysocki
2007-12-28 0:25 ` Robert Hancock
0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2007-12-27 20:00 UTC (permalink / raw)
To: Robert Hancock
Cc: Linus Torvalds, Carlos Corbacho, H. Peter Anvin,
Linux Kernel Mailing List, Greg KH, Ingo Molnar, Thomas Gleixner,
Len Brown, Andrew Morton, pm list, ACPI Devel Maling List
On Thursday, 27 of December 2007, Robert Hancock wrote:
> Rafael J. Wysocki wrote:
> > On Wednesday, 26 of December 2007, Linus Torvalds wrote:
> >> On Tue, 25 Dec 2007, Rafael J. Wysocki wrote:
> >>> the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI
> >>> 2.0 and later wants us to put devices into low power states before calling
> >>> _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following
> >>> the 2.0 and later specifications right now, we're not doing the right thing for
> >>> the (strictly) ACPI 1.0x-compliant systems.
> >>>
> >>> We ought to be able to fix things on the high level, by calling _PTS earlier on
> >>> systems that claim to be ACPI 1.0x-compliant. That will require us to modify
> >>> the generic susped code quite a bit and will need to be tested for some time.
> >> That's insane. Are you really saying that ACPI wants totally different
> >> orderings for different versions of the spec?
> >
> > Yes, I am.
> >
> >> And does Windows really do that?
> >
> > I don't know.
> >
> >> Please don't make lots of modifications to the generic suspend code. The
> >> only thing that is worth doing is to just have a firmware callback before
> >> the "device_suspend()" thing (and then on a ACPI-1.0 system, call _PTS
> >> *there*), and on an ACPI-2.0 system, call _PTS *after* device_suspend().
> >
> > Yes, that's what I'm going to do, but I need to untangle some ACPI code for
> > this purpose.
> >
> >> Still, the fact is, some (most, I think) drivers *should* put themselves
> >> into D3 only in "late_suspend()", so if ACPI-2.0 really expects _PTS to be
> >> called after that, we're just screwed.
> >
> > Well, section 9.1.6 of ACPI 2.0 specifies the suspend ordering directly and
> > says exactly that _PTS is to be executed after putting devices into respective
> > D states.
>
> I would not take those sections as gospel, they're really an example
> only. It's quite possible that Windows does not follow that ordering.
>
> Also, as was pointed out, pre-Vista versions of Windows follow ACPI 1.0
> and Vista follows 3.0, so 2.0 doesn't really matter since BIOS people
> won't test against it. 1.0 specifies that _PTS is to be called before
> suspending devices and 3.0 says that the AML must not depend on any
> specific device power state, so in both cases it should be safe to call
> _PTS before suspending, no?
Well, IMO, if we take one option only (whichever that is) and there are systems
that follow the other one, they will likely break.
Apart from this, there are BIOSes that openly claim ACPI 2.0 support (for
example, the one in my HP nx6325 does that) and they may actually prefer the
post-ACPI-1.0 ordering even if they work with the pre-ACPI-2.0 one.
Greetings,
Rafael
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
2007-12-27 20:00 ` Rafael J. Wysocki
@ 2007-12-28 0:25 ` Robert Hancock
2007-12-28 5:41 ` Linus Torvalds
2008-01-08 3:03 ` Shaohua Li
0 siblings, 2 replies; 29+ messages in thread
From: Robert Hancock @ 2007-12-28 0:25 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linus Torvalds, Carlos Corbacho, H. Peter Anvin,
Linux Kernel Mailing List, Greg KH, Ingo Molnar, Thomas Gleixner,
Len Brown, Andrew Morton, pm list, ACPI Devel Maling List
Rafael J. Wysocki wrote:
>> Also, as was pointed out, pre-Vista versions of Windows follow ACPI 1.0
>> and Vista follows 3.0, so 2.0 doesn't really matter since BIOS people
>> won't test against it. 1.0 specifies that _PTS is to be called before
>> suspending devices and 3.0 says that the AML must not depend on any
>> specific device power state, so in both cases it should be safe to call
>> _PTS before suspending, no?
>
> Well, IMO, if we take one option only (whichever that is) and there are systems
> that follow the other one, they will likely break.
>
> Apart from this, there are BIOSes that openly claim ACPI 2.0 support (for
> example, the one in my HP nx6325 does that) and they may actually prefer the
> post-ACPI-1.0 ordering even if they work with the pre-ACPI-2.0 one.
I doubt they would prefer the later ordering in any way that matters, if
the Windows version they were designed for uses the earlier ordering.
It would be best if somebody could manage to find out what ordering
Windows XP (and Windows Vista, for good measure) actually use, then we
could just use that. Virtual machine trickery might be an option - the
only complication being that it'll be using the DSDT for the fake
machine and not the real one..
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
2007-12-28 0:25 ` Robert Hancock
@ 2007-12-28 5:41 ` Linus Torvalds
2008-01-08 3:03 ` Shaohua Li
1 sibling, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2007-12-28 5:41 UTC (permalink / raw)
To: Robert Hancock
Cc: Rafael J. Wysocki, Carlos Corbacho, H. Peter Anvin,
Linux Kernel Mailing List, Greg KH, Ingo Molnar, Thomas Gleixner,
Len Brown, Andrew Morton, pm list, ACPI Devel Maling List
On Thu, 27 Dec 2007, Robert Hancock wrote:
>
> I doubt they would prefer the later ordering in any way that matters, if the
> Windows version they were designed for uses the earlier ordering.
Well, I wouldn't say it's abotu "preferring" one over the other. It's very
possible that the BIOS writers were *intending* to prefer ACPI 2.0, and it
may even be likely that they thought that they wrote it that way, but the
real issue is that it has apparently never ever been *tested* that way.
So yes, maybe the vendors actually thought they were a good ACPI-2.0
implementation, but if Windows doesn't do the ordering that the 2.0 spec
expects, then that is pretty much just a theoretical thing.
But yeah, it would be really nice to have this verified some way. Somebody
must already know (whether it's a VM person or a BIOS writer, and whether
they'd tell us, is obviously another issue).
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Suspend code ordering (again)
2007-12-28 0:25 ` Robert Hancock
2007-12-28 5:41 ` Linus Torvalds
@ 2008-01-08 3:03 ` Shaohua Li
1 sibling, 0 replies; 29+ messages in thread
From: Shaohua Li @ 2008-01-08 3:03 UTC (permalink / raw)
To: Robert Hancock
Cc: Rafael J. Wysocki, Linus Torvalds, Carlos Corbacho,
H. Peter Anvin, Linux Kernel Mailing List, Greg KH, Ingo Molnar,
Thomas Gleixner, Len Brown, Andrew Morton, pm list,
ACPI Devel Maling List
[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]
On Fri, 2007-12-28 at 08:25 +0800, Robert Hancock wrote:
> Rafael J. Wysocki wrote:
> >> Also, as was pointed out, pre-Vista versions of Windows follow ACPI
> 1.0
> >> and Vista follows 3.0, so 2.0 doesn't really matter since BIOS
> people
> >> won't test against it. 1.0 specifies that _PTS is to be called
> before
> >> suspending devices and 3.0 says that the AML must not depend on
> any
> >> specific device power state, so in both cases it should be safe to
> call
> >> _PTS before suspending, no?
> >
> > Well, IMO, if we take one option only (whichever that is) and there
> are systems
> > that follow the other one, they will likely break.
> >
> > Apart from this, there are BIOSes that openly claim ACPI 2.0 support
> (for
> > example, the one in my HP nx6325 does that) and they may actually
> prefer the
> > post-ACPI-1.0 ordering even if they work with the pre-ACPI-2.0 one.
>
> I doubt they would prefer the later ordering in any way that matters,
> if
> the Windows version they were designed for uses the earlier ordering.
>
> It would be best if somebody could manage to find out what ordering
> Windows XP (and Windows Vista, for good measure) actually use, then
> we
> could just use that. Virtual machine trickery might be an option -
> the
> only complication being that it'll be using the DSDT for the fake
> machine and not the real one..
I modified Qemu and use it to observe how winxp does suspend/resume. So
far, I just get some data for s4 suspend. I did have some interesting
finding.
1. xp seems not save pci config space. Or it appears just save config
PCICMD.
2. the order winxp does looks like
a. save config (PCICMD), put device to D3 (it appears only for ne2000
NIC)
b. _PTS
c. write mem to disk
d. write ACPI PM1_control register, then system shutdown
3. xp write ACPI GBL_EN bit just after _PTS (for both S4/S5), don't know
why
Attached is the log winxp does s4 suspend, it only includes pci config
read/write and ACPI register read/write.
I managed to make xp enter S3, but fails, so can't get the data for S3
so far. Anybody has other ideas which need to verify winxp, pls let me
know.
Thanks,
Shaohua
[-- Attachment #2: xplog --]
[-- Type: text/plain, Size: 2050 bytes --]
PCI NE2000 read addr 4, val 7
PCI NE2000 read addr 50, val 1
PCI NE2000 read addr 52, val c9c2
PCI NE2000 read addr 54, val 8000
PCI NE2000 write addr 54, val 8003
PCI NE2000 read addr 54, val 8003
PCI NE2000 read addr 4, val 7
PCI NE2000 write addr 4, val 0
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 0, val 71138086
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 64, val 8000000
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 0, val 71138086
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 64, val 8000000
PCI Cirrus VGA read addr 4, val 7
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PIIX3 IDE read addr 4, val 7
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 0, val 71138086
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 5c, val 90000000
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 0, val 71138086
PCI PIIX3 read addr 0, val 70008086
PCI PIIX3 read addr 4, val 7
PCI PIIX3 read addr 8, val 6010000
PCI PIIX3 read addr c, val 800000
PCI PM read addr 5c, val 90000000
ACPI: DBG: 0x00000002
PM readw port=0x0002 val=0xb002
PM writew port=0x0002 val=0x0020
PM readw port=0x0002 val=0xb002
PM readw port=0x0004 val=0xb004
PM readw port=0x0004 val=0xb004
PM writew port=0x0004 val=0x2801
type 2
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2008-01-08 3:02 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200712231419.40207.carlos@strangeworlds.co.uk>
2007-12-23 16:30 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Rafael J. Wysocki
2007-12-23 17:57 ` Ingo Molnar
2007-12-23 18:00 ` Linus Torvalds
2007-12-23 22:20 ` Rafael J. Wysocki
2007-12-23 23:12 ` H. Peter Anvin
2007-12-24 0:09 ` Carlos Corbacho
2007-12-24 0:56 ` Linus Torvalds
2007-12-24 1:14 ` Linus Torvalds
2007-12-24 3:05 ` Carlos Corbacho
2007-12-24 13:44 ` Rafael J. Wysocki
2007-12-24 18:34 ` Linus Torvalds
2007-12-24 21:53 ` Carlos Corbacho
2007-12-25 16:13 ` Suspend code ordering (again) (was: Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM) Rafael J. Wysocki
2007-12-26 4:11 ` Linus Torvalds
2007-12-26 15:07 ` Rafael J. Wysocki
2007-12-26 15:24 ` Suspend code ordering (again) Alexey Starikovskiy
2007-12-26 17:50 ` H. Peter Anvin
2007-12-25 12:12 ` x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM Pavel Machek
2007-12-25 12:28 ` Carlos Corbacho
2007-12-23 17:53 ` [Bug 9528] " Linus Torvalds
2007-12-23 17:58 ` Linus Torvalds
2007-12-23 19:19 ` Ingo Molnar
2007-12-23 19:29 ` Linus Torvalds
2007-12-23 20:43 ` Yinghai Lu
[not found] <fa.Tr7qmPdet0rF2FSRX/94s2UEMSE@ifi.uio.no>
[not found] ` <fa.WvaVh83zJOh/eZUrjQOZy4J8JFk@ifi.uio.no>
[not found] ` <fa.VsyhBr+FAHB0bTb9poSZS80xN/0@ifi.uio.no>
[not found] ` <fa.XycBwhGuyvtVl/QW5HONqLwOags@ifi.uio.no>
2007-12-27 18:07 ` Suspend code ordering (again) Robert Hancock
2007-12-27 20:00 ` Rafael J. Wysocki
2007-12-28 0:25 ` Robert Hancock
2007-12-28 5:41 ` Linus Torvalds
2008-01-08 3:03 ` Shaohua Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox