* Re: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-05 17:56 ` Re: [Suspend2-devel] " Pavel Machek
@ 2005-03-05 20:35 ` Bernard Blackham
2005-03-06 12:33 ` Pavel Machek
2005-03-07 16:48 ` Re: [Suspend2-devel] " John M Flinchbaugh
0 siblings, 2 replies; 10+ messages in thread
From: Bernard Blackham @ 2005-03-05 20:35 UTC (permalink / raw)
To: Pavel Machek; +Cc: acpi-devel, suspend2-devel
[-- Attachment #1: Type: text/plain, Size: 1788 bytes --]
On Sat, Mar 05, 2005 at 06:56:41PM +0100, Pavel Machek wrote:
> > After more investigation, it seems that the issue is the GPE is
> > fired but not serviced because kacpid is frozen. This in itself
> > would be okay, however, the GPE isn't being disabled before the
> > method executes (despite there being code there to do so), and hence
> > fires continuously.
>
> Perhaps you should fix that, too? It is going to cause ugly
> perfrmance problems.
Yep, been looking into it, and I think I've got it. The GPE in
question is fired periodically, about every 5 seconds. It fires when
suspending but kacpid is stopped, so the GPE is simply disabled and
never serviced. However, the state of it being disabled is recorded
in the atomic copy.
Upon resume, after restoring the atomic copy, the code at the top of
acpi_ev_disable_gpe believes that the GPE is already disabled (as it
was when we suspended) even though it's not. Hardware state is out
of sync with what the kernel thinks and badness ensues. The culprit
difference between 2.6.8 and 2.6.9-rc1 is that 2.6.8 disabled the
GPE unconditionally, 2.6.9-rc1 checks against its last known state
which is incorrect upon resuming.
Removing the check resolves this issue (patch attached). Is this an
adequate fix?
> Looks good to me. It is good idea for other reasons, too: like
> we'll be able to fight overheat during writing pages.
Fair enough. Even with kacpid running, we may still potentially run
into the same issue if a GPE occurs at the instant prior to the
atomic copy (and gets disabled in the atomic copy).
So, combining this patch with the previous kacpid NOFREEZE patch...
Does this make people happy? :) (And should it be submitted via Len
or akpm?)
Bernard.
--
Bernard Blackham <bernard at blackham dot com dot au>
[-- Attachment #2: toshiba-acpi-hack-3.diff --]
[-- Type: text/plain, Size: 461 bytes --]
--- linux-2.6.9-rc1/drivers/acpi/events/evgpe.c.orig 2005-03-06 04:29:40.000000000 +0800
+++ linux-2.6.9-rc1/drivers/acpi/events/evgpe.c 2005-03-06 04:29:49.000000000 +0800
@@ -253,10 +253,6 @@
ACPI_FUNCTION_TRACE ("ev_disable_gpe");
- if (!(gpe_event_info->flags & ACPI_GPE_ENABLE_MASK)) {
- return_ACPI_STATUS (AE_OK);
- }
-
/* Make sure HW enable masks are updated */
status = acpi_ev_update_gpe_enable_masks (gpe_event_info, ACPI_GPE_DISABLE);
[-- Attachment #3: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Suspend2-devel mailing list
Suspend2-devel@lists.suspend2.net
http://lists.suspend2.net/mailman/listinfo/suspend2-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-05 20:35 ` [ACPI] " Bernard Blackham
@ 2005-03-06 12:33 ` Pavel Machek
2005-03-07 16:48 ` Re: [Suspend2-devel] " John M Flinchbaugh
1 sibling, 0 replies; 10+ messages in thread
From: Pavel Machek @ 2005-03-06 12:33 UTC (permalink / raw)
To: Pavel Machek, Dumitru Ciobarcianu, acpi-devel, suspend2-devel
Hi!
> > Perhaps you should fix that, too? It is going to cause ugly
> > perfrmance problems.
>
> Yep, been looking into it, and I think I've got it. The GPE in
> question is fired periodically, about every 5 seconds. It fires when
> suspending but kacpid is stopped, so the GPE is simply disabled and
> never serviced. However, the state of it being disabled is recorded
> in the atomic copy.
>
> Upon resume, after restoring the atomic copy, the code at the top of
> acpi_ev_disable_gpe believes that the GPE is already disabled (as it
> was when we suspended) even though it's not. Hardware state is out
> of sync with what the kernel thinks and badness ensues. The culprit
> difference between 2.6.8 and 2.6.9-rc1 is that 2.6.8 disabled the
> GPE unconditionally, 2.6.9-rc1 checks against its last known state
> which is incorrect upon resuming.
>
> Removing the check resolves this issue (patch attached). Is this an
> adequate fix?
Does not removing of the check cause problems elsewhere? Bring it up
on linix-acpi lists...
Obvious solution would be to add suspend/resume routines to acpi,
and properly set the flag during resume.
> So, combining this patch with the previous kacpid NOFREEZE patch...
> Does this make people happy? :) (And should it be submitted via Len
> or akpm?)
Len.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-07 16:48 ` Re: [Suspend2-devel] " John M Flinchbaugh
@ 2005-03-07 16:54 ` Bernard Blackham
0 siblings, 0 replies; 10+ messages in thread
From: Bernard Blackham @ 2005-03-07 16:54 UTC (permalink / raw)
To: John M Flinchbaugh; +Cc: acpi-devel, suspend2-devel
On Mon, Mar 07, 2005 at 11:48:00AM -0500, John M Flinchbaugh wrote:
> On Sun, Mar 06, 2005 at 04:35:33AM +0800, Bernard Blackham wrote:
> > So, combining this patch with the previous kacpid NOFREEZE patch...
> > Does this make people happy? :) (And should it be submitted via Len
> > or akpm?)
> > --
> > Bernard Blackham <bernard at blackham dot com dot au>
>
> > --- linux-2.6.9-rc1/drivers/acpi/events/evgpe.c.orig 2005-03-06 04:29:40.000000000 +0800
> > +++ linux-2.6.9-rc1/drivers/acpi/events/evgpe.c 2005-03-06 04:29:49.000000000 +0800
> > @@ -253,10 +253,6 @@
> > ACPI_FUNCTION_TRACE ("ev_disable_gpe");
> >
> >
> > - if (!(gpe_event_info->flags & ACPI_GPE_ENABLE_MASK)) {
> > - return_ACPI_STATUS (AE_OK);
> > - }
> > -
> > /* Make sure HW enable masks are updated */
> >
> > status = acpi_ev_update_gpe_enable_masks (gpe_event_info, ACPI_GPE_DISABLE);
>
> I'm looking into your patches in hopes to eliminate system freezes
> related to ACPI and swsusp on my Thinkpad R40.
>
> I can apply this patch to 2.6.11.1, but the previous PF_NOFREEZE patch
> won't compile. Would you mind providing a proper patch for 2.6.11.1
> that makes all the changes you intend?
The kacpid NOFREEZE patch is only relevant for Software Suspend 2.
Bernard.
--
Bernard Blackham <bernard at blackham dot com dot au>
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-08 3:25 Re: [Suspend2-devel] " Li, Shaohua
@ 2005-03-08 5:28 ` Nigel Cunningham
2005-03-08 6:12 ` Li Shaohua
0 siblings, 1 reply; 10+ messages in thread
From: Nigel Cunningham @ 2005-03-08 5:28 UTC (permalink / raw)
To: Li, Shaohua; +Cc: Bernard Blackham, ACPI List, Suspend2 Development
Hi.
On Tue, 2005-03-08 at 14:25, Li, Shaohua wrote:
> >Ahh, d'oh. Try this patch instead.
> >
> >After more investigation, it seems that the issue is the GPE is
> >fired but not serviced because kacpid is frozen. This in itself
> >would be okay, however, the GPE isn't being disabled before the
> >method executes (despite there being code there to do so), and hence
> >fires continuously.
> >
> >The attached patch makes kacpid NOFREEZE so the GPE does get
> >serviced (eventually), but probably still isn't the correct
> >solution. (A better solution being making sure the GPE gets disabled
> >in the first place ... still looking into this).
> Hi,
> Did you guys try the suspend method 'platform'? In this case, ACPI will
> disable all non-wakeup GPEs.
Suspend2 supports entering S3, S4 or S5 once the image has been written.
For S3 or S4, we use the platform method, but when entering S5, we don't
and neither does Pavel.
Looking at the code, I can see that we could probably rework our support
a little as we're calling the prepare method immediately before powering
down rather than at the start of the process.
One question though: do you see any issues with calling prepare() and
finish() with the PM_DISK_PLATFORM parameter (as if entering S4 sleep)
when we're actually going to do an S5 powerdown? If this will allow the
wake events to fire (as they would for S4), it could be good for
machines where S4 isn't working properly..
The general flow of events for Suspend2 goes
1. Initial preparation
2. Freeze processes.
3. Write LRU pages.
4. Do & write atomic copy of remainder of memory.
5. Powerdown as requested, including the pm_ops->prepare() & enter().
Looking at 2.6.11, I see that Pavel does:
1. Initial preparation
2. Freeze processes.
3. pm_ops->prepare().
4. Do & write atomic copy.
5. pm_ops->enter() (PM_DISK_PLATFORM only, otherwise machine_power_off
or restart).
(Probably post resume)
6. Cleanup, including pm_ops->finish()
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://softwaresuspend.berlios.de
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-08 5:28 ` [ACPI] " Nigel Cunningham
@ 2005-03-08 6:12 ` Li Shaohua
2005-03-08 6:55 ` Nigel Cunningham
0 siblings, 1 reply; 10+ messages in thread
From: Li Shaohua @ 2005-03-08 6:12 UTC (permalink / raw)
To: ncunningham; +Cc: Bernard Blackham, ACPI List, Suspend2 Development
Hi,
On Tue, 2005-03-08 at 13:28, Nigel Cunningham wrote:
> > >
> > >After more investigation, it seems that the issue is the GPE is
> > >fired but not serviced because kacpid is frozen. This in itself
> > >would be okay, however, the GPE isn't being disabled before the
> > >method executes (despite there being code there to do so), and hence
> > >fires continuously.
> > >
> > >The attached patch makes kacpid NOFREEZE so the GPE does get
> > >serviced (eventually), but probably still isn't the correct
> > >solution. (A better solution being making sure the GPE gets disabled
> > >in the first place ... still looking into this).
> > Hi,
> > Did you guys try the suspend method 'platform'? In this case, ACPI will
> > disable all non-wakeup GPEs.
>
> Suspend2 supports entering S3, S4 or S5 once the image has been written.
> For S3 or S4, we use the platform method, but when entering S5, we don't
> and neither does Pavel.
ACPI also disables all non-wakeup GPEs in S5. I guess the problem is we
probably disable the GPEs too lately in swsusp2. swsusp writes image
after GPEs are disabled, but swsusp2 isn't (Sorry, I'm not familiar with
swsusp2, but it looks like this according to the steps you write down).
> Looking at the code, I can see that we could probably rework our support
> a little as we're calling the prepare method immediately before powering
> down rather than at the start of the process.
>
> One question though: do you see any issues with calling prepare() and
> finish() with the PM_DISK_PLATFORM parameter (as if entering S4 sleep)
> when we're actually going to do an S5 powerdown? If this will allow the
> wake events to fire (as they would for S4), it could be good for
> machines where S4 isn't working properly..
This possibly can't work. We must give the BIOS a correct sleep state (4
or 5).
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-08 6:12 ` Li Shaohua
@ 2005-03-08 6:55 ` Nigel Cunningham
0 siblings, 0 replies; 10+ messages in thread
From: Nigel Cunningham @ 2005-03-08 6:55 UTC (permalink / raw)
To: Li Shaohua; +Cc: Bernard Blackham, ACPI List, Suspend2 Development
Hi.
On Tue, 2005-03-08 at 17:12, Li Shaohua wrote:
> ACPI also disables all non-wakeup GPEs in S5. I guess the problem is we
> probably disable the GPEs too lately in swsusp2. swsusp writes image
> after GPEs are disabled, but swsusp2 isn't (Sorry, I'm not familiar with
> swsusp2, but it looks like this according to the steps you write down).
Okay. Bernard is just changing this at the moment.
> > Looking at the code, I can see that we could probably rework our support
> > a little as we're calling the prepare method immediately before powering
> > down rather than at the start of the process.
> >
> > One question though: do you see any issues with calling prepare() and
> > finish() with the PM_DISK_PLATFORM parameter (as if entering S4 sleep)
> > when we're actually going to do an S5 powerdown? If this will allow the
> > wake events to fire (as they would for S4), it could be good for
> > machines where S4 isn't working properly..
> This possibly can't work. We must give the BIOS a correct sleep state (4
> or 5).
Ok. Since S5 enabled the GPEs, we can do what we ought to anyway :>
Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://softwaresuspend.berlios.de
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
@ 2005-03-08 6:58 Li, Shaohua
2005-03-08 9:47 ` Bernard Blackham
0 siblings, 1 reply; 10+ messages in thread
From: Li, Shaohua @ 2005-03-08 6:58 UTC (permalink / raw)
To: Bernard Blackham, Pavel Machek; +Cc: acpi-devel, suspend2-devel
>>
>> Perhaps you should fix that, too? It is going to cause ugly
>> perfrmance problems.
>
>Yep, been looking into it, and I think I've got it. The GPE in
>question is fired periodically, about every 5 seconds. It fires when
>suspending but kacpid is stopped, so the GPE is simply disabled and
>never serviced. However, the state of it being disabled is recorded
>in the atomic copy.
>
>Upon resume, after restoring the atomic copy, the code at the top of
>acpi_ev_disable_gpe believes that the GPE is already disabled (as it
>was when we suspended) even though it's not. Hardware state is out
>of sync with what the kernel thinks and badness ensues. The culprit
>difference between 2.6.8 and 2.6.9-rc1 is that 2.6.8 disabled the
>GPE unconditionally, 2.6.9-rc1 checks against its last known state
>which is incorrect upon resuming.
>
>Removing the check resolves this issue (patch attached). Is this an
>adequate fix?
Great analyzing! I guess Both S3, swsusp, and swsusp2 can't suffer from
this bug. Sure this patch hasn't any side effect. But I wonder if it's
the right fix. Maybe we should disable all GPEs which have been disabled
on very early of resume (that is to restore ACPI hardware's state). This
is very useful for suspend-to-disk. If you found a similar bug in S3, it
possibly it's BIOS bug (OS didn't enable GPE on resume for S3, but for
S4, we follow boot time code path, so we possibly enabled some).
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
@ 2005-03-08 7:24 Li, Shaohua
0 siblings, 0 replies; 10+ messages in thread
From: Li, Shaohua @ 2005-03-08 7:24 UTC (permalink / raw)
To: Li, Shaohua, Bernard Blackham, Pavel Machek; +Cc: acpi-devel, suspend2-devel
>>Yep, been looking into it, and I think I've got it. The GPE in
>>question is fired periodically, about every 5 seconds. It fires when
>>suspending but kacpid is stopped, so the GPE is simply disabled and
>>never serviced. However, the state of it being disabled is recorded
>>in the atomic copy.
>>
>>Upon resume, after restoring the atomic copy, the code at the top of
>>acpi_ev_disable_gpe believes that the GPE is already disabled (as it
>>was when we suspended) even though it's not. Hardware state is out
>>of sync with what the kernel thinks and badness ensues. The culprit
>>difference between 2.6.8 and 2.6.9-rc1 is that 2.6.8 disabled the
>>GPE unconditionally, 2.6.9-rc1 checks against its last known state
>>which is incorrect upon resuming.
>>
>>Removing the check resolves this issue (patch attached). Is this an
>>adequate fix?
>Great analyzing! I guess Both S3, swsusp, and swsusp2 can't suffer from
>this bug. Sure this patch hasn't any side effect. But I wonder if it's
>the right fix. Maybe we should disable all GPEs which have been
disabled
>on very early of resume (that is to restore ACPI hardware's state).
This
>is very useful for suspend-to-disk. If you found a similar bug in S3,
it
>possibly it's BIOS bug (OS didn't enable GPE on resume for S3, but for
>S4, we follow boot time code path, so we possibly enabled some).
Oops, my bad. ACPI has done this, so 'platform' method should survive.
For S5 method, we should disable the disabled GPEs. Leaving the GPE on
will have other side effects. S5 method also has other side effect (Eg.
doesn't invoke _WAK), so I suggest always use 'platform' method.
Shaohua
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops
2005-03-08 6:58 [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops Li, Shaohua
@ 2005-03-08 9:47 ` Bernard Blackham
[not found] ` <20050308094711.GC20154-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Bernard Blackham @ 2005-03-08 9:47 UTC (permalink / raw)
To: Li, Shaohua; +Cc: Pavel Machek, acpi-devel, suspend2-devel
On Tue, Mar 08, 2005 at 02:58:01PM +0800, Li, Shaohua wrote:
> >Removing the check resolves this issue (patch attached). Is this an
> >adequate fix?
>
> Great analyzing! I guess Both S3, swsusp, and swsusp2 can't suffer from
> this bug. Sure this patch hasn't any side effect.
At least one user has reported that this patch makes the battery
status unreadable. - http://lists.suspend2.net/lurker/message/20050306.201243.cbbca9ea.en.html
I don't know why this is the case yet, but I'd hesitate to use it
until this is sorted out.
> But I wonder if it's the right fix. Maybe we should disable all
> GPEs which have been disabled on very early of resume (that is to
> restore ACPI hardware's state).
Does calling acpi_enter_sleep_state_prep through pm_ops->prepare()
disable runtime GPEs? I can only see that done in
acpi_enter_sleep_state called through pm_ops->enter(). But calling
enter() will physically enter the sleep state too which isn't what
we want at that stage.
Hence I'm still unclear on how to disable the GPEs from swsusp2
without delving into drivers/acpi/. Should runtime GPEs be disabled
in prepare() ? The ACPI spec doesn't mention what to do with runtime
GPEs on suspend, but Pavel suggested that keeping kacpid running
during suspend would be a good idea in order to respond to things
like thermal events.
Vanilla swsusp seems to call pm_ops->prepare() on resume before
restoring the atomic copy, if the platform method is used, and later
call pm_ops->finish() once resumed. Is this valid? I gather that it
means that the entire sequence goes:
- prepare()
- snapshot & write to disk
- enter()
- wakeup & boot
- read image from disk
- prepare()
- restore snapshot
- finish()
Are the two calls to prepare() intentional or safe? To me it seems
that GPEs aren't disabled until enter() is called so currently
calling prepare() on resume wouldn't synchronise the GPE states.
Bernard.
--
Bernard Blackham <bernard at blackham dot com dot au>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [Suspend2-devel] Cause of Suspend2 resume failures on Toshiba laptops
[not found] ` <20050308094711.GC20154-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
@ 2005-03-08 10:02 ` Sebastian Kügler
0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Kügler @ 2005-03-08 10:02 UTC (permalink / raw)
To: suspend2-devel-g/J2nbn7YhmWn91e4EydUaxOck334EZe
Cc: Bernard Blackham, Li, Shaohua, Pavel Machek,
acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]
On Tuesday 08 March 2005 10:47, Bernard Blackham wrote:
> On Tue, Mar 08, 2005 at 02:58:01PM +0800, Li, Shaohua wrote:
> > >Removing the check resolves this issue (patch attached). Is this an
> > >adequate fix?
> >
> > Great analyzing! I guess Both S3, swsusp, and swsusp2 can't suffer from
> > this bug. Sure this patch hasn't any side effect.
>
> At least one user has reported that this patch makes the battery
> status unreadable. -
> http://lists.suspend2.net/lurker/message/20050306.201243.cbbca9ea.en.html
>
> I don't know why this is the case yet, but I'd hesitate to use it
> until this is sorted out.
That's me though, who is prepared to do tests for you to resolve the issue.
Please shoot right away...
Cheers,
sebas
--
http://vizZzion.org | GPG Key ID: 9119 0EF9
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
I contend that we are both atheists. I just believe in one fewer god than you
do. When you understand why you dismiss all the other possible gods, you will
understand why I dismiss yours. - Sir Stephen Henry Roberts
[-- Attachment #2: Type: application/pgp-signature, Size: 481 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-03-08 10:02 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-08 6:58 [ACPI] Re: Cause of Suspend2 resume failures on Toshiba laptops Li, Shaohua
2005-03-08 9:47 ` Bernard Blackham
[not found] ` <20050308094711.GC20154-4vSAtV5O1nc0n/F98K4Iww@public.gmane.org>
2005-03-08 10:02 ` Re: [Suspend2-devel] " Sebastian Kügler
-- strict thread matches above, loose matches on Subject: below --
2005-03-08 7:24 [ACPI] " Li, Shaohua
2005-03-08 3:25 Re: [Suspend2-devel] " Li, Shaohua
2005-03-08 5:28 ` [ACPI] " Nigel Cunningham
2005-03-08 6:12 ` Li Shaohua
2005-03-08 6:55 ` Nigel Cunningham
[not found] <20050304175058.GA4042@blackham.com.au>
[not found] ` <1110012298.6028.10.camel@DustPuppy.LNX.RO>
[not found] ` <20050305100254.GH4042@blackham.com.au>
2005-03-05 17:56 ` Re: [Suspend2-devel] " Pavel Machek
2005-03-05 20:35 ` [ACPI] " Bernard Blackham
2005-03-06 12:33 ` Pavel Machek
2005-03-07 16:48 ` Re: [Suspend2-devel] " John M Flinchbaugh
2005-03-07 16:54 ` [ACPI] " Bernard Blackham
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox