From: Thomas Gleixner <tglx@linutronix.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Len Brown <lenb@kernel.org>,
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>,
"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [patch 0/2] suspend/resume regression fixes
Date: Sun, 23 Sep 2007 01:30:44 +0200 [thread overview]
Message-ID: <1190503844.4035.136.camel@chaos> (raw)
In-Reply-To: <alpine.LFD.0.999.0709221546250.16478@woody.linux-foundation.org>
Linus,
On Sat, 2007-09-22 at 15:59 -0700, Linus Torvalds wrote:
> > My final enlightment was, when I removed the ACPI processor module,
> > which controls the lower idle C-states, right before resume; this
> > worked fine all the time even without all the workaround hacks.
> >
> > I really hope that this two patches finally set an end to the "jinxed
> > VAIO heisenbug series", which started when we removed the periodic
> > tick with the clockevents/dyntick patches.
>
> Ok, so the patches look fine, but I somehow have this slight feeling that
> you gave up a bit too soon on the "*why* does this happen?" question.
Yeah, I gave up at the point where I was not longer able to dig
deeper :)
> I realize that the answer is easily "because ACPI screwed up", but I'm
> wondering if there's something we do to trigger that screw-up.
Fair enough.
> In particular, I also suspect that this may not really fix the problem -
> maybe it just makes the window sufficiently small that it no longer
> triggers. Because we don't necessarily understand what the real background
> for the problem is, I'm not sure we can say that it is solved.
>
> The reason I say this is that I have a suspicion on what triggers it.
>
> I suspect that the problem is that we do
>
> pm_ops->prepare();
> disable_nonboot_cpus()
> suspend_enter();
> enable_nonboot_cpus()
> pm_finish()
>
> and here the big thing to notice is that "pm_ops->prepare()" call, which
> sets the wakup vector etc etc.
>
> So maybe the real problem here is that once we've done the "->prepare()"
> call and ACPI has set up various stuff, we MUST NOT do any calls to any
> ACPI routines to set low-power states, because the stupid firmware isn't
> expecting it.
That's what I suspect and deduced from the various experiments including
a force the cpu into a lower c-state one, which triggered the problem
fully reproducible. Note that in case of the "force a lower c-state" I
verified, that the PIT was activated to avoid the local apic stops in c3
issue. But I never got an PIT interrupt. Either the box was completely
stuck or I was able to recover by hitting a key, which is as well one of
the unexplained phenomenons.
> Now, if this is the cause, then I think your patch should indeed fix it,
> since you get called by the early-suspend code (which happens *before* the
> "->prepare()" call), but at the same time, I wonder if maybe it would be
> slightly "more correct" to instead of using the suspend/resume callbacks,
> simply do this in the "acpi_pm_prepare()" stage, since that is likely the
> thing that triggers it?
Yeah, probably that's the correct point, but I leave this to the ACPI
wizards.
> But hey, I think I'll apply the patches as-is. I'd just feel even better
> if we actually understood *why* doing the CPU Cx states is not something
> we can do around the suspend code!
That needs some explanation of the folks who can actually look beyond
the ACPI/BIOS internals.
tglx
next prev parent reply other threads:[~2007-09-22 23:30 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-22 22:29 [patch 0/2] suspend/resume regression fixes Thomas Gleixner
2007-09-22 22:29 ` [patch 1/2] ACPI: disable lower idle C-states across suspend/resume Thomas Gleixner
2007-10-01 10:11 ` Andi Kleen
2007-09-22 22:29 ` [patch 2/2] clockevents: remove the suspend/resume workaround^Wthinko Thomas Gleixner
2007-09-22 22:59 ` [patch 0/2] suspend/resume regression fixes Linus Torvalds
2007-09-22 23:30 ` Thomas Gleixner [this message]
2007-09-23 1:20 ` Oleg Verych
2007-09-23 3:11 ` Linus Torvalds
2007-09-23 5:24 ` Mihai Donțu
2007-09-23 12:30 ` Alan Cox
2007-09-23 13:00 ` Mihai Donțu
2007-09-23 14:06 ` Matthew Garrett
2007-09-23 10:29 ` Rafael J. Wysocki
2007-09-28 20:27 ` Mark Lord
2007-09-28 20:33 ` Thomas Gleixner
2007-09-28 21:17 ` Mark Lord
2007-09-28 21:40 ` Rafael J. Wysocki
2007-09-28 21:04 ` Alan Cox
2007-09-29 17:12 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1190503844.4035.136.camel@chaos \
--to=tglx@linutronix.de \
--cc=akpm@linux-foundation.org \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rjw@sisk.pl \
--cc=torvalds@linux-foundation.org \
--cc=venkatesh.pallipadi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.