xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Marek Marczykowski <marmarek@invisiblethingslab.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x
Date: Fri, 22 Mar 2013 12:56:51 -0400	[thread overview]
Message-ID: <20130322165651.GA4827@phenom.dumpdata.com> (raw)
In-Reply-To: <514C79F3.5050504@invisiblethingslab.com>

On Fri, Mar 22, 2013 at 04:34:11PM +0100, Marek Marczykowski wrote:
> On 15.03.2013 14:02, Konrad Rzeszutek Wilk wrote:
> > On Wed, Mar 13, 2013 at 09:50:39PM +0100, Marek Marczykowski wrote:
> >> Hi,
> >>
> >> I've still have problems with ACPI(?) on Xen. After some system startup or
> >> resume CPU temperature goes high although all domUs (and dom0) are idle. On
> >> "good" system startup it is about 50-55C, on "bad" - above 67C (most time
> >> above 70C). I've noticed difference in C-states repored by Xen (attached
> >> files). On "bad" startups in addition suspend doesn't work - system restarts
> >> during suspend (still didn't managed to get console messages - I don't have
> >> serial port on this system). Note that sometimes system boots fine ("good"
> >> state), but problem occurs after some suspend/resume cycles. Some time ago
> >> I've got other symptoms: only CPU0 was used - for all VCPUs (according to xl
> >> vcpu-list). Maybe it is related?
> >>
> >> Hardware: Dell Latitude E6420
> >> CPU: Intel i5-2520M
> >>
> >> Software:
> >> xen stable-4.1 as of 15.02 (last commit: "xen: sched_creadit: improve picking
> >> up the idle CPU for a VCPU"), with reverted commit "Introduce system_state
> >> variable."
> >> But the same problem on vanilla xen 4.1.2.
> >>
> >> Linux 3.7.6 - happens almost every boot. On Linux 3.7.4 happens much rarer
> >> (but still occurs).
> >> Kernel config:
> >> http://git.qubes-os.org/gitweb/?p=marmarek/kernel.git;a=blob;f=config-pvops;h=a6e953f71cdc84556571b592b8af87a5a4f9a8d0;hb=HEAD
> >> I've tried some bisect from 3.7.4 to 3.7.6, but without success because
> >> problem isn't 100% reproducible.
> >>
> >> Any ideas?
> > 
> > That C-states difference is important. The SYSIO part on your box means that the
> > CPU ends up doing an MWAIT. An HALT on the other hand is not so power-saving
> > friendly.
> > 
> > Looking at this:
> >> (XEN) no cpu_id for acpi_id 5
> >> (XEN) no cpu_id for acpi_id 6
> >> (XEN) no cpu_id for acpi_id 7
> >> (XEN) no cpu_id for acpi_id 8
> > 
> > .. means that xen-acpi-processor was trying to probe for the ACPI IDs of the
> > the other CPUs that the machine theoritcally can support. That means it got
> > the ACPI information for the first four CPUs (which is good).
> > 
> > You can as the first step in trying to figure this out, add #define DEBUG 1
> > in xen-acpi-processor.c right before any of the #includes. And also boot
> > Xen with 'cpufreq=verbose'. That should tell you what kind of C-states the
> > xen-acpi-processor uploaded (And if it did it for all of the vCPUS).
> > 
> > If both bootups show that we do upload the C-states for all the CPUs but they
> > vary that means digging a bit deeper in the ACPI code. Specifically in 
> > acpi_processor_get_power_info_cst and seeing if it hits any of the 'continue'.
> > 
> > Then I would say take also the DSDT for both bootups and compare them. It might
> > be that the BIOS is using a scratch register at reboot to construct the C-states
> > and somehow it ends up being corrupted. Which means that on the next warm reboot
> > the C-states has bogus data. This does show up in the field :-(
> 
> Finally I've found some time for further debugging this. And it looks like
> some deeper ACPI code problem...
> 
> I've switched to 3.8.4, on which problem is much easier to reproduce (almost
> every startup).
> 
> On bad bootup, xen-acpi-processor didn't found any C-state: for each CPU
> _pr.flags.power and _pr->power.count was 0 (but flags.power_setup_done=1). In
> this case suspend (or shutdown) always ends up with reset.

This is you booting the machine from a cold-state or a warm one?

There are some BIOSes out there that I know that use the scratchpad registers in
IOH (so depending on the platform that can be 0:0e.1 , Reg 0x84). If Xen or Linux
touch it then the P-states and C-states that the BIOS generates are buggy.

But that is not the case here - you are saying that the DSDT after disassembling
(so cat /sys/firmware/acpi/tables/DSDT, or SSDT* and the iasl -d on them), the
_PSD, _PSS, and _PCT look the same?

You could also look at the FACP table and see if they are different.
> 
> On good one xen-acpi-processor got C1-C3 states for each CPU, then suspend
> succeeded, but after resume CPU0 had C1-C3, but others only C1. Reloading
> xen-acpi-processor (rmmod -f...) fixes this (according to xl debug-key c), but
> still temperature keep high. Regardless of xen-acpi-processor reloading, next
> suspend always fails.

If you reload, and look at the runqeueus, are all of them using the ACPI
idler or the default one?

> 
> Not sure how C-states can be related to S3 suspend, but perhaps something more
> general with ACPI is wrong?

This reminds me of something. I recall a long long time ago seeing something like this....
Completly forgot about this until now. The difference was whether the Xen's cpu_idle 
as running a) the acpi_idle (so using the different C-states), or b) the default one
(so just using HLT).

With the b), during resume it would get half-way through
(http://darnok.org/xen/devel.acpi-s3.v1.serial.log) while with a) it would actually
continue on - http://darnok.org/xen/devel.acpi-s3.v0.serial.log

This was on some MSI MS-7680/H61M-P23 (MS-7680) motherboard.

Oh look: http://lists.xen.org/archives/html/xen-devel/2011-06/msg02059.html

And it looks Kevin's recommendation was use the a) case with max_cstates=1
to narrow it down.

> 
> Each time DSDT (get from /sys/firmware/acpi/tables) is exactly the same.
> 
> -- 
> Best Regards / Pozdrawiam,
> Marek Marczykowski
> Invisible Things Lab
> 

  reply	other threads:[~2013-03-22 16:56 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-13 20:50 High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x Marek Marczykowski
2013-03-15  3:00 ` Dario Faggioli
2013-03-15  3:22   ` Marek Marczykowski
2013-03-15 13:02 ` Konrad Rzeszutek Wilk
2013-03-22 15:34   ` Marek Marczykowski
2013-03-22 16:56     ` Konrad Rzeszutek Wilk [this message]
2013-03-25 11:36       ` Marek Marczykowski
2013-03-25 14:17         ` Konrad Rzeszutek Wilk
2013-03-25 14:56           ` Marek Marczykowski
2013-03-26 12:17           ` Marek Marczykowski
2013-03-26 13:11             ` Jan Beulich
2013-03-26 13:50               ` Marek Marczykowski
2013-03-26 15:47                 ` Andrew Cooper
2013-03-26 16:12                   ` Andrew Cooper
2013-03-26 16:47                     ` Marek Marczykowski
2013-03-26 16:03                 ` Jan Beulich
2013-03-26 16:45                   ` Marek Marczykowski
2013-03-26 17:02                     ` Andrew Cooper
2013-03-26 17:42                       ` Marek Marczykowski
2013-03-26 17:54                         ` Andrew Cooper
2013-03-26 18:21                           ` Marek Marczykowski
2013-03-26 18:50                             ` Andrew Cooper
2013-03-27  8:50                               ` Marek Marczykowski
2013-03-27  8:58                                 ` Jan Beulich
2013-03-27  8:52                               ` Jan Beulich
2013-03-27  9:03                                 ` Jan Beulich
2013-03-27 14:01                                   ` Marek Marczykowski
2013-03-27 14:31                                 ` Marek Marczykowski
2013-03-27 14:46                                   ` Andrew Cooper
2013-03-27 14:49                                     ` Marek Marczykowski
2013-03-27 15:51                                       ` Marek Marczykowski
2013-03-27 16:27                                         ` Andrew Cooper
2013-03-27 18:16                                           ` Marek Marczykowski
2013-03-27 18:56                                             ` Andrew Cooper
2013-03-28 14:43                                               ` Marek Marczykowski
2013-03-28 10:50                                           ` Jan Beulich
2013-03-28 11:53                                             ` Andrew Cooper
2013-03-28 12:54                                               ` Jan Beulich
2013-03-28 13:19                                                 ` Jan Beulich
2013-03-27 14:52                                     ` Andrew Cooper
2013-03-27 15:47                                       ` Konrad Rzeszutek Wilk
2013-03-27 16:56                                         ` Andrew Cooper
2013-03-27 17:15                                           ` Marek Marczykowski
2013-03-28 17:41                                             ` Andrew Cooper
2013-03-28 17:44                                               ` Marek Marczykowski
2013-03-28 17:50                                                 ` Andrew Cooper
2013-03-29  0:26                                                   ` Marek Marczykowski
2013-03-28 16:13                                   ` Jan Beulich
2013-03-28 19:03                                     ` Marek Marczykowski
2013-04-01 13:53                                       ` Ben Guthro
2013-04-02  1:13                                         ` Marek Marczykowski
2013-04-02 14:05                                           ` Konrad Rzeszutek Wilk
2013-04-15 22:09                                           ` Marek Marczykowski
2013-04-15 23:36                                             ` Ben Guthro
2013-04-15 23:51                                               ` konrad wilk
2013-04-16  0:19                                                 ` Ben Guthro
2013-04-16  0:46                                                   ` Ben Guthro
2013-04-16  3:20                                                     ` konrad wilk
2013-04-16  1:02                                               ` Marek Marczykowski
2013-04-16  8:47                                             ` Jan Beulich
2013-04-16 11:49                                               ` Ben Guthro
2013-04-16 11:57                                                 ` Jan Beulich
2013-04-16 12:09                                                   ` Ben Guthro
2013-04-16 12:51                                                     ` Jan Beulich
2013-03-28 16:25                                   ` Jan Beulich
2013-03-28 16:31                                     ` Marek Marczykowski
2013-03-28 16:52                                       ` Jan Beulich
2013-03-28 17:09                                         ` Marek Marczykowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130322165651.GA4827@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).