From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek Marczykowski Subject: Re: High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x Date: Tue, 16 Apr 2013 03:02:53 +0200 Message-ID: <516CA33D.3090503@invisiblethingslab.com> References: <5140E69F.9090803@invisiblethingslab.com> <20130322165651.GA4827@phenom.dumpdata.com> <515036BF.10105@invisiblethingslab.com> <20130325141701.GI11546@phenom.dumpdata.com> <515191CC.6060609@invisiblethingslab.com> <5151AC8C02000078000C88B9@nat28.tlf.novell.com> <5151A788.809@invisiblethingslab.com> <5151D4CC02000078000C8A1C@nat28.tlf.novell.com> <5151D0A9.7070100@invisiblethingslab.com> <5151D49C.2000809@citrix.com> <5151DE1C.1020307@invisiblethingslab.com> <5151E0D5.3050707@citrix.com> <5151E72D.30205@invisiblethingslab.com> <5151EE0B.9030605@citrix.com> <5152C16E02000078000C8CB8@nat28.tlf.novell.com> <515302C3.3000607@invisiblethingslab.com> <51547A5302000078000C962C@nat28.tlf.novell.com> <515493F0.6030708@invisiblethingslab.com> <515A30D4.90003@invisiblethingslab.com> <516C7AB2.8050406@invisiblethingslab.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7005348631475412505==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ben Guthro Cc: Andrew Cooper , "xen-devel@lists.xen.org" , Jan Beulich , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============7005348631475412505== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig948A0DC48C69D87B78886A16" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig948A0DC48C69D87B78886A16 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 16.04.2013 01:36, Ben Guthro wrote: > On Mon, Apr 15, 2013 at 11:09 PM, Marek Marczykowski > wrote: >> On 02.04.2013 03:13, Marek Marczykowski wrote: >>> On 01.04.2013 15:53, Ben Guthro wrote: >>>> On Thu, Mar 28, 2013 at 3:03 PM, Marek Marczykowski >>>> wrote: >>>>> (XEN) Restoring affinity for d2v3 >>>>> (XEN) Assertion '!cpus_empty(cpus) && cpu_isset(cpu, cpus)' failed = at >>>>> sched_credit.c:481 >>>> >>>> >>>> I think the "fix-suspend-scheduler-*" patches posted here are applic= able here: >>>> http://markmail.org/message/llj3oyhgjzvw3t23 >>>> >>>> >>>> Specifically, I think you need this bit: >>>> >>>> diff --git a/xen/common/cpu.c b/xen/common/cpu.c >>>> index 630881e..e20868c 100644 >>>> --- a/xen/common/cpu.c >>>> +++ b/xen/common/cpu.c >>>> @@ -5,6 +5,7 @@ >>>> #include >>>> #include >>>> #include >>>> +#include >>>> >>>> unsigned int __read_mostly nr_cpu_ids =3D NR_CPUS; >>>> #ifndef nr_cpumask_bits >>>> @@ -212,6 +213,8 @@ void enable_nonboot_cpus(void) >>>> BUG_ON(error =3D=3D -EBUSY); >>>> printk("Error taking CPU%d up: %d\n", cpu, error); >>>> } >>>> + if (system_state =3D=3D SYS_STATE_resume) >>>> + cpumask_set_cpu(cpu, cpupool0->cpu_valid); >>>> } >>>> >>>> cpumask_clear(&frozen_cpus); >>>> >>> >>> Indeed, this makes things better, but still not ideal. >>> Now after resume all CPUs are in Pool-0, which is good. But CPU0 is m= uch more >>> preferred than others (xl vcpu-list). For example if I start 4 busy l= oops in >>> dom0, I got (even after some time): >>> [user@dom0 ~]$ xl vcpu-list >>> Name ID VCPU CPU State Time(s) CP= U Affinity >>> dom0 0 0 0 r-- 98.5 an= y cpu >>> dom0 0 1 0 --- 181.3 an= y cpu >>> dom0 0 2 2 r-- 262.4 an= y cpu >>> dom0 0 3 3 r-- 230.8 an= y cpu >>> netvm 1 0 0 -b- 18.4 an= y cpu >>> netvm 1 1 0 -b- 9.1 an= y cpu >>> netvm 1 2 0 -b- 7.1 an= y cpu >>> netvm 1 3 0 -b- 5.4 an= y cpu >>> firewallvm 2 0 0 -b- 10.7 an= y cpu >>> firewallvm 2 1 0 -b- 3.0 an= y cpu >>> firewallvm 2 2 0 -b- 2.5 an= y cpu >>> firewallvm 2 3 3 -b- 3.6 an= y cpu >>> >>> If I remove some CPU from Pool-0 and re-add it, things back to normal= for this >>> particular CPU (so I got two equally used CPUs) - to fully restore sy= stem I >>> must remove all but CPU0 from Pool-0 and add it again. >>> >>> Also still only CPU0 have all C-states (C0-C3), all others have only = C0-C1. >>> This probably could be fixed by your "xen: Re-upload processor PM dat= a to >>> hypervisor after S3 resume" patch (reload of xen-acpi-processor modul= e helps >>> here). But I don't think it is a right way. It isn't necessary on oth= er >>> systems (with somehow older hardware). It must be something missing o= n resume >>> path. The question is what... >>> >>> Perhaps someone need to go through enable_nonboot_cpus() (__cpu_up?) = and check >>> if it restore all things disabled in disable_nonboot_cpus() (__cpu_di= sable?). >>> Unfortunately I don't know x86 details so good to follow that code...= >> >> Summarize ACPI S3 issues: >> >> I. Fixed issues: >> >> 1. IRQ problem fixed by "x86: irq_move_cleanup_interrupt() must ignore= legacy >> vectors" commit >> 2. Assertion failure on resume with vcpu affinity used, fixes by "x86/= S3: >> Restore broken vcpu affinity on resume" commit >> >> >> II. Not (fully) fixed issues: >> >> 1. CPU Pool-0 contains only CPU0 after resume - patch quoted above fix= es the >> issue, but it isn't applied to xen-unstable >> 2. After resume scheduler chooses (almost) only CPU0 (above quoted lis= ting). >> Removing and re-adding all CPUs to Pool-0 solves the problem. Perhaps = some >> timers are not restarted after resume? >=20 > Marek, > Please try the patch from this thread to see if it solves your 2 issues= above: > http://markmail.org/thread/35ecqimv7bwq3k6d >=20 > This patch was NAK'ed due to cpupool breakage...but in my testing, it > solved both of these problems. >=20 > I don't know how to properly solve it in a cpupool compatible way... > but I also haven't put much additional effort into doing so. Indeed this makes problem disappear. --=20 Best Regards / Pozdrawiam, Marek Marczykowski Invisible Things Lab --------------enig948A0DC48C69D87B78886A16 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQEcBAEBAgAGBQJRbKM9AAoJENuP0xzK19csPoAIAJSgVwfJu6IwksnjSfAmKHwU KQEYPivETawIs9vwoP2oKhFq0LIeq6eBi3kEdx+/QBkCcVeH9NvV2fvHfHGVPcQd EOoArgX1FjEZ52fbWAC7zsrRcUlSi2s5EZHXA+P3/nqJKFl2gxxRYU5yizRp+27p DU4qEovIUeYfxFVWoF19+reSnX+EvXnYTBJRArNAe15Xjn/Yo1KKi+XUsk5o1nwd 0WSQx7+t/fia/Fv5tYZXKS9DpTN5KZ7QVBvncTHJPe/adarI3EaTC9FMnZUDqvp9 /qwN1FlpuDnzIDsrQ1WD8OPqIe2WcwalPuHqhydSzTZWe5ndlujrAL6GhUIjis0= =FAXT -----END PGP SIGNATURE----- --------------enig948A0DC48C69D87B78886A16-- --===============7005348631475412505== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============7005348631475412505==--