From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: Xen4.2 S3 regression? Date: Thu, 20 Sep 2012 14:07:45 +0100 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4236950969225128557==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ben Guthro , Jan Beulich Cc: Konrad Rzeszutek Wilk , john.baboval@citrix.com, Thomas Goetz , xen-devel List-Id: xen-devel@lists.xenproject.org > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --===============4236950969225128557== Content-type: multipart/alternative; boundary="B_3430994876_69768592" > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3430994876_69768592 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: quoted-printable disable_nonboot_cpus() -> cpu_down(1) -> ... ...and from there it is same as the xen-hptool case: arch_do_sysctl() -> cpu_down_helper() -> cpu_down(1) -> stop_machine_run(take_cpu_down, 1) -> [on CPU#1] take_cpu_down() -> __cpu_disable() On 20/09/2012 13:56, "Ben Guthro" wrote: > It appears __cpu_disable() is not getting reached at all, for CPU1 >=20 > I put a cpu id conditional BUG() call in there, to verify - and while it = is > reached when using=C2=A0 > xen-hptool cpu-offline 1 > It never seems to be reached from the S3 path. >=20 >=20 > What is the expected call chain to get into this code during S3? >=20 >=20 > On Thu, Sep 20, 2012 at 4:03 AM, Jan Beulich wrote: >>>>> >>> On 20.09.12 at 08:13, Keir Fraser wrote: >>> > CPU#1 got stuck in loop in cpu_init() as it appears to be =C5=92already >>> > initialised=C2=B9 in cpu_initialized bitmap. CPU#0 detects it is stuck an= d >>> > carries on, but the resume code assumes all CPUs are brought back onl= ine >>> and >>> > crashes later. >>=20 >> So this would suggest play_dead() (-> cpu_exit_clear() -> >> cpu_uninit()) not getting reached during the suspend cycle. >> That should be fairly easy to verify, as the serial console >> ought to still work when the secondary CPUs get offlined. >>=20 >> That might imply cpumask_clear_cpu(cpu, &cpu_online_map) >> not getting reached in __cpu_disable(), which would be in line >> with the observation that none of the logs provided so far >> showed anything being done by fixup_irqs() (called right >> after clearing the online bit). >>=20 >> Jan >=20 >=20 --B_3430994876_69768592 Content-type: text/html; charset="UTF-8" Content-transfer-encoding: quoted-printable Re: [Xen-devel] Xen4.2 S3 regression? disable_nonboot_cpus() -> cpu_down(1) -> ...

...and from there it is same as the xen-hptool case:
arch_do_sysctl() -> cpu_down_helper() -> cpu_down(1) -> stop_machi= ne_run(take_cpu_down, 1) -> [on CPU#1] take_cpu_down() -> __cpu_disabl= e()

On 20/09/2012 13:56, "Ben Guthro" <be= n@guthro.net> wrote:

<= SPAN STYLE=3D'font-size:11pt'>It appears __cpu_disable() is not getting reache= d at all, for CPU1

I put a cpu id conditional BUG() call in there, to verify - and while it is= reached when using=C2=A0
xen-hptool cpu-offline 1
It never seems to be reached from the S3 path.


What is the expected call chain to get into this code during S3?


On Thu, Sep 20, 2012 at 4:03 AM, Jan Beulich <JBeulich@suse.com> wrote:
<= SPAN STYLE=3D'font-size:11pt'>>>> On 20.09.12 at 08:13, Keir Fraser &= lt;keir.xen@gmail.com> wrote:
> CPU#1 got stuck in loop in cpu_init() as it appears to be Œalread= y
> initialised¹ in cpu_initialized bitmap. CPU#0 detects it is stuck= and
> carries on, but the resume code assumes all CPUs are brought back onli= ne and
> crashes later.

So this would suggest play_dead() (-> cpu_exit_clear() ->
cpu_uninit()) not getting reached during the suspend cycle.
That should be fairly easy to verify, as the serial console
ought to still work when the secondary CPUs get offlined.

That might imply cpumask_clear_cpu(cpu, &cpu_online_map)
not getting reached in __cpu_disable(), which would be in line
with the observation that none of the logs provided so far
showed anything being done by fixup_irqs() (called right
after clearing the online bit).

Jan


--B_3430994876_69768592-- --===============4236950969225128557== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4236950969225128557==--