From: Juergen Gross <jgross@suse.com>
To: Jan Beulich <JBeulich@suse.com>,
Dario Faggioli <dario.faggioli@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Keir Fraser <keir@xen.org>,
xen-devel@lists.xen.org
Subject: Re: [PATCH 3/3] xen: cpupools: avoid crashing if shutting down with free CPUs
Date: Fri, 08 May 2015 12:47:44 +0200 [thread overview]
Message-ID: <554C9450.6080405@suse.com> (raw)
In-Reply-To: <554CAD480200007800078288@suse.com>
On 05/08/2015 12:34 PM, Jan Beulich wrote:
>>>> On 08.05.15 at 12:20, <JGross@suse.com> wrote:
>> On 05/06/2015 05:10 PM, Dario Faggioli wrote:
>>> in fact, before this change, shutting down or suspending the
>>> system with some CPUs not assigned to any cpupool, would
>>> crash as follows:
>>>
>>> (XEN) Xen call trace:
>>> (XEN) [<ffff82d080101757>] disable_nonboot_cpus+0xb5/0x138
>>> (XEN) [<ffff82d0801a8824>] enter_state_helper+0xbd/0x369
>>> (XEN) [<ffff82d08010614a>] continue_hypercall_tasklet_handler+0x4a/0xb1
>>> (XEN) [<ffff82d0801320bd>] do_tasklet_work+0x78/0xab
>>> (XEN) [<ffff82d0801323f3>] do_tasklet+0x5e/0x8a
>>> (XEN) [<ffff82d080163cb6>] idle_loop+0x56/0x6b
>>> (XEN)
>>> (XEN)
>>> (XEN) ****************************************
>>> (XEN) Panic on CPU 0:
>>> (XEN) Xen BUG at cpu.c:191
>>> (XEN) ****************************************
>>>
>>> This is because, for free CPUs, -EBUSY were being returned
>>> when trying to tear them down, making cpu_down() unhappy.
>>>
>>> It is certainly unpractical to forbid shutting down or
>>> suspenging if there are unassigned CPUs, so this change
>>> fixes the above by just avoiding returning -EBUSY for those
>>> CPUs. If shutting off, that does not matter much anyway. If
>>> suspending, we make sure that the CPUs remain unassigned
>>> when resuming.
>>>
>>> While there, take the chance to:
>>> - fix the doc comment of cpupool_cpu_remove() (it was
>>> wrong);
>>> - improve comments in general around and in cpupool_cpu_remove()
>>> and cpupool_cpu_add();
>>> - add a couple of ASSERT()-s for checking consistency.
>>
>> I did a test with the patches applied.
>>
>> # xl cpupool-cpu-remove Pool-0 2
>> # echo mem >/sys/power/state
>>
>> When resuming this resulted in:
>>
>> (XEN) mce_intel.c:735: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0
>> extended MCE MSR 0
>> (XEN) CPU0 CMCI LVT vector (0xf2) already installed
>> (XEN) Finishing wakeup from ACPI S3 state.
>> (XEN) Enabling non-boot CPUs ...
>> (XEN) Xen BUG at cpu.c:149
>> (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]----
>> (XEN) CPU: 0
>> (XEN) RIP: e008:[<ffff82d080101531>] cpu_up+0xaf/0xfe
>> (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor
>> (XEN) rax: 0000000000008016 rbx: 0000000000000000 rcx: 0000000000000000
> [...]
>> (XEN) Xen call trace:
>> (XEN) [<ffff82d080101531>] cpu_up+0xaf/0xfe
>> (XEN) [<ffff82d080101733>] enable_nonboot_cpus+0x4f/0xfc
>> (XEN) [<ffff82d0801a6a8d>] enter_state_helper+0x2cb/0x370
>> (XEN) [<ffff82d08010615f>] continue_hypercall_tasklet_handler+0x4a/0xb1
>> (XEN) [<ffff82d08013101d>] do_tasklet_work+0x78/0xab
>> (XEN) [<ffff82d08013134c>] do_tasklet+0x5e/0x8a
>> (XEN) [<ffff82d080161bcb>] idle_loop+0x56/0x70
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Xen BUG at cpu.c:149
>> (XEN) ****************************************
>
> Which would seem to more likely be a result of patch 2. Having
> taken a closer look - is setting ret to -EINVAL at the top of
> cpupool_cpu_add() really correct? I.e. it is guaranteed that
> at least one of the two places altering ret will always be run
> into? If it is, then I'd still suspect one of the two
> cpupool_assign_cpu_locked() invocations to be failing.
Indeed. Setting ret to 0 initially does the trick. With this
modification suspend/resume and power off are working with cpus
not allocated to any cpupool.
Dario, I suggest you write another patch to correct patch 2.
For patch 3 with patch 2 corrected:
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Juergen Gross <jgross@suse.com>
Juergen
next prev parent reply other threads:[~2015-05-08 10:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-06 15:10 [PATCH 0/3] xen: cpupools: avoid crashing when shutting down/suspending with free CPUs Dario Faggioli
2015-05-06 15:10 ` [PATCH 1/3] xen: always print offending CPU on bringup/teardown failure Dario Faggioli
2015-05-07 13:17 ` Jan Beulich
2015-05-06 15:10 ` [PATCH 2/3] xen: cpupool: assigning a CPU to a pool can fail Dario Faggioli
2015-05-07 4:52 ` Juergen Gross
2015-05-06 15:10 ` [PATCH 3/3] xen: cpupools: avoid crashing if shutting down with free CPUs Dario Faggioli
2015-05-08 10:20 ` Juergen Gross
2015-05-08 10:34 ` Jan Beulich
[not found] ` <554CAD480200007800078288@suse.com>
2015-05-08 10:47 ` Juergen Gross [this message]
2015-05-08 13:12 ` Dario Faggioli
2015-05-08 13:18 ` Juergen Gross
2015-05-08 13:32 ` Dario Faggioli
2015-05-08 13:57 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=554C9450.6080405@suse.com \
--to=jgross@suse.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=keir@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.