From: Juergen Gross <jgross@suse.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
xen-devel@lists.xen.org
Subject: Re: [PATCH 3/3] xen: cpupools: avoid crashing if shutting down with free CPUs
Date: Fri, 08 May 2015 15:18:53 +0200 [thread overview]
Message-ID: <554CB7BD.3010201@suse.com> (raw)
In-Reply-To: <1431090740.4957.90.camel@citrix.com>
On 05/08/2015 03:12 PM, Dario Faggioli wrote:
> On Fri, 2015-05-08 at 12:47 +0200, Juergen Gross wrote:
>> On 05/08/2015 12:34 PM, Jan Beulich wrote:
>
>>>> (XEN) Xen call trace:
>>>> (XEN) [<ffff82d080101531>] cpu_up+0xaf/0xfe
>>>> (XEN) [<ffff82d080101733>] enable_nonboot_cpus+0x4f/0xfc
>>>> (XEN) [<ffff82d0801a6a8d>] enter_state_helper+0x2cb/0x370
>>>> (XEN) [<ffff82d08010615f>] continue_hypercall_tasklet_handler+0x4a/0xb1
>>>> (XEN) [<ffff82d08013101d>] do_tasklet_work+0x78/0xab
>>>> (XEN) [<ffff82d08013134c>] do_tasklet+0x5e/0x8a
>>>> (XEN) [<ffff82d080161bcb>] idle_loop+0x56/0x70
>>>> (XEN)
>>>> (XEN)
>>>> (XEN) ****************************************
>>>> (XEN) Panic on CPU 0:
>>>> (XEN) Xen BUG at cpu.c:149
>>>> (XEN) ****************************************
>>>
>>> Which would seem to more likely be a result of patch 2. Having
>>> taken a closer look - is setting ret to -EINVAL at the top of
>>> cpupool_cpu_add() really correct? I.e. it is guaranteed that
>>> at least one of the two places altering ret will always be run
>>> into? If it is, then I'd still suspect one of the two
>>> cpupool_assign_cpu_locked() invocations to be failing.
>>
>> Indeed.
>>
> Not really.
>
> Well, the problem is, of course, related, as your test shows, and I now
> see why this happens, but it's all patch 3 fault (see below).
>
> So what's in tree right now is ok and there is no need to revert. I
> believe the best thing to do is for me to send a new, fixed, version of
> patch 3. The fix would probably still be just changing "int ret =
> -EINVAL" to "int ret = 0" in cpupool_cpu_add(), but that should be done
> within patch 3, not as a fix to patch 2, which was indeed right.
>
> What do you both think?
>
>> Setting ret to 0 initially does the trick.
>>
> Yes. However, as far as patch 2 is concerned, that initialization to
> -EINVAL is ok, as we are sure and it is guaranted that at least one of
> the two places altering ret is executed, as Jan was wandering. (Well,
> because of that, the initialization is not that important, I just added
> it to be extra-cautious.)
>
> The problem is, in patch 3, when that code becomes:
>
> int ret = -EINVAL;
>
> if ( system_state == SYS_STATE_resume )
> {
> <look for the cpu>
> ret = cpupool_assign_cpu_locked(*c, cpu);
> }
> else
> {
>
> ret = cpupool_assign_cpu_locked(cpupool0, cpu);
> }
>
> In fact, now, if the cpu was free when suspending, we won't find it
> anywhere when looking for it in the system_state==SYS_STATE_resume case,
> and hence we won't call cpupool_assign_cpu_locked(). Then, because of
> the 'if() else', we don't call it below either (as we did before), and
> hence no one alters 'ret'.
>
> That is my point, actually: in patch 2, we are sure ret will be altered.
> In patch 3, it's no longer guaranteed that we alter ret, and the case in
> which we don't is perfectly fine, so ret should be inited to 0.
>
>> With this
>> modification suspend/resume and power off are working with cpus
>> not allocated to any cpupool.
>>
> Great to know, thanks for testing... and sorry for not having been able
> to do so myself. My test box allows me to "echo mem >/sys/power/state",
> and it seems to suspend ok (e.g., power led is blinking)... but then it
> just does not resume. :-/
>
>> Dario, I suggest you write another patch to correct patch 2.
>>
>> For patch 3 with patch 2 corrected:
>>
>> Reviewed-by: Juergen Gross <jgross@suse.com>
>> Tested-by: Juergen Gross <jgross@suse.com>
>>
> If you agree on my plan of sending v2 of patch3, and if that will really
> be just the same of v1, but with "int ret=0", I'll stick these tags
> there, unless you tell me not to.
I don't mind how you are doing it. The machine crashed even without
patch 2 when suspending with at least one free cpu, so this patch isn't
making anything worse.
You can still apply my 2 *.by: tags, of course.
Juergen
next prev parent reply other threads:[~2015-05-08 13:18 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-06 15:10 [PATCH 0/3] xen: cpupools: avoid crashing when shutting down/suspending with free CPUs Dario Faggioli
2015-05-06 15:10 ` [PATCH 1/3] xen: always print offending CPU on bringup/teardown failure Dario Faggioli
2015-05-07 13:17 ` Jan Beulich
2015-05-06 15:10 ` [PATCH 2/3] xen: cpupool: assigning a CPU to a pool can fail Dario Faggioli
2015-05-07 4:52 ` Juergen Gross
2015-05-06 15:10 ` [PATCH 3/3] xen: cpupools: avoid crashing if shutting down with free CPUs Dario Faggioli
2015-05-08 10:20 ` Juergen Gross
2015-05-08 10:34 ` Jan Beulich
[not found] ` <554CAD480200007800078288@suse.com>
2015-05-08 10:47 ` Juergen Gross
2015-05-08 13:12 ` Dario Faggioli
2015-05-08 13:18 ` Juergen Gross [this message]
2015-05-08 13:32 ` Dario Faggioli
2015-05-08 13:57 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=554CB7BD.3010201@suse.com \
--to=jgross@suse.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=keir@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.