From: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Juergen Gross <juergen.gross@ts.fujitsu.com>,
"Keir (Xen.org)" <keir@xen.org>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2] Fix scheduler crash after s3 resume
Date: Thu, 24 Jan 2013 17:25:09 +0100 [thread overview]
Message-ID: <51016065.3080902@citrix.com> (raw)
In-Reply-To: <5101630D02000078000B93AD@nat28.tlf.novell.com>
On 24/01/13 16:36, Jan Beulich wrote:
>>>> On 24.01.13 at 15:26, Tomasz Wroblewski<tomasz.wroblewski@citrix.com> wrote:
>>>>
>> @@ -212,6 +213,8 @@
>> BUG_ON(error == -EBUSY);
>> printk("Error taking CPU%d up: %d\n", cpu, error);
>> }
>> + if (system_state == SYS_STATE_resume)
>> + cpumask_set_cpu(cpu, cpupool0->cpu_valid);
>>
> This can't be right: What tells you that all CPUs were in pool 0?
>
>
You're right, in my simple tests this was the case, but generally
speaking it might not be.. Would an approach based on storing cpupool0
mask in disable_nonboot_cpus() and restoring it in enable_nonboot_cpus()
be more acceptable?
> Also, for the future - generating patches with -p helps quite
> a bit in reviewing them.
>
>
Ok, thanks!
>> --- a/xen/common/schedule.c Mon Jan 21 17:03:10 2013 +0000
>> +++ b/xen/common/schedule.c Thu Jan 24 13:40:31 2013 +0000
>> @@ -545,7 +545,7 @@
>> int ret = 0;
>>
>> c = per_cpu(cpupool, cpu);
>> - if ( (c == NULL) || (system_state == SYS_STATE_suspend) )
>> + if ( c == NULL )
>> return ret;
>>
>> for_each_domain_in_cpupool ( d, c )
>> @@ -556,7 +556,8 @@
>>
>> cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid);
>> if ( cpumask_empty(&online_affinity)&&
>> - cpumask_test_cpu(cpu, v->cpu_affinity) )
>> + cpumask_test_cpu(cpu, v->cpu_affinity)&&
>> + system_state != SYS_STATE_suspend )
>> {
>> printk("Breaking vcpu affinity for domain %d vcpu %d\n",
>> v->domain->domain_id, v->vcpu_id);
>>
> I doubt this is correct, as you don't restore any of the settings
> during resume that you tear down here.
>
>
Is the objection about the affinity part or also the (c == NULL) bit?
The cpu_disable_scheduler() function is currently part of a regular cpu
down process, and was also part of suspend process before the "system
state variable" changeset which regressed it. So the (c==NULL) hunk
mostly just returns to previous state where this was working alot better
(by empirical testing). But I am no expert on this, so would be grateful
for ideas how this could be fixed in a better way!
Just to recap, the current problem boils down, I believe, to the fact
that vcpu_wake (schedule.c) function keeps getting called occasionally
during the S3 path for cpus which have the per_cpu data freed, causing a
crash. Safest way of fixing it seemed to be just put the suspend
cpu_disable_scheduler under regular path again - it probably isn't the
best..
next prev parent reply other threads:[~2013-01-24 16:25 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-23 15:51 [PATCH] Fix scheduler crash after s3 resume Tomasz Wroblewski
2013-01-23 16:11 ` Jan Beulich
2013-01-23 16:57 ` Tomasz Wroblewski
2013-01-23 17:01 ` Tomasz Wroblewski
2013-01-23 17:50 ` Tomasz Wroblewski
2013-01-24 6:18 ` Juergen Gross
2013-01-24 14:26 ` [PATCH v2] " Tomasz Wroblewski
2013-01-24 15:36 ` Jan Beulich
2013-01-24 15:57 ` George Dunlap
2013-01-24 16:25 ` Tomasz Wroblewski [this message]
2013-01-24 16:56 ` Jan Beulich
2013-01-25 9:07 ` Tomasz Wroblewski
2013-01-25 9:36 ` Jan Beulich
2013-01-25 9:45 ` Tomasz Wroblewski
2013-01-25 10:15 ` Jan Beulich
2013-01-25 10:18 ` Tomasz Wroblewski
2013-01-25 10:29 ` Jan Beulich
2013-01-25 10:23 ` Juergen Gross
2013-01-25 10:29 ` Tomasz Wroblewski
2013-01-25 10:31 ` Jan Beulich
2013-01-25 10:35 ` Juergen Gross
2013-01-25 10:40 ` Jan Beulich
2013-01-25 11:05 ` Juergen Gross
2013-01-25 11:56 ` Tomasz Wroblewski
2013-01-25 12:27 ` Jan Beulich
2013-01-25 13:58 ` Tomasz Wroblewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51016065.3080902@citrix.com \
--to=tomasz.wroblewski@citrix.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=juergen.gross@ts.fujitsu.com \
--cc=keir@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).