From: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Juergen Gross <juergen.gross@ts.fujitsu.com>,
"Keir (Xen.org)" <keir@xen.org>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2] Fix scheduler crash after s3 resume
Date: Thu, 24 Jan 2013 17:25:09 +0100 [thread overview]
Message-ID: <51016065.3080902@citrix.com> (raw)
In-Reply-To: <5101630D02000078000B93AD@nat28.tlf.novell.com>
On 24/01/13 16:36, Jan Beulich wrote:
>>>> On 24.01.13 at 15:26, Tomasz Wroblewski<tomasz.wroblewski@citrix.com> wrote:
>>>>
>> @@ -212,6 +213,8 @@
>> BUG_ON(error == -EBUSY);
>> printk("Error taking CPU%d up: %d\n", cpu, error);
>> }
>> + if (system_state == SYS_STATE_resume)
>> + cpumask_set_cpu(cpu, cpupool0->cpu_valid);
>>
> This can't be right: What tells you that all CPUs were in pool 0?
>
>
You're right, in my simple tests this was the case, but generally
speaking it might not be.. Would an approach based on storing cpupool0
mask in disable_nonboot_cpus() and restoring it in enable_nonboot_cpus()
be more acceptable?
> Also, for the future - generating patches with -p helps quite
> a bit in reviewing them.
>
>
Ok, thanks!
>> --- a/xen/common/schedule.c Mon Jan 21 17:03:10 2013 +0000
>> +++ b/xen/common/schedule.c Thu Jan 24 13:40:31 2013 +0000
>> @@ -545,7 +545,7 @@
>> int ret = 0;
>>
>> c = per_cpu(cpupool, cpu);
>> - if ( (c == NULL) || (system_state == SYS_STATE_suspend) )
>> + if ( c == NULL )
>> return ret;
>>
>> for_each_domain_in_cpupool ( d, c )
>> @@ -556,7 +556,8 @@
>>
>> cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid);
>> if ( cpumask_empty(&online_affinity)&&
>> - cpumask_test_cpu(cpu, v->cpu_affinity) )
>> + cpumask_test_cpu(cpu, v->cpu_affinity)&&
>> + system_state != SYS_STATE_suspend )
>> {
>> printk("Breaking vcpu affinity for domain %d vcpu %d\n",
>> v->domain->domain_id, v->vcpu_id);
>>
> I doubt this is correct, as you don't restore any of the settings
> during resume that you tear down here.
>
>
Is the objection about the affinity part or also the (c == NULL) bit?
The cpu_disable_scheduler() function is currently part of a regular cpu
down process, and was also part of suspend process before the "system
state variable" changeset which regressed it. So the (c==NULL) hunk
mostly just returns to previous state where this was working alot better
(by empirical testing). But I am no expert on this, so would be grateful
for ideas how this could be fixed in a better way!
Just to recap, the current problem boils down, I believe, to the fact
that vcpu_wake (schedule.c) function keeps getting called occasionally
during the S3 path for cpus which have the per_cpu data freed, causing a
crash. Safest way of fixing it seemed to be just put the suspend
cpu_disable_scheduler under regular path again - it probably isn't the
best..
next prev parent reply other threads:[~2013-01-24 16:25 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-23 15:51 [PATCH] Fix scheduler crash after s3 resume Tomasz Wroblewski
2013-01-23 16:11 ` Jan Beulich
2013-01-23 16:57 ` Tomasz Wroblewski
2013-01-23 17:01 ` Tomasz Wroblewski
2013-01-23 17:50 ` Tomasz Wroblewski
2013-01-24 6:18 ` Juergen Gross
2013-01-24 14:26 ` [PATCH v2] " Tomasz Wroblewski
2013-01-24 15:36 ` Jan Beulich
2013-01-24 15:57 ` George Dunlap
2013-01-24 16:25 ` Tomasz Wroblewski [this message]
2013-01-24 16:56 ` Jan Beulich
2013-01-25 9:07 ` Tomasz Wroblewski
2013-01-25 9:36 ` Jan Beulich
2013-01-25 9:45 ` Tomasz Wroblewski
2013-01-25 10:15 ` Jan Beulich
2013-01-25 10:18 ` Tomasz Wroblewski
2013-01-25 10:29 ` Jan Beulich
2013-01-25 10:23 ` Juergen Gross
2013-01-25 10:29 ` Tomasz Wroblewski
2013-01-25 10:31 ` Jan Beulich
2013-01-25 10:35 ` Juergen Gross
2013-01-25 10:40 ` Jan Beulich
2013-01-25 11:05 ` Juergen Gross
2013-01-25 11:56 ` Tomasz Wroblewski
2013-01-25 12:27 ` Jan Beulich
2013-01-25 13:58 ` Tomasz Wroblewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51016065.3080902@citrix.com \
--to=tomasz.wroblewski@citrix.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=juergen.gross@ts.fujitsu.com \
--cc=keir@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.