From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Cc: george.dunlap@eu.citrix.com, keir@xen.org,
Jan Beulich <JBeulich@suse.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] Fix scheduler crash after s3 resume
Date: Thu, 24 Jan 2013 07:18:17 +0100 [thread overview]
Message-ID: <5100D229.4030906@ts.fujitsu.com> (raw)
In-Reply-To: <5100070F.7010808@citrix.com>
Am 23.01.2013 16:51, schrieb Tomasz Wroblewski:
> Hi all,
>
> This was also discussed earlier, for example here
> http://xen.markmail.org/thread/iqvkylp3mclmsnbw
>
> Changeset 25079:d5ccb2d1dbd1 (Introduce system_state variable) added a
> global variable, which, among other things, is used to prevent disabling
> cpu scheduler, prevent breaking vcpu affinities, prevent removing the
> cpu from cpupool on suspend. However, it missed one place where cpu is
> removed from the cpupool valid cpus mask, in smpboot.c, __cpu_disable(),
> line 840:
>
> cpumask_clear_cpu(cpu, cpupool0->cpu_valid);
>
> This causes the vcpu in the default pool to be considered inactive, and
> the following assertion is violated in sched_credit.c soon after resume
> transitions out of xen, causing a platform reboot:
>
> (XEN) Finishing wakeup from ACPI S3 state.
> (XEN) Enabling non-boot CPUs ...
> (XEN) Assertion '!cpumask_empty(&cpus) && cpumask_test_cpu(cpu, &cpus)'
> failed at sched_credit.c:507
> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Tainted: C ]----
> (XEN) CPU: 1
> (XEN) RIP: e008:[<ffff82c480119e9e>] _csched_cpu_pick+0x155/0x5fd
> (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor
> (XEN) rax: 0000000000000001 rbx: 0000000000000008 rcx: 0000000000000008
> (XEN) rdx: 00000000000000ff rsi: 0000000000000008 rdi: 0000000000000000
> (XEN) rbp: ffff83011415fdd8 rsp: ffff83011415fcf8 r8: 0000000000000000
> (XEN) r9: 000000000000003e r10: 00000008f3de731f r11: ffffea0000063800
> (XEN) r12: ffff82c480261720 r13: ffff830137b4d950 r14: ffff830137beb010
> (XEN) r15: ffff82c480261720 cr0: 0000000080050033 cr4: 00000000000026f0
> (XEN) cr3: 000000013c17d000 cr2: ffff8800ac6ef8f0
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff83011415fcf8:
> (XEN) 00000000000af257 0000000800000001 ffff8300ba4fd000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000002 ffff8800ac6ef8f0
> (XEN) 0000000800000000 00000001318e0025 0000000000000087 ffff83011415fd68
> (XEN) ffff82c480124f79 ffff83011415fd98 ffff83011415fda8 00007fda88d1e790
> (XEN) ffff8800ac6ef8f0 00000001318e0025 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000146 ffff830137b4d940
> (XEN) 0000000000000001 ffff830137b4d950 ffff830137beb010 ffff82c480261720
> (XEN) ffff83011415fe48 ffff82c48011a51b 0002000e00000007 ffffffff81009071
> (XEN) 000000000000e033 ffff83013a805360 ffff880002bb3c28 000000000000e02b
> (XEN) e4d87248e7ca5f52 ffff830102ae2200 0000000000000001 ffff82c48011a356
> (XEN) 00000008efa1f543 00007fda88d1e790 ffff83011415fe78 ffff82c48012748f
> (XEN) 0000000000000002 ffff830137beb028 ffff830102ae2200 ffff830137beb8d0
> (XEN) ffff83011415fec8 ffff82c48012758b ffff830114150000 ffff8800ac6ef8f0
> (XEN) 80100000ae86d065 ffff82c4802e0080 ffff82c4802e0000 ffff830114158000
> (XEN) ffffffffffffffff 00007fda88d1e790 ffff83011415fef8 ffff82c480124b4e
> (XEN) ffff8300ba4fd000 ffffea0000063800 00000001318e0025 ffff8800ac6ef8f0
> (XEN) ffff83011415ff08 ffff82c480124bb4 00007cfeebea00c7 ffff82c480226a71
> (XEN) 00007fda88d1e790 ffff8800ac6ef8f0 00000001318e0025 ffffea0000063800
> (XEN) ffff880002bb3c78 00000001318e0025 ffffea0000063800 0000000000000146
> (XEN) 00003ffffffff000 ffffea0002b1bbf0 0000000000000000 00000001318e0025
> (XEN) Xen call trace:
> (XEN) [<ffff82c480119e9e>] _csched_cpu_pick+0x155/0x5fd
> (XEN) [<ffff82c48011a51b>] csched_tick+0x1c5/0x342
> (XEN) [<ffff82c48012748f>] execute_timer+0x4e/0x6c
> (XEN) [<ffff82c48012758b>] timer_softirq_action+0xde/0x206
> (XEN) [<ffff82c480124b4e>] __do_softirq+0x8e/0x99
> (XEN) [<ffff82c480124bb4>] do_softirq+0x13/0x15
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 1:
> (XEN) Assertion '!cpumask_empty(&cpus) && cpumask_test_cpu(cpu, &cpus)'
> failed at sched_credit.c:507
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
>
> ^ reason for above being that "cpus" cpumask is empty as it is a logical
> "and" between cpupool's valid cpus (from which the cpu was removed) and
> cpu affinity mask.
>
> Attached patch follows the spirit of the changeset 25079:d5ccb2d1dbd1
> (which blocked removal of the cpu from the cpupool in cpupool.c) by also
> blocking it's removal from the cpupool's valid cpumask. So cpu
> affinities are still preserved across suspend/resume, and scheuduler
> does not need to be disabled, as per original intent (I think). Would
> welcome comments.
>
> Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>
> Commit message:
> Fix s3 resume regression (crash in scheduler) after c-s
> 25079:d5ccb2d1dbd1 by also blocking removal of the cpu from the
> cpupool's cpu_valid mask - in the spirit of mentioned c-s.
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
next prev parent reply other threads:[~2013-01-24 6:18 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-23 15:51 [PATCH] Fix scheduler crash after s3 resume Tomasz Wroblewski
2013-01-23 16:11 ` Jan Beulich
2013-01-23 16:57 ` Tomasz Wroblewski
2013-01-23 17:01 ` Tomasz Wroblewski
2013-01-23 17:50 ` Tomasz Wroblewski
2013-01-24 6:18 ` Juergen Gross [this message]
2013-01-24 14:26 ` [PATCH v2] " Tomasz Wroblewski
2013-01-24 15:36 ` Jan Beulich
2013-01-24 15:57 ` George Dunlap
2013-01-24 16:25 ` Tomasz Wroblewski
2013-01-24 16:56 ` Jan Beulich
2013-01-25 9:07 ` Tomasz Wroblewski
2013-01-25 9:36 ` Jan Beulich
2013-01-25 9:45 ` Tomasz Wroblewski
2013-01-25 10:15 ` Jan Beulich
2013-01-25 10:18 ` Tomasz Wroblewski
2013-01-25 10:29 ` Jan Beulich
2013-01-25 10:23 ` Juergen Gross
2013-01-25 10:29 ` Tomasz Wroblewski
2013-01-25 10:31 ` Jan Beulich
2013-01-25 10:35 ` Juergen Gross
2013-01-25 10:40 ` Jan Beulich
2013-01-25 11:05 ` Juergen Gross
2013-01-25 11:56 ` Tomasz Wroblewski
2013-01-25 12:27 ` Jan Beulich
2013-01-25 13:58 ` Tomasz Wroblewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5100D229.4030906@ts.fujitsu.com \
--to=juergen.gross@ts.fujitsu.com \
--cc=JBeulich@suse.com \
--cc=george.dunlap@eu.citrix.com \
--cc=keir@xen.org \
--cc=tomasz.wroblewski@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).