xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andre Przywara <andre.przywara@amd.com>
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Juergen Gross <juergen.gross@ts.fujitsu.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Diestelhorst, Stephan" <Stephan.Diestelhorst@amd.com>
Subject: Re: Hypervisor crash(!) on xl cpupool-numa-split
Date: Tue, 8 Feb 2011 17:33:21 +0100	[thread overview]
Message-ID: <4D517051.10402@amd.com> (raw)
In-Reply-To: <AANLkTinP0z9GynF1RFd8RwzWuqvxYdb+UBE+7xKpX6D4@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1935 bytes --]

George Dunlap wrote:
> Andre,
> 
> Can you try again with the attached patch?
Sure. Unfortunately (or is this a good sign?) the "Migration failed" 
message didn't trigger, I only saw various instances of the other 
printk, see the attached log file.
Migration is happening quite often, because Dom0 has 48 vCPUs and in the 
end they are squashed into less and less pCPUs. I guess that is the 
reason my I see it on my machine.

Regards,
Andre.

> 
> Thanks,
>  -George
> 
> On Tue, Feb 8, 2011 at 12:08 PM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
>> On Tue, Feb 8, 2011 at 5:43 AM, Juergen Gross
>> <juergen.gross@ts.fujitsu.com> wrote:
>>> On 02/07/11 16:55, George Dunlap wrote:
>>>> Juergen,
>>>>
>>>> What is supposed to happen if a domain is in cpupool0, and then all of
>>>> the cpus are taken out of cpupool0?  Is that possible?
>>> No. Cpupool0 can't be without any cpu, as Dom0 is always member of cpupool0.
>> If that's the case, then since Andre is running this immediately after
>> boot, he shouldn't be seeing any vcpus in the new pools; and all of
>> the dom0 vcpus should be migrated to cpupool0, right?  Is it possible
>> that migration process isn't happening properly?
>>
>> It looks like schedule.c:cpu_disable_scheduler() will try to migrate
>> all vcpus, and if it fails to migrate, it returns -EAGAIN so that the
>> tools will try again.  It's probably worth instrumenting that whole
>> code-path to make sure it actually happens as we expect.  Are we
>> certain, for example, that if a hypercall continued on another cpu
>> will actually return the new error value properly?
>>
>> Another minor thing: In cpupool.c:cpupool_unassign_cpu_helper(), why
>> is the cpu's bit set in cpupool_free_cpus without checking to see if
>> the cpu_disable_scheduler() call actually worked?  Shouldn't that also
>> be inside the if() statement?
>>
>>  -George
>>


-- 
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712

[-- Attachment #2: george_debug.log --]
[-- Type: text/plain, Size: 8076 bytes --]

root@dosorca:/data/images# sh numasplit.sh
Removing CPUs from Pool 0
(XEN) cpu_disable_scheduler: Migrating d0v14 from cpu 6
(XEN) cpu_disable_scheduler: Migrating d0v26 from cpu 6
(XEN) cpu_disable_scheduler: Migrating d0v9 from cpu 7
(XEN) cpu_disable_scheduler: Migrating d0v23 from cpu 7
(XEN) cpu_disable_scheduler: Migrating d0v9 from cpu 8
(XEN) cpu_disable_scheduler: Migrating d0v19 from cpu 8
(XEN) cpu_disable_scheduler: Migrating d0v0 from cpu 9
(XEN) cpu_disable_scheduler: Migrating d0v9 from cpu 9
(XEN) cpu_disable_scheduler: Migrating d0v19 from cpu 9
(XEN) cpu_disable_scheduler: Migrating d0v0 from cpu 10
(XEN) cpu_disable_scheduler: Migrating d0v9 from cpu 10
(XEN) cpu_disable_scheduler: Migrating d0v19 from cpu 10
(XEN) cpu_disable_scheduler: Migrating d0v0 from cpu 11
(XEN) cpu_disable_scheduler: Migrating d0v9 from cpu 11
(XEN) cpu_disable_scheduler: Migrating d0v19 from cpu 11
(XEN) cpu_disable_scheduler: Migrating d0v31 from cpu 11
Rewriting config file
Creating new pool
Using config file "cpupool.test"
cpupool name:   Pool-node1
scheduler:      credit
number of cpus: 1
Populating new pool
Removing CPUs from Pool 0
(XEN) cpu_disable_scheduler: Migrating d0v44 from cpu 12
(XEN) cpu_disable_scheduler: Migrating d0v14 from cpu 13
(XEN) cpu_disable_scheduler: Migrating d0v33 from cpu 13
(XEN) cpu_disable_scheduler: Migrating d0v44 from cpu 13
(XEN) cpu_disable_scheduler: Migrating d0v10 from cpu 14
(XEN) cpu_disable_scheduler: Migrating d0v33 from cpu 14
(XEN) cpu_disable_scheduler: Migrating d0v44 from cpu 14
(XEN) cpu_disable_scheduler: Migrating d0v10 from cpu 15
(XEN) cpu_disable_scheduler: Migrating d0v33 from cpu 15
(XEN) cpu_disable_scheduler: Migrating d0v44 from cpu 15
(XEN) cpu_disable_scheduler: Migrating d0v10 from cpu 16
(XEN) cpu_disable_scheduler: Migrating d0v33 from cpu 16
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 16
(XEN) cpu_disable_scheduler: Migrating d0v10 from cpu 17
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 17
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 17
Rewriting config file
Creating new pool
Using config file "cpupool.test"
cpupool name:   Pool-node2
scheduler:      credit
number of cpus: 1
Populating new pool
Removing CPUs from Pool 0
(XEN) cpu_disable_scheduler: Migrating d0v10 from cpu 18
(XEN) cpu_disable_scheduler: Migrating d0v29 from cpu 18
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 18
(XEN) cpu_disable_scheduler: Migrating d0v29 from cpu 19
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 19
(XEN) cpu_disable_scheduler: Migrating d0v6 from cpu 20
(XEN) cpu_disable_scheduler: Migrating d0v29 from cpu 20
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 20
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 21
(XEN) cpu_disable_scheduler: Migrating d0v14 from cpu 21
(XEN) cpu_disable_scheduler: Migrating d0v29 from cpu 21
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 21
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 22
(XEN) cpu_disable_scheduler: Migrating d0v14 from cpu 22
(XEN) cpu_disable_scheduler: Migrating d0v23 from cpu 22
(XEN) cpu_disable_scheduler: Migrating d0v29 from cpu 22
(XEN) cpu_disable_scheduler: Migrating d0v41 from cpu 22
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 23
(XEN) cpu_disable_scheduler: Migrating d0v14 from cpu 23
(XEN) cpu_disable_scheduler: Migrating d0v23 from cpu 23
(XEN) cpu_disable_scheduler: Migrating d0v29 from cpu 23
Rewriting config file
Creating new pool
Using config file "cpupool.test"
cpupool name:   Pool-node3
scheduler:      credit
number of cpus: 1
Populating new pool
Removing CPUs from Pool 0
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 24
(XEN) cpu_disable_scheduler: Migrating d0v34 from cpu 24
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 24
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 25
(XEN) cpu_disable_scheduler: Migrating d0v34 from cpu 25
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 25
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 26
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 26
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 26
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v24 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v25 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v39 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 29
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 29
(XEN) cpu_disable_scheduler: Migrating d0v25 from cpu 29
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 29
(XEN) cpu_disable_scheduler: Migrating d0v39 from cpu 29
Rewriting config file
Creating new pool
Using config file "cpupool.test"
cpupool name:   Pool-node4
scheduler:      credit
number of cpus: 1
(XEN) Xen BUG at sched_credit.c:384
(XEN) ----[ Xen-4.1.0-rc3-pre  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    32
(XEN) RIP:    e008:[<ffff82c480117fa0>] csched_alloc_pdata+0x146/0x17f
(XEN) RFLAGS: 0000000000010093   CONTEXT: hypervisor
(XEN) rax: ffff830434322000   rbx: ffff830a3800f1e8   rcx: 0000000000000018
(XEN) rdx: ffff82c4802d3ec0   rsi: 0000000000000002   rdi: ffff83043445e100
(XEN) rbp: ffff8304343efce8   rsp: ffff8304343efca8   r8:  0000000000000001
(XEN) r9:  ffff830a3800f1e8   r10: ffff82c480219dc0   r11: 0000000000000286
(XEN) r12: 0000000000000018   r13: ffff8310341a7d50   r14: ffff830a3800f1d0
(XEN) r15: 0000000000000018   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000806aed000   cr2: 00007f50c671def5
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff8304343efca8:
(XEN)    ffff8304343efcb8 ffff8310341a7d50 0000000000000282 0000000000000018
(XEN)    ffff830a3800f460 ffff8310341a7c60 0000000000000018 ffff82c4802b0880
(XEN)    ffff8304343efd58 ffff82c48011fa63 ffff82f601024d80 000000000008126c
(XEN)    ffff8300c7e42000 0000000000000000 0000080000000000 ffff82c480248b80
(XEN)    0000000000000002 0000000000000018 ffff830a3800f460 0000000000305000
(XEN)    ffff82c4802550e4 ffff82c4802b0880 ffff8304343efd78 ffff82c48010188c
(XEN)    ffff8304343efe40 0000000000000018 ffff8304343efdb8 ffff82c480101b94
(XEN)    ffff8304343efdb8 ffff82c480183562 fffffffe00000286 ffff8304343eff18
(XEN)    000000000066e004 0000000000305000 ffff8304343efef8 ffff82c4801252a1
(XEN)    ffff8304343efdd8 0000000180153c8d 0000000000000000 ffff82c4801068f8
(XEN)    0000000000000296 ffff8300c7e1e1c8 aaaaaaaaaaaaaaaa 0000000000000000
(XEN)    ffff88007d094170 ffff88007d094170 ffff8304343efef8 ffff82c480113d8a
(XEN)    ffff8304343efe78 ffff8304343efe88 0000000800000012 0000000400000004
(XEN)    00007fff00000001 0000000000000018 00000000000000b3 0000000000000072
(XEN)    00007f50c64e5960 0000000000000018 00007fff85f117c0 00007f50c6b48342
(XEN)    0000000000000001 0000000000000000 0000000000000018 0000000000000004
(XEN)    000000000066d050 000000000066e000 85f1189c00000000 0000000000000033
(XEN)    ffff8304343efed8 ffff8300c7e1e000 00007fff85f11600 0000000000305000
(XEN)    0000000000000003 0000000000000003 00007cfbcbc100c7 ffff82c480207be8
(XEN)    ffffffff8100946a 0000000000000023 0000000000000003 0000000000000003
(XEN) Xen call trace:
(XEN)    [<ffff82c480117fa0>] csched_alloc_pdata+0x146/0x17f
(XEN)    [<ffff82c48011fa63>] schedule_cpu_switch+0x75/0x1cd
(XEN)    [<ffff82c48010188c>] cpupool_assign_cpu_locked+0x44/0x8b
(XEN)    [<ffff82c480101b94>] cpupool_do_sysctl+0x1fb/0x461
(XEN)    [<ffff82c4801252a1>] do_sysctl+0x921/0xa30
(XEN)    [<ffff82c480207be8>] syscall_enter+0xc8/0x122
(XEN)    
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 32:
(XEN) Xen BUG at sched_credit.c:384
(XEN) ****************************************
(XEN) 
(XEN) Reboot in five seconds...

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

  reply	other threads:[~2011-02-08 16:33 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-27 23:18 Hypervisor crash(!) on xl cpupool-numa-split Andre Przywara
2011-01-28  6:47 ` Juergen Gross
2011-01-28 11:07   ` Andre Przywara
2011-01-28 11:44     ` Juergen Gross
2011-01-28 13:14       ` Andre Przywara
2011-01-31  7:04         ` Juergen Gross
2011-01-31 14:59           ` Andre Przywara
2011-01-31 15:28             ` George Dunlap
2011-02-01 16:32               ` Andre Przywara
2011-02-02  6:27                 ` Juergen Gross
2011-02-02  8:49                   ` Juergen Gross
2011-02-02 10:05                     ` Juergen Gross
2011-02-02 10:59                       ` Andre Przywara
2011-02-02 14:39                 ` Stephan Diestelhorst
2011-02-02 15:14                   ` Juergen Gross
2011-02-02 16:01                     ` Stephan Diestelhorst
2011-02-03  5:57                       ` Juergen Gross
2011-02-03  9:18                         ` Juergen Gross
2011-02-04 14:09                           ` Andre Przywara
2011-02-07 12:38                             ` Andre Przywara
2011-02-07 13:32                               ` Juergen Gross
2011-02-07 15:55                                 ` George Dunlap
2011-02-08  5:43                                   ` Juergen Gross
2011-02-08 12:08                                     ` George Dunlap
2011-02-08 12:14                                       ` George Dunlap
2011-02-08 16:33                                         ` Andre Przywara [this message]
2011-02-09 12:27                                           ` George Dunlap
2011-02-09 12:27                                             ` George Dunlap
2011-02-09 13:04                                               ` Juergen Gross
2011-02-09 13:39                                                 ` Andre Przywara
2011-02-09 13:51                                               ` Andre Przywara
2011-02-09 14:21                                                 ` Juergen Gross
2011-02-10  6:42                                                   ` Juergen Gross
2011-02-10  9:25                                                     ` Andre Przywara
2011-02-10 14:18                                                       ` Andre Przywara
2011-02-11  6:17                                                         ` Juergen Gross
2011-02-11  7:39                                                           ` Andre Przywara
2011-02-14 17:57                                                             ` George Dunlap
2011-02-15  7:22                                                               ` Juergen Gross
2011-02-16  9:47                                                                 ` Juergen Gross
2011-02-16 13:54                                                                   ` George Dunlap
     [not found]                                                                     ` <4D6237C6.1050206@amd.c om>
2011-02-16 14:11                                                                     ` Juergen Gross
2011-02-16 14:28                                                                       ` Juergen Gross
2011-02-17  0:05                                                                       ` André Przywara
2011-02-17  7:05                                                                     ` Juergen Gross
2011-02-17  9:11                                                                       ` Juergen Gross
2011-02-21 10:00                                                                     ` Andre Przywara
2011-02-21 13:19                                                                       ` Juergen Gross
2011-02-21 14:45                                                                         ` Andre Przywara
2011-02-21 14:50                                                                           ` Juergen Gross
2011-02-08 12:23                                       ` Juergen Gross
2011-01-28 11:13   ` George Dunlap
2011-01-28 13:05     ` Andre Przywara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D517051.10402@amd.com \
    --to=andre.przywara@amd.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Stephan.Diestelhorst@amd.com \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).