From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Xen-devel <xen-devel@lists.xensource.com>
Subject: Re: credit2 crash
Date: Fri, 07 May 2010 09:19:49 +0200 [thread overview]
Message-ID: <4BE3BF15.2020404@ts.fujitsu.com> (raw)
In-Reply-To: <4BE3BD3F.6010409@ts.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 11907 bytes --]
Sorry, please use the now corrected patch.
The former one included an error introduced by applying patches to an already
changed source...
Juergen
On 05/07/2010 09:11 AM, Juergen Gross wrote:
> Hi,
>
> attached patch solves the credit2 scheduler problem with cpupools.
> The system isn't crashing any more.
>
>
> Juergen
>
> On 04/30/2010 07:58 AM, Juergen Gross wrote:
>> Hi,
>>
>> seems as if my cpupool stuff introduced this one.
>> Attached patch repairs this bug, but my test machine crashes a little bit
>> later:
>>
>> (XEN) EPT support 2M super page.
>> (XEN) EPT support 2M super page.
>> (XEN) EPT support 2M super page.
>> (XEN) Total of 4 processors activated.
>> (XEN) ENABLING IO-APIC IRQs
>> (XEN) -> Using new ACK method
>> (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
>> (XEN) TSC is reliable, synchronization unnecessary
>> (XEN) Platform timer is 14.318MHz HPET
>> �(XEN) Allocated console ring of 32 KiB.
>> (XEN) microcode.c:73:d32767 microcode: CPU1 resumed
>> (XEN) microcode.c:73:d32767 microcode: CPU3 resumed
>> (XEN) Brought up 4 CPUs
>> (XEN) ----[ Xen-4.1-unstable x86_64 debug=y Tainted: C ]----
>> (XEN) microcode.c:73:d32767 microcode: CPU2 resumed
>> (XEN) CPU: 1
>> (XEN) RIP: e008:[<ffff82c48011b030>] csched_schedule+0x161/0x3d9
>> (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor
>> (XEN) rax: ffff83033ff70000 rbx: 0000000000000000 rcx: ffff8300bf2fc000
>> (XEN) rdx: 0000000000000000 rsi: ffff83033fff4d30 rdi: 0000000000000000
>> (XEN) rbp: ffff83033ff37e10 rsp: ffff83033ff37d80 r8: 00000000074fc690
>> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001
>> (XEN) r12: ffff82c48039a9e0 r13: 0000000000000001 r14: ffff83033fff4d30
>> (XEN) r15: ffffffffffffffe0 cr0: 000000008005003b cr4: 00000000000026f0
>> (XEN) cr3: 00000000bf58c000 cr2: 0000000000000018
>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
>> (XEN) Xen stack trace from rsp=ffff83033ff37d80:
>> (XEN) ffff82c48024bec0 ffff82c48022ed16 0000000000000049 00000000074fc690
>> (XEN) ffff82c480249dc0 000000000000003e 0000000000000006 ffff83033ff37dd0
>> (XEN) ffff82c480121c41 ffff82c48027a180 ffff83033ff37e10 0000000000000082
>> (XEN) 0000000000000082 ffff83033ff37f28 ffff8300bf2fc000 ffff82c480249dc0
>> (XEN) ffff82c48027c0b0 ffff82c48027c080 ffff83033ff37e90 ffff82c48012032d
>> (XEN) ffff82c480121c99 00000000074fc690 ffff83033ff37e90 ffff82c480124431
>> (XEN) 0000000000000001 ffff82c48027ae50 ffff83033ff37e90 ffff82c48027c180
>> (XEN) ffff82c4803a0a68 0000000000000001 ffff82c48024c6c0 ffff82c48039c880
>> (XEN) ffff83033ff37f28 ffffffffffffffff ffff83033ff37ed0 ffff82c480121b4f
>> (XEN) ffff82c480121c99 ffff83033ff37f28 ffff82c48024c6c0 ffff83033ff37f28
>> (XEN) ffff82c48027ab80 ffff82c48024d138 ffff83033ff37ee0 ffff82c480121b6e
>> (XEN) ffff83033ff37f20 ffff82c480152b8a 0000000000000000 ffff83033ff37f28
>> (XEN) 0000000000002000 0000000000000001 ffff82c48027ae50 ffff82c48024bec0
>> (XEN) ffff83033ff37e90 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000001 ffff8300bf2fc000
>> (XEN) Xen call trace:
>> (XEN) [<ffff82c48011b030>] csched_schedule+0x161/0x3d9
>> (XEN) [<ffff82c48012032d>] schedule+0xf0/0x5f0
>> (XEN) [<ffff82c480121b4f>] __do_softirq+0x74/0x85
>> (XEN) [<ffff82c480121b6e>] do_softirq+0xe/0x10
>> (XEN) [<ffff82c480152b8a>] idle_loop+0x92/0x94
>> (XEN)
>> (XEN) Pagetable walk from 0000000000000018:
>> (XEN) L4[0x000] = 000000033ffee063 5555555555555555
>> (XEN) L3[0x000] = 000000033ffed063 5555555555555555
>> (XEN) L2[0x000] = 000000033ffec063 5555555555555555
>> (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 1:
>> (XEN) FATAL PAGE FAULT
>> (XEN) [error_code=0000]
>> (XEN) Faulting linear address: 0000000000000018
>> (XEN) ****************************************
>>
>>
>>
>> On 04/30/2010 12:27 AM, Jeremy Fitzhardinge wrote:
>>> I'm seeing this crash when I boot with sched=credit2:
>>>
>>> __ __ _ _ _ _ _ _
>>> \ \/ /___ _ __ | || | / | _ _ _ __ ___| |_ __ _| |__ | | ___
>>> \ // _ \ '_ \ | || |_ | |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
>>> / \ __/ | | | |__ _|| |__| |_| | | | \__ \ || (_| | |_) | | __/
>>> /_/\_\___|_| |_| |_|(_)_| \__,_|_| |_|___/\__\__,_|_.__/|_|\___|
>>>
>>> (XEN) Xen version 4.1-unstable (jeremy@) (gcc version 4.4.3 20100127
>>> (Red Hat 4.4.3-4) (GCC) ) Wed Apr 28 17:22:29 PDT 2010
>>> (XEN) Latest ChangeSet: Wed Apr 28 17:21:55 2010 -0700
>>> 21244:b0fbcf6cbf51
>>> (XEN) Command line: com2=115200,8n1,0x3e8,5 console=com2,vga
>>> cpufreq=xen iommu=pv sched=credit2
>>> (XEN) Video information:
>>> (XEN) VGA is text mode 80x25, font 8x16
>>> (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds
>>> (XEN) EDID info not retrieved because no DDC retrieval method detected
>>> (XEN) Disc information:
>>> (XEN) Found 4 MBR signatures
>>> (XEN) Found 4 EDD information structures
>>> (XEN) Xen-e820 RAM map:
>>> (XEN) 0000000000000000 - 000000000009d800 (usable)
>>> (XEN) 000000000009d800 - 00000000000a0000 (reserved)
>>> (XEN) 00000000000e0000 - 0000000000100000 (reserved)
>>> (XEN) 0000000000100000 - 00000000bf790000 (usable)
>>> (XEN) 00000000bf79e000 - 00000000bf7a0000 type 9
>>> (XEN) 00000000bf7a0000 - 00000000bf7ae000 (ACPI data)
>>> (XEN) 00000000bf7ae000 - 00000000bf7d0000 (ACPI NVS)
>>> (XEN) 00000000bf7d0000 - 00000000bf7e0000 (reserved)
>>> (XEN) 00000000bf7ed000 - 00000000c0000000 (reserved)
>>> (XEN) 00000000e0000000 - 00000000f0000000 (reserved)
>>> (XEN) 00000000fed20000 - 00000000fed40000 (reserved)
>>> (XEN) 00000000fee00000 - 00000000fee01000 (reserved)
>>> (XEN) 0000000100000000 - 0000000140000000 (usable)
>>> (XEN) ACPI: RSDP 000FA110, 0024 (r2 ACPIAM)
>>> (XEN) ACPI: XSDT BF7A0100, 0084 (r1 SMCI 20100225 MSFT 97)
>>> (XEN) ACPI: FACP BF7A0290, 00F4 (r4 022510 FACP1918 20100225 MSFT 97)
>>> (XEN) ACPI: DSDT BF7A05F0, 6C55 (r2 10605 10605000 0 INTL 20051117)
>>> (XEN) ACPI: FACS BF7AE000, 0040
>>> (XEN) ACPI: APIC BF7A0390, 0092 (r2 022510 APIC1918 20100225 MSFT 97)
>>> (XEN) ACPI: MCFG BF7A0430, 003C (r1 022510 OEMMCFG 20100225 MSFT 97)
>>> (XEN) ACPI: OEMB BF7AE040, 0073 (r1 022510 OEMB1918 20100225 MSFT 97)
>>> (XEN) ACPI: HPET BF7AA5F0, 0038 (r1 022510 OEMHPET 20100225 MSFT 97)
>>> (XEN) ACPI: GSCI BF7AE0C0, 2024 (r1 022510 GMCHSCI 20100225 MSFT 97)
>>> (XEN) ACPI: DMAR BF7B00F0, 0090 (r1 AMI OEMDMAR 1 MSFT 97)
>>> (XEN) ACPI: SSDT BF7B1580, 0363 (r1 DpgPmm CpuPm 12 INTL 20051117)
>>> (XEN) ACPI: EINJ BF7AA630, 0130 (r1 AMIER AMI_EINJ 20100225 MSFT 97)
>>> (XEN) ACPI: BERT BF7AA7C0, 0030 (r1 AMIER AMI_BERT 20100225 MSFT 97)
>>> (XEN) ACPI: ERST BF7AA7F0, 01B0 (r1 AMIER AMI_ERST 20100225 MSFT 97)
>>> (XEN) ACPI: HEST BF7AA9A0, 00A8 (r1 AMIER ABC_HEST 20100225 MSFT 97)
>>> (XEN) System RAM: 4087MB (4185268kB)
>>> (XEN) No NUMA configuration found
>>> (XEN) Faking a node at 0000000000000000-0000000140000000
>>> (XEN) Domain heap initialised
>>> (XEN) found SMP MP-table at 000ff780
>>> (XEN) DMI present.
>>> (XEN) Using APIC driver default
>>> (XEN) ACPI: PM-Timer IO Port: 0x808
>>> (XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[804,0], pm1x_evt[800,0]
>>> (XEN) ACPI: wakeup_vec[bf7ae00c], vec_size[20]
>>> (XEN) ACPI: Local APIC address 0xfee00000
>>> (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
>>> (XEN) Processor #0 7:14 APIC version 21
>>> (XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
>>> (XEN) Processor #2 7:14 APIC version 21
>>> (XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
>>> (XEN) Processor #4 7:14 APIC version 21
>>> (XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
>>> (XEN) Processor #6 7:14 APIC version 21
>>> (XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x84] disabled)
>>> (XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x85] disabled)
>>> (XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x86] disabled)
>>> (XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x87] disabled)
>>> (XEN) ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
>>> (XEN) ACPI: IOAPIC (id[0x07] address[0xfec00000] gsi_base[0])
>>> (XEN) IOAPIC[0]: apic_id 7, version 32, address 0xfec00000, GSI 0-23
>>> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>>> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>>> (XEN) ACPI: IRQ0 used by override.
>>> (XEN) ACPI: IRQ2 used by override.
>>> (XEN) ACPI: IRQ9 used by override.
>>> (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs
>>> (XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
>>> (XEN) PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
>>> (XEN) PCI: MCFG area at e0000000 reserved in E820
>>> (XEN) Using ACPI (MADT) for SMP configuration information
>>> (XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
>>> (XEN) Initializing Credit2 scheduler
>>> (XEN) WARNING: This is experimental software in development.
>>> (XEN) Use at your own risk.
>>> (XEN) csched_dom_init: Initializing domain 32767
>>> (XEN) Unknown interrupt (cr2=0000000000000000)
>>> (XEN) ffff82c480275000 ffff830000087f40 ffff830000087fc0
>>> 0000000000000080 ffff82c48037ff18 0000000000000080 0080000000000000
>>> 0000000000000001 0200000000000000 0000000000000000 0000000000000000
>>> ffff82f6017e5f80 00000000000bf5ba 00000000000bf5b8 ffff82c48023e038
>>> ffff82c48026000b 000000000000e008 0000000000010046 ffff82c48037fe58
>>> 0000000000000000 ffff82c480260006 0000000000000000 0000000000000000
>>> 0000000000000000 0000000001af2c70 ffff8300bf585ff8 ffff830000087fc0
>>> ffff830000087f40 0000100000000000 ffff830000087f40 ffff8300bf584ff8
>>> ffff82c4803a2e74 0000000000000000 0000000000cbec01 ffff830000000011
>>> 0000000800000000 000000010000006e 0000000000000003 00000000000002f8
>>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000 0000000000067ddc ffff82c4801000b5
>>> 0000000000000000 0000000000000000
>>
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 00000000fffff000
>>>
>>>
>>> This is a single socket 4 core Nehalem machine.
>>>
>>> J
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>>
>>>
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
--
Juergen Gross Principal Developer Operating Systems
TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
[-- Attachment #2: pool-credit2.patch --]
[-- Type: text/x-patch, Size: 3098 bytes --]
diff -r ccae861f52f7 xen/common/sched_credit2.c
--- a/xen/common/sched_credit2.c Thu May 06 11:59:55 2010 +0100
+++ b/xen/common/sched_credit2.c Fri May 07 09:05:17 2010 +0200
@@ -532,6 +532,9 @@ csched_alloc_vdata(const struct schedule
if ( svc == NULL )
return NULL;
memset(svc, 0, sizeof(*svc));
+
+ printk("%s: Allocating vcpu d%dv%d\n",
+ __func__, vc->domain->domain_id, vc->vcpu_id);
INIT_LIST_HEAD(&svc->rqd_elem);
INIT_LIST_HEAD(&svc->sdom_elem);
@@ -1093,10 +1096,44 @@ csched_dump(const struct scheduler *ops)
}
static void
-make_runq_map(const struct scheduler *ops)
+csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
+{
+ unsigned long flags;
+ struct csched_private *prv = CSCHED_PRIV(ops);
+
+ spin_lock_irqsave(&prv->lock, flags);
+ prv->ncpus--;
+ spin_unlock_irqrestore(&prv->lock, flags);
+
+ return;
+}
+
+static void *
+csched_alloc_pdata(const struct scheduler *ops, int cpu)
+{
+ spinlock_t *new_lock;
+ spinlock_t *old_lock = per_cpu(schedule_data, cpu).schedule_lock;
+ unsigned long flags;
+ struct csched_private *prv = CSCHED_PRIV(ops);
+
+ printk("%s: Allocating pdata %d\n", __func__, cpu);
+
+ spin_lock_irqsave(old_lock, flags);
+ new_lock = &per_cpu(schedule_data, prv->runq_map[cpu])._lock;
+ per_cpu(schedule_data, cpu).schedule_lock = new_lock;
+ spin_unlock_irqrestore(old_lock, flags);
+
+ spin_lock_irqsave(&prv->lock, flags);
+ prv->ncpus++;
+ spin_unlock_irqrestore(&prv->lock, flags);
+
+ return (void *)1;
+}
+
+static void
+make_runq_map(struct csched_private *prv)
{
int cpu, cpu_count=0;
- struct csched_private *prv = CSCHED_PRIV(ops);
/* FIXME: Read pcpu layout and do this properly */
for_each_possible_cpu( cpu )
@@ -1125,13 +1162,14 @@ csched_init(struct scheduler *ops, int p
if ( prv == NULL )
return 1;
memset(prv, 0, sizeof(*prv));
+ ops->sched_data = prv;
spin_lock_init(&prv->lock);
INIT_LIST_HEAD(&prv->sdom);
prv->ncpus = 0;
- make_runq_map(ops);
+ make_runq_map(prv);
for ( i=0; i<prv->runq_count ; i++ )
{
@@ -1141,21 +1179,6 @@ csched_init(struct scheduler *ops, int p
rqd->id = i;
INIT_LIST_HEAD(&rqd->svc);
INIT_LIST_HEAD(&rqd->runq);
- }
-
- /* Initialize pcpu structures */
- for_each_possible_cpu(i)
- {
- int runq_id;
- spinlock_t *lock;
-
- /* Point the per-cpu schedule lock to the runq_id lock */
- runq_id = prv->runq_map[i];
- lock = &per_cpu(schedule_data, runq_id)._lock;
-
- per_cpu(schedule_data, i).schedule_lock = lock;
-
- prv->ncpus++;
}
return 0;
@@ -1201,6 +1224,8 @@ const struct scheduler sched_credit2_def
.deinit = csched_deinit,
.alloc_vdata = csched_alloc_vdata,
.free_vdata = csched_free_vdata,
+ .alloc_pdata = csched_alloc_pdata,
+ .free_pdata = csched_free_pdata,
.alloc_domdata = csched_alloc_domdata,
.free_domdata = csched_free_domdata,
};
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
prev parent reply other threads:[~2010-05-07 7:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-29 22:27 credit2 crash Jeremy Fitzhardinge
2010-04-30 5:58 ` Juergen Gross
2010-05-07 7:11 ` Juergen Gross
2010-05-07 7:19 ` Juergen Gross [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BE3BF15.2020404@ts.fujitsu.com \
--to=juergen.gross@ts.fujitsu.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=jeremy@goop.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).