From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: credit2 crash Date: Fri, 07 May 2010 09:11:59 +0200 Message-ID: <4BE3BD3F.6010409@ts.fujitsu.com> References: <4BDA07B8.5020303@goop.org> <4BDA717C.4040209@ts.fujitsu.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------020405010303000502080705" Return-path: In-Reply-To: <4BDA717C.4040209@ts.fujitsu.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: George Dunlap , Xen-devel List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------020405010303000502080705 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Hi, attached patch solves the credit2 scheduler problem with cpupools. The system isn't crashing any more. Juergen On 04/30/2010 07:58 AM, Juergen Gross wrote: > Hi, > > seems as if my cpupool stuff introduced this one. > Attached patch repairs this bug, but my test machine crashes a little b= it > later: > > (XEN) EPT support 2M super page. > (XEN) EPT support 2M super page. > (XEN) EPT support 2M super page. > (XEN) Total of 4 processors activated. > (XEN) ENABLING IO-APIC IRQs > (XEN) -> Using new ACK method > (XEN) ..TIMER: vector=3D0xF0 apic1=3D0 pin1=3D2 apic2=3D-1 pin2=3D-1 > (XEN) TSC is reliable, synchronization unnecessary > (XEN) Platform timer is 14.318MHz HPET > =EF=BF=BD(XEN) Allocated console ring of 32 KiB. > (XEN) microcode.c:73:d32767 microcode: CPU1 resumed > (XEN) microcode.c:73:d32767 microcode: CPU3 resumed > (XEN) Brought up 4 CPUs > (XEN) ----[ Xen-4.1-unstable x86_64 debug=3Dy Tainted: C ]---- > (XEN) microcode.c:73:d32767 microcode: CPU2 resumed > (XEN) CPU: 1 > (XEN) RIP: e008:[] csched_schedule+0x161/0x3d9 > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: ffff83033ff70000 rbx: 0000000000000000 rcx: ffff8300bf2fc000 > (XEN) rdx: 0000000000000000 rsi: ffff83033fff4d30 rdi: 0000000000000000 > (XEN) rbp: ffff83033ff37e10 rsp: ffff83033ff37d80 r8: 00000000074fc690 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: ffff82c48039a9e0 r13: 0000000000000001 r14: ffff83033fff4d30 > (XEN) r15: ffffffffffffffe0 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 00000000bf58c000 cr2: 0000000000000018 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=3Dffff83033ff37d80: > (XEN) ffff82c48024bec0 ffff82c48022ed16 0000000000000049 00000000074fc6= 90 > (XEN) ffff82c480249dc0 000000000000003e 0000000000000006 ffff83033ff37d= d0 > (XEN) ffff82c480121c41 ffff82c48027a180 ffff83033ff37e10 00000000000000= 82 > (XEN) 0000000000000082 ffff83033ff37f28 ffff8300bf2fc000 ffff82c480249d= c0 > (XEN) ffff82c48027c0b0 ffff82c48027c080 ffff83033ff37e90 ffff82c4801203= 2d > (XEN) ffff82c480121c99 00000000074fc690 ffff83033ff37e90 ffff82c4801244= 31 > (XEN) 0000000000000001 ffff82c48027ae50 ffff83033ff37e90 ffff82c48027c1= 80 > (XEN) ffff82c4803a0a68 0000000000000001 ffff82c48024c6c0 ffff82c48039c8= 80 > (XEN) ffff83033ff37f28 ffffffffffffffff ffff83033ff37ed0 ffff82c480121b= 4f > (XEN) ffff82c480121c99 ffff83033ff37f28 ffff82c48024c6c0 ffff83033ff37f= 28 > (XEN) ffff82c48027ab80 ffff82c48024d138 ffff83033ff37ee0 ffff82c480121b= 6e > (XEN) ffff83033ff37f20 ffff82c480152b8a 0000000000000000 ffff83033ff37f= 28 > (XEN) 0000000000002000 0000000000000001 ffff82c48027ae50 ffff82c48024be= c0 > (XEN) ffff83033ff37e90 0000000000000000 0000000000000000 00000000000000= 00 > (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000000000= 00 > (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000000000= 00 > (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000000000= 00 > (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000000000= 00 > (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000000000= 00 > (XEN) 0000000000000000 0000000000000000 0000000000000001 ffff8300bf2fc0= 00 > (XEN) Xen call trace: > (XEN) [] csched_schedule+0x161/0x3d9 > (XEN) [] schedule+0xf0/0x5f0 > (XEN) [] __do_softirq+0x74/0x85 > (XEN) [] do_softirq+0xe/0x10 > (XEN) [] idle_loop+0x92/0x94 > (XEN) > (XEN) Pagetable walk from 0000000000000018: > (XEN) L4[0x000] =3D 000000033ffee063 5555555555555555 > (XEN) L3[0x000] =3D 000000033ffed063 5555555555555555 > (XEN) L2[0x000] =3D 000000033ffec063 5555555555555555 > (XEN) L1[0x000] =3D 0000000000000000 ffffffffffffffff > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 1: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=3D0000] > (XEN) Faulting linear address: 0000000000000018 > (XEN) **************************************** > > > > On 04/30/2010 12:27 AM, Jeremy Fitzhardinge wrote: >> I'm seeing this crash when I boot with sched=3Dcredit2: >> >> __ __ _ _ _ _ _ _ >> \ \/ /___ _ __ | || | / | _ _ _ __ ___| |_ __ _| |__ | | ___ >> \ // _ \ '_ \ | || |_ | |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \ >> / \ __/ | | | |__ _|| |__| |_| | | | \__ \ || (_| | |_) | | __/ >> /_/\_\___|_| |_| |_|(_)_| \__,_|_| |_|___/\__\__,_|_.__/|_|\___| >> >> (XEN) Xen version 4.1-unstable (jeremy@) (gcc version 4.4.3 20100127 >> (Red Hat 4.4.3-4) (GCC) ) Wed Apr 28 17:22:29 PDT 2010 >> (XEN) Latest ChangeSet: Wed Apr 28 17:21:55 2010 -0700 21244:b0fbcf6cb= f51 >> (XEN) Command line: com2=3D115200,8n1,0x3e8,5 console=3Dcom2,vga >> cpufreq=3Dxen iommu=3Dpv sched=3Dcredit2 >> (XEN) Video information: >> (XEN) VGA is text mode 80x25, font 8x16 >> (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds >> (XEN) EDID info not retrieved because no DDC retrieval method detected >> (XEN) Disc information: >> (XEN) Found 4 MBR signatures >> (XEN) Found 4 EDD information structures >> (XEN) Xen-e820 RAM map: >> (XEN) 0000000000000000 - 000000000009d800 (usable) >> (XEN) 000000000009d800 - 00000000000a0000 (reserved) >> (XEN) 00000000000e0000 - 0000000000100000 (reserved) >> (XEN) 0000000000100000 - 00000000bf790000 (usable) >> (XEN) 00000000bf79e000 - 00000000bf7a0000 type 9 >> (XEN) 00000000bf7a0000 - 00000000bf7ae000 (ACPI data) >> (XEN) 00000000bf7ae000 - 00000000bf7d0000 (ACPI NVS) >> (XEN) 00000000bf7d0000 - 00000000bf7e0000 (reserved) >> (XEN) 00000000bf7ed000 - 00000000c0000000 (reserved) >> (XEN) 00000000e0000000 - 00000000f0000000 (reserved) >> (XEN) 00000000fed20000 - 00000000fed40000 (reserved) >> (XEN) 00000000fee00000 - 00000000fee01000 (reserved) >> (XEN) 0000000100000000 - 0000000140000000 (usable) >> (XEN) ACPI: RSDP 000FA110, 0024 (r2 ACPIAM) >> (XEN) ACPI: XSDT BF7A0100, 0084 (r1 SMCI 20100225 MSFT 97) >> (XEN) ACPI: FACP BF7A0290, 00F4 (r4 022510 FACP1918 20100225 MSFT 97) >> (XEN) ACPI: DSDT BF7A05F0, 6C55 (r2 10605 10605000 0 INTL 20051117) >> (XEN) ACPI: FACS BF7AE000, 0040 >> (XEN) ACPI: APIC BF7A0390, 0092 (r2 022510 APIC1918 20100225 MSFT 97) >> (XEN) ACPI: MCFG BF7A0430, 003C (r1 022510 OEMMCFG 20100225 MSFT 97) >> (XEN) ACPI: OEMB BF7AE040, 0073 (r1 022510 OEMB1918 20100225 MSFT 97) >> (XEN) ACPI: HPET BF7AA5F0, 0038 (r1 022510 OEMHPET 20100225 MSFT 97) >> (XEN) ACPI: GSCI BF7AE0C0, 2024 (r1 022510 GMCHSCI 20100225 MSFT 97) >> (XEN) ACPI: DMAR BF7B00F0, 0090 (r1 AMI OEMDMAR 1 MSFT 97) >> (XEN) ACPI: SSDT BF7B1580, 0363 (r1 DpgPmm CpuPm 12 INTL 20051117) >> (XEN) ACPI: EINJ BF7AA630, 0130 (r1 AMIER AMI_EINJ 20100225 MSFT 97) >> (XEN) ACPI: BERT BF7AA7C0, 0030 (r1 AMIER AMI_BERT 20100225 MSFT 97) >> (XEN) ACPI: ERST BF7AA7F0, 01B0 (r1 AMIER AMI_ERST 20100225 MSFT 97) >> (XEN) ACPI: HEST BF7AA9A0, 00A8 (r1 AMIER ABC_HEST 20100225 MSFT 97) >> (XEN) System RAM: 4087MB (4185268kB) >> (XEN) No NUMA configuration found >> (XEN) Faking a node at 0000000000000000-0000000140000000 >> (XEN) Domain heap initialised >> (XEN) found SMP MP-table at 000ff780 >> (XEN) DMI present. >> (XEN) Using APIC driver default >> (XEN) ACPI: PM-Timer IO Port: 0x808 >> (XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[804,0], pm1x_evt[800,0] >> (XEN) ACPI: wakeup_vec[bf7ae00c], vec_size[20] >> (XEN) ACPI: Local APIC address 0xfee00000 >> (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) >> (XEN) Processor #0 7:14 APIC version 21 >> (XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled) >> (XEN) Processor #2 7:14 APIC version 21 >> (XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled) >> (XEN) Processor #4 7:14 APIC version 21 >> (XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled) >> (XEN) Processor #6 7:14 APIC version 21 >> (XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x84] disabled) >> (XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x85] disabled) >> (XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x86] disabled) >> (XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x87] disabled) >> (XEN) ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) >> (XEN) ACPI: IOAPIC (id[0x07] address[0xfec00000] gsi_base[0]) >> (XEN) IOAPIC[0]: apic_id 7, version 32, address 0xfec00000, GSI 0-23 >> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) >> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) >> (XEN) ACPI: IRQ0 used by override. >> (XEN) ACPI: IRQ2 used by override. >> (XEN) ACPI: IRQ9 used by override. >> (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs >> (XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000 >> (XEN) PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255 >> (XEN) PCI: MCFG area at e0000000 reserved in E820 >> (XEN) Using ACPI (MADT) for SMP configuration information >> (XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2) >> (XEN) Initializing Credit2 scheduler >> (XEN) WARNING: This is experimental software in development. >> (XEN) Use at your own risk. >> (XEN) csched_dom_init: Initializing domain 32767 >> (XEN) Unknown interrupt (cr2=3D0000000000000000) >> (XEN) ffff82c480275000 ffff830000087f40 ffff830000087fc0 >> 0000000000000080 ffff82c48037ff18 0000000000000080 0080000000000000 >> 0000000000000001 0200000000000000 0000000000000000 0000000000000000 >> ffff82f6017e5f80 00000000000bf5ba 00000000000bf5b8 ffff82c48023e038 >> ffff82c48026000b 000000000000e008 0000000000010046 ffff82c48037fe58 >> 0000000000000000 ffff82c480260006 0000000000000000 0000000000000000 >> 0000000000000000 0000000001af2c70 ffff8300bf585ff8 ffff830000087fc0 >> ffff830000087f40 0000100000000000 ffff830000087f40 ffff8300bf584ff8 >> ffff82c4803a2e74 0000000000000000 0000000000cbec01 ffff830000000011 >> 0000000800000000 000000010000006e 0000000000000003 00000000000002f8 >> 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> 0000000000000000 0000000000000000 0000000000067ddc ffff82c4801000b5 >> 0000000000000000 0000000000000000 > > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 00000000fffff000 >> >> >> This is a single socket 4 core Nehalem machine. >> >> J >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel --=20 Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujits= u.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.= html --------------020405010303000502080705 Content-Type: text/x-patch; name="pool-credit2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pool-credit2.patch" diff -r ccae861f52f7 xen/common/cpupool.c --- a/xen/common/cpupool.c Thu May 06 11:59:55 2010 +0100 +++ b/xen/common/cpupool.c Fri May 07 09:05:17 2010 +0200 @@ -559,6 +559,8 @@ addcpu_out: spin_unlock(&cpupool_ctl_lock); + spin_unlock(&cpupool_ctl_lock); + return ret; } diff -r ccae861f52f7 xen/common/sched_credit2.c --- a/xen/common/sched_credit2.c Thu May 06 11:59:55 2010 +0100 +++ b/xen/common/sched_credit2.c Fri May 07 09:05:17 2010 +0200 @@ -532,6 +532,9 @@ csched_alloc_vdata(const struct schedule if ( svc == NULL ) return NULL; memset(svc, 0, sizeof(*svc)); + + printk("%s: Allocating vcpu d%dv%d\n", + __func__, vc->domain->domain_id, vc->vcpu_id); INIT_LIST_HEAD(&svc->rqd_elem); INIT_LIST_HEAD(&svc->sdom_elem); @@ -1093,10 +1096,44 @@ csched_dump(const struct scheduler *ops) } static void -make_runq_map(const struct scheduler *ops) +csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) +{ + unsigned long flags; + struct csched_private *prv = CSCHED_PRIV(ops); + + spin_lock_irqsave(&prv->lock, flags); + prv->ncpus--; + spin_unlock_irqrestore(&prv->lock, flags); + + return; +} + +static void * +csched_alloc_pdata(const struct scheduler *ops, int cpu) +{ + spinlock_t *new_lock; + spinlock_t *old_lock = per_cpu(schedule_data, cpu).schedule_lock; + unsigned long flags; + struct csched_private *prv = CSCHED_PRIV(ops); + + printk("%s: Allocating pdata %d\n", __func__, cpu); + + spin_lock_irqsave(old_lock, flags); + new_lock = &per_cpu(schedule_data, prv->runq_map[cpu])._lock; + per_cpu(schedule_data, cpu).schedule_lock = new_lock; + spin_unlock_irqrestore(old_lock, flags); + + spin_lock_irqsave(&prv->lock, flags); + prv->ncpus++; + spin_unlock_irqrestore(&prv->lock, flags); + + return (void *)1; +} + +static void +make_runq_map(struct csched_private *prv) { int cpu, cpu_count=0; - struct csched_private *prv = CSCHED_PRIV(ops); /* FIXME: Read pcpu layout and do this properly */ for_each_possible_cpu( cpu ) @@ -1125,13 +1162,14 @@ csched_init(struct scheduler *ops, int p if ( prv == NULL ) return 1; memset(prv, 0, sizeof(*prv)); + ops->sched_data = prv; spin_lock_init(&prv->lock); INIT_LIST_HEAD(&prv->sdom); prv->ncpus = 0; - make_runq_map(ops); + make_runq_map(prv); for ( i=0; irunq_count ; i++ ) { @@ -1141,21 +1179,6 @@ csched_init(struct scheduler *ops, int p rqd->id = i; INIT_LIST_HEAD(&rqd->svc); INIT_LIST_HEAD(&rqd->runq); - } - - /* Initialize pcpu structures */ - for_each_possible_cpu(i) - { - int runq_id; - spinlock_t *lock; - - /* Point the per-cpu schedule lock to the runq_id lock */ - runq_id = prv->runq_map[i]; - lock = &per_cpu(schedule_data, runq_id)._lock; - - per_cpu(schedule_data, i).schedule_lock = lock; - - prv->ncpus++; } return 0; @@ -1201,6 +1224,8 @@ const struct scheduler sched_credit2_def .deinit = csched_deinit, .alloc_vdata = csched_alloc_vdata, .free_vdata = csched_free_vdata, + .alloc_pdata = csched_alloc_pdata, + .free_pdata = csched_free_pdata, .alloc_domdata = csched_alloc_domdata, .free_domdata = csched_free_domdata, }; --------------020405010303000502080705 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------020405010303000502080705--