From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: PVHVM VCPU hotplug mechanism via ACPI hotplug doesn't work in Xen 4.[1, 2, 3] Date: Wed, 8 May 2013 14:43:12 +0100 Message-ID: <518A5670.7010303@eu.citrix.com> References: <20130506184509.GA21497@phenom.dumpdata.com> <92B37F2487AE0841841737618F25AC1A0FF5A4F3@FTLPEX01CL03.citrite.net> <92B37F2487AE0841841737618F25AC1A0FF5A51B@FTLPEX01CL03.citrite.net> <20130507194621.GA4776@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: "jinsong.liu@intel.com" , "xen-devel@lists.xensource.com" , Ian Campbell , Konrad Rzeszutek Wilk , Ian Jackson , Ross Philipson List-Id: xen-devel@lists.xenproject.org On 08/05/13 12:14, Stefano Stabellini wrote: > On Tue, 7 May 2013, Konrad Rzeszutek Wilk wrote: >> On Tue, May 07, 2013 at 12:16:25AM +0000, Ross Philipson wrote: >>>> -----Original Message----- >>>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- >>>> bounces@lists.xen.org] On Behalf Of Ross Philipson >>>> Sent: Monday, May 06, 2013 5:07 PM >>>> To: Konrad Rzeszutek Wilk; jinsong.liu@intel.com; Stefano Stabellini; >>>> xen-devel@lists.xensource.com >>>> Subject: Re: [Xen-devel] PVHVM VCPU hotplug mechanism via ACPI hotplug >>>> doesn't work in Xen 4.[1, 2, 3] >>>> >>>>> -----Original Message----- >>>>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- >>>>> bounces@lists.xen.org] On Behalf Of Konrad Rzeszutek Wilk >>>>> Sent: Monday, May 06, 2013 11:45 AM >>>>> To: jinsong.liu@intel.com; Stefano Stabellini; xen- >>>>> devel@lists.xensource.com >>>>> Subject: [Xen-devel] PVHVM VCPU hotplug mechanism via ACPI hotplug >>>>> doesn't work in Xen 4.[1, 2, 3] >>>>> >>>>> Which is probably the case b/c the Linux side implementation for >>>>> such simple operation as : >>>>> >>>>> echo 0 > /sys/devices/system/cpu/cpu1/online >>>>> echo 1 > /sys/devices/system/cpu/cpu1/online >>>>> >>>>> would have blown up so nobody tested it? >>>>> >>>>> Now that is fixed (v3.10 + http://lists.xen.org/archives/html/xen- >>>>> devel/2013-05/msg00503.html) >>>>> I tried to do 'xm vcpu-set latest X' for a PVHVM guest. >>>>> >>>>> The first iteration was using this simple guest config: >>>>> builder='hvm' >>>>> disk = [ 'file:/mnt/lab/latest/root_image.iso,hdc:cdrom,r'] >>>>> memory = 2048 >>>>> boot="d" >>>>> maxvcpus=4 >>>>> vcpus=4 >>>>> serial='pty' >>>>> vnclisten="0.0.0.0" >>>>> name="latest" >>>>> vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ] >>>>> >>>>> And I tried simple combinations of 'x[m|l] vcpu-set latest 2|3|4' and >>>>> none of them worked. >>>>> >>>>> In Xen 4.1 (and Xen 4.3 if I use:device_model_version="qemu-xen- >>>>> traditional") >>>>> I saw that the qemu log does the right thing: >>>>> .. snip.. >>>>> Remove vcpu 2 >>>>> Remove vcpu 1 >>>>> >>>>> and the guest's ACPI SCI is incrementing: >>>>> # cat /proc/interrupts |grep acpi >>>>> 9: 1 0 0 IO-APIC-fasteoi acpi >>>>> # cat /proc/interrupts |grep acpi >>>>> 9: 2 0 0 IO-APIC-fasteoi acpi >>>>> >>>>> But nothing looks to be happening. Where should I look? >>>>> The ACPI DSDT is a bit daunting. Has this ever worked in the past? >>>>> If so, what code ran to hotplug CPUs? >>>> I am looking at qemu-traditional (I believe it is traditional >>>> here: http://xenbits.xen.org/gitweb/?p=qemu-xen-unstable.git;a=summary). >>>> >>>> The code for raising the SCI for processor hotplug doesn't >>>> look quite right to me. This is the code I see in >>>> qemu_cpu_add_remove: >>>> >>>> if (gpe_state.gpe0_en[0] & 4) { >>>> qemu_set_irq(sci_irq, 1); >>>> qemu_set_irq(sci_irq, 0); >>>> } >>>> >>>> I would expect the code to actually set the GPE status bit for >>>> the GPE register in question (in this case bit 2 for _L02), maybe: >>>> >>>> if ((gpe_state.gpe0_en[0] & 4)&& >>>> ((gpe_state.gpe0_sts[0] & 4) == 0)) { >>>> gpe_state.gpe0_sts[0] &= 4; >>>> qemu_irq_raise(sci_irq); >>>> } >>>> >>>> I also don't understand why it is raising the SCI as level and >>>> then edge. That GPE is defined in the DSDT as level: >>>> >>>> /* Define GPE control method '_L02'. */ >>>> push_block("Scope", "\\_GPE"); >>>> push_block("Method", "_L02"); >>>> stmt("Return", "\\_SB.PRSC()"); >>>> >>>> Maybe look at how the PCI hotplug GPE's occur (ACPI_PHP_GPE_BIT). >>>> Without >>>> setting a status register the SCI may just be going in the bit bucket. >>>> >>>> Anyway hope this is helpful at all... >>> Never mind, I missed the calls above to enable_processor >>> and disable_processor which seem to be doing the right >>> thing (long day on a plane). My sample code above had a >>> type-o but g->gpe0_sts[0] |= 4; looks right. >>> >>> I still don't understand why it raises the SCI level then edge though. >> Neither do I but it actually looks to work. The problem I had was that >> the $@$@#( generic hotplug code has a race, which means that udev >> never ends up onlining the CPU (https://lkml.org/lkml/2012/4/30/198) >> So it looks like the CPU nevers gets onlined - and it did not until >> I manually wrote a script to online the CPUs. >> >> Now it can online, albeit I am hitting another bug : >> >> [ 41.465296] CPU 1 got hotplugged >> [ 41.469607] installing Xen timer for CPU 1 >> [ 41.475935] SMP alternatives: lockdep: fixing up alternatives >> [ 41.483014] SMP alternatives: switching to SMP code >> [ 41.499193] smpboot: Booting Node 0 Processor 1 APIC 0x4 >> [ 41.517099] ------------[ cut here ]------------ >> [ 41.519532] kernel BUG at /home/konrad/ssd/konrad/linux/arch/x86/xen/time.c:337! >> [ 41.519532] invalid opcode: 0000 [#1] SMP >> [ 41.519532] Modules linked in: dm_multipath dm_mod iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c sg sr_mod cdrom ata_generic crc32c_intel ata_piix libata scsi_mod xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_kbdfront xenfs xen_privcmd >> [ 41.519532] CPU 1 >> [ 41.519532] Pid: 0, comm: swapper/1 Not tainted 3.9.0upstream-00022-g49c1bf1-dirty #3 Xen HVM domU >> [ 41.519532] RIP: 0010:[] [] xen_vcpuop_set_mode+0x3c/0x80 >> [ 41.519532] RSP: 0000:ffff880074467db8 EFLAGS: 00010082 >> [ 41.519532] RAX: ffffffffffffffea RBX: ffff880074a2be40 RCX: 0000000000000001 >> [ 41.519532] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000009 >> [ 41.519532] RBP: ffff880074467db8 R08: 0000000000000001 R09: 0000000000000000 >> [ 41.519532] R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000001 >> [ 41.519532] R13: 0000000000000082 R14: 0000000000000000 R15: 0000000000000082 >> [ 41.519532] FS: 0000000000000000(0000) GS:ffff880074a20000(0000) knlGS:0000000000000000 >> [ 41.519532] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 41.519532] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000406e0 >> [ 41.519532] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 41.519532] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> [ 41.519532] Process swapper/1 (pid: 0, threadinfo ffff880074466000, task ffff880074464300) >> [ 41.519532] Stack: >> [ 41.519532] ffff880074467dd8 ffffffff810e8385 ffff880074a2be40 ffff880074a2be40 >> [ 41.519532] ffff880074467df8 ffffffff810e83e6 ffff880074467df8 0000000000000000 >> [ 41.519532] ffff880074467e28 ffffffff810e84b7 ffffffff810e8ae4 ffff880074a2be40 >> [ 41.519532] Call Trace: >> [ 41.519532] [] clockevents_set_mode+0x25/0x70 >> [ 41.519532] [] clockevents_shutdown+0x16/0x30 >> [ 41.519532] [] clockevents_exchange_device+0xb7/0x110 >> [ 41.519532] [] ? tick_notify+0x114/0x420 >> [ 41.519532] [] tick_notify+0x1c9/0x420 >> [ 41.519532] [] ? clockevents_register_device+0x31/0x170 >> [ 41.519532] [] notifier_call_chain+0x4d/0x70 >> [ 41.519532] [] raw_notifier_call_chain+0x11/0x20 >> [ 41.519532] [] clockevents_register_device+0xe0/0x170 >> [ 41.519532] [] xen_setup_cpu_clockevents+0x2c/0x50 >> [ 41.519532] [] xen_hvm_setup_cpu_clockevents+0x16/0x20 >> [ 41.519532] [] start_secondary+0x1ea/0x1f9 >> [ 41.519532] Code: 73 2d 48 63 c9 bf 09 00 00 00 31 d2 48 89 ce e8 bb c3 fb ff 85 c0 75 13 bf 07 00 00 00 48 89 ce 31 d2 e8 a8 c3 fb ff 85 c0 74 09 <0f> 0b eb fe 83 ff 03 74 02 c9 c3 bf 07 00 00 00 48 63 f1 31 d2 >> [ 41.519532] RIP [] xen_vcpuop_set_mode+0x3c/0x80 >> [ 41.519532] RSP >> [ 41.519532] ---[ end trace a182694869545b1a ]--- >> [ 41.519532] Kernel panic - not syncing: Attempted to kill the idle task! >> >> Thought this is using xend, as you cannot in xl have maxvcpus != vcpus. Did you mean maxvcpus != vcpus, or maxvcpus > pcpus? -George