From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH] cpu hotplug issue Date: Sun, 24 Jul 2011 14:56:47 +0300 Message-ID: <20110724115647.GR3044@redhat.com> References: <20110720083507.GS2400@redhat.com> <20110721113342.GB3044@redhat.com> <4E281090.9070300@siemens.com> <20110721115118.GD3044@redhat.com> <20110721124512.GI3044@redhat.com> <4E29577A.9080909@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Vasilis Liaskovitis , "kvm@vger.kernel.org" , Markus Armbruster To: Jan Kiszka Return-path: Received: from mx1.redhat.com ([209.132.183.28]:52533 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752515Ab1GXL46 (ORCPT ); Sun, 24 Jul 2011 07:56:58 -0400 Content-Disposition: inline In-Reply-To: <4E29577A.9080909@siemens.com> Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Jul 22, 2011 at 12:56:58PM +0200, Jan Kiszka wrote: > On 2011-07-21 14:45, Gleb Natapov wrote: > > On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote: > >>>> Jan can you look at this please? > >>> > >>> I can't promise to do debugging myself. > >>> > >>> Also, as I never succeeded in getting anything working with CPU hotplug, > >>> even back in the days it was supposed to work, I'm a bit clueless /wrt > >>> to the right test cases. > >>> > >> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch > >> applied). But we have two bugs currently. One is that ACPI interrupt > >> is not send when cpu is onlined (at least this appears to be the case). > >> I will look at that one. Another is that after new cpu is detected it > >> can't be onlined. > >> > >> After fixing the first bug the test should look like this: > >> 1. start vm with -smp 1,macpus=2 > >> 2. wait for it to boot > >> 3. do "cpu 1 online" in monitor. > >> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online" > >> > >> If step 4 should succeed. It fails now. > >> > > The first one was easy to solve. See patch below. Step 3 should be > > "cpu_set 1 online". > > > > --- > > > > Trigger sci interrupt after cpu hotplug/unplug event. > > > > Signed-off-by: Gleb Natapov > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > > index c30a050..40f3fcd 100644 > > --- a/hw/acpi_piix4.c > > +++ b/hw/acpi_piix4.c > > @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s) > > ACPI_BITMASK_POWER_BUTTON_ENABLE | > > ACPI_BITMASK_GLOBAL_LOCK_ENABLE | > > ACPI_BITMASK_TIMER_ENABLE)) != 0) || > > - (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0); > > + (((s->gpe.sts[0] & s->gpe.en[0]) & > > + (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); > > > > qemu_set_irq(s->irq, sci_level); > > /* schedule a timer interruption if needed */ > > -- > > Gleb. > > I had a closer look and identified two further issues, one generic, one > CPU-hotplug-specific: > - (qdev) devices that are hotplugged do not receive any reset. That > does not only apply to the APIC in case of CPU hotplugging, it is > also broken for NICs, storage controllers, etc. when doing PCI > hot-add as I just checked via gdb. > - CPU hotplugging was always (or at least for a fairly long time), > well, fragile as it failed to make CPU thread creation and CPU > initialization atomic against APIC addition and other initialization > steps. IOW, we need to create CPUs stopped, finish all init work, > sync their states completely to the kernel > (cpu_synchronize_post_init), and then kick them of. Actually I'm Syncing the state to the kernel should be done by vcpu thread, so I it cannot be stopped while the sync is done. May be I misunderstood what you mean here. > considering to stop all CPUs during that short phase to make things > simpler and future-proof (when we reduce qemu_global_mutex > dependencies). > > Still, something else must be different for hotplugged CPUs as they fail > to come up properly every 2 or 3 system resets or online transitions of > the Linux guest. Will try to understand that once time permits. > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux -- Gleb.