From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39810) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cTnPH-0002xK-Rv for qemu-devel@nongnu.org; Wed, 18 Jan 2017 05:23:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cTnPE-0006dH-Me for qemu-devel@nongnu.org; Wed, 18 Jan 2017 05:23:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43452) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cTnPE-0006cy-DL for qemu-devel@nongnu.org; Wed, 18 Jan 2017 05:23:52 -0500 References: <20170112182446.9600-1-lersek@redhat.com> <20170112182446.9600-3-lersek@redhat.com> <20170113140914.62af8755@nial.brq.redhat.com> <32d3f5cf-9428-29db-45bb-d3a35c6cbd04@redhat.com> <20170117122131.466d4147@nial.brq.redhat.com> <20170117142059.3a7689de@nial.brq.redhat.com> <17a61164-0cb8-98fd-b340-8cb3a27edb0e@redhat.com> <20170117152056.4e37468e@nial.brq.redhat.com> <20170118110353.3f2ac8a3@nial.brq.redhat.com> <98895612-58dd-f0cb-e761-515f9e29c373@redhat.com> From: Laszlo Ersek Message-ID: Date: Wed, 18 Jan 2017 11:23:48 +0100 MIME-Version: 1.0 In-Reply-To: <98895612-58dd-f0cb-e761-515f9e29c373@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov Cc: Michael Kinney , Paolo Bonzini , Gerd Hoffmann , qemu devel list , "Michael S. Tsirkin" On 01/18/17 11:19, Laszlo Ersek wrote: > On 01/18/17 11:03, Igor Mammedov wrote: >> On Tue, 17 Jan 2017 19:53:21 +0100 >> Laszlo Ersek wrote: >> [snip] >>> This is the code I wrote (extracting a new ich9_apm_broadcast_smi() >>> function from ich9_apm_ctrl_changed(), due to size reasons): >>> >>>> static void ich9_apm_broadcast_smi(void) >>>> { >>>> CPUState *cs; >>>> >>>> cpu_synchronize_all_states(); /* <--------- mark this */ >> it probably should be executed after pause_all_vcpus(), >> maybe it will help. >> >>>> CPU_FOREACH(cs) { >>>> X86CPU *cpu = X86_CPU(cs); >>>> CPUX86State *env = &cpu->env; >>>> >>>> if (env->smbase == 0x30000 && env->eip == 0xfff0) { >>>> CPUClass *k = CPU_GET_CLASS(cs); >>>> uint64_t cpu_arch_id = k->get_arch_id(cs); >>>> >>>> trace_ich9_apm_broadcast_smi_skip(cpu_arch_id); >>>> continue; >>>> } >>>> >>>> cpu_interrupt(cs, CPU_INTERRUPT_SMI); >>>> } >>>> } >>> >> [...] >>> (b) If I add the cpu_synchronize_all_states() call before the loop, then >>> the incorrect filter matches go away; QEMU sees the KVM VCPU states >>> correctly, and the SMI is broad-cast. >>> >>> However, in this case, the boot slows to a *crawl*. It's unbearable. All >>> VCPUs are pegged at 100%, and you can easily follow the OVMF debug log >>> with the naked eye, almost line by line. >>> I didn't expect that cpu_synchronize_all_states() would be so costly, >>> but it makes the idea of checking VCPU state in the APM_CNT handler a >>> non-starter. >> I wonder were bottleneck is to slow down guest so much. >> What is actual cost of calling cpu_synchronize_all_states()? >> >> Maybe calling pause_all_vcpus() before cpu_synchronize_all_states() >> would help. > > If I prepend just a pause_all_vcpus() function call, at the top, then > the VM freezes (quite understandably) when the first SMI is raised via > APM_CNT. > > If I, instead, bracket the function body with pause_all_vcpus() and > resume_all_vcpus(), like this: > > static void ich9_apm_broadcast_smi(void) > { > CPUState *cs; > > pause_all_vcpus(); > cpu_synchronize_all_states(); > CPU_FOREACH(cs) { > X86CPU *cpu = X86_CPU(cs); > CPUX86State *env = &cpu->env; > > if (env->smbase == 0x30000 && env->eip == 0xfff0) { > CPUClass *k = CPU_GET_CLASS(cs); > uint64_t cpu_arch_id = k->get_arch_id(cs); > > trace_ich9_apm_broadcast_smi_skip(cpu_arch_id); > continue; > } > > cpu_interrupt(cs, CPU_INTERRUPT_SMI); > } > resume_all_vcpus(); > } > > then I see the following symptoms: > - rather than 4 VCPUs being pegged at 100%, only one VCPU is pegged at > 100% > - the boot process, as observed from the OVMF log, is just as slow > (= crawling) as without the VCPU pausing/resuming (i.e., like with > cpu_synchronize_all_states() only); so ultimately the > pausing/resuming doesn't help. I also tried this variant (bracketing only the sync call with pause/resume): static void ich9_apm_broadcast_smi(void) { CPUState *cs; pause_all_vcpus(); cpu_synchronize_all_states(); resume_all_vcpus(); CPU_FOREACH(cs) { X86CPU *cpu = X86_CPU(cs); CPUX86State *env = &cpu->env; if (env->smbase == 0x30000 && env->eip == 0xfff0) { CPUClass *k = CPU_GET_CLASS(cs); uint64_t cpu_arch_id = k->get_arch_id(cs); trace_ich9_apm_broadcast_smi_skip(cpu_arch_id); continue; } cpu_interrupt(cs, CPU_INTERRUPT_SMI); } } This behaves identically to the version where pause/resume bracket the entire function body (i.e., 1 VCPU pegged, super-slow boot progress). Laszlo