From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCHv0 dont apply] RFC: kvm eoi PV using shared memory Date: Tue, 10 Apr 2012 17:53:51 +0300 Message-ID: <20120410145351.GE19556@redhat.com> References: <20120410132756.GA14101@redhat.com> <4F843DAA.1000808@redhat.com> <20120410142616.GB19556@redhat.com> <4F8444AE.4020504@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx1.redhat.com ([209.132.183.28]:37877 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752751Ab2DJOxi (ORCPT ); Tue, 10 Apr 2012 10:53:38 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q3AErcmY028752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 10 Apr 2012 10:53:38 -0400 Content-Disposition: inline In-Reply-To: <4F8444AE.4020504@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Apr 10, 2012 at 05:33:18PM +0300, Avi Kivity wrote: > On 04/10/2012 05:26 PM, Michael S. Tsirkin wrote: > > > > u64 status; > > > > } osvw; > > > > + > > > > + struct { > > > > + u64 msr_val; > > > > + struct gfn_to_hva_cache data; > > > > + int vector; > > > > + } eoi; > > > > }; > > > > > > Needs to be cleared on INIT. > > > > You mean kvm_arch_vcpu_reset? > > Yes, or kvm_lapic_reset(). > > > > > > > > > > > > > @@ -307,6 +308,9 @@ void __cpuinit kvm_guest_cpu_init(void) > > > > smp_processor_id()); > > > > } > > > > > > > > + wrmsrl(MSR_KVM_EOI_EN, __pa(this_cpu_ptr(apic_eoi)) | > > > > + MSR_KVM_EOI_ENABLED); > > > > + > > > > > > Clear on kexec. > > > > With register_reboot_notifier? > > Yes, we already clear some kvm msrs there. > > > > > > > > > - apic_set_vector(vector, apic->regs + APIC_ISR); > > > > + if (eoi_enabled(vcpu)) { > > > > + /* Anything pending? If yes disable eoi optimization. */ > > > > + if (unlikely(apic_find_highest_isr(apic) >= 0)) { > > > > + int v = eoi_clr_pending_vector(vcpu); > > > > > > ISR != pending, that's IRR. If ISR vector has lower priority than the > > > new vector, then we don't need to disable eoi avoidance. > > > > Yes. But we can and it's easier than figuring out priorities. > > I am guessing such collisions are rare, right? > > It's pretty easy, if there is something in IRR but > kvm_lapic_has_interrupt() returns -1, then we need to disable eoi avoidance. I only see kvm_apic_has_interrupt - is this what you mean? > > I'll add a trace to make sure. > > > > > > + if (v != -1) > > > > + apic_set_vector(v, apic->regs + APIC_ISR); > > > > + } else { > > > > + eoi_set_pending_vector(vcpu, vector); > > > > + set_isr = false; > > > > > > Weird. Just set it normally. Remember that reading the ISR needs to > > > return the correct value. > > > > Marcelo said linux does not normally read ISR - not true? > > It's true and it's irrelevant. We aren't coding a feature to what linux > does now, but for what linux or another guest may do in the future. Right. So you think reading ISR has value in combination with PV EOI for future guests? I'm not arguing either way just curious. > > Note this has no effect if the PV optimization is not enabled. > > > > > We need to process the avoided EOI before any APIC read/writes, to be > > > sure the guest sees the updated values. Same for IOAPIC, EOI affects > > > remote_irr. That may been we need to sample it after every exit, or > > > perhaps disable the feature for level-triggered interrupts. > > > > Disabling would be very sad. Can we sample on remote irr read? > > That can be done from another vcpu. We still can handle it, right? Where's the code that handles that read? > Why do we care about > level-triggered interrupts? Everything uses MSI or edge-triggered > IOAPIC interrupts these days. Well lots of emulated devices don't yet. They probably should but it's nice to be able to test with e.g. e1000 emulation not just virtio. Besides, kvm_get_apic_interrupt simply doesn't know about the triggering mode at the moment. -- MST