From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: [PATCH v2] kvm: x86: fix stale mmio cache bug Date: Tue, 05 Aug 2014 11:36:15 +0800 Message-ID: <53E0512F.2020309@linux.vnet.ibm.com> References: <1407186620-1999-1-git-send-email-dmatlack@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Eric Northup To: David Matlack , Gleb Natapov , Paolo Bonzini , kvm@vger.kernel.org, x86@kernel.org Return-path: Received: from e28smtp05.in.ibm.com ([122.248.162.5]:57085 "EHLO e28smtp05.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756805AbaHEDgU (ORCPT ); Mon, 4 Aug 2014 23:36:20 -0400 Received: from /spool/local by e28smtp05.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 5 Aug 2014 09:06:17 +0530 Received: from d28relay01.in.ibm.com (d28relay01.in.ibm.com [9.184.220.58]) by d28dlp03.in.ibm.com (Postfix) with ESMTP id 8890F1258017 for ; Tue, 5 Aug 2014 09:06:17 +0530 (IST) Received: from d28av03.in.ibm.com (d28av03.in.ibm.com [9.184.220.65]) by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s753aBGv44040272 for ; Tue, 5 Aug 2014 09:06:11 +0530 Received: from d28av03.in.ibm.com (localhost [127.0.0.1]) by d28av03.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s753aEtm008914 for ; Tue, 5 Aug 2014 09:06:14 +0530 In-Reply-To: <1407186620-1999-1-git-send-email-dmatlack@google.com> Sender: kvm-owner@vger.kernel.org List-ID: On 08/05/2014 05:10 AM, David Matlack wrote: > The following events can lead to an incorrect KVM_EXIT_MMIO bubbling > up to userspace: > > (1) Guest accesses gpa X without a memory slot. The gfn is cached in > struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets > the SPTE write-execute-noread so that future accesses cause > EPT_MISCONFIGs. > > (2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION > covering the page just accessed. > > (3) Guest attempts to read or write to gpa X again. On Intel, this > generates an EPT_MISCONFIG. The memory slot generation number that > was incremented in (2) would normally take care of this but we fast > path mmio faults through quickly_check_mmio_pf(), which only checks > the per-vcpu mmio cache. Since we hit the cache, KVM passes a > KVM_EXIT_MMIO up to userspace. > > This patch fixes the issue by doing the following: > - Tag the mmio cache with the memslot generation and use it to > validate mmio cache lookups. > - Extend vcpu_clear_mmio_info to clear mmio_gfn in addition to > mmio_gva, since both can be used to fast path mmio faults. > - In mmu_sync_roots, unconditionally clear the mmio cache since > even direct_map (e.g. tdp) hosts use it. It's not needed. direct map only uses gpa (and never cache gva) and vcpu_clear_mmio_info only clears gva. > +static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu, > + gva_t gva, gfn_t gfn, unsigned access) > +{ > + vcpu->arch.mmio_gen = kvm_current_mmio_generation(vcpu->kvm); > + > + /* > + * Ensure that the mmio_gen is set before the rest of the cache entry. > + * Otherwise we might see a new generation number attached to an old > + * cache entry if creating/deleting a memslot races with mmio caching. > + * The inverse case is possible (old generation number with new cache > + * info), but that is safe. The next access will just miss the cache > + * when it should have hit. > + */ > + smp_wmb(); The memory barrier can't help us, consider this scenario: CPU 0 CPU 1 page-fault see gpa is not mapped in memslot create new memslot containing gpa from Qemu update the slots's generation number cache mmio info !!! later when vcpu accesses gpa again it will cause mmio-exit. The easy way to fix this is that we update slots's generation-number after synchronize_srcu_expedited when memslot is being updated that ensures other sides can see the new generation-number only after finishing update.