Re: [PATCH] KVM: x86/mmu: fix potential races when walking host page table

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Mingwei Zhang <mizhang@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: x86/mmu: fix potential races when walking host page table
Date: Fri, 29 Apr 2022 15:33:09 +0000	[thread overview]
Message-ID: <YmwFNbYy4tlxOJRG@google.com> (raw)
In-Reply-To: <4b0936bf-fd3e-950a-81af-fd393475553f@redhat.com>

On Fri, Apr 29, 2022, Paolo Bonzini wrote:
> On 4/29/22 05:17, Mingwei Zhang wrote:
> > +	ptep = pte_offset_map(&pmd, address);
> > +	pte = ptep_get(ptep);
> > +	if (pte_present(pte)) {
> > +		pte_unmap(ptep);
> > +		level = PG_LEVEL_4K;
> > +		goto out;
> > +	}
> > +	pte_unmap(ptep);
> 
> Not needed as long as PG_LEVEL_4K is returned for a non-present PTE.
> 
> > +out:
> > +	local_irq_restore(flags);
> > +	return level;
> > +}
> > +EXPORT_SYMBOL_GPL(kvm_lookup_address_level_in_mm);
> 
> Exporting is not needed.
> 
> Thanks for writing the walk code though.  I'll adapt it and integrate the
> patch.

Can you also remove the intermediate pointers?  They are at best a wash in terms
of readability (net negative IMO), and introduce unnecessary risk by opening up
the possibility of using the wrong variable and/or re-reading an entry.

Case in point, this code subtly re-reads the upper level entries when getting
the next level down, which can run afould of huge page promotion and use the
wrong address (huge pfn instead of page table address).

The alternative is to use the p*d_offset_lockless() helper, but IMO that's
unnecessary and pointless because it just does the same thing under the hood.

E.g. (compile tested only at this point)

static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
				  const struct kvm_memory_slot *slot)
{
	unsigned long flags;
	unsigned long hva;
	int level;
	pgd_t pgd;
	p4d_t p4d;
	pud_t pud;
	pmd_t pmd;

	if (!PageCompound(pfn_to_page(pfn)) && !kvm_is_zone_device_pfn(pfn))
		return PG_LEVEL_4K;

	/*
	 * Note, using the already-retrieved memslot and __gfn_to_hva_memslot()
	 * is not solely for performance, it's also necessary to avoid the
	 * "writable" check in __gfn_to_hva_many(), which will always fail on
	 * read-only memslots due to gfn_to_hva() assuming writes.  Earlier
	 * page fault steps have already verified the guest isn't writing a
	 * read-only memslot.
	 */
	hva = __gfn_to_hva_memslot(slot, gfn);

	level = PG_LEVEL_4K;

	/*
	 * Disable IRQs to block TLB shootdown and thus prevent user page tables
	 * from being freed.
	 */
	local_irq_save(flags);

	pgd = READ_ONCE(*pgd_offset(kvm->mm, hva));
	if (pgd_none(pgd))
		goto out;

	p4d = READ_ONCE(*p4d_offset(&pgd, hva));
	if (p4d_none(p4d) || !p4d_present(p4d))
		goto out;

	if (p4d_large(p4d)) {
		level = PG_LEVEL_512G;
		goto out;
	}

	pud = READ_ONCE(*pud_offset(&p4d, hva));
	if (pud_none(pud) || !pud_present(pud))
		goto out;

	if (pud_large(pud)) {
		level = PG_LEVEL_1G;
		goto out;
	}

	pmd = READ_ONCE(*pmd_offset(&pud, hva));
	if (pmd_none(pmd) || !pmd_present(pmd))
		goto out;

	if (pmd_large(pmd))
		level = PG_LEVEL_2M;
out:
	local_irq_restore(flags);
	return level;
}

     prev parent reply	other threads:[~2022-04-29 15:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-29  3:17 [PATCH] KVM: x86/mmu: fix potential races when walking host page table Mingwei Zhang
2022-04-29  9:10 ` Paolo Bonzini
2022-04-29 14:35   ` Sean Christopherson
2022-04-29 14:45     ` Paolo Bonzini
2022-04-29 15:47       ` Sean Christopherson
2022-04-29 15:33   ` Sean Christopherson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmwFNbYy4tlxOJRG@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox