kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: gleb@kernel.org, avi.kivity@gmail.com, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 5/5] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes
Date: Tue, 15 Apr 2014 19:11:24 +0800	[thread overview]
Message-ID: <534D13DC.7040400@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140409145137.GA23449@amt.cnet>


Hi Marcelo,

Thanks your time to review it.

On 04/09/2014 10:51 PM, Marcelo Tosatti wrote:

>> +/*
>> + * Please note PT_WRITABLE_MASK is not stable since
>> + * 1) fast_page_fault() sets spte from readonly to writable out of mmu-lock or
>> + * 2) kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log()
>> + *    can write protect sptes but flush tlb out mmu-lock that means we may use
>> + *    the corrupt tlb entries which depend on this bit.
>> + *
>> + * Both cases do not modify the status of  spte_is_locklessly_modifiable() so
>> + * if you want to check whether the spte is writable on MMU you can check
>> + * SPTE_MMU_WRITEABLE instead. If you want to update spte without losing
>> + * A/D bits and tlb flush, you can check spte_is_locklessly_modifiable()
>> + * instead. See the comments in spte_has_volatile_bits() and
>> + * mmu_spte_update().
>> + */
>>  static inline int is_writable_pte(unsigned long pte)
>>  {
> 
> Xiao,
> 
> Can't get the SPTE_MMU_WRITEABLE part.
> 
> So assume you are writing code to perform some action after guest memory
> has been write protected. You would
> 
> spin_lock(mmu_lock);
> 
> if (writeable spte bit is set)
>     remove writeable spte bit from spte
> remote TLB flush            (*)
> action
> 
> spin_unlock(mmu_lock);
> 
> (*) is necessary because reading the writeable spte bit as zero
> does not guarantee remote TLBs have been flushed.
> 
> Now what SPTE_MMU_WRITEABLE has to do with it ?

It is contained in "remove writeable spte bit from spte" which
is done by mmu_spte_update():

	if (spte_is_locklessly_modifiable(old_spte) &&
	      !is_writable_pte(new_spte))
		ret = true;
> 
> Perhaps a recipe like that (or just the rules) would be useful.

Okay, i am considering to improve this comments, how about like
this:

Currently, we have two sorts of write-protection, a) the first
one write-protects guest page to sync the guest modification,
b) another one is used to sync dirty bitmap when we do
KVM_GET_DIRTY_LOG. The differences between these two sorts are:
1) the first case clears SPTE_MMU_WRITEABLE bit.
2) the first case requires flushing tlb immediately avoiding
   corrupting shadow page table between all vcpus so it should
   be in the protection of mmu-lock. And the another case does
   not need to flush tlb until returning the dirty bitmap to
   userspace since it only write-protects the page logged in
   the bitmap, that means the page in the dirty bitmap is not
   missed, so it can flush tlb out of mmu-lock.

So, there is the problem: the first case can meet the corrupted
tlb caused by another case which write-protects pages but without
flush tlb immediately. In order to making the first case be aware
this problem we let it flush tlb if we try to write-protect
a spte whose SPTE_MMU_WRITEABLE bit is set, it works since another
case never touches SPTE_MMU_WRITEABLE bit.

Anyway, whenever a spte is updated (only permission and status bits
are changed) we need to check whether the spte with SPTE_MMU_WRITEABLE
becomes readonly, if that happens, we need to flush tlb. Fortunately,
mmu_spte_update() has already handled it perfectly.

The rules to use SPTE_MMU_WRITEABLE and PT_WRITABLE_MASK:
- if we want to see if it has writable tlb entry or if the spte can
  be writable on the mmu mapping, check SPTE_MMU_WRITEABLE, this is
  the most case, otherwise
- if we fix page fault on the spte or do write-protection by dirty logging,
  check PT_WRITABLE_MASK.

TODO: introduce APIs to split these two cases.

> 
> The remaining patches look good.

Thank you.

  reply	other threads:[~2014-04-15 11:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-10 14:01 [PATCH v4 0/5] KVM: x86: flush tlb out of mmu-lock after write protection Xiao Guangrong
2014-03-10 14:01 ` [PATCH 1/5] Revert "KVM: Simplify kvm->tlbs_dirty handling" Xiao Guangrong
2014-03-10 14:01 ` [PATCH 2/5] KVM: MMU: properly check last spte in fast_page_fault() Xiao Guangrong
2014-03-10 14:01 ` [PATCH 3/5] KVM: MMU: lazily drop large spte Xiao Guangrong
2014-03-10 14:01 ` [PATCH 4/5] KVM: MMU: flush tlb if the spte can be locklessly modified Xiao Guangrong
2014-03-10 14:01 ` [PATCH 5/5] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes Xiao Guangrong
2014-04-09 14:51   ` Marcelo Tosatti
2014-04-15 11:11     ` Xiao Guangrong [this message]
2014-03-25  3:29 ` [PATCH v4 0/5] KVM: x86: flush tlb out of mmu-lock after write protection Xiao Guangrong
2014-03-25 10:22   ` Paolo Bonzini
2014-03-25 16:25 ` Hu Yaohui
2014-03-26  4:40   ` Hu Yaohui
2014-03-26  5:07     ` Xiao Guangrong
2014-03-26  5:30       ` Hu Yaohui
2014-03-26  4:54   ` Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534D13DC.7040400@linux.vnet.ibm.com \
    --to=xiaoguangrong@linux.vnet.ibm.com \
    --cc=avi.kivity@gmail.com \
    --cc=gleb@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).