kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Xiao Guangrong <xiaoguangrong.eric@gmail.com>
Cc: Gleb Natapov <gleb@redhat.com>,
	avi.kivity@gmail.com,
	"pbonzini@redhat.com Bonzini" <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable
Date: Wed, 20 Nov 2013 17:47:27 -0200	[thread overview]
Message-ID: <20131120194727.GA3353@amt.cnet> (raw)
In-Reply-To: <B166B522-2CDB-44CA-B594-687B45A1CF01@linux.vnet.ibm.com>

On Wed, Nov 20, 2013 at 10:20:09PM +0800, Xiao Guangrong wrote:
> > But what guarantee does userspace require, from GET_DIRTY_LOG, while vcpus are
> > executing? 
> 
> Aha. Single calling GET_DIRTY_LOG is useless since new dirty page can be generated
> when GET_DIRTY_LOG is being returned. If user wants to get exact dirty pages the vcpus
> should be stopped. 
> 
> > 
> > With fast page fault:
> > 
> >      if (cmpxchg64(sptep, spte, spte | PT_WRITABLE_MASK) == spte)
> >              /* The window in here... */
> >              mark_page_dirty(vcpu->kvm, gfn);
> > 
> > And the $SUBJECT set_spte reordering, the rule becomes
> > 
> > A call to GET_DIRTY_LOG guarantees to return correct information about 
> > dirty pages before invocation of the previous GET_DIRTY_LOG call.
> > 
> > (see example 1: the next GET_DIRTY_LOG will return the dirty information
> > there).
> > 
> 
> It seems no.
> 
> The first GET_DIRTY_LOG can happen before fast-page-fault,
> the second GET_DIRTY_LOG happens in the window between cmpxchg()
> and mark_page_dirty(), for the second one, the information is still “incorrect”.

Its correct for the previous GET_DIRTY_LOG call.

> > The rule for sptes that is, because kvm_write_guest does not match the
> > documentation at all.
> 
> You mean the case of “kvm_write_guest” is valid (I do not know why it is)?
> Or anything else?
> 
> > 
> > So before example 1 and this patch, the rule (well for sptes at least) was
> > 
> > "Given a memory slot, return a bitmap containing any pages dirtied
> > since the last call to this ioctl.  Bit 0 is the first page in the
> > memory slot.  Ensure the entire structure is cleared to avoid padding
> > issues."
> > 
> > Can you explain why it is OK to relax this rule?
> 
> It’s because:
> 1) it doesn’t break current use cases, i.e. Live migration and FB-flushing.
> 2) the current code, like kvm_write_guest  has already broken the documentation
>    (the guest page has been written but missed in the dirty bitmap).
> 3) it’s needless to implement a exact get-dirty-pages since the dirty pages can
>    no be exactly got except stopping vcpus. 
> 
> So i think we'd document this case instead. No?

Lets figure out the requirements, then. I don't understand why
FB-flushing is safe (think kvm-autotest: one pixel off the entire
test fails).

Before fast page fault: Pages are either write protected or the
corresponding dirty bitmap bit is set. Any write faults to dirty logged
sptes while GET_DIRTY log is executing in the protected section are
allowed to instantiate writeable spte after GET_DIRTY log is finished.

After fast page fault: Pages can be writeable and the dirty bitmap not
set. Therefore data can be dirty before GET_DIRTY executes and still
fail to be returned in the bitmap.

Since this patchset does not introduce change in behaviour (fast pf
did), no reason to avoid merging this.

BTW, since GET_DIRTY log does not require to report concurrent (to
GET_DIRTY_LOG) write faults, it is not necessary to rewalk the spte
list, is it?

  reply	other threads:[~2013-11-20 19:47 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-30 13:01 [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect Xiao Guangrong
2013-07-30 13:01 ` [PATCH 01/12] KVM: MMU: remove unused parameter Xiao Guangrong
2013-08-29  7:22   ` Gleb Natapov
2013-07-30 13:02 ` [PATCH 02/12] KVM: MMU: properly check last spte in fast_page_fault() Xiao Guangrong
2013-07-30 13:02 ` [PATCH 03/12] KVM: MMU: lazily drop large spte Xiao Guangrong
2013-08-02 14:55   ` Marcelo Tosatti
2013-08-02 15:42     ` Xiao Guangrong
2013-08-02 20:27       ` Marcelo Tosatti
2013-08-02 22:56         ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable Xiao Guangrong
2013-07-30 13:26   ` Paolo Bonzini
2013-07-31  7:25     ` Xiao Guangrong
2013-08-07  1:48   ` Marcelo Tosatti
2013-08-07  4:06     ` Xiao Guangrong
2013-08-08 15:06       ` Marcelo Tosatti
2013-08-08 16:26         ` Xiao Guangrong
2013-11-20  0:29       ` Marcelo Tosatti
2013-11-20  0:35         ` Marcelo Tosatti
2013-11-20 14:20         ` Xiao Guangrong
2013-11-20 19:47           ` Marcelo Tosatti [this message]
2013-11-21  4:26             ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 05/12] KVM: MMU: add spte into rmap before logging dirty page Xiao Guangrong
2013-07-30 13:27   ` Paolo Bonzini
2013-07-31  7:33     ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 06/12] KVM: MMU: flush tlb if the spte can be locklessly modified Xiao Guangrong
2013-08-28  7:23   ` Gleb Natapov
2013-08-28  7:50     ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 07/12] KVM: MMU: redesign the algorithm of pte_list Xiao Guangrong
2013-08-28  8:12   ` Gleb Natapov
2013-08-28  8:37     ` Xiao Guangrong
2013-08-28  8:58       ` Gleb Natapov
2013-08-28  9:19         ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 08/12] KVM: MMU: introduce nulls desc Xiao Guangrong
2013-08-28  8:40   ` Gleb Natapov
2013-08-28  8:54     ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker Xiao Guangrong
2013-08-28  9:20   ` Gleb Natapov
2013-08-28  9:33     ` Xiao Guangrong
2013-08-28  9:46       ` Gleb Natapov
2013-08-28 10:13         ` Xiao Guangrong
2013-08-28 10:49           ` Gleb Natapov
2013-08-28 12:15             ` Xiao Guangrong
2013-08-28 13:36               ` Gleb Natapov
2013-08-29  6:50                 ` Xiao Guangrong
2013-08-29  9:08                   ` Gleb Natapov
2013-08-29  9:31                     ` Xiao Guangrong
2013-08-29  9:51                       ` Gleb Natapov
2013-08-29 11:26                         ` Xiao Guangrong
2013-08-30 11:38                           ` Gleb Natapov
2013-09-02  7:02                             ` Xiao Guangrong
2013-08-29  9:31                   ` Gleb Natapov
2013-08-29 11:33                     ` Xiao Guangrong
2013-08-29 12:02                       ` Xiao Guangrong
2013-08-30 11:44                         ` Gleb Natapov
2013-09-02  8:50                           ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 10/12] KVM: MMU: allow locklessly access shadow page table out of vcpu thread Xiao Guangrong
2013-08-07 13:09   ` Takuya Yoshikawa
2013-08-07 13:19     ` Xiao Guangrong
2013-08-29  9:10   ` Gleb Natapov
2013-08-29  9:25     ` Xiao Guangrong
2013-07-30 13:02 ` [PATCH 11/12] KVM: MMU: locklessly write-protect the page Xiao Guangrong
2013-07-30 13:02 ` [PATCH 12/12] KVM: MMU: clean up spte_write_protect Xiao Guangrong
2013-07-30 13:11 ` [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect Xiao Guangrong
2013-08-03  5:09 ` Takuya Yoshikawa
2013-08-04 14:15   ` Xiao Guangrong
2013-08-29  7:16   ` Gleb Natapov
2013-08-06 13:16 ` Xiao Guangrong
2013-08-08 17:38   ` Paolo Bonzini
2013-08-09  4:51     ` Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131120194727.GA3353@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi.kivity@gmail.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).