public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	avi.kivity@gmail.com, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: [PATCH v6 3/7] KVM: MMU: fast invalidate all pages
Date: Wed, 22 May 2013 16:17:20 +0300	[thread overview]
Message-ID: <20130522131720.GO14287@redhat.com> (raw)
In-Reply-To: <519C92B6.1020405@linux.vnet.ibm.com>

On Wed, May 22, 2013 at 05:41:10PM +0800, Xiao Guangrong wrote:
> On 05/22/2013 04:54 PM, Gleb Natapov wrote:
> > On Wed, May 22, 2013 at 04:46:04PM +0800, Xiao Guangrong wrote:
> >> On 05/22/2013 02:34 PM, Gleb Natapov wrote:
> >>> On Tue, May 21, 2013 at 10:33:30PM -0300, Marcelo Tosatti wrote:
> >>>> On Tue, May 21, 2013 at 11:39:03AM +0300, Gleb Natapov wrote:
> >>>>>> Any pages with stale information will be zapped by kvm_mmu_zap_all().
> >>>>>> When that happens, page faults will take place which will automatically 
> >>>>>> use the new generation number.
> >>>>>>
> >>>>>> So still not clear why is this necessary.
> >>>>>>
> >>>>> This is not, strictly speaking, necessary, but it is the sane thing to do.
> >>>>> You cannot update page's generation number to prevent it from been
> >>>>> destroyed since after kvm_mmu_zap_all() completes stale ptes in the
> >>>>> shadow page may point to now deleted memslot. So why build shadow page
> >>>>> table with a page that is in a process of been destroyed?
> >>>>
> >>>> OK, can this be introduced separately, in a later patch, with separate
> >>>> justification, then?
> >>>>
> >>>> Xiao please have the first patches of the patchset focus on the problem
> >>>> at hand: fix long mmu_lock hold times.
> >>>>
> >>>>> Not sure what you mean again. We flush TLB once before entering this function.
> >>>>> kvm_reload_remote_mmus() does this for us, no?
> >>>>
> >>>> kvm_reload_remote_mmus() is used as an optimization, its separate from the
> >>>> problem solution.
> >>>>
> >>>>>>
> >>>>>> What was suggested was... go to phrase which starts with "The only purpose
> >>>>>> of the generation number should be to".
> >>>>>>
> >>>>>> The comment quoted here does not match that description.
> >>>>>>
> >>>>> The comment describes what code does and in this it is correct.
> >>>>>
> >>>>> You propose to not reload roots right away and do it only when root sp
> >>>>> is encountered, right? So my question is what's the point? There are,
> >>>>> obviously, root sps with invalid generation number at this point, so
> >>>>> reload will happen regardless in kvm_mmu_prepare_zap_page(). So why not
> >>>>> do it here right away and avoid it in kvm_mmu_prepare_zap_page() for
> >>>>> invalid and obsolete sps as I proposed in one of my email?
> >>>>
> >>>> Sure. But Xiao please introduce that TLB collapsing optimization as a
> >>>> later patch, so we can reason about it in a more organized fashion.
> >>>
> >>> So, if I understand correctly, you are asking to move is_obsolete_sp()
> >>> check from kvm_mmu_get_page() and kvm_reload_remote_mmus() from
> >>> kvm_mmu_invalidate_all_pages() to a separate patch. Fine by me, but if
> >>> we drop kvm_reload_remote_mmus() from kvm_mmu_invalidate_all_pages() the
> >>> call to kvm_mmu_invalidate_all_pages() in emulator_fix_hypercall() will
> >>> become nop. But I question the need to zap all shadow pages tables there
> >>> in the first place, why kvm_flush_remote_tlbs() is not enough?
> >>
> >> I do not know too... I even do no know why kvm_flush_remote_tlbs
> >> is needed. :(
> > We changed the content of an executable page, we need to flush instruction
> > cache of all vcpus to not use stale data, so my suggestion to call
> 
> I thought the reason is about icache too but icache is automatically
> flushed on x86, we only need to invalidate the prefetched instructions by
> executing a serializing operation.
> 
> See the SDM in the chapter of
> "8.1.3 Handling Self- and Cross-Modifying Code"
> 
Right, so we do cross-modifying code here and we need to make sure no
vcpu is running in a guest mode while this happens, but
kvm_mmu_zap_all() does not provide this guaranty since vcpus will
continue running after reloading roots!
 
> > kvm_flush_remote_tlbs() is obviously incorrect since this flushes tlb,
> > not instruction cache, but why kvm_reload_remote_mmus() would flush
> > instruction cache?
> 
> kvm_reload_remote_mmus do not have any help i think.
> 
> I find that this change is introduced by commit: 7aa81cc0
> and I have added Anthony in the CC.
> 
> I also find some discussions related to calling
> kvm_reload_remote_mmus():
> 
> >
> > But if the instruction is architecture dependent, and you run on the
> > wrong architecture, now you have to patch many locations at fault time,
> > introducing some nasty runtime code / data cache overlap performance
> > problems.  Granted, they go away eventually.
> >
> 
> We're addressing that by blowing away the shadow cache and holding the
> big kvm lock to ensure SMP safety.  Not a great thing to do from a
> performance perspective but the whole point of patching is that the cost
> is amortized.
> 
> (http://kerneltrap.org/mailarchive/linux-kernel/2007/9/14/260288)
> 
> But i can not understand...
Back then kvm->lock protected memslot access so code like:

mutex_lock(&vcpu->kvm->lock);
kvm_mmu_zap_all(vcpu->kvm);
mutex_unlock(&vcpu->kvm->lock);

which is what 7aa81cc0 does was enough to guaranty that no vcpu will
run while code is patched. This is no longer the case and
mutex_lock(&vcpu->kvm->lock); is gone from that code path long time ago,
so now kvm_mmu_zap_all() there is useless and the code is incorrect.

Lets drop kvm_mmu_zap_all() there (in separate patch) and fix the
patching properly later.

--
			Gleb.

  reply	other threads:[~2013-05-22 13:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-16 21:12 [PATCH v6 0/7] KVM: MMU: fast zap all shadow pages Xiao Guangrong
2013-05-16 21:12 ` [PATCH v6 1/7] KVM: MMU: drop unnecessary kvm_reload_remote_mmus Xiao Guangrong
2013-05-16 21:12 ` [PATCH v6 2/7] KVM: MMU: delete shadow page from hash list in kvm_mmu_prepare_zap_page Xiao Guangrong
2013-05-19 10:47   ` Gleb Natapov
2013-05-20  9:19     ` Xiao Guangrong
2013-05-20  9:42       ` Gleb Natapov
2013-05-16 21:12 ` [PATCH v6 3/7] KVM: MMU: fast invalidate all pages Xiao Guangrong
2013-05-19 10:04   ` Gleb Natapov
2013-05-20  9:12     ` Xiao Guangrong
2013-05-20 19:46   ` Marcelo Tosatti
2013-05-20 20:15     ` Gleb Natapov
2013-05-20 20:40       ` Marcelo Tosatti
2013-05-21  3:36         ` Xiao Guangrong
2013-05-21  8:45           ` Gleb Natapov
2013-05-22  1:41           ` Marcelo Tosatti
2013-05-21  8:39         ` Gleb Natapov
2013-05-22  1:33           ` Marcelo Tosatti
2013-05-22  6:34             ` Gleb Natapov
2013-05-22  8:46               ` Xiao Guangrong
2013-05-22  8:54                 ` Gleb Natapov
2013-05-22  9:41                   ` Xiao Guangrong
2013-05-22 13:17                     ` Gleb Natapov [this message]
2013-05-22 15:25                       ` Xiao Guangrong
2013-05-22 15:42                         ` Gleb Natapov
2013-05-22 15:06               ` Marcelo Tosatti
2013-05-16 21:12 ` [PATCH v6 4/7] KVM: MMU: zap pages in batch Xiao Guangrong
2013-05-16 21:13 ` [PATCH v6 5/7] KVM: x86: use the fast way to invalidate all pages Xiao Guangrong
2013-05-16 21:13 ` [PATCH v6 6/7] KVM: MMU: show mmu_valid_gen in shadow page related tracepoints Xiao Guangrong
2013-05-16 21:13 ` [PATCH v6 7/7] KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages Xiao Guangrong
2013-05-19 10:49 ` [PATCH v6 0/7] KVM: MMU: fast zap all shadow pages Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130522131720.GO14287@redhat.com \
    --to=gleb@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi.kivity@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox