From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [patch 3/4] KVM: MMU: reload request from GET_DIRTY_LOG path Date: Wed, 15 Oct 2014 11:03:54 +0300 Message-ID: <20141015080353.GX26540@minantech.com> References: <20140721131424.GZ18167@minantech.com> <20140909152811.GA4153@amt.cnet> <20141004072332.GS26540@minantech.com> <20141006171932.GA1011@amt.cnet> <20141008065636.GU26540@minantech.com> <20141008171534.GA2651@amt.cnet> <20141008175937.GV26540@minantech.com> <20141008192231.GA5866@amt.cnet> <20141010130928.GW26540@minantech.com> <20141013085238.GB6957@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, ak@linux.intel.com, pbonzini@redhat.com, xiaoguangrong@linux.vnet.ibm.com, avi.kivity@gmail.com To: Marcelo Tosatti Return-path: Received: from mail-wi0-f172.google.com ([209.85.212.172]:40086 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750775AbaJOIEN (ORCPT ); Wed, 15 Oct 2014 04:04:13 -0400 Received: by mail-wi0-f172.google.com with SMTP id n3so12176031wiv.5 for ; Wed, 15 Oct 2014 01:04:12 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20141013085238.GB6957@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Oct 13, 2014 at 05:52:38AM -0300, Marcelo Tosatti wrote: > On Fri, Oct 10, 2014 at 04:09:29PM +0300, Gleb Natapov wrote: > > On Wed, Oct 08, 2014 at 04:22:31PM -0300, Marcelo Tosatti wrote: > > > > > > > > > > Argh, lets try again: > > > > > > > > > > skip_pinned = true > > > > > ------------------ > > > > > > > > > > mark page dirty, keep spte intact > > > > > > > > > > called from get dirty log path. > > > > > > > > > > skip_pinned = false > > > > > ------------------- > > > > > reload remote mmu > > > > > destroy pinned spte. > > > > > > > > > > called from: dirty log enablement, rmap write protect (unused for pinned > > > > > sptes) > > > > > > > > > > > > > > > Note this behaviour is your suggestion: > > > > > > > > Yes, I remember that and I thought we will not need this skip_pinned > > > > at all. For rmap write protect case there shouldn't be any pinned pages, > > > > but why dirty log enablement sets skip_pinned to false? Why not mark > > > > pinned pages as dirty just like you do in get dirty log path? > > > > > > Because if its a large spte, it must be nuked (or marked read-only, > > > which for pinned sptes, is not possible). > > > > > If a large page has one small page pinned inside it its spte will > > be marked as pinned, correct? > > Correct. > > > We did nuke large ptes here until very > > recently: c126d94f2c90ed9d, but we cannot drop a pte here anyway without > > kicking all vcpu from a guest mode, but do you need additional skip_pinned > > parameter? Why not check if spte is large instead? > > Nuke only if large spte is found? Can do that, instead. > > > So why not have per slot pinned page list (Xiao suggested the same) and do: > > The interface is per-vcpu (that is registration of pinned pages is > performed on a per-vcpu basis). > PEBS is per cpu, but it does not mean that pinning should be per cpu, it can be done globally with ref counting. > > spte_write_protect() { > > if (is_pinned(spte) { > > if (large(spte)) > > // cannot drop while vcpu are running > > mmu_reload_pinned_vcpus(); > > else > > return false; > > } > > > > > > get_dirty_log() { > > for_each(pinned pages i) > > makr_dirty(i); > > } > > That is effectively the same this patchset does, except that the spte > pinned bit is checked at spte_write_protect, instead of looping over > page pinned list. Fail to see huge advantage there. > I think spte_write_protect is a strange place to mark pages dirty, but otherwise yes the effect is the same, so definitely not a huge difference. If global pinned list is a PITA in your opinion leave it as is. > I'll drop the skip_pinned parameter and use is_large_pte check instead. > Thanks, -- Gleb.