From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Huang, Kai" Subject: Re: [PATCH 3/6] KVM: Dirty memory tracking for performant checkpointing and improved live migration Date: Wed, 4 May 2016 19:45:08 +1200 Message-ID: <32d8060e-648c-cf99-970a-3ddadc6a501a@linux.intel.com> References: <201604261855.u3QItn85024244@dev1.sn.stratus.com> <33d8668e-2bba-af91-069e-6452609a6ff0@linux.intel.com> <20160429181911.GA2687@potion> <20160503141118.GA27975@potion> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "Cao, Lei" , Paolo Bonzini , "kvm@vger.kernel.org" To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from mga14.intel.com ([192.55.52.115]:8883 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751256AbcEDHpP (ORCPT ); Wed, 4 May 2016 03:45:15 -0400 In-Reply-To: <20160503141118.GA27975@potion> Sender: kvm-owner@vger.kernel.org List-ID: On 5/4/2016 2:11 AM, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: > 2016-05-03 18:06+1200, Huang, Kai: >> Actually my concern is, with your new mechanism to track guest dirty= pages, >> there will be two logdirty mechanisms (using bitmap and your per-vcp= u list), >> which I think is not good as it's a little bit redundant, given both >> mechanisms are used for dirty page logging. >> >> I think your main concern of current bitmap mechanism is scanning bi= tmap >> takes lots of time, especially when only few pages get dirty, you st= ill have >> to scan the entire bitmap, which results in bad performance if you r= uns >> checkpoint very frequently. My suggestion is, instead of introducing= two >> logdirty data structures, maybe you can try to use another more effi= cient >> data structure instead of bitmap for both current logdirty mechanism= and >> your new interfaces. Maybe Xen's log-dirty tree is a good reference. > > A sparse structure (buffer, tree, ...) also needs a mechanism to grow > (store new entries), so concurrent accesses become a problem, because > there has to be synchronization. I think that per-vcpu structure > becomes mandatory when thousands VCPUs dirty memory at the same time. Yes synchronization will be needed. But even for per-vcpu structure, we= =20 still need per-vcpu lock to access, say, gfn_list, right? For example,=20 one thread from userspace trying to get and clear dirty pages would nee= d=20 to loop all vcpus and acquire each vcpu's lock for gfn_list. (see=20 function mt_reset_all_gfns in patch 3/6). Looks this is not scalable=20 neither? > >> Maybe Xen's log-dirty tree is a good reference. > > Is there some top-level overview? > >>>From a glance at the code, it looked like GPA bitmap sparsified with > radix tree in a manner similar to the page table hierarchy. Yes it is just a radix tree. The point is the tree will be pretty small= =20 if there are few dirty pages, so the scanning will be very quick,=20 comparing to bitmap. > >> Of course this is just my concern and I'll leave it to maintainers. > > I too would prefer if both userspace interfaces used a common backend= =2E > A possible backend for that is > > vcpu -> memslot -> sparse dirty log This is the most reasonable proposal I think, at least for the first=20 step to improve performance. > > We should have dynamic sparse dirty log, to avoid wasting memory when > there are many small memslots, but a linear structure is probably sti= ll > fine. The sparse dirty log structure can be allocated when necessary so I=20 don't think it will waste of memory. Take radix tree as example, if=20 there's no dirty page in the slot, the pointer to radix can be NULL, or= =20 just root entry. > > We don't care which vcpu dirtied the page, so it seems like a waste t= o > have them in the hierarchy, but I can't think of designs where the > sparse dirty log is rooted in memslot and its updates scale well. > > 'memslot -> sparse dirty log' usually evolve into buffering on the VC= PU > side before writing to the memslot or aren't efficient for sparse > dataset. > > Where do you think is the balance between 'memslot -> bitmap' and > 'vcpu -> memslot -> dirty buffer'? In my opinion, we can first try 'memslot -> sparse dirty log'. Cao, Lei= =20 mentioned there were two bottlenecks: bitmap and bad multithread=20 performance due to mmu_lock. I think 'memslot->sparse dirty log' might=20 help to improve or solve the bitmap one. > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >