From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44813) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aitbH-0004KD-B8 for qemu-devel@nongnu.org; Wed, 23 Mar 2016 20:58:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aitbD-00077y-8I for qemu-devel@nongnu.org; Wed, 23 Mar 2016 20:58:11 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:62339) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aitbC-00071s-2G for qemu-devel@nongnu.org; Wed, 23 Mar 2016 20:58:07 -0400 Date: Thu, 24 Mar 2016 08:52:56 +0800 From: Wei Yang Message-ID: <20160324005256.GA14956@linux-gk3p> References: <1458632629-4649-1-git-send-email-liang.z.li@intel.com> <20160323013715.GB13750@linux-gk3p> <20160323094643.GA18660@linux-gk3p> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: Subject: Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by skipping free pages Reply-To: Wei Yang List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" Cc: "rkagan@virtuozzo.com" , "linux-kernel@vger.kenel.org" , "ehabkost@redhat.com" , "kvm@vger.kernel.org" , "mst@redhat.com" , "simhan@hpe.com" , "quintela@redhat.com" , "qemu-devel@nongnu.org" , "dgilbert@redhat.com" , "jitendra.kolhe@hpe.com" , "mohan_parthasarathy@hpe.com" , "amit.shah@redhat.com" , "pbonzini@redhat.com" , Wei Yang , "rth@twiddle.net" On Wed, Mar 23, 2016 at 02:35:42PM +0000, Li, Liang Z wrote: >> >No special purpose. Maybe it's caused by the email client. I didn't >> >find the character in the original doc. >> > >>=20 >> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg00715.html >>=20 >> You could take a look at this link, there is a '>' before From. > >Yes, there is.=20 > >> >> > >> >> >6. Handling page cache in the guest >> >> >The memory used for page cache in the guest will change depends on >> >> >the workload, if guest run some block IO intensive work load, there >> >> >will >> >> >> >> Would this improvement benefit a lot when guest only has little free = page? >> > >> >Yes, the improvement is very obvious. >> > >>=20 >> Good to know this. >>=20 >> >> In your Performance data Case 2, I think it mimic this kind of case. >> >> While the memory consuming task is stopped before migration. If it >> >> continues, would we still perform better than before? >> > >> >Actually, my RFC patch didn't consider the page cache, Roman raised this >> issue. >> >so I add this part in this doc. >> > >> >Case 2 didn't mimic this kind of scenario, the work load is an memory >> >consuming work load, not an block IO intensive work load, so there are >> >not many page cache in this case. >> > >> >If the work load in case 2 continues, as long as it not write all the >> >memory it allocates, we still can get benefits. >> > >>=20 >> Sounds I have little knowledge on page cache, and its relationship betwe= en >> free page and I/O intensive work. >>=20 >> Here is some personal understanding, I would appreciate if you could cor= rect >> me. >>=20 >> +---------+ >> |PageCache| >> +---------+ >> +---------+---------+---------+---------+ >> |Page |Page |Free Page|Page | >> +---------+---------+---------+---------+ >>=20 >> Free Page is a page in the free_list, PageCache is some page cached in C= PU's >> cache line? > >No, page cache is quite different with CPU cache line. >" In computing, a page cache, sometimes also called disk cache,[2] is a tr= ansparent cache > for the pages originating from a secondary storage device such as a hard = disk drive (HDD). > The operating system keeps a page cache in otherwise unused portions of t= he main > memory (RAM), resulting in quicker access to the contents of cached pages= and=20 >overall performance improvements " >you can refer to https://en.wikipedia.org/wiki/Page_cache >for more details. > My poor knowledge~ Should google it before I imagine the meaning of the terminology. If my understanding is correct, the Page Cache is counted as Free Page, whi= le actually we should migrate them instead of filter them. > >> When memory consuming task runs, it leads to little Free Page in the who= le >> system. What's the consequence when I/O intensive work runs? I guess it >> still leads to little Free Page. And will have some problem in sync on >> PageCache? >>=20 >> >> >> >> I am thinking is it possible to have a threshold or configurable >> >> threshold to utilize free page bitmap optimization? >> >> >> > >> >Could you elaborate your idea? How does it work? >> > >>=20 >> Let's back to Case 2. We run a memory consuming task which will leads to >> little Free Page in the whole system. Which means from Qemu perspective, >> little of the dirty_memory is filtered by Free Page list. My original qu= estion is >> whether your solution benefits in this scenario. As you mentioned it wor= ks >> fine. So maybe this threshold is not necessary. >>=20 >I didn't quite understand your question before.=20 >The benefits we get depends on the count of free pages we can filter out. >This is always true. > >> My original idea is in Qemu we can calculate the percentage of the Free = Page >> in the whole system. If it finds there is only little percentage of Free= Page, >> then we don't need to bother to use this method. >>=20 > >I got you. The threshold can be used for optimization, but the effect is v= ery limited. >If there are only a few of free pages, the process of constructing the fre= e page >bitmap is very quick.=20 >But we can stop doing the following things, e.g. sending the free page bit= map and doing >the bitmap operation, theoretically, that may help to save some time, mayb= e several ms. > Ha, you got what I mean. >I think a VM has no free pages at all is very rare, in the worst case, the= re are still several > MB of free pages. The proper threshold should be determined by comparing = the extra > time spends on processing the free page bitmap and the time spends on sen= ding >the several MB of free pages though the network. If the formal is longer, = we can stop >using this method. So we should take the network bandwidth into considerat= ion, it's=20 >too complicated and not worth to do. > Yes, after some thinking, it maybe not that easy and worth to do this optimization. >Thanks > >Liang >> Have a nice day~ >>=20 >> >Liang >> > >> >> >> >> -- >> >> Richard Yang\nHelp you, Help me >>=20 >> -- >> Richard Yang\nHelp you, Help me >=04=EF=BF=BD{.n=EF=BF=BD+=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD+%=EF=BF=BD=EF=BF=BDlzwm=EF=BF=BD=EF=BF=BDb=EF=BF=BD=EB=A7= =B2=EF=BF=BD=EF=BF=BDr=EF=BF=BD=EF=BF=BDzK=EF=BF=BD{ay=EF=BF=BD=1D=CA=87=DA= =99=EF=BF=BD,j=07=EF=BF=BD=EF=BF=BDf=EF=BF=BD=EF=BF=BD=EF=BF=BDh=EF=BF=BD= =EF=BF=BD=EF=BF=BDz=EF=BF=BD=1E=EF=BF=BDw=EF=BF=BD=EF=BF=BD=EF=BF=BD=0C=EF= =BF=BD=EF=BF=BD=EF=BF=BDj:+v=EF=BF=BD=EF=BF=BD=EF=BF=BDw=EF=BF=BDj=EF=BF=BD= m=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=07=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BDzZ+=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=DD=A2j"=EF=BF=BD=EF=BF= =BD!=EF=BF=BDi --=20 Richard Yang\nHelp you, Help me