From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Yang Subject: Re: [RFC Design Doc]Speed up live migration by skipping free pages Date: Thu, 24 Mar 2016 08:52:56 +0800 Message-ID: <20160324005256.GA14956@linux-gk3p> References: <1458632629-4649-1-git-send-email-liang.z.li@intel.com> <20160323013715.GB13750@linux-gk3p> <20160323094643.GA18660@linux-gk3p> Reply-To: Wei Yang Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Wei Yang , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , "linux-kernel@vger.kenel.org" , "pbonzini@redhat.com" , "rth@twiddle.net" , "ehabkost@redhat.com" , "mst@redhat.com" , "amit.shah@redhat.com" , "quintela@redhat.com" , "dgilbert@redhat.com" , "mohan_parthasarathy@hpe.com" , "jitendra.kolhe@hpe.com" , "simhan@hpe.com" , "rkagan@virtuozzo.com" , "riel@redhat.com" To: "Li, Liang Z" Return-path: Received: from szxga03-in.huawei.com ([119.145.14.66]:8492 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751157AbcCXAys convert rfc822-to-8bit (ORCPT ); Wed, 23 Mar 2016 20:54:48 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Mar 23, 2016 at 02:35:42PM +0000, Li, Liang Z wrote: >> >No special purpose. Maybe it's caused by the email client. I didn't >> >find the character in the original doc. >> > >>=20 >> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg00715.html >>=20 >> You could take a look at this link, there is a '>' before From. > >Yes, there is.=20 > >> >> > >> >> >6. Handling page cache in the guest >> >> >The memory used for page cache in the guest will change depends = on >> >> >the workload, if guest run some block IO intensive work load, th= ere >> >> >will >> >> >> >> Would this improvement benefit a lot when guest only has little f= ree page? >> > >> >Yes, the improvement is very obvious. >> > >>=20 >> Good to know this. >>=20 >> >> In your Performance data Case 2, I think it mimic this kind of ca= se. >> >> While the memory consuming task is stopped before migration. If i= t >> >> continues, would we still perform better than before? >> > >> >Actually, my RFC patch didn't consider the page cache, Roman raised= this >> issue. >> >so I add this part in this doc. >> > >> >Case 2 didn't mimic this kind of scenario, the work load is an memo= ry >> >consuming work load, not an block IO intensive work load, so there = are >> >not many page cache in this case. >> > >> >If the work load in case 2 continues, as long as it not write all t= he >> >memory it allocates, we still can get benefits. >> > >>=20 >> Sounds I have little knowledge on page cache, and its relationship b= etween >> free page and I/O intensive work. >>=20 >> Here is some personal understanding, I would appreciate if you could= correct >> me. >>=20 >> +---------+ >> |PageCache| >> +---------+ >> +---------+---------+---------+---------+ >> |Page |Page |Free Page|Page | >> +---------+---------+---------+---------+ >>=20 >> Free Page is a page in the free_list, PageCache is some page cached = in CPU's >> cache line? > >No, page cache is quite different with CPU cache line. >" In computing, a page cache, sometimes also called disk cache,[2] is = a transparent cache > for the pages originating from a secondary storage device such as a h= ard disk drive (HDD). > The operating system keeps a page cache in otherwise unused portions = of the main > memory (RAM), resulting in quicker access to the contents of cached p= ages and=20 >overall performance improvements " >you can refer to https://en.wikipedia.org/wiki/Page_cache >for more details. > My poor knowledge~ Should google it before I imagine the meaning of the terminology. If my understanding is correct, the Page Cache is counted as Free Page,= while actually we should migrate them instead of filter them. > >> When memory consuming task runs, it leads to little Free Page in the= whole >> system. What's the consequence when I/O intensive work runs? I guess= it >> still leads to little Free Page. And will have some problem in sync = on >> PageCache? >>=20 >> >> >> >> I am thinking is it possible to have a threshold or configurable >> >> threshold to utilize free page bitmap optimization? >> >> >> > >> >Could you elaborate your idea? How does it work? >> > >>=20 >> Let's back to Case 2. We run a memory consuming task which will lead= s to >> little Free Page in the whole system. Which means from Qemu perspect= ive, >> little of the dirty_memory is filtered by Free Page list. My origina= l question is >> whether your solution benefits in this scenario. As you mentioned it= works >> fine. So maybe this threshold is not necessary. >>=20 >I didn't quite understand your question before.=20 >The benefits we get depends on the count of free pages we can filter = out. >This is always true. > >> My original idea is in Qemu we can calculate the percentage of the F= ree Page >> in the whole system. If it finds there is only little percentage of = =46ree Page, >> then we don't need to bother to use this method. >>=20 > >I got you. The threshold can be used for optimization, but the effect = is very limited. >If there are only a few of free pages, the process of constructing the= free page >bitmap is very quick.=20 >But we can stop doing the following things, e.g. sending the free page= bitmap and doing >the bitmap operation, theoretically, that may help to save some time, = maybe several ms. > Ha, you got what I mean. >I think a VM has no free pages at all is very rare, in the worst case,= there are still several > MB of free pages. The proper threshold should be determined by compar= ing the extra > time spends on processing the free page bitmap and the time spends on= sending >the several MB of free pages though the network. If the formal is long= er, we can stop >using this method. So we should take the network bandwidth into consid= eration, it's=20 >too complicated and not worth to do. > Yes, after some thinking, it maybe not that easy and worth to do this optimization. >Thanks > >Liang >> Have a nice day~ >>=20 >> >Liang >> > >> >> >> >> -- >> >> Richard Yang\nHelp you, Help me >>=20 >> -- >> Richard Yang\nHelp you, Help me >=04=EF=BF=BD{.n=EF=BF=BD+=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD+%=EF=BF=BD=EF=BF=BDlzwm=EF=BF=BD=EF=BF=BDb=EF=BF=BD=EB= =A7=B2=EF=BF=BD=EF=BF=BDr=EF=BF=BD=EF=BF=BDzK=EF=BF=BD{ay=EF=BF=BD=1D=CA= =87=DA=99=EF=BF=BD,j=07=EF=BF=BD=EF=BF=BDf=EF=BF=BD=EF=BF=BD=EF=BF=BDh=EF= =BF=BD=EF=BF=BD=EF=BF=BDz=EF=BF=BD=1E=EF=BF=BDw=EF=BF=BD=EF=BF=BD=EF=BF= =BD=0C=EF=BF=BD=EF=BF=BD=EF=BF=BDj:+v=EF=BF=BD=EF=BF=BD=EF=BF=BDw=EF=BF= =BDj=EF=BF=BDm=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=07=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BDzZ+=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=DD=A2= j"=EF=BF=BD=EF=BF=BD!=EF=BF=BDi --=20 Richard Yang\nHelp you, Help me