Re: [RFC Design Doc]Speed up live migration by skipping free pages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wei Yang <richard.weiyang@huawei.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: Wei Yang <richard.weiyang@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"rth@twiddle.net" <rth@twiddle.net>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"amit.shah@redhat.com" <amit.shah@redhat.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>,
	"mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>,
	"jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>,
	"simhan@hpe.com" <simhan@hpe.com>,
	"rkagan@virtuozzo.com" <rkagan@virtuozzo.com>,
	"riel@redhat.com" <riel@redhat.com>
Subject: Re: [RFC Design Doc]Speed up live migration by skipping free pages
Date: Thu, 24 Mar 2016 08:52:56 +0800	[thread overview]
Message-ID: <20160324005256.GA14956@linux-gk3p> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415AABD@shsmsx102.ccr.corp.intel.com>

On Wed, Mar 23, 2016 at 02:35:42PM +0000, Li, Liang Z wrote:
>> >No special purpose. Maybe it's caused by the email client. I didn't
>> >find the character in the original doc.
>> >
>> 
>> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg00715.html
>> 
>> You could take a look at this link, there is a '>' before From.
>
>Yes, there is. 
>
>> >> >
>> >> >6. Handling page cache in the guest
>> >> >The memory used for page cache in the guest will change depends on
>> >> >the workload, if guest run some block IO intensive work load, there
>> >> >will
>> >>
>> >> Would this improvement benefit a lot when guest only has little free page?
>> >
>> >Yes, the improvement is very obvious.
>> >
>> 
>> Good to know this.
>> 
>> >> In your Performance data Case 2, I think it mimic this kind of case.
>> >> While the memory consuming task is stopped before migration. If it
>> >> continues, would we still perform better than before?
>> >
>> >Actually, my RFC patch didn't consider the page cache, Roman raised this
>> issue.
>> >so I add this part in this doc.
>> >
>> >Case 2 didn't mimic this kind of scenario, the work load is an memory
>> >consuming work load, not an block IO intensive work load, so there are
>> >not many page cache in this case.
>> >
>> >If the work load in case 2 continues, as long as it not write all the
>> >memory it allocates, we still can get benefits.
>> >
>> 
>> Sounds I have little knowledge on page cache, and its relationship between
>> free page and I/O intensive work.
>> 
>> Here is some personal understanding, I would appreciate if you could correct
>> me.
>> 
>>                 +---------+
>>                 |PageCache|
>>                 +---------+
>>       +---------+---------+---------+---------+
>>       |Page     |Page     |Free Page|Page     |
>>       +---------+---------+---------+---------+
>> 
>> Free Page is a page in the free_list, PageCache is some page cached in CPU's
>> cache line?
>
>No, page cache is quite different with CPU cache line.
>" In computing, a page cache, sometimes also called disk cache,[2] is a transparent cache
> for the pages originating from a secondary storage device such as a hard disk drive (HDD).
> The operating system keeps a page cache in otherwise unused portions of the main
> memory (RAM), resulting in quicker access to the contents of cached pages and 
>overall performance improvements "
>you can refer to https://en.wikipedia.org/wiki/Page_cache
>for more details.
>

My poor knowledge~ Should google it before I imagine the meaning of the
terminology.

If my understanding is correct, the Page Cache is counted as Free Page, while
actually we should migrate them instead of filter them.

>
>> When memory consuming task runs, it leads to little Free Page in the whole
>> system. What's the consequence when I/O intensive work runs? I guess it
>> still leads to little Free Page. And will have some problem in sync on
>> PageCache?
>> 
>> >>
>> >> I am thinking is it possible to have a threshold or configurable
>> >> threshold to utilize free page bitmap optimization?
>> >>
>> >
>> >Could you elaborate your idea? How does it work?
>> >
>> 
>> Let's back to Case 2. We run a memory consuming task which will leads to
>> little Free Page in the whole system. Which means from Qemu perspective,
>> little of the dirty_memory is filtered by Free Page list. My original question is
>> whether your solution benefits in this scenario. As you mentioned it works
>> fine. So maybe this threshold is not necessary.
>> 
>I didn't quite understand your question before. 
>The benefits we get depends on the  count of free pages we can filter out.
>This is always true.
>
>> My original idea is in Qemu we can calculate the percentage of the Free Page
>> in the whole system. If it finds there is only little percentage of Free Page,
>> then we don't need to bother to use this method.
>> 
>
>I got you. The threshold can be used for optimization, but the effect is very limited.
>If there are only a few of free pages, the process of constructing the free page
>bitmap is very quick. 
>But we can stop doing the following things, e.g. sending the free page bitmap and doing
>the bitmap operation, theoretically, that may help to save some time, maybe several ms.
>

Ha, you got what I mean.

>I think a VM has no free pages at all is very rare, in the worst case, there are still several
> MB of free pages. The proper threshold should be determined by comparing  the extra
> time spends on processing the free page bitmap and the time spends on sending
>the several MB of free pages though the network. If the formal is longer, we can stop
>using this method. So we should take the network bandwidth into consideration, it's 
>too complicated and not worth to do.
>

Yes, after some thinking, it maybe not that easy and worth to do this
optimization.

>Thanks
>
>Liang
>> Have a nice day~
>> 
>> >Liang
>> >
>> >>
>> >> --
>> >> Richard Yang\nHelp you, Help me
>> 
>> --
>> Richard Yang\nHelp you, Help me
>\x04�{.n�+�������+%��lzwm��b�맲��r��zK�{ay�\x1dʇڙ�,j\a��f���h���z�\x1e�w���\f���j:+v���w�j�m����\a����zZ+�����ݢj"��!�i
-- 
Richard Yang\nHelp you, Help me

WARNING: multiple messages have this Message-ID (diff)

From: Wei Yang <richard.weiyang@huawei.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "rkagan@virtuozzo.com" <rkagan@virtuozzo.com>,
	"linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"mst@redhat.com" <mst@redhat.com>,
	"simhan@hpe.com" <simhan@hpe.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>,
	"jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>,
	"mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>,
	"amit.shah@redhat.com" <amit.shah@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	Wei Yang <richard.weiyang@huawei.com>,
	"rth@twiddle.net" <rth@twiddle.net>
Subject: Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by skipping free pages
Date: Thu, 24 Mar 2016 08:52:56 +0800	[thread overview]
Message-ID: <20160324005256.GA14956@linux-gk3p> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415AABD@shsmsx102.ccr.corp.intel.com>

On Wed, Mar 23, 2016 at 02:35:42PM +0000, Li, Liang Z wrote:
>> >No special purpose. Maybe it's caused by the email client. I didn't
>> >find the character in the original doc.
>> >
>> 
>> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg00715.html
>> 
>> You could take a look at this link, there is a '>' before From.
>
>Yes, there is. 
>
>> >> >
>> >> >6. Handling page cache in the guest
>> >> >The memory used for page cache in the guest will change depends on
>> >> >the workload, if guest run some block IO intensive work load, there
>> >> >will
>> >>
>> >> Would this improvement benefit a lot when guest only has little free page?
>> >
>> >Yes, the improvement is very obvious.
>> >
>> 
>> Good to know this.
>> 
>> >> In your Performance data Case 2, I think it mimic this kind of case.
>> >> While the memory consuming task is stopped before migration. If it
>> >> continues, would we still perform better than before?
>> >
>> >Actually, my RFC patch didn't consider the page cache, Roman raised this
>> issue.
>> >so I add this part in this doc.
>> >
>> >Case 2 didn't mimic this kind of scenario, the work load is an memory
>> >consuming work load, not an block IO intensive work load, so there are
>> >not many page cache in this case.
>> >
>> >If the work load in case 2 continues, as long as it not write all the
>> >memory it allocates, we still can get benefits.
>> >
>> 
>> Sounds I have little knowledge on page cache, and its relationship between
>> free page and I/O intensive work.
>> 
>> Here is some personal understanding, I would appreciate if you could correct
>> me.
>> 
>>                 +---------+
>>                 |PageCache|
>>                 +---------+
>>       +---------+---------+---------+---------+
>>       |Page     |Page     |Free Page|Page     |
>>       +---------+---------+---------+---------+
>> 
>> Free Page is a page in the free_list, PageCache is some page cached in CPU's
>> cache line?
>
>No, page cache is quite different with CPU cache line.
>" In computing, a page cache, sometimes also called disk cache,[2] is a transparent cache
> for the pages originating from a secondary storage device such as a hard disk drive (HDD).
> The operating system keeps a page cache in otherwise unused portions of the main
> memory (RAM), resulting in quicker access to the contents of cached pages and 
>overall performance improvements "
>you can refer to https://en.wikipedia.org/wiki/Page_cache
>for more details.
>

My poor knowledge~ Should google it before I imagine the meaning of the
terminology.

If my understanding is correct, the Page Cache is counted as Free Page, while
actually we should migrate them instead of filter them.

>
>> When memory consuming task runs, it leads to little Free Page in the whole
>> system. What's the consequence when I/O intensive work runs? I guess it
>> still leads to little Free Page. And will have some problem in sync on
>> PageCache?
>> 
>> >>
>> >> I am thinking is it possible to have a threshold or configurable
>> >> threshold to utilize free page bitmap optimization?
>> >>
>> >
>> >Could you elaborate your idea? How does it work?
>> >
>> 
>> Let's back to Case 2. We run a memory consuming task which will leads to
>> little Free Page in the whole system. Which means from Qemu perspective,
>> little of the dirty_memory is filtered by Free Page list. My original question is
>> whether your solution benefits in this scenario. As you mentioned it works
>> fine. So maybe this threshold is not necessary.
>> 
>I didn't quite understand your question before. 
>The benefits we get depends on the  count of free pages we can filter out.
>This is always true.
>
>> My original idea is in Qemu we can calculate the percentage of the Free Page
>> in the whole system. If it finds there is only little percentage of Free Page,
>> then we don't need to bother to use this method.
>> 
>
>I got you. The threshold can be used for optimization, but the effect is very limited.
>If there are only a few of free pages, the process of constructing the free page
>bitmap is very quick. 
>But we can stop doing the following things, e.g. sending the free page bitmap and doing
>the bitmap operation, theoretically, that may help to save some time, maybe several ms.
>

Ha, you got what I mean.

>I think a VM has no free pages at all is very rare, in the worst case, there are still several
> MB of free pages. The proper threshold should be determined by comparing  the extra
> time spends on processing the free page bitmap and the time spends on sending
>the several MB of free pages though the network. If the formal is longer, we can stop
>using this method. So we should take the network bandwidth into consideration, it's 
>too complicated and not worth to do.
>

Yes, after some thinking, it maybe not that easy and worth to do this
optimization.

>Thanks
>
>Liang
>> Have a nice day~
>> 
>> >Liang
>> >
>> >>
>> >> --
>> >> Richard Yang\nHelp you, Help me
>> 
>> --
>> Richard Yang\nHelp you, Help me
>\x04�{.n�+�������+%��lzwm��b�맲��r��zK�{ay�\x1dʇڙ�,j\a��f���h���z�\x1e�w���\f���j:+v���w�j�m����\a����zZ+�����ݢj"��!�i
-- 
Richard Yang\nHelp you, Help me

next prev parent reply	other threads:[~2016-03-24  0:54 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-22  7:43 [RFC Design Doc]Speed up live migration by skipping free pages Liang Li
2016-03-22  7:43 ` [Qemu-devel] " Liang Li
2016-03-22 10:11 ` Michael S. Tsirkin
2016-03-22 10:11   ` [Qemu-devel] " Michael S. Tsirkin
2016-03-23  6:05   ` Li, Liang Z
2016-03-23  6:05     ` [Qemu-devel] " Li, Liang Z
2016-03-23 14:08     ` Michael S. Tsirkin
2016-03-23 14:08       ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24  1:19       ` Li, Liang Z
2016-03-24  1:19         ` [Qemu-devel] " Li, Liang Z
2016-03-24  9:48         ` Michael S. Tsirkin
2016-03-24  9:48           ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 10:16           ` Li, Liang Z
2016-03-24 10:16             ` [Qemu-devel] " Li, Liang Z
2016-03-24 10:29             ` Michael S. Tsirkin
2016-03-24 10:29               ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 14:33               ` Li, Liang Z
2016-03-24 14:33                 ` [Qemu-devel] " Li, Liang Z
2016-03-24 14:44                 ` Michael S. Tsirkin
2016-03-24 14:44                   ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 15:16                   ` Li, Liang Z
2016-03-24 15:16                     ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:18                     ` Paolo Bonzini
2016-03-24 15:18                       ` [Qemu-devel] " Paolo Bonzini
2016-03-24 15:25                       ` Li, Liang Z
2016-03-24 15:25                         ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:27                     ` Michael S. Tsirkin
2016-03-24 15:27                       ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 15:39                       ` Li, Liang Z
2016-03-24 15:39                         ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:47                         ` Paolo Bonzini
2016-03-24 15:47                           ` [Qemu-devel] " Paolo Bonzini
2016-03-24 15:59                           ` Li, Liang Z
2016-03-24 15:59                             ` [Qemu-devel] " Li, Liang Z
2016-03-22 19:05 ` Dr. David Alan Gilbert
2016-03-22 19:05   ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-23  6:48   ` Li, Liang Z
2016-03-23  6:48     ` [Qemu-devel] " Li, Liang Z
2016-03-24  1:24     ` Wei Yang
2016-03-24  1:24       ` [Qemu-devel] " Wei Yang
2016-03-24  9:00       ` Dr. David Alan Gilbert
2016-03-24  9:00         ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-24 10:09         ` Li, Liang Z
2016-03-24 10:09           ` [Qemu-devel] " Li, Liang Z
2016-03-24 10:23           ` Dr. David Alan Gilbert
2016-03-24 10:23             ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-24 14:50             ` Li, Liang Z
2016-03-24 14:50               ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:11               ` Michael S. Tsirkin
2016-03-24 15:11                 ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 15:53                 ` Li, Liang Z
2016-03-24 15:53                   ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:56                   ` Michael S. Tsirkin
2016-03-24 15:56                     ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 16:05                     ` Li, Liang Z
2016-03-24 16:05                       ` [Qemu-devel] " Li, Liang Z
2016-03-24 16:25                       ` Michael S. Tsirkin
2016-03-24 16:25                         ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 17:49                         ` Dr. David Alan Gilbert
2016-03-24 17:49                           ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-24 22:16                           ` Michael S. Tsirkin
2016-03-24 22:16                             ` [Qemu-devel] " Michael S. Tsirkin
2016-03-25  1:59                             ` Li, Liang Z
2016-03-25  1:59                               ` [Qemu-devel] " Li, Liang Z
2016-03-25  1:32                           ` Li, Liang Z
2016-03-25  1:32                             ` [Qemu-devel] " Li, Liang Z
2016-04-18 11:08                           ` Li, Liang Z
2016-04-18 11:08                             ` [Qemu-devel] " Li, Liang Z
2016-04-18 11:29                             ` Michael S. Tsirkin
2016-04-18 11:29                               ` [Qemu-devel] " Michael S. Tsirkin
2016-04-18 14:36                               ` Li, Liang Z
2016-04-18 14:36                                 ` [Qemu-devel] " Li, Liang Z
2016-04-18 15:38                                 ` Michael S. Tsirkin
2016-04-18 15:38                                   ` [Qemu-devel] " Michael S. Tsirkin
2016-04-19  2:20                                   ` Li, Liang Z
2016-04-19  2:20                                     ` [Qemu-devel] " Li, Liang Z
2016-04-19 19:12                               ` Dr. David Alan Gilbert
2016-04-19 19:12                                 ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-25 10:56                                 ` Michael S. Tsirkin
2016-04-25 10:56                                   ` [Qemu-devel] " Michael S. Tsirkin
2016-04-19 19:05                             ` Dr. David Alan Gilbert
2016-04-19 19:05                               ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-20  3:22                               ` Li, Liang Z
2016-04-20  3:22                                 ` [Qemu-devel] " Li, Liang Z
2016-04-20  8:10                                 ` Dr. David Alan Gilbert
2016-04-20  8:10                                   ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-25  1:32                         ` Li, Liang Z
2016-03-25  1:32                           ` [Qemu-devel] " Li, Liang Z
2016-04-01 10:54   ` Amit Shah
2016-04-01 10:54     ` [Qemu-devel] " Amit Shah
2016-04-05  1:49     ` Li, Liang Z
2016-04-05  1:49       ` [Qemu-devel] " Li, Liang Z
2016-03-23  1:37 ` Wei Yang
2016-03-23  1:37   ` [Qemu-devel] " Wei Yang
2016-03-23  7:18   ` Li, Liang Z
2016-03-23  7:18     ` [Qemu-devel] " Li, Liang Z
2016-03-23  9:46     ` Wei Yang
2016-03-23  9:46       ` [Qemu-devel] " Wei Yang
2016-03-23 14:35       ` Li, Liang Z
2016-03-23 14:35         ` [Qemu-devel] " Li, Liang Z
2016-03-24  0:52         ` Wei Yang [this message]
2016-03-24  0:52           ` Wei Yang
2016-03-24  1:32           ` Li, Liang Z
2016-03-24  1:32             ` [Qemu-devel] " Li, Liang Z
2016-03-24  1:56             ` Wei Yang
2016-03-24  1:56               ` [Qemu-devel] " Wei Yang
2016-03-23 16:53     ` Eric Blake
2016-03-23 16:53       ` Eric Blake
2016-03-23 21:41       ` Wei Yang
2016-03-23 21:41         ` Wei Yang
2016-03-24  1:23       ` Li, Liang Z
2016-03-24  1:23         ` Li, Liang Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160324005256.GA14956@linux-gk3p \
    --to=richard.weiyang@huawei.com \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jitendra.kolhe@hpe.com \
    --cc=kvm@vger.kernel.org \
    --cc=liang.z.li@intel.com \
    --cc=linux-kernel@vger.kenel.org \
    --cc=mohan_parthasarathy@hpe.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=riel@redhat.com \
    --cc=rkagan@virtuozzo.com \
    --cc=rth@twiddle.net \
    --cc=simhan@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.