qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
@ 2019-06-05  1:08 Wei Yang
  2019-06-05  6:41 ` Peter Xu
  2019-06-05 12:27 ` Philippe Mathieu-Daudé
  0 siblings, 2 replies; 10+ messages in thread
From: Wei Yang @ 2019-06-05  1:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Wei Yang, dgilbert, quintela

In case we gets a queued page, the order of block is interrupted. We may
not rely on the complete_round flag to say we have already searched the
whole blocks on the list.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
---
 migration/ram.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index d881981876..e9b40d636d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
          */
         pss->block = block;
         pss->page = offset >> TARGET_PAGE_BITS;
+
+        /*
+         * This unqueued page would break the "one round" check, even is
+         * really rare.
+         */
+        pss->complete_round = false;
     }
 
     return !!block;
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05  1:08 [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page Wei Yang
@ 2019-06-05  6:41 ` Peter Xu
  2019-06-05  8:52   ` Wei Yang
  2019-06-05 12:27 ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Xu @ 2019-06-05  6:41 UTC (permalink / raw)
  To: Wei Yang; +Cc: quintela, qemu-devel, dgilbert

On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote:
> In case we gets a queued page, the order of block is interrupted. We may
> not rely on the complete_round flag to say we have already searched the
> whole blocks on the list.
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> ---
>  migration/ram.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index d881981876..e9b40d636d 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
>           */
>          pss->block = block;
>          pss->page = offset >> TARGET_PAGE_BITS;
> +
> +        /*
> +         * This unqueued page would break the "one round" check, even is
> +         * really rare.

Why this is needed?  Could you help explain the problem first?

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05  6:41 ` Peter Xu
@ 2019-06-05  8:52   ` Wei Yang
  2019-06-05  9:38     ` Peter Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2019-06-05  8:52 UTC (permalink / raw)
  To: Peter Xu; +Cc: quintela, Wei Yang, dgilbert, qemu-devel

On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote:
>On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote:
>> In case we gets a queued page, the order of block is interrupted. We may
>> not rely on the complete_round flag to say we have already searched the
>> whole blocks on the list.
>> 
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> ---
>>  migration/ram.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index d881981876..e9b40d636d 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
>>           */
>>          pss->block = block;
>>          pss->page = offset >> TARGET_PAGE_BITS;
>> +
>> +        /*
>> +         * This unqueued page would break the "one round" check, even is
>> +         * really rare.
>
>Why this is needed?  Could you help explain the problem first?

Peter, Thanks for your question.

I found this issue during code review and I believe this is a corner case.

Below is a draft chart for ram_find_and_save_block:

    ram_find_and_save_block
        do
            get_queued_page()
            find_dirty_block()
            ram_save_host_page()
        while

The basic logic here is : get a page need to migrate and migrate it.

In case we don't have get_queued_page(), find_dirty_block() will search the
whole ram_list.blocks by order. pss->complete_round is used to indicate
whether this search has looped.

Everything works fine after get_queued_page() involved. The block unqueued in
get_queued_page() could be any block in the ram_list.blocks. This means we
have very little chance to break the looped indicator.

                           unqueue_page()  last_seen_block
                                     |     |
    ram_list.blocks                  v     v
    ---------------------------------+=====+---


Just draw a raw picture to demonstrate a corner case.

For example, we start from last_seen_block and search till the end of
ram_list.blocks. At this moment, pss->complete_round is set to true. Then we
get a queued page from unqueue_page() at the point I pointed. So the loop
continues may just continue the range as I marked as "=". We will skip all the
other ranges.

This is really a corner case, since ram_save_host_page() should return 0 and
there should be no dirty page in this range. But I don't see we may avoid this
case.

If I am not correct, just let me know :-)

>
>Thanks,
>
>-- 
>Peter Xu

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05  8:52   ` Wei Yang
@ 2019-06-05  9:38     ` Peter Xu
  2019-06-05 10:33       ` Juan Quintela
  2019-06-05 13:41       ` Wei Yang
  0 siblings, 2 replies; 10+ messages in thread
From: Peter Xu @ 2019-06-05  9:38 UTC (permalink / raw)
  To: Wei Yang; +Cc: quintela, qemu-devel, dgilbert

On Wed, Jun 05, 2019 at 04:52:07PM +0800, Wei Yang wrote:
> On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote:
> >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote:
> >> In case we gets a queued page, the order of block is interrupted. We may
> >> not rely on the complete_round flag to say we have already searched the
> >> whole blocks on the list.
> >> 
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> ---
> >>  migration/ram.c | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> >> 
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index d881981876..e9b40d636d 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
> >>           */
> >>          pss->block = block;
> >>          pss->page = offset >> TARGET_PAGE_BITS;
> >> +
> >> +        /*
> >> +         * This unqueued page would break the "one round" check, even is
> >> +         * really rare.
> >
> >Why this is needed?  Could you help explain the problem first?
> 
> Peter, Thanks for your question.
> 
> I found this issue during code review and I believe this is a corner case.
> 
> Below is a draft chart for ram_find_and_save_block:
> 
>     ram_find_and_save_block
>         do
>             get_queued_page()
>             find_dirty_block()
>             ram_save_host_page()
>         while
> 
> The basic logic here is : get a page need to migrate and migrate it.
> 
> In case we don't have get_queued_page(), find_dirty_block() will search the
> whole ram_list.blocks by order. pss->complete_round is used to indicate
> whether this search has looped.
> 
> Everything works fine after get_queued_page() involved. The block unqueued in
> get_queued_page() could be any block in the ram_list.blocks. This means we
> have very little chance to break the looped indicator.
> 
>                            unqueue_page()  last_seen_block
>                                      |     |
>     ram_list.blocks                  v     v
>     ---------------------------------+=====+---
> 
> 
> Just draw a raw picture to demonstrate a corner case.
> 
> For example, we start from last_seen_block and search till the end of
> ram_list.blocks. At this moment, pss->complete_round is set to true. Then we
> get a queued page from unqueue_page() at the point I pointed. So the loop
> continues may just continue the range as I marked as "=". We will skip all the
> other ranges.

Ah I see your point, but I don't think there is a problem - note that
complete_round will be reset for each ram_find_and_save_block(), so
even if we have that iteration of ram_find_and_save_block() to return
we'll still know we have dirty pages to migrate and in the next call
we'll be fine, no?

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05  9:38     ` Peter Xu
@ 2019-06-05 10:33       ` Juan Quintela
  2019-06-05 13:39         ` Wei Yang
  2019-06-05 13:41       ` Wei Yang
  1 sibling, 1 reply; 10+ messages in thread
From: Juan Quintela @ 2019-06-05 10:33 UTC (permalink / raw)
  To: Peter Xu; +Cc: Wei Yang, dgilbert, qemu-devel

Peter Xu <peterx@redhat.com> wrote:
> On Wed, Jun 05, 2019 at 04:52:07PM +0800, Wei Yang wrote:
>> On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote:
>> >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote:
>> >> In case we gets a queued page, the order of block is interrupted. We may
>> >> not rely on the complete_round flag to say we have already searched the
>> >> whole blocks on the list.
>> >> 
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> ---
>> >>  migration/ram.c | 6 ++++++
>> >>  1 file changed, 6 insertions(+)
>> >> 
>> >> diff --git a/migration/ram.c b/migration/ram.c
>> >> index d881981876..e9b40d636d 100644
>> >> --- a/migration/ram.c
>> >> +++ b/migration/ram.c
>> >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
>> >>           */
>> >>          pss->block = block;
>> >>          pss->page = offset >> TARGET_PAGE_BITS;
>> >> +
>> >> +        /*
>> >> +         * This unqueued page would break the "one round" check, even is
>> >> +         * really rare.
>> >


> Ah I see your point, but I don't think there is a problem - note that
> complete_round will be reset for each ram_find_and_save_block(), so
> even if we have that iteration of ram_find_and_save_block() to return
> we'll still know we have dirty pages to migrate and in the next call
> we'll be fine, no?

Reviewed-by: Juan Quintela <quintela@redhat.com>

I *think* that peter is perhaps right, but it is not clear at all, and
it is easier to be safe.  I think that the only case that this could
matter is if:
- all pages are clean (so complete_round will get as true)
- we went a queue_page request

Is that possible?  I am not completely sure after looking at the code.
It *could* be if the page that got queued is the last page remaining,
but ......  I fully agree that the case that _almost all_ pages are
clean and we get a request for a queued page is really rare, so it
should not matter in real life, but ....

Later, Juan.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05  1:08 [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page Wei Yang
  2019-06-05  6:41 ` Peter Xu
@ 2019-06-05 12:27 ` Philippe Mathieu-Daudé
  2019-06-05 13:39   ` Wei Yang
  1 sibling, 1 reply; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-06-05 12:27 UTC (permalink / raw)
  To: Wei Yang, qemu-devel; +Cc: dgilbert, quintela

migratioin -> migration


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05 12:27 ` Philippe Mathieu-Daudé
@ 2019-06-05 13:39   ` Wei Yang
  2019-06-05 14:11     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2019-06-05 13:39 UTC (permalink / raw)
  To: Philippe Mathieu-Daud??; +Cc: quintela, Wei Yang, dgilbert, qemu-devel

On Wed, Jun 05, 2019 at 02:27:11PM +0200, Philippe Mathieu-Daud?? wrote:
>migratioin -> migration

Ah... I should take an English lesson...

Thanks

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05 10:33       ` Juan Quintela
@ 2019-06-05 13:39         ` Wei Yang
  0 siblings, 0 replies; 10+ messages in thread
From: Wei Yang @ 2019-06-05 13:39 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, Wei Yang, Peter Xu, dgilbert

On Wed, Jun 05, 2019 at 12:33:39PM +0200, Juan Quintela wrote:
>Peter Xu <peterx@redhat.com> wrote:
>> On Wed, Jun 05, 2019 at 04:52:07PM +0800, Wei Yang wrote:
>>> On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote:
>>> >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote:
>>> >> In case we gets a queued page, the order of block is interrupted. We may
>>> >> not rely on the complete_round flag to say we have already searched the
>>> >> whole blocks on the list.
>>> >> 
>>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>>> >> ---
>>> >>  migration/ram.c | 6 ++++++
>>> >>  1 file changed, 6 insertions(+)
>>> >> 
>>> >> diff --git a/migration/ram.c b/migration/ram.c
>>> >> index d881981876..e9b40d636d 100644
>>> >> --- a/migration/ram.c
>>> >> +++ b/migration/ram.c
>>> >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
>>> >>           */
>>> >>          pss->block = block;
>>> >>          pss->page = offset >> TARGET_PAGE_BITS;
>>> >> +
>>> >> +        /*
>>> >> +         * This unqueued page would break the "one round" check, even is
>>> >> +         * really rare.
>>> >
>
>
>> Ah I see your point, but I don't think there is a problem - note that
>> complete_round will be reset for each ram_find_and_save_block(), so
>> even if we have that iteration of ram_find_and_save_block() to return
>> we'll still know we have dirty pages to migrate and in the next call
>> we'll be fine, no?
>
>Reviewed-by: Juan Quintela <quintela@redhat.com>
>
>I *think* that peter is perhaps right, but it is not clear at all, and
>it is easier to be safe.  I think that the only case that this could
>matter is if:
>- all pages are clean (so complete_round will get as true)
>- we went a queue_page request
>
>Is that possible?  I am not completely sure after looking at the code.
>It *could* be if the page that got queued is the last page remaining,
>but ......  I fully agree that the case that _almost all_ pages are
>clean and we get a request for a queued page is really rare, so it
>should not matter in real life, but ....
>

Agree

>Later, Juan.

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05  9:38     ` Peter Xu
  2019-06-05 10:33       ` Juan Quintela
@ 2019-06-05 13:41       ` Wei Yang
  1 sibling, 0 replies; 10+ messages in thread
From: Wei Yang @ 2019-06-05 13:41 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Wei Yang, dgilbert, quintela

On Wed, Jun 05, 2019 at 05:38:19PM +0800, Peter Xu wrote:
>On Wed, Jun 05, 2019 at 04:52:07PM +0800, Wei Yang wrote:
>> On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote:
>> >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote:
>> >> In case we gets a queued page, the order of block is interrupted. We may
>> >> not rely on the complete_round flag to say we have already searched the
>> >> whole blocks on the list.
>> >> 
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> ---
>> >>  migration/ram.c | 6 ++++++
>> >>  1 file changed, 6 insertions(+)
>> >> 
>> >> diff --git a/migration/ram.c b/migration/ram.c
>> >> index d881981876..e9b40d636d 100644
>> >> --- a/migration/ram.c
>> >> +++ b/migration/ram.c
>> >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
>> >>           */
>> >>          pss->block = block;
>> >>          pss->page = offset >> TARGET_PAGE_BITS;
>> >> +
>> >> +        /*
>> >> +         * This unqueued page would break the "one round" check, even is
>> >> +         * really rare.
>> >
>> >Why this is needed?  Could you help explain the problem first?
>> 
>> Peter, Thanks for your question.
>> 
>> I found this issue during code review and I believe this is a corner case.
>> 
>> Below is a draft chart for ram_find_and_save_block:
>> 
>>     ram_find_and_save_block
>>         do
>>             get_queued_page()
>>             find_dirty_block()
>>             ram_save_host_page()
>>         while
>> 
>> The basic logic here is : get a page need to migrate and migrate it.
>> 
>> In case we don't have get_queued_page(), find_dirty_block() will search the
>> whole ram_list.blocks by order. pss->complete_round is used to indicate
>> whether this search has looped.
>> 
>> Everything works fine after get_queued_page() involved. The block unqueued in
>> get_queued_page() could be any block in the ram_list.blocks. This means we
>> have very little chance to break the looped indicator.
>> 
>>                            unqueue_page()  last_seen_block
>>                                      |     |
>>     ram_list.blocks                  v     v
>>     ---------------------------------+=====+---
>> 
>> 
>> Just draw a raw picture to demonstrate a corner case.
>> 
>> For example, we start from last_seen_block and search till the end of
>> ram_list.blocks. At this moment, pss->complete_round is set to true. Then we
>> get a queued page from unqueue_page() at the point I pointed. So the loop
>> continues may just continue the range as I marked as "=". We will skip all the
>> other ranges.
>
>Ah I see your point, but I don't think there is a problem - note that
>complete_round will be reset for each ram_find_and_save_block(), so
>even if we have that iteration of ram_find_and_save_block() to return
>we'll still know we have dirty pages to migrate and in the next call
>we'll be fine, no?
>

This is really a rare case and hard to say whether it would be harmful.

The chance still exists.

>-- 
>Peter Xu

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page
  2019-06-05 13:39   ` Wei Yang
@ 2019-06-05 14:11     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-06-05 14:11 UTC (permalink / raw)
  To: Wei Yang; +Cc: quintela, Wei Yang, dgilbert, qemu-devel

On 6/5/19 3:39 PM, Wei Yang wrote:
> On Wed, Jun 05, 2019 at 02:27:11PM +0200, Philippe Mathieu-Daud?? wrote:
>> migratioin -> migration
> 
> Ah... I should take an English lesson...

Your English is fine, I believe this is just a typo that slipped in ;)


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-06-05 14:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-05  1:08 [Qemu-devel] [PATCH] migratioin/ram.c: reset complete_round when we gets a queued page Wei Yang
2019-06-05  6:41 ` Peter Xu
2019-06-05  8:52   ` Wei Yang
2019-06-05  9:38     ` Peter Xu
2019-06-05 10:33       ` Juan Quintela
2019-06-05 13:39         ` Wei Yang
2019-06-05 13:41       ` Wei Yang
2019-06-05 12:27 ` Philippe Mathieu-Daudé
2019-06-05 13:39   ` Wei Yang
2019-06-05 14:11     ` Philippe Mathieu-Daudé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).