qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Yong Huang <yong.huang@smartx.com>,
	qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
	Wei Wang <wei.w.wang@intel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v1 2/2] migration: Do not perform RAMBlock dirty sync during the first iteration
Date: Fri, 15 Nov 2024 16:49:56 +0100	[thread overview]
Message-ID: <91a7fcc0-f4b6-4129-bde2-c776c1e09859@redhat.com> (raw)
In-Reply-To: <ZzdrjTXgNFlrqqEu@x1n>

>> We better double check and document that, because it must be guaranteed, not
>> "let's cross fingers".
> 
> Yes, we should double check on at least known good use cases, maybe not
> all.
> 
> E.g., I see nvmm_log_sync() and whpx_log_sync() unconditionally set dirty
> to all mem always.  I actually don't know how some of these trackers work
> for live migration, or if it works at all.
> 
> If by accident this change breaks something, I also wonder whether we
> should fix the patch that breaks it, or fix the assumption that the 1st
> sync must happen at setup.  That's simply wrong to me.. and not all dirty
> track users have such either.  E.g. vga tracking won't even have a SETUP
> phase at all.
> 
> The simplest solution for any potential breakage is to provide a
> log_global_start() and sync once there, that's exactly what SETUP should
> do.  But I hope it won't happen at all..
> 
> This is legacy tech debt to me.  The earlier we understand it the better,
> so I'm personally ok if someone would try to do this as long as major archs
> will be safe.
> 
>>
>> Also, I'm not 100% sure how KVM internals implement that (I recall some
>> s390x oddities, but might be wrong ...).
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> virtio-balloon() calls qemu_guest_free_page_hint() which calls
>>>>>>
>>>>>> migration_clear_memory_region_dirty_bitmap_range()
>>>>>> bitmap_clear()
>>>>>>
>>>>>> So it would maybe have the same issue.
>>>>>
>>>>> Should virtio-balloon do the same?
>>>>
>>>> virtio-balloon is more interesting, because I assume here we could run after
>>>> the "initial clearing" and would want to mark it clean everywhere.
>>>
>>> Yes, if it does what I mentioned below, IIUC it'll clear all dirty bits
>>> across the whole stack.  Only the ram_list bitmap is missing.  IIUC it
>>> could mean it could stop working for at least tcg, as tcg sololy uses
>>> it.. even with kvm some MRs may use it.  Maybe we want to fix it
>>> separately.
>>
>> Yes, that definitely needs care.
>>
>> [...]
>>
>>>> (but I'm confused because we have way too many bitmaps, and maybe the KVM
>>>> one could be problematic without an initial sync ... we'd want an initial
>>>> clearing for that as well ...)
>>>
>>> So IMHO most of the bitmaps should be initialized with zeros, not
>>> ones.. like I mentioned above.
>>>
>>> Migration bitmap is special, because it's not about dirty tracking
>>> capability / reporting but that we know we need to migrate the first round.
>>> So setting all ones makes sense for migration only, not a reporting
>>> facility.  While KVM_DIRTY_LOG_INITIALLY_SET existed for its own reasoning
>>> on speeding up migration starts..
>>>
>>> So, now I am thinking whether we should not set all ones in ram_list bitmap
>>> at all, corresponds to this change:
>>>
>>> ===8<===
>>> diff --git a/system/physmem.c b/system/physmem.c
>>> index dc1db3a384..10966fa68c 100644
>>> --- a/system/physmem.c
>>> +++ b/system/physmem.c
>>> @@ -1913,10 +1913,6 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>>>        ram_list.version++;
>>>        qemu_mutex_unlock_ramlist();
>>> -    cpu_physical_memory_set_dirty_range(new_block->offset,
>>> -                                        new_block->used_length,
>>> -                                        DIRTY_CLIENTS_ALL);
>>> -
>>>        if (new_block->host) {
>>>            qemu_ram_setup_dump(new_block->host, new_block->max_length);
>>>            qemu_madvise(new_block->host, new_block->max_length, QEMU_MADV_HUGEPAGE);
>>> ===8<===
>>>
>>
>> It will leave things in an inconsistent state with regards of
>> qemu_ram_resize() though. So we'd want to do the same thing there as well.
>>
>> I don't know about DIRTY_CLIENTS_ALL, though ... no expert on that, which
>> side effects it has.
>>
>> Because saying "initially, all memory is dirty" can make sense, depending on
>> from which angle you look at it.
> 
> Migration definitely doesn't need it, so to be safe we could also make
> CLIENT_ALL to CODE|VGA.
> 
> And if we agree virtio-mem will need to punch holes on all the bitmaps,
> then this patch is even not needed.  For that, more below.

Yes, it's fine for me as a fix. It's not optimal, but I don't really 
care as long as it works :)

> 
>>
>>> I'm guessing whether above could fix the virtio-mem regression after
>>> Hyman's current patch applied.
>>>
>>> Said that, IMHO virtio-mem should still use the same helper just like
>>> virtio-balloon as I discussed previously, so as to reset bitmap for the
>>> whole stack (which seems to always be the right thing to do to not miss one
>>> layer of them)?
>>
>> Well, the difference is that virtio-mem really only gets called exactly once
>> initially. If we can just reset all bitmaps initially, then virtio-mem
>> really only has to zap the rb->bmap.
> 
> I think virtio-mem should better always punch through, regardless of
> whether we can have above change. Consider cases like KVM "initial-all-set"
> feature I mentioned above, when that feature bit is set by QEMU, KVM sets
> all ones to the bitmap initially.  So that may be required for virtio-mem
> to clear 1s in KVM at least.

Yes.

Anyhow, I'm happy as long as we don't break virtio-mem. Clearing all 
bitmaps will work.

Getting virtio-mem to clear discarded after every bitmap sync would be 
even more robust, for example, if we'd had some stupid sync() 
implementation that simply always sets all memory dirty ...

-- 
Cheers,

David / dhildenb



      reply	other threads:[~2024-11-15 15:50 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-09  4:59 [PATCH v1 0/2] migration: Skip the first dirty sync Hyman Huang
2024-11-09  4:59 ` [PATCH v1 1/2] virtio-balloon: Enable free page hinting during PRECOPY_NOTIFY_SETUP Hyman Huang
2024-11-12 10:10   ` David Hildenbrand
2024-11-09  4:59 ` [PATCH v1 2/2] migration: Do not perform RAMBlock dirty sync during the first iteration Hyman Huang
2024-11-11  9:07   ` Wang, Wei W
2024-11-11 10:20     ` Yong Huang
2024-11-11  9:27   ` David Hildenbrand
2024-11-11 10:08     ` Yong Huang
2024-11-11 10:42       ` David Hildenbrand
2024-11-11 11:14         ` Yong Huang
2024-11-11 11:37         ` Yong Huang
2024-11-12 10:08           ` David Hildenbrand
2024-11-13 17:40             ` Peter Xu
2024-11-13 18:49               ` David Hildenbrand
2024-11-13 20:12                 ` Peter Xu
2024-11-14  9:02                   ` David Hildenbrand
2024-11-14 19:28                     ` Peter Xu
2024-11-14 21:16                       ` David Hildenbrand
2024-11-14 22:40                         ` Peter Xu
2024-11-15  9:11                           ` David Hildenbrand
2024-11-15 15:41                             ` Peter Xu
2024-11-15 15:49                               ` David Hildenbrand [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91a7fcc0-f4b6-4129-bde2-c776c1e09859@redhat.com \
    --to=david@redhat.com \
    --cc=farosas@suse.de \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wei.w.wang@intel.com \
    --cc=yong.huang@smartx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).