From: Peter Xu <peterx@redhat.com>
To: Catherine Ho <catherine.hecx@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Juan Quintela <quintela@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>,
Yury Kotov <yury-kotov@yandex-team.ru>
Subject: Re: [Qemu-devel] [PATCH] migration: avoid copying ignore-shared ramblock when in incoming migration
Date: Tue, 2 Apr 2019 20:36:59 +0800 [thread overview]
Message-ID: <20190402123659.GG11008@xz-x1> (raw)
In-Reply-To: <CAEn6zmEjYh6uK3Y5cyenLKDKe+cCpjC5HOnNNoM6SniDAKPwzQ@mail.gmail.com>
On Tue, Apr 02, 2019 at 05:06:15PM +0800, Catherine Ho wrote:
> On Tue, 2 Apr 2019 at 15:58, Peter Xu <peterx@redhat.com> wrote:
>
> > On Tue, Apr 02, 2019 at 03:47:16PM +0800, Catherine Ho wrote:
> > > Hi Peter Maydell
> > >
> > > On Tue, 2 Apr 2019 at 11:05, Peter Maydell <peter.maydell@linaro.org>
> > wrote:
> > >
> > > > On Tue, 2 Apr 2019 at 09:57, Catherine Ho <catherine.hecx@gmail.com>
> > > > wrote:
> > > > > The root cause is the used idx is moved forward after 1st time
> > incoming,
> > > > and in 2nd time incoming,
> > > > > the last_avail_idx will be incorrectly restored from the saved device
> > > > state file(not in the ram).
> > > > >
> > > > > I watched this even on x86 for a virtio-scsi disk
> > > > >
> > > > > Any ideas for supporting 2nd time, 3rd time... incoming restoring?
> > > >
> > > > Does the destination end go through reset between the 1st and 2nd
> > > >
> > > seems not, please see my step below
> > >
> > > > incoming attempts? I'm not a migration expert, but I thought that
> > > > devices were allowed to assume that their state is "state of the
> > > > device following QEMU reset" before the start of an incoming
> > > > migration attempt.
> > > >
> > >
> > > Here is my step:
> > > 1. start guest normal by qemu with shared memory-backend file
> > > 2. stop the vm. save the device state to another file via monitor migrate
> > > "exec: cat>..."
> > > 3. quit the vm
> > > 4. retore the vm by qemu -incoming "exec:cat ..."
> > > 5. continue the vm via monito, the 1st incoming works fine
> > > 6. quit the vm
> > > 7. retore the vm by qemu -incoming "exec:cat ..." for 2nd time
> > > 8. continue -> error happened
> > > Actually, this can be fixed by forcely restore the idx by
> > > virtio_queue_restore_last_avail_idx()
> > > But I am sure whether it is reasonable.
> >
> > Yeah I really suspect its validity.
> >
> > IMHO normal migration streams keep the device state and RAM data
> > together in the dumped file, so they always match.
> >
> > In your shared case, the device states are in the dumped file however
> > the RAM data is located somewhere else. After you quit the VM from
> > the 1st incoming migration the RAM is new (because that's a shared
> > memory file) and the device data is still old. They do not match
> > already, then I'd say you can't migrate with that any more.
> >
> > If you want to do that, you'd better take snapshot of the RAM backend
> > file if your filesystem supports (or even simpler, to back it up
> > before hand) before you start any incoming migration. Then with the
> > dumped file (which contains the device states) and that snapshot file
> > (which contains the exact RAM data that matches the device states)
> > you'll alway be able to migrate for as many times as you want.
> >
>
> Understood, thanks Peter Xu
> Is there any feasible way to indicate the snapshot of the RAM backend file
> is
> matched with the device data?
> >VQ 2 size 0x400 < last_avail_idx 0x1639 - used_idx 0x2688
> >Failed to load virtio-scsi:virtio
>
> Because I thought reporting above error is not so friendly. Could we add a
> version id in both RAM backend file and device date file?
It would be non-trivial I'd say - AFAIK we don't have an existing way
to tag the memory-backend-file content (IIUC that's what you use).
And since you mentioned about versioning of these states, I just
remembered that even with this you may not be able to get a complete
matched state of the VM, because AFAICT actually besides RAM state &
device state, you probably also need to consider the disk state as
well. After you started the VM of the 1st incoming, there could be
data flushed to the VM backend disk and then that state is changed as
well. So here even if you snapshot the RAM file you'll still lose the
disk state IIUC so it could still be broken. In other words, to make
a migration/snapshot to work you'll need to make all these three
states to match.
Before we discuss further on the topic... could you share me with your
requirement first? I started to get a bit confused now since when I
thought about shared mem I was thinking about migrating within the
same host to e.g. upgrade the hypervisor but that obviously does not
need you to do incoming migration for multiple times. Then what do
you finally want to achieve?
Regards,
--
Peter Xu
next prev parent reply other threads:[~2019-04-02 12:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1553010562-13561-1-git-send-email-catherine.hecx@gmail.com>
[not found] ` <20190320050735.GB8956@xz-x1>
[not found] ` <CAEn6zmEzwQD_Ot8sttJj0KwML3Zkhfg9+QBxbOLn6bUsDpVn7w@mail.gmail.com>
[not found] ` <20190321061024.GB9149@xz-x1>
[not found] ` <CAEn6zmGFv+UbhyriwakFKB=UhnC6=thhybDF9D1E9JzoYL-1oA@mail.gmail.com>
[not found] ` <CAFEAcA8q0c5BFh-11KNRJWCi6+Yer_5peekmQptmaw8Ag3SNhw@mail.gmail.com>
[not found] ` <20190322101211.GA2703@work-vm>
[not found] ` <20190325033948.GG9149@xz-x1>
[not found] ` <CAEn6zmF0DRqqUxjKpdxWYdb_ofGXV_wACfELA991qLfvo9N6vA@mail.gmail.com>
2019-04-02 2:57 ` [Qemu-devel] [PATCH] migration: avoid copying ignore-shared ramblock when in incoming migration Catherine Ho
2019-04-02 3:05 ` Peter Maydell
2019-04-02 7:47 ` Catherine Ho
2019-04-02 7:49 ` Catherine Ho
2019-04-02 7:51 ` Peter Maydell
2019-04-02 7:58 ` Peter Xu
2019-04-02 9:06 ` Catherine Ho
2019-04-02 12:36 ` Peter Xu [this message]
2019-04-02 14:17 ` Catherine Ho
2019-04-02 14:33 ` Catherine Ho
2019-04-02 17:37 ` Dr. David Alan Gilbert
2019-04-02 15:30 ` [Qemu-devel] [PATCH v2] migration: avoid filling " Catherine Ho
2019-04-03 2:25 ` Peter Xu
2019-04-03 15:21 ` Catherine Ho
2019-04-04 4:25 ` Peter Xu
2019-04-04 7:17 ` Catherine Ho
2019-04-04 7:31 ` Peter Xu
2019-04-04 7:33 ` Catherine Ho
2019-04-04 9:45 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190402123659.GG11008@xz-x1 \
--to=peterx@redhat.com \
--cc=ard.biesheuvel@linaro.org \
--cc=armbru@redhat.com \
--cc=catherine.hecx@gmail.com \
--cc=dgilbert@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=rth@twiddle.net \
--cc=yury-kotov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).