qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Migration vmdesc and xen-save-devices-state
@ 2020-06-24 13:28 Jason Andryuk
  2020-06-24 17:56 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 3+ messages in thread
From: Jason Andryuk @ 2020-06-24 13:28 UTC (permalink / raw)
  To: QEMU, xen-devel, zhang.zhanghailiang

Hi,

At some point, QEMU changed to add a json VM description (vmdesc)
after the migration data.  The vmdesc is not needed to restore the
migration data, but qemu_loadvm_state() will read and discard the
vmdesc to clear the stream when should_send_vmdesc() is true.

xen-save-devices-state generates its migration data without a vmdesc.
xen-load-devices-state in turn calls qemu_loadvm_state() which tries
to load vmdesc since should_send_vmdesc is true for xen.  When
restoring from a file, this is fine since it'll return EOF, print
"Expected vmdescription section, but got 0" and end the restore
successfully.

Linux stubdoms load their migration data over a console, so they don't
hit the EOF and end up waiting.  There does seem to be a timeout
though and restore continues after a delay, but we'd like to eliminate
the delay.

Two options to address this are to either:
1) set suppress_vmdesc for the Xen machines to bypass the
should_send_vmdesc() check.
or
2) just send the vmdesc data.

Since vmdesc is just discarded, maybe #1 should be followed.

If going with #2, qemu_save_device_state() needs to generate the
vmdesc data.  Looking at qemu_save_device_state() and
qemu_savevm_state_complete_precopy_non_iterable(), they are both very
similar and could seemingly be merged.  qmp_xen_save_devices_state()
could even leverage the bdrv_inactivate_all() call in
qemu_savevm_state_complete_precopy_non_iterable().

The would make qemu_save_device_state a little more heavywight, which
could impact COLO.  I'm not sure how performance sensitive the COLO
code is, and I haven't measured anything.

Does anyone have thoughts or opinions on the subject?

Thanks,
Jason


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Migration vmdesc and xen-save-devices-state
  2020-06-24 13:28 Migration vmdesc and xen-save-devices-state Jason Andryuk
@ 2020-06-24 17:56 ` Dr. David Alan Gilbert
  2020-06-25  2:44   ` Jason Andryuk
  0 siblings, 1 reply; 3+ messages in thread
From: Dr. David Alan Gilbert @ 2020-06-24 17:56 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: xen-devel, QEMU, zhang.zhanghailiang

* Jason Andryuk (jandryuk@gmail.com) wrote:
> Hi,
> 
> At some point, QEMU changed to add a json VM description (vmdesc)
> after the migration data.  The vmdesc is not needed to restore the
> migration data, but qemu_loadvm_state() will read and discard the
> vmdesc to clear the stream when should_send_vmdesc() is true.

About 5 years ago :-)

> xen-save-devices-state generates its migration data without a vmdesc.
> xen-load-devices-state in turn calls qemu_loadvm_state() which tries
> to load vmdesc since should_send_vmdesc is true for xen.  When
> restoring from a file, this is fine since it'll return EOF, print
> "Expected vmdescription section, but got 0" and end the restore
> successfully.
> 
> Linux stubdoms load their migration data over a console, so they don't
> hit the EOF and end up waiting.  There does seem to be a timeout
> though and restore continues after a delay, but we'd like to eliminate
> the delay.
> 
> Two options to address this are to either:
> 1) set suppress_vmdesc for the Xen machines to bypass the
> should_send_vmdesc() check.
> or
> 2) just send the vmdesc data.
> 
> Since vmdesc is just discarded, maybe #1 should be followed.

#1 does sound simple!

> If going with #2, qemu_save_device_state() needs to generate the
> vmdesc data.  Looking at qemu_save_device_state() and
> qemu_savevm_state_complete_precopy_non_iterable(), they are both very
> similar and could seemingly be merged.  qmp_xen_save_devices_state()
> could even leverage the bdrv_inactivate_all() call in
> qemu_savevm_state_complete_precopy_non_iterable().
> 
> The would make qemu_save_device_state a little more heavywight, which
> could impact COLO.  I'm not sure how performance sensitive the COLO
> code is, and I haven't measured anything.

COLO snapshots are potentially quite sensitive; although we've got a
load of other things we could do with speeding up, we could do without
making them noticably heavier.

Dave

> Does anyone have thoughts or opinions on the subject?
> 
> Thanks,
> Jason
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Migration vmdesc and xen-save-devices-state
  2020-06-24 17:56 ` Dr. David Alan Gilbert
@ 2020-06-25  2:44   ` Jason Andryuk
  0 siblings, 0 replies; 3+ messages in thread
From: Jason Andryuk @ 2020-06-25  2:44 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: xen-devel, QEMU, zhang.zhanghailiang

On Wed, Jun 24, 2020 at 1:57 PM Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Jason Andryuk (jandryuk@gmail.com) wrote:
> > Hi,
> >
> > At some point, QEMU changed to add a json VM description (vmdesc)
> > after the migration data.  The vmdesc is not needed to restore the
> > migration data, but qemu_loadvm_state() will read and discard the
> > vmdesc to clear the stream when should_send_vmdesc() is true.
>
> About 5 years ago :-)

:)

> > xen-save-devices-state generates its migration data without a vmdesc.
> > xen-load-devices-state in turn calls qemu_loadvm_state() which tries
> > to load vmdesc since should_send_vmdesc is true for xen.  When
> > restoring from a file, this is fine since it'll return EOF, print
> > "Expected vmdescription section, but got 0" and end the restore
> > successfully.
> >
> > Linux stubdoms load their migration data over a console, so they don't
> > hit the EOF and end up waiting.  There does seem to be a timeout
> > though and restore continues after a delay, but we'd like to eliminate
> > the delay.
> >
> > Two options to address this are to either:
> > 1) set suppress_vmdesc for the Xen machines to bypass the
> > should_send_vmdesc() check.
> > or
> > 2) just send the vmdesc data.
> >
> > Since vmdesc is just discarded, maybe #1 should be followed.
>
> #1 does sound simple!
>
> > If going with #2, qemu_save_device_state() needs to generate the
> > vmdesc data.  Looking at qemu_save_device_state() and
> > qemu_savevm_state_complete_precopy_non_iterable(), they are both very
> > similar and could seemingly be merged.  qmp_xen_save_devices_state()
> > could even leverage the bdrv_inactivate_all() call in
> > qemu_savevm_state_complete_precopy_non_iterable().
> >
> > The would make qemu_save_device_state a little more heavywight, which
> > could impact COLO.  I'm not sure how performance sensitive the COLO
> > code is, and I haven't measured anything.
>
> COLO snapshots are potentially quite sensitive; although we've got a
> load of other things we could do with speeding up, we could do without
> making them noticably heavier.

qemu_savevm_state_complete_precopy_non_iterable() generates the vmdesc
json and just discards it if not needed.  How much overhead that adds
is the question.

Thanks,
Jason


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-06-25  2:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-24 13:28 Migration vmdesc and xen-save-devices-state Jason Andryuk
2020-06-24 17:56 ` Dr. David Alan Gilbert
2020-06-25  2:44   ` Jason Andryuk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).