All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "Jason J. Herne" <jjherne@linux.ibm.com>,
	Fam Zheng <fam@euphon.net>, Thomas Huth <thuth@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	Claudio Fontana <cfontana@suse.de>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
Date: Mon, 13 Jul 2020 13:39:57 +0200	[thread overview]
Message-ID: <20200713133957.148716a7.cohuck@redhat.com> (raw)
In-Reply-To: <20200713110333.GE3122@work-vm>

On Mon, 13 Jul 2020 12:03:33 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Claudio Fontana (cfontana@suse.de) wrote:

> > The following workarounds hide the problem (make the test pass):
> > 
> > 1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).
> > 
> > 2) increasing the IO_BUF_SIZE also hides the problem:
> > 
> > ----------------------cut--------------------------
> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> > index be21518c57..f81d1272eb 100644
> > --- a/migration/qemu-file.c
> > +++ b/migration/qemu-file.c
> > @@ -30,7 +30,7 @@
> >  #include "trace.h"
> >  #include "qapi/error.h"
> >  
> > -#define IO_BUF_SIZE 32768
> > +#define IO_BUF_SIZE 65536
> >  #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
> >  
> >  struct QEMUFile {
> > ----------------------cut--------------------------
> > 
> > 3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:
> > 
> > ----------------------cut--------------------------
> > diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> > index 1e036cc602..47c9a015af 100644
> > --- a/hw/s390x/s390-skeys.c
> > +++ b/hw/s390x/s390-skeys.c
> > @@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
> >      .class_size    = sizeof(S390SKeysClass),
> >  };
> >  
> > +extern void qemu_fflush(QEMUFile *f);
> > +
> >  static void s390_storage_keys_save(QEMUFile *f, void *opaque)
> >  {
> >      S390SKeysState *ss = S390_SKEYS(opaque);
> > @@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
> >      g_free(buf);
> >  end_stream:
> >      qemu_put_be64(f, eos);
> > +    qemu_fflush(f);
> >  }
> >  
> >  static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
> > ----------------------cut--------------------------
> > 
> > Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
> > and what could be the next step in the investigation?
> > 
> > Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
> > and should s390 code also follow the same pattern?  
> 
> I didn't think it was required.
> And qemu_put_buffer loops if needed and calls qemu_fflush internally.
> It's possible here that the storage key code is just the canary - the
> first thing that detects that the stream is invalid after it all goes
> wrong.

Yes, that seems possible. Especially as we end up with all zeroes after
the skeys section in the bad case -- it seems like weird problem to
have to be cured by an individual device. No good idea *what* actually
goes wrong, though.



  reply	other threads:[~2020-07-13 11:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-12 10:00 migration: broken snapshot saves appear on s390 when small fields in migration stream removed Claudio Fontana
2020-07-12 16:11 ` Paolo Bonzini
2020-07-13  9:11   ` Claudio Fontana
2020-07-14 14:29     ` Claudio Fontana
2020-07-14 14:35       ` Thomas Huth
2020-07-15 11:10         ` Claudio Fontana
2020-07-15 12:25           ` Claudio Fontana
2020-07-16 12:58           ` Claudio Fontana
2020-07-20 18:24             ` Claudio Fontana
2020-07-21  8:22               ` Claudio Fontana
2020-07-27 23:09                 ` Bruce Rogers
2020-07-28  8:15                   ` Vladimir Sementsov-Ogievskiy
2020-07-28  8:43                   ` Vladimir Sementsov-Ogievskiy
2020-07-28 13:23                     ` Bruce Rogers
2020-07-28 11:10                   ` Max Reitz
2020-07-28 11:27                     ` Vladimir Sementsov-Ogievskiy
2020-07-28 11:33                     ` Vladimir Sementsov-Ogievskiy
2020-07-28 11:35                       ` Paolo Bonzini
2020-07-28 11:45                         ` Max Reitz
2020-07-28 12:09                           ` Paolo Bonzini
2020-07-28 12:47                     ` Claudio Fontana
2020-07-13 11:03 ` Dr. David Alan Gilbert
2020-07-13 11:39   ` Cornelia Huck [this message]
2020-07-13 11:39   ` Claudio Fontana
2020-07-13 11:45     ` Claudio Fontana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200713133957.148716a7.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=cfontana@suse.de \
    --cc=dgilbert@redhat.com \
    --cc=fam@euphon.net \
    --cc=jjherne@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.