All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Thomas Huth" <thuth@redhat.com>,
	"Christian Borntraeger" <borntraeger@linux.ibm.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Ilya Leoshkevich" <iii@linux.ibm.com>,
	"Juan Quintela" <quintela@redhat.com>,
	s.reiter@proxmox.com, "QEMU Developers" <qemu-devel@nongnu.org>,
	"Peter Xu" <peterx@redhat.com>,
	"open list:S390 general arch..." <qemu-s390x@nongnu.org>,
	"Philippe Mathieu-Daudé" <philippe.mathieu.daude@gmail.com>,
	"Hanna Reitz" <hreitz@redhat.com>,
	f.ebner@proxmox.com, "Jinpu Wang" <jinpu.wang@ionos.com>
Subject: Re: multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue)
Date: Tue, 15 Mar 2022 16:14:52 +0000	[thread overview]
Message-ID: <YjC7fLmIvB+o95nA@work-vm> (raw)
In-Reply-To: <CAFEAcA_SUCgXCL3yE9e2H=ZUwn24uLvqSeTQVKuT+RUukOKrEQ@mail.gmail.com>

* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Tue, 15 Mar 2022 at 14:39, Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > On Mon, 14 Mar 2022 at 19:44, Peter Maydell <peter.maydell@linaro.org> wrote:
> > > On Mon, 14 Mar 2022 at 18:58, Peter Maydell <peter.maydell@linaro.org> wrote:
> > > > I just hit the abort case, narrowing it down to the
> > > > /i386/migration/multifd/tcp/zlib case, which can hit this without
> > > > any other tests being run:
> > >
> > > > This test seems to fail fairly frequently. I'll try a bisect...
> > >
> > > On this s390 machine, this test has been intermittent since
> > > it was first added in commit 7ec2c2b3c1 ("multifd: Add zlib compression
> > > multifd support") in 2019.
> >
> > I have tried (on current master) runs of various of the other
> > migration tests, and:
> >  * /i386/migration/multifd/tcp/zstd completed 1170 iterations without
> >    failing
> >  * /i386/migration/precopy/tcp completed 4669 iterations without
> >    failing
> >  * /i386/migration/multifd/tcp/zlib fails usually within the first
> >    10 iterations (the most I ever saw it manage was 32)
> >
> > So whatever this is, it seems like it might be specific to the
> > zlib code somehow ?
> 
> Maybe we're running into this bug
> https://bugs.launchpad.net/ubuntu/+source/zlib/+bug/1961427
> ("zlib: compressBound() returns an incorrect result on z15") ?

The initial description of compressBound being wrong doesn't
feel like it would cause that; it claims it would trigger an error
(I'm not sure how good we are at spotting that!); but then later
in the description it says:

'Mistakes in dfltcc_free_window OF and especially DEFLATE_BOUND_COMPLEN,
  (incl. the bit definitions), may cause various and unforseen defects'

Certainly looks like a 'various and unforseen defect'.

Dave

> That bug report claims it doesn't affect focal, though, which
> is what we're running on this box (specifically, the zlib1g
> package is version 1:1.2.11.dfsg-2ubuntu1.2).
> 
> A run with DFLTCC=0 has made it past 60 iterations so far, which
> suggests that that does serve as a workaround for the bug.
> 
> thanks
> -- PMM
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  parent reply	other threads:[~2022-03-15 16:16 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-02 18:29 [PULL 00/18] migration queue Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 01/18] clock-vmstate: Add missing END_OF_LIST Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 02/18] virtiofsd: Let meson check for statx.stx_mnt_id Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 03/18] monitor/hmp: add support for flag argument with value Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 04/18] qapi/monitor: refactor set/expire_password with enums Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 05/18] qapi/monitor: allow VNC display id in set/expire_password Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 06/18] migration/rdma: set the REUSEADDR option for destination Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 07/18] migration: Dump sub-cmd name in loadvm_process_command tp Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 08/18] migration: Finer grained tracepoints for POSTCOPY_LISTEN Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 09/18] migration: Tracepoint change in postcopy-run bottom half Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 10/18] migration: Introduce postcopy channels on dest node Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 11/18] migration: Dump ramblock and offset too when non-same-page detected Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 12/18] migration: Add postcopy_thread_create() Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 13/18] migration: Move static var in ram_block_from_stream() into global Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 14/18] migration: Enlarge postcopy recovery to capture !-EIO too Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 15/18] migration: postcopy_pause_fault_thread() never fails Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 16/18] migration: Add migration_incoming_transport_cleanup() Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 17/18] tests: Pass in MigrateStart** into test_migrate_start() Dr. David Alan Gilbert (git)
2022-03-02 18:29 ` [PULL 18/18] migration: Remove load_state_old and minimum_version_id_old Dr. David Alan Gilbert (git)
2022-03-03 14:46 ` [PULL 00/18] migration queue Peter Maydell
2022-03-08 18:36   ` Philippe Mathieu-Daudé
2022-03-08 18:47     ` Dr. David Alan Gilbert
2022-03-14 16:56       ` Peter Maydell
2022-03-14 17:07         ` Daniel P. Berrangé
2022-03-14 17:15           ` Peter Maydell
2022-03-14 17:24             ` Daniel P. Berrangé
2022-03-14 17:54             ` Dr. David Alan Gilbert
2022-03-14 18:08               ` Peter Maydell
2022-03-14 18:20                 ` Dr. David Alan Gilbert
2022-03-14 18:53                   ` Daniel P. Berrangé
2022-03-15  2:41                     ` Peter Xu
2022-03-14 18:58             ` Peter Maydell
2022-03-14 19:44               ` Peter Maydell
2022-03-15 14:39                 ` multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue) Peter Maydell
2022-03-15 15:03                   ` Peter Maydell
2022-03-15 15:30                     ` Peter Maydell
2022-03-15 15:40                       ` Daniel P. Berrangé
2022-03-15 15:44                         ` multifd/tcp/zlib intermittent abort Thomas Huth
2022-03-15 17:01                           ` Daniel P. Berrangé
2022-03-15 15:46                         ` multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue) Peter Maydell
2022-03-15 16:14                     ` Dr. David Alan Gilbert [this message]
2022-03-15 16:21                       ` Peter Maydell
2022-03-15 14:53       ` [PULL 00/18] migration queue Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjC7fLmIvB+o95nA@work-vm \
    --to=dgilbert@redhat.com \
    --cc=berrange@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=f.ebner@proxmox.com \
    --cc=hreitz@redhat.com \
    --cc=iii@linux.ibm.com \
    --cc=jinpu.wang@ionos.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=philippe.mathieu.daude@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=s.reiter@proxmox.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.