qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Zhang Chen <zhangckid@gmail.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Li Zhijian <lizhijian@cn.fujitsu.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>
Subject: Re: [Qemu-devel] test-filter-mirror hangs
Date: Wed, 23 Jan 2019 19:53:46 +0000	[thread overview]
Message-ID: <20190123195345.GI2193@work-vm> (raw)
In-Reply-To: <64411f3f-0071-fc94-945c-af16cf5edc77@redhat.com>

* Jason Wang (jasowang@redhat.com) wrote:
> 
> On 2019/1/22 上午2:56, Peter Maydell wrote:
> > On Thu, 17 Jan 2019 at 09:46, Jason Wang <jasowang@redhat.com> wrote:
> > > 
> > > On 2019/1/15 上午12:33, Zhang Chen wrote:
> > > > 
> > > > On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
> > > > <dgilbert@redhat.com <mailto:dgilbert@redhat.com>> wrote:
> > > > 
> > > >      * Peter Maydell (peter.maydell@linaro.org
> > > >      <mailto:peter.maydell@linaro.org>) wrote:
> > > >      > Recently I've noticed that test-filter-mirror has been hanging
> > > >      > intermittently, typically when run on some other TCG architecture.
> > > >      > In the instance I've just looked at, this was with s390x guest on
> > > >      > x86-64 host, though I've also seen it on other host archs and
> > > >      > perhaps with other guests.
> > > > 
> > > >      Watch out to see if you really do see it for other guests;
> > > >      it carefully avoids using virtio-net to avoid vhost; but on s390x it
> > > >      uses virtio-net-ccw - could that hit the vhost it was trying to avoid?
> > > > 
> > > >      > Below is a backtrace, though it seems to be pretty unhelpful.
> > > >      > Anybody got any theories ? Does the mirror test rely on dirty
> > > >      > memory bitmaps like the migration test (which also hangs
> > > >      > occasionally with TCG due to some bug I'm sure we've investigated
> > > >      > in the past) ?
> > > > 
> > > >      I don't think it relies on the CPU at all.
> > > >   I have no idea about this currently, but Jason and I designed the
> > > > test case.
> > > > Add Jason: Have any comments about this ?
> > > 
> > > I can't reproduce this locally with s390x-softmmu. It looks to me the
> > > test should be independent to any kinds of emulation. It should pass
> > > when mainloop work.
> > I've just seen a hang with ppc64 guest on s390x host, so it is
> > indeed not specific to s390x guest (and so not specific to
> > virtio-net either, since the ppc64 guest setup uses e1000).
> > 
> > thanks
> > -- PMM
> 
> 
> Finally reproduced locally after hundreds (sometimes thousands) times of
> running.
> 
> Bisection points to OOB monitor[1].
> 
> It looks to me after OOB is used unconditionally we lose a barrier to make
> sure socket is connected before sending packets in test-filter-mirror.c. Is
> there any other similar and simple thing that we could do to kick the
> mainloop?

Do you mean the:

    /* send a qmp command to guarantee that 'connected' is setting to true. */
    qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

why was that ever sufficient to know the socket was ready?

Dave

> Thanks
> 
> [1]
> 
> commit 8258292e18c39480b64eba9f3551ab772ce29b5d (HEAD, refs/bisect/bad)
> Author: Peter Xu <peterx@redhat.com>
> Date:   Tue Oct 9 14:27:15 2018 +0800
> 
>     monitor: Remove "x-oob", offer capability "oob" unconditionally
> 
>     Out-of-band command execution was introduced in commit cf869d53172.
>     Unfortunately, we ran into a regression, and had to turn it into an
>     experimental option for 2.12 (commit be933ffc23).
> 
> http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html
> 
>     The regression has since been fixed (commit 951702f39c7 "monitor: bind
>     dispatch bh to iohandler context").  A thorough re-review of OOB
>     commands led to a few more issues, which have also been addressed.
> 
>     This patch partly reverts be933ffc23 (monitor: new parameter "x-oob"),
>     and makes QMP monitors again offer capability "oob" whenever they can
>     provide it, i.e. when the monitor's character device is capable of
>     running in an I/O thread.
> 
>     Some trivial touch-up in the test code is required to make sure qmp-test
>     won't break.
> 
>     Reviewed-by: Markus Armbruster <armbru@redhat.com>
>     Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>     Signed-off-by: Peter Xu <peterx@redhat.com>
>     Message-Id: <20181009062718.1914-4-peterx@redhat.com>
>     [Conflict with "monitor: check if chardev can switch gcontext for OOB"
>     resolved, commit message updated]
>     Signed-off-by: Markus Armbruster <armbru@redhat.com>
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2019-01-23 20:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 15:01 [Qemu-devel] test-filter-mirror hangs Peter Maydell
2019-01-11 16:15 ` Dr. David Alan Gilbert
2019-01-14 16:33   ` Zhang Chen
2019-01-17  9:46     ` Jason Wang
2019-01-21 18:56       ` Peter Maydell
2019-01-21 20:01         ` Dr. David Alan Gilbert
2019-01-22  9:06           ` Peter Maydell
2019-01-23  2:43         ` Jason Wang
2019-01-23 19:53           ` Dr. David Alan Gilbert [this message]
2019-01-24  4:01             ` Jason Wang
2019-01-24  9:11               ` Dr. David Alan Gilbert
2019-01-24  9:51                 ` Peter Xu
2019-01-25  3:55                   ` Jason Wang
2019-01-25  7:14                     ` Markus Armbruster
2019-01-25  3:45                 ` Jason Wang
2019-01-24  9:47               ` Markus Armbruster
2019-01-25  3:56                 ` Jason Wang
2019-01-25  7:12                   ` Markus Armbruster
2019-01-25  8:12                     ` Jason Wang
2019-01-25  8:44                       ` Markus Armbruster
2019-01-24 10:11             ` Daniel P. Berrangé
2019-01-24 10:30               ` Daniel P. Berrangé
2019-01-24 11:01                 ` Daniel P. Berrangé
2019-01-25  7:12                   ` Jason Wang
2019-01-25  7:00                 ` Jason Wang
2019-01-15 10:28   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190123195345.GI2193@work-vm \
    --to=dgilbert@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhangckid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).