qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Li Zhijian <lizhijian@cn.fujitsu.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Peter Xu <peterx@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Zhang Chen <zhangckid@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] test-filter-mirror hangs
Date: Fri, 25 Jan 2019 16:12:24 +0800	[thread overview]
Message-ID: <c5c6c8cb-0cfd-323e-a291-f098ea4b8ec2@redhat.com> (raw)
In-Reply-To: <87a7jptdla.fsf@dusky.pond.sub.org>


On 2019/1/25 下午3:12, Markus Armbruster wrote:
> Jason Wang <jasowang@redhat.com> writes:
>
>> On 2019/1/24 下午5:47, Markus Armbruster wrote:
>>> Please cc: me on QMP issues.
>>
>> Ok.
>>
>>
>>> Jason Wang <jasowang@redhat.com> writes:
>>>
>>>> On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:
>>>>> * Jason Wang (jasowang@redhat.com) wrote:
>>>>>> On 2019/1/22 上午2:56, Peter Maydell wrote:
>>>>>>> On Thu, 17 Jan 2019 at 09:46, Jason Wang<jasowang@redhat.com>  wrote:
>>>>>>>> On 2019/1/15 上午12:33, Zhang Chen wrote:
>>>>>>>>> On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
>>>>>>>>> <dgilbert@redhat.com  <mailto:dgilbert@redhat.com>> wrote:
>>>>>>>>>
>>>>>>>>>         * Peter Maydell (peter.maydell@linaro.org
>>>>>>>>>         <mailto:peter.maydell@linaro.org>) wrote:
>>>>>>>>>         > Recently I've noticed that test-filter-mirror has been hanging
>>>>>>>>>         > intermittently, typically when run on some other TCG architecture.
>>>>>>>>>         > In the instance I've just looked at, this was with s390x guest on
>>>>>>>>>         > x86-64 host, though I've also seen it on other host archs and
>>>>>>>>>         > perhaps with other guests.
>>>>>>>>>
>>>>>>>>>         Watch out to see if you really do see it for other guests;
>>>>>>>>>         it carefully avoids using virtio-net to avoid vhost; but on s390x it
>>>>>>>>>         uses virtio-net-ccw - could that hit the vhost it was trying to avoid?
>>>>>>>>>
>>>>>>>>>         > Below is a backtrace, though it seems to be pretty unhelpful.
>>>>>>>>>         > Anybody got any theories ? Does the mirror test rely on dirty
>>>>>>>>>         > memory bitmaps like the migration test (which also hangs
>>>>>>>>>         > occasionally with TCG due to some bug I'm sure we've investigated
>>>>>>>>>         > in the past) ?
>>>>>>>>>
>>>>>>>>>         I don't think it relies on the CPU at all.
>>>>>>>>>      I have no idea about this currently, but Jason and I designed the
>>>>>>>>> test case.
>>>>>>>>> Add Jason: Have any comments about this ?
>>>>>>>> I can't reproduce this locally with s390x-softmmu. It looks to me the
>>>>>>>> test should be independent to any kinds of emulation. It should pass
>>>>>>>> when mainloop work.
>>>>>>> I've just seen a hang with ppc64 guest on s390x host, so it is
>>>>>>> indeed not specific to s390x guest (and so not specific to
>>>>>>> virtio-net either, since the ppc64 guest setup uses e1000).
>>>>>>>
>>>>>>> thanks
>>>>>>> -- PMM
>>>>>> Finally reproduced locally after hundreds (sometimes thousands) times of
>>>>>> running.
>>>>>>
>>>>>> Bisection points to OOB monitor[1].
>>>>>>
>>>>>> It looks to me after OOB is used unconditionally we lose a barrier to make
>>>>>> sure socket is connected before sending packets in test-filter-mirror.c. Is
>>>>>> there any other similar and simple thing that we could do to kick the
>>>>>> mainloop?
>>>>> Do you mean the:
>>>>>
>>>>>        /* send a qmp command to guarantee that 'connected' is setting to true. */
>>>>>        qmp_discard_response(qts, "{ 'execute' : 'query-status'}");
>>>> Yes.
>>>>
>>>>
>>>>> why was that ever sufficient to know the socket was ready?
>>>> It was suggested by Fam, I don't remember the details. Can we make
>>>> sure all pending events has been processed (UNIX socket was set to
>>>> connected) after query-status is returned with an non OOB monitor?
>>> I'm afraid I lack context.  Which socket are you talking about?  The
>>> test has at least the QMP socket, the send_sock[], and recv_sock.  What
>>> exactly are you trying to accomplish?
>>
>> I mean recv_sock. If mirror tries to send a packet to it before its
>> is_connected is set to true, packet will be dropped.
> So the *socket* is connected (in the TCP sense),


UNIX domain socket actually in the case of this test.


> but something else
> (whatever owns is_connected) is not.  Can you point me to where
> is_connected is set to true?


Sorry, should be "connected". It was set in tcp_chr_connect(). So if 
filter want to send a packet to socket chardev before tcp_chr_connect() 
is called, the packet will be dropped silently by tcp_chr_write(). This 
will fail this unit-test.


>
>>> By the way, mkstemp(sock_path) followed by unix_connect(sock_path, NULL)
>>> looks rather fishy.  Why create a temporary file only to create a Unix
>>> domain socket right over it?
>>
>> I vaguely remember passing fd created by unix domain socket doesn't
>> work when the test is introduced. So my understanding is the author
>> needs a way to create a unique file name which will be used b Unix
>> domain socket at that time.
> We should really, really, really improve the test harness to run each
> test program in its very own temporary directory.  Then tests can simply
> create files with fixed names, and leave cleanup to the test harness.


Agree, but for this test, since passing fd works now. I tend to using 
socketpair().


>>>    Why is ignoring errors a good idea?
>>
>> I don't get, which error is missed, it checks the return value of both
>> mkstemp() and unix_connect().
> Now I neglected to provide enough context for you :)
>
> I read
>
>      recv_sock = unix_connect(sock_path, NULL);
>
> and immediately went "why are errors ignored".  If I had read on (as I
> should've), I would've seen the are not:
>
>      g_assert_cmpint(recv_sock, !=, -1);
>
> Sorry for the noise.
>
> I'd replace both lines by
>
>      recv_sock = unix_connect(sock_path, &error_abort);
>
> Reports the actual error, which is an obvious improvement, with the
> location pointing to the failing spot within unix_connect().  To find
> where unix_connect() was called, you need to examine the stack
> backtrace.  Strictly more information, but your actual mileage may vary.
>

I see.

Thanks.

  reply	other threads:[~2019-01-25  8:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 15:01 [Qemu-devel] test-filter-mirror hangs Peter Maydell
2019-01-11 16:15 ` Dr. David Alan Gilbert
2019-01-14 16:33   ` Zhang Chen
2019-01-17  9:46     ` Jason Wang
2019-01-21 18:56       ` Peter Maydell
2019-01-21 20:01         ` Dr. David Alan Gilbert
2019-01-22  9:06           ` Peter Maydell
2019-01-23  2:43         ` Jason Wang
2019-01-23 19:53           ` Dr. David Alan Gilbert
2019-01-24  4:01             ` Jason Wang
2019-01-24  9:11               ` Dr. David Alan Gilbert
2019-01-24  9:51                 ` Peter Xu
2019-01-25  3:55                   ` Jason Wang
2019-01-25  7:14                     ` Markus Armbruster
2019-01-25  3:45                 ` Jason Wang
2019-01-24  9:47               ` Markus Armbruster
2019-01-25  3:56                 ` Jason Wang
2019-01-25  7:12                   ` Markus Armbruster
2019-01-25  8:12                     ` Jason Wang [this message]
2019-01-25  8:44                       ` Markus Armbruster
2019-01-24 10:11             ` Daniel P. Berrangé
2019-01-24 10:30               ` Daniel P. Berrangé
2019-01-24 11:01                 ` Daniel P. Berrangé
2019-01-25  7:12                   ` Jason Wang
2019-01-25  7:00                 ` Jason Wang
2019-01-15 10:28   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c5c6c8cb-0cfd-323e-a291-f098ea4b8ec2@redhat.com \
    --to=jasowang@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhangckid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).