From: Jason Wang <jasowang@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
Li Zhijian <lizhijian@cn.fujitsu.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Peter Xu <peterx@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>,
Zhang Chen <zhangckid@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] test-filter-mirror hangs
Date: Fri, 25 Jan 2019 16:12:24 +0800 [thread overview]
Message-ID: <c5c6c8cb-0cfd-323e-a291-f098ea4b8ec2@redhat.com> (raw)
In-Reply-To: <87a7jptdla.fsf@dusky.pond.sub.org>
On 2019/1/25 下午3:12, Markus Armbruster wrote:
> Jason Wang <jasowang@redhat.com> writes:
>
>> On 2019/1/24 下午5:47, Markus Armbruster wrote:
>>> Please cc: me on QMP issues.
>>
>> Ok.
>>
>>
>>> Jason Wang <jasowang@redhat.com> writes:
>>>
>>>> On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:
>>>>> * Jason Wang (jasowang@redhat.com) wrote:
>>>>>> On 2019/1/22 上午2:56, Peter Maydell wrote:
>>>>>>> On Thu, 17 Jan 2019 at 09:46, Jason Wang<jasowang@redhat.com> wrote:
>>>>>>>> On 2019/1/15 上午12:33, Zhang Chen wrote:
>>>>>>>>> On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
>>>>>>>>> <dgilbert@redhat.com <mailto:dgilbert@redhat.com>> wrote:
>>>>>>>>>
>>>>>>>>> * Peter Maydell (peter.maydell@linaro.org
>>>>>>>>> <mailto:peter.maydell@linaro.org>) wrote:
>>>>>>>>> > Recently I've noticed that test-filter-mirror has been hanging
>>>>>>>>> > intermittently, typically when run on some other TCG architecture.
>>>>>>>>> > In the instance I've just looked at, this was with s390x guest on
>>>>>>>>> > x86-64 host, though I've also seen it on other host archs and
>>>>>>>>> > perhaps with other guests.
>>>>>>>>>
>>>>>>>>> Watch out to see if you really do see it for other guests;
>>>>>>>>> it carefully avoids using virtio-net to avoid vhost; but on s390x it
>>>>>>>>> uses virtio-net-ccw - could that hit the vhost it was trying to avoid?
>>>>>>>>>
>>>>>>>>> > Below is a backtrace, though it seems to be pretty unhelpful.
>>>>>>>>> > Anybody got any theories ? Does the mirror test rely on dirty
>>>>>>>>> > memory bitmaps like the migration test (which also hangs
>>>>>>>>> > occasionally with TCG due to some bug I'm sure we've investigated
>>>>>>>>> > in the past) ?
>>>>>>>>>
>>>>>>>>> I don't think it relies on the CPU at all.
>>>>>>>>> I have no idea about this currently, but Jason and I designed the
>>>>>>>>> test case.
>>>>>>>>> Add Jason: Have any comments about this ?
>>>>>>>> I can't reproduce this locally with s390x-softmmu. It looks to me the
>>>>>>>> test should be independent to any kinds of emulation. It should pass
>>>>>>>> when mainloop work.
>>>>>>> I've just seen a hang with ppc64 guest on s390x host, so it is
>>>>>>> indeed not specific to s390x guest (and so not specific to
>>>>>>> virtio-net either, since the ppc64 guest setup uses e1000).
>>>>>>>
>>>>>>> thanks
>>>>>>> -- PMM
>>>>>> Finally reproduced locally after hundreds (sometimes thousands) times of
>>>>>> running.
>>>>>>
>>>>>> Bisection points to OOB monitor[1].
>>>>>>
>>>>>> It looks to me after OOB is used unconditionally we lose a barrier to make
>>>>>> sure socket is connected before sending packets in test-filter-mirror.c. Is
>>>>>> there any other similar and simple thing that we could do to kick the
>>>>>> mainloop?
>>>>> Do you mean the:
>>>>>
>>>>> /* send a qmp command to guarantee that 'connected' is setting to true. */
>>>>> qmp_discard_response(qts, "{ 'execute' : 'query-status'}");
>>>> Yes.
>>>>
>>>>
>>>>> why was that ever sufficient to know the socket was ready?
>>>> It was suggested by Fam, I don't remember the details. Can we make
>>>> sure all pending events has been processed (UNIX socket was set to
>>>> connected) after query-status is returned with an non OOB monitor?
>>> I'm afraid I lack context. Which socket are you talking about? The
>>> test has at least the QMP socket, the send_sock[], and recv_sock. What
>>> exactly are you trying to accomplish?
>>
>> I mean recv_sock. If mirror tries to send a packet to it before its
>> is_connected is set to true, packet will be dropped.
> So the *socket* is connected (in the TCP sense),
UNIX domain socket actually in the case of this test.
> but something else
> (whatever owns is_connected) is not. Can you point me to where
> is_connected is set to true?
Sorry, should be "connected". It was set in tcp_chr_connect(). So if
filter want to send a packet to socket chardev before tcp_chr_connect()
is called, the packet will be dropped silently by tcp_chr_write(). This
will fail this unit-test.
>
>>> By the way, mkstemp(sock_path) followed by unix_connect(sock_path, NULL)
>>> looks rather fishy. Why create a temporary file only to create a Unix
>>> domain socket right over it?
>>
>> I vaguely remember passing fd created by unix domain socket doesn't
>> work when the test is introduced. So my understanding is the author
>> needs a way to create a unique file name which will be used b Unix
>> domain socket at that time.
> We should really, really, really improve the test harness to run each
> test program in its very own temporary directory. Then tests can simply
> create files with fixed names, and leave cleanup to the test harness.
Agree, but for this test, since passing fd works now. I tend to using
socketpair().
>>> Why is ignoring errors a good idea?
>>
>> I don't get, which error is missed, it checks the return value of both
>> mkstemp() and unix_connect().
> Now I neglected to provide enough context for you :)
>
> I read
>
> recv_sock = unix_connect(sock_path, NULL);
>
> and immediately went "why are errors ignored". If I had read on (as I
> should've), I would've seen the are not:
>
> g_assert_cmpint(recv_sock, !=, -1);
>
> Sorry for the noise.
>
> I'd replace both lines by
>
> recv_sock = unix_connect(sock_path, &error_abort);
>
> Reports the actual error, which is an obvious improvement, with the
> location pointing to the failing spot within unix_connect(). To find
> where unix_connect() was called, you need to examine the stack
> backtrace. Strictly more information, but your actual mileage may vary.
>
I see.
Thanks.
next prev parent reply other threads:[~2019-01-25 8:28 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-11 15:01 [Qemu-devel] test-filter-mirror hangs Peter Maydell
2019-01-11 16:15 ` Dr. David Alan Gilbert
2019-01-14 16:33 ` Zhang Chen
2019-01-17 9:46 ` Jason Wang
2019-01-21 18:56 ` Peter Maydell
2019-01-21 20:01 ` Dr. David Alan Gilbert
2019-01-22 9:06 ` Peter Maydell
2019-01-23 2:43 ` Jason Wang
2019-01-23 19:53 ` Dr. David Alan Gilbert
2019-01-24 4:01 ` Jason Wang
2019-01-24 9:11 ` Dr. David Alan Gilbert
2019-01-24 9:51 ` Peter Xu
2019-01-25 3:55 ` Jason Wang
2019-01-25 7:14 ` Markus Armbruster
2019-01-25 3:45 ` Jason Wang
2019-01-24 9:47 ` Markus Armbruster
2019-01-25 3:56 ` Jason Wang
2019-01-25 7:12 ` Markus Armbruster
2019-01-25 8:12 ` Jason Wang [this message]
2019-01-25 8:44 ` Markus Armbruster
2019-01-24 10:11 ` Daniel P. Berrangé
2019-01-24 10:30 ` Daniel P. Berrangé
2019-01-24 11:01 ` Daniel P. Berrangé
2019-01-25 7:12 ` Jason Wang
2019-01-25 7:00 ` Jason Wang
2019-01-15 10:28 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c5c6c8cb-0cfd-323e-a291-f098ea4b8ec2@redhat.com \
--to=jasowang@redhat.com \
--cc=armbru@redhat.com \
--cc=dgilbert@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhangckid@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).