From: Jason Wang <jasowang@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
Li Zhijian <lizhijian@cn.fujitsu.com>,
QEMU Developers <qemu-devel@nongnu.org>,
Peter Xu <peterx@redhat.com>, Zhang Chen <zhangckid@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] test-filter-mirror hangs
Date: Fri, 25 Jan 2019 15:12:51 +0800 [thread overview]
Message-ID: <1ee4b9df-bd9c-0af8-e4a4-dbce44485f50@redhat.com> (raw)
In-Reply-To: <20190124110146.GC7953@redhat.com>
On 2019/1/24 下午7:01, Daniel P. Berrangé wrote:
> On Thu, Jan 24, 2019 at 10:30:23AM +0000, Daniel P. Berrangé wrote:
>> On Thu, Jan 24, 2019 at 10:11:55AM +0000, Daniel P. Berrangé wrote:
>>> On Wed, Jan 23, 2019 at 07:53:46PM +0000, Dr. David Alan Gilbert wrote:
>>>> Do you mean the:
>>>>
>>>> /* send a qmp command to guarantee that 'connected' is setting to true. */
>>>> qmp_discard_response(qts, "{ 'execute' : 'query-status'}");
>>>>
>>>> why was that ever sufficient to know the socket was ready?
>>> This doesn't make any sense to me.
>>>
>>> There's the netdev socket, which has been passed in as a pre-opened socket
>>> FD, so that's guaranteed connected.
>>>
>>> There's the chardev server socket, to which we've just done a unix_connect()
>>> call to establish a connection. If unix_connect() has succeeded, then at least
>>> the socket is connected & ready for I/O from the test's side. This is a
>>> reliable stream socket, so even if the test sends data on the socket right away
>>> and QEMU isn't ready, it won't be lost. It'll be buffered and received by QEMU
>>> as soon as QEMU starts to monitor for incoming data on the socket.
>>>
>>> So I don't get what trying to wait for a "connected" state actually achieves.
>>> It feels like a mistaken attempt to paper over some other unknown flaw that
>>> just worked by some lucky side-effect.
>> Immediately after writing that, I see what's happened.
>>
>> The filter_redirector_receive_iov() method is triggered when QEMU reads
>> from the -netdev socket (which we passed in as an FD and immediately
>> write to).
>>
>> This method will discard all data, however, if the chr_out -chardev is
>> not in a connected state. So we do indeed have a race condition in this
>> test suite.
>>
>> In fact I'd say this filter-mirror object is racy by design even when
>> run in normal usage, if your chardev is a server mode with "nowait" set,
>> or is a client mode with "reconnect" set. It will simply discard data.
>>
>> We can fix the test suite by using FD passing for the -chardev
>> too, so we're guaranteed to be connected immediately. It might be
>> possible to remove "nowait" flag, but I'm not sure if that will cause
>> problems with the qtest handshake as it might block QEMU at startup
>> preventing qtest handshake from being performed.
>>
>> If we care about the race in real QEMU execution, then we must either
>> document that "nowait" or "reconnect" should never be used with
>> filter-mirror, or perhaps can make use of "qemu_chr_wait_connected"
>> to synchronize startup fo the filter-mirror object with the chardev
>> initialization. That could fix the test suite too
> Actually using qemu_chr_wait_connected would cause the test suite to
> hang, and it wouldn't fix data loss in the case where the chardev
> disconnected and then waited to connect again.
>
> I think the core problem here is that the netdev code assumes that the
> filters are always able to process packets. A proper solution would
> involve the filters having a "bool ready" state and callback to notify
> the netdev anytime this state changes.
>
> The filter-mirror should *not* report ready until the chardev has been
> opened.
>
> The netdevs should then not read packets off the wire unless all the
> regsitered filters are reporting that they are ready.
Netdev should know nothing about filters. And there will be still a race
between iterating all filters and handling disconnection if we did this.
> If a filter then
> transitions to not-ready, the netdev should again stop reading packets
> off the wire & queue any that it might have had in flight, until the
> filter becomes ready again.
I agree to queue the packets in this case.
Thanks
>
> Without this kind of setup the filters are inherantly racy in several
> of the possible -chardev configurations.
>
> In that sense the flaky test has actually done us a favour showing that
> the code is broken. It is not in fact the test that is broken, and though
> we could workaround it in the test that doens't fix the root cause problem.
>
> Regards,
> Daniel
next prev parent reply other threads:[~2019-01-25 7:13 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-11 15:01 [Qemu-devel] test-filter-mirror hangs Peter Maydell
2019-01-11 16:15 ` Dr. David Alan Gilbert
2019-01-14 16:33 ` Zhang Chen
2019-01-17 9:46 ` Jason Wang
2019-01-21 18:56 ` Peter Maydell
2019-01-21 20:01 ` Dr. David Alan Gilbert
2019-01-22 9:06 ` Peter Maydell
2019-01-23 2:43 ` Jason Wang
2019-01-23 19:53 ` Dr. David Alan Gilbert
2019-01-24 4:01 ` Jason Wang
2019-01-24 9:11 ` Dr. David Alan Gilbert
2019-01-24 9:51 ` Peter Xu
2019-01-25 3:55 ` Jason Wang
2019-01-25 7:14 ` Markus Armbruster
2019-01-25 3:45 ` Jason Wang
2019-01-24 9:47 ` Markus Armbruster
2019-01-25 3:56 ` Jason Wang
2019-01-25 7:12 ` Markus Armbruster
2019-01-25 8:12 ` Jason Wang
2019-01-25 8:44 ` Markus Armbruster
2019-01-24 10:11 ` Daniel P. Berrangé
2019-01-24 10:30 ` Daniel P. Berrangé
2019-01-24 11:01 ` Daniel P. Berrangé
2019-01-25 7:12 ` Jason Wang [this message]
2019-01-25 7:00 ` Jason Wang
2019-01-15 10:28 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1ee4b9df-bd9c-0af8-e4a4-dbce44485f50@redhat.com \
--to=jasowang@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhangckid@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).