qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Bin Meng <bmeng.cn@gmail.com>
To: qemu-devel@nongnu.org
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Bin Meng" <bin.meng@windriver.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>
Subject: [PATCH v5 16/18] io/channel-watch: Fix socket watch on Windows
Date: Thu,  6 Oct 2022 23:19:25 +0800	[thread overview]
Message-ID: <20221006151927.2079583-17-bmeng.cn@gmail.com> (raw)
In-Reply-To: <20221006151927.2079583-1-bmeng.cn@gmail.com>

From: Bin Meng <bin.meng@windriver.com>

Random failure was observed when running qtests on Windows due to
"Broken pipe" detected by qmp_fd_receive(). What happened is that
the qtest executable sends testing data over a socket to the QEMU
under test but no response is received. The errno of the recv()
call from the qtest executable indicates ETIMEOUT, due to the qmp
chardev's tcp_chr_read() is never called to receive testing data
hence no response is sent to the other side.

tcp_chr_read() is registered as the callback of the socket watch
GSource. The reason of the callback not being called by glib, is
that the source check fails to indicate the source is ready. There
are two socket watch sources created to monitor the same socket
event object from the char-socket backend in update_ioc_handlers().
During the source check phase, qio_channel_socket_source_check()
calls WSAEnumNetworkEvents() to discover occurrences of network
events for the indicated socket, clear internal network event records,
and reset the event object. Testing shows that if we don't reset the
event object by not passing the event handle to WSAEnumNetworkEvents()
the symptom goes away and qtest runs very stably.

It seems we don't need to call WSAEnumNetworkEvents() at all, as we
don't parse the result of WSANETWORKEVENTS returned from this API.
We use select() to poll the socket status. Fix this instability by
dropping the WSAEnumNetworkEvents() call.

Some side notes:

During the testing, I removed the following codes in update_ioc_handlers():

  remove_hup_source(s);
  s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
  g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
                        chr, NULL);
  g_source_attach(s->hup_source, chr->gcontext);

and such change also makes the symptom go away.

And if I moved the above codes to the beginning, before the call to
io_add_watch_poll(), the symptom also goes away.

It seems two sources watching on the same socket event object is
the key that leads to the instability. The order of adding a source
watch seems to also play a role but I can't explain why.
Hopefully a Windows and glib expert could explain this behavior.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

(no changes since v1)

 io/channel-watch.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/io/channel-watch.c b/io/channel-watch.c
index 43d38494f7..ad7c568a84 100644
--- a/io/channel-watch.c
+++ b/io/channel-watch.c
@@ -115,17 +115,13 @@ static gboolean
 qio_channel_socket_source_check(GSource *source)
 {
     static struct timeval tv0;
-
     QIOChannelSocketSource *ssource = (QIOChannelSocketSource *)source;
-    WSANETWORKEVENTS ev;
     fd_set rfds, wfds, xfds;
 
     if (!ssource->condition) {
         return 0;
     }
 
-    WSAEnumNetworkEvents(ssource->socket, ssource->ioc->event, &ev);
-
     FD_ZERO(&rfds);
     FD_ZERO(&wfds);
     FD_ZERO(&xfds);
-- 
2.34.1



  parent reply	other threads:[~2022-10-06 16:24 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-06 15:19 [PATCH v5 00/18] tests/qtest: Enable running qtest on Windows Bin Meng
2022-10-06 15:19 ` [PATCH v5 01/18] semihosting/arm-compat-semi: Avoid using hardcoded /tmp Bin Meng
2022-10-06 15:19 ` [PATCH v5 02/18] tcg: " Bin Meng
2022-10-06 15:19 ` [PATCH v5 03/18] util/qemu-sockets: Use g_get_tmp_dir() to get the directory for temporary files Bin Meng
2022-10-06 15:19 ` [PATCH v5 04/18] tests/qtest: migration-test: Avoid using hardcoded /tmp Bin Meng
2022-10-25 23:44   ` Richard Henderson
2022-10-26  0:44     ` Bin Meng
2022-10-26  1:25       ` Richard Henderson
2022-10-06 15:19 ` [PATCH v5 05/18] block/vvfat: Unify the mkdir() call Bin Meng
2022-10-06 15:19 ` [PATCH v5 06/18] fsdev/virtfs-proxy-helper: Use g_mkdir() Bin Meng
2022-10-06 15:19 ` [PATCH v5 07/18] hw/usb: dev-mtp: " Bin Meng
2022-10-06 15:19 ` [PATCH v5 08/18] accel/qtest: Support qtest accelerator for Windows Bin Meng
2022-10-11 12:32   ` Thomas Huth
2022-10-11 12:37     ` Bin Meng
2022-10-06 15:19 ` [PATCH v5 09/18] tests/qtest: Use send/recv for socket communication Bin Meng
2022-10-18 16:47   ` Alex Bennée
2022-10-19  7:09     ` Bin Meng
2022-10-19  9:28       ` Thomas Huth
2022-10-18 17:09   ` Peter Maydell
2022-10-18 17:14     ` Daniel P. Berrangé
2022-10-06 15:19 ` [PATCH v5 10/18] tests/qtest: libqtest: Install signal handler via signal() Bin Meng
2022-10-11 14:14   ` Thomas Huth
2022-10-11 14:54     ` Bin Meng
2022-10-06 15:19 ` [PATCH v5 11/18] tests/qtest: Support libqtest to build and run on Windows Bin Meng
2022-10-12  7:43   ` Marc-André Lureau
2022-10-06 15:19 ` [PATCH v5 12/18] tests/qtest: migration-test: Make sure QEMU process "to" exited after migration is canceled Bin Meng
2022-10-12  7:57   ` Marc-André Lureau
2022-10-06 15:19 ` [PATCH v5 13/18] tests/qtest: libqtest: Correct the timeout unit of blocking receive calls for win32 Bin Meng
2022-10-06 15:19 ` [PATCH v5 14/18] io/channel-watch: Drop a superfluous '#ifdef WIN32' Bin Meng
2022-10-06 15:19 ` [PATCH v5 15/18] io/channel-watch: Drop the unnecessary cast Bin Meng
2022-10-06 15:19 ` Bin Meng [this message]
2022-10-06 15:19 ` [PATCH v5 17/18] .gitlab-ci.d/windows.yml: Increase the timeout to 90 minutes Bin Meng
2022-10-07  9:57   ` Marc-André Lureau
2022-10-08  1:11     ` Bin Meng
2022-10-11 14:41   ` Thomas Huth
2022-10-11 15:10     ` Bin Meng
2022-10-06 15:19 ` [PATCH v5 18/18] tests/qtest: Enable qtest build on Windows Bin Meng
2022-10-06 20:33 ` [PATCH v5 00/18] tests/qtest: Enable running qtest " Alex Bennée
2022-10-07  5:31   ` Bin Meng
2022-10-17 15:00     ` Bin Meng
2022-10-17 16:40       ` Christian Schoenebeck
2022-10-18 15:59       ` Alex Bennée
2022-10-19  7:07         ` Bin Meng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221006151927.2079583-17-bmeng.cn@gmail.com \
    --to=bmeng.cn@gmail.com \
    --cc=berrange@redhat.com \
    --cc=bin.meng@windriver.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).