From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56506) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WNM9p-0001uS-DT for qemu-devel@nongnu.org; Tue, 11 Mar 2014 08:51:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WNM9j-0005Xt-Dm for qemu-devel@nongnu.org; Tue, 11 Mar 2014 08:51:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:19238) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WNM9j-0005Xk-3v for qemu-devel@nongnu.org; Tue, 11 Mar 2014 08:51:39 -0400 Message-ID: <1394542311.3981.37.camel@localhost.localdomain> From: Marcel Apfelbaum Date: Tue, 11 Mar 2014 14:51:51 +0200 In-Reply-To: <20140311124022.GA7761@stefanha-thinkpad.redhat.com> References: <1394532550-21857-1-git-send-email-marcel.a@redhat.com> <1394532550-21857-2-git-send-email-marcel.a@redhat.com> <20140311124022.GA7761@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH V2 1/2] tests/libqtest: Fix possible deadlock in qtest initialization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org, armbru@redhat.com, aliguori@amazon.com, afaerber@suse.de On Tue, 2014-03-11 at 13:40 +0100, Stefan Hajnoczi wrote: > On Tue, Mar 11, 2014 at 12:09:09PM +0200, Marcel Apfelbaum wrote: > > @@ -78,12 +79,16 @@ static int socket_accept(int sock) > > struct sockaddr_un addr; > > socklen_t addrlen; > > int ret; > > + struct timeval timeout = { .tv_sec = SOCKET_TIMEOUT, > > + .tv_usec = 0 }; > > + > > + setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, (void *)&timeout, > > + sizeof(timeout)); > > > > addrlen = sizeof(addr); > > do { > > ret = accept(sock, (struct sockaddr *)&addr, &addrlen); > > } while (ret == -1 && errno == EINTR); > > - g_assert_no_errno(ret); > > close(sock); > > Did you mean to leave SO_RCVTIMEO set after this function completes? Yes, I don't think it hurts. A 5 sec timeout should be like infinite, Qemu running on the same machine. If you think > > > @@ -91,7 +96,7 @@ static int socket_accept(int sock) > > > > static void kill_qemu(QTestState *s) > > { > > - if (s->qemu_pid != -1) { > > + if (s && s->qemu_pid != -1) { > > kill(s->qemu_pid, SIGTERM); > > waitpid(s->qemu_pid, NULL, 0); > > } > > This is a bug in libqtest.c, please don't silence the crash. I didn't see it like hiding a crash, I thought that if there is any problem during init it is because the Qemu failed to start, meaning that you don't have a process to kill (Qemu exited already). Al of the above happens -> you don't have a global state. Anyway, if you have a better way to deal with it, I have nothing against it :) Thanks, Marcel > > kill_qemu() gets called from the SIGABRT signal handler but I forgot > that global_qtest isn't initialized yet while qtest_init() executes. > > In other words, the cleanup is broken if we fail inside qtest_init(). > Can you drop this hunk and I'll send a patch to fix the underlying > issue? > > > @@ -153,6 +158,8 @@ QTestState *qtest_init(const char *extra_args) > > g_free(socket_path); > > g_free(qmp_socket_path); > > > > + g_assert(s->fd >= 0 && s->qmp_fd >= 0); > > + > > We probably shouldn't socket_accept() s->qmp_fd if s->fd already failed. > Otherwise we'll wait another 5 seconds for the timeout to explire: Yes, I already had this chunk, I have no idea why I dropped it, I'll return it, thanks. Thanks, Marcel > > s->fd = socket_accept(sock); > if (s->fd >= 0) { > s->qmp_fd = socket_accept(qmpsock); > }