From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49435) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V5Ek4-0003rk-LA for qemu-devel@nongnu.org; Fri, 02 Aug 2013 08:46:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V5Ejx-0007D8-2e for qemu-devel@nongnu.org; Fri, 02 Aug 2013 08:46:00 -0400 Received: from goliath.siemens.de ([192.35.17.28]:31948) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V5Ejw-0007CW-Px for qemu-devel@nongnu.org; Fri, 02 Aug 2013 08:45:53 -0400 Message-ID: <51FBA9FE.9050505@siemens.com> Date: Fri, 02 Aug 2013 14:45:50 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <51FA97CA.7050905@siemens.com> <20130802114652.GA342@stefanha-thinkpad.redhat.com> In-Reply-To: <20130802114652.GA342@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: qemu-devel On 2013-08-02 13:46, Stefan Hajnoczi wrote: > On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote: >> I was digging into the involved code and found something fishy: >> >> net/tap.c: >> static void tap_send(void *opaque) >> { >> ... >> size = qemu_send_packet_async(&s->nc, buf, size, >> tap_send_completed); >> if (size == 0) { >> tap_read_poll(s, false); >> } >> >> So, if tap_send is registered for the mainloop polling (ie. can_receive >> returned true before starting to poll) but qemu_send_packet_async >> returns 0 now as qemu_can_send_packet/can_receive happens to report >> false in the meantime, we will disable read polling. If also write >> polling is off, the fd will be completely removed from the iohandler >> list. But even if write polling remains on, I wonder what should bring >> read polling back? > > This behavior seems fine to me. Once the peer (pcnet) is able to > receive again it must flush the queue, this will re-enable > tap_read_poll(). > > Can you explain a bit more why this would be a problem? The problem is that I don't see at all what will call tap_read_poll(s, 1), neither in theory nor in reality. As long as the real test case is out of reach, I tried to emulate the faulty behaviour by letting tap_can_send always return 1. Result: reception stalls during boot as even qemu_flush_queued_packets cannot get it running again once tap_read_poll(s, 0) was called. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux