From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:52016) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tug3C-00021h-OK for qemu-devel@nongnu.org; Mon, 14 Jan 2013 04:09:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Tug36-0000dP-O7 for qemu-devel@nongnu.org; Mon, 14 Jan 2013 04:09:50 -0500 Received: from mail-wg0-f52.google.com ([74.125.82.52]:62263) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tug36-0000cS-HN for qemu-devel@nongnu.org; Mon, 14 Jan 2013 04:09:44 -0500 Received: by mail-wg0-f52.google.com with SMTP id 12so1933851wgh.7 for ; Mon, 14 Jan 2013 01:09:43 -0800 (PST) Sender: Paolo Bonzini Message-ID: <50F3CB54.6080506@redhat.com> Date: Mon, 14 Jan 2013 10:09:40 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <50F3BEE2.5090802@gmail.com> In-Reply-To: <50F3BEE2.5090802@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] 100% CPU when sockfd is half-closed and unexpected behavior for qemu_co_send() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liu Yuan Cc: qemu-devel@nongnu.org, Stefan Hajnoczi Il 14/01/2013 09:16, Liu Yuan ha scritto: > Hi List, > This problem can be reproduced by: > 1. start a sheepdog cluster and create a volume 'test'* > 2. attach 'test' to a bootable image like > $ qemu -hda image -drive if=virtio,file=sheepdog:test > 3. pkill sheep # create a half-closed situation > > I have straced it that QEMU is busy doing nonsense read/write() after > select() in os_host_main_loop_wait(). I have no knowledge of > glib_select_xxx, so someone please help fix it. read/write() is not done by os_host_main_loop_wait(). It must be done by qemu_co_send()/qemu_co_recv() after the handler has reentered the coroutine. > Another unexpected behavior is that qemu_co_send() will send data > successfully for the half-closed situation, even the other end is > completely down. I think the *expected* behavior is that we get notified > by a HUP and close the affected sockfd, then qemu_co_send() will not > send any data, then the caller of qemu_co_send() can handle error case. qemu_co_send() should get an EPIPE or similar error. The first time it will report a partial send, the second time it will report the error directly to the caller. Please check if this isn't a bug in the Sheepdog driver. Paolo > I don't know which one I should Cc, so I only include Stefan in. > > * You can easily start up a one node sheepdog cluster as following: > $ git clone https://github.com/collie/sheepdog.git > $ cd sheepdog > $ apt-get install liburcu-dev > $ ./autogen.sh; ./configure --disable-corosync;make > #start up a one node sheep cluster > $ mkdir store;./sheep/sheep store -c local > $ collie/collie cluster format -c 1 > #create a volume named test > $ collie/collie vdi create test 1G > > Thanks, > Yuan > >