From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35694) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V21Dd-0003TI-QD for qemu-devel@nongnu.org; Wed, 24 Jul 2013 11:43:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V21Dc-0004NZ-6e for qemu-devel@nongnu.org; Wed, 24 Jul 2013 11:43:13 -0400 Received: from mail-pd0-x232.google.com ([2607:f8b0:400e:c02::232]:33999) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V21Db-0004NN-VO for qemu-devel@nongnu.org; Wed, 24 Jul 2013 11:43:12 -0400 Received: by mail-pd0-f178.google.com with SMTP id w11so526847pde.37 for ; Wed, 24 Jul 2013 08:43:11 -0700 (PDT) Date: Wed, 24 Jul 2013 23:42:49 +0800 From: Liu Yuan Message-ID: <20130724154249.GA16838@ubuntu-precise> References: <1374652593-7242-1-git-send-email-morita.kazutaka@lab.ntt.co.jp> <20130724082830.GA11913@ubuntu-precise> <87ip00gwc6.wl%morita.kazutaka@lab.ntt.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87ip00gwc6.wl%morita.kazutaka@lab.ntt.co.jp> Subject: Re: [Qemu-devel] [sheepdog] [PATCH v2 0/9] sheepdog: reconnect server after connection failure List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: MORITA Kazutaka Cc: Kevin Wolf , Paolo Bonzini , sheepdog@lists.wpkg.org, qemu-devel@nongnu.org, Stefan Hajnoczi On Wed, Jul 24, 2013 at 06:07:21PM +0900, MORITA Kazutaka wrote: > At Wed, 24 Jul 2013 16:28:30 +0800, > Liu Yuan wrote: > > > > On Wed, Jul 24, 2013 at 04:56:24PM +0900, MORITA Kazutaka wrote: > > > Currently, if a sheepdog server exits, all the connecting VMs need to > > > be restarted. This series implements a feature to reconnect the > > > server, and enables us to do online sheepdog upgrade and avoid > > > restarting VMs when sheepdog servers crash unexpectedly. > > > > > > > It doesn't work on my test. I tried start linux-0.2.img stored in sheepdog > > cluster and then > > > > 1. did some buffered writes > > 2. restart sheep that this QEMU VM connected to. > > 3. $ sync > > > > I got following error: > > > > $ ../qemu/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -m 1024 -hda sheepdog:test > > qemu-system-x86_64: failed to get the header, Resource temporarily unavailable > > qemu-system-x86_64: Failed to connect to socket: Connection refused > > qemu-system-x86_64: Failed to connect to socket: Connection refused > > qemu-system-x86_64: Failed to connect to socket: Connection refused > > qemu-system-x86_64: Failed to connect to socket: Connection refused > > qemu-system-x86_64: Failed to connect to socket: Connection refused > > ...repeat... > > > > QEMU version is master tip > > Your sheep daemon looks like unreachable from qemu. I tried the same > procedure, but couldn't reproduce it. > > Is the problem reproducible? Can you make sure that you can connect > to the sheep daemon from collie while the error message shows up? > Yesh. Well I try to repeat it with following process: 1. did some buffered write 2. kill the sheep 3. $ sync # at guest, now 'sync' hang for response 4. restart sheep After 4 'sync' still hangs until timeout with a message "hda:dma_timer_expiry: dma status == 0x21" Guest end up freeze. QEMU output is the same: qemu-system-x86_64: failed to get the header, Resource temporarily unavailable qemu-system-x86_64: Failed to connect to socket: Connection refused qemu-system-x86_64: Failed to connect to socket: Connection refused qemu-system-x86_64: Failed to connect to socket: Connection refused qemu-system-x86_64: Failed to connect to socket: Connection refused But notice, if I did restart sheep with guest doing nothing, your patch set work like a charm. Thanks Yuan