From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roland Paterson-Jones Subject: Re: Re: [Xen-devel] tap:qcow causes dom-U to hang in 3.0.3 Date: Fri, 10 Nov 2006 16:17:01 +0200 Message-ID: <455489DD.50002@rolandpj.com> References: <4551EEC3.3010308@rolandpj.com> <20061108151133.GE3507@leeni.uk.xensource.com> <4552DF8D.6060600@rolandpj.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-users-bounces@lists.xensource.com Errors-To: xen-users-bounces@lists.xensource.com To: Julian Chesterfield Cc: Xen Devel , xen-users@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Julian Chesterfield wrote: > Roland, > > Can you also verify whether there's an active tapdisk process running > in Dom0 for each tap:{aio,qcow} vbd. We are aware of a bug with the > qcow implementation that we hope to submit a fix for very soon. It's > likely that you are seeing the same issue. To answer your question, yes, it does appear that a tapdisk process is still running (this is after the dom-U has hung): [root@dom0-0-50-45-5d-6a-bc ~]# ps -aef | grep tapdisk root 4135 1 0 15:42 ? 00:00:01 tapdisk /dev/xen/tapctrlwrite1 /dev/xen/tapctrlread1 There is only one tap device, and the pid is the same as the single candidate while the dom-U was still reachable. The hand seems to occur on the first (significant?) disk write inside the dom-U. For example: -bash-3.00# dd if=/dev/zero of=./test-10MB bs=1k count=$((10*1024)) Has hung the dom-U, and I can no longer console or ssh into the dom-U. Interestingly, on the dom-U, the qcow file has shrunk from its pervious peak of > 1TB, and is now appearing modestly as: [root@dom0-0-50-45-5d-6a-bc ~]# ls -als /mnt/instance_image_store_0/ total 1564432 4 drwxr-xr-x 2 root root 4096 Nov 10 15:42 . 8 drwxr-xr-x 8 root root 4096 Nov 7 17:56 .. 1563132 -rw-r--r-- 1 root root 1599078400 Nov 10 15:42 2 1288 -rw-r--r-- 1 root root 2466816 Nov 10 16:09 2.qcow It's all very confusing. I'd love it to work, of course. Let me know what I can do to help with a diagnosis. I'm running on the (binary) PAE-enabled 3.0.3 release. Thanks and kind regards Roland