From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51338) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W43NE-0005UK-MG for qemu-devel@nongnu.org; Fri, 17 Jan 2014 01:57:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W43N9-0005sA-MT for qemu-devel@nongnu.org; Fri, 17 Jan 2014 01:57:48 -0500 Received: from mail-gg0-f178.google.com ([209.85.161.178]:65225) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W43N9-0005s6-Hj for qemu-devel@nongnu.org; Fri, 17 Jan 2014 01:57:43 -0500 Received: by mail-gg0-f178.google.com with SMTP id q2so1203125ggc.37 for ; Thu, 16 Jan 2014 22:57:42 -0800 (PST) Message-ID: <52D8D459.2050205@inktank.com> Date: Thu, 16 Jan 2014 22:57:29 -0800 From: Josh Durgin MIME-Version: 1.0 References: <20140108044407.GG30570@stefanha-thinkpad.redhat.com> <52D4D83B.4040803@inktank.com> <52D4EE16.5060703@inktank.com> <52D74336.70708@inktank.com> In-Reply-To: Content-Type: text/plain; charset=gb18030; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ILvY?= =?gb18030?b?uLSjuiC72Li0o7ogIHFlbXUgdm0gYmlnIG5ldHdvcmsgbGF0ZW5jeSB3aGVu?= =?gb18030?q?_met_heavy_io?= List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?gb18030?Q?=D2=B6=C9=DC=E8=A1?= Cc: qemu-devel On 01/15/2014 10:12 PM, Ò¶ÉÜè¡ wrote: > Hi Josh > > # strings /usr/bin/qemu-system-x86_64 | grep rbd_aio > rbd_aio_write > rbd_aio_flush > rbd_aio_read > rbd_aio_create_completion > rbd_aio_release > rbd_aio_discard > rbd_aio_get_return_value > > So, librbd's asynchronous flush is being used. > > I set log settings, fetch the log and ping log with timestamped, all > logs are in the attachment. Excellent, thanks. > It seems that I doesn't have permission to generate a new issue > on tracker.ceph.com, when I click the 'Register' it show a 'internal > error' page. Seems to be working for me. In any case, I created http://tracker.ceph.com/issues/7165 to track the problem. Looking through the logs, it may have been already solved in a couple commits after 0.67.5. Namely, the cache was starting the flush of too much dirty data at once while holding a lock, preventing other I/O from the guest from starting, and thus blocking the qemu thread handling the I/O. I added the relevant commits to the wip-objectcacher-flusher-dumpling branch in ceph.git. Could you install librbd from that branch and see if it fixes the problem? Instructions for getting these packages are: http://ceph.com/docs/master/install/get-packages/#add-ceph-development Thanks, Josh