From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Ka8I2-0008Hf-6P for qemu-devel@nongnu.org; Mon, 01 Sep 2008 08:13:50 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Ka8I0-0008HF-7o for qemu-devel@nongnu.org; Mon, 01 Sep 2008 08:13:49 -0400 Received: from [199.232.76.173] (port=52815 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ka8Hz-0008HB-SI for qemu-devel@nongnu.org; Mon, 01 Sep 2008 08:13:47 -0400 Received: from host36-195-149-62.serverdedicati.aruba.it ([62.149.195.36]:58620 helo=mx.cpushare.com) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Ka8Hz-0005bB-SH for qemu-devel@nongnu.org; Mon, 01 Sep 2008 08:13:48 -0400 Date: Mon, 1 Sep 2008 14:13:41 +0200 From: Andrea Arcangeli Subject: Re: [Qemu-devel] [PATCH] ide_dma_cancel will result in partial DMA transfer Message-ID: <20080901121341.GF25764@duo.random> References: <20080829135249.GI24884@duo.random> <18619.53330.436656.737975@mariner.uk.xensource.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <18619.53330.436656.737975@mariner.uk.xensource.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On Mon, Sep 01, 2008 at 12:21:54PM +0100, Ian Jackson wrote: > So I think the host can't safely assume anything about which parts of > the transfer have completed. `First' must surely mean temporally The point is that the guest will never issue those dma cancel ops in real hardware. So if it doesn't notice its own block-write operation has failed, it will go ahead without retrying the write, and fs corruption will be generated silently without any apparent error message or fs shutdown. With qemu the aio thread can stall indefinitely if there's heavy disk (or cpu) load, such an IRQ-completion delay could happen on real hardware. I can't see why anybody should take any risk by aborting such a timed-out I/O it in the middle, given we perfectly know there are never timeouts triggering on real hardware and not all filesystem software is perfect and checks for I/O errors in every single path (this includes userland that may not notice a -EIO out of a write() syscall).