From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KTLCI-0000F4-S4 for qemu-devel@nongnu.org; Wed, 13 Aug 2008 14:35:50 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KTLCG-0000Dn-6Q for qemu-devel@nongnu.org; Wed, 13 Aug 2008 14:35:49 -0400 Received: from [199.232.76.173] (port=55499 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KTLCG-0000Db-3R for qemu-devel@nongnu.org; Wed, 13 Aug 2008 14:35:48 -0400 Received: from il.qumranet.com ([212.179.150.194]:38199) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KTLCF-0000XL-EA for qemu-devel@nongnu.org; Wed, 13 Aug 2008 14:35:47 -0400 Received: from gleb-debian.qumranet.com (gleb-debian.qumranet.com.qumranet.com [172.16.15.143]) by il.qumranet.com (Postfix) with ESMTP id E6EBB250310 for ; Wed, 13 Aug 2008 21:35:44 +0300 (IDT) Date: Wed, 13 Aug 2008 21:35:44 +0300 From: Gleb Natapov Subject: Re: [Qemu-devel] [PATCH] Flush pending AIO on reboot and shutdown. Message-ID: <20080813183544.GA29998@minantech.com> References: <20080813132358.17672.68212.stgit@gleb-debian.qumranet.com.qumranet.com> <48A2E732.4040501@codemonkey.ws> <20080813141346.GC17567@minantech.com> <48A2F89A.20609@codemonkey.ws> <20080813152329.GG17567@minantech.com> <48A30395.6060706@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48A30395.6060706@codemonkey.ws> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On Wed, Aug 13, 2008 at 10:53:57AM -0500, Anthony Liguori wrote: > Now that I think about it, I think your fixing the wrong problem. The > issue isn't that the IO requests need to be completed, but that they > *will* complete which means that the IDE driver will receive a callback > for a request that it no longer knows about (because it was reset). So Not exactly. IDE still knows about the request after reset (actually currently IDE doesn't know about reset), but the IDE code is written in such a way that there can be only one outstanding request in progress. When guest issues another request before previous request is completed global data is modified and first request start to use wrong data and consequences are unpredictable. It may be crash, image corruption, infinity recursion. The fact that IDE code allows to issue another request wile IDE is still busy is also a bug. > what we really need to do is modify the IDE device such that when it is > reset, it cancels any pending requests. > > The fact that this reset happens as a consequence of a system reset is > really just a coincidence. That will also solve the problem of cause, but what bother me is that we consciously drop user data that we can easily save. Why? Real HW tries hard to save every bit of user data and we just decided to drop it. The difference between cancel or complete a request may be corrupted or not corrupted file system after a crash. I'll send updated patch. -- Gleb.