From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1I92E9-0000IY-9d for qemu-devel@nongnu.org; Thu, 12 Jul 2007 13:13:17 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1I92E8-0000IB-OT for qemu-devel@nongnu.org; Thu, 12 Jul 2007 13:13:16 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1I92E8-0000I6-DU for qemu-devel@nongnu.org; Thu, 12 Jul 2007 13:13:16 -0400 Received: from il.qumranet.com ([82.166.9.18]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1I92E7-00043I-MI for qemu-devel@nongnu.org; Thu, 12 Jul 2007 13:13:16 -0400 Message-ID: <46966132.5010402@qumranet.com> Date: Thu, 12 Jul 2007 20:13:22 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] Crash: When Host HDD is full References: <7fac565a0707110819k635d398fl273d8d5a0afd2d3f@mail.gmail.com> <200707121717.32145.paul@codesourcery.com> <46965902.6030305@qumranet.com> <200707121803.50105.paul@codesourcery.com> In-Reply-To: <200707121803.50105.paul@codesourcery.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paul Brook Cc: qemu-devel@nongnu.org Paul Brook wrote: >>>> Qemu might freeze the guest when it gets -ENOSPC, and say, retry every >>>> second or wait for user input on the monitor. >>>> >>> Better would IMHO be to report an IO error to the guest and allow that to >>> decide what to do. If you're bothered about robustness and reliability >>> then arbitrarily stopping the guest is not acceptable behaviour. There's >>> no guarantee that space will become available in a finite timeframe. >>> >> I've considered that, and I'm not sure. You will likely get a storm of >> I/O errors on ENOSPC; with several ways for disaster to strike: >> - the guest doesn't handle I/O errors well, and keeps writing. some of >> the writes are overwrites so they hit the disk and data is corrupted >> > > If an guest OS ignores IO write errors it's just plain broken. > > Linux 2.4 ignores IO write errors under certain conditions. Yes, it's broken. But you're making the user suffer for this brokenness even if the only thing wrong is a temporary shortage of disk space. >> - the guest decides the disk is bad because it has too many errors and >> initiates some recovery procedure >> >> Stopping the guest at least guarantees nothing unexpected happens. If >> it's part of a managed solution we can output a message to the monitor >> which eventually finds its way to the operator. >> > > I don't buy this argument. If you don't want "unexpected" things to happen > then the solution is simple: Make sure you never run out of disk space. > That's unrealistic, at least for the casual user running qemu. A managed solution can probably work around this. Qemu should be more user friendly. > The fact is that your (virtual) disk *is* broken at this point. The guest OS > is in a much better position to decide on an appropriate course of action, > either by retrying or some other recovery mechanism. > > I don't see why it is broken. The disk contents have not changed since after the last successful write. Once you free some space you can continue writing. Note that a recovery mechanism that involves writing will likely fail as well, possibly corrupting the disk in the process. > There are various error contitions that could be used, for example > write-protect. > The guest would most likely be surprised at getting a write-protect error on its hard disk, and then the disk *would* be broken. -- error compiling committee.c: too many arguments to function