From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LOz0v-0007B5-5a for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:38:21 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LOz0s-0007A8-Iy for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:38:20 -0500 Received: from [199.232.76.173] (port=51426 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LOz0p-00079j-Nm for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:38:16 -0500 Received: from mx2.redhat.com ([66.187.237.31]:55762) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LOz0p-0001bH-5Q for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:38:15 -0500 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n0JIcEqf024932 for ; Mon, 19 Jan 2009 13:38:14 -0500 Message-ID: <4974C896.7030100@redhat.com> Date: Mon, 19 Jan 2009 20:38:14 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH v3] Stop VM on ENOSPC error. References: <20090118110509.GG11299@redhat.com> <18804.27240.886522.337700@mariner.uk.xensource.com> <4974A704.3070605@codemonkey.ws> <18804.46780.936806.748045@mariner.uk.xensource.com> In-Reply-To: <18804.46780.936806.748045@mariner.uk.xensource.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Ian Jackson wrote: > Anthony Liguori writes ("Re: [Qemu-devel] [PATCH v3] Stop VM on ENOSPC error."): > >> Ian Jackson wrote: >> >>> Once again, this feature should be optional. >>> >> Why? >> > > Well, three reasons, one general and theoretical, and two practical > and rather Xen-specific. > This has been tried before, but... > The theoretical reason is that a guest is in a better postion to deal > with the situation because it knows its access patterns. Often the > response to a failing write in a mission-critical system will be some > kind a fallback behaviour, which is likely to work. A situation where many writes fail and many writes succeed is unlikely to have been tested and is therefore unlikely to work. Particularly as some time afterwards all writes start to succeed again as if nothing has happened. A single disk guest will thrash its disk, eventually remounting it read-only (in the case of Linux) and then failing left and right. A multiple disk guest in a RAID 5 configuration will enter degraded mode, and then corrupt data. RAID 5 wasn't designed for multiple disk failures. By induction RAID 6 fails as well. > Stopping the VM > unconditionally is not something that the guest can cope with. > The guest doesn't need to cope with it; the management system does. > The practical reasons are that we would want to retain existing > behaviour unless it was clearly broken (which we don't think it is), > and that we don't currently have any useful mechanism for reporting > and dealing with the problem. > > Fundamentally I think we're seeing this different because of the way > that Xen uses qemu is contextually quite different to the > `traditional' qemu. Traditionally qemu is used as a subprogram of > other tasks, as an interactive debugging or GUI tool, or whatever. > > But in the Xen context, a Xen VM is not a `task' in the same way. > (Xen users make much less use of the built-in cow formats for this > reason, often preferring LVM snapshots or even deeper storage magic.) > We expect the VM to be up and stay up and if it can't continue it > needs to fail or crash You can resume the guest over the monitor (or xenstore if you insist) once more storage is allocated, same as everyone else. I don't see how qemu's role in Xen makes a difference. The only alternative I see to stopping the VM is to offline the disk for both reads and writes. This at least protects data, and is similar to controller or cable failure which guests may have been tested with. An advantage is that if an unimportant disk fails, the guest can continue to work. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.