From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: problem about blocked monitor when disk image on NFS can not be reached. Date: Tue, 01 Mar 2011 17:23:23 +0200 Message-ID: <4D6D0F6B.7030609@redhat.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: ya su , "kvm@vger.kernel.org" , Kevin Wolf To: Stefan Hajnoczi Return-path: Received: from mx1.redhat.com ([209.132.183.28]:26730 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752175Ab1CAPX1 (ORCPT ); Tue, 1 Mar 2011 10:23:27 -0500 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 03/01/2011 05:01 PM, Stefan Hajnoczi wrote: > On Tue, Mar 1, 2011 at 12:39 PM, ya su wrote: > > how about to remove kvm_handle_io/handle_mmio in kvm_run function > > into kvm_main_loop, as these operation belong to io operation, this > > will remove the qemu_mutux between the 2 threads. is this an > > reasonable thought? > > > > In order to keep the monitor to response to user quicker under > > this suition, an easier way is to take monito io out of qemu_mutux > > protection. this include vnc/serial/telnet io related with monitor, > > as these io will not affect the running of vm itself, it need not in > > so stirct protection. > > The qemu_mutex protects all QEMU global state. The monitor does some > I/O and parsing which is not necessarily global state but once it > begins actually performing the command you sent, access to global > state will be required (pretty much any monitor command will operate > on global state). > > I think there are two options for handling NFS hangs: > 1. Ensure that QEMU is never put to sleep by NFS for disk images. The > guest continues executing, may time out and notice that storage is > unavailable. That's the NFS soft mount option. > 2. Pause the VM but keep the monitor running if a timeout error > occurs. Not sure if there is a timeout from NFS that we can detect. The default setting (hard mount) will retry forever in the kernel. Moreover, the other default setting (nointr) means we can't even signal the hung thread. > For I/O errors (e.g. running out of disk space on the host) there is a > configurable policy. You can choose whether to return an error to the > guest or to pause the VM. I think we should treat NFS hangs as an > extension to this and as a block layer problem rather than an io > thread problem. I agree. Mount the share as a soft,intr mount and let the kernel time out and return an I/O error. > Can you get backtraces when KVM hangs (gdb command: thread apply all > bt)? It would be interesting to see some of the blocking cases that > you are hitting. Won't work (at least under the default configuration) since those threads are uninterruptible. At the very least you need an interruptible mount. -- error compiling committee.c: too many arguments to function