From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KsIgJ-000472-98 for qemu-devel@nongnu.org; Tue, 21 Oct 2008 10:57:59 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KsIgI-00046i-Kq for qemu-devel@nongnu.org; Tue, 21 Oct 2008 10:57:58 -0400 Received: from [199.232.76.173] (port=34990 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KsIgI-00046d-9f for qemu-devel@nongnu.org; Tue, 21 Oct 2008 10:57:58 -0400 Received: from qw-out-1920.google.com ([74.125.92.150]:13104) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KsIgH-000365-UH for qemu-devel@nongnu.org; Tue, 21 Oct 2008 10:57:58 -0400 Received: by qw-out-1920.google.com with SMTP id 5so699109qwc.4 for ; Tue, 21 Oct 2008 07:57:54 -0700 (PDT) Message-ID: <48FDEDEE.9000501@codemonkey.ws> Date: Tue, 21 Oct 2008 09:57:50 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: [PATCH]: fix QEMU SCSI lock up References: <20080924225946.GA28588@dmt.cnet> <48DACF6F.40301@us.ibm.com> <48DB4519.6040902@redhat.com> <48F75D05.3010601@redhat.com> In-Reply-To: <48F75D05.3010601@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Marcelo Tosatti , Avi Kivity Avi Kivity wrote: > Avi Kivity wrote: >> Anthony Liguori wrote: >> >>>> For reasons that I do not fully understand, bdrv_aio_read() does >>>> not return immediately, but instead it calls scsi_read_data() >>>> recursively. >>>> >>> This bothers me. bdrv_aio_read() should never immediately invoke the >>> callback to prevent exactly this sort of problem. Perhaps this was a >>> bug that has since been fixed? Is this still reproducible? >>> >> >> qcow2 metadata is synchronous, and if the disk is empty, there will be >> no data I/O, so bdrv_aio_read() will never be invoked. >> >> Maybe we should fix this in qcow2 (and the other block formats) by >> scheduling a BH. >> > > FWIW, I was told this reproduces on kvm-77 (which has the latest qemu > scsi bits). qemu_aio_wait() will run bottom halves when emulating synchronous IO. I don't think this is exploitable practically speaking but it seems to me like a major flaw. I think the proper fix is what you describe, modifying qcow2 to schedule a bottom half to read metadata. Better yet, a full conversion to make the meta data reading/writing asynchronous. Regards, Anthony Liguori