From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JhDWI-0003hn-UH for qemu-devel@nongnu.org; Wed, 02 Apr 2008 20:41:35 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JhDWI-0003hF-1X for qemu-devel@nongnu.org; Wed, 02 Apr 2008 20:41:34 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JhDWH-0003h5-N2 for qemu-devel@nongnu.org; Wed, 02 Apr 2008 20:41:33 -0400 Received: from fftw.vpsland.com ([69.61.62.151] helo=fftw.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JhDWH-000815-Ed for qemu-devel@nongnu.org; Wed, 02 Apr 2008 20:41:33 -0400 Received: from pool-96-237-13-71.bstnma.east.verizon.net ([96.237.13.71] helo=thinkpad) by fftw.org with esmtpsa (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.68) (envelope-from ) id 1JhDWF-0000Dx-1C for qemu-devel@nongnu.org; Wed, 02 Apr 2008 20:41:31 -0400 Received: from athena by thinkpad with local (Exim 4.69) (envelope-from ) id 1JhDW8-0001Mv-9i for qemu-devel@nongnu.org; Wed, 02 Apr 2008 20:41:24 -0400 From: Matteo Frigo Date: Wed, 02 Apr 2008 20:41:24 -0400 Message-ID: <87k5jfixcb.fsf@fftw.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Subject: [Qemu-devel] QEMU/KVM SCSI lock up Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org --=-=-= kvm-64 hangs under heavy disk I/O with scsi disks. To reproduce, create a fresh qcow2 disk, boot linux, and execute dd if=/dev/sdX of=/dev/null bs=1M on the fresh disk. See also https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1895893&group_id=180599 I have attached a patch that appears to fix the problem. The bug seems to be the following. scsi_read_data() does the following bdrv_aio_read() r->sector += n; r->sector_count -= n; For reasons that I do not fully understand, bdrv_aio_read() does not return immediately, but instead it calls scsi_read_data() recursively. Since ``r->sector += n;'' has not been executed yet, the re-entrant call triggers a read of the same sector, which breaks the producer-consumer lockstep. The fix is to swap the operations as follows: r->sector += n; r->sector_count -= n; bdrv_aio_read() A similar fix applies to scsi_write_data(). Thanks for developing kvm, it is truly an amazing piece of software. Regards, Matteo Frigo --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=scsi-patch diff -aur kvm-64.old/qemu/hw/scsi-disk.c kvm-64.new/qemu/hw/scsi-disk.c --- kvm-64.old/qemu/hw/scsi-disk.c 2008-03-26 08:49:35.000000000 -0400 +++ kvm-64.new/qemu/hw/scsi-disk.c 2008-03-30 08:37:25.000000000 -0400 @@ -196,12 +196,12 @@ n = SCSI_DMA_BUF_SIZE / 512; r->buf_len = n * 512; - r->aiocb = bdrv_aio_read(s->bdrv, r->sector, r->dma_buf, n, + r->sector += n; + r->sector_count -= n; + r->aiocb = bdrv_aio_read(s->bdrv, r->sector - n, r->dma_buf, n, scsi_read_complete, r); if (r->aiocb == NULL) scsi_command_complete(r, SENSE_HARDWARE_ERROR); - r->sector += n; - r->sector_count -= n; } static void scsi_write_complete(void * opaque, int ret) @@ -248,12 +248,12 @@ BADF("Data transfer already in progress\n"); n = r->buf_len / 512; if (n) { - r->aiocb = bdrv_aio_write(s->bdrv, r->sector, r->dma_buf, n, + r->sector += n; + r->sector_count -= n; + r->aiocb = bdrv_aio_write(s->bdrv, r->sector - n, r->dma_buf, n, scsi_write_complete, r); if (r->aiocb == NULL) scsi_command_complete(r, SENSE_HARDWARE_ERROR); - r->sector += n; - r->sector_count -= n; } else { /* Invoke completion routine to fetch data from host. */ scsi_write_complete(r, 0); --=-=-=--