From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JhKxy-0004V6-6A for qemu-devel@nongnu.org; Thu, 03 Apr 2008 04:38:38 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JhKxx-0004TT-0F for qemu-devel@nongnu.org; Thu, 03 Apr 2008 04:38:37 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JhKxw-0004T2-Ij for qemu-devel@nongnu.org; Thu, 03 Apr 2008 04:38:36 -0400 Received: from bzq-179-150-194.static.bezeqint.net ([212.179.150.194] helo=il.qumranet.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JhKxw-0002iE-04 for qemu-devel@nongnu.org; Thu, 03 Apr 2008 04:38:36 -0400 Message-ID: <47F49770.1010906@qumranet.com> Date: Thu, 03 Apr 2008 11:38:08 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] QEMU/KVM SCSI lock up References: <87k5jfixcb.fsf@fftw.org> In-Reply-To: <87k5jfixcb.fsf@fftw.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Matteo Frigo Cc: qemu-devel@nongnu.org Matteo Frigo wrote: > kvm-64 hangs under heavy disk I/O with scsi disks. To reproduce, > create a fresh qcow2 disk, boot linux, and execute > > dd if=/dev/sdX of=/dev/null bs=1M > > on the fresh disk. See also https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1895893&group_id=180599 > > I have attached a patch that appears to fix the problem. The bug > seems to be the following. scsi_read_data() does the following > > bdrv_aio_read() > r->sector += n; > r->sector_count -= n; > > For reasons that I do not fully understand, bdrv_aio_read() does > not return immediately, but instead it calls scsi_read_data() > recursively. What happens (I think) is that bdrv_aio_read() completes immediately, calls the completion callback, which starts a read for the next batch of sectors. > Since ``r->sector += n;'' has not been executed > yet, the re-entrant call triggers a read of the same sector, which > breaks the producer-consumer lockstep. The fix is to swap the operations > as follows: > > r->sector += n; > r->sector_count -= n; > bdrv_aio_read() > > A similar fix applies to scsi_write_data(). > > Will that not issue the read for the wrong sector? I think the correct fix is to move r->sector and r->sector_count adjustment into scsi_read_complete() and scsi_write_complete(). Long term we want to replace the recursion by queuing. -- error compiling committee.c: too many arguments to function