From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1N8F5s-0002zh-Jf for qemu-devel@nongnu.org; Wed, 11 Nov 2009 10:26:48 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1N8F5o-0002z3-O3 for qemu-devel@nongnu.org; Wed, 11 Nov 2009 10:26:48 -0500 Received: from [199.232.76.173] (port=37966 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N8F5o-0002z0-Cs for qemu-devel@nongnu.org; Wed, 11 Nov 2009 10:26:44 -0500 Received: from mx1.redhat.com ([209.132.183.28]:16922) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N8F5n-0004rf-Ta for qemu-devel@nongnu.org; Wed, 11 Nov 2009 10:26:44 -0500 Message-ID: <4AFAD7A8.70707@redhat.com> Date: Wed, 11 Nov 2009 16:26:32 +0100 From: Gerd Hoffmann MIME-Version: 1.0 Subject: Re: [Qemu-devel] [sneak preview] major scsi overhaul References: <4AF4ACA5.2090701@redhat.com> <200911110406.31319.paul@codesourcery.com> <4AFA86C5.6020000@redhat.com> <200911111413.09320.paul@codesourcery.com> In-Reply-To: <200911111413.09320.paul@codesourcery.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paul Brook Cc: qemu-devel@nongnu.org On 11/11/09 15:13, Paul Brook wrote: >> The current qemu code *does* cache the response. scsi-disk caps the >> buffer at 128k (which is big enough for any request I've seen in my >> testing). scsi-generic has no cap. > > That cap is important. > For scsi-generic you probably don't have a choice because of the way the > kernel interface works. Exactly. And why is the cap important for scsi-disk if scsi-generic does fine without? >> scsi-disk: dma_bdrv_{read,write} will split it into smaller chunks if >> needed. > > You seem to be assuming the HBA knows where it's going to put the data before > it issues the command. This is not true (see blow). You are talking about a real HBA I guess? >> lsi (only one in-tree with TCQ support) works like this: >> >> - allocate + parse scsi command. scsi_req_get+scsi_req_parse >> - continue script processing, collect >> DMA addresses and stick them into >> a scatter list until it is complete. >> - queue command and disconnect. >> - submit I/O to the qemu block layer scsi_req_sgl >> >> *can process more scsi commands here* >> >> - when I/O is finished reselect tag >> - return status, release request. scsi_req_put > > I'm pretty sure this is wrong, This describes what the lsi emulation does with my patches applied. > and what actually happens is: > > 1) Wait for device to reconnect (goto 5), or commands from host (goto 2). > > 2) SCRIPTS connect to device, and send command. > 3) If device has data immediately (metadata command) then goto 6 > 4) Device disconnects. goto 1 > > 5) Device has data ready, and reconnects > 6) SCRIPTS locate the next DMA block for this command, and initiate a (linear) > DMA transfer. > 7) DATA is transferred. Note that DMA stalls the SCRIPTS processor until the > transfer completes. > 8) If the device still has data then goto 6. > 9) If the device runs out of data before the command completes then goto 3. > 10) Command complete. goto 1 This is what a real HBA does. > Note that the IO command is parsed at stage 2, but the data transfer is not > requested until stage 6. i.e. after the command has partially completed. This > window between issue and data transfer is where other commands are issued. Or when the device disconnects again in the middle of a transfer. > The only way to make your API work is to skip straight from step 3 to step 6, > which effectively looses the command queueing capability. It doesn't. The disconnect and thus the opportunity to submit more commands while the device is busy doing the actual I/O is there. > It may be that it's > hard/impossible to get both command queueing and zero-copy. I have it working. cheers, Gerd