From: Paul Brook <paul@codesourcery.com>
To: Gerd Hoffmann <kraxel@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [sneak preview] major scsi overhaul
Date: Wed, 11 Nov 2009 14:13:09 +0000 [thread overview]
Message-ID: <200911111413.09320.paul@codesourcery.com> (raw)
In-Reply-To: <4AFA86C5.6020000@redhat.com>
> The current qemu code *does* cache the response. scsi-disk caps the
> buffer at 128k (which is big enough for any request I've seen in my
> testing). scsi-generic has no cap.
That cap is important.
For scsi-generic you probably don't have a choice because of the way the
kernel interface works.
> With the new interface the HBA has to handle the caching if needed. But
> the HBA also has the option to pass scatter lists, in which case qemu
> doesn't do any caching, the data is transfered directly from/to guest
> memory. Which is clearly an improvement IMO.
> > Remember that a single
> > transfer can be very large (greater than available ram on the host). Even
> > with a SG capable HBA, you can not assume all the data will be
> > transferred in a single request.
>
> scsi-generic: It must be a single request anyway, and it already is today.
>
> scsi-disk: dma_bdrv_{read,write} will split it into smaller chunks if
> needed.
You seem to be assuming the HBA knows where it's going to put the data before
it issues the command. This is not true (see blow).
> > You should also consider how this interacts with command queueing.
> > IIUC an Initiator (HBA) typically sends a tagged command, then
> > disconnects from the target (disk). At some later time the target
> > reconnects, and the initiator starts the DMA transfer. By my reading your
> > code does not issue any IO requests until after the HBA starts
> > transferring data.
>
> Hmm? What code you are looking at?
>
> For esp and usb-storage reads and writes are handles slightly different.
> They roughly works like this:
Neither ESP nor usb-storage implement command queueing, so aren't interesting.
> lsi (only one in-tree with TCQ support) works like this:
>
> - allocate + parse scsi command. scsi_req_get+scsi_req_parse
> - continue script processing, collect
> DMA addresses and stick them into
> a scatter list until it is complete.
> - queue command and disconnect.
> - submit I/O to the qemu block layer scsi_req_sgl
>
> *can process more scsi commands here*
>
> - when I/O is finished reselect tag
> - return status, release request. scsi_req_put
I'm pretty sure this is wrong, and what actually happens is:
1) Wait for device to reconnect (goto 5), or commands from host (goto 2).
2) SCRIPTS connect to device, and send command.
3) If device has data immediately (metadata command) then goto 6
4) Device disconnects. goto 1
5) Device has data ready, and reconnects
6) SCRIPTS locate the next DMA block for this command, and initiate a (linear)
DMA transfer.
7) DATA is transferred. Note that DMA stalls the SCRIPTS processor until the
transfer completes.
8) If the device still has data then goto 6.
9) If the device runs out of data before the command completes then goto 3.
10) Command complete. goto 1
Note that the IO command is parsed at stage 2, but the data transfer is not
requested until stage 6. i.e. after the command has partially completed. This
window between issue and data transfer is where other commands are issued.
The only way to make your API work is to skip straight from step 3 to step 6,
which effectively looses the command queueing capability. It may be that it's
hard/impossible to get both command queueing and zero-copy. In that case I say
command queueing wins.
Also note that use of self-modifying SCRIPTS is common.
Paul
next prev parent reply other threads:[~2009-11-11 14:13 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-06 23:09 [Qemu-devel] [sneak preview] major scsi overhaul Gerd Hoffmann
2009-11-07 15:22 ` Blue Swirl
2009-11-09 9:08 ` Gerd Hoffmann
2009-11-09 12:37 ` Avi Kivity
2009-11-09 13:03 ` Gerd Hoffmann
2009-11-09 13:17 ` Avi Kivity
2009-11-09 13:39 ` Gerd Hoffmann
2009-11-09 13:48 ` Avi Kivity
2009-11-09 20:38 ` Blue Swirl
2009-11-09 21:25 ` Gerd Hoffmann
2009-11-11 4:06 ` Paul Brook
2009-11-11 9:41 ` Gerd Hoffmann
2009-11-11 14:13 ` Paul Brook [this message]
2009-11-11 15:26 ` Gerd Hoffmann
2009-11-11 16:38 ` Paul Brook
2009-11-16 16:35 ` Gerd Hoffmann
2009-11-16 18:53 ` Paul Brook
2009-11-16 21:50 ` Gerd Hoffmann
2009-11-24 11:59 ` Gerd Hoffmann
2009-11-24 13:51 ` Paul Brook
2009-11-25 16:37 ` Gerd Hoffmann
2009-11-26 7:31 ` Hannes Reinecke
2009-11-26 8:25 ` Gerd Hoffmann
2009-11-26 10:57 ` Hannes Reinecke
2009-11-26 11:04 ` Gerd Hoffmann
2009-11-26 11:20 ` Hannes Reinecke
2009-11-26 14:21 ` Gerd Hoffmann
2009-11-26 14:27 ` Hannes Reinecke
2009-11-26 14:37 ` Gerd Hoffmann
2009-11-26 15:50 ` Hannes Reinecke
2009-11-27 11:08 ` Gerd Hoffmann
2009-12-02 13:47 ` Gerd Hoffmann
2009-12-07 8:28 ` Hannes Reinecke
2009-12-07 8:50 ` Gerd Hoffmann
2009-11-16 19:08 ` Ryan Harper
2009-11-16 20:40 ` Gerd Hoffmann
2009-11-16 21:45 ` Ryan Harper
2009-11-11 11:21 ` [Qemu-devel] " Gerd Hoffmann
2009-11-11 11:52 ` Hannes Reinecke
2009-11-11 13:02 ` Gerd Hoffmann
2009-11-11 13:30 ` Hannes Reinecke
2009-11-11 14:37 ` Gerd Hoffmann
2009-11-12 9:54 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200911111413.09320.paul@codesourcery.com \
--to=paul@codesourcery.com \
--cc=kraxel@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).