qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paul Brook <paul@codesourcery.com>
To: Gerd Hoffmann <kraxel@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [sneak preview] major scsi overhaul
Date: Wed, 11 Nov 2009 14:13:09 +0000	[thread overview]
Message-ID: <200911111413.09320.paul@codesourcery.com> (raw)
In-Reply-To: <4AFA86C5.6020000@redhat.com>

> The current qemu code *does* cache the response.  scsi-disk caps the
> buffer at 128k (which is big enough for any request I've seen in my
> testing).  scsi-generic has no cap.

That cap is important.
For scsi-generic you probably don't have a choice because of the way the 
kernel interface works.
 
> With the new interface the HBA has to handle the caching if needed.  But
> the HBA also has the option to pass scatter lists, in which case qemu
> doesn't do any caching, the data is transfered directly from/to guest
> memory.  Which is clearly an improvement IMO.

> > Remember that a single
> > transfer can be very large (greater than available ram on the host). Even
> > with a SG capable HBA, you can not assume all the data will be
> > transferred in a single request.
> 
> scsi-generic: It must be a single request anyway, and it already is today.
> 
> scsi-disk: dma_bdrv_{read,write} will split it into smaller chunks if
> needed.

You seem to be assuming the HBA knows where it's going to put the data before 
it issues the command. This is not true (see blow).

> > You should also consider how this interacts with command queueing.
> > IIUC an Initiator (HBA) typically sends a tagged command, then
> > disconnects from the target (disk). At some later time the target
> > reconnects, and the initiator starts the DMA transfer. By my reading your
> > code does not issue any IO requests until after the HBA starts
> > transferring data.
> 
> Hmm?  What code you are looking at?
> 
> For esp and usb-storage reads and writes are handles slightly different.
>   They roughly works like this:

Neither ESP nor usb-storage implement command queueing, so aren't interesting.

> lsi (only one in-tree with TCQ support) works like this:
> 
> - allocate + parse scsi command.        scsi_req_get+scsi_req_parse
> - continue script processing, collect
>    DMA addresses and stick them into
>    a scatter list until it is complete.
> - queue command and disconnect.
> - submit I/O to the qemu block layer    scsi_req_sgl
> 
> *can process more scsi commands here*
>
> - when I/O is finished reselect tag
> - return status, release request.       scsi_req_put

I'm pretty sure this is wrong, and what actually happens is:

1) Wait for device to reconnect (goto 5), or commands from host (goto 2).

2) SCRIPTS connect to device, and send command.
3) If device has data immediately (metadata command) then goto 6
4) Device disconnects. goto 1

5) Device has data ready, and reconnects
6) SCRIPTS locate the next DMA block for this command, and initiate a (linear) 
DMA transfer.
7) DATA is transferred. Note that DMA stalls the SCRIPTS processor until the 
transfer completes.
8) If the device still has data then goto 6.
9) If the device runs out of data before the command completes then goto 3.
10) Command complete. goto 1

Note that the IO command is parsed at stage 2, but the data transfer is not 
requested until stage 6. i.e. after the command has partially completed. This 
window between issue and data transfer is where other commands are issued.

The only way to make your API work is to skip straight from step 3 to step 6, 
which effectively looses the command queueing capability. It may be that it's 
hard/impossible to get both command queueing and zero-copy. In that case I say 
command queueing wins.

Also note that use of self-modifying SCRIPTS is common.

Paul

  reply	other threads:[~2009-11-11 14:13 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-06 23:09 [Qemu-devel] [sneak preview] major scsi overhaul Gerd Hoffmann
2009-11-07 15:22 ` Blue Swirl
2009-11-09  9:08   ` Gerd Hoffmann
2009-11-09 12:37     ` Avi Kivity
2009-11-09 13:03       ` Gerd Hoffmann
2009-11-09 13:17         ` Avi Kivity
2009-11-09 13:39           ` Gerd Hoffmann
2009-11-09 13:48             ` Avi Kivity
2009-11-09 20:38       ` Blue Swirl
2009-11-09 21:25         ` Gerd Hoffmann
2009-11-11  4:06 ` Paul Brook
2009-11-11  9:41   ` Gerd Hoffmann
2009-11-11 14:13     ` Paul Brook [this message]
2009-11-11 15:26       ` Gerd Hoffmann
2009-11-11 16:38         ` Paul Brook
2009-11-16 16:35           ` Gerd Hoffmann
2009-11-16 18:53             ` Paul Brook
2009-11-16 21:50               ` Gerd Hoffmann
2009-11-24 11:59               ` Gerd Hoffmann
2009-11-24 13:51                 ` Paul Brook
2009-11-25 16:37                   ` Gerd Hoffmann
2009-11-26  7:31                     ` Hannes Reinecke
2009-11-26  8:25                       ` Gerd Hoffmann
2009-11-26 10:57                         ` Hannes Reinecke
2009-11-26 11:04                           ` Gerd Hoffmann
2009-11-26 11:20                             ` Hannes Reinecke
2009-11-26 14:21                               ` Gerd Hoffmann
2009-11-26 14:27                                 ` Hannes Reinecke
2009-11-26 14:37                                   ` Gerd Hoffmann
2009-11-26 15:50                                     ` Hannes Reinecke
2009-11-27 11:08                                       ` Gerd Hoffmann
2009-12-02 13:47                                         ` Gerd Hoffmann
2009-12-07  8:28                                           ` Hannes Reinecke
2009-12-07  8:50                                             ` Gerd Hoffmann
2009-11-16 19:08             ` Ryan Harper
2009-11-16 20:40               ` Gerd Hoffmann
2009-11-16 21:45                 ` Ryan Harper
2009-11-11 11:21 ` [Qemu-devel] " Gerd Hoffmann
2009-11-11 11:52   ` Hannes Reinecke
2009-11-11 13:02     ` Gerd Hoffmann
2009-11-11 13:30       ` Hannes Reinecke
2009-11-11 14:37         ` Gerd Hoffmann
2009-11-12  9:54           ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200911111413.09320.paul@codesourcery.com \
    --to=paul@codesourcery.com \
    --cc=kraxel@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).