qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel <qemu-devel@nongnu.org>,
	ronnie sahlberg <ronniesahlberg@gmail.com>
Subject: Re: [Qemu-devel] scsi-generic and max request size
Date: Tue, 21 Dec 2010 09:44:19 +0100	[thread overview]
Message-ID: <4D1068E3.4080000@suse.de> (raw)
In-Reply-To: <1292903541.16694.695.camel@pasglop>

On 12/21/2010 04:52 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2010-12-21 at 14:38 +1100, ronnie sahlberg wrote:
>> Ben,
>>
>> Since it is a scsi device you can try the Inquiry command with
>> pagecode 0xb0  :  Block Limit VPD Page.
>> That pages show optimal and maximum request sizes.
>>
>> This is for SBC, in the Vital Product Data chapter.
>>
>> Unfortunately this page is not mandatory so some devices might not
>> understand it. :-(
>>
>> sg_inq --page=0x00 /dev/sg?
>> will show you what inq pages your device supports.
> 
> Well, that won't help much figuring what the limit is since in most case
> the limit seems to come from the host linux HBA (ie, usb-storage for
> example artificially clamps the max request size to deal with bogus
> USB-ATA bridges).
> 
Indeed. The request size is pretty much limited by the driver/scsi
layer, so the above page won't help much here.

> As for using this to try to "inform" the guest OS as to what the limit
> is, this could be done by "patching" the result of that command on the
> fly in qemu, but that is nasty, and would only work if the guest OS
> actually uses the said command in the first place. AFAIK, neither sr.c
> nor sd.c do in Linux.
> 
And you'll be getting yelled at by hch to boot.

> So back to square 1 ... my vscsi (and virtio-blk too btw) can
> technically pass a max size to the guest, but we don't have a way to
> interrogate scsi-generic (and the underlying block driver) which is the
> main issue (that plus the fact that the ioctl seems to be broken in
> "compat" mode for /dev/sg specifically)...
> 
Ah, the warm and fuzzy feeling knowing to be not alone in this ...

This is basically the same issue I brought up with the first
submission round of my megasas emulation.

As we're passing scatter-gather lists directly to the underlying
device we might end up sending a request which is improperly
formatted. The linux block layer has three limits onto which a
request has to be formatted:
- Max length of the scatter-gather list (max_sectors)
- Max overall request size (max_segments)
- Max length of individual sg elements (max_segment_size)

newer kernels export these limits; they have been exported with
commit c77a5710b7e23847bfdb81fcaa10b585f65c960a.
For older kernels, however, we're being left in the dark here.

So on newer kernel we probably could be doing a quick check on the
block queue limits and reformat the I/O if required.

Instead of reformatting we could be sendiong each element of an eg
list individually. Thereby we would be introducing some slowdown as
the sg lists have to be reassembled again by the lower layers, but
we would be insulated from any sg list mismatch.
However, this won't cover requests with too large sg elements.
For those we could probably use some simple divide-by-two algorithm
on the element to make them fit.

But seeing we have to split the I/O requests anyway we might as well
use the divide-by-two algorithm for the sg lists, too.

Easiest would be if we could just transfer the available bits and
push the request back to the guest as a partial completion.
Sadly the I/O stack on the guest will choose to interpret this as an
I/O error instead of retrying the remainder :-(

So in the long run I fear we have to implement some sort of I/O
request splitting in Qemu, using the values from sysfs.

Cheers,

Hannes
--
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

  reply	other threads:[~2010-12-21  8:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-21  3:25 [Qemu-devel] scsi-generic and max request size Benjamin Herrenschmidt
2010-12-21  3:38 ` ronnie sahlberg
2010-12-21  3:52   ` Benjamin Herrenschmidt
2010-12-21  8:44     ` Hannes Reinecke [this message]
2010-12-21 22:05       ` Benjamin Herrenschmidt
2010-12-22 13:54         ` Hannes Reinecke
2010-12-22 13:27           ` Christoph Hellwig
2010-12-22 22:06             ` Benjamin Herrenschmidt
2010-12-22 23:19             ` Alexander Graf
2010-12-22 21:59           ` Benjamin Herrenschmidt
2010-12-22 23:23             ` Alexander Graf
2010-12-22 23:35               ` Benjamin Herrenschmidt
2010-12-22 23:39                 ` Alexander Graf
2010-12-22 23:44                   ` Benjamin Herrenschmidt
2010-12-22 23:49                     ` Alexander Graf
2010-12-23  0:00                       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D1068E3.4080000@suse.de \
    --to=hare@suse.de \
    --cc=benh@kernel.crashing.org \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ronniesahlberg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).