From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Hannes Reinecke <hare@suse.de>
Cc: Kevin Wolf <kwolf@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
qemu-devel <qemu-devel@nongnu.org>,
ronnie sahlberg <ronniesahlberg@gmail.com>
Subject: Re: [Qemu-devel] scsi-generic and max request size
Date: Thu, 23 Dec 2010 08:59:10 +1100 [thread overview]
Message-ID: <1293055150.16694.779.camel@pasglop> (raw)
In-Reply-To: <4D12032E.6040602@suse.de>
On Wed, 2010-12-22 at 14:54 +0100, Hannes Reinecke wrote:
> Well, sort of. 'sg' doesn't have any block queue limits directly as the
> block queue is attached to the block device (surprise, surprise :-).
> But nevertheless any commands send via SG_IO are being placed on the
> block queue, hence the same limits apply here, too.
Right, tho is there a "simple" way to map sg to the appropriate block
driver to retreive the info via sysfs ? I looks possible from a quick
peek there but it also looks like an ungodly mess.
> If it were me I would be using
I think you meant to type more here :-)
> > However, I can't quite figure out how to reliably obtain that
> > information in my driver since on one hand, the ioctl doesn't seem to
> > work in mixed 32/64-bit environments, and on the other hand, sysfs
> > doesn't seem to have anything for "sg" in /sys/class/block... Besides,
> > those are both Linux-isms... so we'd have to be extra careful there too.
> >
> Yes. I've been bashing my head against this, too.
Christoph, any suggestion there ?
> IMO the whole problem arises from the fact that we're deliberately
> destroying information here.
> Most modern HBAs are using separate codepaths for streaming/block I/O
> anyway, but when using 'scsi-generic' we are forced to discard this
> information. We have to fake a SCSI READ/WRITE command, and send it via
> SG_IO to the underlying device and keep fingers crossed that we're not
> exceeding any device limitations.
I wouldn't say it like that no.
It's a transport problem. In my case I'm not "faking" anything, vscsi is
just a transport (a variant of SRP). The problem is that when
'emulating' a HW HBA, you have no way to express the intrinsic
limitations of the underlying HBA, but that's not a problem I have with
vscsi which is meant to be a transport and as such does have means to
convey that sort of information (tho in my case, I have some issues due
to assumptions/bugs in the existing ibm vscsi client driver but that's a
different topic).
So I think there's a significant difference here between emulating a HW
HBA and doing something like vscsi. The former has problems that cannot
be easily solved I believe. The later problems on the other hands can be
solved, the means to do so are there, but we have to deal with
"interface" issues ... plumbing problems.
The non working compat ioctl is one, the fact that "sg" has
no /sys/class/block (or /sys/block) entries is another, etc... Ie, we
are faced with a problem with Linux not exposing those informations in
an easy to retrieve way, and no proper cross-platform way to obtain
those informations neither.
> The whole problem would just go away if we could use the standard block
> read()/write() calls here. Then the iovec would be placed _as
> scatter-gather list_ on the request-queue and the block layer would take
> care of the whole issue.
That would be somewhat cheating with the concept of just being a SCSI
transport layer :-) You would interpret some requests and turn them into
something else. That would be "interesting" when your user starts using
tags and make assumptions about what's in flight and what not etc...
> I've tried to advocate this approach once, but (again) was being told
> that it's a misuse of scsi-generic and I should be using scsi-disk instead.
>
> However, since Alex Graf is facing similar problems with the AHCI HBA of
> his maybe we could retry again ...
Again, I'd say different problems :-) To some extent scsi-disk will
solve the issues with basic read/write operations, but there's some more
nasty SCSI commands that you want through for things like DVD burning
for example, unless we start building higher level abstractions into the
kernel. So you -still- end up acting somewhat as a SCSI transport layer,
and potentially hit the problem with limits again.
Cheers,
Ben.
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke zSeries & Storage
> hare@suse.de +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
next prev parent reply other threads:[~2010-12-22 21:59 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-21 3:25 [Qemu-devel] scsi-generic and max request size Benjamin Herrenschmidt
2010-12-21 3:38 ` ronnie sahlberg
2010-12-21 3:52 ` Benjamin Herrenschmidt
2010-12-21 8:44 ` Hannes Reinecke
2010-12-21 22:05 ` Benjamin Herrenschmidt
2010-12-22 13:54 ` Hannes Reinecke
2010-12-22 13:27 ` Christoph Hellwig
2010-12-22 22:06 ` Benjamin Herrenschmidt
2010-12-22 23:19 ` Alexander Graf
2010-12-22 21:59 ` Benjamin Herrenschmidt [this message]
2010-12-22 23:23 ` Alexander Graf
2010-12-22 23:35 ` Benjamin Herrenschmidt
2010-12-22 23:39 ` Alexander Graf
2010-12-22 23:44 ` Benjamin Herrenschmidt
2010-12-22 23:49 ` Alexander Graf
2010-12-23 0:00 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1293055150.16694.779.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=hare@suse.de \
--cc=hch@infradead.org \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=ronniesahlberg@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).