From: Jens Axboe <axboe@suse.de>
To: Douglas Gilbert <dougg@torque.net>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH] SG_IO readcd and various bugs
Date: Tue, 3 Jun 2003 10:29:38 +0200 [thread overview]
Message-ID: <20030603082938.GG482@suse.de> (raw)
In-Reply-To: <3EDC30C7.5060804@torque.net>
On Tue, Jun 03 2003, Douglas Gilbert wrote:
> Jens Axboe wrote:
> >On Sun, Jun 01 2003, Douglas Gilbert wrote:
> <snip>
> >>The block layer SG_IO ioctl passes through the SCSI
> >>command set to a device that understands it
> >>(i.e. not necessarily a "SCSI" device in the traditional
> >>sense). Other pass throughs exist (or may be needed) for
> >>ATA's task file interface and SAS's management protocol.
> >>
> >>Even though my tests, shown earlier in this thread, indicated
> >>that the SG_IO ioctl might be a shade faster than O_DIRECT,
> >>the main reason for having it is to pass through "non-block"
> >>commands to a device. Some examples:
> >> - special writes (e.g. formating a disk, writing a CD/DVD)
> >> - uploading firmware
> >> - reading the defect table from a disk
> >> - reading and writing special areas on a disk
> >> (e.g. application client log page)
> >>
> >>The reason for choosing this list is that all these
> >>operations potentially move large amounts of data in a
> >>single operation. For such data transfers to be constrained
> >>by max_sectors is questionable. Putting a block paradigm
> >>bypass in the block layer is an interesting design :-)
> >
> >
> >I think this is nonsense. The block layer will not accept commands
> >that it cannot handle in one go, what would the point of that be?
> >There's no way for us to break down a single command into pieces,
> >we have no idea how to do that. max_sectors _is_ the natural
> >constraint, it's the hardware limit not something I impose through
> >policy. For SCSI it could be bigger in some cases, that's up to the
> >lldd to set though.
> <snip>
>
> Jens,
> Reviewing the linix-scsi archives, max_sectors was
> introduced around lk 2.4.7 and you were quite active
> in its promotion. There are also posts about problems
> with qlogic HBAs and their need for a limit to maximum
> transfer length. So there is some hardware justification.
>
> On 11th April 2002 Justin Gibbs posted this in a mail
> about aic7xxx version 6.2.6:
> "2) Set max_sectors to a sane value. The aic7xxx driver was not
> updated when this value was added to the host template structure.
> In more recent kernels, the default setting for this field, 255,
> can limit our transaction size to 127K. This often causes the
> scsi_merge routines to generate 127k followed by 1k I/Os to complete
> a client transaction. The command overhead of such small
> transactions
> can severely impact performance. The driver now sets max_sectors to
> 8192 which equates to the 16MB S/G element limit for these cards as
> expressed in 2K sectors."
>
> At the time max_sectors defaulted to 255, later it was
> bumped to 256 and is now 1024 in lk 2.5. However Justin's
> post is saying the hardware limit for a data transfer
> associated with a single SCSI command in the aic7xxx
> driver is:
> sg_tablesize * (2 ** 24) bytes == 2 GB
> as the aic7xxx driver sets sg_tablesize to 128.
> Taking into account the largest practical kmalloc of 128 KB
> (which is not a hardware limitation) this number comes down
> to 16 MB. The 8192 figure that Justin chose is still in place
> in the aic7xxx driver in lk 2.5 and it limits maximum transfer
> size to 4 MB since the unit of max_sectors is now 512 bytes.
>
> Various projects have reported to me success in transferring
> 8 and 16 MB individual WRITE commands through the sg driver,
> usually with LSI or Adaptec HBAs. The max_sectors==8192
> set by the aic7xxx is the maximum of any driver in the
> ide or the scsi subsystems (both in lk 2.4 and lk 2.5)
> currently. Most drivers are picking up the default value.
> The definition of "max_sectors" states in
> drivers/scsi/hosts.h:
> "if the host adapter has limitations beside segment count"
> That could be taken to imply if a LLD does not define
> max_sectors then there is no limit.
>
> In summary, from a HBA drivers point of view, "max_sectors"
> is misnamed (since they transfer bytes) and not precise
> enough to describe any limitations on data transfers they
> may have.
>
> Apologies in advance for propagating further nonsense.
Wow, that was a long email. I don't have good connectivity these days,
so mail collisions are bound to happen.
As I wrote in the last email to you, it might make sense to introduce a
hard upper limit and a preferred limit. I think you are missing the
point with the latency requirements. If you allow 16MB sg requests in
the queue, you will both have pinned down a _lot_ of memory (that's one
problem) and build up a huge latency queue. And that's not even
considering that it makes _no_ sense from a performance pov to go as
high as 16MB, zero.
The fact is that for some drivers, max_sectors is a hard limit. There's
no way that scsi_ioctl will pass down requests bigger than this, period.
If 512KB is too small for some operations (your firmware case, it makes
sense), then I'm all for fixing that up.
--
Jens Axboe
next prev parent reply other threads:[~2003-06-03 8:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-05-31 8:23 [PATCH] SG_IO readcd and various bugs Douglas Gilbert
2003-05-31 10:57 ` Jens Axboe
2003-06-01 7:39 ` Douglas Gilbert
2003-06-02 7:27 ` Jens Axboe
2003-06-03 5:23 ` Douglas Gilbert
2003-06-03 5:39 ` Nick Piggin
2003-06-03 8:29 ` Jens Axboe [this message]
-- strict thread matches above, loose matches on Subject: below --
2003-05-30 13:02 Jens Axboe
2003-05-30 13:47 ` Markus Plail
2003-05-30 13:52 ` Markus Plail
2003-05-30 14:58 ` Jens Axboe
2003-05-30 16:57 ` Markus Plail
2003-06-01 7:50 ` uaca
2003-06-01 9:18 ` Markus Plail
2003-06-01 10:19 ` uaca
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030603082938.GG482@suse.de \
--to=axboe@suse.de \
--cc=dougg@torque.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox