public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Douglas Gilbert <dougg@torque.net>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH] SG_IO readcd and various bugs
Date: Tue, 3 Jun 2003 10:29:38 +0200	[thread overview]
Message-ID: <20030603082938.GG482@suse.de> (raw)
In-Reply-To: <3EDC30C7.5060804@torque.net>

On Tue, Jun 03 2003, Douglas Gilbert wrote:
> Jens Axboe wrote:
> >On Sun, Jun 01 2003, Douglas Gilbert wrote:
> <snip>
> >>The block layer SG_IO ioctl passes through the SCSI
> >>command set to a device that understands it
> >>(i.e. not necessarily a "SCSI" device in the traditional
> >>sense). Other pass throughs exist (or may be needed) for
> >>ATA's task file interface and SAS's management protocol.
> >>
> >>Even though my tests, shown earlier in this thread, indicated
> >>that the SG_IO ioctl might be a shade faster than O_DIRECT,
> >>the main reason for having it is to pass through "non-block"
> >>commands to a device. Some examples:
> >> - special writes (e.g. formating a disk, writing a CD/DVD)
> >> - uploading firmware
> >> - reading the defect table from a disk
> >> - reading and writing special areas on a disk
> >>   (e.g. application client log page)
> >>
> >>The reason for choosing this list is that all these
> >>operations potentially move large amounts of data in a
> >>single operation. For such data transfers to be constrained
> >>by max_sectors is questionable. Putting a block paradigm
> >>bypass in the block layer is an interesting design :-)
> >
> >
> >I think this is nonsense. The block layer will not accept commands
> >that it cannot handle in one go, what would the point of that be?
> >There's no way for us to break down a single command into pieces,
> >we have no idea how to do that. max_sectors _is_ the natural
> >constraint, it's the hardware limit not something I impose through
> >policy. For SCSI it could be bigger in some cases, that's up to the
> >lldd to set though.
> <snip>
> 
> Jens,
> Reviewing the linix-scsi archives, max_sectors was
> introduced around lk 2.4.7 and you were quite active
> in its promotion. There are also posts about problems
> with qlogic HBAs and their need for a limit to maximum
> transfer length. So there is some hardware justification.
> 
> On 11th April 2002 Justin Gibbs posted this in a mail
> about aic7xxx version 6.2.6:
> "2) Set max_sectors to a sane value.  The aic7xxx driver was not
>    updated when this value was added to the host template structure.
>    In more recent kernels, the default setting for this field, 255,
>    can limit our transaction size to 127K.  This often causes the
>    scsi_merge routines to generate 127k followed by 1k I/Os to complete
>    a client transaction.  The command overhead of such small
>    transactions
>    can severely impact performance.  The driver now sets max_sectors to
>    8192 which equates to the 16MB S/G element limit for these cards as
>    expressed in 2K sectors."
> 
> At the time max_sectors defaulted to 255, later it was
> bumped to 256 and is now 1024 in lk 2.5. However Justin's
> post is saying the hardware limit for a data transfer
> associated with a single SCSI command in the aic7xxx
> driver is:
>   sg_tablesize * (2 ** 24) bytes == 2 GB
> as the aic7xxx driver sets sg_tablesize to 128.
> Taking into account the largest practical kmalloc of 128 KB
> (which is not a hardware limitation) this number comes down
> to 16 MB. The 8192 figure that Justin chose is still in place
> in the aic7xxx driver in lk 2.5 and it limits maximum transfer
> size to 4 MB since the unit of max_sectors is now 512 bytes.
> 
> Various projects have reported to me success in transferring
> 8 and 16 MB individual WRITE commands through the sg driver,
> usually with LSI or Adaptec HBAs. The max_sectors==8192
> set by the aic7xxx is the maximum of any driver in the
> ide or the scsi subsystems (both in lk 2.4 and lk 2.5)
> currently. Most drivers are picking up the default value.
> The definition of "max_sectors" states in
> drivers/scsi/hosts.h:
>   "if the host adapter has limitations beside segment count"
> That could be taken to imply if a LLD does not define
> max_sectors then there is no limit.
> 
> In summary, from a HBA drivers point of view, "max_sectors"
> is misnamed (since they transfer bytes) and not precise
> enough to describe any limitations on data transfers they
> may have.
> 
> Apologies in advance for propagating further nonsense.

Wow, that was a long email. I don't have good connectivity these days,
so mail collisions are bound to happen.

As I wrote in the last email to you, it might make sense to introduce a
hard upper limit and a preferred limit. I think you are missing the
point with the latency requirements. If you allow 16MB sg requests in
the queue, you will both have pinned down a _lot_ of memory (that's one
problem) and build up a huge latency queue. And that's not even
considering that it makes _no_ sense from a performance pov to go as
high as 16MB, zero.

The fact is that for some drivers, max_sectors is a hard limit. There's
no way that scsi_ioctl will pass down requests bigger than this, period.
If 512KB is too small for some operations (your firmware case, it makes
sense), then I'm all for fixing that up.

-- 
Jens Axboe


  parent reply	other threads:[~2003-06-03  8:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-31  8:23 [PATCH] SG_IO readcd and various bugs Douglas Gilbert
2003-05-31 10:57 ` Jens Axboe
2003-06-01  7:39   ` Douglas Gilbert
2003-06-02  7:27     ` Jens Axboe
2003-06-03  5:23       ` Douglas Gilbert
2003-06-03  5:39         ` Nick Piggin
2003-06-03  8:29         ` Jens Axboe [this message]
  -- strict thread matches above, loose matches on Subject: below --
2003-05-30 13:02 Jens Axboe
2003-05-30 13:47 ` Markus Plail
2003-05-30 13:52   ` Markus Plail
2003-05-30 14:58     ` Jens Axboe
2003-05-30 16:57       ` Markus Plail
2003-06-01  7:50         ` uaca
2003-06-01  9:18           ` Markus Plail
2003-06-01 10:19             ` uaca

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030603082938.GG482@suse.de \
    --to=axboe@suse.de \
    --cc=dougg@torque.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox