linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: "Elliott, Robert (Server Storage)" <Elliott@hp.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Chris Friesen <chris.friesen@windriver.com>
Cc: Jens Axboe <axboe@kernel.dk>, lkml <linux-kernel@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	Mike Snitzer <snitzer@redhat.com>
Subject: Re: absurdly high "optimal_io_size" on Seagate SAS disk
Date: Fri, 07 Nov 2014 15:15:01 -0500	[thread overview]
Message-ID: <545D2845.5020904@interlog.com> (raw)
In-Reply-To: <94D0CD8314A33A4D9D801C0FE68B40295937AE5D@G4W3202.americas.hpqcorp.net>

On 14-11-07 12:10 PM, Elliott, Robert (Server Storage) wrote:
>> commit 87c0103ea3f96615b8a9816b8aee8a7ccdf55d50
>> Author: Martin K. Petersen <martin.petersen@oracle.com>
>> Date:   Thu Nov 6 12:31:43 2014 -0500
>>
>>      [SCSI] sd: Sanity check the optimal I/O size
>>
>>      We have come across a couple of devices that report crackpot
>> 	values in the optimal I/O size in the Block Limits VPD page.
>> 	Since this is a 32-bit entity that gets multiplied by the
>> 	logical block size we can get
>>      disproportionately large values reported to the block layer.
>>
>>      Cap io_opt at 1 GB.
>
> Another reasonable cap is the maximum transfer size.
> There are lots of them:
>
> * the block layer BIO_MAX_PAGES value of 256 limits IOs
>    to a maximum of 1 MiB
> * SCSI LLDs report their maximum transfer size in
>    /sys/block/sdNN/queue/max_hw_sectors_kb
> * the SCSI midlayer maximum transfer size is set/reported
>    in /sys/block/sdNN/queue/max_sectors_kb
>    and the default is 512 KiB
> * the SCSI LLD maximum number of scatter gather entries
>    reported in /sys/block/sdNN/queue/max_segments and
>    /sys/block/sdNN/queue/max_segment_size creates a
>    limit based on how fragmented the data buffer is
>    in virtual memory
> * the Block Limits VPD page MAXIMUM TRANSFER LENGTH field
>    indicates the maximum transfer size for one command over
>    the SCSI transport protocol supported by the drive itself
>
> It is risky to use transfer sizes larger than linux and
> Windows can generate, since drives are probably tested in
> those environments.

After being burnt by a (virtual) SCSI disk recently, my
utilities now take a more aggressive approach to the data-in
buffer received from INQUIRY, MODE SENSE and LOG SENSE (and
probably should add a few more):

At a low level, after the command is completed, the data-in
buffer is post-filled with zeros following the last valid
byte as indicated by resid, until the end of that buffer.
Then it is passed back for higher level processing of the
command including its data-in buffer.

Pre-filling the data-in buffer with zeros has been in place
for a long time, but I don't think it helps much.


So if there are any HBA drivers that set resid higher than it
should be, expect some pain soon.

Doug Gilbert



      parent reply	other threads:[~2014-11-07 20:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-06 16:47 absurdly high "optimal_io_size" on Seagate SAS disk Chris Friesen
2014-11-06 17:16 ` Chris Friesen
2014-11-06 17:34   ` Martin K. Petersen
2014-11-06 17:45     ` Chris Friesen
2014-11-06 18:12       ` Martin K. Petersen
2014-11-06 18:15         ` Jens Axboe
2014-11-06 19:14         ` Chris Friesen
2014-11-07  1:56           ` Martin K. Petersen
2014-11-07  5:35             ` Chris Friesen
2014-11-07 15:18               ` Dale R. Worley
2014-11-07 16:25               ` Martin K. Petersen
2014-11-07 17:42                 ` Martin K. Petersen
2014-11-07 17:51                   ` Chris Friesen
2014-11-07 18:03                     ` Martin K. Petersen
2014-11-07 18:48                 ` Chris Friesen
2014-11-07 19:17                   ` Martin K. Petersen
2014-11-07 21:04                     ` Chris Friesen
2014-11-07 17:10             ` Elliott, Robert (Server Storage)
2014-11-07 17:40               ` Martin K. Petersen
2014-11-07 20:15               ` Douglas Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545D2845.5020904@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=Elliott@hp.com \
    --cc=axboe@kernel.dk \
    --cc=chris.friesen@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).