From: John Garry <john.g.garry@oracle.com>
To: Damien Le Moal <dlemoal@kernel.org>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
linux-scsi@vger.kernel.org
Subject: Re: [PATCH v4 2/2] scsi: sd: Set a default optimal IO size if one is not defined
Date: Fri, 13 Jun 2025 15:31:47 +0100 [thread overview]
Message-ID: <b20bde78-5f11-4700-9f99-e9bf4bc31e85@oracle.com> (raw)
In-Reply-To: <20250613062909.2505759-3-dlemoal@kernel.org>
On 13/06/2025 07:29, Damien Le Moal wrote:
> Introduce the helper function sd_set_io_opt() to set a disk io_opt
> limit. This new way of setting this limit falls back to using the
> max_sectors limit if the host does not define an optimal sector limit
> and the device did not indicate an optimal transfer size (e.g. as is
> the case for ATA devices). io_opt calculation is done using a local
> 64-bits variable to avoid overflows. The final value is clamped to
> UINT_MAX aligned down to the device physical block size.
>
> This fallback io_opt limit avoids setting up the disk with a zero
> io_opt limit, which result in the rather small 128 KB read_ahead_kb
> attribute. The larger read_ahead_kb value set with the default non-zero
> io_opt limit significantly improves buffered read performance with file
> systems without any intervention from the user.
Out of curiosity, why do this just for sd.c and not always set up the
default like this in blk_validate_limits()?
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/scsi/sd.c | 45 +++++++++++++++++++++++++++++++++++----------
> 1 file changed, 35 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index daddef2e9e87..8070356285a7 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3681,6 +3681,40 @@ static void sd_read_block_zero(struct scsi_disk *sdkp)
> kfree(buffer);
> }
>
> +/*
> + * Set the optimal I/O size: limit the default to the SCSI host optimal sector
> + * limit if it is set. There may be an impact on performance when the size of
> + * a request exceeds this host limit. If the host did not set any optimal
> + * sector limit and the device did not indicate an optimal transfer size
> + * (e.g. ATA devices), default to using the device max_sectors limit.
> + */
> +static void sd_set_io_opt(struct scsi_disk *sdkp, unsigned int dev_max,
> + struct queue_limits *lim)
> +{
> + struct scsi_device *sdp = sdkp->device;
> + struct Scsi_Host *shost = sdp->host;
> + u64 io_opt;
> +
> + io_opt = (u64)shost->opt_sectors << SECTOR_SHIFT;
> + if (sd_validate_opt_xfer_size(sdkp, dev_max))
> + io_opt = min_not_zero(io_opt,
> + logical_to_bytes(sdp, sdkp->opt_xfer_blocks));
> + if (io_opt) {
> + lim->io_opt = ALIGN_DOWN(min_t(u64, io_opt, UINT_MAX),
> + sdkp->physical_block_size - 1);
> + return;
> + }
> +
> + /* Set default */
> + io_opt = (u64)lim->max_sectors << SECTOR_SHIFT;
> + lim->io_opt = ALIGN_DOWN(min_t(u64, io_opt, UINT_MAX),
does lim->max_sectors << SECTOR_SHIFT really possibly overflow? I guess
that it the reason for the min_t() call.
> + sdkp->physical_block_size - 1);
blk_validate_limits() has the following:
lim->io_opt = round_down(lim->io_opt, lim->physical_block_size)
Does that do what we want already? I do realize that we want to print
the used value in lim->io_opt, below.
> +
> + sd_first_printk(KERN_INFO, sdkp,
> + "Using default optimal transfer size of %u bytes\n",
> + lim->io_opt);
> +}
> +
> /**
> * sd_revalidate_disk - called the first time a new disk is seen,
> * performs disk spin up, read_capacity, etc.
> @@ -3777,16 +3811,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
> else
> lim.io_min = 0;
>
> - /*
> - * Limit default to SCSI host optimal sector limit if set. There may be
> - * an impact on performance for when the size of a request exceeds this
> - * host limit.
> - */
> - lim.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
> - if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
> - lim.io_opt = min_not_zero(lim.io_opt,
> - logical_to_bytes(sdp, sdkp->opt_xfer_blocks));
> - }
> + sd_set_io_opt(sdkp, dev_max, &lim);
>
> sdkp->first_scan = 0;
>
next prev parent reply other threads:[~2025-06-13 14:31 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-13 6:29 [PATCH v4 0/2] Improve optimal IO size initialization Damien Le Moal
2025-06-13 6:29 ` [PATCH v4 1/2] scsi: sd: Prevent logical_to_bytes() from returning overflowed values Damien Le Moal
2025-06-13 16:17 ` Bart Van Assche
2025-06-13 6:29 ` [PATCH v4 2/2] scsi: sd: Set a default optimal IO size if one is not defined Damien Le Moal
2025-06-13 14:31 ` John Garry [this message]
2025-06-16 5:34 ` Damien Le Moal
2025-06-16 6:26 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b20bde78-5f11-4700-9f99-e9bf4bc31e85@oracle.com \
--to=john.g.garry@oracle.com \
--cc=dlemoal@kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox