From: Damien Le Moal <dlemoal@kernel.org>
To: David Laight <david.laight.linux@gmail.com>
Cc: Ranjan Kumar <ranjan.kumar@broadcom.com>,
linux-scsi@vger.kernel.org, martin.petersen@oracle.com,
sathya.prakash@broadcom.com, chandrakanth.patil@broadcom.com,
stable@vger.kernel.org, Mira Limbeck <m.limbeck@proxmox.com>,
Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH v3] mpt3sas: Limit NVMe request size to 2 MiB
Date: Tue, 14 Apr 2026 09:21:15 +0200 [thread overview]
Message-ID: <a7101526-3b23-4474-afc8-bd39e7e3646f@kernel.org> (raw)
In-Reply-To: <20260414081210.2b63e350@pumpkin>
On 4/14/26 09:12, David Laight wrote:
> On Tue, 14 Apr 2026 05:41:59 +0200
> Damien Le Moal <dlemoal@kernel.org> wrote:
>
>> On 2026/04/13 22:33, David Laight wrote:
>>> On Mon, 13 Apr 2026 23:30:03 +0530
>>> Ranjan Kumar <ranjan.kumar@broadcom.com> wrote:
>>>
>>>> The HBA firmware reports NVMe MDTS values based on the underlying drive
>>>> capability. However, due to the 4K PRP page size and a limit of
>>>> 512 entries, the driver supports a maximum I/O transfer size of 2 MiB.
>>>>
>>>> Limit max_hw_sectors to the smaller of the reported MDTS and the
>>>> 2 MiB driver limit to prevent issuing oversized I/O that may lead
>>>> to a kernel oops.
>>>>
>>>> Cc: stable@vger.kernel.org
>>>> Fixes: 9b8b84879d4a ("block: Increase BLK_DEF_MAX_SECTORS_CAP")
>>>> Reported-by: Mira Limbeck <m.limbeck@proxmox.com>
>>>> Closes: https://lore.kernel.org/r/291f78bf-4b4a-40dd-867d-053b36c564b3@proxmox.com
>>>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b8b84879d4a
>>>> Suggested-by: Keith Busch <kbusch@kernel.org>
>>>> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
>>>> ---
>>>> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 +++++++++++++-
>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>>> index 6ff788557294..44dd439e6f17 100644
>>>> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>>> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>>> @@ -2738,8 +2738,20 @@ scsih_sdev_configure(struct scsi_device *sdev, struct queue_limits *lim)
>>>> pcie_device->enclosure_level,
>>>> pcie_device->connector_name);
>>>>
>>>> + /*
>>>> + * The HBA firmware passes the NVMe drive's MDTS
>>>> + * (Maximum Data Transfer Size) up to the driver. However,
>>>> + * the driver hardcodes a 4K page size for the PRP list,
>>> ^ buffer ?
>>>> + * accommodating at most 512 entries. This strictly limits
>>>> + * the maximum supported NVMe I/O transfer to 2 MiB.
>>>
>>> Doesn't that make max_fw_entries 4096/8.
>>
>> What is max_fw_entries ?
>
> A mistype for max_hw_sectors :-(
>
>> What the above explains is that a single NVMe page (4K) can store 512 (4096/8)
>> PRP entries, each pointing at a 4K nvme page, so 512*4096=2M maximum size.
>>
>>> Assuming 4096 byte sectors the longest transfer is then 4096/8*4096.
>>
>> Yes, that's the SZ_2M Bytes.
>
> So write it as (4096/8)*4096
See below.
>>> So none of this has anything to to with SECTOR_SHIFT.
>>
>> Apparently, nvme_mdts is in bytes, even though the documentation in
>> mpt3sas_base.h does not mention anything about its unit. So yes, we need a
>> SECTOR_SHIFT to convert that to 512B sectors unit.
>
> It is all very confusing because of the 4k and 512 byte sectors and there
> being another 512 constant.
> Perhaps the best expression is:
> (4096 /* NVMe page */ / 8) * (4096 /* hw sector size */ >> SECTOR_SIZE)
Yes, we could, but I think the comment is clear enough, so I have no issue with
the code as it is. But I will not fight this though. I will let Martin & James
decide.
>>>> + *
>>>> + * Cap max_hw_sectors to the smaller of the drive's reported
>>>> + * MDTS or the 2 MiB driver limit to prevent kernel oopses.
>>>> + */
>>>> + lim->max_hw_sectors = SZ_2M >> SECTOR_SHIFT;
>>>> if (pcie_device->nvme_mdts)
>>>> - lim->max_hw_sectors = pcie_device->nvme_mdts / 512;
>>>> + lim->max_hw_sectors = min_t(u32, lim->max_hw_sectors,
>>>> + pcie_device->nvme_mdts >> SECTOR_SHIFT);
>>>
>>> Why min_t() ?
>>
>> max_hw_sectors is unsigned int and nvme_mdts is u32. Not sure if that bothers
>> min(). Worth trying.
>
> It doesn't bother it (any more).
OK. Let's drop the min_t() then and use min().
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2026-04-14 7:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-13 18:00 [PATCH v3] mpt3sas: Limit NVMe request size to 2 MiB Ranjan Kumar
2026-04-13 20:33 ` David Laight
2026-04-14 3:41 ` Damien Le Moal
2026-04-14 7:12 ` David Laight
2026-04-14 7:21 ` Damien Le Moal [this message]
2026-04-14 9:19 ` Ranjan Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a7101526-3b23-4474-afc8-bd39e7e3646f@kernel.org \
--to=dlemoal@kernel.org \
--cc=chandrakanth.patil@broadcom.com \
--cc=david.laight.linux@gmail.com \
--cc=kbusch@kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=m.limbeck@proxmox.com \
--cc=martin.petersen@oracle.com \
--cc=ranjan.kumar@broadcom.com \
--cc=sathya.prakash@broadcom.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox