public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: David Laight <david.laight.linux@gmail.com>
Cc: Ranjan Kumar <ranjan.kumar@broadcom.com>,
	linux-scsi@vger.kernel.org, martin.petersen@oracle.com,
	sathya.prakash@broadcom.com, chandrakanth.patil@broadcom.com,
	stable@vger.kernel.org, Mira Limbeck <m.limbeck@proxmox.com>,
	Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH v3] mpt3sas: Limit NVMe request size to 2 MiB
Date: Tue, 14 Apr 2026 09:21:15 +0200	[thread overview]
Message-ID: <a7101526-3b23-4474-afc8-bd39e7e3646f@kernel.org> (raw)
In-Reply-To: <20260414081210.2b63e350@pumpkin>

On 4/14/26 09:12, David Laight wrote:
> On Tue, 14 Apr 2026 05:41:59 +0200
> Damien Le Moal <dlemoal@kernel.org> wrote:
> 
>> On 2026/04/13 22:33, David Laight wrote:
>>> On Mon, 13 Apr 2026 23:30:03 +0530
>>> Ranjan Kumar <ranjan.kumar@broadcom.com> wrote:
>>>   
>>>> The HBA firmware reports NVMe MDTS values based on the underlying drive
>>>> capability. However, due to the 4K PRP page size and a limit of
>>>> 512 entries, the driver supports a maximum I/O transfer size of 2 MiB.
>>>>
>>>> Limit max_hw_sectors to the smaller of the reported MDTS and the
>>>> 2 MiB driver limit to prevent issuing oversized I/O that may lead
>>>> to a kernel oops.
>>>>
>>>> Cc: stable@vger.kernel.org
>>>> Fixes: 9b8b84879d4a ("block: Increase BLK_DEF_MAX_SECTORS_CAP")
>>>> Reported-by: Mira Limbeck <m.limbeck@proxmox.com>
>>>> Closes: https://lore.kernel.org/r/291f78bf-4b4a-40dd-867d-053b36c564b3@proxmox.com
>>>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b8b84879d4a
>>>> Suggested-by: Keith Busch <kbusch@kernel.org>
>>>> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
>>>> ---
>>>>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 +++++++++++++-
>>>>  1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>>> index 6ff788557294..44dd439e6f17 100644
>>>> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>>> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>>>> @@ -2738,8 +2738,20 @@ scsih_sdev_configure(struct scsi_device *sdev, struct queue_limits *lim)
>>>>  				pcie_device->enclosure_level,
>>>>  				pcie_device->connector_name);
>>>>  
>>>> +		/*
>>>> +		 * The HBA firmware passes the NVMe drive's MDTS
>>>> +		 * (Maximum Data Transfer Size) up to the driver. However,
>>>> +		 * the driver hardcodes a 4K page size for the PRP list,  
>>>                                              ^ buffer ?   
>>>> +		 * accommodating at most 512 entries. This strictly limits
>>>> +		 * the maximum supported NVMe I/O transfer to 2 MiB.  
>>>
>>> Doesn't that make max_fw_entries 4096/8.  
>>
>> What is max_fw_entries ?
> 
> A mistype for max_hw_sectors :-(
> 
>> What the above explains is that a single NVMe page (4K) can store 512 (4096/8)
>> PRP entries, each pointing at a 4K nvme page, so 512*4096=2M maximum size.
>>
>>> Assuming 4096 byte sectors the longest transfer is then 4096/8*4096.  
>>
>> Yes, that's the SZ_2M Bytes.
> 
> So write it as (4096/8)*4096

See below.

>>> So none of this has anything to to with SECTOR_SHIFT.  
>>
>> Apparently, nvme_mdts is in bytes, even though the documentation in
>> mpt3sas_base.h does not mention anything about its unit. So yes, we need a
>> SECTOR_SHIFT to convert that to 512B sectors unit.
> 
> It is all very confusing because of the 4k and 512 byte sectors and there
> being another 512 constant.
> Perhaps the best expression is:
> 	(4096 /* NVMe page */ / 8) * (4096 /* hw sector size */ >> SECTOR_SIZE)

Yes, we could, but I think the comment is clear enough, so I have no issue with
the code as it is. But I will not fight this though. I will let Martin & James
decide.

>>>> +		 *
>>>> +		 * Cap max_hw_sectors to the smaller of the drive's reported
>>>> +		 * MDTS or the 2 MiB driver limit to prevent kernel oopses.
>>>> +		 */
>>>> +		lim->max_hw_sectors = SZ_2M >> SECTOR_SHIFT;
>>>>  		if (pcie_device->nvme_mdts)
>>>> -			lim->max_hw_sectors = pcie_device->nvme_mdts / 512;
>>>> +			lim->max_hw_sectors = min_t(u32, lim->max_hw_sectors,
>>>> +					pcie_device->nvme_mdts >> SECTOR_SHIFT);  
>>>
>>> Why min_t() ?  
>>
>> max_hw_sectors is unsigned int and nvme_mdts is u32. Not sure if that bothers
>> min(). Worth trying.
> 
> It doesn't bother it (any more).

OK. Let's drop the min_t() then and use min().

-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2026-04-14  7:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13 18:00 [PATCH v3] mpt3sas: Limit NVMe request size to 2 MiB Ranjan Kumar
2026-04-13 20:33 ` David Laight
2026-04-14  3:41   ` Damien Le Moal
2026-04-14  7:12     ` David Laight
2026-04-14  7:21       ` Damien Le Moal [this message]
2026-04-14  9:19         ` Ranjan Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7101526-3b23-4474-afc8-bd39e7e3646f@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=chandrakanth.patil@broadcom.com \
    --cc=david.laight.linux@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=m.limbeck@proxmox.com \
    --cc=martin.petersen@oracle.com \
    --cc=ranjan.kumar@broadcom.com \
    --cc=sathya.prakash@broadcom.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox