From: Keith Busch <kbusch@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: Friedrich Weber <f.weber@proxmox.com>,
Mira Limbeck <m.limbeck@proxmox.com>,
hch@lst.de, martin.petersen@oracle.com,
Sathya Prakash <sathya.prakash@broadcom.com>,
Sreekanth Reddy <sreekanth.reddy@broadcom.com>,
Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>,
Ranjan Kumar <ranjan.kumar@broadcom.com>,
linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH v2] block: Increase BLK_DEF_MAX_SECTORS_CAP
Date: Fri, 3 Apr 2026 07:51:30 -0600 [thread overview]
Message-ID: <ac_F4nIj6vVm9c42@kbusch-mbp> (raw)
In-Reply-To: <8198c919-1f4d-4d18-925b-f6c0e80d8b3e@kernel.org>
On Fri, Apr 03, 2026 at 08:25:04AM +0900, Damien Le Moal wrote:
> Thanks for this. But where do you see that the DMA pool size is 2M ?
It's not that the DMA pool size is 2M. NVMe PRP can describe 2M of data
with 4k worth of PRP entries.
I was thinking it's the "page_size", assuming it was 4k, but I misread
the argument order:
ioc->pcie_sgl_dma_pool =
dma_pool_create("PCIe SGL pool", &ioc->pdev->dev, sz,
ioc->page_size, 0);
The dma element size is whatever "sz" is, and ioc->page_size is just the
alignment. It still doesn't seem like it's big enough, though.
The function _base_build_nvme_prp() takes a pointer to the pcie_sgl that
was allocated from that pool and writes the prp entries to it without
doing chaining PRP elements from the end of the list, so it looks like
it just overruns it if you have a large transfer.
> Looking at the code, it seems that ioc->pcie_sgl_dma_pool is created using
> _base_allocate_pcie_sgl_pool() with a size that is calculated as:
>
> /*
> * The number of NVMe page sized blocks needed is:
> * (((sg_tablesize * 8) - 1) / (page_size - 8)) + 1
> * ((sg_tablesize * 8) - 1) is the max PRP's minus the first PRP entry
> * that is placed in the main message frame. 8 is the size of each PRP
> * entry or PRP list pointer entry. 8 is subtracted from page_size
> * because of the PRP list pointer entry at the end of a page, so this
> * is not counted as a PRP entry. The 1 added page is a round up.
This doesn't sound right because sg_tablesize refers to a scatterlist
that may contain multi-page entries, but nvme PRP's need a single entry
per page, so it's too low when you have contiguous memory.
prev parent reply other threads:[~2026-04-03 13:51 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250618060045.37593-1-dlemoal@kernel.org>
[not found] ` <291f78bf-4b4a-40dd-867d-053b36c564b3@proxmox.com>
[not found] ` <ff5e2877-840b-4eb6-b449-bb64fb2e4097@kernel.org>
[not found] ` <ac2256a0-25ce-4453-8c47-04cb7716d46a@proxmox.com>
[not found] ` <7a0cfc66-3131-4b94-87f2-cbb96595ebb6@kernel.org>
[not found] ` <9bf5286c-bac7-4cb7-9bfe-f47195e18b79@proxmox.com>
[not found] ` <ac6FVPT3ZCDoVtb7@kbusch-mbp>
2026-04-02 23:25 ` [PATCH v2] block: Increase BLK_DEF_MAX_SECTORS_CAP Damien Le Moal
2026-04-03 13:51 ` Keith Busch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac_F4nIj6vVm9c42@kbusch-mbp \
--to=kbusch@kernel.org \
--cc=dlemoal@kernel.org \
--cc=f.weber@proxmox.com \
--cc=hch@lst.de \
--cc=linux-scsi@vger.kernel.org \
--cc=m.limbeck@proxmox.com \
--cc=martin.petersen@oracle.com \
--cc=ranjan.kumar@broadcom.com \
--cc=sathya.prakash@broadcom.com \
--cc=sreekanth.reddy@broadcom.com \
--cc=suganath-prabu.subramani@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox