* [PATCH v2 0/2] block: Generalize physical entry definition
@ 2025-11-17 19:22 Leon Romanovsky
2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
2025-11-17 19:22 ` [PATCH v2 2/2] types: move phys_vec definition to common header Leon Romanovsky
0 siblings, 2 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 19:22 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
Changelog:
v2:
* Added Chaitanya's Reviewed-by tags.
* Removed explicit casting from size_t to unsigned int.
v1: https://patch.msgid.link/20251115-nvme-phys-types-v1-0-c0f2e5e9163d@kernel.org
--------------------------------------------------------------------------------
The block layer code is declared "struct phys_vec" entry which describes
contiguous chunk of physical memory. That definition is useful for all
possible users of DMA physical address-based API.
This series changes NVMe code to support larger chunks of memory by changing
length field from u32 to be size_t, which will be u64 on 64-bits platforms,
and promotes "struct phys_vec" to general place.
---
Leon Romanovsky (2):
nvme-pci: Use size_t for length fields to handle larger sizes
types: move phys_vec definition to common header
block/blk-mq-dma.c | 11 +++++------
drivers/nvme/host/pci.c | 4 ++--
include/linux/types.h | 5 +++++
3 files changed, 12 insertions(+), 8 deletions(-)
---
base-commit: 5674abb82e2b74205a6a5cd1ffd79a3ba48a469d
change-id: 20251030-nvme-phys-types-988893249454
Best regards,
--
Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-17 19:22 [PATCH v2 0/2] block: Generalize physical entry definition Leon Romanovsky
@ 2025-11-17 19:22 ` Leon Romanovsky
2025-11-17 19:35 ` Keith Busch
2025-11-18 5:03 ` Christoph Hellwig
2025-11-17 19:22 ` [PATCH v2 2/2] types: move phys_vec definition to common header Leon Romanovsky
1 sibling, 2 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 19:22 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
From: Leon Romanovsky <leonro@nvidia.com>
This patch changes the length variables from unsigned int to size_t.
Using size_t ensures that we can handle larger sizes, as size_t is
always equal to or larger than the previously used u32 type.
Originally, u32 was used because blk-mq-dma code evolved from
scatter-gather implementation, which uses unsigned int to describe length.
This change will also allow us to reuse the existing struct phys_vec in places
that don't need scatter-gather.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
---
block/blk-mq-dma.c | 8 ++++++--
drivers/nvme/host/pci.c | 4 ++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index e9108ccaf4b0..e7d9b54c3eed 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -8,7 +8,7 @@
struct phys_vec {
phys_addr_t paddr;
- u32 len;
+ size_t len;
};
static bool __blk_map_iter_next(struct blk_map_iter *iter)
@@ -112,8 +112,8 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
struct phys_vec *vec)
{
enum dma_data_direction dir = rq_dma_dir(req);
- unsigned int mapped = 0;
unsigned int attrs = 0;
+ size_t mapped = 0;
int error;
iter->addr = state->addr;
@@ -296,6 +296,8 @@ int __blk_rq_map_sg(struct request *rq, struct scatterlist *sglist,
blk_rq_map_iter_init(rq, &iter);
while (blk_map_iter_next(rq, &iter, &vec)) {
*last_sg = blk_next_sg(last_sg, sglist);
+
+ WARN_ON_ONCE(overflows_type(vec.len, unsigned int));
sg_set_page(*last_sg, phys_to_page(vec.paddr), vec.len,
offset_in_page(vec.paddr));
nsegs++;
@@ -416,6 +418,8 @@ int blk_rq_map_integrity_sg(struct request *rq, struct scatterlist *sglist)
while (blk_map_iter_next(rq, &iter, &vec)) {
sg = blk_next_sg(&sg, sglist);
+
+ WARN_ON_ONCE(overflows_type(vec.len, unsigned int));
sg_set_page(sg, phys_to_page(vec.paddr), vec.len,
offset_in_page(vec.paddr));
segments++;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e5ca8301bb8b..b61ec62b0ec6 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -290,14 +290,14 @@ struct nvme_iod {
u8 flags;
u8 nr_descriptors;
- unsigned int total_len;
+ size_t total_len;
struct dma_iova_state dma_state;
void *descriptors[NVME_MAX_NR_DESCRIPTORS];
struct nvme_dma_vec *dma_vecs;
unsigned int nr_dma_vecs;
dma_addr_t meta_dma;
- unsigned int meta_total_len;
+ size_t meta_total_len;
struct dma_iova_state meta_dma_state;
struct nvme_sgl_desc *meta_descriptor;
};
--
2.51.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 2/2] types: move phys_vec definition to common header
2025-11-17 19:22 [PATCH v2 0/2] block: Generalize physical entry definition Leon Romanovsky
2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
@ 2025-11-17 19:22 ` Leon Romanovsky
1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 19:22 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
From: Leon Romanovsky <leonro@nvidia.com>
Move the struct phys_vec definition from block/blk-mq-dma.c to
include/linux/types.h to make it available for use across the kernel.
The phys_vec structure represents a physical address range with a
length, which is used by the new physical address-based DMA mapping
API. This structure is already used by the block layer and will be
needed for DMA phys API users.
Moving this definition to types.h provides a centralized location
for this common data structure and eliminates code duplication
across subsystems that need to work with physical address ranges.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
---
block/blk-mq-dma.c | 5 -----
include/linux/types.h | 5 +++++
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index e7d9b54c3eed..be2ccf224f9c 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -6,11 +6,6 @@
#include <linux/blk-mq-dma.h>
#include "blk.h"
-struct phys_vec {
- phys_addr_t paddr;
- size_t len;
-};
-
static bool __blk_map_iter_next(struct blk_map_iter *iter)
{
if (iter->iter.bi_size)
diff --git a/include/linux/types.h b/include/linux/types.h
index 6dfdb8e8e4c3..6cc2d7cba9b3 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -170,6 +170,11 @@ typedef u64 phys_addr_t;
typedef u32 phys_addr_t;
#endif
+struct phys_vec {
+ phys_addr_t paddr;
+ size_t len;
+};
+
typedef phys_addr_t resource_size_t;
/*
--
2.51.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
@ 2025-11-17 19:35 ` Keith Busch
2025-11-17 20:01 ` Leon Romanovsky
2025-11-18 5:18 ` Christoph Hellwig
2025-11-18 5:03 ` Christoph Hellwig
1 sibling, 2 replies; 15+ messages in thread
From: Keith Busch @ 2025-11-17 19:35 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-block,
linux-kernel, linux-nvme, Chaitanya Kulkarni
On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index e5ca8301bb8b..b61ec62b0ec6 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -290,14 +290,14 @@ struct nvme_iod {
> u8 flags;
> u8 nr_descriptors;
>
> - unsigned int total_len;
> + size_t total_len;
Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
limitation on how large an IO can be, so I don't think the driver's
length needs to match the phys_vec type.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-17 19:35 ` Keith Busch
@ 2025-11-17 20:01 ` Leon Romanovsky
2025-11-18 5:18 ` Christoph Hellwig
1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 20:01 UTC (permalink / raw)
To: Keith Busch
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-block,
linux-kernel, linux-nvme, Chaitanya Kulkarni
On Mon, Nov 17, 2025 at 12:35:40PM -0700, Keith Busch wrote:
> On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index e5ca8301bb8b..b61ec62b0ec6 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -290,14 +290,14 @@ struct nvme_iod {
> > u8 flags;
> > u8 nr_descriptors;
> >
> > - unsigned int total_len;
> > + size_t total_len;
>
> Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
> limitation on how large an IO can be, so I don't think the driver's
> length needs to match the phys_vec type.
I'm big fan of keeping same types in all places, but can drop nvme changes,
if you think that it is right thing to do.
Thanks
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
2025-11-17 19:35 ` Keith Busch
@ 2025-11-18 5:03 ` Christoph Hellwig
2025-11-19 9:55 ` Leon Romanovsky
1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2025-11-18 5:03 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> This patch changes the length variables from unsigned int to size_t.
> Using size_t ensures that we can handle larger sizes, as size_t is
> always equal to or larger than the previously used u32 type.
>
> Originally, u32 was used because blk-mq-dma code evolved from
> scatter-gather implementation, which uses unsigned int to describe length.
> This change will also allow us to reuse the existing struct phys_vec in places
> that don't need scatter-gather.
>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> ---
> block/blk-mq-dma.c | 8 ++++++--
> drivers/nvme/host/pci.c | 4 ++--
> 2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> index e9108ccaf4b0..e7d9b54c3eed 100644
> --- a/block/blk-mq-dma.c
> +++ b/block/blk-mq-dma.c
> @@ -8,7 +8,7 @@
>
> struct phys_vec {
> phys_addr_t paddr;
> - u32 len;
> + size_t len;
> };
So we're now going to increase memory usage by 50% again after just
reducing it by removing the scatterlist?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-17 19:35 ` Keith Busch
2025-11-17 20:01 ` Leon Romanovsky
@ 2025-11-18 5:18 ` Christoph Hellwig
2025-11-18 23:10 ` Keith Busch
1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2025-11-18 5:18 UTC (permalink / raw)
To: Keith Busch
Cc: Leon Romanovsky, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
On Mon, Nov 17, 2025 at 12:35:40PM -0700, Keith Busch wrote:
> > + size_t total_len;
>
> Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
> limitation on how large an IO can be, so I don't think the driver's
> length needs to match the phys_vec type.
With the new dma mapping interface we could lift that limits for
SGL-based controllers as we basically only have a nr_segments limit now.
Not that I'm trying to argue for multi-GB I/O..
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-18 5:18 ` Christoph Hellwig
@ 2025-11-18 23:10 ` Keith Busch
0 siblings, 0 replies; 15+ messages in thread
From: Keith Busch @ 2025-11-18 23:10 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Leon Romanovsky, Jens Axboe, Sagi Grimberg, linux-block,
linux-kernel, linux-nvme, Chaitanya Kulkarni
On Tue, Nov 18, 2025 at 06:18:23AM +0100, Christoph Hellwig wrote:
> On Mon, Nov 17, 2025 at 12:35:40PM -0700, Keith Busch wrote:
> > > + size_t total_len;
> >
> > Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
> > limitation on how large an IO can be, so I don't think the driver's
> > length needs to match the phys_vec type.
>
> With the new dma mapping interface we could lift that limits for
> SGL-based controllers as we basically only have a nr_segments limit now.
> Not that I'm trying to argue for multi-GB I/O..
It's not a bad idea. The tricky part is in the timeout handling. If
we allow very large IO, I think we need a dynamic timeout value to
account for the link's throughput. We can already trigger blk-mq
timeouts if you saturate enough queues with max sized IO, despite
everything else working-as-designed.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-18 5:03 ` Christoph Hellwig
@ 2025-11-19 9:55 ` Leon Romanovsky
2025-11-19 10:10 ` Christoph Hellwig
2025-11-19 13:36 ` David Laight
0 siblings, 2 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 9:55 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme, Chaitanya Kulkarni
On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > This patch changes the length variables from unsigned int to size_t.
> > Using size_t ensures that we can handle larger sizes, as size_t is
> > always equal to or larger than the previously used u32 type.
> >
> > Originally, u32 was used because blk-mq-dma code evolved from
> > scatter-gather implementation, which uses unsigned int to describe length.
> > This change will also allow us to reuse the existing struct phys_vec in places
> > that don't need scatter-gather.
> >
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > ---
> > block/blk-mq-dma.c | 8 ++++++--
> > drivers/nvme/host/pci.c | 4 ++--
> > 2 files changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > index e9108ccaf4b0..e7d9b54c3eed 100644
> > --- a/block/blk-mq-dma.c
> > +++ b/block/blk-mq-dma.c
> > @@ -8,7 +8,7 @@
> >
> > struct phys_vec {
> > phys_addr_t paddr;
> > - u32 len;
> > + size_t len;
> > };
>
> So we're now going to increase memory usage by 50% again after just
> reducing it by removing the scatterlist?
It is slightly less.
Before this change: 96 bits
After this change (on 64bits system): 128 bits.
It is 33% increase per-structure.
So what is the resolution? Should I drop this patch or not?
Thanks
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-19 9:55 ` Leon Romanovsky
@ 2025-11-19 10:10 ` Christoph Hellwig
2025-11-19 11:06 ` Leon Romanovsky
2025-11-19 14:44 ` Leon Romanovsky
2025-11-19 13:36 ` David Laight
1 sibling, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2025-11-19 10:10 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
On Wed, Nov 19, 2025 at 11:55:16AM +0200, Leon Romanovsky wrote:
> So what is the resolution? Should I drop this patch or not?
I think it should be dropped. I don't think having to use one 96-bit
structure per 4GB worth of memory should be a deal breaker.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-19 10:10 ` Christoph Hellwig
@ 2025-11-19 11:06 ` Leon Romanovsky
2025-11-19 14:44 ` Leon Romanovsky
1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 11:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme, Chaitanya Kulkarni
On Wed, Nov 19, 2025 at 11:10:48AM +0100, Christoph Hellwig wrote:
> On Wed, Nov 19, 2025 at 11:55:16AM +0200, Leon Romanovsky wrote:
> > So what is the resolution? Should I drop this patch or not?
>
> I think it should be dropped. I don't think having to use one 96-bit
> structure per 4GB worth of memory should be a deal breaker.
No problem, will drop.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-19 9:55 ` Leon Romanovsky
2025-11-19 10:10 ` Christoph Hellwig
@ 2025-11-19 13:36 ` David Laight
2025-11-19 13:58 ` Leon Romanovsky
1 sibling, 1 reply; 15+ messages in thread
From: David Laight @ 2025-11-19 13:36 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
On Wed, 19 Nov 2025 11:55:16 +0200
Leon Romanovsky <leon@kernel.org> wrote:
> On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> > On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro@nvidia.com>
> > >
> > > This patch changes the length variables from unsigned int to size_t.
> > > Using size_t ensures that we can handle larger sizes, as size_t is
> > > always equal to or larger than the previously used u32 type.
> > >
> > > Originally, u32 was used because blk-mq-dma code evolved from
> > > scatter-gather implementation, which uses unsigned int to describe length.
> > > This change will also allow us to reuse the existing struct phys_vec in places
> > > that don't need scatter-gather.
> > >
> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > > ---
> > > block/blk-mq-dma.c | 8 ++++++--
> > > drivers/nvme/host/pci.c | 4 ++--
> > > 2 files changed, 8 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > index e9108ccaf4b0..e7d9b54c3eed 100644
> > > --- a/block/blk-mq-dma.c
> > > +++ b/block/blk-mq-dma.c
> > > @@ -8,7 +8,7 @@
> > >
> > > struct phys_vec {
> > > phys_addr_t paddr;
> > > - u32 len;
> > > + size_t len;
> > > };
> >
> > So we're now going to increase memory usage by 50% again after just
> > reducing it by removing the scatterlist?
>
> It is slightly less.
>
> Before this change: 96 bits
Did you actually look?
There will normally be 4 bytes of padding at the end of the structure.
About the only place where it will be 12 bytes is a 32bit system with
64bit phyaddr that aligns 64bit items on 32bit boundaries - so x86.
David
> After this change (on 64bits system): 128 bits.
>
> It is 33% increase per-structure.
>
> So what is the resolution? Should I drop this patch or not?
>
> Thanks
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-19 13:36 ` David Laight
@ 2025-11-19 13:58 ` Leon Romanovsky
2025-11-19 14:13 ` David Laight
0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 13:58 UTC (permalink / raw)
To: David Laight
Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
On Wed, Nov 19, 2025 at 01:36:15PM +0000, David Laight wrote:
> On Wed, 19 Nov 2025 11:55:16 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
>
> > On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> > > On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > >
> > > > This patch changes the length variables from unsigned int to size_t.
> > > > Using size_t ensures that we can handle larger sizes, as size_t is
> > > > always equal to or larger than the previously used u32 type.
> > > >
> > > > Originally, u32 was used because blk-mq-dma code evolved from
> > > > scatter-gather implementation, which uses unsigned int to describe length.
> > > > This change will also allow us to reuse the existing struct phys_vec in places
> > > > that don't need scatter-gather.
> > > >
> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > > > ---
> > > > block/blk-mq-dma.c | 8 ++++++--
> > > > drivers/nvme/host/pci.c | 4 ++--
> > > > 2 files changed, 8 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > > index e9108ccaf4b0..e7d9b54c3eed 100644
> > > > --- a/block/blk-mq-dma.c
> > > > +++ b/block/blk-mq-dma.c
> > > > @@ -8,7 +8,7 @@
> > > >
> > > > struct phys_vec {
> > > > phys_addr_t paddr;
> > > > - u32 len;
> > > > + size_t len;
> > > > };
> > >
> > > So we're now going to increase memory usage by 50% again after just
> > > reducing it by removing the scatterlist?
> >
> > It is slightly less.
> >
> > Before this change: 96 bits
>
> Did you actually look?
No, I simply performed sizeof(phys_addr_t) + sizeof(size_t).
> There will normally be 4 bytes of padding at the end of the structure.
>
> About the only place where it will be 12 bytes is a 32bit system with
> 64bit phyaddr that aligns 64bit items on 32bit boundaries - so x86.
So does it mean that Christoph's comment about size increase is not correct?
Thanks
>
> David
>
> > After this change (on 64bits system): 128 bits.
> >
> > It is 33% increase per-structure.
> >
> > So what is the resolution? Should I drop this patch or not?
> >
> > Thanks
> >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-19 13:58 ` Leon Romanovsky
@ 2025-11-19 14:13 ` David Laight
0 siblings, 0 replies; 15+ messages in thread
From: David Laight @ 2025-11-19 14:13 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni
On Wed, 19 Nov 2025 15:58:21 +0200
Leon Romanovsky <leon@kernel.org> wrote:
> On Wed, Nov 19, 2025 at 01:36:15PM +0000, David Laight wrote:
> > On Wed, 19 Nov 2025 11:55:16 +0200
> > Leon Romanovsky <leon@kernel.org> wrote:
> >
> > > On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> > > > On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > >
> > > > > This patch changes the length variables from unsigned int to size_t.
> > > > > Using size_t ensures that we can handle larger sizes, as size_t is
> > > > > always equal to or larger than the previously used u32 type.
> > > > >
> > > > > Originally, u32 was used because blk-mq-dma code evolved from
> > > > > scatter-gather implementation, which uses unsigned int to describe length.
> > > > > This change will also allow us to reuse the existing struct phys_vec in places
> > > > > that don't need scatter-gather.
> > > > >
> > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > > > > ---
> > > > > block/blk-mq-dma.c | 8 ++++++--
> > > > > drivers/nvme/host/pci.c | 4 ++--
> > > > > 2 files changed, 8 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > > > index e9108ccaf4b0..e7d9b54c3eed 100644
> > > > > --- a/block/blk-mq-dma.c
> > > > > +++ b/block/blk-mq-dma.c
> > > > > @@ -8,7 +8,7 @@
> > > > >
> > > > > struct phys_vec {
> > > > > phys_addr_t paddr;
> > > > > - u32 len;
> > > > > + size_t len;
> > > > > };
> > > >
> > > > So we're now going to increase memory usage by 50% again after just
> > > > reducing it by removing the scatterlist?
> > >
> > > It is slightly less.
> > >
> > > Before this change: 96 bits
> >
> > Did you actually look?
>
> No, I simply performed sizeof(phys_addr_t) + sizeof(size_t).
>
> > There will normally be 4 bytes of padding at the end of the structure.
> >
> > About the only place where it will be 12 bytes is a 32bit system with
> > 64bit phyaddr that aligns 64bit items on 32bit boundaries - so x86.
>
> So does it mean that Christoph's comment about size increase is not correct?
Correct - ie there is no size increase.
>
> Thanks
>
> >
> > David
> >
> > > After this change (on 64bits system): 128 bits.
> > >
> > > It is 33% increase per-structure.
> > >
> > > So what is the resolution? Should I drop this patch or not?
> > >
> > > Thanks
> > >
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
2025-11-19 10:10 ` Christoph Hellwig
2025-11-19 11:06 ` Leon Romanovsky
@ 2025-11-19 14:44 ` Leon Romanovsky
1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 14:44 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme, Chaitanya Kulkarni
On Wed, Nov 19, 2025 at 11:10:48AM +0100, Christoph Hellwig wrote:
> On Wed, Nov 19, 2025 at 11:55:16AM +0200, Leon Romanovsky wrote:
> > So what is the resolution? Should I drop this patch or not?
>
> I think it should be dropped. I don't think having to use one 96-bit
> structure per 4GB worth of memory should be a deal breaker.
Christoph,
It seems like no size change will before and after my change.
https://lore.kernel.org/linux-nvme/20251117-nvme-phys-types-v2-0-c75a60a2c468@nvidia.com/T/#ma575c050517e91e7630683cf193e39d812338fa4
Thanks
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-11-19 14:44 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 19:22 [PATCH v2 0/2] block: Generalize physical entry definition Leon Romanovsky
2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
2025-11-17 19:35 ` Keith Busch
2025-11-17 20:01 ` Leon Romanovsky
2025-11-18 5:18 ` Christoph Hellwig
2025-11-18 23:10 ` Keith Busch
2025-11-18 5:03 ` Christoph Hellwig
2025-11-19 9:55 ` Leon Romanovsky
2025-11-19 10:10 ` Christoph Hellwig
2025-11-19 11:06 ` Leon Romanovsky
2025-11-19 14:44 ` Leon Romanovsky
2025-11-19 13:36 ` David Laight
2025-11-19 13:58 ` Leon Romanovsky
2025-11-19 14:13 ` David Laight
2025-11-17 19:22 ` [PATCH v2 2/2] types: move phys_vec definition to common header Leon Romanovsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.