linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] block: Generalize physical entry definition
@ 2025-11-17 19:22 Leon Romanovsky
  2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
  2025-11-17 19:22 ` [PATCH v2 2/2] types: move phys_vec definition to common header Leon Romanovsky
  0 siblings, 2 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 19:22 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
  Cc: linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

Changelog:
v2:
 * Added Chaitanya's Reviewed-by tags.
 * Removed explicit casting from size_t to unsigned int.
v1: https://patch.msgid.link/20251115-nvme-phys-types-v1-0-c0f2e5e9163d@kernel.org

--------------------------------------------------------------------------------
The block layer code is declared "struct phys_vec" entry which describes
contiguous chunk of physical memory. That definition is useful for all
possible users of DMA physical address-based API.

This series changes NVMe code to support larger chunks of memory by changing
length field from u32 to be size_t, which will be u64 on 64-bits platforms,
and promotes "struct phys_vec" to general place.

---
Leon Romanovsky (2):
      nvme-pci: Use size_t for length fields to handle larger sizes
      types: move phys_vec definition to common header

 block/blk-mq-dma.c      | 11 +++++------
 drivers/nvme/host/pci.c |  4 ++--
 include/linux/types.h   |  5 +++++
 3 files changed, 12 insertions(+), 8 deletions(-)
---
base-commit: 5674abb82e2b74205a6a5cd1ffd79a3ba48a469d
change-id: 20251030-nvme-phys-types-988893249454

Best regards,
--  
Leon Romanovsky <leonro@nvidia.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-17 19:22 [PATCH v2 0/2] block: Generalize physical entry definition Leon Romanovsky
@ 2025-11-17 19:22 ` Leon Romanovsky
  2025-11-17 19:35   ` Keith Busch
  2025-11-18  5:03   ` Christoph Hellwig
  2025-11-17 19:22 ` [PATCH v2 2/2] types: move phys_vec definition to common header Leon Romanovsky
  1 sibling, 2 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 19:22 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
  Cc: linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

From: Leon Romanovsky <leonro@nvidia.com>

This patch changes the length variables from unsigned int to size_t.
Using size_t ensures that we can handle larger sizes, as size_t is
always equal to or larger than the previously used u32 type.

Originally, u32 was used because blk-mq-dma code evolved from
scatter-gather implementation, which uses unsigned int to describe length.
This change will also allow us to reuse the existing struct phys_vec in places
that don't need scatter-gather.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 block/blk-mq-dma.c      | 8 ++++++--
 drivers/nvme/host/pci.c | 4 ++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index e9108ccaf4b0..e7d9b54c3eed 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -8,7 +8,7 @@
 
 struct phys_vec {
 	phys_addr_t	paddr;
-	u32		len;
+	size_t		len;
 };
 
 static bool __blk_map_iter_next(struct blk_map_iter *iter)
@@ -112,8 +112,8 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
 		struct phys_vec *vec)
 {
 	enum dma_data_direction dir = rq_dma_dir(req);
-	unsigned int mapped = 0;
 	unsigned int attrs = 0;
+	size_t mapped = 0;
 	int error;
 
 	iter->addr = state->addr;
@@ -296,6 +296,8 @@ int __blk_rq_map_sg(struct request *rq, struct scatterlist *sglist,
 	blk_rq_map_iter_init(rq, &iter);
 	while (blk_map_iter_next(rq, &iter, &vec)) {
 		*last_sg = blk_next_sg(last_sg, sglist);
+
+		WARN_ON_ONCE(overflows_type(vec.len, unsigned int));
 		sg_set_page(*last_sg, phys_to_page(vec.paddr), vec.len,
 				offset_in_page(vec.paddr));
 		nsegs++;
@@ -416,6 +418,8 @@ int blk_rq_map_integrity_sg(struct request *rq, struct scatterlist *sglist)
 
 	while (blk_map_iter_next(rq, &iter, &vec)) {
 		sg = blk_next_sg(&sg, sglist);
+
+		WARN_ON_ONCE(overflows_type(vec.len, unsigned int));
 		sg_set_page(sg, phys_to_page(vec.paddr), vec.len,
 				offset_in_page(vec.paddr));
 		segments++;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e5ca8301bb8b..b61ec62b0ec6 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -290,14 +290,14 @@ struct nvme_iod {
 	u8 flags;
 	u8 nr_descriptors;
 
-	unsigned int total_len;
+	size_t total_len;
 	struct dma_iova_state dma_state;
 	void *descriptors[NVME_MAX_NR_DESCRIPTORS];
 	struct nvme_dma_vec *dma_vecs;
 	unsigned int nr_dma_vecs;
 
 	dma_addr_t meta_dma;
-	unsigned int meta_total_len;
+	size_t meta_total_len;
 	struct dma_iova_state meta_dma_state;
 	struct nvme_sgl_desc *meta_descriptor;
 };

-- 
2.51.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/2] types: move phys_vec definition to common header
  2025-11-17 19:22 [PATCH v2 0/2] block: Generalize physical entry definition Leon Romanovsky
  2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
@ 2025-11-17 19:22 ` Leon Romanovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 19:22 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
  Cc: linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

From: Leon Romanovsky <leonro@nvidia.com>

Move the struct phys_vec definition from block/blk-mq-dma.c to
include/linux/types.h to make it available for use across the kernel.

The phys_vec structure represents a physical address range with a
length, which is used by the new physical address-based DMA mapping
API. This structure is already used by the block layer and will be
needed for DMA phys API users.

Moving this definition to types.h provides a centralized location
for this common data structure and eliminates code duplication
across subsystems that need to work with physical address ranges.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 block/blk-mq-dma.c    | 5 -----
 include/linux/types.h | 5 +++++
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index e7d9b54c3eed..be2ccf224f9c 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -6,11 +6,6 @@
 #include <linux/blk-mq-dma.h>
 #include "blk.h"
 
-struct phys_vec {
-	phys_addr_t	paddr;
-	size_t		len;
-};
-
 static bool __blk_map_iter_next(struct blk_map_iter *iter)
 {
 	if (iter->iter.bi_size)
diff --git a/include/linux/types.h b/include/linux/types.h
index 6dfdb8e8e4c3..6cc2d7cba9b3 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -170,6 +170,11 @@ typedef u64 phys_addr_t;
 typedef u32 phys_addr_t;
 #endif
 
+struct phys_vec {
+	phys_addr_t	paddr;
+	size_t		len;
+};
+
 typedef phys_addr_t resource_size_t;
 
 /*

-- 
2.51.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
@ 2025-11-17 19:35   ` Keith Busch
  2025-11-17 20:01     ` Leon Romanovsky
  2025-11-18  5:18     ` Christoph Hellwig
  2025-11-18  5:03   ` Christoph Hellwig
  1 sibling, 2 replies; 15+ messages in thread
From: Keith Busch @ 2025-11-17 19:35 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-block,
	linux-kernel, linux-nvme, Chaitanya Kulkarni

On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index e5ca8301bb8b..b61ec62b0ec6 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -290,14 +290,14 @@ struct nvme_iod {
>  	u8 flags;
>  	u8 nr_descriptors;
>  
> -	unsigned int total_len;
> +	size_t total_len;

Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
limitation on how large an IO can be, so I don't think the driver's
length needs to match the phys_vec type.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-17 19:35   ` Keith Busch
@ 2025-11-17 20:01     ` Leon Romanovsky
  2025-11-18  5:18     ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-17 20:01 UTC (permalink / raw)
  To: Keith Busch
  Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, linux-block,
	linux-kernel, linux-nvme, Chaitanya Kulkarni

On Mon, Nov 17, 2025 at 12:35:40PM -0700, Keith Busch wrote:
> On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index e5ca8301bb8b..b61ec62b0ec6 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -290,14 +290,14 @@ struct nvme_iod {
> >  	u8 flags;
> >  	u8 nr_descriptors;
> >  
> > -	unsigned int total_len;
> > +	size_t total_len;
> 
> Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
> limitation on how large an IO can be, so I don't think the driver's
> length needs to match the phys_vec type.

I'm big fan of keeping same types in all places, but can drop nvme changes,
if you think that it is right thing to do.

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
  2025-11-17 19:35   ` Keith Busch
@ 2025-11-18  5:03   ` Christoph Hellwig
  2025-11-19  9:55     ` Leon Romanovsky
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2025-11-18  5:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg,
	linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> This patch changes the length variables from unsigned int to size_t.
> Using size_t ensures that we can handle larger sizes, as size_t is
> always equal to or larger than the previously used u32 type.
> 
> Originally, u32 was used because blk-mq-dma code evolved from
> scatter-gather implementation, which uses unsigned int to describe length.
> This change will also allow us to reuse the existing struct phys_vec in places
> that don't need scatter-gather.
> 
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> ---
>  block/blk-mq-dma.c      | 8 ++++++--
>  drivers/nvme/host/pci.c | 4 ++--
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> index e9108ccaf4b0..e7d9b54c3eed 100644
> --- a/block/blk-mq-dma.c
> +++ b/block/blk-mq-dma.c
> @@ -8,7 +8,7 @@
>  
>  struct phys_vec {
>  	phys_addr_t	paddr;
> -	u32		len;
> +	size_t		len;
>  };

So we're now going to increase memory usage by 50% again after just
reducing it by removing the scatterlist?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-17 19:35   ` Keith Busch
  2025-11-17 20:01     ` Leon Romanovsky
@ 2025-11-18  5:18     ` Christoph Hellwig
  2025-11-18 23:10       ` Keith Busch
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2025-11-18  5:18 UTC (permalink / raw)
  To: Keith Busch
  Cc: Leon Romanovsky, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

On Mon, Nov 17, 2025 at 12:35:40PM -0700, Keith Busch wrote:
> > +	size_t total_len;
> 
> Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
> limitation on how large an IO can be, so I don't think the driver's
> length needs to match the phys_vec type.

With the new dma mapping interface we could lift that limits for
SGL-based controllers as we basically only have a nr_segments limit now.
Not that I'm trying to argue for multi-GB I/O..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-18  5:18     ` Christoph Hellwig
@ 2025-11-18 23:10       ` Keith Busch
  0 siblings, 0 replies; 15+ messages in thread
From: Keith Busch @ 2025-11-18 23:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Leon Romanovsky, Jens Axboe, Sagi Grimberg, linux-block,
	linux-kernel, linux-nvme, Chaitanya Kulkarni

On Tue, Nov 18, 2025 at 06:18:23AM +0100, Christoph Hellwig wrote:
> On Mon, Nov 17, 2025 at 12:35:40PM -0700, Keith Busch wrote:
> > > +	size_t total_len;
> > 
> > Changing the generic phys_vec sounds fine, but the nvme driver has a 8MB
> > limitation on how large an IO can be, so I don't think the driver's
> > length needs to match the phys_vec type.
> 
> With the new dma mapping interface we could lift that limits for
> SGL-based controllers as we basically only have a nr_segments limit now.
> Not that I'm trying to argue for multi-GB I/O..

It's not a bad idea. The tricky part is in the timeout handling. If
we allow very large IO, I think we need a dynamic timeout value to
account for the link's throughput. We can already trigger blk-mq
timeouts if you saturate enough queues with max sized IO, despite
everything else working-as-designed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-18  5:03   ` Christoph Hellwig
@ 2025-11-19  9:55     ` Leon Romanovsky
  2025-11-19 10:10       ` Christoph Hellwig
  2025-11-19 13:36       ` David Laight
  0 siblings, 2 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19  9:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
	linux-nvme, Chaitanya Kulkarni

On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > This patch changes the length variables from unsigned int to size_t.
> > Using size_t ensures that we can handle larger sizes, as size_t is
> > always equal to or larger than the previously used u32 type.
> > 
> > Originally, u32 was used because blk-mq-dma code evolved from
> > scatter-gather implementation, which uses unsigned int to describe length.
> > This change will also allow us to reuse the existing struct phys_vec in places
> > that don't need scatter-gather.
> > 
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > ---
> >  block/blk-mq-dma.c      | 8 ++++++--
> >  drivers/nvme/host/pci.c | 4 ++--
> >  2 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > index e9108ccaf4b0..e7d9b54c3eed 100644
> > --- a/block/blk-mq-dma.c
> > +++ b/block/blk-mq-dma.c
> > @@ -8,7 +8,7 @@
> >  
> >  struct phys_vec {
> >  	phys_addr_t	paddr;
> > -	u32		len;
> > +	size_t		len;
> >  };
> 
> So we're now going to increase memory usage by 50% again after just
> reducing it by removing the scatterlist?

It is slightly less.

Before this change: 96 bits
After this change (on 64bits system): 128 bits.

It is 33% increase per-structure.

So what is the resolution? Should I drop this patch or not?

Thanks 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-19  9:55     ` Leon Romanovsky
@ 2025-11-19 10:10       ` Christoph Hellwig
  2025-11-19 11:06         ` Leon Romanovsky
  2025-11-19 14:44         ` Leon Romanovsky
  2025-11-19 13:36       ` David Laight
  1 sibling, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2025-11-19 10:10 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
	linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

On Wed, Nov 19, 2025 at 11:55:16AM +0200, Leon Romanovsky wrote:
> So what is the resolution? Should I drop this patch or not?

I think it should be dropped.  I don't think having to use one 96-bit
structure per 4GB worth of memory should be a deal breaker.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-19 10:10       ` Christoph Hellwig
@ 2025-11-19 11:06         ` Leon Romanovsky
  2025-11-19 14:44         ` Leon Romanovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 11:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
	linux-nvme, Chaitanya Kulkarni

On Wed, Nov 19, 2025 at 11:10:48AM +0100, Christoph Hellwig wrote:
> On Wed, Nov 19, 2025 at 11:55:16AM +0200, Leon Romanovsky wrote:
> > So what is the resolution? Should I drop this patch or not?
> 
> I think it should be dropped.  I don't think having to use one 96-bit
> structure per 4GB worth of memory should be a deal breaker.

No problem, will drop.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-19  9:55     ` Leon Romanovsky
  2025-11-19 10:10       ` Christoph Hellwig
@ 2025-11-19 13:36       ` David Laight
  2025-11-19 13:58         ` Leon Romanovsky
  1 sibling, 1 reply; 15+ messages in thread
From: David Laight @ 2025-11-19 13:36 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
	linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

On Wed, 19 Nov 2025 11:55:16 +0200
Leon Romanovsky <leon@kernel.org> wrote:

> On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> > On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:  
> > > From: Leon Romanovsky <leonro@nvidia.com>
> > > 
> > > This patch changes the length variables from unsigned int to size_t.
> > > Using size_t ensures that we can handle larger sizes, as size_t is
> > > always equal to or larger than the previously used u32 type.
> > > 
> > > Originally, u32 was used because blk-mq-dma code evolved from
> > > scatter-gather implementation, which uses unsigned int to describe length.
> > > This change will also allow us to reuse the existing struct phys_vec in places
> > > that don't need scatter-gather.
> > > 
> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > > ---
> > >  block/blk-mq-dma.c      | 8 ++++++--
> > >  drivers/nvme/host/pci.c | 4 ++--
> > >  2 files changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > index e9108ccaf4b0..e7d9b54c3eed 100644
> > > --- a/block/blk-mq-dma.c
> > > +++ b/block/blk-mq-dma.c
> > > @@ -8,7 +8,7 @@
> > >  
> > >  struct phys_vec {
> > >  	phys_addr_t	paddr;
> > > -	u32		len;
> > > +	size_t		len;
> > >  };  
> > 
> > So we're now going to increase memory usage by 50% again after just
> > reducing it by removing the scatterlist?  
> 
> It is slightly less.
> 
> Before this change: 96 bits

Did you actually look?
There will normally be 4 bytes of padding at the end of the structure.

About the only place where it will be 12 bytes is a 32bit system with
64bit phyaddr that aligns 64bit items on 32bit boundaries - so x86.

	David

> After this change (on 64bits system): 128 bits.
> 
> It is 33% increase per-structure.
> 
> So what is the resolution? Should I drop this patch or not?
> 
> Thanks 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-19 13:36       ` David Laight
@ 2025-11-19 13:58         ` Leon Romanovsky
  2025-11-19 14:13           ` David Laight
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 13:58 UTC (permalink / raw)
  To: David Laight
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
	linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

On Wed, Nov 19, 2025 at 01:36:15PM +0000, David Laight wrote:
> On Wed, 19 Nov 2025 11:55:16 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
> 
> > On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:
> > > On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:  
> > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > 
> > > > This patch changes the length variables from unsigned int to size_t.
> > > > Using size_t ensures that we can handle larger sizes, as size_t is
> > > > always equal to or larger than the previously used u32 type.
> > > > 
> > > > Originally, u32 was used because blk-mq-dma code evolved from
> > > > scatter-gather implementation, which uses unsigned int to describe length.
> > > > This change will also allow us to reuse the existing struct phys_vec in places
> > > > that don't need scatter-gather.
> > > > 
> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > > > ---
> > > >  block/blk-mq-dma.c      | 8 ++++++--
> > > >  drivers/nvme/host/pci.c | 4 ++--
> > > >  2 files changed, 8 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > > index e9108ccaf4b0..e7d9b54c3eed 100644
> > > > --- a/block/blk-mq-dma.c
> > > > +++ b/block/blk-mq-dma.c
> > > > @@ -8,7 +8,7 @@
> > > >  
> > > >  struct phys_vec {
> > > >  	phys_addr_t	paddr;
> > > > -	u32		len;
> > > > +	size_t		len;
> > > >  };  
> > > 
> > > So we're now going to increase memory usage by 50% again after just
> > > reducing it by removing the scatterlist?  
> > 
> > It is slightly less.
> > 
> > Before this change: 96 bits
> 
> Did you actually look?

No, I simply performed sizeof(phys_addr_t) + sizeof(size_t).

> There will normally be 4 bytes of padding at the end of the structure.
> 
> About the only place where it will be 12 bytes is a 32bit system with
> 64bit phyaddr that aligns 64bit items on 32bit boundaries - so x86.

So does it mean that Christoph's comment about size increase is not correct?

Thanks

> 
> 	David
> 
> > After this change (on 64bits system): 128 bits.
> > 
> > It is 33% increase per-structure.
> > 
> > So what is the resolution? Should I drop this patch or not?
> > 
> > Thanks 
> > 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-19 13:58         ` Leon Romanovsky
@ 2025-11-19 14:13           ` David Laight
  0 siblings, 0 replies; 15+ messages in thread
From: David Laight @ 2025-11-19 14:13 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
	linux-block, linux-kernel, linux-nvme, Chaitanya Kulkarni

On Wed, 19 Nov 2025 15:58:21 +0200
Leon Romanovsky <leon@kernel.org> wrote:

> On Wed, Nov 19, 2025 at 01:36:15PM +0000, David Laight wrote:
> > On Wed, 19 Nov 2025 11:55:16 +0200
> > Leon Romanovsky <leon@kernel.org> wrote:
> >   
> > > On Tue, Nov 18, 2025 at 06:03:11AM +0100, Christoph Hellwig wrote:  
> > > > On Mon, Nov 17, 2025 at 09:22:43PM +0200, Leon Romanovsky wrote:    
> > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > 
> > > > > This patch changes the length variables from unsigned int to size_t.
> > > > > Using size_t ensures that we can handle larger sizes, as size_t is
> > > > > always equal to or larger than the previously used u32 type.
> > > > > 
> > > > > Originally, u32 was used because blk-mq-dma code evolved from
> > > > > scatter-gather implementation, which uses unsigned int to describe length.
> > > > > This change will also allow us to reuse the existing struct phys_vec in places
> > > > > that don't need scatter-gather.
> > > > > 
> > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > > > > ---
> > > > >  block/blk-mq-dma.c      | 8 ++++++--
> > > > >  drivers/nvme/host/pci.c | 4 ++--
> > > > >  2 files changed, 8 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > > > index e9108ccaf4b0..e7d9b54c3eed 100644
> > > > > --- a/block/blk-mq-dma.c
> > > > > +++ b/block/blk-mq-dma.c
> > > > > @@ -8,7 +8,7 @@
> > > > >  
> > > > >  struct phys_vec {
> > > > >  	phys_addr_t	paddr;
> > > > > -	u32		len;
> > > > > +	size_t		len;
> > > > >  };    
> > > > 
> > > > So we're now going to increase memory usage by 50% again after just
> > > > reducing it by removing the scatterlist?    
> > > 
> > > It is slightly less.
> > > 
> > > Before this change: 96 bits  
> > 
> > Did you actually look?  
> 
> No, I simply performed sizeof(phys_addr_t) + sizeof(size_t).
> 
> > There will normally be 4 bytes of padding at the end of the structure.
> > 
> > About the only place where it will be 12 bytes is a 32bit system with
> > 64bit phyaddr that aligns 64bit items on 32bit boundaries - so x86.  
> 
> So does it mean that Christoph's comment about size increase is not correct?

Correct - ie there is no size increase.

> 
> Thanks
> 
> > 
> > 	David
> >   
> > > After this change (on 64bits system): 128 bits.
> > > 
> > > It is 33% increase per-structure.
> > > 
> > > So what is the resolution? Should I drop this patch or not?
> > > 
> > > Thanks 
> > >   
> >   


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes
  2025-11-19 10:10       ` Christoph Hellwig
  2025-11-19 11:06         ` Leon Romanovsky
@ 2025-11-19 14:44         ` Leon Romanovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2025-11-19 14:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
	linux-nvme, Chaitanya Kulkarni

On Wed, Nov 19, 2025 at 11:10:48AM +0100, Christoph Hellwig wrote:
> On Wed, Nov 19, 2025 at 11:55:16AM +0200, Leon Romanovsky wrote:
> > So what is the resolution? Should I drop this patch or not?
> 
> I think it should be dropped.  I don't think having to use one 96-bit
> structure per 4GB worth of memory should be a deal breaker.

Christoph,

It seems like no size change will before and after my change.
https://lore.kernel.org/linux-nvme/20251117-nvme-phys-types-v2-0-c75a60a2c468@nvidia.com/T/#ma575c050517e91e7630683cf193e39d812338fa4

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-11-19 14:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 19:22 [PATCH v2 0/2] block: Generalize physical entry definition Leon Romanovsky
2025-11-17 19:22 ` [PATCH v2 1/2] nvme-pci: Use size_t for length fields to handle larger sizes Leon Romanovsky
2025-11-17 19:35   ` Keith Busch
2025-11-17 20:01     ` Leon Romanovsky
2025-11-18  5:18     ` Christoph Hellwig
2025-11-18 23:10       ` Keith Busch
2025-11-18  5:03   ` Christoph Hellwig
2025-11-19  9:55     ` Leon Romanovsky
2025-11-19 10:10       ` Christoph Hellwig
2025-11-19 11:06         ` Leon Romanovsky
2025-11-19 14:44         ` Leon Romanovsky
2025-11-19 13:36       ` David Laight
2025-11-19 13:58         ` Leon Romanovsky
2025-11-19 14:13           ` David Laight
2025-11-17 19:22 ` [PATCH v2 2/2] types: move phys_vec definition to common header Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).