* [PATCH 0/3] block: Enable proper MMIO memory handling for P2P DMA
@ 2025-10-17 5:31 Leon Romanovsky
2025-10-17 5:31 ` [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-17 5:31 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme
Changelog:
v1:
* Reordered patches.
* Dropped patch which tried to unify unmap flow.
* Set MMIO flag separately for data and integrity payloads.
v0: https://lore.kernel.org/all/cover.1760369219.git.leon@kernel.org/
----------------------------------------------------------------------
This patch series improves block layer and NVMe driver support for MMIO
memory regions, particularly for peer-to-peer (P2P) DMA transfers that
go through the host bridge.
The series addresses a critical gap where P2P transfers through the host
bridge (PCI_P2PDMA_MAP_THRU_HOST_BRIDGE) were not properly marked as
MMIO memory, leading to potential issues with:
- Inappropriate CPU cache synchronization operations on MMIO regions
- Incorrect DMA mapping/unmapping that doesn't respect MMIO semantics
- Missing IOMMU configuration for MMIO memory handling
This work is extracted from the larger DMA physical API improvement
series [1] and focuses specifically on block layer and NVMe requirements
for MMIO memory support.
Thanks
[1] https://lore.kernel.org/all/cover.1757423202.git.leonro@nvidia.com/
---
Leon Romanovsky (3):
blk-mq-dma: migrate to dma_map_phys instead of map_page
nvme-pci: unmap MMIO pages with appropriate interface
block-dma: properly take MMIO path
block/blk-mq-dma.c | 12 +++++++++---
drivers/nvme/host/pci.c | 18 +++++++++++++-----
include/linux/bio-integrity.h | 1 +
include/linux/blk-integrity.h | 3 ++-
include/linux/blk-mq-dma.h | 14 +++++++++++---
include/linux/blk_types.h | 2 ++
6 files changed, 38 insertions(+), 12 deletions(-)
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20251016-block-with-mmio-02acf4285427
Best regards,
--
Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page
2025-10-17 5:31 [PATCH 0/3] block: Enable proper MMIO memory handling for P2P DMA Leon Romanovsky
@ 2025-10-17 5:31 ` Leon Romanovsky
2025-10-17 6:18 ` Christoph Hellwig
2025-10-17 5:31 ` [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface Leon Romanovsky
2025-10-17 5:32 ` [PATCH 3/3] block-dma: properly take MMIO path Leon Romanovsky
2 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-17 5:31 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme
From: Leon Romanovsky <leonro@nvidia.com>
After introduction of dma_map_phys(), there is no need to convert
from physical address to struct page in order to map page. So let's
use it directly.
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
block/blk-mq-dma.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 449950029872..4ba7b0323da4 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -93,8 +93,8 @@ static bool blk_dma_map_bus(struct blk_dma_iter *iter, struct phys_vec *vec)
static bool blk_dma_map_direct(struct request *req, struct device *dma_dev,
struct blk_dma_iter *iter, struct phys_vec *vec)
{
- iter->addr = dma_map_page(dma_dev, phys_to_page(vec->paddr),
- offset_in_page(vec->paddr), vec->len, rq_dma_dir(req));
+ iter->addr = dma_map_phys(dma_dev, vec->paddr, vec->len,
+ rq_dma_dir(req), 0);
if (dma_mapping_error(dma_dev, iter->addr)) {
iter->status = BLK_STS_RESOURCE;
return false;
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface
2025-10-17 5:31 [PATCH 0/3] block: Enable proper MMIO memory handling for P2P DMA Leon Romanovsky
2025-10-17 5:31 ` [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
@ 2025-10-17 5:31 ` Leon Romanovsky
2025-10-17 6:20 ` Christoph Hellwig
2025-10-17 5:32 ` [PATCH 3/3] block-dma: properly take MMIO path Leon Romanovsky
2 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-17 5:31 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme
From: Leon Romanovsky <leonro@nvidia.com>
Block layer maps MMIO memory through dma_map_phys() interface
with help of DMA_ATTR_MMIO attribute. There is a need to unmap
that memory with the appropriate unmap function, something which
wasn't possible before adding new REQ attribute to block layer in
previous patch.
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/nvme/host/pci.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c916176bd9f0..2e9fb3c7bc09 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -689,11 +689,15 @@ static void nvme_free_prps(struct request *req)
{
struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
+ unsigned int attrs = 0;
unsigned int i;
+ if (req->cmd_flags & REQ_MMIO)
+ attrs |= DMA_ATTR_MMIO;
+
for (i = 0; i < iod->nr_dma_vecs; i++)
- dma_unmap_page(nvmeq->dev->dev, iod->dma_vecs[i].addr,
- iod->dma_vecs[i].len, rq_dma_dir(req));
+ dma_unmap_phys(nvmeq->dev->dev, iod->dma_vecs[i].addr,
+ iod->dma_vecs[i].len, rq_dma_dir(req), attrs);
mempool_free(iod->dma_vecs, nvmeq->dev->dmavec_mempool);
}
@@ -704,16 +708,20 @@ static void nvme_free_sgls(struct request *req, struct nvme_sgl_desc *sge,
enum dma_data_direction dir = rq_dma_dir(req);
unsigned int len = le32_to_cpu(sge->length);
struct device *dma_dev = nvmeq->dev->dev;
+ unsigned int attrs = 0;
unsigned int i;
+ if (req->cmd_flags & REQ_MMIO)
+ attrs |= DMA_ATTR_MMIO;
+
if (sge->type == (NVME_SGL_FMT_DATA_DESC << 4)) {
- dma_unmap_page(dma_dev, le64_to_cpu(sge->addr), len, dir);
+ dma_unmap_phys(dma_dev, le64_to_cpu(sge->addr), len, dir, attrs);
return;
}
for (i = 0; i < len / sizeof(*sg_list); i++)
- dma_unmap_page(dma_dev, le64_to_cpu(sg_list[i].addr),
- le32_to_cpu(sg_list[i].length), dir);
+ dma_unmap_phys(dma_dev, le64_to_cpu(sg_list[i].addr),
+ le32_to_cpu(sg_list[i].length), dir, attrs);
}
static void nvme_unmap_metadata(struct request *req)
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/3] block-dma: properly take MMIO path
2025-10-17 5:31 [PATCH 0/3] block: Enable proper MMIO memory handling for P2P DMA Leon Romanovsky
2025-10-17 5:31 ` [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
2025-10-17 5:31 ` [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface Leon Romanovsky
@ 2025-10-17 5:32 ` Leon Romanovsky
2025-10-17 6:25 ` Christoph Hellwig
2 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-17 5:32 UTC (permalink / raw)
To: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg
Cc: linux-block, linux-kernel, linux-nvme
From: Leon Romanovsky <leonro@nvidia.com>
Make sure that CPU is not synced and IOMMU is configured to take
MMIO path by providing newly introduced DMA_ATTR_MMIO attribute.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
block/blk-mq-dma.c | 10 ++++++++--
include/linux/bio-integrity.h | 1 +
include/linux/blk-integrity.h | 3 ++-
include/linux/blk-mq-dma.h | 14 +++++++++++---
include/linux/blk_types.h | 2 ++
5 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 4ba7b0323da4..e1f460da95d7 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -94,7 +94,7 @@ static bool blk_dma_map_direct(struct request *req, struct device *dma_dev,
struct blk_dma_iter *iter, struct phys_vec *vec)
{
iter->addr = dma_map_phys(dma_dev, vec->paddr, vec->len,
- rq_dma_dir(req), 0);
+ rq_dma_dir(req), iter->iter.attrs);
if (dma_mapping_error(dma_dev, iter->addr)) {
iter->status = BLK_STS_RESOURCE;
return false;
@@ -116,7 +116,7 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
do {
error = dma_iova_link(dma_dev, state, vec->paddr, mapped,
- vec->len, dir, 0);
+ vec->len, dir, iter->iter.attrs);
if (error)
break;
mapped += vec->len;
@@ -184,6 +184,12 @@ static bool blk_dma_map_iter_start(struct request *req, struct device *dma_dev,
* P2P transfers through the host bridge are treated the
* same as non-P2P transfers below and during unmap.
*/
+ if (iter->iter.is_integrity)
+ bio_integrity(req->bio)->bip_flags |= BIP_MMIO;
+ else
+ req->cmd_flags |= REQ_MMIO;
+ iter->iter.attrs |= DMA_ATTR_MMIO;
+ fallthrough;
case PCI_P2PDMA_MAP_NONE:
break;
default:
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 851254f36eb3..b77b2cfb7b0f 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -14,6 +14,7 @@ enum bip_flags {
BIP_CHECK_REFTAG = 1 << 6, /* reftag check */
BIP_CHECK_APPTAG = 1 << 7, /* apptag check */
BIP_P2P_DMA = 1 << 8, /* using P2P address */
+ BIP_MMIO = 1 << 9, /* contains MMIO memory */
};
struct bio_integrity_payload {
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index b659373788f6..34648d6c14d7 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -33,7 +33,8 @@ static inline bool blk_rq_integrity_dma_unmap(struct request *req,
size_t mapped_len)
{
return blk_dma_unmap(req, dma_dev, state, mapped_len,
- bio_integrity(req->bio)->bip_flags & BIP_P2P_DMA);
+ bio_integrity(req->bio)->bip_flags & BIP_P2P_DMA,
+ bio_integrity(req->bio)->bip_flags & BIP_MMIO);
}
int blk_rq_count_integrity_sg(struct request_queue *, struct bio *);
diff --git a/include/linux/blk-mq-dma.h b/include/linux/blk-mq-dma.h
index 51829958d872..916ca1deaf2c 100644
--- a/include/linux/blk-mq-dma.h
+++ b/include/linux/blk-mq-dma.h
@@ -10,6 +10,7 @@ struct blk_map_iter {
struct bio *bio;
struct bio_vec *bvecs;
bool is_integrity;
+ unsigned int attrs;
};
struct blk_dma_iter {
@@ -49,19 +50,25 @@ static inline bool blk_rq_dma_map_coalesce(struct dma_iova_state *state)
* @state: DMA IOVA state
* @mapped_len: number of bytes to unmap
* @is_p2p: true if mapped with PCI_P2PDMA_MAP_BUS_ADDR
+ * @is_mmio: true if mapped with PCI_P2PDMA_MAP_THRU_HOST_BRIDGE
*
* Returns %false if the callers need to manually unmap every DMA segment
* mapped using @iter or %true if no work is left to be done.
*/
static inline bool blk_dma_unmap(struct request *req, struct device *dma_dev,
- struct dma_iova_state *state, size_t mapped_len, bool is_p2p)
+ struct dma_iova_state *state, size_t mapped_len, bool is_p2p,
+ bool is_mmio)
{
if (is_p2p)
return true;
if (dma_use_iova(state)) {
+ unsigned int attrs = 0;
+
+ if (is_mmio)
+ attrs = DMA_ATTR_MMIO;
dma_iova_destroy(dma_dev, state, mapped_len, rq_dma_dir(req),
- 0);
+ attrs);
return true;
}
@@ -72,7 +79,8 @@ static inline bool blk_rq_dma_unmap(struct request *req, struct device *dma_dev,
struct dma_iova_state *state, size_t mapped_len)
{
return blk_dma_unmap(req, dma_dev, state, mapped_len,
- req->cmd_flags & REQ_P2PDMA);
+ req->cmd_flags & REQ_P2PDMA,
+ req->cmd_flags & REQ_MMIO);
}
#endif /* BLK_MQ_DMA_H */
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 8e8d1cc8b06c..9affa3b2d047 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -382,6 +382,7 @@ enum req_flag_bits {
__REQ_FS_PRIVATE, /* for file system (submitter) use */
__REQ_ATOMIC, /* for atomic write operations */
__REQ_P2PDMA, /* contains P2P DMA pages */
+ __REQ_MMIO, /* contains MMIO memory */
/*
* Command specific flags, keep last:
*/
@@ -415,6 +416,7 @@ enum req_flag_bits {
#define REQ_FS_PRIVATE (__force blk_opf_t)(1ULL << __REQ_FS_PRIVATE)
#define REQ_ATOMIC (__force blk_opf_t)(1ULL << __REQ_ATOMIC)
#define REQ_P2PDMA (__force blk_opf_t)(1ULL << __REQ_P2PDMA)
+#define REQ_MMIO (__force blk_opf_t)(1ULL << __REQ_MMIO)
#define REQ_NOUNMAP (__force blk_opf_t)(1ULL << __REQ_NOUNMAP)
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page
2025-10-17 5:31 ` [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
@ 2025-10-17 6:18 ` Christoph Hellwig
2025-10-19 14:40 ` Leon Romanovsky
0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2025-10-17 6:18 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme
On Fri, Oct 17, 2025 at 08:31:58AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> After introduction of dma_map_phys(), there is no need to convert
> from physical address to struct page in order to map page. So let's
> use it directly.
>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
You forgot to pick up my review from last round.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface
2025-10-17 5:31 ` [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface Leon Romanovsky
@ 2025-10-17 6:20 ` Christoph Hellwig
2025-10-20 7:53 ` Leon Romanovsky
0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2025-10-17 6:20 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme
On Fri, Oct 17, 2025 at 08:31:59AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> Block layer maps MMIO memory through dma_map_phys() interface
> with help of DMA_ATTR_MMIO attribute. There is a need to unmap
> that memory with the appropriate unmap function, something which
> wasn't possible before adding new REQ attribute to block layer in
> previous patch.
DMA_ATTR_MMIO only gets set in the following patch as far as I can
tell.
The more logical way would be to simply convert to dma_unmap_phys
here and then add the flag in one go as suggested last round.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] block-dma: properly take MMIO path
2025-10-17 5:32 ` [PATCH 3/3] block-dma: properly take MMIO path Leon Romanovsky
@ 2025-10-17 6:25 ` Christoph Hellwig
2025-10-20 8:52 ` Leon Romanovsky
2025-10-20 8:56 ` Leon Romanovsky
0 siblings, 2 replies; 13+ messages in thread
From: Christoph Hellwig @ 2025-10-17 6:25 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jens Axboe, Keith Busch, Christoph Hellwig, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme
On Fri, Oct 17, 2025 at 08:32:00AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> Make sure that CPU is not synced and IOMMU is configured to take
> MMIO path by providing newly introduced DMA_ATTR_MMIO attribute.
Please write a commit log that explains this. Where was DMA_ATTR_MMIO
recently introduced? Why? What does this actually fix or improve?
> @@ -184,6 +184,12 @@ static bool blk_dma_map_iter_start(struct request *req, struct device *dma_dev,
> * P2P transfers through the host bridge are treated the
> * same as non-P2P transfers below and during unmap.
> */
> + if (iter->iter.is_integrity)
> + bio_integrity(req->bio)->bip_flags |= BIP_MMIO;
> + else
> + req->cmd_flags |= REQ_MMIO;
> + iter->iter.attrs |= DMA_ATTR_MMIO;
REQ_MMIO / BIP_MMIO is not block layer state, but driver state resulting
from the dma mapping. Reflecting it in block layer data structures
is not a good idea. This is really something that just needs to be
communicated outward and recorded in the driver. For nvme I suspect
two new flags in nvme_iod_flags would be the right place, assuming
we actually need it. But do we need it? If REQ_/BIP_P2PDMA is set,
these are always true.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page
2025-10-17 6:18 ` Christoph Hellwig
@ 2025-10-19 14:40 ` Leon Romanovsky
0 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-19 14:40 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme
On Fri, Oct 17, 2025 at 08:18:48AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 17, 2025 at 08:31:58AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > After introduction of dma_map_phys(), there is no need to convert
> > from physical address to struct page in order to map page. So let's
> > use it directly.
> >
> > Reviewed-by: Keith Busch <kbusch@kernel.org>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>
> You forgot to pick up my review from last round.
I'm sorry that I missed your Reviewed-by tag.
Thanks
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface
2025-10-17 6:20 ` Christoph Hellwig
@ 2025-10-20 7:53 ` Leon Romanovsky
0 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-20 7:53 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme
On Fri, Oct 17, 2025 at 08:20:08AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 17, 2025 at 08:31:59AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > Block layer maps MMIO memory through dma_map_phys() interface
> > with help of DMA_ATTR_MMIO attribute. There is a need to unmap
> > that memory with the appropriate unmap function, something which
> > wasn't possible before adding new REQ attribute to block layer in
> > previous patch.
>
> DMA_ATTR_MMIO only gets set in the following patch as far as I can
> tell.
>
> The more logical way would be to simply convert to dma_unmap_phys
> here and then add the flag in one go as suggested last round.
Done, thanks
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] block-dma: properly take MMIO path
2025-10-17 6:25 ` Christoph Hellwig
@ 2025-10-20 8:52 ` Leon Romanovsky
2025-10-20 12:30 ` Christoph Hellwig
2025-10-20 8:56 ` Leon Romanovsky
1 sibling, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-20 8:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme
On Fri, Oct 17, 2025 at 08:25:19AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 17, 2025 at 08:32:00AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > Make sure that CPU is not synced and IOMMU is configured to take
> > MMIO path by providing newly introduced DMA_ATTR_MMIO attribute.
>
> Please write a commit log that explains this. Where was DMA_ATTR_MMIO
> recently introduced? Why? What does this actually fix or improve?
What about this commit message?
Author: Leon Romanovsky <leonro@nvidia.com>
Date: Mon Oct 13 18:34:12 2025 +0300
block-dma: properly take MMIO path
In commit eadaa8b255f3 ("dma-mapping: introduce new DMA attribute to
indicate MMIO memory"), DMA_ATTR_MMIO attribute was added to describe
MMIO addresses, which requite to avoid any memory cache flushing, as
an outcome of the discussion pointed in Link tag below.
In case of PCI_P2PDMA_MAP_THRU_HOST_BRIDGE transfer, blk-mq-dm logic
treated this as regular page and relied on "struct page" DMA flow.
That flow performs CPU cache flushing, which shouldn't be done here,
and doesn't set IOMMU_MMIO flag in DMA-IOMMU case.
Link: https://lore.kernel.org/all/f912c446-1ae9-4390-9c11-00dce7bf0fd3@arm.com/
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] block-dma: properly take MMIO path
2025-10-17 6:25 ` Christoph Hellwig
2025-10-20 8:52 ` Leon Romanovsky
@ 2025-10-20 8:56 ` Leon Romanovsky
1 sibling, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-20 8:56 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme
On Fri, Oct 17, 2025 at 08:25:19AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 17, 2025 at 08:32:00AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > Make sure that CPU is not synced and IOMMU is configured to take
> > MMIO path by providing newly introduced DMA_ATTR_MMIO attribute.
<...>
> > + if (iter->iter.is_integrity)
> > + bio_integrity(req->bio)->bip_flags |= BIP_MMIO;
> > + else
> > + req->cmd_flags |= REQ_MMIO;
> > + iter->iter.attrs |= DMA_ATTR_MMIO;
>
> REQ_MMIO / BIP_MMIO is not block layer state, but driver state resulting
> from the dma mapping. Reflecting it in block layer data structures
> is not a good idea. This is really something that just needs to be
> communicated outward and recorded in the driver. For nvme I suspect
> two new flags in nvme_iod_flags would be the right place, assuming
> we actually need it. But do we need it? If REQ_/BIP_P2PDMA is set,
> these are always true.
We have three different flows.
1. Regular one, backed by struct page, e.g. dma_map_page()
2. PCI_P2PDMA_MAP_BUS_ADDR - non-DMA flow
3. PCI_P2PDMA_MAP_THRU_HOST_BRIDGE - DMA without struct page, e.g. dma_map_resource()
There is a need for two bits to represent them.
Thanks
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] block-dma: properly take MMIO path
2025-10-20 8:52 ` Leon Romanovsky
@ 2025-10-20 12:30 ` Christoph Hellwig
2025-10-20 14:53 ` Leon Romanovsky
0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2025-10-20 12:30 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
linux-block, linux-kernel, linux-nvme
On Mon, Oct 20, 2025 at 11:52:31AM +0300, Leon Romanovsky wrote:
> What about this commit message?
Much bettwer. Btw, what is the plan for getting rid of the
"automatic" p2p handling, which would be the logical conflusion from
this?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] block-dma: properly take MMIO path
2025-10-20 12:30 ` Christoph Hellwig
@ 2025-10-20 14:53 ` Leon Romanovsky
0 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-10-20 14:53 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-kernel,
linux-nvme
On Mon, Oct 20, 2025 at 02:30:27PM +0200, Christoph Hellwig wrote:
> On Mon, Oct 20, 2025 at 11:52:31AM +0300, Leon Romanovsky wrote:
> > What about this commit message?
>
> Much bettwer. Btw, what is the plan for getting rid of the
> "automatic" p2p handling, which would be the logical conflusion from
> this?
I continued with "automatic" p2p code and think that it is structured
pretty well. Why do you want to remove it?
The code in v2 looks like this:
@@ -184,6 +184,8 @@ static bool blk_dma_map_iter_start(struct request *req, struct device *dma_dev,
* P2P transfers through the host bridge are treated the
* same as non-P2P transfers below and during unmap.
*/
+ iter->attrs |= DMA_ATTR_MMIO;
+ fallthrough;
case PCI_P2PDMA_MAP_NONE:
break;
default:
...
@@ -1038,6 +1051,9 @@ static blk_status_t nvme_map_data(struct request *req)
if (!blk_rq_dma_map_iter_start(req, dev->dev, &iod->dma_state, &iter))
return iter.status;
+ if (iter.attrs & DMA_ATTR_MMIO)
+ iod->flags |= IOD_DATA_MMIO;
+
if (use_sgl == SGL_FORCED ||
(use_sgl == SGL_SUPPORTED &&
(sgl_threshold && nvme_pci_avg_seg_size(req) >= sgl_threshold)))
@@ -1060,6 +1076,9 @@ static blk_status_t nvme_pci_setup_meta_sgls(struct request *req)
&iod->meta_dma_state, &iter))
return iter.status;
+ if (iter.attrs & DMA_ATTR_MMIO)
+ iod->flags |= IOD_META_MMIO;
+
if (blk_rq_dma_map_coalesce(&iod->meta_dma_state))
entries = 1;
...
@@ -733,8 +739,11 @@ static void nvme_unmap_metadata(struct request *req)
return;
}
+ if (iod->flags & IOD_META_MMIO)
+ attrs |= DMA_ATTR_MMIO;
+
if (!blk_rq_integrity_dma_unmap(req, dma_dev, &iod->meta_dma_state,
- iod->meta_total_len)) {
+ iod->meta_total_len, attrs)) {
if (nvme_pci_cmd_use_meta_sgl(&iod->cmd))
nvme_free_sgls(req, sge, &sge[1], attrs);
else
The code is here (waiting for kbuild results) https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=block-with-mmio-v2
Thanks
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-10-21 16:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17 5:31 [PATCH 0/3] block: Enable proper MMIO memory handling for P2P DMA Leon Romanovsky
2025-10-17 5:31 ` [PATCH 1/3] blk-mq-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
2025-10-17 6:18 ` Christoph Hellwig
2025-10-19 14:40 ` Leon Romanovsky
2025-10-17 5:31 ` [PATCH 2/3] nvme-pci: unmap MMIO pages with appropriate interface Leon Romanovsky
2025-10-17 6:20 ` Christoph Hellwig
2025-10-20 7:53 ` Leon Romanovsky
2025-10-17 5:32 ` [PATCH 3/3] block-dma: properly take MMIO path Leon Romanovsky
2025-10-17 6:25 ` Christoph Hellwig
2025-10-20 8:52 ` Leon Romanovsky
2025-10-20 12:30 ` Christoph Hellwig
2025-10-20 14:53 ` Leon Romanovsky
2025-10-20 8:56 ` Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).