From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>, Sagi Grimberg <sagi@grimberg.me>,
Chaitanya Kulkarni <kch@nvidia.com>,
Kanchan Joshi <joshi.k@samsung.com>,
Leon Romanovsky <leon@kernel.org>,
Nitesh Shetty <nj.shetty@samsung.com>,
Logan Gunthorpe <logang@deltatee.com>,
linux-block@vger.kernel.org, linux-nvme@lists.infradead.org
Subject: [PATCH 1/9] block: don't merge different kinds of P2P transfers in a single bio
Date: Tue, 10 Jun 2025 07:06:39 +0200 [thread overview]
Message-ID: <20250610050713.2046316-2-hch@lst.de> (raw)
In-Reply-To: <20250610050713.2046316-1-hch@lst.de>
To get out of the DMA mapping helpers having to check every segment for
it's P2P status, ensure that bios either contain P2P transfers or non-P2P
transfers, and that a P2P bio only contains ranges from a single device.
This means we do the page zone access in the bio add path where it should
be still page hot, and will only have do the fairly expensive P2P topology
lookup once per bio down in the DMA mapping path, and only for already
marked bios.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/bio-integrity.c | 3 +++
block/bio.c | 20 +++++++++++++-------
include/linux/blk_types.h | 2 ++
3 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 10912988c8f5..6b077ca937f6 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -128,6 +128,9 @@ int bio_integrity_add_page(struct bio *bio, struct page *page,
if (bip->bip_vcnt > 0) {
struct bio_vec *bv = &bip->bip_vec[bip->bip_vcnt - 1];
+ if (!zone_device_pages_have_same_pgmap(bv->bv_page, page))
+ return 0;
+
if (bvec_try_merge_hw_page(q, bv, page, len, offset)) {
bip->bip_iter.bi_size += len;
return len;
diff --git a/block/bio.c b/block/bio.c
index 3c0a558c90f5..92c512e876c8 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -930,8 +930,6 @@ static bool bvec_try_merge_page(struct bio_vec *bv, struct page *page,
return false;
if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
return false;
- if (!zone_device_pages_have_same_pgmap(bv->bv_page, page))
- return false;
if ((vec_end_addr & PAGE_MASK) != ((page_addr + off) & PAGE_MASK)) {
if (IS_ENABLED(CONFIG_KMSAN))
@@ -982,6 +980,9 @@ void __bio_add_page(struct bio *bio, struct page *page,
WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
WARN_ON_ONCE(bio_full(bio, len));
+ if (is_pci_p2pdma_page(page))
+ bio->bi_opf |= REQ_P2PDMA | REQ_NOMERGE;
+
bvec_set_page(&bio->bi_io_vec[bio->bi_vcnt], page, len, off);
bio->bi_iter.bi_size += len;
bio->bi_vcnt++;
@@ -1022,11 +1023,16 @@ int bio_add_page(struct bio *bio, struct page *page,
if (bio->bi_iter.bi_size > UINT_MAX - len)
return 0;
- if (bio->bi_vcnt > 0 &&
- bvec_try_merge_page(&bio->bi_io_vec[bio->bi_vcnt - 1],
- page, len, offset)) {
- bio->bi_iter.bi_size += len;
- return len;
+ if (bio->bi_vcnt > 0) {
+ struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
+
+ if (!zone_device_pages_have_same_pgmap(bv->bv_page, page))
+ return 0;
+
+ if (bvec_try_merge_page(bv, page, len, offset)) {
+ bio->bi_iter.bi_size += len;
+ return len;
+ }
}
if (bio->bi_vcnt >= bio->bi_max_vecs)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3d1577f07c1c..2a02972dc17c 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -386,6 +386,7 @@ enum req_flag_bits {
__REQ_DRV, /* for driver use */
__REQ_FS_PRIVATE, /* for file system (submitter) use */
__REQ_ATOMIC, /* for atomic write operations */
+ __REQ_P2PDMA, /* contains P2P DMA pages */
/*
* Command specific flags, keep last:
*/
@@ -418,6 +419,7 @@ enum req_flag_bits {
#define REQ_DRV (__force blk_opf_t)(1ULL << __REQ_DRV)
#define REQ_FS_PRIVATE (__force blk_opf_t)(1ULL << __REQ_FS_PRIVATE)
#define REQ_ATOMIC (__force blk_opf_t)(1ULL << __REQ_ATOMIC)
+#define REQ_P2PDMA (__force blk_opf_t)(1ULL << __REQ_P2PDMA)
#define REQ_NOUNMAP (__force blk_opf_t)(1ULL << __REQ_NOUNMAP)
--
2.47.2
next prev parent reply other threads:[~2025-06-10 5:07 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-10 5:06 new DMA API conversion for nvme-pci Christoph Hellwig
2025-06-10 5:06 ` Christoph Hellwig [this message]
2025-06-10 12:44 ` [PATCH 1/9] block: don't merge different kinds of P2P transfers in a single bio Leon Romanovsky
2025-06-10 15:37 ` Keith Busch
2025-06-11 3:43 ` Christoph Hellwig
2025-06-11 16:26 ` Keith Busch
2025-06-11 16:39 ` Logan Gunthorpe
2025-06-11 16:41 ` Keith Busch
2025-06-11 19:41 ` Logan Gunthorpe
2025-06-11 20:00 ` Keith Busch
2025-06-12 4:57 ` Christoph Hellwig
2025-06-12 6:24 ` Kanchan Joshi
2025-06-13 6:19 ` Christoph Hellwig
2025-06-12 15:22 ` Logan Gunthorpe
2025-06-10 5:06 ` [PATCH 2/9] block: add scatterlist-less DMA mapping helpers Christoph Hellwig
2025-06-10 12:51 ` Leon Romanovsky
2025-06-11 13:43 ` Daniel Gomez
2025-06-16 5:02 ` Christoph Hellwig
2025-06-16 6:43 ` Daniel Gomez
2025-06-16 11:31 ` Christoph Hellwig
2025-06-16 12:37 ` Daniel Gomez
2025-06-16 12:42 ` Christoph Hellwig
2025-06-16 12:52 ` Daniel Gomez
2025-06-16 13:01 ` Christoph Hellwig
2025-06-12 6:35 ` Kanchan Joshi
2025-06-13 6:17 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 3/9] nvme-pci: simplify nvme_pci_metadata_use_sgls Christoph Hellwig
2025-06-10 12:52 ` Leon Romanovsky
2025-06-11 21:38 ` Keith Busch
2025-06-12 4:59 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 4/9] nvme-pci: refactor nvme_pci_use_sgls Christoph Hellwig
2025-06-10 13:10 ` Leon Romanovsky
2025-06-11 13:43 ` Daniel Gomez
2025-06-12 5:00 ` Christoph Hellwig
2025-06-11 20:50 ` Keith Busch
2025-06-12 5:00 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 5/9] nvme-pci: merge the simple PRP and SGL setup into a common helper Christoph Hellwig
2025-06-10 13:13 ` Leon Romanovsky
2025-06-11 13:44 ` Daniel Gomez
2025-06-12 5:01 ` Christoph Hellwig
2025-06-11 21:03 ` Keith Busch
2025-06-12 5:01 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 6/9] nvme-pci: remove superfluous arguments Christoph Hellwig
2025-06-10 13:15 ` Leon Romanovsky
2025-06-11 21:05 ` Keith Busch
2025-06-10 5:06 ` [PATCH 7/9] nvme-pci: convert the data mapping blk_rq_dma_map Christoph Hellwig
2025-06-10 13:19 ` Leon Romanovsky
2025-06-11 12:15 ` Daniel Gomez
2025-06-12 5:02 ` Christoph Hellwig
2025-06-16 7:41 ` Daniel Gomez
2025-06-16 11:33 ` Christoph Hellwig
2025-06-17 17:33 ` Daniel Gomez
2025-06-17 23:25 ` Keith Busch
2025-06-17 17:43 ` Daniel Gomez
2025-06-17 17:45 ` Daniel Gomez
2025-06-11 14:13 ` Daniel Gomez
2025-06-12 5:03 ` Christoph Hellwig
2025-06-16 7:49 ` Daniel Gomez
2025-06-16 11:35 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 8/9] nvme-pci: replace NVME_MAX_KB_SZ with NVME_MAX_BYTE Christoph Hellwig
2025-06-10 13:20 ` Leon Romanovsky
2025-06-11 14:00 ` Daniel Gomez
2025-06-10 5:06 ` [PATCH 9/9] nvme-pci: rework the build time assert for NVME_MAX_NR_DESCRIPTORS Christoph Hellwig
2025-06-10 13:21 ` Leon Romanovsky
2025-06-11 13:51 ` Daniel Gomez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250610050713.2046316-2-hch@lst.de \
--to=hch@lst.de \
--cc=axboe@kernel.dk \
--cc=joshi.k@samsung.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=logang@deltatee.com \
--cc=nj.shetty@samsung.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).