From: Chaitanya Kulkarni <kch@nvidia.com>
To: <song@kernel.org>, <yukuai@fnnas.com>, <linan122@huawei.com>,
<kbusch@kernel.org>, <axboe@kernel.dk>, <hch@lst.de>,
<sagi@grimberg.me>
Cc: <linux-raid@vger.kernel.org>, <linux-nvme@lists.infradead.org>,
<kmodukuri@nvidia.com>, Chaitanya Kulkarni <kch@nvidia.com>
Subject: [PATCH V2 1/2] md: propagate BLK_FEAT_PCI_P2PDMA from member devices
Date: Wed, 8 Apr 2026 00:25:36 -0700 [thread overview]
Message-ID: <20260408072537.46540-2-kch@nvidia.com> (raw)
In-Reply-To: <20260408072537.46540-1-kch@nvidia.com>
From: Kiran Kumar Modukuri <kmodukuri@nvidia.com>
MD RAID does not propagate BLK_FEAT_PCI_P2PDMA from member devices to
the RAID device, preventing peer-to-peer DMA through the RAID layer even
when all underlying devices support it.
Enable BLK_FEAT_PCI_P2PDMA in raid0, raid1 and raid10 personalities
during queue limits setup and clear it in mddev_stack_rdev_limits()
during array init and mddev_stack_new_rdev() during hot-add if any
member device lacks support. Parity RAID personalities (raid4/5/6) are
excluded because they need CPU access to data pages for parity
computation, which is incompatible with P2P mappings.
Tested with RAID0/1/10 arrays containing multiple NVMe devices with P2PDMA
support, confirming that peer-to-peer transfers work correctly through
the RAID layer.
Signed-off-by: Kiran Kumar Modukuri <kmodukuri@nvidia.com>
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
drivers/md/md.c | 4 ++++
drivers/md/raid0.c | 1 +
drivers/md/raid1.c | 1 +
drivers/md/raid10.c | 1 +
4 files changed, 7 insertions(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 521d9b34cd9e..48d7a3ca8c66 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6176,6 +6176,8 @@ int mddev_stack_rdev_limits(struct mddev *mddev, struct queue_limits *lim,
if ((flags & MDDEV_STACK_INTEGRITY) &&
!queue_limits_stack_integrity_bdev(lim, rdev->bdev))
return -EINVAL;
+ if (!blk_queue_pci_p2pdma(rdev->bdev->bd_disk->queue))
+ lim->features &= ~BLK_FEAT_PCI_P2PDMA;
}
/*
@@ -6231,6 +6233,8 @@ int mddev_stack_new_rdev(struct mddev *mddev, struct md_rdev *rdev)
lim = queue_limits_start_update(mddev->gendisk->queue);
queue_limits_stack_bdev(&lim, rdev->bdev, rdev->data_offset,
mddev->gendisk->disk_name);
+ if (!blk_queue_pci_p2pdma(rdev->bdev->bd_disk->queue))
+ lim.features &= ~BLK_FEAT_PCI_P2PDMA;
if (!queue_limits_stack_integrity_bdev(&lim, rdev->bdev)) {
pr_err("%s: incompatible integrity profile for %pg\n",
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index ef0045db409f..1cdcafd31744 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -392,6 +392,7 @@ static int raid0_set_limits(struct mddev *mddev)
lim.io_opt = lim.io_min * mddev->raid_disks;
lim.chunk_sectors = mddev->chunk_sectors;
lim.features |= BLK_FEAT_ATOMIC_WRITES;
+ lim.features |= BLK_FEAT_PCI_P2PDMA;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err)
return err;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 16f671ab12c0..b25e661e9738 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3192,6 +3192,7 @@ static int raid1_set_limits(struct mddev *mddev)
lim.max_hw_wzeroes_unmap_sectors = 0;
lim.logical_block_size = mddev->logical_block_size;
lim.features |= BLK_FEAT_ATOMIC_WRITES;
+ lim.features |= BLK_FEAT_PCI_P2PDMA;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err)
return err;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 4901ebe45c87..07a5b734c8f3 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3939,6 +3939,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
lim.chunk_sectors = mddev->chunk_sectors;
lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
lim.features |= BLK_FEAT_ATOMIC_WRITES;
+ lim.features |= BLK_FEAT_PCI_P2PDMA;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err)
return err;
--
2.39.5
next prev parent reply other threads:[~2026-04-08 7:26 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 7:25 [PATCH V2 0/2] md/nvme: Enable PCI P2PDMA support for RAID0 and NVMe Multipath Chaitanya Kulkarni
2026-04-08 7:25 ` Chaitanya Kulkarni [this message]
2026-04-08 7:25 ` [PATCH V2 2/2] nvme-multipath: enable PCI P2PDMA for multipath devices Chaitanya Kulkarni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260408072537.46540-2-kch@nvidia.com \
--to=kch@nvidia.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=kmodukuri@nvidia.com \
--cc=linan122@huawei.com \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-raid@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=song@kernel.org \
--cc=yukuai@fnnas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox