Linux block layer
 help / color / mirror / Atom feed
* [PATCH V4 0/3] md/nvme: Enable PCI P2PDMA support for RAID0 and NVMe Multipath
@ 2026-05-13 18:51 Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 1/3] block: clear BLK_FEAT_PCI_P2PDMA in blk_stack_limits() for non-supporting devices Chaitanya Kulkarni
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Chaitanya Kulkarni @ 2026-05-13 18:51 UTC (permalink / raw)
  To: song, yukuai, linan122, kbusch, axboe, hch, sagi
  Cc: linux-block, linux-raid, linux-nvme, kmodukuri,
	Chaitanya Kulkarni

Hi,

This patch series extends PCI peer-to-peer DMA (P2PDMA) support to enable
direct data transfers between PCIe devices through RAID and NVMe multipath
block layers.

Current Linux kernel P2PDMA infrastructure supports direct peer-to-peer
transfers, but this support is not propagated through certain storage
stacks like MD RAID and NVMe multipath. This adds two patches for
MD RAID 0/1/10 and NVMe to propogate P2PDMA support through the
storage stack.

All four test scenarios demonstrate that P2PDMA capabilities are correctly
propagated through both the MD RAID layer (patch 2/3) and NVMe multipath
layer (patch 3/3). Direct peer-to-peer transfers complete successfully with
full data integrity verification, confirming that:

1. RAID devices properly inherit P2PDMA capability from member devices
2. NVMe multipath devices correctly expose P2PDMA support
3. P2P memory buffers can be used for transfers involving both types
4. Data integrity is maintained across all transfer combinations

I've added blktest log as well at the end.

Repo:-

git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git

Branch HEAD:-

commit 5d75bcab6af0916283c2675d831d39fe4d360b11 (origin/for-next)
Merge: feab7e5ae6ee 87d0740b7c4c
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed May 13 07:55:56 2026 -0600

    Merge branch 'block-7.1' into for-next
    
    * block-7.1:
      selftests: ublk: cap nthreads to kernel's actual nr_hw_queues

-ck

Changes from V3:-

1. Rebase on latest code and add review and tested by tags.

Changes from V2:-

1. Unconditionally set the BLK_FEAT_PCI_P2PDMA for md and nvme multipath.
   (Christoph)
2. Add a prep patch to diable BLK_FEAT_PCI_P2PDMA in the blk_stack_limit().
   (christoph)

Changes from V1:-

- Update patch 1 to explicitly support MD RAID 0/1/10.
- Fix signoff chain order for patch 2.
- Clear BLK_FEAT_PCI_P2PDMA in nvme_mpath_add_disk() when a newly
  added path does not support it, to handle multipath across different
  transports.
- Add nvme multipath test log for mixed transport TCP and PCIe.

Chaitanya Kulkarni (1):
  block: clear BLK_FEAT_PCI_P2PDMA in blk_stack_limits() for
    non-supporting devices

Kiran Kumar Modukuri (2):
  md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device
  nvme-multipath: enable PCI P2PDMA for multipath devices

 block/blk-settings.c          | 2 ++
 drivers/md/raid0.c            | 1 +
 drivers/md/raid1.c            | 1 +
 drivers/md/raid10.c           | 1 +
 drivers/nvme/host/multipath.c | 2 +-
 5 files changed, 6 insertions(+), 1 deletion(-)

++ for t in loop tcp
++ echo '################NVMET_TRTYPES=loop############'
################NVMET_TRTYPES=loop############
++ NVMET_TRTYPES=loop
++ ./check nvme
nvme/002 (tr=loop) (create many subsystems and test discovery) [passed]
    runtime  33.808s  ...  34.634s
nvme/003 (tr=loop) (test if we're sending keep-alives to a discovery controller) [passed]
    runtime  10.222s  ...  10.239s
nvme/004 (tr=loop) (test nvme and nvmet UUID NS descriptors) [passed]
    runtime  0.683s  ...  0.651s
nvme/005 (tr=loop) (reset local loopback target)             [passed]
    runtime  0.974s  ...  1.001s
nvme/006 (tr=loop bd=device) (create an NVMeOF target)       [passed]
    runtime  0.088s  ...  0.094s
nvme/006 (tr=loop bd=file) (create an NVMeOF target)         [passed]
    runtime  0.082s  ...  0.087s
nvme/008 (tr=loop bd=device) (create an NVMeOF host)         [passed]
    runtime  0.652s  ...  0.658s
nvme/008 (tr=loop bd=file) (create an NVMeOF host)           [passed]
    runtime  0.634s  ...  0.658s
nvme/010 (tr=loop bd=device) (run data verification fio job) [passed]
    runtime  9.426s  ...  9.469s
nvme/010 (tr=loop bd=file) (run data verification fio job)   [passed]
    runtime  41.443s  ...  43.716s
nvme/012 (tr=loop bd=device) (run mkfs and data verification fio) [passed]
    runtime  47.971s  ...  51.404s
nvme/012 (tr=loop bd=file) (run mkfs and data verification fio) [passed]
    runtime  38.937s  ...  38.971s
nvme/014 (tr=loop bd=device) (flush a command from host)     [passed]
    runtime  9.257s  ...  9.208s
nvme/014 (tr=loop bd=file) (flush a command from host)       [passed]
    runtime  8.874s  ...  8.863s
nvme/016 (tr=loop) (create/delete many NVMeOF block device-backed ns and test discovery) [passed]
    runtime  20.101s  ...  19.387s
nvme/017 (tr=loop) (create/delete many file-ns and test discovery) [passed]
    runtime  22.534s  ...  21.919s
nvme/018 (tr=loop) (unit test NVMe-oF out of range access on a file backend) [passed]
    runtime  0.629s  ...  0.653s
nvme/019 (tr=loop bd=device) (test NVMe DSM Discard command) [passed]
    runtime  0.638s  ...  0.657s
nvme/019 (tr=loop bd=file) (test NVMe DSM Discard command)   [passed]
    runtime  0.624s  ...  0.611s
nvme/021 (tr=loop bd=device) (test NVMe list command)        [passed]
    runtime  0.633s  ...  0.639s
nvme/021 (tr=loop bd=file) (test NVMe list command)          [passed]
    runtime  0.647s  ...  0.632s
nvme/022 (tr=loop bd=device) (test NVMe reset command)       [passed]
    runtime  1.008s  ...  1.011s
nvme/022 (tr=loop bd=file) (test NVMe reset command)         [passed]
    runtime  1.010s  ...  1.006s
nvme/023 (tr=loop bd=device) (test NVMe smart-log command)   [passed]
    runtime  0.628s  ...  0.623s
nvme/023 (tr=loop bd=file) (test NVMe smart-log command)     [passed]
    runtime  0.624s  ...  0.627s
nvme/025 (tr=loop bd=device) (test NVMe effects-log)         [passed]
    runtime  0.647s  ...  0.647s
nvme/025 (tr=loop bd=file) (test NVMe effects-log)           [passed]
    runtime  0.638s  ...  0.636s
nvme/026 (tr=loop bd=device) (test NVMe ns-descs)            [passed]
    runtime  0.628s  ...  0.647s
nvme/026 (tr=loop bd=file) (test NVMe ns-descs)              [passed]
    runtime  0.623s  ...  0.637s
nvme/027 (tr=loop bd=device) (test NVMe ns-rescan command)   [passed]
    runtime  0.657s  ...  0.658s
nvme/027 (tr=loop bd=file) (test NVMe ns-rescan command)     [passed]
    runtime  0.650s  ...  0.641s
nvme/028 (tr=loop bd=device) (test NVMe list-subsys)         [passed]
    runtime  0.618s  ...  0.642s
nvme/028 (tr=loop bd=file) (test NVMe list-subsys)           [passed]
    runtime  0.631s  ...  0.627s
nvme/029 (tr=loop) (test userspace IO via nvme-cli read/write interface) [passed]
    runtime  0.954s  ...  0.952s
nvme/030 (tr=loop) (ensure the discovery generation counter is updated appropriately) [passed]
    runtime  0.432s  ...  0.437s
nvme/031 (tr=loop) (test deletion of NVMeOF controllers immediately after setup) [passed]
    runtime  5.813s  ...  5.999s
nvme/038 (tr=loop) (test deletion of NVMeOF subsystem without enabling) [passed]
    runtime  0.035s  ...  0.034s
nvme/040 (tr=loop) (test nvme fabrics controller reset/disconnect operation during I/O) [passed]
    runtime  7.029s  ...  7.022s
nvme/041 (tr=loop) (Create authenticated connections)        [passed]
    runtime  0.654s  ...  0.684s
nvme/042 (tr=loop) (Test dhchap key types for authenticated connections) [passed]
    runtime  3.866s  ...  3.856s
nvme/043 (tr=loop) (Test hash and DH group variations for authenticated connections) [passed]
    runtime  4.834s  ...  4.886s
nvme/044 (tr=loop) (Test bi-directional authentication)      [passed]
    runtime  1.237s  ...  1.328s
nvme/045 (tr=loop) (Test re-authentication)                  [passed]
    runtime  1.570s  ...  1.609s
nvme/047 (tr=loop) (test different queue types for fabric transports) [not run]
    nvme_trtype=loop is not supported in this test
nvme/048 (tr=loop) (Test queue count changes on reconnect)   [not run]
    nvme_trtype=loop is not supported in this test
nvme/051 (tr=loop) (test nvmet concurrent ns enable/disable) [passed]
    runtime  1.362s  ...  1.407s
nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [passed]
    runtime  6.258s  ...  6.282s
nvme/054 (tr=loop) (Test the NVMe reservation feature)       [passed]
    runtime  0.754s  ...  0.769s
nvme/055 (tr=loop) (Test nvme write to a loop target ns just after ns is disabled) [not run]
    kernel option DEBUG_ATOMIC_SLEEP has not been enabled
nvme/056 (tr=loop) (enable zero copy offload and run rw traffic) [not run]
    Remote target required but NVME_TARGET_CONTROL is not set
    nvme_trtype=loop is not supported in this test
    kernel option ULP_DDP has not been enabled
    module nvme_tcp does not have parameter ddp_offload
    KERNELSRC not set
    Kernel sources do not have tools/net/ynl/cli.py
    NVME_IFACE not set
nvme/057 (tr=loop) (test nvme fabrics controller ANA failover during I/O) [passed]
    runtime  27.227s  ...  27.146s
nvme/058 (tr=loop) (test rapid namespace remapping)          [passed]
    runtime  4.646s  ...  4.746s
nvme/060 (tr=loop) (test nvme fabrics target reset)          [not run]
    nvme_trtype=loop is not supported in this test
nvme/061 (tr=loop) (test fabric target teardown and setup during I/O) [not run]
    nvme_trtype=loop is not supported in this test
nvme/062 (tr=loop) (Create TLS-encrypted connections)        [not run]
    nvme_trtype=loop is not supported in this test
nvme/063 (tr=loop) (Create authenticated TCP connections with secure concatenation) [not run]
    nvme_trtype=loop is not supported in this test
nvme/065 (tr=loop) (test unmap write zeroes sysfs interface with nvmet devices) [passed]
    runtime  2.310s  ...  2.270s
++ for t in loop tcp
++ echo '################NVMET_TRTYPES=tcp############'
################NVMET_TRTYPES=tcp############
++ NVMET_TRTYPES=tcp
++ ./check nvme
nvme/002 (tr=tcp) (create many subsystems and test discovery) [not run]
    nvme_trtype=tcp is not supported in this test
nvme/003 (tr=tcp) (test if we're sending keep-alives to a discovery controller) [passed]
    runtime  10.279s  ...  10.271s
nvme/004 (tr=tcp) (test nvme and nvmet UUID NS descriptors)  [passed]
    runtime  0.350s  ...  0.404s
nvme/005 (tr=tcp) (reset local loopback target)              [passed]
    runtime  0.434s  ...  0.466s
nvme/006 (tr=tcp bd=device) (create an NVMeOF target)        [passed]
    runtime  0.097s  ...  0.098s
nvme/006 (tr=tcp bd=file) (create an NVMeOF target)          [passed]
    runtime  0.089s  ...  0.094s
nvme/008 (tr=tcp bd=device) (create an NVMeOF host)          [passed]
    runtime  0.348s  ...  0.361s
nvme/008 (tr=tcp bd=file) (create an NVMeOF host)            [passed]
    runtime  0.370s  ...  0.395s
nvme/010 (tr=tcp bd=device) (run data verification fio job)  [passed]
    runtime  76.430s  ...  78.689s
nvme/010 (tr=tcp bd=file) (run data verification fio job)    [passed]
    runtime  120.407s  ...  116.963s
nvme/012 (tr=tcp bd=device) (run mkfs and data verification fio) [passed]
    runtime  85.918s  ...  84.974s
nvme/012 (tr=tcp bd=file) (run mkfs and data verification fio) [passed]
    runtime  120.326s  ...  120.123s
nvme/014 (tr=tcp bd=device) (flush a command from host)      [passed]
    runtime  9.603s  ...  9.522s
nvme/014 (tr=tcp bd=file) (flush a command from host)        [passed]
    runtime  9.319s  ...  9.297s
nvme/016 (tr=tcp) (create/delete many NVMeOF block device-backed ns and test discovery) [not run]
    nvme_trtype=tcp is not supported in this test
nvme/017 (tr=tcp) (create/delete many file-ns and test discovery) [not run]
    nvme_trtype=tcp is not supported in this test
nvme/018 (tr=tcp) (unit test NVMe-oF out of range access on a file backend) [passed]
    runtime  0.334s  ...  0.384s
nvme/019 (tr=tcp bd=device) (test NVMe DSM Discard command)  [passed]
    runtime  0.342s  ...  0.377s
nvme/019 (tr=tcp bd=file) (test NVMe DSM Discard command)    [passed]
    runtime  0.334s  ...  0.368s
nvme/021 (tr=tcp bd=device) (test NVMe list command)         [passed]
    runtime  0.357s  ...  0.400s
nvme/021 (tr=tcp bd=file) (test NVMe list command)           [passed]
    runtime  0.356s  ...  0.397s
nvme/022 (tr=tcp bd=device) (test NVMe reset command)        [passed]
    runtime  0.469s  ...  0.504s
nvme/022 (tr=tcp bd=file) (test NVMe reset command)          [passed]
    runtime  0.441s  ...  0.512s
nvme/023 (tr=tcp bd=device) (test NVMe smart-log command)    [passed]
    runtime  0.348s  ...  0.387s
nvme/023 (tr=tcp bd=file) (test NVMe smart-log command)      [passed]
    runtime  0.330s  ...  0.372s
nvme/025 (tr=tcp bd=device) (test NVMe effects-log)          [passed]
    runtime  0.351s  ...  0.384s
nvme/025 (tr=tcp bd=file) (test NVMe effects-log)            [passed]
    runtime  0.362s  ...  0.377s
nvme/026 (tr=tcp bd=device) (test NVMe ns-descs)             [passed]
    runtime  0.343s  ...  0.359s
nvme/026 (tr=tcp bd=file) (test NVMe ns-descs)               [passed]
    runtime  0.317s  ...  0.361s
nvme/027 (tr=tcp bd=device) (test NVMe ns-rescan command)    [passed]
    runtime  0.375s  ...  0.412s
nvme/027 (tr=tcp bd=file) (test NVMe ns-rescan command)      [passed]
    runtime  0.388s  ...  0.392s
nvme/028 (tr=tcp bd=device) (test NVMe list-subsys)          [passed]
    runtime  0.355s  ...  0.362s
nvme/028 (tr=tcp bd=file) (test NVMe list-subsys)            [passed]
    runtime  0.342s  ...  0.371s
nvme/029 (tr=tcp) (test userspace IO via nvme-cli read/write interface) [passed]
    runtime  0.698s  ...  0.742s
nvme/030 (tr=tcp) (ensure the discovery generation counter is updated appropriately) [passed]
    runtime  0.402s  ...  0.416s
nvme/031 (tr=tcp) (test deletion of NVMeOF controllers immediately after setup) [passed]
    runtime  3.068s  ...  3.161s
nvme/038 (tr=tcp) (test deletion of NVMeOF subsystem without enabling) [passed]
    runtime  0.041s  ...  0.042s
nvme/040 (tr=tcp) (test nvme fabrics controller reset/disconnect operation during I/O) [passed]
    runtime  6.442s  ...  6.478s
nvme/041 (tr=tcp) (Create authenticated connections)         [passed]
    runtime  0.408s  ...  0.407s
nvme/042 (tr=tcp) (Test dhchap key types for authenticated connections) [passed]
    runtime  1.803s  ...  1.879s
nvme/043 (tr=tcp) (Test hash and DH group variations for authenticated connections) [passed]
    runtime  2.484s  ...  2.515s
nvme/044 (tr=tcp) (Test bi-directional authentication)       [passed]
    runtime  0.732s  ...  0.732s
nvme/045 (tr=tcp) (Test re-authentication)                   [passed]
    runtime  1.322s  ...  1.310s
nvme/047 (tr=tcp) (test different queue types for fabric transports) [passed]
    runtime  1.863s  ...  1.879s
nvme/048 (tr=tcp) (Test queue count changes on reconnect)    [passed]
    runtime  6.525s  ...  5.523s
nvme/051 (tr=tcp) (test nvmet concurrent ns enable/disable)  [passed]
    runtime  1.336s  ...  1.361s
nvme/052 (tr=tcp) (Test file-ns creation/deletion under one subsystem) [not run]
    nvme_trtype=tcp is not supported in this test
nvme/054 (tr=tcp) (Test the NVMe reservation feature)        [passed]
    runtime  0.493s  ...  0.491s
nvme/055 (tr=tcp) (Test nvme write to a loop target ns just after ns is disabled) [not run]
    nvme_trtype=tcp is not supported in this test
    kernel option DEBUG_ATOMIC_SLEEP has not been enabled
nvme/056 (tr=tcp) (enable zero copy offload and run rw traffic) [not run]
    Remote target required but NVME_TARGET_CONTROL is not set
    kernel option ULP_DDP has not been enabled
    module nvme_tcp does not have parameter ddp_offload
    KERNELSRC not set
    Kernel sources do not have tools/net/ynl/cli.py
    NVME_IFACE not set
nvme/057 (tr=tcp) (test nvme fabrics controller ANA failover during I/O) [passed]
    runtime  25.906s  ...  25.963s
nvme/058 (tr=tcp) (test rapid namespace remapping)           [passed]
    runtime  3.145s  ...  3.126s
nvme/060 (tr=tcp) (test nvme fabrics target reset)           [passed]
    runtime  20.395s  ...  19.399s
nvme/061 (tr=tcp) (test fabric target teardown and setup during I/O) [passed]
    runtime  8.549s  ...  8.617s
nvme/062 (tr=tcp) (Create TLS-encrypted connections)         [failed]
    runtime  1.608s  ...  5.200s
    --- tests/nvme/062.out	2026-01-28 12:04:48.888356244 -0800
    +++ /mnt/sda/blktests/results/nodev_tr_tcp/nvme/062.out.bad	2026-05-13 11:33:33.891029529 -0700
    @@ -2,9 +2,13 @@
     Test unencrypted connection w/ tls not required
     disconnected 1 controller(s)
     Test encrypted connection w/ tls not required
    -disconnected 1 controller(s)
    +FAIL: nvme connect return error code
    +WARNING: connection is not encrypted
    +disconnected 0 controller(s)
    ...
    (Run 'diff -u tests/nvme/062.out /mnt/sda/blktests/results/nodev_tr_tcp/nvme/062.out.bad' to see the entire diff)
nvme/063 (tr=tcp) (Create authenticated TCP connections with secure concatenation) [passed]
    runtime  1.838s  ...  2.011s
nvme/065 (tr=tcp) (test unmap write zeroes sysfs interface with nvmet devices) [passed]
    runtime  1.749s  ...  1.758s
++ ./manage-rdma-nvme.sh --cleanup
====== RDMA NVMe Setup ======
RDMA Type: siw
Interface: auto-detect

[INFO] Checking prerequisites...
[INFO] Prerequisites check passed
[INFO] Loading RDMA module: siw
[INFO] Module siw loaded successfully
[INFO] Creating RDMA links...
[INFO] Creating RDMA link: ens5_siw
[INFO] Created RDMA link: ens5_siw -> ens5
++ ./manage-rdma-nvme.sh --status
====== RDMA Configuration Status ======

====== RDMA Network Configuration Status ======

Loaded Modules:
  siw                      217088
  nvmet                    258048

RDMA Links:
  link ens5_siw/1 state ACTIVE physical_state LINK_UP netdev ens5 

Network Interfaces (RDMA-capable):
  Interface: ens5
    IPv4: 192.168.0.46
    IPv6: fe80::5054:98ff:fe76:5440%ens5

blktests Configuration:
  Transport Address: 192.168.0.46:4420
  Transport Type: rdma
  Command: NVMET_TRTYPES=rdma ./check nvme/

NVMe RDMA Controllers:
  None

=================================================
++ echo '################NVMET_TRTYPES=rdma############'
################NVMET_TRTYPES=rdma############
++ NVMET_TRTYPES=rdma
++ ./check nvme
nvme/002 (tr=rdma) (create many subsystems and test discovery) [not run]
    nvme_trtype=rdma is not supported in this test
nvme/003 (tr=rdma) (test if we're sending keep-alives to a discovery controller) [passed]
    runtime  10.315s  ...  10.293s
nvme/004 (tr=rdma) (test nvme and nvmet UUID NS descriptors) [passed]
    runtime  0.695s  ...  0.750s
nvme/005 (tr=rdma) (reset local loopback target)             [passed]
    runtime  0.974s  ...  1.046s
nvme/006 (tr=rdma bd=device) (create an NVMeOF target)       [passed]
    runtime  0.136s  ...  0.149s
nvme/006 (tr=rdma bd=file) (create an NVMeOF target)         [passed]
    runtime  0.127s  ...  0.140s
nvme/008 (tr=rdma bd=device) (create an NVMeOF host)         [passed]
    runtime  0.684s  ...  0.685s
nvme/008 (tr=rdma bd=file) (create an NVMeOF host)           [passed]
    runtime  0.672s  ...  0.729s
nvme/010 (tr=rdma bd=device) (run data verification fio job) [passed]
    runtime  35.255s  ...  35.280s
nvme/010 (tr=rdma bd=file) (run data verification fio job)   [passed]
    runtime  68.570s  ...  66.260s
nvme/012 (tr=rdma bd=device) (run mkfs and data verification fio) [passed]
    runtime  47.482s  ...  42.242s
nvme/012 (tr=rdma bd=file) (run mkfs and data verification fio) [passed]
    runtime  61.407s  ...  62.511s
nvme/014 (tr=rdma bd=device) (flush a command from host)     [passed]
    runtime  9.855s  ...  9.321s
nvme/014 (tr=rdma bd=file) (flush a command from host)       [passed]
    runtime  9.919s  ...  9.175s
nvme/016 (tr=rdma) (create/delete many NVMeOF block device-backed ns and test discovery) [not run]
    nvme_trtype=rdma is not supported in this test
nvme/017 (tr=rdma) (create/delete many file-ns and test discovery) [not run]
    nvme_trtype=rdma is not supported in this test
nvme/018 (tr=rdma) (unit test NVMe-oF out of range access on a file backend) [passed]
    runtime  0.654s  ...  0.737s
nvme/019 (tr=rdma bd=device) (test NVMe DSM Discard command) [passed]
    runtime  0.664s  ...  0.738s
nvme/019 (tr=rdma bd=file) (test NVMe DSM Discard command)   [passed]
    runtime  0.649s  ...  0.730s
nvme/021 (tr=rdma bd=device) (test NVMe list command)        [passed]
    runtime  0.676s  ...  0.729s
nvme/021 (tr=rdma bd=file) (test NVMe list command)          [passed]
    runtime  0.673s  ...  0.720s
nvme/022 (tr=rdma bd=device) (test NVMe reset command)       [passed]
    runtime  1.021s  ...  1.066s
nvme/022 (tr=rdma bd=file) (test NVMe reset command)         [passed]
    runtime  1.015s  ...  1.055s
nvme/023 (tr=rdma bd=device) (test NVMe smart-log command)   [passed]
    runtime  0.633s  ...  0.718s
nvme/023 (tr=rdma bd=file) (test NVMe smart-log command)     [passed]
    runtime  0.656s  ...  0.673s
nvme/025 (tr=rdma bd=device) (test NVMe effects-log)         [passed]
    runtime  0.706s  ...  0.715s
nvme/025 (tr=rdma bd=file) (test NVMe effects-log)           [passed]
    runtime  0.688s  ...  0.733s
nvme/026 (tr=rdma bd=device) (test NVMe ns-descs)            [passed]
    runtime  0.686s  ...  0.690s
nvme/026 (tr=rdma bd=file) (test NVMe ns-descs)              [passed]
    runtime  0.669s  ...  0.684s
nvme/027 (tr=rdma bd=device) (test NVMe ns-rescan command)   [passed]
    runtime  0.703s  ...  0.731s
nvme/027 (tr=rdma bd=file) (test NVMe ns-rescan command)     [passed]
    runtime  0.707s  ...  0.772s
nvme/028 (tr=rdma bd=device) (test NVMe list-subsys)         [passed]
    runtime  0.680s  ...  0.712s
nvme/028 (tr=rdma bd=file) (test NVMe list-subsys)           [passed]
    runtime  0.670s  ...  0.709s
nvme/029 (tr=rdma) (test userspace IO via nvme-cli read/write interface) [passed]
    runtime  1.077s  ...  1.087s
nvme/030 (tr=rdma) (ensure the discovery generation counter is updated appropriately) [passed]
    runtime  0.511s  ...  0.560s
nvme/031 (tr=rdma) (test deletion of NVMeOF controllers immediately after setup) [passed]
    runtime  5.869s  ...  6.106s
nvme/038 (tr=rdma) (test deletion of NVMeOF subsystem without enabling) [passed]
    runtime  0.084s  ...  0.085s
nvme/040 (tr=rdma) (test nvme fabrics controller reset/disconnect operation during I/O) [passed]
    runtime  7.024s  ...  6.994s
nvme/041 (tr=rdma) (Create authenticated connections)        [passed]
    runtime  0.718s  ...  0.769s
nvme/042 (tr=rdma) (Test dhchap key types for authenticated connections) [passed]
    runtime  3.759s  ...  3.968s
nvme/043 (tr=rdma) (Test hash and DH group variations for authenticated connections) [passed]
    runtime  4.487s  ...  4.880s
nvme/044 (tr=rdma) (Test bi-directional authentication)      [passed]
    runtime  1.279s  ...  1.393s
nvme/045 (tr=rdma) (Test re-authentication)                  [passed]
    runtime  1.837s  ...  1.892s
nvme/047 (tr=rdma) (test different queue types for fabric transports) [passed]
    runtime  2.642s  ...  2.700s
nvme/048 (tr=rdma) (Test queue count changes on reconnect)   [passed]
    runtime  5.810s  ...  6.860s
nvme/051 (tr=rdma) (test nvmet concurrent ns enable/disable) [passed]
    runtime  1.479s  ...  1.408s
nvme/052 (tr=rdma) (Test file-ns creation/deletion under one subsystem) [not run]
    nvme_trtype=rdma is not supported in this test
nvme/054 (tr=rdma) (Test the NVMe reservation feature)       [passed]
    runtime  0.799s  ...  0.854s
nvme/055 (tr=rdma) (Test nvme write to a loop target ns just after ns is disabled) [not run]
    nvme_trtype=rdma is not supported in this test
    kernel option DEBUG_ATOMIC_SLEEP has not been enabled
nvme/056 (tr=rdma) (enable zero copy offload and run rw traffic) [not run]
    Remote target required but NVME_TARGET_CONTROL is not set
    nvme_trtype=rdma is not supported in this test
    kernel option ULP_DDP has not been enabled
    module nvme_tcp does not have parameter ddp_offload
    KERNELSRC not set
    Kernel sources do not have tools/net/ynl/cli.py
    NVME_IFACE not set
nvme/057 (tr=rdma) (test nvme fabrics controller ANA failover during I/O) [passed]
    runtime  26.984s  ...  27.074s
nvme/058 (tr=rdma) (test rapid namespace remapping)          [passed]
    runtime  4.444s  ...  4.553s
nvme/060 (tr=rdma) (test nvme fabrics target reset)          [passed]
    runtime  20.696s  ...  22.308s
nvme/061 (tr=rdma) (test fabric target teardown and setup during I/O) [passed]
    runtime  15.375s  ...  15.348s
nvme/062 (tr=rdma) (Create TLS-encrypted connections)        [not run]
    nvme_trtype=rdma is not supported in this test
nvme/063 (tr=rdma) (Create authenticated TCP connections with secure concatenation) [not run]
    nvme_trtype=rdma is not supported in this test
nvme/065 (tr=rdma) (test unmap write zeroes sysfs interface with nvmet devices) [passed]
    runtime  2.369s  ...  2.466s
++ ./manage-rdma-nvme.sh --cleanup
====== RDMA NVMe Cleanup ======

[INFO] Disconnecting NVMe RDMA controllers...
[INFO] No NVMe RDMA controllers to disconnect
[INFO] Removing RDMA links...
[INFO] No RDMA links to remove
[INFO] Unloading NVMe RDMA modules...
[INFO] Unloading module: nvme_rdma
[INFO] Module nvme_rdma unloaded
[INFO] Unloading module: nvmet_rdma
[INFO] Module nvmet_rdma unloaded
[INFO] Unloading module: nvmet


-- 
2.39.5


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH V4 1/3] block: clear BLK_FEAT_PCI_P2PDMA in blk_stack_limits() for non-supporting devices
  2026-05-13 18:51 [PATCH V4 0/3] md/nvme: Enable PCI P2PDMA support for RAID0 and NVMe Multipath Chaitanya Kulkarni
@ 2026-05-13 18:51 ` Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 2/3] md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 3/3] nvme-multipath: enable PCI P2PDMA for multipath devices Chaitanya Kulkarni
  2 siblings, 0 replies; 4+ messages in thread
From: Chaitanya Kulkarni @ 2026-05-13 18:51 UTC (permalink / raw)
  To: song, yukuai, linan122, kbusch, axboe, hch, sagi
  Cc: linux-block, linux-raid, linux-nvme, kmodukuri,
	Chaitanya Kulkarni, Pranjal Shrivastava, Nitesh Shetty

BLK_FEAT_NOWAIT and BLK_FEAT_POLL are cleared in blk_stack_limits()
when an underlying device does not support them.  Apply the same
treatment to BLK_FEAT_PCI_P2PDMA: stacking drivers set it
unconditionally and rely on the core to clear it whenever a
non-supporting member device is stacked.

Tested-by: Pranjal Shrivastava<praan@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 block/blk-settings.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 78c83817b9d3..8274631290db 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -795,6 +795,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 		t->features &= ~BLK_FEAT_NOWAIT;
 	if (!(b->features & BLK_FEAT_POLL))
 		t->features &= ~BLK_FEAT_POLL;
+	if (!(b->features & BLK_FEAT_PCI_P2PDMA))
+		t->features &= ~BLK_FEAT_PCI_P2PDMA;
 
 	t->flags |= (b->flags & BLK_FLAG_MISALIGNED);
 
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH V4 2/3] md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device
  2026-05-13 18:51 [PATCH V4 0/3] md/nvme: Enable PCI P2PDMA support for RAID0 and NVMe Multipath Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 1/3] block: clear BLK_FEAT_PCI_P2PDMA in blk_stack_limits() for non-supporting devices Chaitanya Kulkarni
@ 2026-05-13 18:51 ` Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 3/3] nvme-multipath: enable PCI P2PDMA for multipath devices Chaitanya Kulkarni
  2 siblings, 0 replies; 4+ messages in thread
From: Chaitanya Kulkarni @ 2026-05-13 18:51 UTC (permalink / raw)
  To: song, yukuai, linan122, kbusch, axboe, hch, sagi
  Cc: linux-block, linux-raid, linux-nvme, kmodukuri,
	Pranjal Shrivastava, Xiao Ni, Chaitanya Kulkarni

From: Kiran Kumar Modukuri <kmodukuri@nvidia.com>

MD RAID does not propagate BLK_FEAT_PCI_P2PDMA from member devices to
the RAID device, preventing peer-to-peer DMA through the RAID layer even
when all underlying devices support it.

Enable BLK_FEAT_PCI_P2PDMA unconditionally in raid0, raid1 and raid10
personalities during queue limits setup.  blk_stack_limits() clears it
automatically if any member device lacks support, consistent with how
BLK_FEAT_NOWAIT and BLK_FEAT_POLL are handled in the block core.

Parity RAID personalities (raid4/5/6) are excluded because they require
CPU access to data pages for parity computation, which is incompatible
with P2P mappings.

Tested with RAID0/1/10 arrays containing multiple NVMe devices with
P2PDMA support, confirming that peer-to-peer transfers work correctly
through the RAID layer.

Tested-by: Pranjal Shrivastava<praan@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Kiran Kumar Modukuri <kmodukuri@nvidia.com>
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/md/raid0.c  | 1 +
 drivers/md/raid1.c  | 1 +
 drivers/md/raid10.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 5e38a51e349a..2cdaf7495d92 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -392,6 +392,7 @@ static int raid0_set_limits(struct mddev *mddev)
 	lim.io_opt = lim.io_min * mddev->raid_disks;
 	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
+	lim.features |= BLK_FEAT_PCI_P2PDMA;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
 		return err;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 64d970e2ef50..cc628a1be52c 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3208,6 +3208,7 @@ static int raid1_set_limits(struct mddev *mddev)
 	lim.max_hw_wzeroes_unmap_sectors = 0;
 	lim.logical_block_size = mddev->logical_block_size;
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
+	lim.features |= BLK_FEAT_PCI_P2PDMA;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
 		return err;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 39085e7dd6d2..f905dc391b74 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3941,6 +3941,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
 	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
+	lim.features |= BLK_FEAT_PCI_P2PDMA;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
 		return err;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH V4 3/3] nvme-multipath: enable PCI P2PDMA for multipath devices
  2026-05-13 18:51 [PATCH V4 0/3] md/nvme: Enable PCI P2PDMA support for RAID0 and NVMe Multipath Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 1/3] block: clear BLK_FEAT_PCI_P2PDMA in blk_stack_limits() for non-supporting devices Chaitanya Kulkarni
  2026-05-13 18:51 ` [PATCH V4 2/3] md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device Chaitanya Kulkarni
@ 2026-05-13 18:51 ` Chaitanya Kulkarni
  2 siblings, 0 replies; 4+ messages in thread
From: Chaitanya Kulkarni @ 2026-05-13 18:51 UTC (permalink / raw)
  To: song, yukuai, linan122, kbusch, axboe, hch, sagi
  Cc: linux-block, linux-raid, linux-nvme, kmodukuri,
	Pranjal Shrivastava, Nitesh Shetty, Chaitanya Kulkarni

From: Kiran Kumar Modukuri <kmodukuri@nvidia.com>

NVMe multipath does not expose BLK_FEAT_PCI_P2PDMA on the head disk
even when all underlying controllers support it.

Set BLK_FEAT_PCI_P2PDMA unconditionally in nvme_mpath_alloc_disk()
alongside the other features.  nvme_update_ns_info_block() already
calls queue_limits_stack_bdev() to stack each path's limits onto the
head disk, which routes through blk_stack_limits().  The core now
clears BLK_FEAT_PCI_P2PDMA automatically if any path (e.g., FC) does
not support it, consistent with how BLK_FEAT_NOWAIT and BLK_FEAT_POLL
are handled.

Tested-by: Pranjal Shrivastava<praan@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Kiran Kumar Modukuri <kmodukuri@nvidia.com>
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/nvme/host/multipath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 263161cb8ac0..ff442bbf2937 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -730,7 +730,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
 	blk_set_stacking_limits(&lim);
 	lim.dma_alignment = 3;
 	lim.features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT |
-		BLK_FEAT_POLL | BLK_FEAT_ATOMIC_WRITES;
+		BLK_FEAT_POLL | BLK_FEAT_ATOMIC_WRITES | BLK_FEAT_PCI_P2PDMA;
 	if (head->ids.csi == NVME_CSI_ZNS)
 		lim.features |= BLK_FEAT_ZONED;
 
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-13 18:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 18:51 [PATCH V4 0/3] md/nvme: Enable PCI P2PDMA support for RAID0 and NVMe Multipath Chaitanya Kulkarni
2026-05-13 18:51 ` [PATCH V4 1/3] block: clear BLK_FEAT_PCI_P2PDMA in blk_stack_limits() for non-supporting devices Chaitanya Kulkarni
2026-05-13 18:51 ` [PATCH V4 2/3] md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device Chaitanya Kulkarni
2026-05-13 18:51 ` [PATCH V4 3/3] nvme-multipath: enable PCI P2PDMA for multipath devices Chaitanya Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox