Linux NFS development
 help / color / mirror / Atom feed
* [PATCH v1 0/7] nfs: Modernize Direct I/O path
@ 2026-06-03  5:30 Pranjal Shrivastava
  2026-06-03  5:30 ` [PATCH v1 1/7] nfs: make nfs_page pin-aware Pranjal Shrivastava
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Pranjal Shrivastava @ 2026-06-03  5:30 UTC (permalink / raw)
  To: linux-nfs, linux-kernel
  Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig,
	Christoph Hellwig, Shivaji Kant, Pranjal Shrivastava

Modernize the NFS Direct I/O path as a preparatory step to enable PCI
Peer-to-Peer DMA (P2PDMA) support. Following feedback on the initial
RFC [1], the modernization and architectural changes are split into
this standalone series.

Currently, NFS O_DIRECT relies on the legacy iov_iter_get_pages_alloc2()
API which does not support the pinning requirements for P2P memory.
The implementation moves NFS to the modern iov_iter_extract_pages() API
and migrates NFS direct I/O away from pages to use folios.

Design
======

1. Pin-Awareness
Standard NFS requests use get_page() and put_page() for memory
management. However, memory extracted via iov_iter_extract_pages()
requires explicit pinning.

Introduce a PG_PINNED flag and a wb_nr_pinned count to struct nfs_page.
This allows the request lifecycle to track ownership of physical pins
and ensure that unpinning is performed only when the I/O is complete.

2. API Migration
Migrate the Direct I/O path to the modern iov_iter_extract_pages()
API. This aligns NFS with the modern extraction model and serves as
the foundation for passing ITER_ALLOW_P2PDMA in a follow-up series.

3. Extraction Helper and Folio Support
Introduce a new extraction helper in direct.c to group contiguous
pages from the same folio into a single struct nfs_page. This
effectively migrates the Direct I/O path from being page-based to being
folio-based.

Note: zone_device_pages_have_same_pgmap() checks are intentionally
omitted in the extraction helper since P2PDMA enablement will be
introduced in a follow-up series.

Bisectability
=============
The series attempts to remain bisectable. 

[Patches 1-2] Introduce pin-aware infrastructure and accounting.
[Patch 3] Adds a centralized request release helper.
[Patch 4] Migrates the Direct I/O path to iov_iter_extract_pages().
[Patches 5-6] Implement the extraction helper and folio-based grouping.
[Patch 7] Removes orphaned page-based helpers.

Testing
=======
The series lightly tested using fio (bs=1M, size=1G) on a small
(non-server) machine running Linux 7.1-rc6. Some test logs from a run:

nfs-test: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=1
fio-3.42-37-g5b47
Starting 1 process

nfs-test: (groupid=0, jobs=1): err= 0: pid=33264: Tue Jun  2 23:50:15 2026
  read: IOPS=5145, BW=5146MiB/s (5396MB/s)(1024MiB/199msec)
    slat (usec): min=8, max=168, avg=11.12, stdev= 5.16
    clat (usec): min=153, max=628, avg=182.20, stdev=24.15
     lat (usec): min=165, max=796, avg=193.33, stdev=27.64
    clat percentiles (usec):
     |  1.00th=[  159],  5.00th=[  163], 10.00th=[  165], 20.00th=[  169],
     | 30.00th=[  172], 40.00th=[  176], 50.00th=[  178], 60.00th=[  182],
     | 70.00th=[  186], 80.00th=[  194], 90.00th=[  202], 95.00th=[  215],
     | 99.00th=[  229], 99.50th=[  334], 99.90th=[  408], 99.95th=[  627],
     | 99.99th=[  627]
  lat (usec)   : 250=99.32%, 500=0.59%, 750=0.10%
  cpu          : usr=1.01%, sys=5.56%, ctx=1025, majf=0, minf=265
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0.00ns, window=0.00ns, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=5146MiB/s (5396MB/s), 5146MiB/s-5146MiB/s (5396MB/s-5396MB/s), io=1024MiB (1074MB), run=199-199msec

Pranjal Shrivastava (7):
  nfs: make nfs_page pin-aware
  nfs: Track number of pinned pages in nfs_page
  nfs: Introduce nfs_release_request_list helper
  nfs: migrate direct I/O to iov_iter_extract_pages
  nfs: introduce nfs_direct_extract_pages helper
  nfs: Optimize direct I/O to use folios for requests
  nfs: Cleanup the nfs_page_create_from_page helper

 fs/nfs/direct.c          | 160 ++++++++++++++++++++++-----------------
 fs/nfs/pagelist.c        |  86 +++++++++++----------
 fs/nfs/read.c            |   2 +-
 fs/nfs/write.c           |   2 +-
 include/linux/nfs_page.h |  12 ++-
 5 files changed, 144 insertions(+), 118 deletions(-)

base-commit: 2c9eb6f2c18bff4cf3ddeab96db5137cc2b2572b
-- 
2.54.0.1013.g208068f2d8-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-06-04  7:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-03  5:30 [PATCH v1 0/7] nfs: Modernize Direct I/O path Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 1/7] nfs: make nfs_page pin-aware Pranjal Shrivastava
2026-06-03 16:39   ` Anna Schumaker
2026-06-04  7:51     ` Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 2/7] nfs: Track number of pinned pages in nfs_page Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 3/7] nfs: Introduce nfs_release_request_list helper Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 4/7] nfs: migrate direct I/O to iov_iter_extract_pages Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 5/7] nfs: introduce nfs_direct_extract_pages helper Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 6/7] nfs: Optimize direct I/O to use folios for requests Pranjal Shrivastava
2026-06-03 19:14   ` Anna Schumaker
2026-06-04  7:59     ` Pranjal Shrivastava
2026-06-03  5:30 ` [PATCH v1 7/7] nfs: Cleanup the nfs_page_create_from_page helper Pranjal Shrivastava
2026-06-03  5:33 ` [PATCH v1 0/7] nfs: Modernize Direct I/O path Pranjal Shrivastava

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox