* [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget
@ 2025-10-03 9:11 Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 1/4] NFSD/blocklayout: Fix minlength check in proc_layoutget Sergey Bashirov
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Sergey Bashirov @ 2025-10-03 9:11 UTC (permalink / raw)
To: Chuck Lever, Christoph Hellwig, Dai Ngo, Jeff Layton, NeilBrown,
Olga Kornievskaia, Tom Talpey
Cc: linux-nfs, linux-kernel, Sergey Bashirov
Implement support for multiple extents in the LAYOUTGET response
for two main reasons.
First, it avoids unnecessary RPC calls. For files consisting of many
extents, especially large ones, too many LAYOUTGET requests are observed
in Wireshark traces.
Second, due to the current limitation on returning a single extent,
the client can only reliably request layouts with minimum length set
to 4K. Otherwise, NFS4ERR_LAYOUTUNAVAILABLE may be returned if XFS
allocated a 4K extent within the requested range.
We are using the ability to request layouts with a minimum length
greater than 4K to fix/workaround a bug in the client. I will prepare
the client's patch for review too.
Below is an example of multiple extents in the LAYOUTGET response
captured using Wireshark.
Network File System, Ops(3): SEQUENCE, PUTFH, LAYOUTGET
[Program Version: 4]
[V4 Procedure: COMPOUND (1)]
Tag: <EMPTY>
length: 0
contents: <EMPTY>
minorversion: 2
Operations (count: 3): SEQUENCE, PUTFH, LAYOUTGET
Opcode: SEQUENCE (53)
Opcode: PUTFH (22)
Opcode: LAYOUTGET (50)
layout available?: No
layout type: LAYOUT4_BLOCK_VOLUME (3)
IO mode: IOMODE_RW (2)
offset: 0
length: 10485760
min length: 16384
StateID
maxcount: 4096
[Main Opcode: LAYOUTGET (50)]
Network File System, Ops(3): SEQUENCE PUTFH LAYOUTGET
[Program Version: 4]
[V4 Procedure: COMPOUND (1)]
Status: NFS4_OK (0)
Tag: <EMPTY>
length: 0
contents: <EMPTY>
Operations (count: 3)
Opcode: SEQUENCE (53)
Opcode: PUTFH (22)
Opcode: LAYOUTGET (50)
Status: NFS4_OK (0)
return on close?: Yes
StateID
Layout Segment (count: 1)
offset: 0
length: 385024
IO mode: IOMODE_RW (2)
layout type: LAYOUT4_BLOCK_VOLUME (3)
layout: <DATA>
length: 4052
contents: <DATA>
[Main Opcode: LAYOUTGET (50)]
pNFS Block Layout Extents
bex_count: 92
BEX[0]
BEX[1]
BEX[2]
...
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
---
Changes in v3:
- Added a Fixes tag
- Removed an unnecessary sentence from the commit message
Sergey Bashirov (4):
NFSD/blocklayout: Fix minlength check in proc_layoutget
NFSD/blocklayout: Extract extent mapping from proc_layoutget
NFSD/blocklayout: Introduce layout content structure
NFSD/blocklayout: Support multiple extents per LAYOUTGET
fs/nfsd/blocklayout.c | 154 +++++++++++++++++++++++++++------------
fs/nfsd/blocklayoutxdr.c | 36 ++++++---
fs/nfsd/blocklayoutxdr.h | 14 ++++
3 files changed, 147 insertions(+), 57 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/4] NFSD/blocklayout: Fix minlength check in proc_layoutget
2025-10-03 9:11 [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Sergey Bashirov
@ 2025-10-03 9:11 ` Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 2/4] NFSD/blocklayout: Extract extent mapping from proc_layoutget Sergey Bashirov
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Sergey Bashirov @ 2025-10-03 9:11 UTC (permalink / raw)
To: Chuck Lever, Christoph Hellwig, Dai Ngo, Jeff Layton, NeilBrown,
Olga Kornievskaia, Tom Talpey
Cc: linux-nfs, linux-kernel, Sergey Bashirov, Christoph Hellwig
The extent returned by the file system may have a smaller offset than
the segment offset requested by the client. In this case, the minimum
segment length must be checked against the requested range. Otherwise,
the client may not be able to continue the read/write operation.
Fixes: 8650b8a05850 ("nfsd: pNFS block layout driver")
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/nfsd/blocklayout.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index fde5539cf6a6..425648565ab2 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -23,6 +23,7 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
{
struct nfsd4_layout_seg *seg = &args->lg_seg;
struct super_block *sb = inode->i_sb;
+ u64 length;
u32 block_size = i_blocksize(inode);
struct pnfs_block_extent *bex;
struct iomap iomap;
@@ -56,7 +57,8 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
goto out_error;
}
- if (iomap.length < args->lg_minlength) {
+ length = iomap.offset + iomap.length - seg->offset;
+ if (length < args->lg_minlength) {
dprintk("pnfsd: extent smaller than minlength\n");
goto out_layoutunavailable;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/4] NFSD/blocklayout: Extract extent mapping from proc_layoutget
2025-10-03 9:11 [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 1/4] NFSD/blocklayout: Fix minlength check in proc_layoutget Sergey Bashirov
@ 2025-10-03 9:11 ` Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 3/4] NFSD/blocklayout: Introduce layout content structure Sergey Bashirov
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Sergey Bashirov @ 2025-10-03 9:11 UTC (permalink / raw)
To: Chuck Lever, Christoph Hellwig, Dai Ngo, Jeff Layton, NeilBrown,
Olga Kornievskaia, Tom Talpey
Cc: linux-nfs, linux-kernel, Sergey Bashirov, Christoph Hellwig
No changes in functionality. Split the proc_layoutget function to
create a helper function that maps single extent to the requested
range. This helper function is then used to implement support for
multiple extents per LAYOUTGET.
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/nfsd/blocklayout.c | 115 ++++++++++++++++++++++++------------------
1 file changed, 66 insertions(+), 49 deletions(-)
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index 425648565ab2..35a95501db63 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -17,68 +17,44 @@
#define NFSDDBG_FACILITY NFSDDBG_PNFS
+/*
+ * Get an extent from the file system that starts at offset or below
+ * and may be shorter than the requested length.
+ */
static __be32
-nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
- const struct svc_fh *fhp, struct nfsd4_layoutget *args)
+nfsd4_block_map_extent(struct inode *inode, const struct svc_fh *fhp,
+ u64 offset, u64 length, u32 iomode, u64 minlength,
+ struct pnfs_block_extent *bex)
{
- struct nfsd4_layout_seg *seg = &args->lg_seg;
struct super_block *sb = inode->i_sb;
- u64 length;
- u32 block_size = i_blocksize(inode);
- struct pnfs_block_extent *bex;
struct iomap iomap;
u32 device_generation = 0;
int error;
- if (locks_in_grace(SVC_NET(rqstp)))
- return nfserr_grace;
-
- if (seg->offset & (block_size - 1)) {
- dprintk("pnfsd: I/O misaligned\n");
- goto out_layoutunavailable;
- }
-
- /*
- * Some clients barf on non-zero block numbers for NONE or INVALID
- * layouts, so make sure to zero the whole structure.
- */
- error = -ENOMEM;
- bex = kzalloc(sizeof(*bex), GFP_KERNEL);
- if (!bex)
- goto out_error;
- args->lg_content = bex;
-
- error = sb->s_export_op->map_blocks(inode, seg->offset, seg->length,
- &iomap, seg->iomode != IOMODE_READ,
- &device_generation);
+ error = sb->s_export_op->map_blocks(inode, offset, length, &iomap,
+ iomode != IOMODE_READ, &device_generation);
if (error) {
if (error == -ENXIO)
- goto out_layoutunavailable;
- goto out_error;
- }
-
- length = iomap.offset + iomap.length - seg->offset;
- if (length < args->lg_minlength) {
- dprintk("pnfsd: extent smaller than minlength\n");
- goto out_layoutunavailable;
+ return nfserr_layoutunavailable;
+ return nfserrno(error);
}
switch (iomap.type) {
case IOMAP_MAPPED:
- if (seg->iomode == IOMODE_READ)
+ if (iomode == IOMODE_READ)
bex->es = PNFS_BLOCK_READ_DATA;
else
bex->es = PNFS_BLOCK_READWRITE_DATA;
bex->soff = iomap.addr;
break;
case IOMAP_UNWRITTEN:
- if (seg->iomode & IOMODE_RW) {
+ if (iomode & IOMODE_RW) {
/*
* Crack monkey special case from section 2.3.1.
*/
- if (args->lg_minlength == 0) {
+ if (minlength == 0) {
dprintk("pnfsd: no soup for you!\n");
- goto out_layoutunavailable;
+ return nfserr_layoutunavailable;
}
bex->es = PNFS_BLOCK_INVALID_DATA;
@@ -87,7 +63,7 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
}
fallthrough;
case IOMAP_HOLE:
- if (seg->iomode == IOMODE_READ) {
+ if (iomode == IOMODE_READ) {
bex->es = PNFS_BLOCK_NONE_DATA;
break;
}
@@ -95,27 +71,68 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
case IOMAP_DELALLOC:
default:
WARN(1, "pnfsd: filesystem returned %d extent\n", iomap.type);
- goto out_layoutunavailable;
+ return nfserr_layoutunavailable;
}
error = nfsd4_set_deviceid(&bex->vol_id, fhp, device_generation);
if (error)
- goto out_error;
+ return nfserrno(error);
+
bex->foff = iomap.offset;
bex->len = iomap.length;
+ return nfs_ok;
+}
+
+static __be32
+nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
+ const struct svc_fh *fhp, struct nfsd4_layoutget *args)
+{
+ struct nfsd4_layout_seg *seg = &args->lg_seg;
+ struct pnfs_block_extent *bex;
+ u64 length;
+ u32 block_size = i_blocksize(inode);
+ __be32 nfserr;
+
+ if (locks_in_grace(SVC_NET(rqstp)))
+ return nfserr_grace;
- seg->offset = iomap.offset;
- seg->length = iomap.length;
+ nfserr = nfserr_layoutunavailable;
+ if (seg->offset & (block_size - 1)) {
+ dprintk("pnfsd: I/O misaligned\n");
+ goto out_error;
+ }
+
+ /*
+ * Some clients barf on non-zero block numbers for NONE or INVALID
+ * layouts, so make sure to zero the whole structure.
+ */
+ nfserr = nfserrno(-ENOMEM);
+ bex = kzalloc(sizeof(*bex), GFP_KERNEL);
+ if (!bex)
+ goto out_error;
+ args->lg_content = bex;
+
+ nfserr = nfsd4_block_map_extent(inode, fhp, seg->offset, seg->length,
+ seg->iomode, args->lg_minlength, bex);
+ if (nfserr != nfs_ok)
+ goto out_error;
+
+ nfserr = nfserr_layoutunavailable;
+ length = bex->foff + bex->len - seg->offset;
+ if (length < args->lg_minlength) {
+ dprintk("pnfsd: extent smaller than minlength\n");
+ goto out_error;
+ }
+
+ seg->offset = bex->foff;
+ seg->length = bex->len;
dprintk("GET: 0x%llx:0x%llx %d\n", bex->foff, bex->len, bex->es);
- return 0;
+ return nfs_ok;
out_error:
seg->length = 0;
- return nfserrno(error);
-out_layoutunavailable:
- seg->length = 0;
- return nfserr_layoutunavailable;
+ return nfserr;
}
static __be32
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 3/4] NFSD/blocklayout: Introduce layout content structure
2025-10-03 9:11 [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 1/4] NFSD/blocklayout: Fix minlength check in proc_layoutget Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 2/4] NFSD/blocklayout: Extract extent mapping from proc_layoutget Sergey Bashirov
@ 2025-10-03 9:11 ` Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 4/4] NFSD/blocklayout: Support multiple extents per LAYOUTGET Sergey Bashirov
2025-10-04 17:20 ` [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Chuck Lever
4 siblings, 0 replies; 6+ messages in thread
From: Sergey Bashirov @ 2025-10-03 9:11 UTC (permalink / raw)
To: Chuck Lever, Christoph Hellwig, Dai Ngo, Jeff Layton, NeilBrown,
Olga Kornievskaia, Tom Talpey
Cc: linux-nfs, linux-kernel, Sergey Bashirov, Christoph Hellwig
Add a layout content structure instead of a single extent. The ability
to store and encode an array of extents is then used to implement support
for multiple extents per LAYOUTGET.
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/nfsd/blocklayout.c | 26 ++++++++++++++++++++++----
fs/nfsd/blocklayoutxdr.c | 36 +++++++++++++++++++++++++++---------
fs/nfsd/blocklayoutxdr.h | 14 ++++++++++++++
3 files changed, 63 insertions(+), 13 deletions(-)
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index 35a95501db63..6d29ea5e8623 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -88,9 +88,10 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
const struct svc_fh *fhp, struct nfsd4_layoutget *args)
{
struct nfsd4_layout_seg *seg = &args->lg_seg;
+ struct pnfs_block_layout *bl;
struct pnfs_block_extent *bex;
u64 length;
- u32 block_size = i_blocksize(inode);
+ u32 nr_extents_max = 1, block_size = i_blocksize(inode);
__be32 nfserr;
if (locks_in_grace(SVC_NET(rqstp)))
@@ -102,16 +103,33 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
goto out_error;
}
+ /*
+ * RFC 8881, section 3.3.17:
+ * The layout4 data type defines a layout for a file.
+ *
+ * RFC 8881, section 18.43.3:
+ * The loga_maxcount field specifies the maximum layout size
+ * (in bytes) that the client can handle. If the size of the
+ * layout structure exceeds the size specified by maxcount,
+ * the metadata server will return the NFS4ERR_TOOSMALL error.
+ */
+ nfserr = nfserr_toosmall;
+ if (args->lg_maxcount < PNFS_BLOCK_LAYOUT4_SIZE +
+ PNFS_BLOCK_EXTENT_SIZE)
+ goto out_error;
+
/*
* Some clients barf on non-zero block numbers for NONE or INVALID
* layouts, so make sure to zero the whole structure.
*/
nfserr = nfserrno(-ENOMEM);
- bex = kzalloc(sizeof(*bex), GFP_KERNEL);
- if (!bex)
+ bl = kzalloc(struct_size(bl, extents, nr_extents_max), GFP_KERNEL);
+ if (!bl)
goto out_error;
- args->lg_content = bex;
+ bl->nr_extents = nr_extents_max;
+ args->lg_content = bl;
+ bex = &bl->extents[0];
nfserr = nfsd4_block_map_extent(inode, fhp, seg->offset, seg->length,
seg->iomode, args->lg_minlength, bex);
if (nfserr != nfs_ok)
diff --git a/fs/nfsd/blocklayoutxdr.c b/fs/nfsd/blocklayoutxdr.c
index e50afe340737..196ef4245604 100644
--- a/fs/nfsd/blocklayoutxdr.c
+++ b/fs/nfsd/blocklayoutxdr.c
@@ -14,12 +14,25 @@
#define NFSDDBG_FACILITY NFSDDBG_PNFS
+/**
+ * nfsd4_block_encode_layoutget - encode block/scsi layout extent array
+ * @xdr: stream for data encoding
+ * @lgp: layoutget content, actually an array of extents to encode
+ *
+ * Encode the opaque loc_body field in the layoutget response. Since the
+ * pnfs_block_layout4 and pnfs_scsi_layout4 structures on the wire are
+ * the same, this function is used by both layout drivers.
+ *
+ * Return values:
+ * %nfs_ok: Success, all extents encoded into @xdr
+ * %nfserr_toosmall: Not enough space in @xdr to encode all the data
+ */
__be32
nfsd4_block_encode_layoutget(struct xdr_stream *xdr,
const struct nfsd4_layoutget *lgp)
{
- const struct pnfs_block_extent *b = lgp->lg_content;
- int len = sizeof(__be32) + 5 * sizeof(__be64) + sizeof(__be32);
+ const struct pnfs_block_layout *bl = lgp->lg_content;
+ u32 i, len = sizeof(__be32) + bl->nr_extents * PNFS_BLOCK_EXTENT_SIZE;
__be32 *p;
p = xdr_reserve_space(xdr, sizeof(__be32) + len);
@@ -27,14 +40,19 @@ nfsd4_block_encode_layoutget(struct xdr_stream *xdr,
return nfserr_toosmall;
*p++ = cpu_to_be32(len);
- *p++ = cpu_to_be32(1); /* we always return a single extent */
+ *p++ = cpu_to_be32(bl->nr_extents);
- p = svcxdr_encode_deviceid4(p, &b->vol_id);
- p = xdr_encode_hyper(p, b->foff);
- p = xdr_encode_hyper(p, b->len);
- p = xdr_encode_hyper(p, b->soff);
- *p++ = cpu_to_be32(b->es);
- return 0;
+ for (i = 0; i < bl->nr_extents; i++) {
+ const struct pnfs_block_extent *bex = bl->extents + i;
+
+ p = svcxdr_encode_deviceid4(p, &bex->vol_id);
+ p = xdr_encode_hyper(p, bex->foff);
+ p = xdr_encode_hyper(p, bex->len);
+ p = xdr_encode_hyper(p, bex->soff);
+ *p++ = cpu_to_be32(bex->es);
+ }
+
+ return nfs_ok;
}
static int
diff --git a/fs/nfsd/blocklayoutxdr.h b/fs/nfsd/blocklayoutxdr.h
index 7d25ef689671..2e0c6c7d2b42 100644
--- a/fs/nfsd/blocklayoutxdr.h
+++ b/fs/nfsd/blocklayoutxdr.h
@@ -8,6 +8,15 @@
struct iomap;
struct xdr_stream;
+/* On the wire size of the layout4 struct with zero number of extents */
+#define PNFS_BLOCK_LAYOUT4_SIZE \
+ (sizeof(__be32) * 2 + /* offset4 */ \
+ sizeof(__be32) * 2 + /* length4 */ \
+ sizeof(__be32) + /* layoutiomode4 */ \
+ sizeof(__be32) + /* layouttype4 */ \
+ sizeof(__be32) + /* number of bytes */ \
+ sizeof(__be32)) /* number of extents */
+
struct pnfs_block_extent {
struct nfsd4_deviceid vol_id;
u64 foff;
@@ -21,6 +30,11 @@ struct pnfs_block_range {
u64 len;
};
+struct pnfs_block_layout {
+ u32 nr_extents;
+ struct pnfs_block_extent extents[] __counted_by(nr_extents);
+};
+
/*
* Random upper cap for the uuid length to avoid unbounded allocation.
* Not actually limited by the protocol.
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 4/4] NFSD/blocklayout: Support multiple extents per LAYOUTGET
2025-10-03 9:11 [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Sergey Bashirov
` (2 preceding siblings ...)
2025-10-03 9:11 ` [PATCH v3 3/4] NFSD/blocklayout: Introduce layout content structure Sergey Bashirov
@ 2025-10-03 9:11 ` Sergey Bashirov
2025-10-04 17:20 ` [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Chuck Lever
4 siblings, 0 replies; 6+ messages in thread
From: Sergey Bashirov @ 2025-10-03 9:11 UTC (permalink / raw)
To: Chuck Lever, Christoph Hellwig, Dai Ngo, Jeff Layton, NeilBrown,
Olga Kornievskaia, Tom Talpey
Cc: linux-nfs, linux-kernel, Sergey Bashirov, Christoph Hellwig
Allow the pNFS server to respond with multiple extents to a LAYOUTGET
request, thereby avoiding unnecessary load on the server and improving
performance for the client. The number of LAYOUTGET requests is
significantly reduced for various file access patterns, including
random and parallel writes.
Additionally, this change allows the client to request layouts with the
loga_minlength value greater than the minimum possible length of a single
extent in XFS. We use this functionality to fix a livelock in the client.
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/nfsd/blocklayout.c | 47 +++++++++++++++++++++++++++++++------------
1 file changed, 34 insertions(+), 13 deletions(-)
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index 6d29ea5e8623..101cccbee4a3 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -89,9 +89,9 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
{
struct nfsd4_layout_seg *seg = &args->lg_seg;
struct pnfs_block_layout *bl;
- struct pnfs_block_extent *bex;
- u64 length;
- u32 nr_extents_max = 1, block_size = i_blocksize(inode);
+ struct pnfs_block_extent *first_bex, *last_bex;
+ u64 offset = seg->offset, length = seg->length;
+ u32 i, nr_extents_max, block_size = i_blocksize(inode);
__be32 nfserr;
if (locks_in_grace(SVC_NET(rqstp)))
@@ -118,6 +118,13 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
PNFS_BLOCK_EXTENT_SIZE)
goto out_error;
+ /*
+ * Limit the maximum layout size to avoid allocating
+ * a large buffer on the server for each layout request.
+ */
+ nr_extents_max = (min(args->lg_maxcount, PAGE_SIZE) -
+ PNFS_BLOCK_LAYOUT4_SIZE) / PNFS_BLOCK_EXTENT_SIZE;
+
/*
* Some clients barf on non-zero block numbers for NONE or INVALID
* layouts, so make sure to zero the whole structure.
@@ -129,23 +136,37 @@ nfsd4_block_proc_layoutget(struct svc_rqst *rqstp, struct inode *inode,
bl->nr_extents = nr_extents_max;
args->lg_content = bl;
- bex = &bl->extents[0];
- nfserr = nfsd4_block_map_extent(inode, fhp, seg->offset, seg->length,
- seg->iomode, args->lg_minlength, bex);
- if (nfserr != nfs_ok)
- goto out_error;
+ for (i = 0; i < bl->nr_extents; i++) {
+ struct pnfs_block_extent *bex = bl->extents + i;
+ u64 bex_length;
+
+ nfserr = nfsd4_block_map_extent(inode, fhp, offset, length,
+ seg->iomode, args->lg_minlength, bex);
+ if (nfserr != nfs_ok)
+ goto out_error;
+
+ bex_length = bex->len - (offset - bex->foff);
+ if (bex_length >= length) {
+ bl->nr_extents = i + 1;
+ break;
+ }
+
+ offset = bex->foff + bex->len;
+ length -= bex_length;
+ }
+
+ first_bex = bl->extents;
+ last_bex = bl->extents + bl->nr_extents - 1;
nfserr = nfserr_layoutunavailable;
- length = bex->foff + bex->len - seg->offset;
+ length = last_bex->foff + last_bex->len - seg->offset;
if (length < args->lg_minlength) {
dprintk("pnfsd: extent smaller than minlength\n");
goto out_error;
}
- seg->offset = bex->foff;
- seg->length = bex->len;
-
- dprintk("GET: 0x%llx:0x%llx %d\n", bex->foff, bex->len, bex->es);
+ seg->offset = first_bex->foff;
+ seg->length = last_bex->foff - first_bex->foff + last_bex->len;
return nfs_ok;
out_error:
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget
2025-10-03 9:11 [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Sergey Bashirov
` (3 preceding siblings ...)
2025-10-03 9:11 ` [PATCH v3 4/4] NFSD/blocklayout: Support multiple extents per LAYOUTGET Sergey Bashirov
@ 2025-10-04 17:20 ` Chuck Lever
4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2025-10-04 17:20 UTC (permalink / raw)
To: Christoph Hellwig, Dai Ngo, Jeff Layton, NeilBrown,
Olga Kornievskaia, Tom Talpey, Sergey Bashirov
Cc: Chuck Lever, linux-nfs, linux-kernel
From: Chuck Lever <chuck.lever@oracle.com>
On Fri, 03 Oct 2025 12:11:02 +0300, Sergey Bashirov wrote:
> Implement support for multiple extents in the LAYOUTGET response
> for two main reasons.
>
> First, it avoids unnecessary RPC calls. For files consisting of many
> extents, especially large ones, too many LAYOUTGET requests are observed
> in Wireshark traces.
>
> [...]
Applied to nfsd-testing, thanks!
[1/4] NFSD/blocklayout: Fix minlength check in proc_layoutget
commit: b94708d49420881366669b7010269f159a6e1b70
[2/4] NFSD/blocklayout: Extract extent mapping from proc_layoutget
commit: 88f8b3f8c4fc8c351aaae49d0fec4e7b5e6ad0db
[3/4] NFSD/blocklayout: Introduce layout content structure
commit: 76fc273123889e9b1629fc9f1ec40465dbda1a73
[4/4] NFSD/blocklayout: Support multiple extents per LAYOUTGET
commit: 8a3c46f07fb5c3cd6c1cc807d9a22e1531100625
--
Chuck Lever <chuck.lever@oracle.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-10-04 17:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-03 9:11 [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 1/4] NFSD/blocklayout: Fix minlength check in proc_layoutget Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 2/4] NFSD/blocklayout: Extract extent mapping from proc_layoutget Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 3/4] NFSD/blocklayout: Introduce layout content structure Sergey Bashirov
2025-10-03 9:11 ` [PATCH v3 4/4] NFSD/blocklayout: Support multiple extents per LAYOUTGET Sergey Bashirov
2025-10-04 17:20 ` [PATCH v3 0/4] NFSD: Impl multiple extents in block/scsi layoutget Chuck Lever
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).