Re: [PATCH 1/1] xfs: fix overlapping extents returned for pNFS LAYOUTGET

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dai Ngo <dai.ngo@oracle.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: cem@kernel.org, linux-xfs@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/1] xfs: fix overlapping extents returned for pNFS LAYOUTGET
Date: Wed, 13 May 2026 10:28:31 -0700	[thread overview]
Message-ID: <06d9b1ae-e46f-459c-bcb4-1a5ca4ded4b0@oracle.com> (raw)
In-Reply-To: <961eb355-2f52-47a0-9399-e050a4e535a2@oracle.com>

Hi Christoph,

On 5/13/26 8:50 AM, Dai Ngo wrote:
>
> On 5/13/26 12:01 AM, Christoph Hellwig wrote:
>> On Tue, May 12, 2026 at 10:21:53AM -0700, Dai Ngo wrote:
>>> A single LAYOUTGET request from the client can cause the server to
>>> issue multiple calls to xfs_fs_map_blocks() for different offsets
>>> within the same extent. Because the use of XFS_BMAPI_ENTIRE flag,
>>> these calls can produce overlapping mappings.
>>>
>>> As a result, the LAYOUTGET reply sent to the NFS client may contain
>>> overlapping extents. This creates ambiguity in extent selection for a
>>> given file range, which can lead to incorrect device selection,
>>> inconsistent handling of datastate, and ultimately data corruption or
>>> protocol violations on the client side.
>> Please also add a check to the client that catches this and doesn't
>> use the layout that has extents outside the requested range. And maybe
>> warn about it as well.
>
> The returned extents cover exactly the range requested in the LAYOUTGET
> op. However these extents are overlapping. For example, here is the
> on-the-wire capture of the LAYOUTGET operation and reply showing the
> overlapping extents:
>
>     Network File System, Ops(3): SEQUENCE, PUTFH, LAYOUTGET
>         [Program Version: 4]
>         [V4 Procedure: COMPOUND (1)]
>         Tag: <EMPTY>
>         minorversion: 2
>         Operations (count: 3): SEQUENCE, PUTFH, LAYOUTGET
>             Opcode: SEQUENCE (53)
>             Opcode: PUTFH (22)
>             Opcode: LAYOUTGET (50)
>                 layout available?: No
>                 layout type: LAYOUT4_SCSI (5)
>                 IO mode: IOMODE_RW (2)
>                 offset: 122880
>                 length: 65536
>                 min length: 4096
>                 StateID
>                 maxcount: 4096
>         [Main Opcode: LAYOUTGET (50)]
>         Network File System, Ops(3): SEQUENCE PUTFH LAYOUTGET
>         [Program Version: 4]
>         [V4 Procedure: COMPOUND (1)]
>         Status: NFS4_OK (0)
>         Tag: <EMPTY>
>         Operations (count: 3)
>             Opcode: SEQUENCE (53)
>             Opcode: PUTFH (22)
>             Opcode: LAYOUTGET (50)
>                 Status: NFS4_OK (0)
>                 return on close?: Yes
>                 StateID
>                 Layout Segment (count: 1)
>                     offset: 122880
>                     length: 77824
>                     IO mode: IOMODE_RW (2)
>                     layout type: LAYOUT4_SCSI (5)
>                     SCSI Extents (count: 2)
>                         extent 0
>                             device ID: 01000000000000000000000000000000
>                             file offset: 122880
>                             length: 53248
>                             volume offset: 339460096
>                             extent state: INVALID_DATA (2)
>                         extent 1
>                             device ID: 01000000000000000000000000000000
>                             file offset: 122880
>                             length: 77824
>                             volume offset: 339460096
>                             extent state: INVALID_DATA (2)
>         [Main Opcode: LAYOUTGET (50)]

After reviewing ext_tree_insert(), with assist from Codex, I think this
function handles overlapping extents properly. The only issue I see in
ext_tree_insert() is the accuracy of the return error code, EINVAL instead
of ENOMEM, when kmemdup() fails.

Since ext_tree_insert seems to handle overlapping extents fine, do you
think it's worth it to fix xfs_fs_map_blocks() to avoid returning overlap
extents?

IMHO, I think we still should fix xfs_fs_map_blocks() to avoid any overhead
and complication in ext_tree_insert having to handle overlapping extents.

-Dai

>
> -Dai
>
>>
>>> Also drop the check for (!error) since it was checked after call to
>>> xfs_bmapi_read().
>>>
>>> Fixes: cc6c40e09d7b1 ("NFSD/blocklayout: Support multiple extents 
>>> per LAYOUTGET").
>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>> ---
>>>   fs/xfs/xfs_pnfs.c | 6 +++---
>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> - This patch is based on top of the patch:
>>>    xfs: fix use of uninitialized imap in xfs_fs_map_blocks error path
>> The error changes should go into that patch, so please resend it with
>> that fixes.  Maybe as a series together with this patch to keep them
>> together.
>>
>>> @@ -172,6 +172,7 @@ xfs_fs_map_blocks(
>>>       offset_fsb = XFS_B_TO_FSBT(mp, offset);
>>>         lock_flags = xfs_ilock_data_map_shared(ip);
>>> +    bmapi_flags = 0;    /* return map for requested range only */
>> Just remove the variable and hard code the 0 in the xfs_bmapi_read call.
>>
>

next prev parent reply	other threads:[~2026-05-13 17:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 17:21 [PATCH 1/1] xfs: fix overlapping extents returned for pNFS LAYOUTGET Dai Ngo
2026-05-12 17:34 ` Darrick J. Wong
2026-05-12 19:21   ` Dai Ngo
2026-05-13  7:01 ` Christoph Hellwig
2026-05-13 15:50   ` Dai Ngo
2026-05-13 17:28     ` Dai Ngo [this message]
2026-05-14  0:25       ` Darrick J. Wong
2026-05-14 17:19         ` Dai Ngo
2026-05-14 17:49           ` Darrick J. Wong
2026-05-15 21:39           ` Dave Chinner
2026-05-16  2:14             ` Dai Ngo
2026-05-15 11:50       ` Christoph Hellwig
2026-05-15 11:49     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06d9b1ae-e46f-459c-bcb4-1a5ca4ded4b0@oracle.com \
    --to=dai.ngo@oracle.com \
    --cc=cem@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.