public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Bloch <mbloch@nvidia.com>
To: Trond Myklebust <trondmy@kernel.org>
Cc: Linoy Ganti <lganti@nvidia.com>,
	Bar Friedman <bfriedman@nvidia.com>,
	linux-nfs@vger.kernel.org, Maor Gottlieb <maorg@nvidia.com>
Subject: Re: Possible regression after NFS eof page pollution fix (ext4 checksum errors)
Date: Thu, 22 Jan 2026 12:00:40 +0200	[thread overview]
Message-ID: <9871411e-8272-4f95-80f0-2b86e55231b8@nvidia.com> (raw)
In-Reply-To: <391d9e32-afef-4b1c-adf9-422204360c77@nvidia.com>



On 11/01/2026 9:03, Mark Bloch wrote:
> Hi Trond,
> 
> On 11/01/2026 2:24, Trond Myklebust wrote:
>> Hi Mark,
>>
>> On Mon, 2026-01-05 at 10:20 -0500, Trond Myklebust wrote:
>>>
>>> OK so if I'm understanding correctly, this is organised as ext4
>>> partitions that are stored in qcow2 images that are again stored on a
>>> NFSv4.2 partition.
>>>
>>> Do these qcow2 images have a file size that is fixed at creation
>>> time,
>>> or is the file size dynamic?
> 
> The file size is dynamic (with a fixed maximum of 35 GB).
> 
>>> Also, does changing the "discard" option from "unmap" to "ignore"
>>> make
>>> any difference to the outcome?
> 
> The discard option is already set to "ignore" in the image.
> Do you want us to test the other options just to see if it makes
> a difference?
> 
>>
>> I've been staring at this for several days now, and the only candidate
>> for a bug in the NFS client that I can see is this one. Can you please
>> check if the following patch helps?
> 
> Thanks for the patch, I'll let the team dealing with the issue know
> and let them test the patch.
> I'll update once I know anything.

We've been testing your patch for some time now and didn't hit the issue.
Feel free to add Bar's tested by tag as she was the one
that actually tested the fix. Thanks for looking into this.

Tested-by: Bar Friedman <bfriedman@nvidia.com>

Mark

> 
> Mark
> 
>>
>> Thanks
>>   Trond
>>
>> 8<------------------------------------------------------------------
>> From 18acd9e2652d44bcb8a48bc4643ab006787b809a Mon Sep 17 00:00:00 2001
>> Message-ID: <18acd9e2652d44bcb8a48bc4643ab006787b809a.1768091015.git.trond.myklebust@hammerspace.com>
>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>> Date: Sat, 10 Jan 2026 18:53:34 -0500
>> Subject: [PATCH] NFSv4.2: Fix size read races in fallocate and copy offload
>>
>> If the pre-operation file size is read before locking the inode and
>> quiescing O_DIRECT writes, then nfs_truncate_last_folio() might end up
>> overwriting valid file data.
>>
>> Fixes: b1817b18ff20 ("NFS: Protect against 'eof page pollution'")
>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>> ---
>>  fs/nfs/io.c        |  2 ++
>>  fs/nfs/nfs42proc.c | 29 +++++++++++++++++++----------
>>  2 files changed, 21 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/nfs/io.c b/fs/nfs/io.c
>> index d275b0a250bf..8337f0ae852d 100644
>> --- a/fs/nfs/io.c
>> +++ b/fs/nfs/io.c
>> @@ -84,6 +84,7 @@ nfs_start_io_write(struct inode *inode)
>>  		nfs_file_block_o_direct(NFS_I(inode));
>>  	return err;
>>  }
>> +EXPORT_SYMBOL_GPL(nfs_start_io_write);
>>  
>>  /**
>>   * nfs_end_io_write - declare that the buffered write operation is done
>> @@ -97,6 +98,7 @@ nfs_end_io_write(struct inode *inode)
>>  {
>>  	up_write(&inode->i_rwsem);
>>  }
>> +EXPORT_SYMBOL_GPL(nfs_end_io_write);
>>  
>>  /* Call with exclusively locked inode->i_rwsem */
>>  static void nfs_block_buffered(struct nfs_inode *nfsi, struct inode *inode)
>> diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
>> index d537fb0c230e..c08520828708 100644
>> --- a/fs/nfs/nfs42proc.c
>> +++ b/fs/nfs/nfs42proc.c
>> @@ -114,7 +114,6 @@ static int nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep,
>>  	exception.inode = inode;
>>  	exception.state = lock->open_context->state;
>>  
>> -	nfs_file_block_o_direct(NFS_I(inode));
>>  	err = nfs_sync_inode(inode);
>>  	if (err)
>>  		goto out;
>> @@ -138,13 +137,17 @@ int nfs42_proc_allocate(struct file *filep, loff_t offset, loff_t len)
>>  		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_ALLOCATE],
>>  	};
>>  	struct inode *inode = file_inode(filep);
>> -	loff_t oldsize = i_size_read(inode);
>> +	loff_t oldsize;
>>  	int err;
>>  
>>  	if (!nfs_server_capable(inode, NFS_CAP_ALLOCATE))
>>  		return -EOPNOTSUPP;
>>  
>> -	inode_lock(inode);
>> +	err = nfs_start_io_write(inode);
>> +	if (err)
>> +		return err;
>> +
>> +	oldsize = i_size_read(inode);
>>  
>>  	err = nfs42_proc_fallocate(&msg, filep, offset, len);
>>  
>> @@ -155,7 +158,7 @@ int nfs42_proc_allocate(struct file *filep, loff_t offset, loff_t len)
>>  		NFS_SERVER(inode)->caps &= ~(NFS_CAP_ALLOCATE |
>>  					     NFS_CAP_ZERO_RANGE);
>>  
>> -	inode_unlock(inode);
>> +	nfs_end_io_write(inode);
>>  	return err;
>>  }
>>  
>> @@ -170,7 +173,9 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
>>  	if (!nfs_server_capable(inode, NFS_CAP_DEALLOCATE))
>>  		return -EOPNOTSUPP;
>>  
>> -	inode_lock(inode);
>> +	err = nfs_start_io_write(inode);
>> +	if (err)
>> +		return err;
>>  
>>  	err = nfs42_proc_fallocate(&msg, filep, offset, len);
>>  	if (err == 0)
>> @@ -179,7 +184,7 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
>>  		NFS_SERVER(inode)->caps &= ~(NFS_CAP_DEALLOCATE |
>>  					     NFS_CAP_ZERO_RANGE);
>>  
>> -	inode_unlock(inode);
>> +	nfs_end_io_write(inode);
>>  	return err;
>>  }
>>  
>> @@ -189,14 +194,17 @@ int nfs42_proc_zero_range(struct file *filep, loff_t offset, loff_t len)
>>  		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_ZERO_RANGE],
>>  	};
>>  	struct inode *inode = file_inode(filep);
>> -	loff_t oldsize = i_size_read(inode);
>> +	loff_t oldsize;
>>  	int err;
>>  
>>  	if (!nfs_server_capable(inode, NFS_CAP_ZERO_RANGE))
>>  		return -EOPNOTSUPP;
>>  
>> -	inode_lock(inode);
>> +	err = nfs_start_io_write(inode);
>> +	if (err)
>> +		return err;
>>  
>> +	oldsize = i_size_read(inode);
>>  	err = nfs42_proc_fallocate(&msg, filep, offset, len);
>>  	if (err == 0) {
>>  		nfs_truncate_last_folio(inode->i_mapping, oldsize,
>> @@ -205,7 +213,7 @@ int nfs42_proc_zero_range(struct file *filep, loff_t offset, loff_t len)
>>  	} else if (err == -EOPNOTSUPP)
>>  		NFS_SERVER(inode)->caps &= ~NFS_CAP_ZERO_RANGE;
>>  
>> -	inode_unlock(inode);
>> +	nfs_end_io_write(inode);
>>  	return err;
>>  }
>>  
>> @@ -416,7 +424,7 @@ static ssize_t _nfs42_proc_copy(struct file *src,
>>  	struct nfs_server *src_server = NFS_SERVER(src_inode);
>>  	loff_t pos_src = args->src_pos;
>>  	loff_t pos_dst = args->dst_pos;
>> -	loff_t oldsize_dst = i_size_read(dst_inode);
>> +	loff_t oldsize_dst;
>>  	size_t count = args->count;
>>  	ssize_t status;
>>  
>> @@ -461,6 +469,7 @@ static ssize_t _nfs42_proc_copy(struct file *src,
>>  		&src_lock->open_context->state->flags);
>>  	set_bit(NFS_CLNT_DST_SSC_COPY_STATE,
>>  		&dst_lock->open_context->state->flags);
>> +	oldsize_dst = i_size_read(dst_inode);
>>  
>>  	status = nfs4_call_sync(dst_server->client, dst_server, &msg,
>>  				&args->seq_args, &res->seq_res, 0);
> 


  parent reply	other threads:[~2026-01-22 10:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-04  9:16 Possible regression after NFS eof page pollution fix (ext4 checksum errors) Mark Bloch
2026-01-04 15:36 ` Trond Myklebust
2026-01-05 14:00   ` Mark Bloch
2026-01-05 15:20     ` Trond Myklebust
2026-01-06 10:12       ` Bar Friedman
2026-01-11  0:24       ` Trond Myklebust
2026-01-11  7:03         ` Mark Bloch
2026-01-11 14:59           ` Trond Myklebust
2026-01-22 10:00           ` Mark Bloch [this message]
2026-01-22 21:13             ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9871411e-8272-4f95-80f0-2b86e55231b8@nvidia.com \
    --to=mbloch@nvidia.com \
    --cc=bfriedman@nvidia.com \
    --cc=lganti@nvidia.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox