Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Li Lingfeng <lilingfeng3@huawei.com>,
	trondmy@kernel.org, anna@kernel.org, 	bcodding@redhat.com
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	 yukuai1@huaweicloud.com, houtao1@huawei.com,
	yi.zhang@huawei.com,  yangerkun@huawei.com,
	lilingfeng@huaweicloud.com
Subject: Re: [PATCH] nfs: handle failure of nfs_get_lock_context in unlock path
Date: Sat, 19 Apr 2025 08:25:02 -0400	[thread overview]
Message-ID: <2463b2c4db0494d49e15b91d75d1692e00426b56.camel@kernel.org> (raw)
In-Reply-To: <21817f2c-2971-4568-9ae4-1ccc25f7f1ef@huawei.com>

On Sat, 2025-04-19 at 16:28 +0800, Li Lingfeng wrote:
> 在 2025/4/17 20:43, Jeff Layton 写道:
> > On Thu, 2025-04-17 at 20:24 +0800, Li Lingfeng wrote:
> > > 在 2025/4/17 18:29, Jeff Layton 写道:
> > > > On Thu, 2025-04-17 at 15:25 +0800, Li Lingfeng wrote:
> > > > > When memory is insufficient, the allocation of nfs_lock_context in
> > > > > nfs_get_lock_context() fails and returns -ENOMEM. If we mistakenly treat
> > > > > an nfs4_unlockdata structure (whose l_ctx member has been set to -ENOMEM)
> > > > > as valid and proceed to execute rpc_run_task(), this will trigger a NULL
> > > > > pointer dereference in nfs4_locku_prepare. For example:
> > > > > 
> > > > > BUG: kernel NULL pointer dereference, address: 000000000000000c
> > > > > PGD 0 P4D 0
> > > > > Oops: Oops: 0000 [#1] SMP PTI
> > > > > CPU: 15 UID: 0 PID: 12 Comm: kworker/u64:0 Not tainted 6.15.0-rc2-dirty #60
> > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40
> > > > > Workqueue: rpciod rpc_async_schedule
> > > > > RIP: 0010:nfs4_locku_prepare+0x35/0xc2
> > > > > Code: 89 f2 48 89 fd 48 c7 c7 68 69 ef b5 53 48 8b 8e 90 00 00 00 48 89 f3
> > > > > RSP: 0018:ffffbbafc006bdb8 EFLAGS: 00010246
> > > > > RAX: 000000000000004b RBX: ffff9b964fc1fa00 RCX: 0000000000000000
> > > > > RDX: 0000000000000000 RSI: fffffffffffffff4 RDI: ffff9ba53fddbf40
> > > > > RBP: ffff9ba539934000 R08: 0000000000000000 R09: ffffbbafc006bc38
> > > > > R10: ffffffffb6b689c8 R11: 0000000000000003 R12: ffff9ba539934030
> > > > > R13: 0000000000000001 R14: 0000000004248060 R15: ffffffffb56d1c30
> > > > > FS: 0000000000000000(0000) GS:ffff9ba5881f0000(0000) knlGS:00000000
> > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > CR2: 000000000000000c CR3: 000000093f244000 CR4: 00000000000006f0
> > > > > Call Trace:
> > > > >    <TASK>
> > > > >    __rpc_execute+0xbc/0x480
> > > > >    rpc_async_schedule+0x2f/0x40
> > > > >    process_one_work+0x232/0x5d0
> > > > >    worker_thread+0x1da/0x3d0
> > > > >    ? __pfx_worker_thread+0x10/0x10
> > > > >    kthread+0x10d/0x240
> > > > >    ? __pfx_kthread+0x10/0x10
> > > > >    ret_from_fork+0x34/0x50
> > > > >    ? __pfx_kthread+0x10/0x10
> > > > >    ret_from_fork_asm+0x1a/0x30
> > > > >    </TASK>
> > > > > Modules linked in:
> > > > > CR2: 000000000000000c
> > > > > ---[ end trace 0000000000000000 ]---
> > > > > 
> > > > > Free the allocated nfs4_unlockdata when nfs_get_lock_context() fails and
> > > > > return NULL to terminate subsequent rpc_run_task, preventing NULL pointer
> > > > > dereference.
> > > > > 
> > > > > Fixes: f30cb757f680 ("NFS: Always wait for I/O completion before unlock")
> > > > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> > > > > ---
> > > > >    fs/nfs/nfs4proc.c | 9 ++++++++-
> > > > >    1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > > > index 970f28dbf253..9f5689c43a50 100644
> > > > > --- a/fs/nfs/nfs4proc.c
> > > > > +++ b/fs/nfs/nfs4proc.c
> > > > > @@ -7074,10 +7074,18 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl,
> > > > >    	struct nfs4_unlockdata *p;
> > > > >    	struct nfs4_state *state = lsp->ls_state;
> > > > >    	struct inode *inode = state->inode;
> > > > > +	struct nfs_lock_context *l_ctx;
> > > > >    
> > > > >    	p = kzalloc(sizeof(*p), GFP_KERNEL);
> > > > >    	if (p == NULL)
> > > > >    		return NULL;
> > > > > +	l_ctx = nfs_get_lock_context(ctx);
> > > > > +	if (!IS_ERR(l_ctx)) {
> > > > > +		p->l_ctx = l_ctx;
> > > > > +	} else {
> > > > > +		kfree(p);
> > > > > +		return NULL;
> > > > > +	}
> > > > >    	p->arg.fh = NFS_FH(inode);
> > > > >    	p->arg.fl = &p->fl;
> > > > >    	p->arg.seqid = seqid;
> > > > > @@ -7085,7 +7093,6 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl,
> > > > >    	p->lsp = lsp;
> > > > >    	/* Ensure we don't close file until we're done freeing locks! */
> > > > >    	p->ctx = get_nfs_open_context(ctx);
> > > > Not exactly the same problem, but get_nfs_open_context() can fail too.
> > > > Does it need error handling for that as well?
> > > Hi,
> > > 
> > > IIUC, nfs_open_context is allocated during file open and attached to
> > > filp->private_data. Upon successful file opening, the context remains valid.
> > > Post-lock acquisition, nfs_open_context can be retrieved via
> > > file_lock->file->nfs_open_context chain. Thus get_nfs_open_context() here
> > > should have non-failure guarantee in standard code paths.
> > 
> > I'm not so sure. This function can get called from the rpc_release
> > callback for a LOCK request:
> > 
> > ->rpc_release
> >      nfs4_lock_release
> > 	nfs4_do_unlck
> > 	    nfs4_alloc_unlockdata
> > 
> > Can that happen after the open_ctx->lock_context.count goes to 0?
> > 
> > Given that we have a safe failure path in this code, it seems like we
> > ought to check for that here, just to be safe. If it really shouldn't
> > happen like you say, then we could throw in a WARN_ON_ONCE() too.
> Thank you for raising this concern.
> During file open, the nfs_open_context is allocated, and
> open_ctx->lock_context.count is initialized to 1. Based on the current
> flow, I think it's unlikely for this counter to reach 0 during lock/unlock
> operations since its decrement is tied to file closure.
> 
> However, I agree with your suggestion to add checks when
> get_nfs_open_context fails. Furthermore, this check might also be
> necessary not only in the unlock path but potentially in the lock path if
> get_nfs_open_contextb fails there as well.
> 
> Additionally, I noticed that both the lock and unlock release callbacks
> dereference nfs_open_context. If get_nfs_open_context were to fail
> (assuming such a scenario is possible), this could lead to a NULL pointer
> dereference. Instead of relying solely on WARN_ON_ONCE(), it might be
> safer to halt the operation immediately upon detecting a failure in
> get_nfs_open_context.
> 
> // unlock
> nfs4_locku_release_calldata
>   put_nfs_open_context
>     __put_nfs_open_context
>      // dereference nfs_open_context
> 
> // lock
> nfs4_lock_release
>   nfs4_do_unlck
>    // dereference nfs_open_context
>   put_nfs_open_context
>    // dereference nfs_open_context
> 
> I'll incorporate your feedback and send a patchset soon.

Thanks. I think that it pays to be safe here. If this scenario turns
out to not be possible, then the alternative would be to add a comment
that explains why.

Just handling the error case is probably best though as this is not a
particularly hot codepath, and that would help safeguard against future
changes.

> > 
> > > > > -	p->l_ctx = nfs_get_lock_context(ctx);
> > > > >    	locks_init_lock(&p->fl);
> > > > >    	locks_copy_lock(&p->fl, fl);
> > > > >    	p->server = NFS_SERVER(inode);
> > > > Good catch:
> > > > 
> > > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > > > 

-- 
Jeff Layton <jlayton@kernel.org>

      reply	other threads:[~2025-04-19 12:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17  7:25 [PATCH] nfs: handle failure of nfs_get_lock_context in unlock path Li Lingfeng
2025-04-17 10:29 ` Jeff Layton
2025-04-17 12:24   ` Li Lingfeng
2025-04-17 12:43     ` Jeff Layton
2025-04-19  8:28       ` Li Lingfeng
2025-04-19 12:25         ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2463b2c4db0494d49e15b91d75d1692e00426b56.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=anna@kernel.org \
    --cc=bcodding@redhat.com \
    --cc=houtao1@huawei.com \
    --cc=lilingfeng3@huawei.com \
    --cc=lilingfeng@huaweicloud.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox