From: yangerkun <yangerkun@huawei.com>
To: Chuck Lever <cel@kernel.org>,
Misbah Anjum N <misanjum@linux.ibm.com>,
Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>, <yi.zhang@huawei.com>
Cc: <linux-nfs@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<netdev@vger.kernel.org>, Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH 0/6] SUNRPC: Address remaining cache_check_rcu() UAF in cache content files
Date: Fri, 8 May 2026 11:08:00 +0800 [thread overview]
Message-ID: <05f93fc4-59d7-4735-bc7d-a00d1497687a@huawei.com> (raw)
In-Reply-To: <4bb9ed6b-1a64-406a-9239-b0560ca963cc@huawei.com>
在 2026/5/8 10:45, yangerkun 写道:
> Hello Chuck,
>
> 在 2026/5/8 0:12, Chuck Lever 写道:
>> Hello Erkun -
>>
>> On Thu, May 7, 2026, at 11:09 AM, yangerkun wrote:
>>> Hi,
>>>
>>> 在 2026/5/1 22:51, Chuck Lever 写道:
>>>> Misbah Anjum reported a use-after-free in cache_check_rcu()
>>>> reached through e_show() while sosreport was reading
>>>> /proc/fs/nfsd/exports on ppc64le. Two fixes for that report
>>>> landed in v7.0:
>>>>
>>>> 48db892356d6 ("NFSD: Defer sub-object cleanup in export put
>>>> callbacks")
>>>> e7fcf179b82d ("NFSD: Hold net reference for the lifetime of /
>>>> proc/fs/nfs/exports fd")
>>>
>>> Back to the problem fixed by this patches, I'm a little confused why
>>> this UAF can be trigged.
>>>
>>> Before this patches, svc_export_put show as follow:
>>>
>>> 368 static void svc_export_put(struct kref *ref)
>>> 369 {
>>> 370 struct svc_export *exp = container_of(ref, struct
>>> svc_export, h.ref);
>>> 371
>>> 372 path_put(&exp->ex_path);
>>> 373 auth_domain_put(exp->ex_client);
>>> 374 call_rcu(&exp->ex_rcu, svc_export_release);
>>> 375 }
>>>
>>> The auth_domain_put function releases ->name using call_rcu, and
>>> path_put may release the dentry also via call_rcu. All of this seems to
>>> prevent e_show from causing a UAF. Could you point out which line in
>>> d_path triggers the issue?
>>
>> The dentry, the mount, and the auth_domain ->name buffer all
>> end up RCU-freed (dentry_free() and delayed_free_vfsmnt in
>> fs/, svcauth_unix_domain_release_rcu() in svcauth_unix.c).
>> The eventual kfree isn't the problem.
>>
>> The problem is the synchronous teardown inside path_put(),
>> which runs before svc_export_put() ever reaches its own
>> call_rcu():
>>
>> path_put(&exp->ex_path)
>> -> dput(dentry)
>> -> __dentry_kill() [if last ref]
>> -> __d_drop() /* unhashes */
>> -> dentry_unlink_inode() /* d_inode = NULL */
>> -> d_op->d_release() if set
>> -> drops parent d_lockref /* may cascade up */
>> -> dentry_free() /* call_rcu deferred */
>> -> mntput(mnt) /* deferred via task_work */
>>
>> The dentry pointer itself is RCU-safe, so prepend_path()'s walk
>> of d_parent and d_name doesn't read freed memory. But by the
>> time the reader gets there, __d_clear_type_and_inode() has
>> already stored NULL into d_inode, __d_drop() has broken the
>> hash linkage, and the parent's d_lockref has been decremented
>> -- which can in turn fire __dentry_kill() on the parent, and
>> on up the tree. An e_show() that's still inside its cache RCU
>> read section walks into that half-dismantled state through
>> seq_path(), and that's the NULL deref Misbah reported.
>
> Thank you for your detailed explanation! Yes, e_show might be called
> when the state is partially dismantled, but after carefully reviewing
> the code with dput up to __dentry_kill, I still cannot find anything
> that could cause this issue. Additionally, the comments for prepend_path
> indicate that they have already taken into account that the dentry can
> be removed concurrently. I have also run some tests on my arm64 QEMU,
> but I couldn't reproduce the problem either. Could you please help me
> identify the specific line or pointer in the dentry that triggers this
> use-after-free or null pointer issue?
>
> Maybe I am not be very familiar with the code, which caused me to fail
> to identify the real root cause. I'm so sorry for that.
>
>
> 265 char *d_path(const struct path *path, char *buf, int buflen)
> 266 {
> 267 DECLARE_BUFFER(b, buf, buflen);
> 268 struct path root;
> 269
> 270 /*
> 271 * We have various synthetic filesystems that never get
> mounted. On
> 272 * these filesystems dentries are never used for lookup
> purposes, and
> 273 * thus don't need to be hashed. They also don't need a
> name until a
> 274 * user wants to identify the object in /proc/pid/fd/. The
> little hack
> 275 * below allows us to generate a name for these objects on
> demand:
> 276 *
> 277 * Some pseudo inodes are mountable. When they are mounted
> 278 * path->dentry == path->mnt->mnt_root. In that case don't
> call d_dname
> 279 * and instead have d_path return the mounted path.
> 280 */
> 281 if (path->dentry->d_op && path->dentry->d_op->d_dname &&
> 282 (!IS_ROOT(path->dentry) || path->dentry != path->mnt-
> >mnt_root))
> 283 return path->dentry->d_op->d_dname(path->dentry,
> buf, buflen);
> 284
> 285 rcu_read_lock();
> 286 get_fs_root_rcu(current->fs, &root);
> 287 if (unlikely(d_unlinked(path->dentry)))
> 288 prepend(&b, " (deleted)", 11);
> 289 else
> 290 prepend_char(&b, 0);
> 291 prepend_path(path, &root, &b);
> 292 rcu_read_unlock();
> 293
> 294 return extract_string(&b);
> 295 }
>
>
>>
>> The earlier fix (2530766492ec, "nfsd: fix UAF when access
>> ex_uuid or ex_stats") moved the kfree of ex_uuid and ex_stats
>> into svc_export_release() so those are RCU-safe now.
>> path_put() and auth_domain_put() couldn't go in there because
>> both may sleep, and call_rcu callbacks run in softirq context.
>> This series uses queue_rcu_work() instead: it defers past the
>> grace period AND runs the callback in process context, so the
>> sleeping puts move into the deferred path and the window
>> closes.
>
> Yeah, I can get this! Thanks again for your detail explanation!
Also, could the scenario described in this commit be triggered again?
commit 69d803c40edeaf94089fbc8751c9b746cdc35044
Author: Yang Erkun <yangerkun@huawei.com>
Date: Mon Dec 16 22:21:52 2024 +0800
nfsd: Revert "nfsd: release svc_expkey/svc_export with rcu_work"
This reverts commit f8c989a0c89a75d30f899a7cabdc14d72522bb8d.
Before this commit, svc_export_put or expkey_put will call path_put
with
sync mode. After this commit, path_put will be called with async mode.
And this can lead the unexpected results show as follow.
mkfs.xfs -f /dev/sda
echo "/ *(rw,no_root_squash,fsid=0)" > /etc/exports
echo "/mnt *(rw,no_root_squash,fsid=1)" >> /etc/exports
exportfs -ra
service nfs-server start
mount -t nfs -o vers=4.0 127.0.0.1:/mnt /mnt1
mount /dev/sda /mnt/sda
touch /mnt1/sda/file
exportfs -r
umount /mnt/sda # failed unexcepted
The touch will finally call nfsd_cross_mnt, add refcount to mount, and
then add cache_head. Before this commit, exportfs -r will call
cache_flush to cleanup all cache_head, and path_put in
svc_export_put/expkey_put will be finished with sync mode. So, the
latter umount will always success. However, after this commit, path_put
will be called with async mode, the latter umount may failed, and if
we add some delay, umount will success too. Personally I think this bug
and should be fixed. We first revert before bugfix patch, and then fix
the original bug with a different way.
Fixes: f8c989a0c89a ("nfsd: release svc_expkey/svc_export with
rcu_work")
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>
> Thanks,
> Erkun.
>
>>
>>
>
next prev parent reply other threads:[~2026-05-08 3:08 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-01 14:51 [PATCH 0/6] SUNRPC: Address remaining cache_check_rcu() UAF in cache content files Chuck Lever
2026-05-01 14:51 ` [PATCH 1/6] SUNRPC: Move cache_initialize() declaration to sunrpc-private header Chuck Lever
2026-05-01 14:51 ` [PATCH 2/6] SUNRPC: Provide a shared workqueue for cache release callbacks Chuck Lever
2026-05-01 14:51 ` [PATCH 3/6] SUNRPC: Defer ip_map sub-object cleanup past RCU grace period Chuck Lever
2026-05-01 14:51 ` [PATCH 4/6] SUNRPC: Use shared release pattern for the unix_gid cache Chuck Lever
2026-05-01 14:51 ` [PATCH 5/6] SUNRPC: Hold cd->net for the lifetime of cache files Chuck Lever
2026-05-01 14:51 ` [PATCH 6/6] NFSD: Convert nfsd_export_shutdown() to sunrpc_cache_destroy_net() Chuck Lever
2026-05-05 5:32 ` [PATCH 0/6] SUNRPC: Address remaining cache_check_rcu() UAF in cache content files Jeff Layton
2026-05-05 10:49 ` Calum Mackay
2026-05-05 10:53 ` Chuck Lever
2026-05-07 9:09 ` yangerkun
2026-05-07 16:12 ` Chuck Lever
2026-05-08 2:45 ` yangerkun
2026-05-08 3:08 ` yangerkun [this message]
2026-05-08 8:16 ` yangerkun
2026-05-08 13:00 ` yangerkun
2026-05-08 20:47 ` Chuck Lever
2026-05-09 9:41 ` yangerkun
2026-05-10 16:18 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05f93fc4-59d7-4735-bc7d-a00d1497687a@huawei.com \
--to=yangerkun@huawei.com \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=misanjum@linux.ibm.com \
--cc=neil@brown.name \
--cc=netdev@vger.kernel.org \
--cc=okorniev@redhat.com \
--cc=pabeni@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox