From: Chuck Lever <chuck.lever@oracle.com>
To: Li Lingfeng <lilingfeng3@huawei.com>,
cve@kernel.org, linux-kernel@vger.kernel.org,
linux-cve-announce@vger.kernel.org,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Olga Kornievskaia <okorniev@redhat.com>,
Jeff Layton <jlayton@kernel.org>, NeilBrown <neilb@suse.de>,
yangerkun <yangerkun@huawei.com>,
"zhangyi (F)" <yi.zhang@huawei.com>, Hou Tao <houtao1@huawei.com>,
"yukuai (C)" <yukuai3@huawei.com>,
"chengzhihao1@huawei.com" <chengzhihao1@huawei.com>,
ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Subject: Re: CVE-2024-50106: nfsd: fix race between laundromat and free_stateid
Date: Tue, 17 Dec 2024 13:25:43 -0500 [thread overview]
Message-ID: <7b0ec3c4-77a1-49cf-aadf-7d393c750f8e@oracle.com> (raw)
In-Reply-To: <ef9774e3-572b-427f-99e9-c6a456ffe4fc@huawei.com>
On 12/17/24 10:30 AM, Li Lingfeng wrote:
> Hi,
> after analysis, we think that this issue is not introduced by commit
> 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by
> clp->cl_lock") but by commit 83e733161fde ("nfsd: avoid race after
> unhash_delegation_locked()").
> Therefore, kernel versions earlier than 6.9 do not involve this issue.
A more practical question is: has anyone reproduced the reported crash
on a pre-v6.9 kernel?
I recall (dimly) that we knew that 8dd91e8d31fe ("nfsd: fix race between
laundromat and free_stateid") could not be cleanly applied before v6.9.
It was less clear at the time whether a more extensive LTS backport
would be required.
> // normal case 1 -- free deleg by delegreturn
> 1) OP_DELEGRETURN
> nfsd4_delegreturn
> nfsd4_lookup_stateid
> destroy_delegation
> destroy_unhashed_deleg
> nfs4_unlock_deleg_lease
> vfs_setlease // unlock
> nfs4_put_stid // put last refcount
> idr_remove // remove from cl_stateids
> s->sc_free // free deleg
>
> 2) OP_FREE_STATEID
> nfsd4_free_stateid
> find_stateid_locked // can not find the deleg in cl_stateids
>
>
> // normal case 2 -- free deleg by laundromat
> nfs4_laundromat
> state_expired
> unhash_delegation_locked // set NFS4_REVOKED_DELEG_STID
> list_add // add the deleg to reaplist
> list_first_entry // get the deleg from reaplist
> revoke_delegation
> destroy_unhashed_deleg
> nfs4_unlock_deleg_lease
> nfs4_put_stid
>
>
> // abnormal case
> nfs4_laundromat
> state_expired
> unhash_delegation_locked
> // set NFS4_REVOKED_DELEG_STID
> list_add
> // add the deleg to reaplist
> 1) OP_DELEGRETURN
> nfsd4_delegreturn
> nfsd4_lookup_stateid
> nfsd4_stid_check_stateid_generation
> nfsd4_verify_open_stid
> // check NFS4_REVOKED_DELEG_STID
> // and return nfserr_deleg_revoked
> // skip destroy_delegation
>
> 2) OP_FREE_STATEID
> nfsd4_free_stateid
> // check NFS4_REVOKED_DELEG_STID
> list_del_init
> // remove deleg from reaplist
> nfs4_put_stid
> // free deleg
> list_first_entry
> // cant not get the deleg from reaplist
>
>
> Before commit 83e733161fde ("nfsd: avoid race after
> unhash_delegation_locked()"), nfs4_laundromat --> unhash_delegation_locked
> would not set NFS4_REVOKED_DELEG_STID for the deleg.
> So the description "it marks the delegation stid revoked" in the CVE fix
> patch does not hold true. And the OP_FREE_STATEID operation will not
> release the deleg.
>
> Thanks.
>
> 在 2024/11/6 1:10, Greg Kroah-Hartman 写道:
>> Description
>> ===========
>>
>> In the Linux kernel, the following vulnerability has been resolved:
>>
>> nfsd: fix race between laundromat and free_stateid
>>
>> There is a race between laundromat handling of revoked delegations
>> and a client sending free_stateid operation. Laundromat thread
>> finds that delegation has expired and needs to be revoked so it
>> marks the delegation stid revoked and it puts it on a reaper list
>> but then it unlock the state lock and the actual delegation revocation
>> happens without the lock. Once the stid is marked revoked a racing
>> free_stateid processing thread does the following (1) it calls
>> list_del_init() which removes it from the reaper list and (2) frees
>> the delegation stid structure. The laundromat thread ends up not
>> calling the revoke_delegation() function for this particular delegation
>> but that means it will no release the lock lease that exists on
>> the file.
>>
>> Now, a new open for this file comes in and ends up finding that
>> lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
>> up trying to derefence a freed delegation stateid. Leading to the
>> followint use-after-free KASAN warning:
>>
>> kernel:
>> ==================================================================
>> kernel: BUG: KASAN: slab-use-after-free in
>> nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
>> kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205
>> kernel:
>> kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted
>> 6.11.0-rc7+ #9
>> kernel: Hardware name: Apple Inc. Apple Virtualization Generic
>> Platform, BIOS 2069.0.0.0.0 08/03/2024
>> kernel: Call trace:
>> kernel: dump_backtrace+0x98/0x120
>> kernel: show_stack+0x1c/0x30
>> kernel: dump_stack_lvl+0x80/0xe8
>> kernel: print_address_description.constprop.0+0x84/0x390
>> kernel: print_report+0xa4/0x268
>> kernel: kasan_report+0xb4/0xf8
>> kernel: __asan_report_load8_noabort+0x1c/0x28
>> kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
>> kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd]
>> kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd]
>> kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd]
>> kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd]
>> kernel: nfsd4_open+0xa08/0xe80 [nfsd]
>> kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd]
>> kernel: nfsd_dispatch+0x22c/0x718 [nfsd]
>> kernel: svc_process_common+0x8e8/0x1960 [sunrpc]
>> kernel: svc_process+0x3d4/0x7e0 [sunrpc]
>> kernel: svc_handle_xprt+0x828/0xe10 [sunrpc]
>> kernel: svc_recv+0x2cc/0x6a8 [sunrpc]
>> kernel: nfsd+0x270/0x400 [nfsd]
>> kernel: kthread+0x288/0x310
>> kernel: ret_from_fork+0x10/0x20
>>
>> This patch proposes a fixed that's based on adding 2 new additional
>> stid's sc_status values that help coordinate between the laundromat
>> and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()).
>>
>> First to make sure, that once the stid is marked revoked, it is not
>> removed by the nfsd4_free_stateid(), the laundromat take a reference
>> on the stateid. Then, coordinating whether the stid has been put
>> on the cl_revoked list or we are processing FREE_STATEID and need to
>> make sure to remove it from the list, each check that state and act
>> accordingly. If laundromat has added to the cl_revoke list before
>> the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove
>> it from the list. If nfsd4_free_stateid() finds that operations arrived
>> before laundromat has placed it on cl_revoke list, it marks the state
>> freed and then laundromat will no longer add it to the list.
>>
>> Also, for nfsd4_delegreturn() when looking for the specified stid,
>> we need to access stid that are marked removed or freeable, it means
>> the laundromat has started processing it but hasn't finished and this
>> delegreturn needs to return nfserr_deleg_revoked and not
>> nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the
>> lack of it will leave this stid on the cl_revoked list indefinitely.
>>
>> The Linux kernel CVE team has assigned CVE-2024-50106 to this issue.
>>
>>
>> Affected and fixed versions
>> ===========================
>>
>> Issue introduced in 3.17 with commit 2d4a532d385f and fixed in
>> 6.11.6 with commit 967faa26f313
>> Issue introduced in 3.17 with commit 2d4a532d385f and fixed in
>> 6.12-rc5 with commit 8dd91e8d31fe
>>
>> Please see https://www.kernel.org for a full list of currently supported
>> kernel versions by the kernel community.
>>
>> Unaffected versions might change over time as fixes are backported to
>> older supported kernel versions. The official CVE entry at
>> https://cve.org/CVERecord/?id=CVE-2024-50106
>> will be updated if fixes are backported, please check that for the most
>> up to date information about this issue.
>>
>>
>> Affected files
>> ==============
>>
>> The file(s) affected by this issue are:
>> fs/nfsd/nfs4state.c
>> fs/nfsd/state.h
>>
>>
>> Mitigation
>> ==========
>>
>> The Linux kernel CVE team recommends that you update to the latest
>> stable kernel version for this, and many other bugfixes. Individual
>> changes are never tested alone, but rather are part of a larger kernel
>> release. Cherry-picking individual commits is not recommended or
>> supported by the Linux kernel community at all. If however, updating to
>> the latest release is impossible, the individual changes to resolve this
>> issue can be found at these commits:
>> https://git.kernel.org/stable/
>> c/967faa26f313a62e7bebc55d5b8122eaee43b929
>> https://git.kernel.org/stable/
>> c/8dd91e8d31febf4d9cca3ae1bb4771d33ae7ee5a
--
Chuck Lever
prev parent reply other threads:[~2024-12-17 18:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <2024110553-CVE-2024-50106-c095@gregkh>
2024-12-17 15:30 ` CVE-2024-50106: nfsd: fix race between laundromat and free_stateid Li Lingfeng
2024-12-17 15:59 ` Greg Kroah-Hartman
2024-12-18 1:55 ` Chuck Lever
2025-01-06 16:27 ` Greg Kroah-Hartman
2024-12-17 18:25 ` Chuck Lever [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b0ec3c4-77a1-49cf-aadf-7d393c750f8e@oracle.com \
--to=chuck.lever@oracle.com \
--cc=chengzhihao1@huawei.com \
--cc=cve@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=houtao1@huawei.com \
--cc=jlayton@kernel.org \
--cc=lilingfeng3@huawei.com \
--cc=linux-cve-announce@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=okorniev@redhat.com \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
--cc=zhangxiaoxu5@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox