* [PATCH] nfsd: free nfsd_file by gc after adding it to lru list
@ 2025-01-13 2:59 Li Lingfeng
2025-01-13 14:07 ` Chuck Lever
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Li Lingfeng @ 2025-01-13 2:59 UTC (permalink / raw)
To: chuck.lever, jlayton, neilb, okorniev, Dai.Ngo, tom
Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun,
lilingfeng, lilingfeng3
In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru
list, gc may be triggered in another thread and immediately release this
nfsd_file, which will lead to a UAF when accessing this nfsd_file again.
All the places where unhash is done will also perform lru_remove, so there
is no need to do lru_remove separately here. After inserting the nfsd_file
into the nfsd_file_lru list, it can be released by relying on gc.
Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
fs/nfsd/filecache.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index a1cdba42c4fa..37b65cb1579a 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf)
/* Try to add it to the LRU. If that fails, decrement. */
if (nfsd_file_lru_add(nf)) {
/* If it's still hashed, we're done */
- if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
+ if (list_lru_count(&nfsd_file_lru))
nfsd_file_schedule_laundrette();
- return;
- }
- /*
- * We're racing with unhashing, so try to remove it from
- * the LRU. If removal fails, then someone else already
- * has our reference.
- */
- if (!nfsd_file_lru_remove(nf))
- return;
+ return;
}
}
if (refcount_dec_and_test(&nf->nf_ref))
--
2.31.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-13 2:59 [PATCH] nfsd: free nfsd_file by gc after adding it to lru list Li Lingfeng @ 2025-01-13 14:07 ` Chuck Lever 2025-01-14 1:54 ` Li Lingfeng 2025-01-14 19:27 ` Jeff Layton 2025-01-14 19:40 ` cel 2 siblings, 1 reply; 17+ messages in thread From: Chuck Lever @ 2025-01-13 14:07 UTC (permalink / raw) To: Li Lingfeng, jlayton, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng Hello! On 1/12/25 9:59 PM, Li Lingfeng wrote: > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > list, gc may be triggered in another thread and immediately release this > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. Do you happen to have a reproducer that can trigger this issue? > All the places where unhash is done will also perform lru_remove, so there > is no need to do lru_remove separately here. After inserting the nfsd_file > into the nfsd_file_lru list, it can be released by relying on gc. > > Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") The code that is being replaced below was introduced in ac3a2585f018 ("nfsd: rework refcounting in filecache"). This fix won't apply to kernels that have only 4a0e73e635e3 but not ac3a2585f018, for instance. At the very least we need to add "Cc: stable@vger.kernel.org # v6.2" but I'm not convinced "Fixes: 4a0e73e635e3" is correct. > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > --- > fs/nfsd/filecache.c | 12 ++---------- > 1 file changed, 2 insertions(+), 10 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index a1cdba42c4fa..37b65cb1579a 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) > /* Try to add it to the LRU. If that fails, decrement. */ > if (nfsd_file_lru_add(nf)) { > /* If it's still hashed, we're done */ > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > + if (list_lru_count(&nfsd_file_lru)) > nfsd_file_schedule_laundrette(); > - return; > - } The above change does not seem to be related to the fix described in the patch description. Can you help me understand why this is needed? > - /* > - * We're racing with unhashing, so try to remove it from > - * the LRU. If removal fails, then someone else already > - * has our reference. > - */ > - if (!nfsd_file_lru_remove(nf)) > - return; > + return; > } > } > if (refcount_dec_and_test(&nf->nf_ref)) -- Chuck Lever ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-13 14:07 ` Chuck Lever @ 2025-01-14 1:54 ` Li Lingfeng 2025-01-14 19:17 ` Chuck Lever 0 siblings, 1 reply; 17+ messages in thread From: Li Lingfeng @ 2025-01-14 1:54 UTC (permalink / raw) To: Chuck Lever, jlayton, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng 在 2025/1/13 22:07, Chuck Lever 写道: > Hello! > > On 1/12/25 9:59 PM, Li Lingfeng wrote: >> In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru >> list, gc may be triggered in another thread and immediately release this >> nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > Do you happen to have a reproducer that can trigger this issue? > Hi, thanks for your reply. Here is the reproducer: Patch: diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 585163b4e11c..8a8245b0ce32 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -356,6 +356,7 @@ nfsd_file_get(struct nfsd_file *nf) void nfsd_file_put(struct nfsd_file *nf) { + static int count = 0; might_sleep(); trace_nfsd_file_put(nf); @@ -370,6 +371,11 @@ nfsd_file_put(struct nfsd_file *nf) /* Try to add it to the LRU. If that fails, decrement. */ if (nfsd_file_lru_add(nf)) { + if (count++ % 3 == 0) { + printk("%s %d sleep before test...\n", __func__, __LINE__); + msleep(5000); + printk("%s %d sleep done\n", __func__, __LINE__); + } /* If it's still hashed, we're done */ if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { nfsd_file_schedule_laundrette(); Steps: mkfs.ext4 -F /dev/sdb mount /dev/sdb /mnt/sdb echo "/mnt *(rw,no_root_squash,fsid=0)" > /etc/exports echo "/mnt/sdb *(rw,no_root_squash,fsid=1)" >> /etc/exports systemctl restart nfs-server for i in {1..10}; do filename="file$i.txt"; echo "123" > /mnt/sdb/"$filename"; done mount -t nfs -o rw,relatime,vers=3 127.0.0.1:/mnt/sdb /mnt/sdbb sh test.sh test.sh: for i in {1..10}; do filename="file$i.txt" cat /mnt/sdbb/"$filename" & done [ 205.114996][ T2409] ================================================================== [ 205.115006][ T2409] BUG: KASAN: slab-use-after-free in nfsd_file_put+0x190/0x3b0 [ 205.115055][ T2409] Read of size 8 at addr ffff88810798f3e8 by task nfsd/2409 [ 205.115062][ T2409] [ 205.115068][ T2409] CPU: 1 UID: 0 PID: 2409 Comm: nfsd Not tainted 6.13.0-rc6-00038-g09a0fa92e5b4-dirty #83 [ 205.115078][ T2409] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 123[ 205.115085][ T2409] Call Trace: [ 205.115095][ T2409] <TASK> [ 205.115101][ T2409] dump_stack_lvl+0x68/0xa0 [ 205.115117][ T2409] print_address_description.constprop.0+0x2c/0x3d0 [ 205.115132][ T2409] ? nfsd_file_put+0x190/0x3b0 [ 205.115140][ T2409] print_report+0xb3/0x270 123[ 205.115147][ T2409] ? kasan_addr_to_slab+0xd/0xa0 [ 205.115160][ T2409] kasan_report+0x93/0xc0 [ 205.115168][ T2409] ? nfsd_file_put+0x190/0x3b0 [ 205.115196][ T2409] kasan_check_range+0xf6/0x1b0 [ 205.115207][ T2409] nfsd_file_put+0x190/0x3b0 [ 205.115217][ T2409] nfsd_read+0x151/0x3b0 [ 205.115229][ T2409] ? __pfx_nfsd_read+0x10/0x10 [ 205.115238][ T2409] ? lock_acquire+0x15c/0x3f0 [ 205.115249][ T2409] ? nfsd_init_request+0x6b/0x300 [ 205.115259][ T2409] ? rcu_is_watching+0x20/0x50 [ 205.115272][ T2409] nfsd3_proc_read+0x1c9/0x280 [ 205.115285][ T2409] nfsd_dispatch+0x1b7/0x4c0 [ 205.115294][ T2409] ? __pfx_nfsd_dispatch+0x10/0x10 [ 205.115305][ T2409] ? __asan_memset+0x24/0x50 [ 205.115315][ T2409] ? rcu_is_watching+0x20/0x50 [ 205.115325][ T2409] svc_process_common+0x615/0xd20 [ 205.115346][ T2409] ? __pfx_svc_process_common+0x10/0x10 [ 205.115359][ T2409] ? __pfx_nfsd_dispatch+0x10/0x10 [ 205.115374][ T2409] ? mark_held_locks+0x24/0x90 [ 205.115388][ T2409] ? xdr_init_decode+0x11d/0x190 [ 205.115410][ T2409] svc_process+0x2a8/0x430 [ 205.115435][ T2409] svc_handle_xprt+0x71c/0xb40 [ 205.115458][ T2409] svc_recv+0x5f1/0x9b0 [ 205.115477][ T2409] ? svc_recv+0xab/0x9b0 [ 205.115501][ T2409] nfsd+0x1e7/0x390 [ 205.115522][ T2409] ? __pfx_nfsd+0x10/0x10 [ 205.115536][ T2409] kthread+0x1a7/0x1f0 [ 205.115551][ T2409] ? kthread+0xfb/0x1f0 [ 205.139288][ T2409] ? __pfx_kthread+0x10/0x10 [ 205.139769][ T2409] ret_from_fork+0x34/0x60 [ 205.140238][ T2409] ? __pfx_kthread+0x10/0x10 [ 205.140701][ T2409] ret_from_fork_asm+0x1a/0x30 [ 205.141226][ T2409] </TASK> [ 205.141548][ T2409] [ 205.141809][ T2409] Allocated by task 2409: [ 205.142254][ T2409] kasan_save_stack+0x24/0x50 [ 205.142726][ T2409] kasan_save_track+0x14/0x30 [ 205.143214][ T2409] __kasan_slab_alloc+0x87/0x90 [ 205.143694][ T2409] kmem_cache_alloc_noprof+0x162/0x4a0 [ 205.144268][ T2409] nfsd_file_do_acquire+0x3a2/0x1420 [ 205.144812][ T2409] nfsd_file_acquire_gc+0x5a/0x80 [ 205.145338][ T2409] nfsd_read+0xc6/0x3b0 [ 205.145766][ T2409] nfsd3_proc_read+0x1c9/0x280 [ 205.146226][ T2409] nfsd_dispatch+0x1b7/0x4c0 [ 205.146682][ T2409] svc_process_common+0x615/0xd20 [ 205.147212][ T2409] svc_process+0x2a8/0x430 [ 205.147652][ T2409] svc_handle_xprt+0x71c/0xb40 [ 205.148140][ T2409] svc_recv+0x5f1/0x9b0 [ 205.148569][ T2409] nfsd+0x1e7/0x390 [ 205.148966][ T2409] kthread+0x1a7/0x1f0 [ 205.149388][ T2409] ret_from_fork+0x34/0x60 [ 205.149838][ T2409] ret_from_fork_asm+0x1a/0x30 [ 205.150330][ T2409] [ 205.150646][ T2409] Freed by task 0: [ 205.151120][ T2409] kasan_save_stack+0x24/0x50 [ 205.151778][ T2409] kasan_save_track+0x14/0x30 [ 205.152409][ T2409] kasan_save_free_info+0x3a/0x60 [ 205.153073][ T2409] __kasan_slab_free+0x54/0x70 [ 205.153677][ T2409] kmem_cache_free+0x159/0x5f0 [ 205.154169][ T2409] rcu_do_batch+0x311/0x900 [ 205.154643][ T2409] rcu_core+0x58c/0x6f0 [ 205.155065][ T2409] handle_softirqs+0xf8/0x570 [ 205.155526][ T2409] irq_exit_rcu+0x141/0x160 [ 205.156004][ T2409] sysvec_apic_timer_interrupt+0x76/0x90 [ 205.156575][ T2409] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 205.157196][ T2409] [ 205.157416][ T2409] Last potentially related work creation: [ 205.157986][ T2409] kasan_save_stack+0x24/0x50 [ 205.158404][ T2409] __kasan_record_aux_stack+0xa6/0xc0 [ 205.158893][ T2409] __call_rcu_common.constprop.0+0xa5/0x8f0 [ 205.159515][ T2409] nfsd_file_dispose_list+0x93/0xc0 [ 205.160067][ T2409] nfsd_file_net_dispose+0x1ad/0x1f0 [ 205.160613][ T2409] nfsd+0x1ef/0x390 [ 205.161021][ T2409] kthread+0x1a7/0x1f0 [ 205.161451][ T2409] ret_from_fork+0x34/0x60 [ 205.161908][ T2409] ret_from_fork_asm+0x1a/0x30 [ 205.162407][ T2409] [ 205.162663][ T2409] The buggy address belongs to the object at ffff88810798f3b8 [ 205.162663][ T2409] which belongs to the cache nfsd_file of size 128 [ 205.164061][ T2409] The buggy address is located 48 bytes inside of [ 205.164061][ T2409] freed 128-byte region [ffff88810798f3b8, ffff88810798f438) [ 205.165446][ T2409] [ 205.165678][ T2409] The buggy address belongs to the physical page: [ 205.166340][ T2409] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88810798f4a8 pfn:0x10798e [ 205.167356][ T2409] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 205.168204][ T2409] flags: 0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) [ 205.169071][ T2409] page_type: f5(slab) [ 205.169483][ T2409] raw: 0017ffffc0000240 ffff888443a06500 ffff8881214b4bc8 ffff8881214b4bc8 [ 205.170431][ T2409] raw: ffff88810798f4a8 0000000000220005 00000001f5000000 0000000000000000 [ 205.171554][ T2409] head: 0017ffffc0000240 ffff888443a06500 ffff8881214b4bc8 ffff8881214b4bc8 [ 205.172630][ T2409] head: ffff88810798f4a8 0000000000220005 00000001f5000000 0000000000000000 [ 205.173509][ T2409] head: 0017ffffc0000001 ffffea00041e6381 ffffffffffffffff 0000000000000000 [ 205.174587][ T2409] head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000 [ 205.175538][ T2409] page dumped because: kasan: bad access detected [ 205.176193][ T2409] [ 205.176427][ T2409] Memory state around the buggy address: [ 205.176997][ T2409] ffff88810798f280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 205.177810][ T2409] ffff88810798f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 205.178601][ T2409] >ffff88810798f380: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb [ 205.179420][ T2409] ^ [ 205.180190][ T2409] ffff88810798f400: fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc [ 205.181000][ T2409] ffff88810798f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 205.181844][ T2409] ================================================================== [ 205.182677][ T2409] Disabling lock debugging due to kernel taint > >> All the places where unhash is done will also perform lru_remove, so >> there >> is no need to do lru_remove separately here. After inserting the >> nfsd_file >> into the nfsd_file_lru list, it can be released by relying on gc. >> >> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") > > The code that is being replaced below was introduced in ac3a2585f018 > ("nfsd: rework refcounting in filecache"). This fix won't apply to > kernels that have only 4a0e73e635e3 but not ac3a2585f018, for instance. > > At the very least we need to add "Cc: stable@vger.kernel.org # v6.2" but > I'm not convinced "Fixes: 4a0e73e635e3" is correct. The replaced code is indeed introduced by ac3a2585f018 ("nfsd: rework refcounting in filecache"). However, commit 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") added the nfsd_file_lru_add operation in nfsd_file_put, which enables concurrent removal of the current nfsd_file by gc, potentially triggering UAF when accessing nfsd_file below. Therefore, I use 4a0e73e635e3 as the fix tag. > > >> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >> --- >> fs/nfsd/filecache.c | 12 ++---------- >> 1 file changed, 2 insertions(+), 10 deletions(-) >> >> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >> index a1cdba42c4fa..37b65cb1579a 100644 >> --- a/fs/nfsd/filecache.c >> +++ b/fs/nfsd/filecache.c >> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >> /* Try to add it to the LRU. If that fails, decrement. */ >> if (nfsd_file_lru_add(nf)) { >> /* If it's still hashed, we're done */ >> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >> + if (list_lru_count(&nfsd_file_lru)) >> nfsd_file_schedule_laundrette(); >> - return; >> - } > > The above change does not seem to be related to the fix described > in the patch description. Can you help me understand why this is > needed? Firstly, I removed the access operations on nfsd_file, including checking nf->nf_flags and nfsd_file_lru_remove, to prevent triggering UAF after concurrent release of nfsd_file by gc. Secondly, I kept nfsd_file_schedule_laundrette and used list_lru_count to determine whether to trigger gc, with the aim of initiating gc and reclaiming nfsd_file as soon as possible even if no other process triggers gc. > > >> - /* >> - * We're racing with unhashing, so try to remove it from >> - * the LRU. If removal fails, then someone else already >> - * has our reference. >> - */ >> - if (!nfsd_file_lru_remove(nf)) >> - return; >> + return; >> } >> } >> if (refcount_dec_and_test(&nf->nf_ref)) > > ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-14 1:54 ` Li Lingfeng @ 2025-01-14 19:17 ` Chuck Lever 0 siblings, 0 replies; 17+ messages in thread From: Chuck Lever @ 2025-01-14 19:17 UTC (permalink / raw) To: Li Lingfeng, jlayton Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng, neilb, okorniev, Dai.Ngo, tom On 1/13/25 8:54 PM, Li Lingfeng wrote: > > 在 2025/1/13 22:07, Chuck Lever 写道: >> Hello! >> >> On 1/12/25 9:59 PM, Li Lingfeng wrote: >>> In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru >>> list, gc may be triggered in another thread and immediately release this >>> nfsd_file, which will lead to a UAF when accessing this nfsd_file again. >> >> Do you happen to have a reproducer that can trigger this issue? >> > Hi, thanks for your reply. > > Here is the reproducer: > > Patch: > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index 585163b4e11c..8a8245b0ce32 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -356,6 +356,7 @@ nfsd_file_get(struct nfsd_file *nf) > void > nfsd_file_put(struct nfsd_file *nf) > { > + static int count = 0; > might_sleep(); > trace_nfsd_file_put(nf); > > @@ -370,6 +371,11 @@ nfsd_file_put(struct nfsd_file *nf) > > /* Try to add it to the LRU. If that fails, decrement. */ > if (nfsd_file_lru_add(nf)) { > + if (count++ % 3 == 0) { > + printk("%s %d sleep before test...\n", __func__, __LINE__); > + msleep(5000); This will certainly misbehave because garbage collection evicts files after 2 seconds, and here the reproducer waits /5/ seconds. But I can see how it is unsafe to continue dereferencing @nf after the reference count is dropped. Jeff, any comments about this failure mode or the proposed fix? More below. > + printk("%s %d sleep done\n", __func__, __LINE__); > + } > /* If it's still hashed, we're done */ > if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > nfsd_file_schedule_laundrette(); > > Steps: > mkfs.ext4 -F /dev/sdb > mount /dev/sdb /mnt/sdb > echo "/mnt *(rw,no_root_squash,fsid=0)" > /etc/exports > echo "/mnt/sdb *(rw,no_root_squash,fsid=1)" >> /etc/exports > systemctl restart nfs-server > > for i in {1..10}; do filename="file$i.txt"; echo "123" > /mnt/ > sdb/"$filename"; done > mount -t nfs -o rw,relatime,vers=3 127.0.0.1:/mnt/sdb /mnt/sdbb > sh test.sh > > test.sh: > for i in {1..10}; do > filename="file$i.txt" > cat /mnt/sdbb/"$filename" & > done > > [ 205.114996][ T2409] > ================================================================== > > [ 205.115006][ T2409] BUG: KASAN: slab-use-after-free in > nfsd_file_put+0x190/0x3b0 > [ 205.115055][ T2409] Read of size 8 at addr ffff88810798f3e8 by task > nfsd/2409 > [ 205.115062][ T2409] > [ 205.115068][ T2409] CPU: 1 UID: 0 PID: 2409 Comm: nfsd Not tainted > 6.13.0-rc6-00038-g09a0fa92e5b4-dirty #83 > [ 205.115078][ T2409] Hardware name: QEMU Standard PC (i440FX + PIIX, > 1996), BIOS ?-20190727_073836-buildvm- > ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 > 123[ 205.115085][ T2409] Call Trace: > [ 205.115095][ T2409] <TASK> > > [ 205.115101][ T2409] dump_stack_lvl+0x68/0xa0 > [ 205.115117][ T2409] print_address_description.constprop.0+0x2c/0x3d0 > [ 205.115132][ T2409] ? nfsd_file_put+0x190/0x3b0 > [ 205.115140][ T2409] print_report+0xb3/0x270 > 123[ 205.115147][ T2409] ? kasan_addr_to_slab+0xd/0xa0 > [ 205.115160][ T2409] kasan_report+0x93/0xc0 > > [ 205.115168][ T2409] ? nfsd_file_put+0x190/0x3b0 > [ 205.115196][ T2409] kasan_check_range+0xf6/0x1b0 > [ 205.115207][ T2409] nfsd_file_put+0x190/0x3b0 > [ 205.115217][ T2409] nfsd_read+0x151/0x3b0 > [ 205.115229][ T2409] ? __pfx_nfsd_read+0x10/0x10 > [ 205.115238][ T2409] ? lock_acquire+0x15c/0x3f0 > [ 205.115249][ T2409] ? nfsd_init_request+0x6b/0x300 > [ 205.115259][ T2409] ? rcu_is_watching+0x20/0x50 > [ 205.115272][ T2409] nfsd3_proc_read+0x1c9/0x280 > [ 205.115285][ T2409] nfsd_dispatch+0x1b7/0x4c0 > [ 205.115294][ T2409] ? __pfx_nfsd_dispatch+0x10/0x10 > [ 205.115305][ T2409] ? __asan_memset+0x24/0x50 > [ 205.115315][ T2409] ? rcu_is_watching+0x20/0x50 > [ 205.115325][ T2409] svc_process_common+0x615/0xd20 > [ 205.115346][ T2409] ? __pfx_svc_process_common+0x10/0x10 > [ 205.115359][ T2409] ? __pfx_nfsd_dispatch+0x10/0x10 > [ 205.115374][ T2409] ? mark_held_locks+0x24/0x90 > [ 205.115388][ T2409] ? xdr_init_decode+0x11d/0x190 > [ 205.115410][ T2409] svc_process+0x2a8/0x430 > [ 205.115435][ T2409] svc_handle_xprt+0x71c/0xb40 > [ 205.115458][ T2409] svc_recv+0x5f1/0x9b0 > [ 205.115477][ T2409] ? svc_recv+0xab/0x9b0 > [ 205.115501][ T2409] nfsd+0x1e7/0x390 > [ 205.115522][ T2409] ? __pfx_nfsd+0x10/0x10 > [ 205.115536][ T2409] kthread+0x1a7/0x1f0 > [ 205.115551][ T2409] ? kthread+0xfb/0x1f0 > [ 205.139288][ T2409] ? __pfx_kthread+0x10/0x10 > [ 205.139769][ T2409] ret_from_fork+0x34/0x60 > [ 205.140238][ T2409] ? __pfx_kthread+0x10/0x10 > [ 205.140701][ T2409] ret_from_fork_asm+0x1a/0x30 > [ 205.141226][ T2409] </TASK> > [ 205.141548][ T2409] > [ 205.141809][ T2409] Allocated by task 2409: > [ 205.142254][ T2409] kasan_save_stack+0x24/0x50 > [ 205.142726][ T2409] kasan_save_track+0x14/0x30 > [ 205.143214][ T2409] __kasan_slab_alloc+0x87/0x90 > [ 205.143694][ T2409] kmem_cache_alloc_noprof+0x162/0x4a0 > [ 205.144268][ T2409] nfsd_file_do_acquire+0x3a2/0x1420 > [ 205.144812][ T2409] nfsd_file_acquire_gc+0x5a/0x80 > [ 205.145338][ T2409] nfsd_read+0xc6/0x3b0 > [ 205.145766][ T2409] nfsd3_proc_read+0x1c9/0x280 > [ 205.146226][ T2409] nfsd_dispatch+0x1b7/0x4c0 > [ 205.146682][ T2409] svc_process_common+0x615/0xd20 > [ 205.147212][ T2409] svc_process+0x2a8/0x430 > [ 205.147652][ T2409] svc_handle_xprt+0x71c/0xb40 > [ 205.148140][ T2409] svc_recv+0x5f1/0x9b0 > [ 205.148569][ T2409] nfsd+0x1e7/0x390 > [ 205.148966][ T2409] kthread+0x1a7/0x1f0 > [ 205.149388][ T2409] ret_from_fork+0x34/0x60 > [ 205.149838][ T2409] ret_from_fork_asm+0x1a/0x30 > [ 205.150330][ T2409] > [ 205.150646][ T2409] Freed by task 0: > [ 205.151120][ T2409] kasan_save_stack+0x24/0x50 > [ 205.151778][ T2409] kasan_save_track+0x14/0x30 > [ 205.152409][ T2409] kasan_save_free_info+0x3a/0x60 > [ 205.153073][ T2409] __kasan_slab_free+0x54/0x70 > [ 205.153677][ T2409] kmem_cache_free+0x159/0x5f0 > [ 205.154169][ T2409] rcu_do_batch+0x311/0x900 > [ 205.154643][ T2409] rcu_core+0x58c/0x6f0 > [ 205.155065][ T2409] handle_softirqs+0xf8/0x570 > [ 205.155526][ T2409] irq_exit_rcu+0x141/0x160 > [ 205.156004][ T2409] sysvec_apic_timer_interrupt+0x76/0x90 > [ 205.156575][ T2409] asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 205.157196][ T2409] > [ 205.157416][ T2409] Last potentially related work creation: > [ 205.157986][ T2409] kasan_save_stack+0x24/0x50 > [ 205.158404][ T2409] __kasan_record_aux_stack+0xa6/0xc0 > [ 205.158893][ T2409] __call_rcu_common.constprop.0+0xa5/0x8f0 > [ 205.159515][ T2409] nfsd_file_dispose_list+0x93/0xc0 > [ 205.160067][ T2409] nfsd_file_net_dispose+0x1ad/0x1f0 > [ 205.160613][ T2409] nfsd+0x1ef/0x390 > [ 205.161021][ T2409] kthread+0x1a7/0x1f0 > [ 205.161451][ T2409] ret_from_fork+0x34/0x60 > [ 205.161908][ T2409] ret_from_fork_asm+0x1a/0x30 > [ 205.162407][ T2409] > [ 205.162663][ T2409] The buggy address belongs to the object at > ffff88810798f3b8 > [ 205.162663][ T2409] which belongs to the cache nfsd_file of size 128 > [ 205.164061][ T2409] The buggy address is located 48 bytes inside of > [ 205.164061][ T2409] freed 128-byte region [ffff88810798f3b8, > ffff88810798f438) > [ 205.165446][ T2409] > [ 205.165678][ T2409] The buggy address belongs to the physical page: > [ 205.166340][ T2409] page: refcount:1 mapcount:0 > mapping:0000000000000000 index:0xffff88810798f4a8 pfn:0x10798e > [ 205.167356][ T2409] head: order:1 mapcount:0 entire_mapcount:0 > nr_pages_mapped:0 pincount:0 > [ 205.168204][ T2409] flags: 0x17ffffc0000240(workingset|head|node=0| > zone=2|lastcpupid=0x1fffff) > [ 205.169071][ T2409] page_type: f5(slab) > [ 205.169483][ T2409] raw: 0017ffffc0000240 ffff888443a06500 > ffff8881214b4bc8 ffff8881214b4bc8 > [ 205.170431][ T2409] raw: ffff88810798f4a8 0000000000220005 > 00000001f5000000 0000000000000000 > [ 205.171554][ T2409] head: 0017ffffc0000240 ffff888443a06500 > ffff8881214b4bc8 ffff8881214b4bc8 > [ 205.172630][ T2409] head: ffff88810798f4a8 0000000000220005 > 00000001f5000000 0000000000000000 > [ 205.173509][ T2409] head: 0017ffffc0000001 ffffea00041e6381 > ffffffffffffffff 0000000000000000 > [ 205.174587][ T2409] head: 0000000000000002 0000000000000000 > 00000000ffffffff 0000000000000000 > [ 205.175538][ T2409] page dumped because: kasan: bad access detected > [ 205.176193][ T2409] > [ 205.176427][ T2409] Memory state around the buggy address: > [ 205.176997][ T2409] ffff88810798f280: fc fc fc fc fc fc fc fc fc fc > fc fc fc fc fc fc > [ 205.177810][ T2409] ffff88810798f300: fc fc fc fc fc fc fc fc fc fc > fc fc fc fc fc fc > [ 205.178601][ T2409] >ffff88810798f380: fc fc fc fc fc fc fc fa fb fb > fb fb fb fb fb fb > [ 205.179420] > [ T2409] ^ > [ 205.180190][ T2409] ffff88810798f400: fb fb fb fb fb fb fb fc fc fc > fc fc fc fc fc fc > [ 205.181000][ T2409] ffff88810798f480: fc fc fc fc fc fc fc fc fc fc > fc fc fc fc fc fc > [ 205.181844][ T2409] > ================================================================== > [ 205.182677][ T2409] Disabling lock debugging due to kernel taint > >> >>> All the places where unhash is done will also perform lru_remove, so >>> there >>> is no need to do lru_remove separately here. After inserting the >>> nfsd_file >>> into the nfsd_file_lru list, it can be released by relying on gc. >>> >>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") >> >> The code that is being replaced below was introduced in ac3a2585f018 >> ("nfsd: rework refcounting in filecache"). This fix won't apply to >> kernels that have only 4a0e73e635e3 but not ac3a2585f018, for instance. >> >> At the very least we need to add "Cc: stable@vger.kernel.org # v6.2" but >> I'm not convinced "Fixes: 4a0e73e635e3" is correct. > The replaced code is indeed introduced by ac3a2585f018 ("nfsd: rework > refcounting in filecache"). > However, commit 4a0e73e635e3 ("NFSD: Leave open files out of the filecache > LRU") added the nfsd_file_lru_add operation in nfsd_file_put, which > enables concurrent removal of the current nfsd_file by gc, potentially > triggering UAF when accessing nfsd_file below. > Therefore, I use 4a0e73e635e3 as the fix tag. >> >> >>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >>> --- >>> fs/nfsd/filecache.c | 12 ++---------- >>> 1 file changed, 2 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>> index a1cdba42c4fa..37b65cb1579a 100644 >>> --- a/fs/nfsd/filecache.c >>> +++ b/fs/nfsd/filecache.c >>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >>> /* Try to add it to the LRU. If that fails, decrement. */ >>> if (nfsd_file_lru_add(nf)) { >>> /* If it's still hashed, we're done */ >>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>> + if (list_lru_count(&nfsd_file_lru)) >>> nfsd_file_schedule_laundrette(); >>> - return; >>> - } >> >> The above change does not seem to be related to the fix described >> in the patch description. Can you help me understand why this is >> needed? > Firstly, I removed the access operations on nfsd_file, including checking > nf->nf_flags and nfsd_file_lru_remove, to prevent triggering UAF after > concurrent release of nfsd_file by gc. Understood and agreed. > Secondly, I kept nfsd_file_schedule_laundrette and used list_lru_count to > determine whether to trigger gc, with the aim of initiating gc and > reclaiming nfsd_file as soon as possible even if no other process triggers > gc. The list_lru_count() remains non-zero for extended periods which might cause the proposed code to trigger the laundrette more often than is optimal. >>> - /* >>> - * We're racing with unhashing, so try to remove it from >>> - * the LRU. If removal fails, then someone else already >>> - * has our reference. >>> - */ >>> - if (!nfsd_file_lru_remove(nf)) >>> - return; >>> + return; >>> } >>> } >>> if (refcount_dec_and_test(&nf->nf_ref)) >> >> -- Chuck Lever ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-13 2:59 [PATCH] nfsd: free nfsd_file by gc after adding it to lru list Li Lingfeng 2025-01-13 14:07 ` Chuck Lever @ 2025-01-14 19:27 ` Jeff Layton 2025-01-14 19:39 ` Jeff Layton 2025-01-14 19:40 ` cel 2 siblings, 1 reply; 17+ messages in thread From: Jeff Layton @ 2025-01-14 19:27 UTC (permalink / raw) To: Li Lingfeng, chuck.lever, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > list, gc may be triggered in another thread and immediately release this > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > All the places where unhash is done will also perform lru_remove, so there > is no need to do lru_remove separately here. After inserting the nfsd_file > into the nfsd_file_lru list, it can be released by relying on gc. > > Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > --- > fs/nfsd/filecache.c | 12 ++---------- > 1 file changed, 2 insertions(+), 10 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index a1cdba42c4fa..37b65cb1579a 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) > /* Try to add it to the LRU. If that fails, decrement. */ > if (nfsd_file_lru_add(nf)) { > /* If it's still hashed, we're done */ > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > + if (list_lru_count(&nfsd_file_lru)) > nfsd_file_schedule_laundrette(); > - return; > - } > > - /* > - * We're racing with unhashing, so try to remove it from > - * the LRU. If removal fails, then someone else already > - * has our reference. > - */ > - if (!nfsd_file_lru_remove(nf)) > - return; > + return; > } > } > if (refcount_dec_and_test(&nf->nf_ref)) I think this looks OK. Filecache bugs are particularly nasty though, so let's run this through a nice long testing cycle. Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-14 19:27 ` Jeff Layton @ 2025-01-14 19:39 ` Jeff Layton 2025-01-15 15:03 ` Chuck Lever 0 siblings, 1 reply; 17+ messages in thread From: Jeff Layton @ 2025-01-14 19:39 UTC (permalink / raw) To: Li Lingfeng, chuck.lever, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: > On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: > > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > > list, gc may be triggered in another thread and immediately release this > > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > > > All the places where unhash is done will also perform lru_remove, so there > > is no need to do lru_remove separately here. After inserting the nfsd_file > > into the nfsd_file_lru list, it can be released by relying on gc. > > > > Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > > --- > > fs/nfsd/filecache.c | 12 ++---------- > > 1 file changed, 2 insertions(+), 10 deletions(-) > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index a1cdba42c4fa..37b65cb1579a 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) > > /* Try to add it to the LRU. If that fails, decrement. */ > > if (nfsd_file_lru_add(nf)) { > > /* If it's still hashed, we're done */ > > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > > + if (list_lru_count(&nfsd_file_lru)) > > nfsd_file_schedule_laundrette(); > > - return; > > - } > > > > - /* > > - * We're racing with unhashing, so try to remove it from > > - * the LRU. If removal fails, then someone else already > > - * has our reference. > > - */ > > - if (!nfsd_file_lru_remove(nf)) > > - return; > > + return; > > } > > } > > if (refcount_dec_and_test(&nf->nf_ref)) > > I think this looks OK. Filecache bugs are particularly nasty though, so > let's run this through a nice long testing cycle. > > Reviewed-by: Jeff Layton <jlayton@kernel.org> Actually, I take it back. This is problematic in another way. In this case, we're racing with another task that is unhashing the object, but we've put it on the LRU ourselves. What guarantee do we have that the unhashing and removal from the LRU didn't occur before this task called nfsd_file_lru_add()? That's why we attempt to remove it here -- we can't rely on the task that unhashed it to do so at that point. What might be best is to take and hold the rcu_read_lock() before doing the nfsd_file_lru_add, and just release it after we do these racy checks. That should make it safe to access the object. Thoughts? -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-14 19:39 ` Jeff Layton @ 2025-01-15 15:03 ` Chuck Lever 2025-01-15 15:27 ` Jeff Layton 2025-01-21 20:50 ` Jeff Layton 0 siblings, 2 replies; 17+ messages in thread From: Chuck Lever @ 2025-01-15 15:03 UTC (permalink / raw) To: Jeff Layton, Li Lingfeng, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On 1/14/25 2:39 PM, Jeff Layton wrote: > On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: >> On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: >>> In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru >>> list, gc may be triggered in another thread and immediately release this >>> nfsd_file, which will lead to a UAF when accessing this nfsd_file again. >>> >>> All the places where unhash is done will also perform lru_remove, so there >>> is no need to do lru_remove separately here. After inserting the nfsd_file >>> into the nfsd_file_lru list, it can be released by relying on gc. >>> >>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") >>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >>> --- >>> fs/nfsd/filecache.c | 12 ++---------- >>> 1 file changed, 2 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>> index a1cdba42c4fa..37b65cb1579a 100644 >>> --- a/fs/nfsd/filecache.c >>> +++ b/fs/nfsd/filecache.c >>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >>> /* Try to add it to the LRU. If that fails, decrement. */ >>> if (nfsd_file_lru_add(nf)) { >>> /* If it's still hashed, we're done */ >>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>> + if (list_lru_count(&nfsd_file_lru)) >>> nfsd_file_schedule_laundrette(); >>> - return; >>> - } >>> >>> - /* >>> - * We're racing with unhashing, so try to remove it from >>> - * the LRU. If removal fails, then someone else already >>> - * has our reference. >>> - */ >>> - if (!nfsd_file_lru_remove(nf)) >>> - return; >>> + return; >>> } >>> } >>> if (refcount_dec_and_test(&nf->nf_ref)) >> >> I think this looks OK. Filecache bugs are particularly nasty though, so >> let's run this through a nice long testing cycle. >> >> Reviewed-by: Jeff Layton <jlayton@kernel.org> > > Actually, I take it back. This is problematic in another way. > > In this case, we're racing with another task that is unhashing the > object, but we've put it on the LRU ourselves. What guarantee do we > have that the unhashing and removal from the LRU didn't occur before > this task called nfsd_file_lru_add()? That's why we attempt to remove > it here -- we can't rely on the task that unhashed it to do so at that > point. > > What might be best is to take and hold the rcu_read_lock() before doing > the nfsd_file_lru_add, and just release it after we do these racy > checks. That should make it safe to access the object. > > Thoughts? Holding the RCU read lock will keep the dereferences safe since nfsd_file objects are freed only after an RCU grace period. But will the logic in nfsd_file_put() work properly on totally dead nfsd_file objects? I don't see a specific failure mode there, but I'm not very imaginative. Overall, I think RCU would help. -- Chuck Lever ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-15 15:03 ` Chuck Lever @ 2025-01-15 15:27 ` Jeff Layton 2025-01-22 1:33 ` Li Lingfeng 2025-01-21 20:50 ` Jeff Layton 1 sibling, 1 reply; 17+ messages in thread From: Jeff Layton @ 2025-01-15 15:27 UTC (permalink / raw) To: Chuck Lever, Li Lingfeng, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: > On 1/14/25 2:39 PM, Jeff Layton wrote: > > On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: > > > On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: > > > > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > > > > list, gc may be triggered in another thread and immediately release this > > > > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > > > > > > > All the places where unhash is done will also perform lru_remove, so there > > > > is no need to do lru_remove separately here. After inserting the nfsd_file > > > > into the nfsd_file_lru list, it can be released by relying on gc. > > > > > > > > Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") > > > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > > > > --- > > > > fs/nfsd/filecache.c | 12 ++---------- > > > > 1 file changed, 2 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > > index a1cdba42c4fa..37b65cb1579a 100644 > > > > --- a/fs/nfsd/filecache.c > > > > +++ b/fs/nfsd/filecache.c > > > > @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) > > > > /* Try to add it to the LRU. If that fails, decrement. */ > > > > if (nfsd_file_lru_add(nf)) { > > > > /* If it's still hashed, we're done */ > > > > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > > > > + if (list_lru_count(&nfsd_file_lru)) > > > > nfsd_file_schedule_laundrette(); > > > > - return; > > > > - } > > > > > > > > - /* > > > > - * We're racing with unhashing, so try to remove it from > > > > - * the LRU. If removal fails, then someone else already > > > > - * has our reference. > > > > - */ > > > > - if (!nfsd_file_lru_remove(nf)) > > > > - return; > > > > + return; > > > > } > > > > } > > > > if (refcount_dec_and_test(&nf->nf_ref)) > > > > > > I think this looks OK. Filecache bugs are particularly nasty though, so > > > let's run this through a nice long testing cycle. > > > > > > Reviewed-by: Jeff Layton <jlayton@kernel.org> > > > > Actually, I take it back. This is problematic in another way. > > > > In this case, we're racing with another task that is unhashing the > > object, but we've put it on the LRU ourselves. What guarantee do we > > have that the unhashing and removal from the LRU didn't occur before > > this task called nfsd_file_lru_add()? That's why we attempt to remove > > it here -- we can't rely on the task that unhashed it to do so at that > > point. > > > > What might be best is to take and hold the rcu_read_lock() before doing > > the nfsd_file_lru_add, and just release it after we do these racy > > checks. That should make it safe to access the object. > > > > Thoughts? > > Holding the RCU read lock will keep the dereferences safe since > nfsd_file objects are freed only after an RCU grace period. But will the > logic in nfsd_file_put() work properly on totally dead nfsd_file > objects? I don't see a specific failure mode there, but I'm not very > imaginative. > > Overall, I think RCU would help. > It should be safe to call nfsd_file_lru_add() with the rcu_read_lock() held. After that we're just looking at the nf_flags() and the nf_lru list head. On a dead file, HASHED will be clear and the nfsd_file_lru_remove() call will be a no-op (the list_head will be empty). Li Lingfeng, do you want to propose a patch for this? Unfortunately, your reproducer won't work after that, since you can't sleep with the rcu_read_lock held. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-15 15:27 ` Jeff Layton @ 2025-01-22 1:33 ` Li Lingfeng 0 siblings, 0 replies; 17+ messages in thread From: Li Lingfeng @ 2025-01-22 1:33 UTC (permalink / raw) To: Jeff Layton, Chuck Lever, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng 在 2025/1/15 23:27, Jeff Layton 写道: > On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: >> On 1/14/25 2:39 PM, Jeff Layton wrote: >>> On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: >>>> On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: >>>>> In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru >>>>> list, gc may be triggered in another thread and immediately release this >>>>> nfsd_file, which will lead to a UAF when accessing this nfsd_file again. >>>>> >>>>> All the places where unhash is done will also perform lru_remove, so there >>>>> is no need to do lru_remove separately here. After inserting the nfsd_file >>>>> into the nfsd_file_lru list, it can be released by relying on gc. >>>>> >>>>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") >>>>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >>>>> --- >>>>> fs/nfsd/filecache.c | 12 ++---------- >>>>> 1 file changed, 2 insertions(+), 10 deletions(-) >>>>> >>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>>>> index a1cdba42c4fa..37b65cb1579a 100644 >>>>> --- a/fs/nfsd/filecache.c >>>>> +++ b/fs/nfsd/filecache.c >>>>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >>>>> /* Try to add it to the LRU. If that fails, decrement. */ >>>>> if (nfsd_file_lru_add(nf)) { >>>>> /* If it's still hashed, we're done */ >>>>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>>>> + if (list_lru_count(&nfsd_file_lru)) >>>>> nfsd_file_schedule_laundrette(); >>>>> - return; >>>>> - } >>>>> >>>>> - /* >>>>> - * We're racing with unhashing, so try to remove it from >>>>> - * the LRU. If removal fails, then someone else already >>>>> - * has our reference. >>>>> - */ >>>>> - if (!nfsd_file_lru_remove(nf)) >>>>> - return; >>>>> + return; >>>>> } >>>>> } >>>>> if (refcount_dec_and_test(&nf->nf_ref)) >>>> I think this looks OK. Filecache bugs are particularly nasty though, so >>>> let's run this through a nice long testing cycle. >>>> >>>> Reviewed-by: Jeff Layton <jlayton@kernel.org> >>> Actually, I take it back. This is problematic in another way. >>> >>> In this case, we're racing with another task that is unhashing the >>> object, but we've put it on the LRU ourselves. What guarantee do we >>> have that the unhashing and removal from the LRU didn't occur before >>> this task called nfsd_file_lru_add()? That's why we attempt to remove >>> it here -- we can't rely on the task that unhashed it to do so at that >>> point. >>> >>> What might be best is to take and hold the rcu_read_lock() before doing >>> the nfsd_file_lru_add, and just release it after we do these racy >>> checks. That should make it safe to access the object. >>> >>> Thoughts? >> Holding the RCU read lock will keep the dereferences safe since >> nfsd_file objects are freed only after an RCU grace period. But will the >> logic in nfsd_file_put() work properly on totally dead nfsd_file >> objects? I don't see a specific failure mode there, but I'm not very >> imaginative. >> >> Overall, I think RCU would help. >> > It should be safe to call nfsd_file_lru_add() with the rcu_read_lock() > held. After that we're just looking at the nf_flags() and the nf_lru > list head. On a dead file, HASHED will be clear and the > nfsd_file_lru_remove() call will be a no-op (the list_head will be > empty). > > Li Lingfeng, do you want to propose a patch for this? Unfortunately, > your reproducer won't work after that, since you can't sleep with the > rcu_read_lock held. Sorry for the delay. Of course I'm willing to do it. Maybe I can reproduce the problem by changing msleep to mdelay? I will try. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-15 15:03 ` Chuck Lever 2025-01-15 15:27 ` Jeff Layton @ 2025-01-21 20:50 ` Jeff Layton 2025-01-22 1:15 ` NeilBrown 1 sibling, 1 reply; 17+ messages in thread From: Jeff Layton @ 2025-01-21 20:50 UTC (permalink / raw) To: Chuck Lever, Li Lingfeng, neilb, okorniev, Dai.Ngo, tom Cc: linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: > On 1/14/25 2:39 PM, Jeff Layton wrote: > > On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: > > > On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: > > > > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > > > > list, gc may be triggered in another thread and immediately release this > > > > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > > > > > > > All the places where unhash is done will also perform lru_remove, so there > > > > is no need to do lru_remove separately here. After inserting the nfsd_file > > > > into the nfsd_file_lru list, it can be released by relying on gc. > > > > > > > > Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") > > > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > > > > --- > > > > fs/nfsd/filecache.c | 12 ++---------- > > > > 1 file changed, 2 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > > index a1cdba42c4fa..37b65cb1579a 100644 > > > > --- a/fs/nfsd/filecache.c > > > > +++ b/fs/nfsd/filecache.c > > > > @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) > > > > /* Try to add it to the LRU. If that fails, decrement. */ > > > > if (nfsd_file_lru_add(nf)) { > > > > /* If it's still hashed, we're done */ > > > > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > > > > + if (list_lru_count(&nfsd_file_lru)) > > > > nfsd_file_schedule_laundrette(); > > > > - return; > > > > - } > > > > > > > > - /* > > > > - * We're racing with unhashing, so try to remove it from > > > > - * the LRU. If removal fails, then someone else already > > > > - * has our reference. > > > > - */ > > > > - if (!nfsd_file_lru_remove(nf)) > > > > - return; > > > > + return; > > > > } > > > > } > > > > if (refcount_dec_and_test(&nf->nf_ref)) > > > > > > I think this looks OK. Filecache bugs are particularly nasty though, so > > > let's run this through a nice long testing cycle. > > > > > > Reviewed-by: Jeff Layton <jlayton@kernel.org> > > > > Actually, I take it back. This is problematic in another way. > > > > In this case, we're racing with another task that is unhashing the > > object, but we've put it on the LRU ourselves. What guarantee do we > > have that the unhashing and removal from the LRU didn't occur before > > this task called nfsd_file_lru_add()? That's why we attempt to remove > > it here -- we can't rely on the task that unhashed it to do so at that > > point. > > > > What might be best is to take and hold the rcu_read_lock() before doing > > the nfsd_file_lru_add, and just release it after we do these racy > > checks. That should make it safe to access the object. > > > > Thoughts? > > Holding the RCU read lock will keep the dereferences safe since > nfsd_file objects are freed only after an RCU grace period. But will the > logic in nfsd_file_put() work properly on totally dead nfsd_file > objects? I don't see a specific failure mode there, but I'm not very > imaginative. > > Overall, I think RCU would help. > > To be clear, I think we need to drop e57420be100ab from your nfsd- testing branch. The race I identified above is quite likely to occur and could lead to leaks. If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I think the RCU approach is safe. Thanks, -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-21 20:50 ` Jeff Layton @ 2025-01-22 1:15 ` NeilBrown 2025-01-22 1:43 ` Li Lingfeng 2025-01-22 3:48 ` NeilBrown 0 siblings, 2 replies; 17+ messages in thread From: NeilBrown @ 2025-01-22 1:15 UTC (permalink / raw) To: Jeff Layton Cc: Chuck Lever, Li Lingfeng, okorniev, Dai.Ngo, tom, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Wed, 22 Jan 2025, Jeff Layton wrote: > On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: > > On 1/14/25 2:39 PM, Jeff Layton wrote: > > > On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: > > > > On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: > > > > > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > > > > > list, gc may be triggered in another thread and immediately release this > > > > > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > > > > > > > > > All the places where unhash is done will also perform lru_remove, so there > > > > > is no need to do lru_remove separately here. After inserting the nfsd_file > > > > > into the nfsd_file_lru list, it can be released by relying on gc. > > > > > > > > > > Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") > > > > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > > > > > --- > > > > > fs/nfsd/filecache.c | 12 ++---------- > > > > > 1 file changed, 2 insertions(+), 10 deletions(-) > > > > > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > > > index a1cdba42c4fa..37b65cb1579a 100644 > > > > > --- a/fs/nfsd/filecache.c > > > > > +++ b/fs/nfsd/filecache.c > > > > > @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) > > > > > /* Try to add it to the LRU. If that fails, decrement. */ > > > > > if (nfsd_file_lru_add(nf)) { > > > > > /* If it's still hashed, we're done */ > > > > > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > > > > > + if (list_lru_count(&nfsd_file_lru)) > > > > > nfsd_file_schedule_laundrette(); > > > > > - return; > > > > > - } > > > > > > > > > > - /* > > > > > - * We're racing with unhashing, so try to remove it from > > > > > - * the LRU. If removal fails, then someone else already > > > > > - * has our reference. > > > > > - */ > > > > > - if (!nfsd_file_lru_remove(nf)) > > > > > - return; > > > > > + return; > > > > > } > > > > > } > > > > > if (refcount_dec_and_test(&nf->nf_ref)) > > > > > > > > I think this looks OK. Filecache bugs are particularly nasty though, so > > > > let's run this through a nice long testing cycle. > > > > > > > > Reviewed-by: Jeff Layton <jlayton@kernel.org> > > > > > > Actually, I take it back. This is problematic in another way. > > > > > > In this case, we're racing with another task that is unhashing the > > > object, but we've put it on the LRU ourselves. What guarantee do we > > > have that the unhashing and removal from the LRU didn't occur before > > > this task called nfsd_file_lru_add()? That's why we attempt to remove > > > it here -- we can't rely on the task that unhashed it to do so at that > > > point. > > > > > > What might be best is to take and hold the rcu_read_lock() before doing > > > the nfsd_file_lru_add, and just release it after we do these racy > > > checks. That should make it safe to access the object. > > > > > > Thoughts? > > > > Holding the RCU read lock will keep the dereferences safe since > > nfsd_file objects are freed only after an RCU grace period. But will the > > logic in nfsd_file_put() work properly on totally dead nfsd_file > > objects? I don't see a specific failure mode there, but I'm not very > > imaginative. > > > > Overall, I think RCU would help. > > > > > > To be clear, I think we need to drop e57420be100ab from your nfsd- > testing branch. The race I identified above is quite likely to occur > and could lead to leaks. > > If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I > think the RCU approach is safe. I'm not convinced this is the right approach. I cannot see how nfsd_file_put() can race with unhashing. If it cannot then we can simply unconditionally call nfsd_file_schedule_laundrette(). Can describe how the race can happen - if indeed it can. Note that we also need rcu protection in nfsd_file_lru_add() so that the nf doesn't get freed after it is added the the lru and before the trace point. If we don't end up needing rcu round the call, we will need it in the call. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-22 1:15 ` NeilBrown @ 2025-01-22 1:43 ` Li Lingfeng 2025-01-22 2:21 ` Li Lingfeng 2025-01-22 3:48 ` NeilBrown 1 sibling, 1 reply; 17+ messages in thread From: Li Lingfeng @ 2025-01-22 1:43 UTC (permalink / raw) To: NeilBrown, Jeff Layton Cc: Chuck Lever, okorniev, Dai.Ngo, tom, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng 在 2025/1/22 9:15, NeilBrown 写道: > On Wed, 22 Jan 2025, Jeff Layton wrote: >> On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: >>> On 1/14/25 2:39 PM, Jeff Layton wrote: >>>> On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: >>>>> On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: >>>>>> In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru >>>>>> list, gc may be triggered in another thread and immediately release this >>>>>> nfsd_file, which will lead to a UAF when accessing this nfsd_file again. >>>>>> >>>>>> All the places where unhash is done will also perform lru_remove, so there >>>>>> is no need to do lru_remove separately here. After inserting the nfsd_file >>>>>> into the nfsd_file_lru list, it can be released by relying on gc. >>>>>> >>>>>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") >>>>>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >>>>>> --- >>>>>> fs/nfsd/filecache.c | 12 ++---------- >>>>>> 1 file changed, 2 insertions(+), 10 deletions(-) >>>>>> >>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>>>>> index a1cdba42c4fa..37b65cb1579a 100644 >>>>>> --- a/fs/nfsd/filecache.c >>>>>> +++ b/fs/nfsd/filecache.c >>>>>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >>>>>> /* Try to add it to the LRU. If that fails, decrement. */ >>>>>> if (nfsd_file_lru_add(nf)) { >>>>>> /* If it's still hashed, we're done */ >>>>>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>>>>> + if (list_lru_count(&nfsd_file_lru)) >>>>>> nfsd_file_schedule_laundrette(); >>>>>> - return; >>>>>> - } >>>>>> >>>>>> - /* >>>>>> - * We're racing with unhashing, so try to remove it from >>>>>> - * the LRU. If removal fails, then someone else already >>>>>> - * has our reference. >>>>>> - */ >>>>>> - if (!nfsd_file_lru_remove(nf)) >>>>>> - return; >>>>>> + return; >>>>>> } >>>>>> } >>>>>> if (refcount_dec_and_test(&nf->nf_ref)) >>>>> I think this looks OK. Filecache bugs are particularly nasty though, so >>>>> let's run this through a nice long testing cycle. >>>>> >>>>> Reviewed-by: Jeff Layton <jlayton@kernel.org> >>>> Actually, I take it back. This is problematic in another way. >>>> >>>> In this case, we're racing with another task that is unhashing the >>>> object, but we've put it on the LRU ourselves. What guarantee do we >>>> have that the unhashing and removal from the LRU didn't occur before >>>> this task called nfsd_file_lru_add()? That's why we attempt to remove >>>> it here -- we can't rely on the task that unhashed it to do so at that >>>> point. >>>> >>>> What might be best is to take and hold the rcu_read_lock() before doing >>>> the nfsd_file_lru_add, and just release it after we do these racy >>>> checks. That should make it safe to access the object. >>>> >>>> Thoughts? >>> Holding the RCU read lock will keep the dereferences safe since >>> nfsd_file objects are freed only after an RCU grace period. But will the >>> logic in nfsd_file_put() work properly on totally dead nfsd_file >>> objects? I don't see a specific failure mode there, but I'm not very >>> imaginative. >>> >>> Overall, I think RCU would help. >>> >>> >> To be clear, I think we need to drop e57420be100ab from your nfsd- >> testing branch. The race I identified above is quite likely to occur >> and could lead to leaks. >> >> If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I >> think the RCU approach is safe. > I'm not convinced this is the right approach. > I cannot see how nfsd_file_put() can race with unhashing. If it cannot > then we can simply unconditionally call nfsd_file_schedule_laundrette(). > > Can describe how the race can happen - if indeed it can. At the beginning, our testing team discovered this issue on 5.10 using their own test cases (I'm not clear on how their test cases were constructed). Then, I found that there might be a problem here, so I reproduced the phenomenon in the above way. I will ask if their test cases can be shared externally or if they can describe what their test cases did. > Note that we also need rcu protection in nfsd_file_lru_add() so that the > nf doesn't get freed after it is added the the lru and before the trace > point. If we don't end up needing rcu round the call, we will need it > in the call. > > Thanks, > NeilBrown ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-22 1:43 ` Li Lingfeng @ 2025-01-22 2:21 ` Li Lingfeng 0 siblings, 0 replies; 17+ messages in thread From: Li Lingfeng @ 2025-01-22 2:21 UTC (permalink / raw) To: Li Lingfeng, NeilBrown, Jeff Layton Cc: Chuck Lever, okorniev, Dai.Ngo, tom, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun 在 2025/1/22 9:43, Li Lingfeng 写道: > > 在 2025/1/22 9:15, NeilBrown 写道: >> On Wed, 22 Jan 2025, Jeff Layton wrote: >>> On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: >>>> On 1/14/25 2:39 PM, Jeff Layton wrote: >>>>> On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: >>>>>> On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: >>>>>>> In nfsd_file_put, after inserting the nfsd_file into the >>>>>>> nfsd_file_lru >>>>>>> list, gc may be triggered in another thread and immediately >>>>>>> release this >>>>>>> nfsd_file, which will lead to a UAF when accessing this >>>>>>> nfsd_file again. >>>>>>> >>>>>>> All the places where unhash is done will also perform >>>>>>> lru_remove, so there >>>>>>> is no need to do lru_remove separately here. After inserting the >>>>>>> nfsd_file >>>>>>> into the nfsd_file_lru list, it can be released by relying on gc. >>>>>>> >>>>>>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the >>>>>>> filecache LRU") >>>>>>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >>>>>>> --- >>>>>>> fs/nfsd/filecache.c | 12 ++---------- >>>>>>> 1 file changed, 2 insertions(+), 10 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>>>>>> index a1cdba42c4fa..37b65cb1579a 100644 >>>>>>> --- a/fs/nfsd/filecache.c >>>>>>> +++ b/fs/nfsd/filecache.c >>>>>>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >>>>>>> /* Try to add it to the LRU. If that fails, >>>>>>> decrement. */ >>>>>>> if (nfsd_file_lru_add(nf)) { >>>>>>> /* If it's still hashed, we're done */ >>>>>>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>>>>>> + if (list_lru_count(&nfsd_file_lru)) >>>>>>> nfsd_file_schedule_laundrette(); >>>>>>> - return; >>>>>>> - } >>>>>>> - /* >>>>>>> - * We're racing with unhashing, so try to remove it >>>>>>> from >>>>>>> - * the LRU. If removal fails, then someone else >>>>>>> already >>>>>>> - * has our reference. >>>>>>> - */ >>>>>>> - if (!nfsd_file_lru_remove(nf)) >>>>>>> - return; >>>>>>> + return; >>>>>>> } >>>>>>> } >>>>>>> if (refcount_dec_and_test(&nf->nf_ref)) >>>>>> I think this looks OK. Filecache bugs are particularly nasty >>>>>> though, so >>>>>> let's run this through a nice long testing cycle. >>>>>> >>>>>> Reviewed-by: Jeff Layton <jlayton@kernel.org> >>>>> Actually, I take it back. This is problematic in another way. >>>>> >>>>> In this case, we're racing with another task that is unhashing the >>>>> object, but we've put it on the LRU ourselves. What guarantee do we >>>>> have that the unhashing and removal from the LRU didn't occur before >>>>> this task called nfsd_file_lru_add()? That's why we attempt to remove >>>>> it here -- we can't rely on the task that unhashed it to do so at >>>>> that >>>>> point. >>>>> >>>>> What might be best is to take and hold the rcu_read_lock() before >>>>> doing >>>>> the nfsd_file_lru_add, and just release it after we do these racy >>>>> checks. That should make it safe to access the object. >>>>> >>>>> Thoughts? >>>> Holding the RCU read lock will keep the dereferences safe since >>>> nfsd_file objects are freed only after an RCU grace period. But >>>> will the >>>> logic in nfsd_file_put() work properly on totally dead nfsd_file >>>> objects? I don't see a specific failure mode there, but I'm not very >>>> imaginative. >>>> >>>> Overall, I think RCU would help. >>>> >>>> >>> To be clear, I think we need to drop e57420be100ab from your nfsd- >>> testing branch. The race I identified above is quite likely to occur >>> and could lead to leaks. >>> >>> If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I >>> think the RCU approach is safe. >> I'm not convinced this is the right approach. >> I cannot see how nfsd_file_put() can race with unhashing. If it cannot >> then we can simply unconditionally call nfsd_file_schedule_laundrette(). >> >> Can describe how the race can happen - if indeed it can. > At the beginning, our testing team discovered this issue on 5.10 using > their own test cases (I'm not clear on how their test cases were > constructed). > Then, I found that there might be a problem here, so I reproduced the > phenomenon in the above way. > I will ask if their test cases can be shared externally or if they can > describe what their test cases did. They used three machines for the test: one server and two clients. ClientA continuously creates and deletes a file, while ClientB uses fio to perform random writes on this file. I found the original stack trace (since I only have a screenshot, I can only extract some important information from it): list_del corruption. prev-next should be ffff000052be6828, but was ffff000052221e48 WARNING:CPU: 1 PID: 12224at lib/list_debug.c:95 list_del_entry_valid+Oxe0/0x10 ... Call trace: list_del_entry_valid+Oxe0/0x10c list_lru_del+0xb4/0x19c nfsd_file_put+0x134/0xla0 [nfsd] nfsd_read+0xc0/0xldo [nfsd nfsd3_proc_read+0xl44/0xla0 [nfsd] nfsd_dispatch+0x190/0x254 [nfsd] svc_process_common+Ox52c/0x780 [sunrpc] svc_process+Ox88/0xe4 [sunrpc] nfsd+0xb4/0x160 [nfsd] kthread+0x108/0x134 ret_from_fork+ох10/0xl8 ---[ end trace f87713f92e49e854 ]--- refcount_t: underflow;use-after-free. WARNING:CPU: 1 PID: 12220 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0xl44 ... Call trace: refcount_warn_saturate+0xf4/0x144 nfsd_file_put+0x160/0xla0 [nfsd] nfsd_read+oxco/0xld0 [nfsd] nfsd3_proc_read+0xl44/0xla0 [nfsd] nfsd_dispatch+0x190/0x254 [nfsd] svc_process_common+Ox52c/0x780[sunrpc] svc_process+0x88/0xe4_ [sunrpc] nfsd+0xb4/0x160 [nfsd] kthread+0x108/0x134 ret_from_fork+0x10/0x18 ---[ end tracef87713f92e49e853 ]--- After applying this modification, this test case no longer encountered any issues. >> Note that we also need rcu protection in nfsd_file_lru_add() so that the >> nf doesn't get freed after it is added the the lru and before the trace >> point. If we don't end up needing rcu round the call, we will need it >> in the call. >> >> Thanks, >> NeilBrown ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-22 1:15 ` NeilBrown 2025-01-22 1:43 ` Li Lingfeng @ 2025-01-22 3:48 ` NeilBrown 2025-01-22 7:31 ` Li Lingfeng 2025-01-22 12:31 ` Jeff Layton 1 sibling, 2 replies; 17+ messages in thread From: NeilBrown @ 2025-01-22 3:48 UTC (permalink / raw) To: Jeff Layton Cc: Chuck Lever, Li Lingfeng, okorniev, Dai.Ngo, tom, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Wed, 22 Jan 2025, NeilBrown wrote: > On Wed, 22 Jan 2025, Jeff Layton wrote: > > To be clear, I think we need to drop e57420be100ab from your nfsd- > > testing branch. The race I identified above is quite likely to occur > > and could lead to leaks. > > > > If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I > > think the RCU approach is safe. > > I'm not convinced this is the right approach. > I cannot see how nfsd_file_put() can race with unhashing. If it cannot > then we can simply unconditionally call nfsd_file_schedule_laundrette(). > > Can describe how the race can happen - if indeed it can. I thought I should explore this more and explain what I think actually happens ... Certainly nfsd_file_unhash() might race with nfsd_file_put(). At this point in nfsd_file_put() we have the only reference but a hash lookup could gain another reference and the immediately unhash it. nfsd_file_queue_for_close() can do this. There might be other paths. But why does this mean we need to remove it from the lru and free it immediately? If we leave it on the lru it will be freed in a couple of seconds. The reason might be nfsd_file_close_inode_sync(). This needs to close files before returning. But if nfsd_file_close_inode_sync() is called while some other thread holds a reference to the file and might want to call nfsd_file_put(), then it isn't going to succeed anyway so any race here doesn't make any difference. So I think the following might be the best fix ??? Thanks, NeilBrown diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index fcd751cb7c76..773788a50e56 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -322,10 +322,13 @@ nfsd_file_check_writeback(struct nfsd_file *nf) static bool nfsd_file_lru_add(struct nfsd_file *nf) { set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags); + rcu_read_lock(); if (list_lru_add_obj(&nfsd_file_lru, &nf->nf_lru)) { trace_nfsd_file_lru_add(nf); + rcu_read_unlock(); return true; } + rcu_read_unlock(); return false; } @@ -371,19 +374,8 @@ nfsd_file_put(struct nfsd_file *nf) /* Try to add it to the LRU. If that fails, decrement. */ if (nfsd_file_lru_add(nf)) { - /* If it's still hashed, we're done */ - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { - nfsd_file_schedule_laundrette(); - return; - } - - /* - * We're racing with unhashing, so try to remove it from - * the LRU. If removal fails, then someone else already - * has our reference. - */ - if (!nfsd_file_lru_remove(nf)) - return; + nfsd_file_schedule_laundrette(); + return; } } if (refcount_dec_and_test(&nf->nf_ref)) ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-22 3:48 ` NeilBrown @ 2025-01-22 7:31 ` Li Lingfeng 2025-01-22 12:31 ` Jeff Layton 1 sibling, 0 replies; 17+ messages in thread From: Li Lingfeng @ 2025-01-22 7:31 UTC (permalink / raw) To: NeilBrown, Jeff Layton Cc: Chuck Lever, okorniev, Dai.Ngo, tom, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng 在 2025/1/22 11:48, NeilBrown 写道: > On Wed, 22 Jan 2025, NeilBrown wrote: >> On Wed, 22 Jan 2025, Jeff Layton wrote: >>> To be clear, I think we need to drop e57420be100ab from your nfsd- >>> testing branch. The race I identified above is quite likely to occur >>> and could lead to leaks. >>> >>> If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I >>> think the RCU approach is safe. >> I'm not convinced this is the right approach. >> I cannot see how nfsd_file_put() can race with unhashing. If it cannot >> then we can simply unconditionally call nfsd_file_schedule_laundrette(). >> >> Can describe how the race can happen - if indeed it can. > I thought I should explore this more and explain what I think actually > happens ... > > Certainly nfsd_file_unhash() might race with nfsd_file_put(). At this > point in nfsd_file_put() we have the only reference but a hash lookup > could gain another reference and the immediately unhash it. > nfsd_file_queue_for_close() can do this. There might be other paths. > > But why does this mean we need to remove it from the lru and free it > immediately? If we leave it on the lru it will be freed in a couple of > seconds. > > The reason might be nfsd_file_close_inode_sync(). This needs to close > files before returning. > But if nfsd_file_close_inode_sync() is called while some other thread > holds a reference to the file and might want to call nfsd_file_put(), > then it isn't going to succeed anyway so any race here doesn't make any > difference. > > So I think the following might be the best fix Agree. I ignored the trace point in nfsd_file_lru_add(), this one looks better. And I have tested it. Thanks. > > ??? > > Thanks, > NeilBrown > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index fcd751cb7c76..773788a50e56 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -322,10 +322,13 @@ nfsd_file_check_writeback(struct nfsd_file *nf) > static bool nfsd_file_lru_add(struct nfsd_file *nf) > { > set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags); > + rcu_read_lock(); > if (list_lru_add_obj(&nfsd_file_lru, &nf->nf_lru)) { > trace_nfsd_file_lru_add(nf); > + rcu_read_unlock(); > return true; > } > + rcu_read_unlock(); > return false; > } > > @@ -371,19 +374,8 @@ nfsd_file_put(struct nfsd_file *nf) > > /* Try to add it to the LRU. If that fails, decrement. */ > if (nfsd_file_lru_add(nf)) { > - /* If it's still hashed, we're done */ > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > - nfsd_file_schedule_laundrette(); > - return; > - } > - > - /* > - * We're racing with unhashing, so try to remove it from > - * the LRU. If removal fails, then someone else already > - * has our reference. > - */ > - if (!nfsd_file_lru_remove(nf)) > - return; > + nfsd_file_schedule_laundrette(); > + return; > } > } > if (refcount_dec_and_test(&nf->nf_ref)) > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-22 3:48 ` NeilBrown 2025-01-22 7:31 ` Li Lingfeng @ 2025-01-22 12:31 ` Jeff Layton 1 sibling, 0 replies; 17+ messages in thread From: Jeff Layton @ 2025-01-22 12:31 UTC (permalink / raw) To: NeilBrown Cc: Chuck Lever, Li Lingfeng, okorniev, Dai.Ngo, tom, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng On Wed, 2025-01-22 at 14:48 +1100, NeilBrown wrote: > On Wed, 22 Jan 2025, NeilBrown wrote: > > On Wed, 22 Jan 2025, Jeff Layton wrote: > > > To be clear, I think we need to drop e57420be100ab from your nfsd- > > > testing branch. The race I identified above is quite likely to occur > > > and could lead to leaks. > > > > > > If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I > > > think the RCU approach is safe. > > > > I'm not convinced this is the right approach. > > I cannot see how nfsd_file_put() can race with unhashing. If it cannot > > then we can simply unconditionally call nfsd_file_schedule_laundrette(). > > > > Can describe how the race can happen - if indeed it can. > > I thought I should explore this more and explain what I think actually > happens ... > > Certainly nfsd_file_unhash() might race with nfsd_file_put(). At this > point in nfsd_file_put() we have the only reference but a hash lookup > could gain another reference and the immediately unhash it. > nfsd_file_queue_for_close() can do this. There might be other paths. > > But why does this mean we need to remove it from the lru and free it > immediately? If we leave it on the lru it will be freed in a couple of > seconds. > > The reason might be nfsd_file_close_inode_sync(). This needs to close > files before returning. > But if nfsd_file_close_inode_sync() is called while some other thread > holds a reference to the file and might want to call nfsd_file_put(), > then it isn't going to succeed anyway so any race here doesn't make any > difference. > > So I think the following might be the best fix > > ??? > > Thanks, > NeilBrown > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index fcd751cb7c76..773788a50e56 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -322,10 +322,13 @@ nfsd_file_check_writeback(struct nfsd_file *nf) > static bool nfsd_file_lru_add(struct nfsd_file *nf) > { > set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags); > + rcu_read_lock(); > if (list_lru_add_obj(&nfsd_file_lru, &nf->nf_lru)) { > trace_nfsd_file_lru_add(nf); > + rcu_read_unlock(); > return true; > } > + rcu_read_unlock(); > return false; > } > I think that I'm now convinced that it's OK to remove the code in the if block below. But if we do that, then I don't think you need the rcu_read_lock() in nfsd_file_lru_add(). It's just handing off the reference to the LRU at that point and once that's done, it doesn't need to look at it again. That makes the rcu_read_lock() unnecessary. Given that, Li Lingfeng's original patch is OK after all. Am I missing something? > @@ -371,19 +374,8 @@ nfsd_file_put(struct nfsd_file *nf) > > /* Try to add it to the LRU. If that fails, decrement. */ > if (nfsd_file_lru_add(nf)) { > - /* If it's still hashed, we're done */ > - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { > - nfsd_file_schedule_laundrette(); > - return; > - } > - > - /* > - * We're racing with unhashing, so try to remove it from > - * the LRU. If removal fails, then someone else already > - * has our reference. > - */ > - if (!nfsd_file_lru_remove(nf)) > - return; > + nfsd_file_schedule_laundrette(); > + return; > } > } > if (refcount_dec_and_test(&nf->nf_ref)) > -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list 2025-01-13 2:59 [PATCH] nfsd: free nfsd_file by gc after adding it to lru list Li Lingfeng 2025-01-13 14:07 ` Chuck Lever 2025-01-14 19:27 ` Jeff Layton @ 2025-01-14 19:40 ` cel 2 siblings, 0 replies; 17+ messages in thread From: cel @ 2025-01-14 19:40 UTC (permalink / raw) To: jlayton, neilb, okorniev, Dai.Ngo, tom, Li Lingfeng Cc: Chuck Lever, linux-nfs, linux-kernel, yukuai1, houtao1, yi.zhang, yangerkun, lilingfeng From: Chuck Lever <chuck.lever@oracle.com> On Mon, 13 Jan 2025 10:59:57 +0800, Li Lingfeng wrote: > In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru > list, gc may be triggered in another thread and immediately release this > nfsd_file, which will lead to a UAF when accessing this nfsd_file again. > > All the places where unhash is done will also perform lru_remove, so there > is no need to do lru_remove separately here. After inserting the nfsd_file > into the nfsd_file_lru list, it can be released by relying on gc. > > [...] Applied to nfsd-testing, thanks! Cc-stable tag added. [1/1] nfsd: free nfsd_file by gc after adding it to lru list commit: e57420be100ab3ff6d42992a37ce34f3e03d8d91 -- Chuck Lever ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-01-22 12:31 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-13 2:59 [PATCH] nfsd: free nfsd_file by gc after adding it to lru list Li Lingfeng 2025-01-13 14:07 ` Chuck Lever 2025-01-14 1:54 ` Li Lingfeng 2025-01-14 19:17 ` Chuck Lever 2025-01-14 19:27 ` Jeff Layton 2025-01-14 19:39 ` Jeff Layton 2025-01-15 15:03 ` Chuck Lever 2025-01-15 15:27 ` Jeff Layton 2025-01-22 1:33 ` Li Lingfeng 2025-01-21 20:50 ` Jeff Layton 2025-01-22 1:15 ` NeilBrown 2025-01-22 1:43 ` Li Lingfeng 2025-01-22 2:21 ` Li Lingfeng 2025-01-22 3:48 ` NeilBrown 2025-01-22 7:31 ` Li Lingfeng 2025-01-22 12:31 ` Jeff Layton 2025-01-14 19:40 ` cel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox