From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E261C23DE; Wed, 22 Jan 2025 01:33:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737509621; cv=none; b=fWvw9DJ8xPkiL3VfuQbLpkSoUa1DLekWqTZQ+Y7kRMpkck/+Jbg+zyLJecqoHQlImC8CpaoPajtarcO++URswegrdroEe0LAKTpnB6Mg8qtdwJ5ouBZfGKxdoIobEy5l3+S3y/GZdvFGL8MQaBMNVrTOIw6mAph3fZMaXs5Ge8c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737509621; c=relaxed/simple; bh=GD611Br9GiKp8D06BERtLZ9qQ8mXv6qovryeg+cB7P8=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=kE4lmYkUYlqpz7t73bQWpP+ZhlF2P96WkEeKpT4Q7lXBlCnF1wO0ykzzNamhFpn1iFA1VlNFC+3ij573ytHMrJXNep/iIniIzWhsZl6NONdZQZvdyOGcnKHuXI+G4WK77T1RnsYDE+w6fHRFn4HctAfSggnjqJfTL9FORYvA5eo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Yd63X2hQ4z22lks; Wed, 22 Jan 2025 09:31:04 +0800 (CST) Received: from kwepemg500017.china.huawei.com (unknown [7.202.181.81]) by mail.maildlp.com (Postfix) with ESMTPS id 20EDC14013B; Wed, 22 Jan 2025 09:33:34 +0800 (CST) Received: from [10.174.179.155] (10.174.179.155) by kwepemg500017.china.huawei.com (7.202.181.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 22 Jan 2025 09:33:32 +0800 Message-ID: <240dcf4e-b716-446a-9b9d-d232e26a58b5@huawei.com> Date: Wed, 22 Jan 2025 09:33:32 +0800 Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: =?UTF-8?B?TW96aWxsYSBUaHVuZGVyYmlyZCDmtYvor5XniYg=?= Subject: Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list To: Jeff Layton , Chuck Lever , , , , CC: , , , , , , References: <20250113025957.1253214-1-lilingfeng3@huawei.com> <8864761c99553a7f18adc13e98b4ef6255da1d9e.camel@kernel.org> From: Li Lingfeng In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemg500017.china.huawei.com (7.202.181.81) 在 2025/1/15 23:27, Jeff Layton 写道: > On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote: >> On 1/14/25 2:39 PM, Jeff Layton wrote: >>> On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote: >>>> On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote: >>>>> In nfsd_file_put, after inserting the nfsd_file into the nfsd_file_lru >>>>> list, gc may be triggered in another thread and immediately release this >>>>> nfsd_file, which will lead to a UAF when accessing this nfsd_file again. >>>>> >>>>> All the places where unhash is done will also perform lru_remove, so there >>>>> is no need to do lru_remove separately here. After inserting the nfsd_file >>>>> into the nfsd_file_lru list, it can be released by relying on gc. >>>>> >>>>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the filecache LRU") >>>>> Signed-off-by: Li Lingfeng >>>>> --- >>>>> fs/nfsd/filecache.c | 12 ++---------- >>>>> 1 file changed, 2 insertions(+), 10 deletions(-) >>>>> >>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>>>> index a1cdba42c4fa..37b65cb1579a 100644 >>>>> --- a/fs/nfsd/filecache.c >>>>> +++ b/fs/nfsd/filecache.c >>>>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf) >>>>> /* Try to add it to the LRU. If that fails, decrement. */ >>>>> if (nfsd_file_lru_add(nf)) { >>>>> /* If it's still hashed, we're done */ >>>>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>>>> + if (list_lru_count(&nfsd_file_lru)) >>>>> nfsd_file_schedule_laundrette(); >>>>> - return; >>>>> - } >>>>> >>>>> - /* >>>>> - * We're racing with unhashing, so try to remove it from >>>>> - * the LRU. If removal fails, then someone else already >>>>> - * has our reference. >>>>> - */ >>>>> - if (!nfsd_file_lru_remove(nf)) >>>>> - return; >>>>> + return; >>>>> } >>>>> } >>>>> if (refcount_dec_and_test(&nf->nf_ref)) >>>> I think this looks OK. Filecache bugs are particularly nasty though, so >>>> let's run this through a nice long testing cycle. >>>> >>>> Reviewed-by: Jeff Layton >>> Actually, I take it back. This is problematic in another way. >>> >>> In this case, we're racing with another task that is unhashing the >>> object, but we've put it on the LRU ourselves. What guarantee do we >>> have that the unhashing and removal from the LRU didn't occur before >>> this task called nfsd_file_lru_add()? That's why we attempt to remove >>> it here -- we can't rely on the task that unhashed it to do so at that >>> point. >>> >>> What might be best is to take and hold the rcu_read_lock() before doing >>> the nfsd_file_lru_add, and just release it after we do these racy >>> checks. That should make it safe to access the object. >>> >>> Thoughts? >> Holding the RCU read lock will keep the dereferences safe since >> nfsd_file objects are freed only after an RCU grace period. But will the >> logic in nfsd_file_put() work properly on totally dead nfsd_file >> objects? I don't see a specific failure mode there, but I'm not very >> imaginative. >> >> Overall, I think RCU would help. >> > It should be safe to call nfsd_file_lru_add() with the rcu_read_lock() > held. After that we're just looking at the nf_flags() and the nf_lru > list head. On a dead file, HASHED will be clear and the > nfsd_file_lru_remove() call will be a no-op (the list_head will be > empty). > > Li Lingfeng, do you want to propose a patch for this? Unfortunately, > your reproducer won't work after that, since you can't sleep with the > rcu_read_lock held. Sorry for the delay. Of course I'm willing to do it. Maybe I can reproduce the problem by changing msleep to mdelay? I will try.