Re: [PATCH] Revert "libfs: Use d_children list to iterate simple_offset directories"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Chuck Lever <chuck.lever@oracle.com>
To: Sun Yongjian <sunyongjian1@huawei.com>
Cc: linux-fsdevel@vger.kernel.org, yangerkun@huawei.com
Subject: Re: [PATCH] Revert "libfs: Use d_children list to iterate simple_offset directories"
Date: Tue, 11 Mar 2025 11:23:04 -0400	[thread overview]
Message-ID: <f73e4e72-c46d-499b-a5d6-bf469331d496@oracle.com> (raw)
In-Reply-To: <691e95db-112e-4276-9de4-03a383ff4bfe@huawei.com>

On 3/11/25 9:55 AM, Sun Yongjian wrote:
> 
> 
> 在 2025/3/11 1:30, Chuck Lever 写道:
>> On 3/10/25 12:29 PM, Greg Kroah-Hartman wrote:
>>> On Wed, Feb 26, 2025 at 03:33:56PM -0500, Chuck Lever wrote:
>>>> On 2/26/25 2:13 PM, Greg Kroah-Hartman wrote:
>>>>> On Wed, Feb 26, 2025 at 11:28:35AM -0500, Chuck Lever wrote:
>>>>>> On 2/26/25 11:21 AM, Greg Kroah-Hartman wrote:
>>>>>>> On Wed, Feb 26, 2025 at 10:57:48AM -0500, Chuck Lever wrote:
>>>>>>>> On 2/26/25 9:29 AM, Greg Kroah-Hartman wrote:
>>>>>>>>> This reverts commit b9b588f22a0c049a14885399e27625635ae6ef91.
>>>>>>>>>
>>>>>>>>> There are reports of this commit breaking Chrome's rendering
>>>>>>>>> mode.  As
>>>>>>>>> no one seems to want to do a root-cause, let's just revert it
>>>>>>>>> for now as
>>>>>>>>> it is affecting people using the latest release as well as the
>>>>>>>>> stable
>>>>>>>>> kernels that it has been backported to.
>>>>>>>>
>>>>>>>> NACK. This re-introduces a CVE.
>>>>>>>
>>>>>>> As I said elsewhere, when a commit that is assigned a CVE is
>>>>>>> reverted,
>>>>>>> then the CVE gets revoked.  But I don't see this commit being
>>>>>>> assigned
>>>>>>> to a CVE, so what CVE specifically are you referring to?
>>>>>>
>>>>>> https://nvd.nist.gov/vuln/detail/CVE-2024-46701
>>>>>
>>>>> That refers to commit 64a7ce76fb90 ("libfs: fix infinite directory
>>>>> reads
>>>>> for offset dir"), which showed up in 6.11 (and only backported to
>>>>> 6.10.7
>>>>> (which is long end-of-life).  Commit b9b588f22a0c ("libfs: Use
>>>>> d_children list to iterate simple_offset directories") is in 6.14-rc1
>>>>> and has been backported to 6.6.75, 6.12.12, and 6.13.1.
>>>>>
>>>>> I don't understand the interaction here, sorry.
>>>>
>>>> Commit 64a7ce76fb90 is an attempt to fix the infinite loop, but can
>>>> not be applied to kernels before 0e4a862174f2 ("libfs: Convert simple
>>>> directory offsets to use a Maple Tree"), even though those kernels also
>>>> suffer from the looping symptoms described in the CVE.
>>>>
>>>> There was significant controversy (which you responded to) when Yu Kuai
>>>> <yukuai3@huawei.com> attempted a backport of 64a7ce76fb90 to address
>>>> this CVE in v6.6 by first applying all upstream mtree patches to v6.6.
>>>> That backport was roundly rejected by Liam and Lorenzo.
>>>>
>>>> Commit b9b588f22a0c is a second attempt to fix the infinite loop
>>>> problem
>>>> that does not depend on having a working Maple tree implementation.
>>>> b9b588f22a0c is a fix that can work properly with the older xarray
>>>> mechanism that 0e4a862174f2 replaced, so it can be backported (with
>>>> certain adjustments) to kernels before 0e4a862174f2.
>>>>
>>>> Note that as part of the series where b9b588f22a0c was applied,
>>>> 64a7ce76fb90 is reverted (v6.10 and forward). Reverting b9b588f22a0c
>>>> leaves LTS kernels from v6.6 forward with the infinite loop problem
>>>> unfixed entirely because 64a7ce76fb90 has also now been reverted.
>>>>
>>>>
>>>>>> The guideline that "regressions are more important than CVEs" is
>>>>>> interesting. I hadn't heard that before.
>>>>>
>>>>> CVEs should not be relevant for development given that we create 10-11
>>>>> of them a day.  Treat them like any other public bug list please.
>>>>>
>>>>> But again, I don't understand how reverting this commit relates to the
>>>>> CVE id you pointed at, what am I missing?
>>>>>
>>>>>> Still, it seems like we haven't had a chance to actually work on this
>>>>>> issue yet. It could be corrected by a simple fix. Reverting seems
>>>>>> premature to me.
>>>>>
>>>>> I'll let that be up to the vfs maintainers, but I'd push for reverting
>>>>> first to fix the regression and then taking the time to find the real
>>>>> change going forward to make our user's lives easier.  Especially as I
>>>>> don't know who is working on that "simple fix" :)
>>>>
>>>> The issue is that we need the Chrome team to tell us what new system
>>>> behavior is causing Chrome to malfunction. None of us have expertise to
>>>> examine as complex an application as Chrome to nail the one small
>>>> change
>>>> that is causing the problem. This could even be a latent bug in Chrome.
>>>>
>>>> As soon as they have reviewed the bug and provided a simple reproducer,
>>>> I will start active triage.
>>>
>>> What ever happened with all of this?
>>
>> https://issuetracker.google.com/issues/396434686?pli=1
>>
>> The Chrome engineer chased this into the Mesa library, but since then
>> progress has slowed. We still don't know why GPU acceleration is not
>> being detected on certain devices.
>>
>>
> Hello,
> 
> 
> I recently conducted an experiment after applying the patch "libfs: Use
> d_children
> 
> list to iterate simple_offset directories."  In a directory under tmpfs,
> I created 1026
> 
> files using the following commands:
> for i in {1..1026}; do
>     echo "This is file $i" > /tmp/dir/file$i
> done
> 
> When I use the ls to read the contents of the dir, I find that glibc
> performs two
> 
> rounds of readdir calls due to the large number of files. The first
> readdir populates
> 
> dirent with file1026 through file5, and the second readdir populates it
> with file4
> 
> through file1, which are then returned to user space.
> 
> If an unlink file4 operation is inserted between these two readdir
> calls, the second
> 
> readdir will return file5, file3, file2, and file1, causing ls to
> display two instances of
> 
> file5. However, if we replace mas_find with mas_find_rev in the
> offset_dir_lookup
> 
> function, the results become normal.
> 
> I'm not sure whether this experiment could shed light on a potential
> fix.

Thanks for the report. Directory contents cached in glibc make this
stack more brittle than it needs to be, certainly. Your issue does
look like a bug that is related to the commit.

We believe the GPU acceleration bug is related to directory order,
but I don't think libdrm is removing an entry from /dev/dri, so I
am a little skeptical this is the cause of the GPU acceleration issue
(could be wrong though).

What I recommend you do is:

 a. Create a full patch (with S-o-b) that replaces mas_find() with
    mas_find_rev() in offset_dir_lookup()

 b. Construct a new fstests test that looks for this problem (and
    it would be good to investigate fstests to see if it already
    looks for this issue, but I bet it does not)

 c. Run the full fstests suite against a kernel before and after you
    apply a. and confirm that the problem goes away and does not
    introduce new test failures when a. is applied

 d. If all goes to plan, post a. to linux-fsdevel and linux-mm.


-- 
Chuck Lever

next prev parent reply	other threads:[~2025-03-11 15:23 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-26 14:29 [PATCH] Revert "libfs: Use d_children list to iterate simple_offset directories" Greg Kroah-Hartman
2025-02-26 15:57 ` Chuck Lever
2025-02-26 16:21   ` Greg Kroah-Hartman
2025-02-26 16:28     ` Chuck Lever
2025-02-26 19:13       ` Greg Kroah-Hartman
2025-02-26 20:33         ` Chuck Lever
2025-03-10 16:29           ` Greg Kroah-Hartman
2025-03-10 17:30             ` Chuck Lever
2025-03-11 13:55               ` Sun Yongjian
2025-03-11 15:23                 ` Chuck Lever [this message]
2025-03-11 18:11                   ` Darrick J. Wong
2025-03-11 18:25                     ` Chuck Lever
2025-03-11 18:52                       ` Darrick J. Wong
2025-03-11 19:39                         ` Chuck Lever
2025-03-11 21:47                           ` Kent Overstreet
2025-03-12 15:39                             ` Christoph Hellwig
2025-03-12 15:47                           ` Christoph Hellwig
2025-03-13 13:45                   ` Chuck Lever
2025-03-18 11:38                     ` Sun Yongjian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f73e4e72-c46d-499b-a5d6-bf469331d496@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sunyongjian1@huawei.com \
    --cc=yangerkun@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.