* [RFC] weird stuff in exfat_lookup()
@ 2025-02-27 22:48 Al Viro
2025-02-28 5:44 ` Namjae Jeon
0 siblings, 1 reply; 6+ messages in thread
From: Al Viro @ 2025-02-27 22:48 UTC (permalink / raw)
To: Namjae Jeon; +Cc: linux-fsdevel
There's a really odd comment in that thing:
/*
* Unhashed alias is able to exist because of revalidate()
* called by lookup_fast. You can easily make this status
* by calling create and lookup concurrently
* In such case, we reuse an alias instead of new dentry
*/
and AFAICS it had been there since the original merge. What I don't
understand is how the hell could revalidate result in that -
exfat_d_revalidate() always returns 1 on any positive dentry and alias is
obviously positive (it has the same inode as the one we are about to use).
It mentions a way to reproduce that, but I don't understand what does
that refer to; could you give details?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup()
2025-02-27 22:48 [RFC] weird stuff in exfat_lookup() Al Viro
@ 2025-02-28 5:44 ` Namjae Jeon
2025-02-28 16:03 ` Sungjong Seo
0 siblings, 1 reply; 6+ messages in thread
From: Namjae Jeon @ 2025-02-28 5:44 UTC (permalink / raw)
To: Al Viro, Sungjong Seo; +Cc: linux-fsdevel
On Fri, Feb 28, 2025 at 7:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> There's a really odd comment in that thing:
> /*
> * Unhashed alias is able to exist because of revalidate()
> * called by lookup_fast. You can easily make this status
> * by calling create and lookup concurrently
> * In such case, we reuse an alias instead of new dentry
> */
> and AFAICS it had been there since the original merge. What I don't
> understand is how the hell could revalidate result in that -
> exfat_d_revalidate() always returns 1 on any positive dentry and alias is
> obviously positive (it has the same inode as the one we are about to use).
>
> It mentions a way to reproduce that, but I don't understand what does
> that refer to; could you give details?
We need to find out the history of it.
Sungjong, Could you please check the history of how this code came in?
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup()
2025-02-28 5:44 ` Namjae Jeon
@ 2025-02-28 16:03 ` Sungjong Seo
2025-03-13 12:39 ` Sungjong Seo
0 siblings, 1 reply; 6+ messages in thread
From: Sungjong Seo @ 2025-02-28 16:03 UTC (permalink / raw)
To: Namjae Jeon, Al Viro; +Cc: linux-fsdevel
Hello? This is Sungjong. Currently, I am unable to reply using my
samsung.com email, so I am responding with my other Gmail account.
On 2/28/25 14:44, Namjae Jeon wrote:
> On Fri, Feb 28, 2025 at 7:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> There's a really odd comment in that thing:
>> /*
>> * Unhashed alias is able to exist because of revalidate()
>> * called by lookup_fast. You can easily make this status
>> * by calling create and lookup concurrently
>> * In such case, we reuse an alias instead of new dentry
>> */
>> and AFAICS it had been there since the original merge. What I don't
>> understand is how the hell could revalidate result in that -
>> exfat_d_revalidate() always returns 1 on any positive dentry and alias is
>> obviously positive (it has the same inode as the one we are about to use).
>>
>> It mentions a way to reproduce that, but I don't understand what does
>> that refer to; could you give details?
> We need to find out the history of it.
> Sungjong, Could you please check the history of how this code came in?
I believe this code is intended to address issues that could arise from
the stacked FS nested mount structure used in older versions of Android,
which are unlikely to occur in the general Linux VFS. However, I will
need to look into the modification history to confirm this, and it might
take some time.
Additionally, there is unnecessary code remaining in `exfat_lookup`.
This is because linux-exfat is based on Samsung's fat/exfat integrated
implementation, sdfat. We need to address these legacy issues one by one.
Thank you.
>
> Thanks.
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [RFC] weird stuff in exfat_lookup()
2025-02-28 16:03 ` Sungjong Seo
@ 2025-03-13 12:39 ` Sungjong Seo
2026-04-03 19:54 ` Al Viro
0 siblings, 1 reply; 6+ messages in thread
From: Sungjong Seo @ 2025-03-13 12:39 UTC (permalink / raw)
To: 'Namjae Jeon', 'Al Viro'
Cc: linux-fsdevel, sj1557.seo, sjdev.seo
Hello,
> Hello? This is Sungjong. Currently, I am unable to reply using my
> samsung.com email, so I am responding with my other Gmail account.
>
> On 2/28/25 14:44, Namjae Jeon wrote:
> > On Fri, Feb 28, 2025 at 7:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >>
> >> There's a really odd comment in that thing:
> >> /*
> >> * Unhashed alias is able to exist because of revalidate()
> >> * called by lookup_fast. You can easily make this status
> >> * by calling create and lookup concurrently
> >> * In such case, we reuse an alias instead of new dentry
> >> */
> >> and AFAICS it had been there since the original merge. What I don't
> >> understand is how the hell could revalidate result in that -
> >> exfat_d_revalidate() always returns 1 on any positive dentry and alias
> is
> >> obviously positive (it has the same inode as the one we are about to
> use).
> >>
> >> It mentions a way to reproduce that, but I don't understand what does
> >> that refer to; could you give details?
I tested it on an arm64 device running on Android with kernel v5.15, v6.6.
And I can see unhashed-positive dentry with the following simple TC,
even if it is not an Android stacked fs environment.
* TestCase
- thread1: while(1) { mkdir(A) and rmdir(A) }
- thread2: while(1) { stat(A) }
This is due to the characteristics of exfat allowing negative dentry and
considering CI in d_revalidate. As mentioned in the comment,
unhashed-positive dentry can exist in a situation where mkdir
and stat are competing, and it can be dropped, but exfat_lookup has
been implemented to reuse(rehash) this dentry.
I hope the following callstack will help you understand.
Thank you.
<CPU 0> <CPU 1>
do_mkdirat
user_path_create
*filename_create
inode_lock(I_MUTEX_PARENT)
__lookup_hash(LOOKUP_CREATE | LOOKUP_EXCL)
dentry = d_alloc
exfat_lookup
lock(sbi->s_lock)
no_exist_file
unlock(sbi->s_lock)
d_version_set
d_splice_alias
__d_add() //hashed-negative dentry here
*vfs_mkdir
exfat_mkdir
lock(sbi->s_lock)
inc_iversion(dir)//iversion diff occured
vfs_statx
filename_lookup
path_lookupat
lookup_last
work_component
lookup_fast
dentry = __d_lookup
d_revalidate(dentry) //because of iversion diff
d_invalidate(dentry)
__d_drop(dentry)
lookup_slow
inode_lock_shared(dir)
inc_nlink(dir)
inode = build_inode
inc_iversion(inode)
d_instantiate(dentry, inode) //* unhashed-positive dentry here
unlock(sbi->s_lock)
*done_path_create
inode_unlock(I_MUTEX_PARENT)
__lookup_slow
exfat_lookup
lock(sbi->s_lock)
inode = exfat_build_inode
alias = d_find_alias(inode)
d_unhashed(alias)? // * found unhashed positive dentry *
d_rehash(alias)
...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup()
2025-03-13 12:39 ` Sungjong Seo
@ 2026-04-03 19:54 ` Al Viro
2026-04-03 20:02 ` Al Viro
0 siblings, 1 reply; 6+ messages in thread
From: Al Viro @ 2026-04-03 19:54 UTC (permalink / raw)
To: Sungjong Seo; +Cc: 'Namjae Jeon', linux-fsdevel, sjdev.seo
[with apologies for very late reply]
On Thu, Mar 13, 2025 at 09:39:39PM +0900, Sungjong Seo wrote:
> - thread1: while(1) { mkdir(A) and rmdir(A) }
> - thread2: while(1) { stat(A) }
>
> This is due to the characteristics of exfat allowing negative dentry and
> considering CI in d_revalidate. As mentioned in the comment,
> unhashed-positive dentry can exist in a situation where mkdir
> and stat are competing, and it can be dropped, but exfat_lookup has
> been implemented to reuse(rehash) this dentry.
That's an interesting scenario, but I still don't see why would we bother.
Note that in your example we don't need to even look for aliases - it's
a directory inode, so d_splice_alias() would do the right thing, no matter
what. And for non-directories you
* already have d_move(alias, dentry) there, which would do the right
thing as well and
* won't get an unhashed alias from d_find_alias() to start with.
Frankly, I would skip the entire "look for aliases" thing in case of
directory inodes - just let d_splice_alias() handle it. That has
another fun benefit - exfat_d_anon_disconn() check becomes completely
pointless. By the time we call it we have already verified that
alias->d_parent == dentry->d_parent, so the only way to get
exfat_d_anon_disconn(alias) to be true is to have IS_ROOT(alias),
i.e. alias->d_parent == alias and thus alias == dentry->d_parent
and at the very least the inode is a directory one. We obvously
want to have it fail with -ELOOP in such case and d_splice_alias()
does just that, so if we bypass the entire "look for an alias"
thing for directories, the check becomes identical to
if (alias && alias->d_parent == dentry->d_parent)
since the last term in the current variant (!exfat_d_anon_disconn(...))
can be dropped, along with the helper itself.
Does anybody have a problem with patch below?
[PATCH] simplify exfat_lookup()
1) d_splice_alias() handles ERR_PTR() for inode just fine
2) no need to even look for existing aliases in case of directory inodes;
just punt to d_splice_alias(), it'll do the right thing
3) no need to bother with 'd_unhashed(alias)' case - d_find_alias()
would've returned that only in case of a directory, and d_splice_alias()
will handle that just fine on its own.
4) exfat_d_anon_disconn() is entirely pointless now - we only get to
evaluating it in case dentry->d_parent == alias->d_parent and
alias being a non-directory. But in that case IS_ROOT(alias) can't
possibly be true - that would've reqiured alias == alias->d_parent,
i.e alias == dentry->d_parent and dentry->d_parent is guaranteed to
be a directory. So exfat_d_anon_disconn() would always return false
when it's called, which makes && !exfat_d_anon_disconn(alias)
a no-op.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
index 670116ae9ec8..8fac39f2bcb3 100644
--- a/fs/exfat/namei.c
+++ b/fs/exfat/namei.c
@@ -711,71 +711,44 @@ static int exfat_find(struct inode *dir, const struct qstr *qname,
return 0;
}
-static int exfat_d_anon_disconn(struct dentry *dentry)
-{
- return IS_ROOT(dentry) && (dentry->d_flags & DCACHE_DISCONNECTED);
-}
-
static struct dentry *exfat_lookup(struct inode *dir, struct dentry *dentry,
unsigned int flags)
{
struct super_block *sb = dir->i_sb;
- struct inode *inode;
+ struct inode *inode = NULL;
struct dentry *alias;
struct exfat_dir_entry info;
int err;
loff_t i_pos;
- mode_t i_mode;
mutex_lock(&EXFAT_SB(sb)->s_lock);
err = exfat_find(dir, &dentry->d_name, &info);
if (err) {
- if (err == -ENOENT) {
- inode = NULL;
- goto out;
- }
- goto unlock;
+ if (unlikely(err != -ENOENT))
+ inode = ERR_PTR(err);
+ goto out;
}
i_pos = exfat_make_i_pos(&info);
inode = exfat_build_inode(sb, &info, i_pos);
- err = PTR_ERR_OR_ZERO(inode);
- if (err)
- goto unlock;
+ if (IS_ERR(inode) || S_ISDIR(inode->i_mode))
+ goto out;
- i_mode = inode->i_mode;
alias = d_find_alias(inode);
/*
* Checking "alias->d_parent == dentry->d_parent" to make sure
* FS is not corrupted (especially double linked dir).
*/
- if (alias && alias->d_parent == dentry->d_parent &&
- !exfat_d_anon_disconn(alias)) {
-
+ if (alias && alias->d_parent == dentry->d_parent) {
/*
- * Unhashed alias is able to exist because of revalidate()
- * called by lookup_fast. You can easily make this status
- * by calling create and lookup concurrently
- * In such case, we reuse an alias instead of new dentry
+ * This inode has non anonymous-DCACHE_DISCONNECTED
+ * dentry. This means, the user did ->lookup() by an
+ * another name (longname vs 8.3 alias of it) in past.
+ *
+ * Switch to new one for reason of locality if possible.
*/
- if (d_unhashed(alias)) {
- WARN_ON(alias->d_name.hash_len !=
- dentry->d_name.hash_len);
- exfat_info(sb, "rehashed a dentry(%p) in read lookup",
- alias);
- d_drop(dentry);
- d_rehash(alias);
- } else if (!S_ISDIR(i_mode)) {
- /*
- * This inode has non anonymous-DCACHE_DISCONNECTED
- * dentry. This means, the user did ->lookup() by an
- * another name (longname vs 8.3 alias of it) in past.
- *
- * Switch to new one for reason of locality if possible.
- */
- d_move(alias, dentry);
- }
+ d_move(alias, dentry);
iput(inode);
mutex_unlock(&EXFAT_SB(sb)->s_lock);
return alias;
@@ -787,9 +760,6 @@ static struct dentry *exfat_lookup(struct inode *dir, struct dentry *dentry,
exfat_d_version_set(dentry, inode_query_iversion(dir));
return d_splice_alias(inode, dentry);
-unlock:
- mutex_unlock(&EXFAT_SB(sb)->s_lock);
- return ERR_PTR(err);
}
/* remove an entry, BUT don't truncate */
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup()
2026-04-03 19:54 ` Al Viro
@ 2026-04-03 20:02 ` Al Viro
0 siblings, 0 replies; 6+ messages in thread
From: Al Viro @ 2026-04-03 20:02 UTC (permalink / raw)
To: Sungjong Seo; +Cc: 'Namjae Jeon', linux-fsdevel, sjdev.seo
On Fri, Apr 03, 2026 at 08:54:08PM +0100, Al Viro wrote:
> + * This inode has non anonymous-DCACHE_DISCONNECTED
> + * dentry. This means, the user did ->lookup() by an
That phrase probably should be "This inode has a hashed alias dentry
with different name.", while we are at it - DCACHE_DISCONNECTED is
quite irrelevant...
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-03 19:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-27 22:48 [RFC] weird stuff in exfat_lookup() Al Viro
2025-02-28 5:44 ` Namjae Jeon
2025-02-28 16:03 ` Sungjong Seo
2025-03-13 12:39 ` Sungjong Seo
2026-04-03 19:54 ` Al Viro
2026-04-03 20:02 ` Al Viro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox