* [RFC] weird stuff in exfat_lookup()
@ 2025-02-27 22:48 Al Viro
2025-02-28 5:44 ` Namjae Jeon
0 siblings, 1 reply; 6+ messages in thread
From: Al Viro @ 2025-02-27 22:48 UTC (permalink / raw)
To: Namjae Jeon; +Cc: linux-fsdevel
There's a really odd comment in that thing:
/*
* Unhashed alias is able to exist because of revalidate()
* called by lookup_fast. You can easily make this status
* by calling create and lookup concurrently
* In such case, we reuse an alias instead of new dentry
*/
and AFAICS it had been there since the original merge. What I don't
understand is how the hell could revalidate result in that -
exfat_d_revalidate() always returns 1 on any positive dentry and alias is
obviously positive (it has the same inode as the one we are about to use).
It mentions a way to reproduce that, but I don't understand what does
that refer to; could you give details?
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [RFC] weird stuff in exfat_lookup() 2025-02-27 22:48 [RFC] weird stuff in exfat_lookup() Al Viro @ 2025-02-28 5:44 ` Namjae Jeon 2025-02-28 16:03 ` Sungjong Seo 0 siblings, 1 reply; 6+ messages in thread From: Namjae Jeon @ 2025-02-28 5:44 UTC (permalink / raw) To: Al Viro, Sungjong Seo; +Cc: linux-fsdevel On Fri, Feb 28, 2025 at 7:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > There's a really odd comment in that thing: > /* > * Unhashed alias is able to exist because of revalidate() > * called by lookup_fast. You can easily make this status > * by calling create and lookup concurrently > * In such case, we reuse an alias instead of new dentry > */ > and AFAICS it had been there since the original merge. What I don't > understand is how the hell could revalidate result in that - > exfat_d_revalidate() always returns 1 on any positive dentry and alias is > obviously positive (it has the same inode as the one we are about to use). > > It mentions a way to reproduce that, but I don't understand what does > that refer to; could you give details? We need to find out the history of it. Sungjong, Could you please check the history of how this code came in? Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup() 2025-02-28 5:44 ` Namjae Jeon @ 2025-02-28 16:03 ` Sungjong Seo 2025-03-13 12:39 ` Sungjong Seo 0 siblings, 1 reply; 6+ messages in thread From: Sungjong Seo @ 2025-02-28 16:03 UTC (permalink / raw) To: Namjae Jeon, Al Viro; +Cc: linux-fsdevel Hello? This is Sungjong. Currently, I am unable to reply using my samsung.com email, so I am responding with my other Gmail account. On 2/28/25 14:44, Namjae Jeon wrote: > On Fri, Feb 28, 2025 at 7:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote: >> >> There's a really odd comment in that thing: >> /* >> * Unhashed alias is able to exist because of revalidate() >> * called by lookup_fast. You can easily make this status >> * by calling create and lookup concurrently >> * In such case, we reuse an alias instead of new dentry >> */ >> and AFAICS it had been there since the original merge. What I don't >> understand is how the hell could revalidate result in that - >> exfat_d_revalidate() always returns 1 on any positive dentry and alias is >> obviously positive (it has the same inode as the one we are about to use). >> >> It mentions a way to reproduce that, but I don't understand what does >> that refer to; could you give details? > We need to find out the history of it. > Sungjong, Could you please check the history of how this code came in? I believe this code is intended to address issues that could arise from the stacked FS nested mount structure used in older versions of Android, which are unlikely to occur in the general Linux VFS. However, I will need to look into the modification history to confirm this, and it might take some time. Additionally, there is unnecessary code remaining in `exfat_lookup`. This is because linux-exfat is based on Samsung's fat/exfat integrated implementation, sdfat. We need to address these legacy issues one by one. Thank you. > > Thanks. > ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [RFC] weird stuff in exfat_lookup() 2025-02-28 16:03 ` Sungjong Seo @ 2025-03-13 12:39 ` Sungjong Seo 2026-04-03 19:54 ` Al Viro 0 siblings, 1 reply; 6+ messages in thread From: Sungjong Seo @ 2025-03-13 12:39 UTC (permalink / raw) To: 'Namjae Jeon', 'Al Viro' Cc: linux-fsdevel, sj1557.seo, sjdev.seo Hello, > Hello? This is Sungjong. Currently, I am unable to reply using my > samsung.com email, so I am responding with my other Gmail account. > > On 2/28/25 14:44, Namjae Jeon wrote: > > On Fri, Feb 28, 2025 at 7:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > >> > >> There's a really odd comment in that thing: > >> /* > >> * Unhashed alias is able to exist because of revalidate() > >> * called by lookup_fast. You can easily make this status > >> * by calling create and lookup concurrently > >> * In such case, we reuse an alias instead of new dentry > >> */ > >> and AFAICS it had been there since the original merge. What I don't > >> understand is how the hell could revalidate result in that - > >> exfat_d_revalidate() always returns 1 on any positive dentry and alias > is > >> obviously positive (it has the same inode as the one we are about to > use). > >> > >> It mentions a way to reproduce that, but I don't understand what does > >> that refer to; could you give details? I tested it on an arm64 device running on Android with kernel v5.15, v6.6. And I can see unhashed-positive dentry with the following simple TC, even if it is not an Android stacked fs environment. * TestCase - thread1: while(1) { mkdir(A) and rmdir(A) } - thread2: while(1) { stat(A) } This is due to the characteristics of exfat allowing negative dentry and considering CI in d_revalidate. As mentioned in the comment, unhashed-positive dentry can exist in a situation where mkdir and stat are competing, and it can be dropped, but exfat_lookup has been implemented to reuse(rehash) this dentry. I hope the following callstack will help you understand. Thank you. <CPU 0> <CPU 1> do_mkdirat user_path_create *filename_create inode_lock(I_MUTEX_PARENT) __lookup_hash(LOOKUP_CREATE | LOOKUP_EXCL) dentry = d_alloc exfat_lookup lock(sbi->s_lock) no_exist_file unlock(sbi->s_lock) d_version_set d_splice_alias __d_add() //hashed-negative dentry here *vfs_mkdir exfat_mkdir lock(sbi->s_lock) inc_iversion(dir)//iversion diff occured vfs_statx filename_lookup path_lookupat lookup_last work_component lookup_fast dentry = __d_lookup d_revalidate(dentry) //because of iversion diff d_invalidate(dentry) __d_drop(dentry) lookup_slow inode_lock_shared(dir) inc_nlink(dir) inode = build_inode inc_iversion(inode) d_instantiate(dentry, inode) //* unhashed-positive dentry here unlock(sbi->s_lock) *done_path_create inode_unlock(I_MUTEX_PARENT) __lookup_slow exfat_lookup lock(sbi->s_lock) inode = exfat_build_inode alias = d_find_alias(inode) d_unhashed(alias)? // * found unhashed positive dentry * d_rehash(alias) ... ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup() 2025-03-13 12:39 ` Sungjong Seo @ 2026-04-03 19:54 ` Al Viro 2026-04-03 20:02 ` Al Viro 0 siblings, 1 reply; 6+ messages in thread From: Al Viro @ 2026-04-03 19:54 UTC (permalink / raw) To: Sungjong Seo; +Cc: 'Namjae Jeon', linux-fsdevel, sjdev.seo [with apologies for very late reply] On Thu, Mar 13, 2025 at 09:39:39PM +0900, Sungjong Seo wrote: > - thread1: while(1) { mkdir(A) and rmdir(A) } > - thread2: while(1) { stat(A) } > > This is due to the characteristics of exfat allowing negative dentry and > considering CI in d_revalidate. As mentioned in the comment, > unhashed-positive dentry can exist in a situation where mkdir > and stat are competing, and it can be dropped, but exfat_lookup has > been implemented to reuse(rehash) this dentry. That's an interesting scenario, but I still don't see why would we bother. Note that in your example we don't need to even look for aliases - it's a directory inode, so d_splice_alias() would do the right thing, no matter what. And for non-directories you * already have d_move(alias, dentry) there, which would do the right thing as well and * won't get an unhashed alias from d_find_alias() to start with. Frankly, I would skip the entire "look for aliases" thing in case of directory inodes - just let d_splice_alias() handle it. That has another fun benefit - exfat_d_anon_disconn() check becomes completely pointless. By the time we call it we have already verified that alias->d_parent == dentry->d_parent, so the only way to get exfat_d_anon_disconn(alias) to be true is to have IS_ROOT(alias), i.e. alias->d_parent == alias and thus alias == dentry->d_parent and at the very least the inode is a directory one. We obvously want to have it fail with -ELOOP in such case and d_splice_alias() does just that, so if we bypass the entire "look for an alias" thing for directories, the check becomes identical to if (alias && alias->d_parent == dentry->d_parent) since the last term in the current variant (!exfat_d_anon_disconn(...)) can be dropped, along with the helper itself. Does anybody have a problem with patch below? [PATCH] simplify exfat_lookup() 1) d_splice_alias() handles ERR_PTR() for inode just fine 2) no need to even look for existing aliases in case of directory inodes; just punt to d_splice_alias(), it'll do the right thing 3) no need to bother with 'd_unhashed(alias)' case - d_find_alias() would've returned that only in case of a directory, and d_splice_alias() will handle that just fine on its own. 4) exfat_d_anon_disconn() is entirely pointless now - we only get to evaluating it in case dentry->d_parent == alias->d_parent and alias being a non-directory. But in that case IS_ROOT(alias) can't possibly be true - that would've reqiured alias == alias->d_parent, i.e alias == dentry->d_parent and dentry->d_parent is guaranteed to be a directory. So exfat_d_anon_disconn() would always return false when it's called, which makes && !exfat_d_anon_disconn(alias) a no-op. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c index 670116ae9ec8..8fac39f2bcb3 100644 --- a/fs/exfat/namei.c +++ b/fs/exfat/namei.c @@ -711,71 +711,44 @@ static int exfat_find(struct inode *dir, const struct qstr *qname, return 0; } -static int exfat_d_anon_disconn(struct dentry *dentry) -{ - return IS_ROOT(dentry) && (dentry->d_flags & DCACHE_DISCONNECTED); -} - static struct dentry *exfat_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { struct super_block *sb = dir->i_sb; - struct inode *inode; + struct inode *inode = NULL; struct dentry *alias; struct exfat_dir_entry info; int err; loff_t i_pos; - mode_t i_mode; mutex_lock(&EXFAT_SB(sb)->s_lock); err = exfat_find(dir, &dentry->d_name, &info); if (err) { - if (err == -ENOENT) { - inode = NULL; - goto out; - } - goto unlock; + if (unlikely(err != -ENOENT)) + inode = ERR_PTR(err); + goto out; } i_pos = exfat_make_i_pos(&info); inode = exfat_build_inode(sb, &info, i_pos); - err = PTR_ERR_OR_ZERO(inode); - if (err) - goto unlock; + if (IS_ERR(inode) || S_ISDIR(inode->i_mode)) + goto out; - i_mode = inode->i_mode; alias = d_find_alias(inode); /* * Checking "alias->d_parent == dentry->d_parent" to make sure * FS is not corrupted (especially double linked dir). */ - if (alias && alias->d_parent == dentry->d_parent && - !exfat_d_anon_disconn(alias)) { - + if (alias && alias->d_parent == dentry->d_parent) { /* - * Unhashed alias is able to exist because of revalidate() - * called by lookup_fast. You can easily make this status - * by calling create and lookup concurrently - * In such case, we reuse an alias instead of new dentry + * This inode has non anonymous-DCACHE_DISCONNECTED + * dentry. This means, the user did ->lookup() by an + * another name (longname vs 8.3 alias of it) in past. + * + * Switch to new one for reason of locality if possible. */ - if (d_unhashed(alias)) { - WARN_ON(alias->d_name.hash_len != - dentry->d_name.hash_len); - exfat_info(sb, "rehashed a dentry(%p) in read lookup", - alias); - d_drop(dentry); - d_rehash(alias); - } else if (!S_ISDIR(i_mode)) { - /* - * This inode has non anonymous-DCACHE_DISCONNECTED - * dentry. This means, the user did ->lookup() by an - * another name (longname vs 8.3 alias of it) in past. - * - * Switch to new one for reason of locality if possible. - */ - d_move(alias, dentry); - } + d_move(alias, dentry); iput(inode); mutex_unlock(&EXFAT_SB(sb)->s_lock); return alias; @@ -787,9 +760,6 @@ static struct dentry *exfat_lookup(struct inode *dir, struct dentry *dentry, exfat_d_version_set(dentry, inode_query_iversion(dir)); return d_splice_alias(inode, dentry); -unlock: - mutex_unlock(&EXFAT_SB(sb)->s_lock); - return ERR_PTR(err); } /* remove an entry, BUT don't truncate */ ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] weird stuff in exfat_lookup() 2026-04-03 19:54 ` Al Viro @ 2026-04-03 20:02 ` Al Viro 0 siblings, 0 replies; 6+ messages in thread From: Al Viro @ 2026-04-03 20:02 UTC (permalink / raw) To: Sungjong Seo; +Cc: 'Namjae Jeon', linux-fsdevel, sjdev.seo On Fri, Apr 03, 2026 at 08:54:08PM +0100, Al Viro wrote: > + * This inode has non anonymous-DCACHE_DISCONNECTED > + * dentry. This means, the user did ->lookup() by an That phrase probably should be "This inode has a hashed alias dentry with different name.", while we are at it - DCACHE_DISCONNECTED is quite irrelevant... ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-03 19:58 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-27 22:48 [RFC] weird stuff in exfat_lookup() Al Viro 2025-02-28 5:44 ` Namjae Jeon 2025-02-28 16:03 ` Sungjong Seo 2025-03-13 12:39 ` Sungjong Seo 2026-04-03 19:54 ` Al Viro 2026-04-03 20:02 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox