From: Al Viro <viro@zeniv.linux.org.uk>
To: "zhengbin (A)" <zhengbin13@huawei.com>
Cc: jack@suse.cz, akpm@linux-foundation.org,
linux-fsdevel@vger.kernel.org,
"zhangyi (F)" <yi.zhang@huawei.com>,
renxudong1@huawei.com, Li Jun <jun.li@nxp.com>
Subject: Re: Possible FS race condition between iterate_dir and d_alloc_parallel
Date: Thu, 5 Sep 2019 18:47:44 +0100 [thread overview]
Message-ID: <20190905174744.GP1131@ZenIV.linux.org.uk> (raw)
In-Reply-To: <b5876e84-853c-e1f6-4fef-83d3d45e1767@huawei.com>
On Wed, Sep 04, 2019 at 02:15:58PM +0800, zhengbin (A) wrote:
> >>
> >> Confused... OTOH, I might be misreading that table of yours -
> >> it's about 30% wider than the widest xterm I can get while still
> >> being able to read the font...
>
> The table is my guess. This oops happens sometimes
>
> (We have one vmcore, others just have log, and the backtrace is same with vmcore, so the reason should be same).
>
> Unfortunately, we do not know how to reproduce it. The vmcore has such a law:
>
> 1、dirA has 177 files, and it is OK
>
> 2、dirB has 25 files, and it is OK
>
> 3、When we ls dirA, it begins with ".", "..", dirB's first file, second file... last file, last file->next = &(dirB->d_subdirs)
Hmm... Now, that is interesting. I'm not sure it has anything to do
with that bug, but lockless loops over d_subdirs can run into trouble.
Look: dentry_unlist() leaves the ->d_child.next pointing to the next
non-cursor list element (or parent's ->d_subdir, if there's nothing
else left). It works in pair with d_walk(): there we have
struct dentry *child = this_parent;
this_parent = child->d_parent;
spin_unlock(&child->d_lock);
spin_lock(&this_parent->d_lock);
/* might go back up the wrong parent if we have had a rename. */
if (need_seqretry(&rename_lock, seq))
goto rename_retry;
/* go into the first sibling still alive */
do {
next = child->d_child.next;
if (next == &this_parent->d_subdirs)
goto ascend;
child = list_entry(next, struct dentry, d_child);
} while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED));
rcu_read_unlock();
Note the recheck of rename_lock there - it does guarantee that even if
child has been killed off between unlocking it and locking this_parent,
whatever it has ended up with in its ->d_child->next has *not* been
moved elsewhere. It might, in turn, have been killed off. In that
case its ->d_child.next points to the next surviving non-cursor, also
guaranteed to remain in the same directory, etc.
However, lose that rename_lock recheck and we'd get screwed, unless
there's some other d_move() prevention in effect.
Note that all libfs.c users (next_positive(), move_cursor(),
dcache_dir_lseek(), dcache_readdir(), simple_empty()) should be
safe - dcache_readdir() is called with directory locked at least
shared, uses in dcache_dir_lseek() are surrounded by the same,
move_cursor() and simple_empty() hold ->d_lock on parent,
next_positive() is called only under the lock on directory's
inode (at least shared). Any of those should prevent any
kind of cross-directory moves - both into and out of.
<greps for d_subdirs/d_child users>
Huh?
In drivers/usb/typec/tcpm/tcpm.c:
static void tcpm_debugfs_exit(struct tcpm_port *port)
{
int i;
mutex_lock(&port->logbuffer_lock);
for (i = 0; i < LOG_BUFFER_ENTRIES; i++) {
kfree(port->logbuffer[i]);
port->logbuffer[i] = NULL;
}
mutex_unlock(&port->logbuffer_lock);
debugfs_remove(port->dentry);
if (list_empty(&rootdir->d_subdirs)) {
debugfs_remove(rootdir);
rootdir = NULL;
}
}
Unrelated, but obviously broken. Not only the locking is
deeply suspect, but it's trivially confused by open() on
the damn directory. It will definitely have ->d_subdirs
non-empty.
Came in "usb: typec: tcpm: remove tcpm dir if no children",
author Cc'd... Why not remove the directory on rmmod?
And create on insmod, initially empty...
fs/nfsd/nfsctl.c:
static void nfsdfs_remove_files(struct dentry *root)
{
struct dentry *dentry, *tmp;
list_for_each_entry_safe(dentry, tmp, &root->d_subdirs, d_child) {
if (!simple_positive(dentry)) {
WARN_ON_ONCE(1); /* I think this can't happen? */
continue;
It can happen - again, just have it opened and it bloody well will.
Locking is OK, though - parent's inode is locked, so we are
safe from d_move() playing silly buggers there.
fs/autofs/root.c:
static void autofs_clear_leaf_automount_flags(struct dentry *dentry)
{
...
/* Set parent managed if it's becoming empty */
if (d_child->next == &parent->d_subdirs &&
d_child->prev == &parent->d_subdirs)
managed_dentry_set_managed(parent);
Same bogosity regarding the check for emptiness (that one might've been my
fault). Locking is safe... Not sure if all places in autofs/expire.c
are careful enough...
So it doesn't look like this theory holds. Which filesystem had that
been on and what about ->d_parent of dentries in dirA and dirB
->d_subdirs?
next prev parent reply other threads:[~2019-09-05 17:47 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-03 14:44 Possible FS race condition between iterate_dir and d_alloc_parallel zhengbin (A)
2019-09-03 15:40 ` Al Viro
2019-09-03 15:41 ` Al Viro
2019-09-04 6:15 ` zhengbin (A)
2019-09-05 17:47 ` Al Viro [this message]
2019-09-06 0:55 ` Jun Li
2019-09-06 2:00 ` Al Viro
2019-09-06 2:32 ` zhengbin (A)
2019-09-09 14:10 ` zhengbin (A)
2019-09-09 14:59 ` Al Viro
2019-09-09 15:10 ` zhengbin (A)
[not found] ` <7e32cda5-dc89-719d-9651-cf2bd06ae728@huawei.com>
2019-09-10 21:53 ` Al Viro
2019-09-10 22:17 ` Al Viro
2019-09-14 16:16 ` [PATCH] " Al Viro
2019-09-14 16:49 ` Linus Torvalds
2019-09-14 17:01 ` Al Viro
2019-09-14 17:15 ` Linus Torvalds
2019-09-14 20:04 ` Al Viro
2019-09-14 22:57 ` Linus Torvalds
2019-09-15 0:50 ` Al Viro
2019-09-15 1:41 ` Linus Torvalds
2019-09-15 16:02 ` Al Viro
2019-09-15 17:58 ` Linus Torvalds
2019-09-21 14:07 ` Al Viro
2019-09-21 16:21 ` Linus Torvalds
2019-09-21 17:18 ` Al Viro
2019-09-21 17:38 ` Linus Torvalds
2019-09-24 2:52 ` Al Viro
2019-09-24 13:30 ` Josef Bacik
2019-09-24 14:51 ` Al Viro
2019-09-24 15:01 ` Josef Bacik
2019-09-24 15:11 ` Al Viro
2019-09-24 15:26 ` Josef Bacik
2019-09-24 16:33 ` Al Viro
[not found] ` <CAHk-=wiJ1eY7y6r_cFNRPCqD+BJZS7eJeQFO6OrXxRFjDAipsQ@mail.gmail.com>
2019-09-29 5:29 ` Al Viro
2019-09-25 11:59 ` Amir Goldstein
2019-09-25 12:22 ` Al Viro
2019-09-25 12:34 ` Amir Goldstein
2019-09-22 21:29 ` Al Viro
2019-09-23 3:32 ` zhengbin (A)
2019-09-23 5:08 ` Al Viro
2019-09-16 2:04 ` 266a9a8b41: WARNING:possible_recursive_locking_detected kernel test robot
2019-09-16 2:58 ` Al Viro
2019-09-16 2:58 ` Al Viro
2019-09-16 3:03 ` Al Viro
2019-09-16 3:03 ` Al Viro
2019-09-16 3:44 ` Linus Torvalds
2019-09-16 3:44 ` Linus Torvalds
2019-09-16 17:16 ` Al Viro
2019-09-16 17:16 ` Al Viro
2019-09-16 17:29 ` Al Viro
2019-09-16 17:29 ` Al Viro
2019-09-17 8:59 ` Rong Chen
2019-09-17 7:03 ` zhengbin
2019-09-17 12:01 ` Al Viro
2019-09-17 12:01 ` Al Viro
2019-09-19 3:36 ` zhengbin
2019-09-19 3:36 ` zhengbin (A)
2019-09-19 3:55 ` Al Viro
2019-09-19 3:55 ` Al Viro
2019-09-19 4:16 ` zhengbin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190905174744.GP1131@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=jun.li@nxp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=renxudong1@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=zhengbin13@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.