* [BUG] fs/dcache: might_sleep is called under a spinlock
@ 2017-10-03 2:38 Jia-Ju Bai
2017-10-03 3:19 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Jia-Ju Bai @ 2017-10-03 2:38 UTC (permalink / raw)
To: viro; +Cc: linux-fsdevel, linux-kernel
According to fs/dcache.c, might_sleep is called under a spinlock,
and the function call path is:
d_prune_aliases (acquire the spinlock)
dput
might_sleep
This bug is found by my static analysis tool and my code review.
A possible fix is to remove might_sleep in dput.
Thanks,
Jia-Ju Bai
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [BUG] fs/dcache: might_sleep is called under a spinlock 2017-10-03 2:38 [BUG] fs/dcache: might_sleep is called under a spinlock Jia-Ju Bai @ 2017-10-03 3:19 ` Al Viro 2017-10-03 8:46 ` Jia-Ju Bai 0 siblings, 1 reply; 3+ messages in thread From: Al Viro @ 2017-10-03 3:19 UTC (permalink / raw) To: Jia-Ju Bai; +Cc: linux-fsdevel, linux-kernel On Tue, Oct 03, 2017 at 10:38:25AM +0800, Jia-Ju Bai wrote: > According to fs/dcache.c, might_sleep is called under a spinlock, > and the function call path is: > d_prune_aliases (acquire the spinlock) > dput > might_sleep > > This bug is found by my static analysis tool and my code review. > A possible fix is to remove might_sleep in dput. ... or to fix your static analysis tool. First of all, that call of dput() really *can* block and if we had inode->i_lock or dentry->d_lock still held at that point we'd have a real bug. However, __dentry_kill() there is called with dentry->d_inode == inode and inode->i_lock held, so dentry->d_inode is stable until inode->i_lock is dropped. Said __dentry_kill() contains if (dentry->d_inode) dentry_unlink_inode(dentry); with inode->i_lock held until that point. dentry_unlink_inode() starts with struct inode *inode = dentry->d_inode; bool hashed = !d_unhashed(dentry); if (hashed) raw_write_seqcount_begin(&dentry->d_seq); __d_clear_type_and_inode(dentry); hlist_del_init(&dentry->d_u.d_alias); if (hashed) raw_write_seqcount_end(&dentry->d_seq); spin_unlock(&dentry->d_lock); spin_unlock(&inode->i_lock); so 1) inode in there is guaranteed to be equal to the argument of d_prune_aliases() and 2) both dentry->d_lock and inode->i_lock are dropped before dentry_unlink_inode() returns. inode->i_lock is not regained in the rest of __dentry_kill(); dentry->d_lock is regained and dropped before __dentry_kill() returns. IOW, we are fine - dput() in d_prune_aliases() is called without any spinlocks held. That, BTW, is the reason for goto restart; in there, instead of just continuing the loop - if we get to that point, the list of aliases might have changed. Removing might_sleep() in dput() would've been wrong - it really might sleep when called from that point. Here's how: we used to have two links to the same file - foo/bar and baz/barf. baz/barf used to be opened, then rm -rf baz happened and later we'd called d_prune_aliases() on the inode of foo/bar. And as the loop had been executed on one CPU, on another the opened file got closed, dropping the last reference to dentry that used to be baz/barf. Note that its parent (the thing that used to be dentry of baz) is unhashed and the only contributor to its refcount is our dentry, so dput(parent) *does* drop the last remaining reference, triggering the final iput() on inode of baz, along with freeing on-disk inode, doing disk IO, etc. Again, it's not that we can't block in that dput() - it's that __dentry_kill() drops all spinlocks. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] fs/dcache: might_sleep is called under a spinlock 2017-10-03 3:19 ` Al Viro @ 2017-10-03 8:46 ` Jia-Ju Bai 0 siblings, 0 replies; 3+ messages in thread From: Jia-Ju Bai @ 2017-10-03 8:46 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, linux-kernel Thanks for your detailed explanation :) I will improve my static analysis tool. Thanks, Jia-Ju Bai On 2017/10/3 11:19, Al Viro wrote: > On Tue, Oct 03, 2017 at 10:38:25AM +0800, Jia-Ju Bai wrote: >> According to fs/dcache.c, might_sleep is called under a spinlock, >> and the function call path is: >> d_prune_aliases (acquire the spinlock) >> dput >> might_sleep >> >> This bug is found by my static analysis tool and my code review. >> A possible fix is to remove might_sleep in dput. > ... or to fix your static analysis tool. First of all, that call > of dput() really *can* block and if we had inode->i_lock or dentry->d_lock > still held at that point we'd have a real bug. However, __dentry_kill() > there is called with dentry->d_inode == inode and inode->i_lock held, > so dentry->d_inode is stable until inode->i_lock is dropped. Said > __dentry_kill() contains > if (dentry->d_inode) > dentry_unlink_inode(dentry); > with inode->i_lock held until that point. dentry_unlink_inode() starts > with > struct inode *inode = dentry->d_inode; > bool hashed = !d_unhashed(dentry); > > if (hashed) > raw_write_seqcount_begin(&dentry->d_seq); > __d_clear_type_and_inode(dentry); > hlist_del_init(&dentry->d_u.d_alias); > if (hashed) > raw_write_seqcount_end(&dentry->d_seq); > spin_unlock(&dentry->d_lock); > spin_unlock(&inode->i_lock); > so > 1) inode in there is guaranteed to be equal to the argument of > d_prune_aliases() and > 2) both dentry->d_lock and inode->i_lock are dropped before > dentry_unlink_inode() returns. inode->i_lock is not regained in the > rest of __dentry_kill(); dentry->d_lock is regained and dropped before > __dentry_kill() returns. IOW, we are fine - dput() in d_prune_aliases() > is called without any spinlocks held. > > That, BTW, is the reason for > goto restart; > in there, instead of just continuing the loop - if we get to that point, > the list of aliases might have changed. > > Removing might_sleep() in dput() would've been wrong - it really might > sleep when called from that point. Here's how: we used to have two > links to the same file - foo/bar and baz/barf. baz/barf used to be > opened, then rm -rf baz happened and later we'd called d_prune_aliases() > on the inode of foo/bar. And as the loop had been executed on one CPU, > on another the opened file got closed, dropping the last reference to > dentry that used to be baz/barf. Note that its parent (the thing that > used to be dentry of baz) is unhashed and the only contributor to its > refcount is our dentry, so dput(parent) *does* drop the last remaining > reference, triggering the final iput() on inode of baz, along with > freeing on-disk inode, doing disk IO, etc. > > Again, it's not that we can't block in that dput() - it's that __dentry_kill() > drops all spinlocks. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-10-03 8:46 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-10-03 2:38 [BUG] fs/dcache: might_sleep is called under a spinlock Jia-Ju Bai 2017-10-03 3:19 ` Al Viro 2017-10-03 8:46 ` Jia-Ju Bai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).