From: Al Viro <viro@ZenIV.linux.org.uk>
To: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Djalal Harouni <tixxdz@opendz.org>,
Hugh Dickins <hughd@google.com>,
Minchan Kim <minchan.kim@gmail.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Wu Fengguang <fengguang.wu@intel.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"J. Bruce Fields" <bfields@fieldses.org>,
Neil Brown <neilb@suse.de>,
Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] mm: add missing mutex lock arround notify_change
Date: Mon, 19 Dec 2011 02:03:40 +0000 [thread overview]
Message-ID: <20111219020340.GG2203@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20111219014343.GK23662@dastard>
On Mon, Dec 19, 2011 at 12:43:43PM +1100, Dave Chinner wrote:
> > We have a shitload of deadlocks on very common paths with that patch. What
> > of the paths that do lead to file_remove_suid() without i_mutex?
> > * xfs_file_aio_write_checks(): we drop i_mutex (via xfs_rw_iunlock())
> > just before calling file_remove_suid(). Racy, the fix is obvious - move
> > file_remove_suid() call before unlocking.
>
> Not exactly. xfs_rw_iunlock() is not doing what you think it's doing
> there.....
Huh? It is called as
> > - xfs_rw_iunlock(ip, XFS_ILOCK_EXCL);
and thus in
static inline void
xfs_rw_iunlock(
struct xfs_inode *ip,
int type)
{
xfs_iunlock(ip, type);
if (type & XFS_IOLOCK_EXCL)
mutex_unlock(&VFS_I(ip)->i_mutex);
}
we are guaranteed to hit i_mutex.
> Wrong lock. That's dropping the internal XFS inode metadata lock,
> but the VFS i_mutex is associated with the internal XFS inode IO
> lock, which is accessed via XFS_IOLOCK_*. Only if we take the iolock
> via XFS_IOLOCK_EXCL do we actually take the i_mutex.
> Now it gets complex. For buffered IO, we are guaranteed to already
> be holding the i_mutex because we do:
>
> *iolock = XFS_IOLOCK_EXCL;
> xfs_rw_ilock(ip, *iolock);
>
> ret = xfs_file_aio_write_checks(file, &pos, &count, new_size, iolock);
>
> So that is safe and non-racy right now.
No, it is not - we *drop* it before calling file_remove_suid(). Explicitly.
Again, look at that xfs_rw_iunlock() call there - it does drop i_mutex
(which is to say, you'd better have taken it prior to that, or you have
far worse problems).
> For direct IO, however, we don't always take the IOLOCK exclusively.
> Indeed, we try really, really hard not to do this so we can do
> concurrent reads and writes to the inode, and that results
> in a bunch of lock juggling when we actually need the IOLOCK
> exclusive (like in xfs_file_aio_write_checks()). It sounds like we
> need to know if we are going to have to remove the SUID bit ahead of
> time so that we can take the correct lock up front. I haven't
> looked at what is needed to do that yet.
OK, I'm definitely missing something. The very first thing
xfs_file_aio_write_checks() does is
xfs_rw_ilock(ip, XFS_ILOCK_EXCL);
which really makes me wonder how the hell does that manage to avoid an
instant deadlock in case of call via xfs_file_buffered_aio_write()
where we have:
struct address_space *mapping = file->f_mapping;
struct inode *inode = mapping->host;
struct xfs_inode *ip = XFS_I(inode);
*iolock = XFS_IOLOCK_EXCL;
xfs_rw_ilock(ip, *iolock);
ret = xfs_file_aio_write_checks(file, &pos, &count, new_size, iolock);
which leads to
struct inode *inode = file->f_mapping->host;
struct xfs_inode *ip = XFS_I(inode);
(IOW, inode and ip are the same as in the caller) followed by
xfs_rw_ilock(ip, XFS_ILOCK_EXCL);
and with both xfs_rw_ilock() calls turning into
mutex_lock(&VFS_I(ip)->i_mutex);
xfs_ilock(ip, XFS_ILOCK_EXCL);
we ought to deadlock on that i_mutex. What am I missing and how do we manage
to survive that?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-12-19 2:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-16 11:25 [PATCH] mm: add missing mutex lock arround notify_change Djalal Harouni
2011-12-16 20:55 ` Andrew Morton
2011-12-16 21:54 ` Djalal Harouni
2011-12-17 21:41 ` Al Viro
2011-12-17 22:10 ` Al Viro
2011-12-20 22:09 ` Ted Ts'o
2011-12-20 22:45 ` Ted Ts'o
2011-12-19 1:43 ` Dave Chinner
2011-12-19 2:03 ` Al Viro [this message]
2011-12-19 2:06 ` Al Viro
2011-12-19 5:07 ` Dave Chinner
2011-12-19 4:22 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111219020340.GG2203@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mikulas@artax.karlin.mff.cuni.cz \
--cc=minchan.kim@gmail.com \
--cc=neilb@suse.de \
--cc=tixxdz@opendz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).