public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Dave Jones <davej@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	xfs@oss.sgi.com
Subject: Re: fs/attr.c:notify_change locking warning.
Date: Tue, 15 Oct 2013 13:19:05 -0700	[thread overview]
Message-ID: <20131015201905.GA7509@infradead.org> (raw)
In-Reply-To: <20131005031918.GL4446@dastard>

On Sat, Oct 05, 2013 at 01:19:18PM +1000, Dave Chinner wrote:
> Yup, we don't hold the i_mutex *at all* through the fast path for
> direct IO writes. Having to grab the i_mutex on every IO just for
> the extremely unlikely case we need to remove a suid bit on the file
> would add a significant serialisation point into the direct Io model
> that XFS uses, and is the difference between 50,000 and 2+ million
> direct IO IOPS to a single file.
> 
> I'm unwilling to sacrifice the concurrency of direct IO writes just
> to shut up ths warning, especially as the actual modifications that
> are made to remove SUID bits are correctly serialised within XFS
> once notify_change() calls ->setattr(). If it really matters, I'll
> just open code file_remove_suid() into XFS like ocfs2 does just so
> we don't get that warning being emitted by trinity.

But the i_lock doesn't synchronize against the VFS modifying various
struct inode fields.  The right fix is to take i_mutex just in case
we actually need to remove the suid bit.  The patch below should fix it,
although I need to write a testcase that actually exercises it first.

Dave (J.): if you have time to try the patch below please go ahead,
if not I'll make sure to write an isolated test ASAP to verify it and
will then submit the change.

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 4c749ab..e879f96 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -590,8 +590,22 @@ restart:
 	 * If we're writing the file then make sure to clear the setuid and
 	 * setgid bits if the process is not being run by root.  This keeps
 	 * people from modifying setuid and setgid binaries.
+	 *
+	 * Note that file_remove_suid must be called with the i_mutex held,
+	 * so we have to go through some hoops here to make sure we hold it.
 	 */
-	return file_remove_suid(file);
+	if (!IS_NOSEC(inode) && should_remove_suid(file->f_path.dentry)) {
+		if (*iolock == XFS_IOLOCK_SHARED) {
+			mutex_lock(&inode->i_mutex);
+			error = file_remove_suid(file);
+			mutex_unlock(&inode->i_mutex);
+		} else {
+			error = file_remove_suid(file);
+		}
+
+	}
+
+	return error;
 }
 
 /*

  reply	other threads:[~2013-10-15 20:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-05  0:52 fs/attr.c:notify_change locking warning Dave Jones
2013-10-05  3:19 ` Dave Chinner
2013-10-15 20:19   ` Christoph Hellwig [this message]
2013-10-15 21:36     ` Dave Chinner
2013-10-16  7:05       ` Christoph Hellwig
2013-10-16 10:26         ` Dave Chinner
2013-10-16 18:12           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131015201905.GA7509@infradead.org \
    --to=hch@infradead.org \
    --cc=davej@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox