All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Mahoney <jeffm@jeffm.io>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Roman Lebedev <lebedev.ri@gmail.com>,
	David Howells <dhowells@redhat.com>,
	linux-btrfs@vger.kernel.org, linux-unionfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, fstests@vger.kernel.org,
	Filipe Manana <fdmanana@suse.com>
Subject: Re: kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs
Date: Thu, 5 Nov 2015 23:03:58 -0500	[thread overview]
Message-ID: <563C26AE.1020403@jeffm.io> (raw)
In-Reply-To: <20151106031845.GV22011@ZenIV.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 2837 bytes --]

On 11/5/15 10:18 PM, Al Viro wrote:
> On Thu, Nov 05, 2015 at 09:57:35PM -0500, Jeff Mahoney wrote:
> 
>> So now file_operations callbacks can't assume that file->f_path.dentry
>> belongs to the same file system that implements the callback.  More than
>> that, any code that could ultimately get a dentry that comes from an
>> open file can't trust that it's from the same file system.
> 
> Use file_inode() for inode.
> 
>> This crash is due to this issue.  Unlike xfs and ext2/3/4, we use
>> file->f_path.dentry->d_inode to resolve the inode.  Using file_inode()
>> is an easy enough fix here, but we run into trouble later.  We have
>> logic in the btrfs fsync() call path (check_parent_dirs_for_sync) that
>> walks back up the dentry chain examining the inode's last transaction
>> and last unlink transaction to determine whether a full transaction
>> commit is required.  This obviously doesn't work if we're walking the
>> overlayfs path instead.  Regardless of any argument over whether that's
>> doing the right thing, it's a pretty common pattern to assume that
>> file->f_path.dentry comes from the same file system when using a
>> file_operation.  Is it intended that that assumption is no longer valid?
> 
> It's actually rare, and your example is a perfect demonstration of the
> reasons why it is so rare.  What's to protect btrfs_log_dentry_safe()
> from racing with rename(2)?  Sure, you do dget_parent().  Which protects
> you from having one-time parent dentry freed under you.  What it doesn't
> do is making any promises about its relationship with your file.

I suppose the irony here is that, AFAIK, that code is to ensure a file
doesn't get lost between transactions due to rename.

Isn't the file->f_path.dentry relationship stable otherwise, though? The
name might change and the parent might change but the dentry that the
file points to won't.

I did find a few other places where that assumption happens without any
questionable traversals.  Sure, all three are in file systems unlikely
to be used with overlayfs.

ocfs2_prepare_inode_for_write uses file->f_path.dentry for
should_remove_suid (due to needing to do it early since cluster locking
is unknown in setattr, according to the commit).  Having
should_remove_suid operate on an inode would solve that easily.

fat_ioctl_set_attributes uses it to call fat_setattr, but that only uses
the inode and could have the inode_operation use a wrapper.

cifs_new_fileinfo keeps a reference to the dentry but it seems to be
used mostly to access the inode except for the nasty-looking call to
build_path_from_dentry in cifs_reopen_file, which I won't be touching.
That does look like a questionable traversal, especially with the "we
can't take the rename lock here" comment.

-Jeff

-- 
Jeff Mahoney


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 827 bytes --]

  reply	other threads:[~2015-11-06  4:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-30 19:57 kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs Roman Lebedev
2015-09-30 19:57 ` Roman Lebedev
2015-09-30 19:57 ` [RFC PATCH] fstests: generic: Test that fsync works on file in overlayfs merged directory Roman Lebedev
2015-09-30 21:56   ` Dave Chinner
2015-09-30 22:07     ` Eric Sandeen
2015-11-06  2:57 ` kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs Jeff Mahoney
2015-11-06  3:18   ` Al Viro
2015-11-06  4:03     ` Jeff Mahoney [this message]
2015-11-06 14:46       ` Jeff Mahoney
2016-03-24 15:20       ` Al Viro
2016-03-24 15:25         ` Al Viro
2016-03-24 15:31         ` Jeff Mahoney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563C26AE.1020403@jeffm.io \
    --to=jeffm@jeffm.io \
    --cc=dhowells@redhat.com \
    --cc=fdmanana@suse.com \
    --cc=fstests@vger.kernel.org \
    --cc=lebedev.ri@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.