From: Jeff Mahoney <jeffm@jeffm.io>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Roman Lebedev <lebedev.ri@gmail.com>,
David Howells <dhowells@redhat.com>,
linux-btrfs@vger.kernel.org, linux-unionfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, fstests@vger.kernel.org,
Filipe Manana <fdmanana@suse.com>
Subject: Re: kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs
Date: Thu, 5 Nov 2015 23:03:58 -0500 [thread overview]
Message-ID: <563C26AE.1020403@jeffm.io> (raw)
In-Reply-To: <20151106031845.GV22011@ZenIV.linux.org.uk>
[-- Attachment #1: Type: text/plain, Size: 2837 bytes --]
On 11/5/15 10:18 PM, Al Viro wrote:
> On Thu, Nov 05, 2015 at 09:57:35PM -0500, Jeff Mahoney wrote:
>
>> So now file_operations callbacks can't assume that file->f_path.dentry
>> belongs to the same file system that implements the callback. More than
>> that, any code that could ultimately get a dentry that comes from an
>> open file can't trust that it's from the same file system.
>
> Use file_inode() for inode.
>
>> This crash is due to this issue. Unlike xfs and ext2/3/4, we use
>> file->f_path.dentry->d_inode to resolve the inode. Using file_inode()
>> is an easy enough fix here, but we run into trouble later. We have
>> logic in the btrfs fsync() call path (check_parent_dirs_for_sync) that
>> walks back up the dentry chain examining the inode's last transaction
>> and last unlink transaction to determine whether a full transaction
>> commit is required. This obviously doesn't work if we're walking the
>> overlayfs path instead. Regardless of any argument over whether that's
>> doing the right thing, it's a pretty common pattern to assume that
>> file->f_path.dentry comes from the same file system when using a
>> file_operation. Is it intended that that assumption is no longer valid?
>
> It's actually rare, and your example is a perfect demonstration of the
> reasons why it is so rare. What's to protect btrfs_log_dentry_safe()
> from racing with rename(2)? Sure, you do dget_parent(). Which protects
> you from having one-time parent dentry freed under you. What it doesn't
> do is making any promises about its relationship with your file.
I suppose the irony here is that, AFAIK, that code is to ensure a file
doesn't get lost between transactions due to rename.
Isn't the file->f_path.dentry relationship stable otherwise, though? The
name might change and the parent might change but the dentry that the
file points to won't.
I did find a few other places where that assumption happens without any
questionable traversals. Sure, all three are in file systems unlikely
to be used with overlayfs.
ocfs2_prepare_inode_for_write uses file->f_path.dentry for
should_remove_suid (due to needing to do it early since cluster locking
is unknown in setattr, according to the commit). Having
should_remove_suid operate on an inode would solve that easily.
fat_ioctl_set_attributes uses it to call fat_setattr, but that only uses
the inode and could have the inode_operation use a wrapper.
cifs_new_fileinfo keeps a reference to the dentry but it seems to be
used mostly to access the inode except for the nasty-looking call to
build_path_from_dentry in cifs_reopen_file, which I won't be touching.
That does look like a questionable traversal, especially with the "we
can't take the rename lock here" comment.
-Jeff
--
Jeff Mahoney
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 827 bytes --]
next prev parent reply other threads:[~2015-11-06 4:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-30 19:57 kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs Roman Lebedev
2015-09-30 19:57 ` [RFC PATCH] fstests: generic: Test that fsync works on file in overlayfs merged directory Roman Lebedev
2015-09-30 21:56 ` Dave Chinner
2015-09-30 22:07 ` Eric Sandeen
2015-11-06 2:57 ` kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs Jeff Mahoney
2015-11-06 3:18 ` Al Viro
2015-11-06 4:03 ` Jeff Mahoney [this message]
2015-11-06 14:46 ` Jeff Mahoney
2016-03-24 15:20 ` Al Viro
2016-03-24 15:25 ` Al Viro
2016-03-24 15:31 ` Jeff Mahoney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=563C26AE.1020403@jeffm.io \
--to=jeffm@jeffm.io \
--cc=dhowells@redhat.com \
--cc=fdmanana@suse.com \
--cc=fstests@vger.kernel.org \
--cc=lebedev.ri@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).