From: Al Viro <viro@ZenIV.linux.org.uk>
To: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
marc.zyngier@arm.com, torvalds@linux-foundation.org,
Tejun Heo <tj@kernel.org>,
stable@vger.kernel.org
Subject: Re: [PATCH] blkdev: Fix blkdev_open to release the bdev on error
Date: Tue, 8 Dec 2015 07:25:08 +0000 [thread overview]
Message-ID: <20151208072508.GM20997@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1449511503-7543-1-git-send-email-suzuki.poulose@arm.com>
On Mon, Dec 07, 2015 at 06:05:03PM +0000, Suzuki K. Poulose wrote:
> blkdev_open() doesn't release the bdev, it attached to a given
> inode, if blkdev_get() fails (e.g, due to absence of a device).
> This can cause kernel crashes when the original filesystem
> tries to flush the data during evict_inode.
>
> This can be triggered easily with virtio-9p fs using the following
> simple steps.
???
How can filesystem type affect the behaviour of block devices?
Having mknod /tmp/splat b 8 1; rm /tmp/splat try to evict the pagecache
of /dev/sda1 is simply wrong, no matter what type /tmp happens to have.
And they must share pagecache, or you'll get one hell of cache coherency
problems. As it is, that pagecache belongs to inode on bdevfs (see
fs/block_dev.c; not mountable anywhere visible, the one and only mount is
internal). That inode is tied to struct bdev, ditto for its lifetime.
Block device inodes on anything else have their ->i_mapping pointing to
the corresponding (unique for given major/minor) inode on bdevfs; that
gives us the coherency, but that also means that their *own* pagecache
(->i_data) is empty. Which is just fine, since inode eviction should
get rid of everything in its embedded struct address_space. In case of
block device inodes on ext2, 9p, etc. that amounts to no pages at all.
In case of bdevfs, it contains the page cache of block device.
<looks>
Aha...
truncate_inode_pages_final(inode->i_mapping);
clear_inode(inode);
filemap_fdatawrite(inode->i_mapping);
in there is obviously wrong - it should be
truncate_inode_pages_final(&inode->i_data);
clear_inode(inode);
filemap_fdatawrite(&inode->i_data);
and if you check other filesystems' ->evict_inode() you'll see the same thing
there.
We should not do bd_forget() upon failing open() - what for? As long as
->i_rdev remains the same, the pointer to struct bdev is valid. It
doesn't pin bdev down; having it (or any other alias) opened does. When
we decide to evict bdev, *all* aliasing inodes are dissociated from it;
none of them is open at that point, so we are OK. When an aliasing inode
gets evicted, we have it dissociated from its ->i_bdev (if any). Since we
only access the ->i_mapping of aliasing inode while its open, those places
are fine and anything that wants ->i_data of alias will simply find it empty.
AFAICS, the cause of your oopsen is that 9p evict_inode is accessing the
object it has no business to touch.
Could you confirm that the patch below fixes your problem?
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 699941e..5110785 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -451,9 +451,9 @@ void v9fs_evict_inode(struct inode *inode)
{
struct v9fs_inode *v9inode = V9FS_I(inode);
- truncate_inode_pages_final(inode->i_mapping);
+ truncate_inode_pages_final(&inode->i_data);
clear_inode(inode);
- filemap_fdatawrite(inode->i_mapping);
+ filemap_fdatawrite(&inode->i_data);
v9fs_cache_inode_put_cookie(inode);
/* clunk the fid stashed in writeback_fid */
next prev parent reply other threads:[~2015-12-08 7:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-07 18:05 [PATCH] blkdev: Fix blkdev_open to release the bdev on error Suzuki K. Poulose
2015-12-07 18:49 ` Linus Torvalds
2015-12-08 7:58 ` Al Viro
2015-12-08 10:08 ` Suzuki K. Poulose
2015-12-08 11:56 ` Vegard Nossum
2015-12-08 7:25 ` Al Viro [this message]
2015-12-08 10:07 ` Suzuki K. Poulose
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151208072508.GM20997@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=stable@vger.kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).