From: Matthew Wilcox <willy@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"J. Bruce Fields" <bfields@fieldses.org>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.com>,
Jeff Layton <jlayton@poochiereds.net>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-nvdimm <linux-nvdimm@ml01.01.org>,
XFS Developers <xfs@oss.sgi.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] dax: allow DAX to look up an inode's block device
Date: Tue, 2 Feb 2016 18:52:43 -0500 [thread overview]
Message-ID: <20160202235243.GC3260@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gqq0guubnddRPmDVA0N1=vfh3w6jPf4GsuYs0D29nS4w@mail.gmail.com>
On Tue, Feb 02, 2016 at 03:39:15PM -0800, Dan Williams wrote:
> On Tue, Feb 2, 2016 at 3:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Tue, Feb 02, 2016 at 04:11:42PM -0700, Ross Zwisler wrote:
> >> However, for raw block devices and for XFS with a real-time device, the
> >> value in inode->i_sb->s_bdev is not correct. With the code as it is
> >> currently written, an fsync or msync to a DAX enabled raw block device will
> >> cause a NULL pointer dereference kernel BUG. For this to work correctly we
> >> need to ask the block device or filesystem what struct block_device is
> >> appropriate for our inode.
> >>
> >> To that end, add a get_bdev(struct inode *) entry point to struct
> >> super_operations. If this function pointer is non-NULL, this notifies DAX
> >> that it needs to use it to look up the correct block_device. If
> >> i_sb->get_bdev() is NULL DAX will default to inode->i_sb->s_bdev.
> >
> > Umm... It assumes that bdev will stay pinned for as long as inode is
> > referenced, presumably? If so, that needs to be documented (and verified
> > for existing fs instances). In principle, multi-disk fs might want to
> > support things like "silently move the inodes backed by that disk to other
> > ones"...
>
> I assume btrfs is the only fs we have that might reassign the bdev for
> a given inode on the fly? Hopefully we don't need anything stronger
> than rcu_read_lock() to pin the result as valid.
>
> At least in this case the initial user is dax-fsync where the
> ->get_bdev() answer should be static for the life of the inode, and
> btrfs does not currently interface with dax. But yes, we need to get
> the expected semantics clear.
Let's be clear though. ->get_bdev is a temporary hack. The need for
it goes away when DAX doesn't rely on being on a block_device any more.
I don't expect it to live longer than six months.
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jeff Layton <jlayton@poochiereds.net>,
linux-nvdimm <linux-nvdimm@ml01.01.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
XFS Developers <xfs@oss.sgi.com>,
"J. Bruce Fields" <bfields@fieldses.org>,
Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] dax: allow DAX to look up an inode's block device
Date: Tue, 2 Feb 2016 18:52:43 -0500 [thread overview]
Message-ID: <20160202235243.GC3260@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gqq0guubnddRPmDVA0N1=vfh3w6jPf4GsuYs0D29nS4w@mail.gmail.com>
On Tue, Feb 02, 2016 at 03:39:15PM -0800, Dan Williams wrote:
> On Tue, Feb 2, 2016 at 3:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Tue, Feb 02, 2016 at 04:11:42PM -0700, Ross Zwisler wrote:
> >> However, for raw block devices and for XFS with a real-time device, the
> >> value in inode->i_sb->s_bdev is not correct. With the code as it is
> >> currently written, an fsync or msync to a DAX enabled raw block device will
> >> cause a NULL pointer dereference kernel BUG. For this to work correctly we
> >> need to ask the block device or filesystem what struct block_device is
> >> appropriate for our inode.
> >>
> >> To that end, add a get_bdev(struct inode *) entry point to struct
> >> super_operations. If this function pointer is non-NULL, this notifies DAX
> >> that it needs to use it to look up the correct block_device. If
> >> i_sb->get_bdev() is NULL DAX will default to inode->i_sb->s_bdev.
> >
> > Umm... It assumes that bdev will stay pinned for as long as inode is
> > referenced, presumably? If so, that needs to be documented (and verified
> > for existing fs instances). In principle, multi-disk fs might want to
> > support things like "silently move the inodes backed by that disk to other
> > ones"...
>
> I assume btrfs is the only fs we have that might reassign the bdev for
> a given inode on the fly? Hopefully we don't need anything stronger
> than rcu_read_lock() to pin the result as valid.
>
> At least in this case the initial user is dax-fsync where the
> ->get_bdev() answer should be static for the life of the inode, and
> btrfs does not currently interface with dax. But yes, we need to get
> the expected semantics clear.
Let's be clear though. ->get_bdev is a temporary hack. The need for
it goes away when DAX doesn't rely on being on a block_device any more.
I don't expect it to live longer than six months.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2016-02-02 23:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-02 23:11 [PATCH] dax: allow DAX to look up an inode's block device Ross Zwisler
2016-02-02 23:11 ` Ross Zwisler
2016-02-02 23:11 ` Ross Zwisler
2016-02-02 23:19 ` Al Viro
2016-02-02 23:19 ` Al Viro
2016-02-02 23:36 ` Jared Hulbert
2016-02-02 23:36 ` Jared Hulbert
2016-02-02 23:41 ` Dan Williams
2016-02-02 23:41 ` Dan Williams
2016-02-03 0:33 ` Jared Hulbert
2016-02-03 0:33 ` Jared Hulbert
2016-02-03 7:54 ` Dave Chinner
2016-02-02 23:38 ` Dan Williams
2016-02-02 23:38 ` Dan Williams
2016-02-02 23:39 ` Dan Williams
2016-02-02 23:39 ` Dan Williams
2016-02-02 23:52 ` Matthew Wilcox [this message]
2016-02-02 23:52 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160202235243.GC3260@linux.intel.com \
--to=willy@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=jack@suse.com \
--cc=jlayton@poochiereds.net \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=ross.zwisler@linux.intel.com \
--cc=viro@zeniv.linux.org.uk \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.