From: Jan Kara <jack@suse.cz>
To: Liu Bo <bo.li.liu@oracle.com>
Cc: Jan Kara <jack@suse.cz>, Chris Mason <clm@fb.com>,
linux-btrfs@vger.kernel.org, David Sterba <dsterba@suse.cz>
Subject: Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs
Date: Fri, 9 Dec 2016 13:31:03 +0100 [thread overview]
Message-ID: <20161209123103.GA10957@quack2.suse.cz> (raw)
In-Reply-To: <20161208164539.GB20111@localhost.localdomain>
On Thu 08-12-16 08:45:39, Liu Bo wrote:
> On Thu, Dec 08, 2016 at 11:47:41AM +0100, Jan Kara wrote:
> > On Wed 07-12-16 17:15:42, Chris Mason wrote:
> > > On 12/07/2016 04:45 PM, Liu Bo wrote:
> > > >This has implemented DAX support for btrfs with nocow and single-device.
> > > >
> > > >DAX is developed for block devices that are memory-like in order to avoid
> > > >double buffer in both page cache and the storage, so DAX can performs reads and
> > > >writes directly to the storage device, and for those who prefer to using
> > > >filesystem, filesystem dax support can help to map the storage into userspace
> > > >for file-mapping.
> > > >
> > > >Since I haven't figure out how to map multiple devices to userspace without
> > > >pagecache, this DAX support is only for single-device, and I don't think
> > > >DAX(Direct Access) can work with cow, this is limited to nocow case. I made
> > > >this by setting nodatacow in dax mount option.
> > >
> > > Interesting, this is a nice small start. It might make more sense to limit
> > > snapshots to readonly in DAX mode until we can figure out how to cow
> > > properly. I think it can be done, I just need to sit down with the dax code
> > > to do a good review.
> > >
> > > But bigger picture, if we can't cow and we can't crc and we can't
> > > multi-device, I'd rather let XFS/ext4 sort out the dax space until we pull
> > > in more of the btrfs features too.
> >
> > So normal DAX IO (via read(2) and write(2)) is very similar to direct IO so
> > I don't think there would be any obstacle to support all the features with
> > that.
>
> For DAX IO via read(2)/write(2), cow is OK while the mutliple devices is
> a problem as currently iomap_dax_actor only takes one <device, blocknum>
> pair:
>
> - raid 0, one device is written once a time
> - raid 1/10 and others, 2 or more devices need to be written each time
OK, but how do you cope with direct IO for multiple devices then? Do you
just disallow it? That's the same issue AFAICS.
> > For mmap(2) things get more difficult but still: The filesystem gets
> > normal ->fault notifications when the page is first faulted in. So you
> > can COW if you need to at that moment.
>
> Right.
>
> > Also DAX PTEs can be write-protected (well, as of the coming merge
> > window) as normal PTEs and then you'll get ->pfn_mkwrite /
> > ->page_mkwrite notification when someone tries to write via mmap and
> > you can do your stuff at that point.
>
> That's right, but I think the problem comes from the fact that only
> ->fault with FAULT_FLAG_WRITE gets to space allocation where we could
> cow to new location.
>
> For page_mkwrite, btrfs does cow while writing back a dirty page, but
> dax doesn't do delayed allocation so dax_writeback_one doesn't have
> place to do cow.
Yes, so you'd have to change this logic so that for DAX COW happens already
on page_mkwrite() time (when iomap_begin() handler is called to prepare
blocks for writing at given file offset) and not at write back time.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2016-12-09 12:31 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-07 21:45 [PATCH 0/6] btrfs dax IO Liu Bo
2016-12-07 21:45 ` [PATCH 1/6] Btrfs: add mount option for dax Liu Bo
2016-12-08 2:44 ` kbuild test robot
2016-12-09 4:47 ` Dave Chinner
2016-12-09 18:41 ` Liu Bo
2016-12-09 21:58 ` Dave Chinner
2016-12-07 21:45 ` [PATCH 2/6] Btrfs: set single device limit for dax usecase Liu Bo
2016-12-08 13:35 ` David Sterba
2016-12-08 15:19 ` Liu Bo
2016-12-07 21:45 ` [PATCH 3/6] Btrfs: refactor btrfs_file_write_iter Liu Bo
2016-12-08 0:44 ` kbuild test robot
2016-12-07 21:45 ` [PATCH 4/6] Btrfs: add DAX support for nocow btrfs Liu Bo
2016-12-07 22:15 ` Chris Mason
2016-12-07 22:51 ` Liu Bo
2016-12-08 10:47 ` Jan Kara
2016-12-08 16:45 ` Liu Bo
2016-12-09 12:31 ` Jan Kara [this message]
2016-12-09 18:38 ` Liu Bo
2016-12-08 1:16 ` kbuild test robot
2016-12-08 2:19 ` Janos Toth F.
2016-12-08 2:30 ` kbuild test robot
2016-12-09 5:13 ` Dave Chinner
2016-12-09 14:23 ` Chris Mason
2016-12-07 21:45 ` [PATCH 5/6] Btrfs: add mmap_sem to avoid race between page faults and truncate/hole_punch Liu Bo
2016-12-07 21:45 ` [PATCH 6/6] Btrfs: add tracepoint for btrfs_get_blocks_dax_fault Liu Bo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161209123103.GA10957@quack2.suse.cz \
--to=jack@suse.cz \
--cc=bo.li.liu@oracle.com \
--cc=clm@fb.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).