From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jmoyer@redhat.com" <jmoyer@redhat.com>,
"hch@lst.de" <hch@lst.de>, "axboe@fb.com" <axboe@fb.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
"willy@linux.intel.com" <willy@linux.intel.com>,
"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>
Subject: Re: [PATCH 5/5] block: enable dax for raw block devices
Date: Tue, 27 Oct 2015 09:19:30 +1100 [thread overview]
Message-ID: <20151026221930.GL19199@dastard> (raw)
In-Reply-To: <CAPcyv4iaKDYuP6ppAVk5UOhzmOEO4Q=N_+osB2D+aPoAzpeHvw@mail.gmail.com>
On Mon, Oct 26, 2015 at 05:56:30PM +0900, Dan Williams wrote:
> On Mon, Oct 26, 2015 at 3:23 PM, Dave Chinner <david@fromorbit.com> wrote:
> > Also, DAX access isn't a property of mmap - it's a property
> > of the inode. We cannot do DAX access via mmap while mixing page
> > cache based access through file descriptor based interfaces. This
> > I why I'm adding an inode attribute (on disk) to enable per-file DAX
> > capabilities - either everything is via the DAX paths, or nothing
> > is.
> >
>
> Per-inode control sounds very useful, I'll look at a similar mechanism
> for the raw block case.
>
> However, still not quite convinced page-cache control is an inode-only
> property, especially when direct-i/o is not an inode-property. That
> said, I agree the complexity of handling mixed mappings of the same
> file is prohibitive.
We didn't get that choice with direct IO - support via O_DIRECT was
kinda inherited from other OS's(*). We still have all sorts of
coherency problems between buffered/mmap/direct IO on the same file,
and I'd really, really like to avoid making that same mistake again
with DAX.
i.e. We have a choice with DAX right now that will allow us to avoid
coherency problems that we know existi and can't solve right now.
Making DAX and inode property rather than a application context
property avoids those coherence problems as all access will play by
the same rules....
(*)That said, some other OS's did O_DIRECT as an inode property (e.g.
solaris) where O_DIRECT was only done if no other cached operations
were required (e.g. mmap), and so the fd would transparently shift
between buffered and O_DIRECT depending on external accesses to the
inode. This was not liked because of it's unpredictable effect on
CPU usage and IO latency....
> Sounds good, get blkdev_issue_flush() functional first and then worry
> about building a more efficient solution on top.
*nod*
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jmoyer@redhat.com" <jmoyer@redhat.com>,
"hch@lst.de" <hch@lst.de>, "axboe@fb.com" <axboe@fb.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
"willy@linux.intel.com" <willy@linux.intel.com>,
"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>
Subject: Re: [PATCH 5/5] block: enable dax for raw block devices
Date: Tue, 27 Oct 2015 09:19:30 +1100 [thread overview]
Message-ID: <20151026221930.GL19199@dastard> (raw)
In-Reply-To: <CAPcyv4iaKDYuP6ppAVk5UOhzmOEO4Q=N_+osB2D+aPoAzpeHvw@mail.gmail.com>
On Mon, Oct 26, 2015 at 05:56:30PM +0900, Dan Williams wrote:
> On Mon, Oct 26, 2015 at 3:23 PM, Dave Chinner <david@fromorbit.com> wrote:
> > Also, DAX access isn't a property of mmap - it's a property
> > of the inode. We cannot do DAX access via mmap while mixing page
> > cache based access through file descriptor based interfaces. This
> > I why I'm adding an inode attribute (on disk) to enable per-file DAX
> > capabilities - either everything is via the DAX paths, or nothing
> > is.
> >
>
> Per-inode control sounds very useful, I'll look at a similar mechanism
> for the raw block case.
>
> However, still not quite convinced page-cache control is an inode-only
> property, especially when direct-i/o is not an inode-property. That
> said, I agree the complexity of handling mixed mappings of the same
> file is prohibitive.
We didn't get that choice with direct IO - support via O_DIRECT was
kinda inherited from other OS's(*). We still have all sorts of
coherency problems between buffered/mmap/direct IO on the same file,
and I'd really, really like to avoid making that same mistake again
with DAX.
i.e. We have a choice with DAX right now that will allow us to avoid
coherency problems that we know existi and can't solve right now.
Making DAX and inode property rather than a application context
property avoids those coherence problems as all access will play by
the same rules....
(*)That said, some other OS's did O_DIRECT as an inode property (e.g.
solaris) where O_DIRECT was only done if no other cached operations
were required (e.g. mmap), and so the fd would transparently shift
between buffered and O_DIRECT depending on external accesses to the
inode. This was not liked because of it's unpredictable effect on
CPU usage and IO latency....
> Sounds good, get blkdev_issue_flush() functional first and then worry
> about building a more efficient solution on top.
*nod*
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2015-10-26 22:19 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-22 6:41 [PATCH 0/5] block, dax: updates for 4.4 Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 6:41 ` [PATCH 1/5] pmem, dax: clean up clear_pmem() Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 6:41 ` [PATCH 2/5] dax: increase granularity of dax_clear_blocks() operations Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 9:26 ` Jan Kara
2015-10-22 9:26 ` Jan Kara
2015-10-22 6:41 ` [PATCH 3/5] block, dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 6:42 ` [PATCH 4/5] block: introduce file_bd_inode() Dan Williams
2015-10-22 6:42 ` Dan Williams
2015-10-22 9:45 ` Jan Kara
2015-10-22 9:45 ` Jan Kara
2015-10-22 15:41 ` Dan Williams
2015-10-22 15:41 ` Dan Williams
2015-10-22 6:42 ` [PATCH 5/5] block: enable dax for raw block devices Dan Williams
2015-10-22 6:42 ` Dan Williams
2015-10-22 9:35 ` Jan Kara
2015-10-22 9:35 ` Jan Kara
2015-10-22 16:05 ` Williams, Dan J
2015-10-22 16:05 ` Williams, Dan J
2015-10-22 21:08 ` Jan Kara
2015-10-22 21:08 ` Jan Kara
2015-10-22 23:41 ` Williams, Dan J
2015-10-22 23:41 ` Williams, Dan J
2015-10-24 12:21 ` Jan Kara
2015-10-24 12:21 ` Jan Kara
2015-10-23 23:32 ` Dan Williams
2015-10-23 23:32 ` Dan Williams
2015-10-24 14:49 ` Jan Kara
2015-10-24 14:49 ` Jan Kara
2015-10-25 21:22 ` Dave Chinner
2015-10-25 21:22 ` Dave Chinner
2015-10-26 2:48 ` Dan Williams
2015-10-26 2:48 ` Dan Williams
2015-10-26 6:23 ` Dave Chinner
2015-10-26 6:23 ` Dave Chinner
2015-10-26 7:20 ` Jan Kara
2015-10-26 7:20 ` Jan Kara
2015-10-26 8:56 ` Dan Williams
2015-10-26 8:56 ` Dan Williams
2015-10-26 22:19 ` Dave Chinner [this message]
2015-10-26 22:19 ` Dave Chinner
2015-10-27 22:55 ` Ross Zwisler
2015-10-27 22:55 ` Ross Zwisler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151026221930.GL19199@dastard \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=ross.zwisler@linux.intel.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.