From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jmoyer@redhat.com" <jmoyer@redhat.com>,
"hch@lst.de" <hch@lst.de>, "axboe@fb.com" <axboe@fb.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
"willy@linux.intel.com" <willy@linux.intel.com>,
"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>
Subject: Re: [PATCH 5/5] block: enable dax for raw block devices
Date: Mon, 26 Oct 2015 08:20:57 +0100 [thread overview]
Message-ID: <20151026072057.GA11450@quack.suse.cz> (raw)
In-Reply-To: <20151026062319.GJ19199@dastard>
On Mon 26-10-15 17:23:19, Dave Chinner wrote:
> On Mon, Oct 26, 2015 at 11:48:06AM +0900, Dan Williams wrote:
> > 2/ Even if we get a new flag that lets the kernel know the app
> > understands DAX mappings, we shouldn't leave fsync broken. Can we
> > instead get by with a simple / big hammer solution? I.e.
>
> Because we don't physically have to write back data the problem is
> both simpler and more complex. The simplest solution is for the
> underlying block device to implement blkdev_issue_flush() correctly.
>
> i.e. if blkdev_issue_flush() behaves according to it's required
> semantics - that all volatile cached data is flushed to stable
> storage - then fsync-on-DAX will work appropriately. As it is, this is
> needed for journal based filesystems to work correctly, as they are
> assuming that their journal writes are being treated correctly as
> REQ_FLUSH | REQ_FUA to ensure correct data/metadata/journal
> ordering is maintained....
>
> So, to begin with, this problem needs to be solved at the block
> device level. That's the simple, brute-force, big hammer solution to
> the problem, and it requires no changes at the filesystem level at
> all.
Completely agreed. Just make sure REQ_FLUSH, REQ_FUA works correctly for
pmem and fsync(2) / sync(2) issues go away. Fs freezing stuff is a
different story, that will likely need some coordination from the
filesystem layer (although with some luck we could keep it hidden in
fs/super.c and fs/block_dev.c). I can have a look at that once ext4 dax
support works unless someone beats me to it...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jmoyer@redhat.com" <jmoyer@redhat.com>,
"hch@lst.de" <hch@lst.de>, "axboe@fb.com" <axboe@fb.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
"willy@linux.intel.com" <willy@linux.intel.com>,
"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>
Subject: Re: [PATCH 5/5] block: enable dax for raw block devices
Date: Mon, 26 Oct 2015 08:20:57 +0100 [thread overview]
Message-ID: <20151026072057.GA11450@quack.suse.cz> (raw)
In-Reply-To: <20151026062319.GJ19199@dastard>
On Mon 26-10-15 17:23:19, Dave Chinner wrote:
> On Mon, Oct 26, 2015 at 11:48:06AM +0900, Dan Williams wrote:
> > 2/ Even if we get a new flag that lets the kernel know the app
> > understands DAX mappings, we shouldn't leave fsync broken. Can we
> > instead get by with a simple / big hammer solution? I.e.
>
> Because we don't physically have to write back data the problem is
> both simpler and more complex. The simplest solution is for the
> underlying block device to implement blkdev_issue_flush() correctly.
>
> i.e. if blkdev_issue_flush() behaves according to it's required
> semantics - that all volatile cached data is flushed to stable
> storage - then fsync-on-DAX will work appropriately. As it is, this is
> needed for journal based filesystems to work correctly, as they are
> assuming that their journal writes are being treated correctly as
> REQ_FLUSH | REQ_FUA to ensure correct data/metadata/journal
> ordering is maintained....
>
> So, to begin with, this problem needs to be solved at the block
> device level. That's the simple, brute-force, big hammer solution to
> the problem, and it requires no changes at the filesystem level at
> all.
Completely agreed. Just make sure REQ_FLUSH, REQ_FUA works correctly for
pmem and fsync(2) / sync(2) issues go away. Fs freezing stuff is a
different story, that will likely need some coordination from the
filesystem layer (although with some luck we could keep it hidden in
fs/super.c and fs/block_dev.c). I can have a look at that once ext4 dax
support works unless someone beats me to it...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2015-10-26 7:20 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-22 6:41 [PATCH 0/5] block, dax: updates for 4.4 Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 6:41 ` [PATCH 1/5] pmem, dax: clean up clear_pmem() Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 6:41 ` [PATCH 2/5] dax: increase granularity of dax_clear_blocks() operations Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 9:26 ` Jan Kara
2015-10-22 9:26 ` Jan Kara
2015-10-22 6:41 ` [PATCH 3/5] block, dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() Dan Williams
2015-10-22 6:41 ` Dan Williams
2015-10-22 6:42 ` [PATCH 4/5] block: introduce file_bd_inode() Dan Williams
2015-10-22 6:42 ` Dan Williams
2015-10-22 9:45 ` Jan Kara
2015-10-22 9:45 ` Jan Kara
2015-10-22 15:41 ` Dan Williams
2015-10-22 15:41 ` Dan Williams
2015-10-22 6:42 ` [PATCH 5/5] block: enable dax for raw block devices Dan Williams
2015-10-22 6:42 ` Dan Williams
2015-10-22 9:35 ` Jan Kara
2015-10-22 9:35 ` Jan Kara
2015-10-22 16:05 ` Williams, Dan J
2015-10-22 16:05 ` Williams, Dan J
2015-10-22 21:08 ` Jan Kara
2015-10-22 21:08 ` Jan Kara
2015-10-22 23:41 ` Williams, Dan J
2015-10-22 23:41 ` Williams, Dan J
2015-10-24 12:21 ` Jan Kara
2015-10-24 12:21 ` Jan Kara
2015-10-23 23:32 ` Dan Williams
2015-10-23 23:32 ` Dan Williams
2015-10-24 14:49 ` Jan Kara
2015-10-24 14:49 ` Jan Kara
2015-10-25 21:22 ` Dave Chinner
2015-10-25 21:22 ` Dave Chinner
2015-10-26 2:48 ` Dan Williams
2015-10-26 2:48 ` Dan Williams
2015-10-26 6:23 ` Dave Chinner
2015-10-26 6:23 ` Dave Chinner
2015-10-26 7:20 ` Jan Kara [this message]
2015-10-26 7:20 ` Jan Kara
2015-10-26 8:56 ` Dan Williams
2015-10-26 8:56 ` Dan Williams
2015-10-26 22:19 ` Dave Chinner
2015-10-26 22:19 ` Dave Chinner
2015-10-27 22:55 ` Ross Zwisler
2015-10-27 22:55 ` Ross Zwisler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151026072057.GA11450@quack.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=ross.zwisler@linux.intel.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.