From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jens Axboe <axboe@fb.com>, Jan Kara <jack@suse.cz>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Jeff Moyer <jmoyer@redhat.com>, Jan Kara <jack@suse.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v3 02/15] dax: increase granularity of dax_clear_blocks() operations
Date: Tue, 3 Nov 2015 16:52:06 +1100 [thread overview]
Message-ID: <20151103055206.GR10656@dastard> (raw)
In-Reply-To: <CAPcyv4iwiTMMWGE63KX_tzrH1_pEpPxzAvRNgpaDEXAOhXU1BA@mail.gmail.com>
On Mon, Nov 02, 2015 at 09:31:11PM -0800, Dan Williams wrote:
> On Mon, Nov 2, 2015 at 8:48 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Mon, Nov 02, 2015 at 07:27:26PM -0800, Dan Williams wrote:
> >> On Mon, Nov 2, 2015 at 4:51 PM, Dave Chinner <david@fromorbit.com> wrote:
> >> > On Sun, Nov 01, 2015 at 11:29:53PM -0500, Dan Williams wrote:
> >> > The zeroing (and the data, for that matter) doesn't need to be
> >> > committed to persistent store until the allocation is written and
> >> > committed to the journal - that will happen with a REQ_FLUSH|REQ_FUA
> >> > write, so it makes sense to deploy the big hammer and delay the
> >> > blocking CPU cache flushes until the last possible moment in cases
> >> > like this.
> >>
> >> In pmem terms that would be a non-temporal memset plus a delayed
> >> wmb_pmem at REQ_FLUSH time. Better to write around the cache than
> >> loop over the dirty-data issuing flushes after the fact. We'll bump
> >> the priority of the non-temporal memset implementation.
> >
> > Why is it better to do two synchronous physical writes to memory
> > within a couple of microseconds of CPU time rather than writing them
> > through the cache and, in most cases, only doing one physical write
> > to memory in a separate context that expects to wait for a flush
> > to complete?
>
> With a switch to non-temporal writes they wouldn't be synchronous,
> although it's doubtful that the subsequent writes after zeroing would
> also hit the store buffer.
>
> If we had a method to flush by physical-cache-way rather than a
> virtual address then it would indeed be better to save up for one
> final flush, but when we need to resort to looping through all the
> virtual addresses that might have touched it gets expensive.
msync() is for flushing userspace mmap ranges addresses back to
physical memory. fsync() is for flushing kernel addresses (i.e. as
returned by bdev_direct_access()) back to physical addresses.
msync() calls ->fsync() as part of it's operation, fsync() does not
care about whether mmap has been sync'd first or not.
i.e. we don't care about random dirty userspace virtual mappings in
fsync() - if you have them then you need to call msync() first. So
we shouldn't ever be having to walk virtual addresses in fsync -
just the kaddr returned by bdev_direct_access() is all that fsync
needs to flush...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2015-11-03 5:52 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-02 4:29 [PATCH v3 00/15] block, dax updates for 4.4 Dan Williams
2015-11-02 4:29 ` [PATCH v3 01/15] pmem, dax: clean up clear_pmem() Dan Williams
2015-11-02 4:29 ` [PATCH v3 02/15] dax: increase granularity of dax_clear_blocks() operations Dan Williams
2015-11-03 0:51 ` Dave Chinner
2015-11-03 3:27 ` Dan Williams
2015-11-03 4:48 ` Dave Chinner
2015-11-03 5:31 ` Dan Williams
2015-11-03 5:52 ` Dave Chinner [this message]
2015-11-03 7:24 ` Dan Williams
2015-11-03 16:21 ` Jan Kara
2015-11-03 17:57 ` Ross Zwisler
2015-11-03 20:59 ` Dave Chinner
2015-11-02 4:29 ` [PATCH v3 03/15] block, dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() Dan Williams
2015-11-03 19:01 ` Ross Zwisler
2015-11-03 19:09 ` Jeff Moyer
2015-11-03 22:50 ` Dan Williams
2016-01-18 10:42 ` Geert Uytterhoeven
2015-11-02 4:30 ` [PATCH v3 04/15] libnvdimm, pmem: move request_queue allocation earlier in probe Dan Williams
2015-11-03 19:15 ` Ross Zwisler
2015-11-02 4:30 ` [PATCH v3 05/15] libnvdimm, pmem: fix size trim in pmem_direct_access() Dan Williams
2015-11-03 19:32 ` Ross Zwisler
2015-11-03 21:39 ` Dan Williams
2015-11-02 4:30 ` [PATCH v3 06/15] um: kill pfn_t Dan Williams
2015-11-02 4:30 ` [PATCH v3 07/15] kvm: rename pfn_t to kvm_pfn_t Dan Williams
2015-11-02 4:30 ` [PATCH v3 08/15] mm, dax, pmem: introduce pfn_t Dan Williams
2015-11-02 16:30 ` Joe Perches
2015-11-02 4:30 ` [PATCH v3 09/15] block: notify queue death confirmation Dan Williams
2015-11-02 4:30 ` [PATCH v3 10/15] dax, pmem: introduce zone_device_revoke() and devm_memunmap_pages() Dan Williams
2015-11-02 4:30 ` [PATCH v3 11/15] block: introduce bdev_file_inode() Dan Williams
2015-11-02 4:30 ` [PATCH v3 12/15] block: enable dax for raw block devices Dan Williams
2015-11-02 4:30 ` [PATCH v3 13/15] block, dax: make dax mappings opt-in by default Dan Williams
2015-11-03 0:32 ` Dave Chinner
2015-11-03 7:35 ` Dan Williams
2015-11-03 20:20 ` Dave Chinner
2015-11-03 23:04 ` Dan Williams
2015-11-04 19:23 ` Dan Williams
2015-11-02 4:30 ` [PATCH v3 14/15] dax: dirty extent notification Dan Williams
2015-11-03 1:16 ` Dave Chinner
2015-11-03 4:56 ` Dan Williams
2015-11-03 5:40 ` Dave Chinner
2015-11-03 7:20 ` Dan Williams
2015-11-03 20:51 ` Dave Chinner
2015-11-03 21:19 ` Dan Williams
2015-11-03 21:37 ` Ross Zwisler
2015-11-03 21:43 ` Dan Williams
2015-11-03 21:18 ` Ross Zwisler
2015-11-03 21:34 ` Dan Williams
2015-11-02 4:31 ` [PATCH v3 15/15] pmem: blkdev_issue_flush support Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151103055206.GR10656@dastard \
--to=david@fromorbit.com \
--cc=axboe@fb.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=ross.zwisler@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox