From: Dan Williams <dan.j.williams@intel.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH 2/2] libnvdimm: don't flush power-fail protected CPU caches
Date: Tue, 5 Jun 2018 15:12:20 -0700 [thread overview]
Message-ID: <CAPcyv4h4U_nocetqT-ejjc5O7gzmAh6oShvb7CQ+3eVM1qw6tA@mail.gmail.com> (raw)
In-Reply-To: <20180605215926.GA16066@linux.intel.com>
On Tue, Jun 5, 2018 at 2:59 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> On Tue, Jun 05, 2018 at 02:20:38PM -0700, Dan Williams wrote:
>> On Tue, Jun 5, 2018 at 1:58 PM, Ross Zwisler
>> <ross.zwisler@linux.intel.com> wrote:
>> > This commit:
>> >
>> > 5fdf8e5ba566 ("libnvdimm: re-enable deep flush for pmem devices via fsync()")
>> >
>> > intended to make sure that deep flush was always available even on
>> > platforms which support a power-fail protected CPU cache. An unintended
>> > side effect of this change was that we also lost the ability to skip
>> > flushing CPU caches on those power-fail protected CPU cache.
>> >
>> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
>> > Fixes: 5fdf8e5ba566 ("libnvdimm: re-enable deep flush for pmem devices via fsync()")
>> > ---
>> > drivers/dax/super.c | 20 +++++++++++++++++++-
>> > drivers/nvdimm/pmem.c | 2 ++
>> > include/linux/dax.h | 9 +++++++++
>> > 3 files changed, 30 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
>> > index c2c46f96b18c..457e0bb6c936 100644
>> > --- a/drivers/dax/super.c
>> > +++ b/drivers/dax/super.c
>> > @@ -152,6 +152,8 @@ enum dax_device_flags {
>> > DAXDEV_ALIVE,
>> > /* gate whether dax_flush() calls the low level flush routine */
>> > DAXDEV_WRITE_CACHE,
>> > + /* only flush the CPU caches if they are not power fail protected */
>> > + DAXDEV_FLUSH_ON_SYNC,
>> > };
>> >
>> > /**
>> > @@ -283,7 +285,8 @@ EXPORT_SYMBOL_GPL(dax_copy_from_iter);
>> > void arch_wb_cache_pmem(void *addr, size_t size);
>> > void dax_flush(struct dax_device *dax_dev, void *addr, size_t size)
>> > {
>> > - if (unlikely(!dax_write_cache_enabled(dax_dev)))
>> > + if (unlikely(!dax_write_cache_enabled(dax_dev)) ||
>> > + !dax_flush_on_sync_enabled(dax_dev))
>>
>> This seems backwards. I think we should teach the pmem driver to still
>> issue deep flush even when dax_write_cache_enabled() is false.
>
> That does still happen. Deep flush is essentially controlled by the 'wbc'
> variable in pmem_attach_disk(), which we use to set blk_queue_write_cache().
Right, what I'm trying to kill is the need to add
dax_flush_on_sync_enabled() I think we can handle this local to the
pmem driver and not extend the 'dax' api.
> My understanding is that this causes the block layer to send down
> REQ_FUA/REQ_PREFLUSH BIOs, and it's in response to these that we do a deep
> flush via nvdimm_flush(). Whether this happens is totally up to the device's
> write cache setting, and doesn't look at whether the platform has
> flush-on-fail CPU caches.
>
> This does bring up another wrinkle, though: we export a write_cache sysfs
> entry that you can use to change the write cache setting of a namespace:
>
> i.e.:
> /sys/bus/nd/devices/pfn0.1/block/pmem0/dax/write_cache
>
> This changes whether or not the DAXDEV_WRITE_CACHE flag is set, but does *not*
> change whether the block queue says it supports a write cache
> (blk_queue_write_cache()). So, the sysfs entry ends up controlling whether or
> not we do CPU cache flushing via DAX, but does not do anything with the deep
> flush code.
>
> I'm guessing this should be fixed? I'll go take a look...
I think we need to disconnect DAXDEV_WRITE_CACHE from the indication
of the filesystem triggerring nvdimm_flush() via REQ_{FUA,FLUSH}.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
prev parent reply other threads:[~2018-06-05 22:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-05 20:58 [PATCH 1/2] libnvdimm: use dax_write_cache* helpers Ross Zwisler
2018-06-05 20:58 ` [PATCH 2/2] libnvdimm: don't flush power-fail protected CPU caches Ross Zwisler
2018-06-05 21:20 ` Dan Williams
2018-06-05 21:59 ` Ross Zwisler
2018-06-05 22:12 ` Dan Williams [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPcyv4h4U_nocetqT-ejjc5O7gzmAh6oShvb7CQ+3eVM1qw6tA@mail.gmail.com \
--to=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=ross.zwisler@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).