From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: alistair@popple.id.au,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
Date: Thu, 21 May 2020 00:13:42 +0530 [thread overview]
Message-ID: <87v9kqqwf5.fsf@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4jZhYXEmYGzqGPjPtq9ZWJNtQyszN0V0Xcv0qtByK_KCw@mail.gmail.com>
Dan Williams <dan.j.williams@intel.com> writes:
> On Tue, May 19, 2020 at 6:53 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>> > On Mon, May 18, 2020 at 10:30 PM Aneesh Kumar K.V
>> > <aneesh.kumar@linux.ibm.com> wrote:
>>
>> ...
>>
>> >> Applications using new instructions will behave as expected when running
>> >> on P8 and P9. Only future hardware will differentiate between 'dcbf' and
>> >> 'dcbfps'
>> >
>> > Right, this is the problem. Applications using new instructions behave
>> > as expected, the kernel has been shipping of_pmem and papr_scm for
>> > several cycles now, you're saying that the DAX applications written
>> > against those platforms are going to be broken on P8 and P9?
>>
>> The expecation is that both kernel and userspace would get upgraded to
>> use the new instruction before actual persistent memory devices are
>> made available.
>>
>> >
>> >> > I'm thinking the kernel
>> >> > should go as far as to disable DAX operation by default on new
>> >> > hardware until userspace asserts that it is prepared to switch to the
>> >> > new implementation. Is there any other way to ensure the forward
>> >> > compatibility of deployed ppc64 DAX applications?
>> >>
>> >> AFAIU there is no released persistent memory hardware on ppc64 platform
>> >> and we need to make sure before applications get enabled to use these
>> >> persistent memory devices, they should switch to use the new
>> >> instruction?
>> >
>> > Right, I want the kernel to offer some level of safety here because
>> > everything you are describing sounds like a flag day conversion. Am I
>> > misreading? Is there some other gate that prevents existing users of
>> > of_pmem and papr_scm from having their expectations violated when
>> > running on P8 / P9 hardware? Maybe there's tighter ecosystem control
>> > that I'm just not familiar with, I'm only going off the fact that the
>> > kernel has shipped a non-zero number of NVDIMM drivers that build with
>> > ARCH=ppc64 for several cycles.
>>
>> If we are looking at adding changes to kernel that will prevent a kernel
>> from running on newer hardware in a specific case, we could as well take
>> the changes to get the kernel use the newer instructions right?
>
> Oh, no, I'm not talking about stopping the kernel from running. I'm
> simply recommending that support for MAP_SYNC mappings (userspace
> managed flushing) be disabled by default on PPC with either a
> compile-time or run-time default to assert that userspace has been
> audited for legacy applications or that the platform owner is
> otherwise willing to take the risk.
>
>> But I agree with your concern that if we have older kernel/applications
>> that continue to use `dcbf` on future hardware we will end up
>> having issues w.r.t powerfail consistency. The plan is what you outlined
>> above as tighter ecosystem control. Considering we don't have a pmem
>> device generally available, we get both kernel and userspace upgraded
>> to use these new instructions before such a device is made available.
>
> Ok, I think a compile time kernel option with a runtime override
> satisfies my concern. Does that work for you?
something like below? But this still won't handle devdax mmap right?
diff --git a/arch/arm64/include/asm/libnvdimm.h b/arch/arm64/include/asm/libnvdimm.h
new file mode 100644
index 000000000000..aee697a72537
--- /dev/null
+++ b/arch/arm64/include/asm/libnvdimm.h
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ARCH_LIBNVDIMM_H__
+#define __ARCH_LIBNVDIMM_H__
+
+#endif /* __ARCH_LIBNVDIMM_H__ */
diff --git a/arch/powerpc/include/asm/libnvdimm.h b/arch/powerpc/include/asm/libnvdimm.h
new file mode 100644
index 000000000000..da479200bfb8
--- /dev/null
+++ b/arch/powerpc/include/asm/libnvdimm.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ARCH_LIBNVDIMM_H__
+#define __ARCH_LIBNVDIMM_H__
+
+#define arch_disable_sync_nvdimm arch_disable_sync_nvdimm
+extern bool arch_disable_sync_nvdimm(void);
+
+#endif /* __ARCH_LIBNVDIMM_H__ */
diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c
index 0666a8d29596..3ce4fb4f167b 100644
--- a/arch/powerpc/lib/pmem.c
+++ b/arch/powerpc/lib/pmem.c
@@ -9,6 +9,8 @@
#include <asm/cacheflush.h>
+
+static bool sync_fault = IS_ENABLED(CONFIG_PPC_NVDIMM_SYNC_FAULT);
/*
* CONFIG_ARCH_HAS_PMEM_API symbols
*/
@@ -57,3 +59,16 @@ void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
memcpy_flushcache(to, page_to_virt(page) + offset, len);
}
EXPORT_SYMBOL(memcpy_page_flushcache);
+
+bool arch_disable_sync_nvdimm(void)
+{
+ return !sync_fault;
+}
+
+static int __init parse_sync_fault(char *p)
+{
+ sync_fault = true;
+ return 0;
+}
+early_param("enable_sync_fault", parse_sync_fault);
+
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 27a81c291be8..dde11d75a746 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -383,6 +383,15 @@ config PPC_KUEP
If you're unsure, say Y.
+config PPC_NVDIMM_SYNC_FAULT
+ bool "Synchronous fault support (MAP_SYNC)"
+ default n
+ help
+ Enable support for synchronous fault with nvdimm namespaces.
+
+ If you're unsure, say N.
+
+
config PPC_HAVE_KUAP
bool
diff --git a/arch/x86/include/asm/libnvdimm.h b/arch/x86/include/asm/libnvdimm.h
new file mode 100644
index 000000000000..aee697a72537
--- /dev/null
+++ b/arch/x86/include/asm/libnvdimm.h
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ARCH_LIBNVDIMM_H__
+#define __ARCH_LIBNVDIMM_H__
+
+#endif /* __ARCH_LIBNVDIMM_H__ */
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index ccbb5b43b8b2..74a0809491af 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1278,6 +1278,13 @@ bool is_nvdimm_sync(struct nd_region *nd_region)
if (is_nd_volatile(&nd_region->dev))
return true;
+ /*
+ * If arch is forcing a synchronous fault
+ * disable.
+ */
+ if (arch_disable_sync_nvdimm())
+ return false;
+
return is_nd_pmem(&nd_region->dev) &&
!test_bit(ND_REGION_ASYNC, &nd_region->flags);
}
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 18da4059be09..891449aebe91 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -13,6 +13,8 @@
#include <linux/spinlock.h>
#include <linux/bio.h>
+#include <asm/libnvdimm.h>
+
struct badrange_entry {
u64 start;
u64 length;
@@ -286,4 +288,12 @@ static inline void arch_invalidate_pmem(void *addr, size_t size)
}
#endif
+#ifndef arch_disable_sync_nvdimm
+#define arch_disable_sync_nvdimm arch_disable_sync_nvdimm
+static inline bool arch_disable_sync_nvdimm()
+{
+ return false;
+}
+#endif
+
#endif /* __LIBNVDIMM_H__ */
next prev parent reply other threads:[~2020-05-20 18:49 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-13 3:47 [PATCH v2 1/5] powerpc/pmem: Add new instructions for persistent storage and sync Aneesh Kumar K.V
2020-05-13 3:47 ` [PATCH v2 2/5] powerpc/pmem: Add flush routines using new pmem store and sync instruction Aneesh Kumar K.V
2020-05-13 3:47 ` [PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier Aneesh Kumar K.V
2020-05-13 16:14 ` Dan Williams
2020-05-19 5:30 ` Aneesh Kumar K.V
2020-05-19 7:09 ` Dan Williams
2020-05-19 13:52 ` Aneesh Kumar K.V
2020-05-19 18:59 ` Dan Williams
2020-05-20 18:43 ` Aneesh Kumar K.V [this message]
2020-05-21 14:38 ` Jeff Moyer
2020-05-21 17:02 ` Aneesh Kumar K.V
2020-05-21 18:25 ` Dan Williams
2020-05-21 18:52 ` Mikulas Patocka
2020-05-22 9:31 ` Michal Suchánek
2020-05-22 10:08 ` Aneesh Kumar K.V
2020-05-22 13:01 ` Mikulas Patocka
2020-06-26 10:20 ` Michal Suchánek
2020-05-21 18:34 ` Dan Williams
2020-05-13 3:47 ` [PATCH v2 4/5] powerpc/pmem/of_pmem: Update of_pmem to use the new barrier instruction Aneesh Kumar K.V
2020-05-13 6:44 ` kbuild test robot
2020-05-13 3:47 ` [PATCH v2 5/5] powerpc/pmem: Avoid the barrier in flush routines Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v9kqqwf5.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=alistair@popple.id.au \
--cc=dan.j.williams@intel.com \
--cc=linux-nvdimm@lists.01.org \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).