* [PATCH 0/6] pmem, dax: I/O path enhancements
@ 2015-08-06 17:43 Ross Zwisler
2015-08-06 17:43 ` [PATCH 5/6] nd_blk: add support for "read flush" DSM flag Ross Zwisler
2015-08-07 16:47 ` [PATCH 0/6] pmem, dax: I/O path enhancements Dan Williams
0 siblings, 2 replies; 4+ messages in thread
From: Ross Zwisler @ 2015-08-06 17:43 UTC (permalink / raw)
To: linux-kernel, linux-nvdimm, dan.j.williams
Cc: Ross Zwisler, Alexander Viro, Borislav Petkov, H. Peter Anvin,
Ingo Molnar, Juergen Gross, Len Brown, linux-acpi, linux-fsdevel,
Luis R. Rodriguez, Matthew Wilcox, Rafael J. Wysocki,
Thomas Gleixner, Toshi Kani, x86
Patch 5 adds support for the "read flush" _DSM flag, allowing us to change the
ND BLK aperture mapping from write-combining to write-back via memremap_pmem().
Patch 6 updates the DAX I/O path so that all operations that store data (I/O
writes, zeroing blocks, punching holes, etc.) properly synchronize the stores
to media using the PMEM API. This ensures that the data DAX is writing is
durable on media before the operation completes.
Patches 1-4 are cleanup patches and additions to the PMEM API that make
patches 5 and 6 possible.
Regarding the choice to add both flush_cache_pmem() and wb_cache_pmem() to the
PMEM API, I had initially implemented flush_cache_pmem() as a generic function
flush_io_cache_range() in the spirit of flush_cache_range(), etc., in
cacheflush.h. I eventually moved it into the PMEM API because a) it has a
common and consistent use of the __pmem annotation, b) it has a clear fallback
method for architectures that don't support it, as opposed to APIs in
cacheflush.h which would need to be added individually to all other
architectures. It can be argued that the flush API could apply to other uses
beyond PMEM such as flushing cache lines associated with other types of
sliding MMIO windows. At this point I'm inclined to have it as part of the
PMEM API, and then take on the effort of making it a general cache flusing API
if other users come along.
Ross Zwisler (6):
pmem: remove indirection layer arch_has_pmem_api()
x86: clean up conditional pmem includes
x86: add clwb_cache_range()
pmem: Add wb_cache_pmem() and flush_cache_pmem()
nd_blk: add support for "read flush" DSM flag
dax: update I/O path to do proper PMEM flushing
arch/x86/include/asm/cacheflush.h | 24 +++++++++--------
arch/x86/mm/pageattr.c | 23 ++++++++++++++++
drivers/acpi/nfit.c | 18 ++++++-------
drivers/acpi/nfit.h | 6 ++++-
fs/dax.c | 55 +++++++++++++++++++++++++++++++--------
include/linux/pmem.h | 36 ++++++++++++++++++-------
6 files changed, 120 insertions(+), 42 deletions(-)
--
2.1.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 5/6] nd_blk: add support for "read flush" DSM flag
2015-08-06 17:43 [PATCH 0/6] pmem, dax: I/O path enhancements Ross Zwisler
@ 2015-08-06 17:43 ` Ross Zwisler
2015-08-07 16:47 ` [PATCH 0/6] pmem, dax: I/O path enhancements Dan Williams
1 sibling, 0 replies; 4+ messages in thread
From: Ross Zwisler @ 2015-08-06 17:43 UTC (permalink / raw)
To: linux-kernel, linux-nvdimm, dan.j.williams
Cc: Ross Zwisler, Rafael J. Wysocki, Len Brown, linux-acpi
Add support for the "read flush" _DSM flag, as outlined in the DSM spec:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any new
data is read. This ensures that any stale cache lines from the previous
contents of the aperture will be discarded from the processor cache, and the
new data will be read properly from the DIMM. We know that the cache lines
are clean and will be discarded without any writeback because either a) the
previous aperture operation was a read, and we never modified the contents of
the aperture, or b) the previous aperture operation was a write and we must
have written back the dirtied contents of the aperture to the DIMM
before the I/O was completed.
By supporting the "read flush" flag we can also change the ND BLK aperture
mapping from write-combining to write-back via memremap_pmem().
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
drivers/acpi/nfit.c | 18 +++++++++---------
drivers/acpi/nfit.h | 6 +++++-
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index 7c2638f..5bd6819 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1080,9 +1080,13 @@ static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
if (rw)
memcpy_to_pmem(mmio->aperture + offset,
iobuf + copied, c);
- else
+ else {
+ if (nfit_blk->dimm_flags & ND_BLK_READ_FLUSH)
+ flush_cache_pmem(mmio->aperture + offset, c);
+
memcpy_from_pmem(iobuf + copied,
mmio->aperture + offset, c);
+ }
copied += c;
len -= c;
@@ -1191,13 +1195,9 @@ static void __iomem *__nfit_spa_map(struct acpi_nfit_desc *acpi_desc,
if (!res)
goto err_mem;
- if (type == SPA_MAP_APERTURE) {
- /*
- * TODO: memremap_pmem() support, but that requires cache
- * flushing when the aperture is moved.
- */
- spa_map->iomem = ioremap_wc(start, n);
- } else
+ if (type == SPA_MAP_APERTURE)
+ spa_map->aperture = memremap_pmem(start, n);
+ else
spa_map->iomem = ioremap_nocache(start, n);
if (!spa_map->iomem)
@@ -1267,7 +1267,7 @@ static int acpi_nfit_blk_get_flags(struct nvdimm_bus_descriptor *nd_desc,
nfit_blk->dimm_flags = flags.flags;
else if (rc == -ENOTTY) {
/* fall back to a conservative default */
- nfit_blk->dimm_flags = ND_BLK_DCR_LATCH;
+ nfit_blk->dimm_flags = ND_BLK_DCR_LATCH | ND_BLK_READ_FLUSH;
rc = 0;
} else
rc = -ENXIO;
diff --git a/drivers/acpi/nfit.h b/drivers/acpi/nfit.h
index f2c2bb7..7c6990e 100644
--- a/drivers/acpi/nfit.h
+++ b/drivers/acpi/nfit.h
@@ -41,6 +41,7 @@ enum nfit_uuids {
};
enum {
+ ND_BLK_READ_FLUSH = 1,
ND_BLK_DCR_LATCH = 2,
};
@@ -149,7 +150,10 @@ struct nfit_spa_mapping {
struct acpi_nfit_system_address *spa;
struct list_head list;
struct kref kref;
- void __iomem *iomem;
+ union {
+ void __iomem *iomem;
+ void __pmem *aperture;
+ };
};
static inline struct nfit_spa_mapping *to_spa_map(struct kref *kref)
--
2.1.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH 0/6] pmem, dax: I/O path enhancements
2015-08-06 17:43 [PATCH 0/6] pmem, dax: I/O path enhancements Ross Zwisler
2015-08-06 17:43 ` [PATCH 5/6] nd_blk: add support for "read flush" DSM flag Ross Zwisler
@ 2015-08-07 16:47 ` Dan Williams
2015-08-07 19:06 ` Ross Zwisler
1 sibling, 1 reply; 4+ messages in thread
From: Dan Williams @ 2015-08-07 16:47 UTC (permalink / raw)
To: Ross Zwisler
Cc: linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
Alexander Viro, Borislav Petkov, H. Peter Anvin, Ingo Molnar,
Juergen Gross, Len Brown, Linux ACPI, linux-fsdevel,
Luis R. Rodriguez, Matthew Wilcox, Rafael J. Wysocki,
Thomas Gleixner, Toshi Kani, X86 ML
On Thu, Aug 6, 2015 at 10:43 AM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> Patch 5 adds support for the "read flush" _DSM flag, allowing us to change the
> ND BLK aperture mapping from write-combining to write-back via memremap_pmem().
>
> Patch 6 updates the DAX I/O path so that all operations that store data (I/O
> writes, zeroing blocks, punching holes, etc.) properly synchronize the stores
> to media using the PMEM API. This ensures that the data DAX is writing is
> durable on media before the operation completes.
>
> Patches 1-4 are cleanup patches and additions to the PMEM API that make
> patches 5 and 6 possible.
>
> Regarding the choice to add both flush_cache_pmem() and wb_cache_pmem() to the
> PMEM API, I had initially implemented flush_cache_pmem() as a generic function
> flush_io_cache_range() in the spirit of flush_cache_range(), etc., in
> cacheflush.h. I eventually moved it into the PMEM API because a) it has a
> common and consistent use of the __pmem annotation, b) it has a clear fallback
> method for architectures that don't support it, as opposed to APIs in
> cacheflush.h which would need to be added individually to all other
> architectures. It can be argued that the flush API could apply to other uses
> beyond PMEM such as flushing cache lines associated with other types of
> sliding MMIO windows. At this point I'm inclined to have it as part of the
> PMEM API, and then take on the effort of making it a general cache flusing API
> if other users come along.
I'm not convinced. There are already existing users for invalidating
a cpu cache and they currently jump through hoops to have cross-arch
flushing, see drm_clflush_pages(). What the NFIT-BLK driver brings to
the table is just one more instance where the cpu cache needs to be
invalidated, and for something so fundamental it is time we had a
cross arch generic helper.
The cache-writeback case is different. To date we've only used
writeback for i/o-incoherent archs. x86 now for the first time needs
(potentially) a writeback api specifically for guaranteeing
persistence. I say "potentially" because all the cases where we need
to guarantee persistence could be handled with non-temporal stores.
The __pmem annotation is a separate issue that we need to tackle. I
think Christoph is already on team "__pmem is a mistake", but I think
we should walk through what carrying it forward would look like. The
__pfn_t patches allow for flags to be attached to the pfn(s) returned
from ->direct_access(). We could add a PFN_PMEM flag and teach
kmap_atomic_pfn_t() to only operate on !PFN_PMEM pfns. A new
"kmap_atomic_pmem()" would be needed to map pfns from the pmem
driver's ->direct_access() and that would return "void __pmem *". I
think this would force DAX to always be "__pmem clean" regardless of
whether we got the pfns from BRD or PMEM. It becomes messy when we
consider carrying __pfn_t in a bio_vec. But, I think it becomes messy
in precisely the right way in that drivers that want to setup
DMA-to-pmem should consciously be handling the __pmem annotation and
the resulting side effects.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/6] pmem, dax: I/O path enhancements
2015-08-07 16:47 ` [PATCH 0/6] pmem, dax: I/O path enhancements Dan Williams
@ 2015-08-07 19:06 ` Ross Zwisler
0 siblings, 0 replies; 4+ messages in thread
From: Ross Zwisler @ 2015-08-07 19:06 UTC (permalink / raw)
To: Dan Williams
Cc: linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
Alexander Viro, Borislav Petkov, H. Peter Anvin, Ingo Molnar,
Juergen Gross, Len Brown, Linux ACPI, linux-fsdevel,
Luis R. Rodriguez, Matthew Wilcox, Rafael J. Wysocki,
Thomas Gleixner, Toshi Kani, X86 ML
On Fri, 2015-08-07 at 09:47 -0700, Dan Williams wrote:
> On Thu, Aug 6, 2015 at 10:43 AM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > Patch 5 adds support for the "read flush" _DSM flag, allowing us to change the
> > ND BLK aperture mapping from write-combining to write-back via memremap_pmem().
> >
> > Patch 6 updates the DAX I/O path so that all operations that store data (I/O
> > writes, zeroing blocks, punching holes, etc.) properly synchronize the stores
> > to media using the PMEM API. This ensures that the data DAX is writing is
> > durable on media before the operation completes.
> >
> > Patches 1-4 are cleanup patches and additions to the PMEM API that make
> > patches 5 and 6 possible.
> >
> > Regarding the choice to add both flush_cache_pmem() and wb_cache_pmem() to the
> > PMEM API, I had initially implemented flush_cache_pmem() as a generic function
> > flush_io_cache_range() in the spirit of flush_cache_range(), etc., in
> > cacheflush.h. I eventually moved it into the PMEM API because a) it has a
> > common and consistent use of the __pmem annotation, b) it has a clear fallback
> > method for architectures that don't support it, as opposed to APIs in
> > cacheflush.h which would need to be added individually to all other
> > architectures. It can be argued that the flush API could apply to other uses
> > beyond PMEM such as flushing cache lines associated with other types of
> > sliding MMIO windows. At this point I'm inclined to have it as part of the
> > PMEM API, and then take on the effort of making it a general cache flusing API
> > if other users come along.
>
> I'm not convinced. There are already existing users for invalidating
> a cpu cache and they currently jump through hoops to have cross-arch
> flushing, see drm_clflush_pages(). What the NFIT-BLK driver brings to
> the table is just one more instance where the cpu cache needs to be
> invalidated, and for something so fundamental it is time we had a
> cross arch generic helper.
Fair enough. I'll move back to the flush_io_cache_range() solution.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-08-07 19:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-06 17:43 [PATCH 0/6] pmem, dax: I/O path enhancements Ross Zwisler
2015-08-06 17:43 ` [PATCH 5/6] nd_blk: add support for "read flush" DSM flag Ross Zwisler
2015-08-07 16:47 ` [PATCH 0/6] pmem, dax: I/O path enhancements Dan Williams
2015-08-07 19:06 ` Ross Zwisler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).