* [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only
@ 2018-10-12 17:47 Dan Williams
0 siblings, 0 replies; 4+ messages in thread
From: Dan Williams @ 2018-10-12 17:47 UTC (permalink / raw)
To: akpm
Cc: stable, Balbir Singh, Logan Gunthorpe, Christoph Hellwig,
Jérôme Glisse, Michal Hocko, Jérôme Glisse,
linux-mm, linux-kernel
Changes since v6 [1]:
* Rebase on next-20181008 and fixup conflicts with the xarray conversion
and hotplug optimizations
* It has soaked on a 0day visible branch for a few days without any
reports.
[1]: https://lkml.org/lkml/2018/9/13/104
---
Hi Andrew,
Jérôme has reviewed the cleanups, thanks Jérôme. We still disagree on
the EXPORT_SYMBOL_GPL status of the core HMM implementation, but Logan,
Christoph and I continue to support marking all devm_memremap_pages()
derivatives EXPORT_SYMBOL_GPL.
HMM has been upstream for over a year, with no in-tree users it is clear
it was designed first and foremost for out of tree drivers. It takes
advantage of a facility Christoph and I spearheaded to support
persistent memory. It continues to see expanding use cases with no clear
end date when it will stop attracting features / revisions. It is not
suitable to export devm_memremap_pages() as a stable 3rd party driver
api.
devm_memremap_pages() is a facility that can create struct page entries
for any arbitrary range and give out-of-tree drivers the ability to
subvert core aspects of page management. It, and anything derived from
it (e.g. hmm, pcip2p, etc...), is a deep integration point into the core
kernel, and an EXPORT_SYMBOL_GPL() interface.
Commit 31c5bda3a656 "mm: fix exports that inadvertently make put_page()
EXPORT_SYMBOL_GPL" was merged ahead of this series to relieve some of
the pressure from innocent consumers of put_page(), but now we need this
series to address *producers* of device pages.
More details and justification in the changelogs. The 0day
infrastructure has reported success across 152 configs and this survives
the libnvdimm unit test suite. Aside from the controversial bits the
diffstat is compelling at:
7 files changed, 126 insertions(+), 323 deletions(-)
Note that the series has some minor collisions with Alex's recent series
to improve devm_memremap_pages() scalability [2]. So, whichever you take
first the other will need a minor rebase.
[2]: https://www.lkml.org/lkml/2018/9/11/10
Dan Williams (7):
mm, devm_memremap_pages: Mark devm_memremap_pages() EXPORT_SYMBOL_GPL
mm, devm_memremap_pages: Kill mapping "System RAM" support
mm, devm_memremap_pages: Fix shutdown handling
mm, devm_memremap_pages: Add MEMORY_DEVICE_PRIVATE support
mm, hmm: Use devm semantics for hmm_devmem_{add,remove}
mm, hmm: Replace hmm_devmem_pages_create() with devm_memremap_pages()
mm, hmm: Mark hmm_devmem_{add,add_resource} EXPORT_SYMBOL_GPL
drivers/dax/pmem.c | 14 --
drivers/nvdimm/pmem.c | 13 +-
include/linux/hmm.h | 4
include/linux/memremap.h | 2
kernel/memremap.c | 94 +++++++----
mm/hmm.c | 305 +++++--------------------------------
tools/testing/nvdimm/test/iomap.c | 17 ++
7 files changed, 126 insertions(+), 323 deletions(-)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only
@ 2018-10-12 17:49 Dan Williams
2018-10-12 17:49 ` [PATCH v7 3/7] mm, devm_memremap_pages: Fix shutdown handling Dan Williams
2018-10-12 18:14 ` [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Jerome Glisse
0 siblings, 2 replies; 4+ messages in thread
From: Dan Williams @ 2018-10-12 17:49 UTC (permalink / raw)
To: akpm
Cc: stable, Balbir Singh, Logan Gunthorpe, Christoph Hellwig,
Jérôme Glisse, Michal Hocko, Jérôme Glisse,
linux-mm, linux-kernel
[ apologies for the resend, script error ]
Changes since v6 [1]:
* Rebase on next-20181008 and fixup conflicts with the xarray conversion
and hotplug optimizations
* It has soaked on a 0day visible branch for a few days without any
reports.
[1]: https://lkml.org/lkml/2018/9/13/104
---
Hi Andrew,
Jérôme has reviewed the cleanups, thanks Jérôme. We still disagree on
the EXPORT_SYMBOL_GPL status of the core HMM implementation, but Logan,
Christoph and I continue to support marking all devm_memremap_pages()
derivatives EXPORT_SYMBOL_GPL.
HMM has been upstream for over a year, with no in-tree users it is clear
it was designed first and foremost for out of tree drivers. It takes
advantage of a facility Christoph and I spearheaded to support
persistent memory. It continues to see expanding use cases with no clear
end date when it will stop attracting features / revisions. It is not
suitable to export devm_memremap_pages() as a stable 3rd party driver
api.
devm_memremap_pages() is a facility that can create struct page entries
for any arbitrary range and give out-of-tree drivers the ability to
subvert core aspects of page management. It, and anything derived from
it (e.g. hmm, pcip2p, etc...), is a deep integration point into the core
kernel, and an EXPORT_SYMBOL_GPL() interface.
Commit 31c5bda3a656 "mm: fix exports that inadvertently make put_page()
EXPORT_SYMBOL_GPL" was merged ahead of this series to relieve some of
the pressure from innocent consumers of put_page(), but now we need this
series to address *producers* of device pages.
More details and justification in the changelogs. The 0day
infrastructure has reported success across 152 configs and this survives
the libnvdimm unit test suite. Aside from the controversial bits the
diffstat is compelling at:
7 files changed, 126 insertions(+), 323 deletions(-)
Note that the series has some minor collisions with Alex's recent series
to improve devm_memremap_pages() scalability [2]. So, whichever you take
first the other will need a minor rebase.
[2]: https://www.lkml.org/lkml/2018/9/11/10
Dan Williams (7):
mm, devm_memremap_pages: Mark devm_memremap_pages() EXPORT_SYMBOL_GPL
mm, devm_memremap_pages: Kill mapping "System RAM" support
mm, devm_memremap_pages: Fix shutdown handling
mm, devm_memremap_pages: Add MEMORY_DEVICE_PRIVATE support
mm, hmm: Use devm semantics for hmm_devmem_{add,remove}
mm, hmm: Replace hmm_devmem_pages_create() with devm_memremap_pages()
mm, hmm: Mark hmm_devmem_{add,add_resource} EXPORT_SYMBOL_GPL
drivers/dax/pmem.c | 14 --
drivers/nvdimm/pmem.c | 13 +-
include/linux/hmm.h | 4
include/linux/memremap.h | 2
kernel/memremap.c | 94 +++++++----
mm/hmm.c | 305 +++++--------------------------------
tools/testing/nvdimm/test/iomap.c | 17 ++
7 files changed, 126 insertions(+), 323 deletions(-)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v7 3/7] mm, devm_memremap_pages: Fix shutdown handling
2018-10-12 17:49 [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Dan Williams
@ 2018-10-12 17:49 ` Dan Williams
2018-10-12 18:14 ` [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Jerome Glisse
1 sibling, 0 replies; 4+ messages in thread
From: Dan Williams @ 2018-10-12 17:49 UTC (permalink / raw)
To: akpm
Cc: stable, Jérôme Glisse, Logan Gunthorpe, Logan Gunthorpe,
Christoph Hellwig, linux-mm, linux-kernel
The last step before devm_memremap_pages() returns success is to
allocate a release action, devm_memremap_pages_release(), to tear the
entire setup down. However, the result from devm_add_action() is not
checked.
Checking the error from devm_add_action() is not enough. The api
currently relies on the fact that the percpu_ref it is using is killed
by the time the devm_memremap_pages_release() is run. Rather than
continue this awkward situation, offload the responsibility of killing
the percpu_ref to devm_memremap_pages_release() directly. This allows
devm_memremap_pages() to do the right thing relative to init failures
and shutdown.
Without this change we could fail to register the teardown of
devm_memremap_pages(). The likelihood of hitting this failure is tiny as
small memory allocations almost always succeed. However, the impact of
the failure is large given any future reconfiguration, or
disable/enable, of an nvdimm namespace will fail forever as subsequent
calls to devm_memremap_pages() will fail to setup the pgmap_radix since
there will be stale entries for the physical address range.
An argument could be made to require that the ->kill() operation be set
in the @pgmap arg rather than passed in separately. However, it helps
code readability, tracking the lifetime of a given instance, to be able
to grep the kill routine directly at the devm_memremap_pages() call
site.
Cc: <stable@vger.kernel.org>
Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/pmem.c | 14 +++-----------
drivers/nvdimm/pmem.c | 13 +++++--------
include/linux/memremap.h | 2 ++
kernel/memremap.c | 30 ++++++++++++++----------------
tools/testing/nvdimm/test/iomap.c | 15 ++++++++++++++-
5 files changed, 38 insertions(+), 36 deletions(-)
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index 99e2aace8078..2c1f459c0c63 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -48,9 +48,8 @@ static void dax_pmem_percpu_exit(void *data)
percpu_ref_exit(ref);
}
-static void dax_pmem_percpu_kill(void *data)
+static void dax_pmem_percpu_kill(struct percpu_ref *ref)
{
- struct percpu_ref *ref = data;
struct dax_pmem *dax_pmem = to_dax_pmem(ref);
dev_dbg(dax_pmem->dev, "trace\n");
@@ -112,17 +111,10 @@ static int dax_pmem_probe(struct device *dev)
}
dax_pmem->pgmap.ref = &dax_pmem->ref;
+ dax_pmem->pgmap.kill = dax_pmem_percpu_kill;
addr = devm_memremap_pages(dev, &dax_pmem->pgmap);
- if (IS_ERR(addr)) {
- devm_remove_action(dev, dax_pmem_percpu_exit, &dax_pmem->ref);
- percpu_ref_exit(&dax_pmem->ref);
+ if (IS_ERR(addr))
return PTR_ERR(addr);
- }
-
- rc = devm_add_action_or_reset(dev, dax_pmem_percpu_kill,
- &dax_pmem->ref);
- if (rc)
- return rc;
/* adjust the dax_region resource to the start of data */
memcpy(&res, &dax_pmem->pgmap.res, sizeof(res));
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index a75d10c23d80..c7548e712a16 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -309,8 +309,11 @@ static void pmem_release_queue(void *q)
blk_cleanup_queue(q);
}
-static void pmem_freeze_queue(void *q)
+static void pmem_freeze_queue(struct percpu_ref *ref)
{
+ struct request_queue *q;
+
+ q = container_of(ref, typeof(*q), q_usage_counter);
blk_freeze_queue_start(q);
}
@@ -402,6 +405,7 @@ static int pmem_attach_disk(struct device *dev,
pmem->pfn_flags = PFN_DEV;
pmem->pgmap.ref = &q->q_usage_counter;
+ pmem->pgmap.kill = pmem_freeze_queue;
if (is_nd_pfn(dev)) {
if (setup_pagemap_fsdax(dev, &pmem->pgmap))
return -ENOMEM;
@@ -425,13 +429,6 @@ static int pmem_attach_disk(struct device *dev,
addr = devm_memremap(dev, pmem->phys_addr,
pmem->size, ARCH_MEMREMAP_PMEM);
- /*
- * At release time the queue must be frozen before
- * devm_memremap_pages is unwound
- */
- if (devm_add_action_or_reset(dev, pmem_freeze_queue, q))
- return -ENOMEM;
-
if (IS_ERR(addr))
return PTR_ERR(addr);
pmem->virt_addr = addr;
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index f91f9e763557..a84572cdc438 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -106,6 +106,7 @@ typedef void (*dev_page_free_t)(struct page *page, void *data);
* @altmap: pre-allocated/reserved memory for vmemmap allocations
* @res: physical address range covered by @ref
* @ref: reference count that pins the devm_memremap_pages() mapping
+ * @kill: callback to transition @ref to the dead state
* @dev: host device of the mapping for debug
* @data: private data pointer for page_free()
* @type: memory type: see MEMORY_* in memory_hotplug.h
@@ -117,6 +118,7 @@ struct dev_pagemap {
bool altmap_valid;
struct resource res;
struct percpu_ref *ref;
+ void (*kill)(struct percpu_ref *ref);
struct device *dev;
void *data;
enum memory_type type;
diff --git a/kernel/memremap.c b/kernel/memremap.c
index 871d81bf0c69..ec7d8ca8db2c 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -88,14 +88,10 @@ static void devm_memremap_pages_release(void *data)
resource_size_t align_start, align_size;
unsigned long pfn;
+ pgmap->kill(pgmap->ref);
for_each_device_pfn(pfn, pgmap)
put_page(pfn_to_page(pfn));
- if (percpu_ref_tryget_live(pgmap->ref)) {
- dev_WARN(dev, "%s: page mapping is still live!\n", __func__);
- percpu_ref_put(pgmap->ref);
- }
-
/* pages are dead and unused, undo the arch mapping */
align_start = res->start & ~(SECTION_SIZE - 1);
align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE)
@@ -116,7 +112,7 @@ static void devm_memremap_pages_release(void *data)
/**
* devm_memremap_pages - remap and provide memmap backing for the given resource
* @dev: hosting device for @res
- * @pgmap: pointer to a struct dev_pgmap
+ * @pgmap: pointer to a struct dev_pagemap
*
* Notes:
* 1/ At a minimum the res, ref and type members of @pgmap must be initialized
@@ -125,11 +121,8 @@ static void devm_memremap_pages_release(void *data)
* 2/ The altmap field may optionally be initialized, in which case altmap_valid
* must be set to true
*
- * 3/ pgmap.ref must be 'live' on entry and 'dead' before devm_memunmap_pages()
- * time (or devm release event). The expected order of events is that ref has
- * been through percpu_ref_kill() before devm_memremap_pages_release(). The
- * wait for the completion of all references being dropped and
- * percpu_ref_exit() must occur after devm_memremap_pages_release().
+ * 3/ pgmap->ref must be 'live' on entry and will be killed at
+ * devm_memremap_pages_release() time, or if this routine fails.
*
* 4/ res is expected to be a host memory range that could feasibly be
* treated as a "System RAM" range, i.e. not a device mmio range, but
@@ -145,6 +138,9 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
pgprot_t pgprot = PAGE_KERNEL;
int error, nid, is_ram;
+ if (!pgmap->ref || !pgmap->kill)
+ return ERR_PTR(-EINVAL);
+
align_start = res->start & ~(SECTION_SIZE - 1);
align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE)
- align_start;
@@ -170,12 +166,10 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
if (is_ram != REGION_DISJOINT) {
WARN_ONCE(1, "%s attempted on %s region %pr\n", __func__,
is_ram == REGION_MIXED ? "mixed" : "ram", res);
- return ERR_PTR(-ENXIO);
+ error = -ENXIO;
+ goto err_array;
}
- if (!pgmap->ref)
- return ERR_PTR(-EINVAL);
-
pgmap->dev = dev;
error = xa_err(xa_store_range(&pgmap_array, PHYS_PFN(res->start),
@@ -216,7 +210,10 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
align_start >> PAGE_SHIFT,
align_size >> PAGE_SHIFT, pgmap);
- devm_add_action(dev, devm_memremap_pages_release, pgmap);
+ error = devm_add_action_or_reset(dev, devm_memremap_pages_release,
+ pgmap);
+ if (error)
+ return ERR_PTR(error);
return __va(res->start);
@@ -227,6 +224,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
err_pfn_remap:
pgmap_array_delete(res);
err_array:
+ pgmap->kill(pgmap->ref);
return ERR_PTR(error);
}
EXPORT_SYMBOL_GPL(devm_memremap_pages);
diff --git a/tools/testing/nvdimm/test/iomap.c b/tools/testing/nvdimm/test/iomap.c
index ed18a0cbc0c8..c6635fee27d8 100644
--- a/tools/testing/nvdimm/test/iomap.c
+++ b/tools/testing/nvdimm/test/iomap.c
@@ -104,13 +104,26 @@ void *__wrap_devm_memremap(struct device *dev, resource_size_t offset,
}
EXPORT_SYMBOL(__wrap_devm_memremap);
+static void nfit_test_kill(void *_pgmap)
+{
+ struct dev_pagemap *pgmap = _pgmap;
+
+ pgmap->kill(pgmap->ref);
+}
+
void *__wrap_devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
{
resource_size_t offset = pgmap->res.start;
struct nfit_test_resource *nfit_res = get_nfit_res(offset);
- if (nfit_res)
+ if (nfit_res) {
+ int rc;
+
+ rc = devm_add_action_or_reset(dev, nfit_test_kill, pgmap);
+ if (rc)
+ return ERR_PTR(rc);
return nfit_res->buf + offset - nfit_res->res.start;
+ }
return devm_memremap_pages(dev, pgmap);
}
EXPORT_SYMBOL_GPL(__wrap_devm_memremap_pages);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only
2018-10-12 17:49 [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Dan Williams
2018-10-12 17:49 ` [PATCH v7 3/7] mm, devm_memremap_pages: Fix shutdown handling Dan Williams
@ 2018-10-12 18:14 ` Jerome Glisse
1 sibling, 0 replies; 4+ messages in thread
From: Jerome Glisse @ 2018-10-12 18:14 UTC (permalink / raw)
To: Dan Williams
Cc: akpm, stable, Balbir Singh, Logan Gunthorpe, Christoph Hellwig,
Michal Hocko, linux-mm, linux-kernel
On Fri, Oct 12, 2018 at 10:49:31AM -0700, Dan Williams wrote:
> [ apologies for the resend, script error ]
>
> Changes since v6 [1]:
> * Rebase on next-20181008 and fixup conflicts with the xarray conversion
> and hotplug optimizations
> * It has soaked on a 0day visible branch for a few days without any
> reports.
>
> [1]: https://lkml.org/lkml/2018/9/13/104
>
> ---
>
> Hi Andrew,
>
> J�r�me has reviewed the cleanups, thanks J�r�me. We still disagree on
> the EXPORT_SYMBOL_GPL status of the core HMM implementation, but Logan,
> Christoph and I continue to support marking all devm_memremap_pages()
> derivatives EXPORT_SYMBOL_GPL.
>
> HMM has been upstream for over a year, with no in-tree users it is clear
> it was designed first and foremost for out of tree drivers. It takes
> advantage of a facility Christoph and I spearheaded to support
> persistent memory. It continues to see expanding use cases with no clear
> end date when it will stop attracting features / revisions. It is not
> suitable to export devm_memremap_pages() as a stable 3rd party driver
> api.
>
> devm_memremap_pages() is a facility that can create struct page entries
> for any arbitrary range and give out-of-tree drivers the ability to
> subvert core aspects of page management. It, and anything derived from
> it (e.g. hmm, pcip2p, etc...), is a deep integration point into the core
> kernel, and an EXPORT_SYMBOL_GPL() interface.
>
> Commit 31c5bda3a656 "mm: fix exports that inadvertently make put_page()
> EXPORT_SYMBOL_GPL" was merged ahead of this series to relieve some of
> the pressure from innocent consumers of put_page(), but now we need this
> series to address *producers* of device pages.
>
> More details and justification in the changelogs. The 0day
> infrastructure has reported success across 152 configs and this survives
> the libnvdimm unit test suite. Aside from the controversial bits the
> diffstat is compelling at:
>
> 7 files changed, 126 insertions(+), 323 deletions(-)
>
> Note that the series has some minor collisions with Alex's recent series
> to improve devm_memremap_pages() scalability [2]. So, whichever you take
> first the other will need a minor rebase.
>
> [2]: https://www.lkml.org/lkml/2018/9/11/10
I am fine with this patchset going in (i reviewed it and tested it with
HMM on nouveau), Dan and Christoph did author the original code around
devm_memremap_pages() and thus ultimately they are the one who should
decide over GPL export or not.
Cheers,
J�r�me
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-10-12 18:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-12 17:49 [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Dan Williams
2018-10-12 17:49 ` [PATCH v7 3/7] mm, devm_memremap_pages: Fix shutdown handling Dan Williams
2018-10-12 18:14 ` [PATCH v7 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Jerome Glisse
-- strict thread matches above, loose matches on Subject: below --
2018-10-12 17:47 Dan Williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).