From: Alistair Popple <apopple@nvidia.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-mm@kvack.org, Matthew Wilcox <willy@infradead.org>,
Jan Kara <jack@suse.cz>, "Darrick J. Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@lst.de>,
John Hubbard <jhubbard@nvidia.com>,
Jason Gunthorpe <jgg@nvidia.com>,
david@fromorbit.com, nvdimm@lists.linux.dev,
akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 23/25] mm/memremap_pages: Initialize all ZONE_DEVICE pages to start at refcount 0
Date: Mon, 17 Oct 2022 18:04:12 +1100 [thread overview]
Message-ID: <8735bn0w2l.fsf@nvidia.com> (raw)
In-Reply-To: <166579195218.2236710.8731183545033177929.stgit@dwillia2-xfh.jf.intel.com>
The DEVICE_PRIVATE/COHERENT changes look good to me so feel free to add:
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Dan Williams <dan.j.williams@intel.com> writes:
> The initial memremap_pages() implementation inherited the
> __init_single_page() default of pages starting life with an elevated
> reference count. This originally allowed for the page->pgmap pointer to
> alias with the storage for page->lru since a page was only allowed to be
> on an lru list when its reference count was zero.
>
> Since then, 'struct page' definition cleanups have arranged for
> dedicated space for the ZONE_DEVICE page metadata, the
> MEMORY_DEVICE_{PRIVATE,COHERENT} work has arranged for the 1 -> 0
> page->_refcount transition to route the page to free_zone_device_page()
> and not the core-mm page-free, and MEMORY_DEVICE_{PRIVATE,COHERENT} now
> arranges for its ZONE_DEVICE pages to start at _refcount 0. With those
> cleanups in place and with filesystem-dax and device-dax now converted
> to take and drop references at map and truncate time, it is possible to
> start MEMORY_DEVICE_FS_DAX and MEMORY_DEVICE_GENERIC reference counts at
> 0 as well.
>
> This conversion also unifies all @pgmap accounting to be relative to
> pgmap_request_folio() and the paired folio_put() calls for those
> requested folios. This allows pgmap_release_folios() to be simplified to
> just a folio_put() helper.
>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: "Darrick J. Wong" <djwong@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> drivers/dax/mapping.c | 2 +-
> include/linux/dax.h | 2 +-
> include/linux/memremap.h | 6 ++----
> mm/memremap.c | 36 ++++++++++++++++--------------------
> mm/page_alloc.c | 9 +--------
> 5 files changed, 21 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/dax/mapping.c b/drivers/dax/mapping.c
> index 07caaa23d476..ca06f2515644 100644
> --- a/drivers/dax/mapping.c
> +++ b/drivers/dax/mapping.c
> @@ -691,7 +691,7 @@ static struct page *dax_zap_pages(struct xa_state *xas, void *entry)
>
> dax_for_each_folio(entry, folio, i) {
> if (zap)
> - pgmap_release_folios(folio_pgmap(folio), folio, 1);
> + pgmap_release_folios(folio, 1);
> if (!ret && !dax_folio_idle(folio))
> ret = folio_page(folio, 0);
> }
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index f2fbb5746ffa..f4fc37933fc2 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -235,7 +235,7 @@ static inline void dax_unlock_mapping_entry(struct address_space *mapping,
> */
> static inline bool dax_page_idle(struct page *page)
> {
> - return page_ref_count(page) == 1;
> + return page_ref_count(page) == 0;
> }
>
> static inline bool dax_folio_idle(struct folio *folio)
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 3fb3809d71f3..ddb196ae0696 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -195,8 +195,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
> struct dev_pagemap *pgmap);
> bool pgmap_request_folios(struct dev_pagemap *pgmap, struct folio *folio,
> int nr_folios);
> -void pgmap_release_folios(struct dev_pagemap *pgmap, struct folio *folio,
> - int nr_folios);
> +void pgmap_release_folios(struct folio *folio, int nr_folios);
> bool pgmap_pfn_valid(struct dev_pagemap *pgmap, unsigned long pfn);
>
> unsigned long vmem_altmap_offset(struct vmem_altmap *altmap);
> @@ -238,8 +237,7 @@ static inline bool pgmap_request_folios(struct dev_pagemap *pgmap,
> return false;
> }
>
> -static inline void pgmap_release_folios(struct dev_pagemap *pgmap,
> - struct folio *folio, int nr_folios)
> +static inline void pgmap_release_folios(struct folio *folio, int nr_folios)
> {
> }
>
> diff --git a/mm/memremap.c b/mm/memremap.c
> index c46e700f5245..368ff41c560b 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -469,8 +469,10 @@ EXPORT_SYMBOL_GPL(get_dev_pagemap);
>
> void free_zone_device_page(struct page *page)
> {
> - if (WARN_ON_ONCE(!page->pgmap->ops || !page->pgmap->ops->page_free))
> - return;
> + struct dev_pagemap *pgmap = page->pgmap;
> +
> + /* wake filesystem 'break dax layouts' waiters */
> + wake_up_var(page);
>
> mem_cgroup_uncharge(page_folio(page));
>
> @@ -505,17 +507,9 @@ void free_zone_device_page(struct page *page)
> * to clear page->mapping.
> */
> page->mapping = NULL;
> - page->pgmap->ops->page_free(page);
> -
> - if (page->pgmap->type != MEMORY_DEVICE_PRIVATE &&
> - page->pgmap->type != MEMORY_DEVICE_COHERENT)
> - /*
> - * Reset the page count to 1 to prepare for handing out the page
> - * again.
> - */
> - set_page_count(page, 1);
> - else
> - put_dev_pagemap(page->pgmap);
> + if (pgmap->ops && pgmap->ops->page_free)
> + pgmap->ops->page_free(page);
> + put_dev_pagemap(page->pgmap);
> }
>
> static bool folio_span_valid(struct dev_pagemap *pgmap, struct folio *folio,
> @@ -576,17 +570,19 @@ bool pgmap_request_folios(struct dev_pagemap *pgmap, struct folio *folio,
> }
> EXPORT_SYMBOL_GPL(pgmap_request_folios);
>
> -void pgmap_release_folios(struct dev_pagemap *pgmap, struct folio *folio, int nr_folios)
> +/*
> + * A symmetric helper to undo the page references acquired by
> + * pgmap_request_folios(), but the caller can also just arrange
> + * folio_put() on all the folios it acquired previously for the same
> + * effect.
> + */
> +void pgmap_release_folios(struct folio *folio, int nr_folios)
> {
> struct folio *iter;
> int i;
>
> - for (iter = folio, i = 0; i < nr_folios; iter = folio_next(iter), i++) {
> - if (!put_devmap_managed_page(&iter->page))
> - folio_put(iter);
> - if (!folio_ref_count(iter))
> - put_dev_pagemap(pgmap);
> - }
> + for (iter = folio, i = 0; i < nr_folios; iter = folio_next(folio), i++)
> + folio_put(iter);
> }
>
> #ifdef CONFIG_FS_DAX
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 8e9b7f08a32c..e35d1eb3308d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6787,6 +6787,7 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
> {
>
> __init_single_page(page, pfn, zone_idx, nid);
> + set_page_count(page, 0);
>
> /*
> * Mark page reserved as it will need to wait for onlining
> @@ -6819,14 +6820,6 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
> set_pageblock_migratetype(page, MIGRATE_MOVABLE);
> cond_resched();
> }
> -
> - /*
> - * ZONE_DEVICE pages are released directly to the driver page allocator
> - * which will set the page count to 1 when allocating the page.
> - */
> - if (pgmap->type == MEMORY_DEVICE_PRIVATE ||
> - pgmap->type == MEMORY_DEVICE_COHERENT)
> - set_page_count(page, 0);
> }
>
> /*
next prev parent reply other threads:[~2022-10-17 7:05 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-14 23:56 [PATCH v3 00/25] Fix the DAX-gup mistake Dan Williams
2022-10-14 23:57 ` [PATCH v3 01/25] fsdax: Wait on @page not @page->_refcount Dan Williams
2022-10-14 23:57 ` [PATCH v3 02/25] fsdax: Use dax_page_idle() to document DAX busy page checking Dan Williams
2022-10-14 23:57 ` [PATCH v3 03/25] fsdax: Include unmapped inodes for page-idle detection Dan Williams
2022-10-14 23:57 ` [PATCH v3 04/25] fsdax: Introduce dax_zap_mappings() Dan Williams
2022-11-02 13:04 ` Aneesh Kumar K.V
2022-10-14 23:57 ` [PATCH v3 05/25] fsdax: Wait for pinned pages during truncate_inode_pages_final() Dan Williams
2022-10-14 23:57 ` [PATCH v3 06/25] fsdax: Validate DAX layouts broken before truncate Dan Williams
2022-10-14 23:57 ` [PATCH v3 07/25] fsdax: Hold dax lock over mapping insertion Dan Williams
2022-10-17 19:31 ` Jason Gunthorpe
2022-10-17 20:17 ` Dan Williams
2022-10-18 5:26 ` Christoph Hellwig
2022-10-18 17:30 ` Dan Williams
2022-10-14 23:57 ` [PATCH v3 08/25] fsdax: Update dax_insert_entry() calling convention to return an error Dan Williams
2022-10-14 23:57 ` [PATCH v3 09/25] fsdax: Rework for_each_mapped_pfn() to dax_for_each_folio() Dan Williams
2022-10-14 23:57 ` [PATCH v3 10/25] fsdax: Introduce pgmap_request_folios() Dan Williams
2022-10-17 6:31 ` Alistair Popple
2022-10-17 20:06 ` Dan Williams
2022-10-17 20:11 ` Jason Gunthorpe
2022-10-17 20:51 ` Dan Williams
2022-10-17 23:57 ` Jason Gunthorpe
2022-10-18 0:19 ` Dan Williams
2022-10-17 19:41 ` Jason Gunthorpe
2022-10-14 23:58 ` [PATCH v3 11/25] fsdax: Rework dax_insert_entry() calling convention Dan Williams
2022-10-14 23:58 ` [PATCH v3 12/25] fsdax: Cleanup dax_associate_entry() Dan Williams
2022-10-14 23:58 ` [PATCH v3 13/25] devdax: Minor warning fixups Dan Williams
2022-10-14 23:58 ` [PATCH v3 14/25] devdax: Fix sparse lock imbalance warning Dan Williams
2022-10-14 23:58 ` [PATCH v3 15/25] libnvdimm/pmem: Support pmem block devices without dax Dan Williams
2022-10-14 23:58 ` [PATCH v3 16/25] devdax: Move address_space helpers to the DAX core Dan Williams
2022-10-14 23:58 ` [PATCH v3 17/25] devdax: Sparse fixes for xarray locking Dan Williams
2022-10-14 23:58 ` [PATCH v3 18/25] devdax: Sparse fixes for vmfault_t / dax-entry conversions Dan Williams
2022-10-14 23:58 ` [PATCH v3 19/25] devdax: Sparse fixes for vm_fault_t in tracepoints Dan Williams
2022-10-14 23:58 ` [PATCH v3 20/25] devdax: add PUD support to the DAX mapping infrastructure Dan Williams
2022-10-14 23:59 ` [PATCH v3 21/25] devdax: Use dax_insert_entry() + dax_delete_mapping_entry() Dan Williams
2022-10-14 23:59 ` [PATCH v3 22/25] mm/memremap_pages: Replace zone_device_page_init() with pgmap_request_folios() Dan Williams
2022-10-17 19:17 ` Lyude Paul
2022-10-14 23:59 ` [PATCH v3 23/25] mm/memremap_pages: Initialize all ZONE_DEVICE pages to start at refcount 0 Dan Williams
2022-10-17 7:04 ` Alistair Popple [this message]
2022-10-17 19:48 ` Jason Gunthorpe
2022-10-14 23:59 ` [PATCH v3 24/25] mm/meremap_pages: Delete put_devmap_managed_page_refs() Dan Williams
2022-10-17 7:08 ` Alistair Popple
2022-10-14 23:59 ` [PATCH v3 25/25] mm/gup: Drop DAX pgmap accounting Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8735bn0w2l.fsf@nvidia.com \
--to=apopple@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nvdimm@lists.linux.dev \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.