* [RFC PATCH 0/3] expose mapping wrprotect, fix fb_defio use
@ 2025-01-08 16:18 Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic Lorenzo Stoakes
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 16:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Jaya Kumar, Simona Vetter, Helge Deller, linux-fbdev, dri-devel,
linux-kernel, linux-mm, Matthew Wilcox, David Hildenbrand
Right now the only means by which we can write-protect a range using the
reverse mapping is via folio_mkclean().
However this is not always the appropriate means of doing so, specifically
in the case of the framebuffer deferred I/O logic (fb_defio enabled by
CONFIG_FB_DEFERRED_IO). There, kernel pages are mapped read-only and
write-protect faults used to batch up I/O operations.
Each time the deferred work is done, folio_mkclean() is used to mark the
framebuffer page as having had I/O performed on it. However doing so
requires the kernel page (perhaps allocated via vmalloc()) to have its
page->mapping, index fields set so the rmap can find everything that maps
it in order to write-protect.
This is problematic as firstly, these fields should not be set for
kernel-allocated memory, and secondly these are not folios (it's not user
memory) and page->index, mapping fields are now deprecated and soon to be
removed.
The implementers cannot be blamed for having used this however, as there is
simply no other way of performing this operation correctly.
This series fixes this - we provide the rmap_wrprotect_page() function to
allow the reverse mapping to be used to look up mappings from the page
cache object (i.e. its address_space pointer) at a specific offset.
The fb_defio logic already stores this offset, and can simply be expanded
to keep track of the page cache object, so the change then becomes
straight-forward.
This series should have no functional change.
*** REVIEWERS NOTES: ***
I do not have any hardware that uses fb_defio, so I'm asking for help with
testing this series from those who do :) I have tested the mm side of this,
and done a quick compile smoke test of the fb_defio side but this _very
much_ requires testing on actual hardware to ensure everything behaves as
expected.
This is based on Andrew's tree [0] in the mm-unstable branch - I was
thinking it'd be best to go through the mm tree (with fb_defio maintainer
approval, of course!) as it relies upon the mm changes to work correctly.
[0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/
Lorenzo Stoakes (3):
mm: refactor rmap_walk_file() to separate out traversal logic
mm: provide rmap_wrprotect_file_page() function
fb_defio: do not use deprecated page->mapping, index fields
drivers/video/fbdev/core/fb_defio.c | 34 +++----
include/linux/fb.h | 1 +
include/linux/rmap.h | 20 +++++
mm/rmap.c | 135 ++++++++++++++++++++++------
4 files changed, 141 insertions(+), 49 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic
2025-01-08 16:18 [RFC PATCH 0/3] expose mapping wrprotect, fix fb_defio use Lorenzo Stoakes
@ 2025-01-08 16:18 ` Lorenzo Stoakes
2025-01-08 16:38 ` Matthew Wilcox
2025-01-08 16:18 ` [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields Lorenzo Stoakes
2 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 16:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Jaya Kumar, Simona Vetter, Helge Deller, linux-fbdev, dri-devel,
linux-kernel, linux-mm, Matthew Wilcox, David Hildenbrand
In order to permit the traversal of the reverse mapping at a specified
mapping and offset rather than those specified by an input folio, we need
to separate out the portion of the rmap file logic which deals with this
traversal from those parts of the logic which interact with the folio.
This patch achieves this by adding a new static __rmap_walk_file() function
which rmap_walk_file() invokes.
This function permits the ability to pass NULL folio, on the assumption
that the caller has provided for this correctly in the callbacks specified
in the rmap_walk_control object.
Though it provides for this, and adds debug asserts to ensure that, should
a folio be specified, these are equal to the mapping and offset specified
in the folio, there should be no functional change as a result of this
patch.
The reason for adding this is to enable for future changes to permit users
to be able to traverse mappings of userland-mapped kernel memory,
write-protecting those mappings to enable page_mkwrite() or pfn_mkwrite()
fault handlers to be retriggered on subsequent dirty.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/rmap.c | 81 +++++++++++++++++++++++++++++++++++++------------------
1 file changed, 55 insertions(+), 26 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 227c60e38261..effafdb44365 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2710,35 +2710,37 @@ static void rmap_walk_anon(struct folio *folio,
anon_vma_unlock_read(anon_vma);
}
-/*
- * rmap_walk_file - do something to file page using the object-based rmap method
- * @folio: the folio to be handled
- * @rwc: control variable according to each walk type
- * @locked: caller holds relevant rmap lock
+/**
+ * __rmap_walk_file() - Traverse the reverse mapping for a file-backed mapping
+ * of a page mapped within a specified page cache object at a specified offset.
*
- * Find all the mappings of a folio using the mapping pointer and the vma chains
- * contained in the address_space struct it points to.
+ * @folio: Either the folio whose mappings to traverse, or if NULL,
+ * the callbacks specified in @rwc will be configured such
+ * as to be able to look up mappings correctly.
+ * @mapping: The page cache object whose mapping VMAs we intend to
+ * traverse. If @folio is non-NULL, this should be equal to
+ * folio_mapping(folio).
+ * @pgoff_start: The offset within @mapping of the page which we are
+ * looking up. If @folio is non-NULL, this should be equal
+ * to folio_pgoff(folio).
+ * @nr_pages: The number of pages mapped by the mapping. If @folio is
+ * non-NULL, this should be equal to folio_nr_pages(folio).
+ * @rwc: The reverse mapping walk control object describing how
+ * the traversal should proceed.
+ * @locked: Is the @mapping already locked? If not, we acquire the
+ * lock.
*/
-static void rmap_walk_file(struct folio *folio,
- struct rmap_walk_control *rwc, bool locked)
+static void __rmap_walk_file(struct folio *folio, struct address_space *mapping,
+ pgoff_t pgoff_start, unsigned long nr_pages,
+ struct rmap_walk_control *rwc, bool locked)
{
- struct address_space *mapping = folio_mapping(folio);
- pgoff_t pgoff_start, pgoff_end;
+ pgoff_t pgoff_end = pgoff_start + nr_pages - 1;
struct vm_area_struct *vma;
- /*
- * The page lock not only makes sure that page->mapping cannot
- * suddenly be NULLified by truncation, it makes sure that the
- * structure at mapping cannot be freed and reused yet,
- * so we can safely take mapping->i_mmap_rwsem.
- */
- VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
+ VM_WARN_ON_FOLIO(folio && mapping != folio_mapping(folio), folio);
+ VM_WARN_ON_FOLIO(folio && pgoff_start != folio_pgoff(folio), folio);
+ VM_WARN_ON_FOLIO(folio && nr_pages != folio_nr_pages(folio), folio);
- if (!mapping)
- return;
-
- pgoff_start = folio_pgoff(folio);
- pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;
if (!locked) {
if (i_mmap_trylock_read(mapping))
goto lookup;
@@ -2753,8 +2755,7 @@ static void rmap_walk_file(struct folio *folio,
lookup:
vma_interval_tree_foreach(vma, &mapping->i_mmap,
pgoff_start, pgoff_end) {
- unsigned long address = vma_address(vma, pgoff_start,
- folio_nr_pages(folio));
+ unsigned long address = vma_address(vma, pgoff_start, nr_pages);
VM_BUG_ON_VMA(address == -EFAULT, vma);
cond_resched();
@@ -2767,12 +2768,40 @@ static void rmap_walk_file(struct folio *folio,
if (rwc->done && rwc->done(folio))
goto done;
}
-
done:
if (!locked)
i_mmap_unlock_read(mapping);
}
+/*
+ * rmap_walk_file - do something to file page using the object-based rmap method
+ * @folio: the folio to be handled
+ * @rwc: control variable according to each walk type
+ * @locked: caller holds relevant rmap lock
+ *
+ * Find all the mappings of a folio using the mapping pointer and the vma chains
+ * contained in the address_space struct it points to.
+ */
+static void rmap_walk_file(struct folio *folio,
+ struct rmap_walk_control *rwc, bool locked)
+{
+ struct address_space *mapping = folio_mapping(folio);
+
+ /*
+ * The page lock not only makes sure that page->mapping cannot
+ * suddenly be NULLified by truncation, it makes sure that the
+ * structure at mapping cannot be freed and reused yet,
+ * so we can safely take mapping->i_mmap_rwsem.
+ */
+ VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
+
+ if (!mapping)
+ return;
+
+ __rmap_walk_file(folio, mapping, folio_pgoff(folio),
+ folio_nr_pages(folio), rwc, locked);
+}
+
void rmap_walk(struct folio *folio, struct rmap_walk_control *rwc)
{
if (unlikely(folio_test_ksm(folio)))
--
2.47.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function
2025-01-08 16:18 [RFC PATCH 0/3] expose mapping wrprotect, fix fb_defio use Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic Lorenzo Stoakes
@ 2025-01-08 16:18 ` Lorenzo Stoakes
2025-01-08 17:25 ` Matthew Wilcox
2025-01-08 16:18 ` [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields Lorenzo Stoakes
2 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 16:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Jaya Kumar, Simona Vetter, Helge Deller, linux-fbdev, dri-devel,
linux-kernel, linux-mm, Matthew Wilcox, David Hildenbrand
in the fb_defio video driver, page dirty state is used to determine when
frame buffer pages have been changed, allowing for batched, deferred I/O to
be performed for efficiency.
This implementation had only one means of doing so effectively - the use of
the folio_mkclean() function.
However, this use of the function is inappropriate, as the fb_defio
implementation allocates kernel memory to back the framebuffer, and then is
forced to specified page->index, mapping fields in order to permit the
folio_mkclean() rmap traversal to proceed correctly.
It is not correct to specify these fields on kernel-allocated memory, and
moreover since these are not folios, page->index, mapping are deprecated
fields, soon to be removed.
We therefore need to provide a means by which we can correctly traverse the
reverse mapping and write-protect mappings for a page backing an
address_space page cache object at a given offset.
This patch provides this - rmap_wrprotect_file_page() allows for this
operation to be performed for a specified address_space, offset and PFN,
without requiring a folio nor, of course, an inappropriate use of
page->index, mapping.
With this provided, we can subequently adjust the fb_defio implementation
to make use of this function and avoid incorrect invocation of
folio_mkclean() and more importantly, incorrect manipulation of
page->index, mapping fields.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
include/linux/rmap.h | 20 ++++++++++++++++
mm/rmap.c | 54 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 4509a43fe59f..9d80b09e58ae 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -754,6 +754,26 @@ unsigned long page_address_in_vma(const struct folio *folio,
*/
int folio_mkclean(struct folio *);
+/**
+ * rmap_wrprotect_file_page() - Traverses the reverse mapping, finding all VMAs
+ * which contain a shared mapping of the single page at PFN @pfn in @mapping at
+ * offset @pgoff and write-protecting the mappings.
+ *
+ * The PFN mapped does not have to be a folio, but rather can be a kernel
+ * allocation that is mapped into userland. We therefore do not require that the
+ * PFN maps to a folio with a valid mapping or index field, rather these are
+ * specified in @mapping and @pgoff.
+ *
+ * @mapping: The mapping whose reverse mapping should be traversed.
+ * @pgoff: The page offset at which @pfn is mapped within @mapping.
+ * @nr_pages: The number of physically contiguous base pages spanned.
+ * @pfn: The PFN of the memory mapped in @mapping at @pgoff.
+ *
+ * Return the number of write-protected PTEs, or an error.
+ */
+int rmap_wrprotect_file_page(struct address_space *mapping, pgoff_t pgoff,
+ unsigned long nr_pages, unsigned long pfn);
+
int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff,
struct vm_area_struct *vma);
diff --git a/mm/rmap.c b/mm/rmap.c
index effafdb44365..46474343116c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1127,6 +1127,60 @@ int folio_mkclean(struct folio *folio)
}
EXPORT_SYMBOL_GPL(folio_mkclean);
+struct wrprotect_file_state {
+ int cleaned;
+ pgoff_t pgoff;
+ unsigned long pfn;
+ unsigned long nr_pages;
+};
+
+static bool rmap_wrprotect_file_one(struct folio *folio, struct vm_area_struct *vma,
+ unsigned long address, void *arg)
+{
+ struct wrprotect_file_state *state = (struct wrprotect_file_state *)arg;
+ struct page_vma_mapped_walk pvmw = {
+ .pfn = state->pfn,
+ .nr_pages = state->nr_pages,
+ .pgoff = state->pgoff,
+ .vma = vma,
+ .address = address,
+ .flags = PVMW_SYNC,
+ };
+
+ state->cleaned += page_vma_mkclean_one(&pvmw);
+
+ return true;
+}
+
+static void __rmap_walk_file(struct folio *folio, struct address_space *mapping,
+ pgoff_t pgoff_start, unsigned long nr_pages,
+ struct rmap_walk_control *rwc, bool locked);
+
+int rmap_wrprotect_file_page(struct address_space *mapping, pgoff_t pgoff,
+ unsigned long nr_pages, unsigned long pfn)
+{
+ struct wrprotect_file_state state = {
+ .cleaned = 0,
+ .pgoff = pgoff,
+ .pfn = pfn,
+ .nr_pages = nr_pages,
+ };
+ struct rmap_walk_control rwc = {
+ .arg = (void *)&state,
+ .rmap_one = rmap_wrprotect_file_one,
+ .invalid_vma = invalid_mkclean_vma,
+ };
+
+ if (!mapping)
+ return 0;
+
+ __rmap_walk_file(/* folio = */NULL, mapping, pgoff, nr_pages, &rwc,
+ /* locked = */false);
+
+ return state.cleaned;
+}
+EXPORT_SYMBOL_GPL(rmap_wrprotect_file_page);
+
/**
* pfn_mkclean_range - Cleans the PTEs (including PMDs) mapped with range of
* [@pfn, @pfn + @nr_pages) at the specific offset (@pgoff)
--
2.47.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 16:18 [RFC PATCH 0/3] expose mapping wrprotect, fix fb_defio use Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function Lorenzo Stoakes
@ 2025-01-08 16:18 ` Lorenzo Stoakes
2025-01-08 17:32 ` Matthew Wilcox
2 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 16:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Jaya Kumar, Simona Vetter, Helge Deller, linux-fbdev, dri-devel,
linux-kernel, linux-mm, Matthew Wilcox, David Hildenbrand
With the introduction of rmap_wrprotect_file_page() there is no need to use
folio_mkclean() in order to write-protect mappings of frame buffer pages,
and therefore no need to inappropriately set kernel-allocated page->index,
mapping fields to permit this operation.
Instead, store the pointer to the page cache object for the mapped driver
in the fb_deferred_io object, and use the already stored page offset from
the pageref object to look up mappings in order to write-protect them.
This is justified, as for the page objects to store a mapping pointer at
the point of assignment of pages, they must all reference the same
underlying address_space object. Since the life time of the pagerefs is
also the lifetime of the fb_deferred_io object, storing the pointer here
makes snese.
This eliminates the need for all of the logic around setting and
maintaining page->index,mapping which we remove.
This eliminates the use of folio_mkclean() entirely but otherwise should
have no functional change.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
drivers/video/fbdev/core/fb_defio.c | 34 ++++++++++-------------------
include/linux/fb.h | 1 +
2 files changed, 12 insertions(+), 23 deletions(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index 65363df8e81b..186ff902da5f 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -69,14 +69,6 @@ static struct fb_deferred_io_pageref *fb_deferred_io_pageref_lookup(struct fb_in
return pageref;
}
-static void fb_deferred_io_pageref_clear(struct fb_deferred_io_pageref *pageref)
-{
- struct page *page = pageref->page;
-
- if (page)
- page->mapping = NULL;
-}
-
static struct fb_deferred_io_pageref *fb_deferred_io_pageref_get(struct fb_info *info,
unsigned long offset,
struct page *page)
@@ -140,13 +132,10 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
if (!page)
return VM_FAULT_SIGBUS;
- if (vmf->vma->vm_file)
- page->mapping = vmf->vma->vm_file->f_mapping;
- else
+ if (!vmf->vma->vm_file)
printk(KERN_ERR "no mapping available\n");
- BUG_ON(!page->mapping);
- page->index = vmf->pgoff; /* for folio_mkclean() */
+ BUG_ON(!info->fbdefio->mapping);
vmf->page = page;
return 0;
@@ -194,9 +183,9 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
/*
* We want the page to remain locked from ->page_mkwrite until
- * the PTE is marked dirty to avoid folio_mkclean() being called
- * before the PTE is updated, which would leave the page ignored
- * by defio.
+ * the PTE is marked dirty to avoid rmap_wrprotect_file_page()
+ * being called before the PTE is updated, which would leave
+ * the page ignored by defio.
* Do this by locking the page here and informing the caller
* about it with VM_FAULT_LOCKED.
*/
@@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
struct folio *folio = page_folio(pageref->page);
folio_lock(folio);
- folio_mkclean(folio);
+ rmap_wrprotect_file_page(fbdefio->mapping,
+ pageref->offset >> PAGE_SHIFT,
+ compound_nr(pageref->page),
+ page_to_pfn(pageref->page));
folio_unlock(folio);
}
@@ -337,6 +329,7 @@ void fb_deferred_io_open(struct fb_info *info,
{
struct fb_deferred_io *fbdefio = info->fbdefio;
+ fbdefio->mapping = file->f_mapping;
file->f_mapping->a_ops = &fb_deferred_io_aops;
fbdefio->open_count++;
}
@@ -344,13 +337,7 @@ EXPORT_SYMBOL_GPL(fb_deferred_io_open);
static void fb_deferred_io_lastclose(struct fb_info *info)
{
- unsigned long i;
-
flush_delayed_work(&info->deferred_work);
-
- /* clear out the mapping that we setup */
- for (i = 0; i < info->npagerefs; ++i)
- fb_deferred_io_pageref_clear(&info->pagerefs[i]);
}
void fb_deferred_io_release(struct fb_info *info)
@@ -370,5 +357,6 @@ void fb_deferred_io_cleanup(struct fb_info *info)
kvfree(info->pagerefs);
mutex_destroy(&fbdefio->lock);
+ fbdefio->mapping = NULL;
}
EXPORT_SYMBOL_GPL(fb_deferred_io_cleanup);
diff --git a/include/linux/fb.h b/include/linux/fb.h
index 5ba187e08cf7..cd653862ab99 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -225,6 +225,7 @@ struct fb_deferred_io {
int open_count; /* number of opened files; protected by fb_info lock */
struct mutex lock; /* mutex that protects the pageref list */
struct list_head pagereflist; /* list of pagerefs for touched pages */
+ struct address_space *mapping; /* page cache object for fb device */
/* callback */
struct page *(*get_page)(struct fb_info *info, unsigned long offset);
void (*deferred_io)(struct fb_info *info, struct list_head *pagelist);
--
2.47.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic
2025-01-08 16:18 ` [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic Lorenzo Stoakes
@ 2025-01-08 16:38 ` Matthew Wilcox
2025-01-08 19:23 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2025-01-08 16:38 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 04:18:40PM +0000, Lorenzo Stoakes wrote:
> +/*
> + * rmap_walk_file - do something to file page using the object-based rmap method
> + * @folio: the folio to be handled
> + * @rwc: control variable according to each walk type
> + * @locked: caller holds relevant rmap lock
> + *
> + * Find all the mappings of a folio using the mapping pointer and the vma chains
> + * contained in the address_space struct it points to.
> + */
> +static void rmap_walk_file(struct folio *folio,
> + struct rmap_walk_control *rwc, bool locked)
> +{
> + struct address_space *mapping = folio_mapping(folio);
I'm unconvinced this shouldn't be just folio->mapping. On the face of
it, we're saying that we're walking a file, and file folios just want
to use folio->mapping. But let's dig a little deeper.
The folio passed in is locked, so it can't be changed during this call.
In folio_mapping(), folio_test_slab() is guaranteed untrue.
folio_test_swapcache() doesn't seem likely to be true either; unless
it's shmem, it can't be in the swapcache, and if it's shmem and in the
swap cache, it can't be mapped to userspace (they're swizzled back from
the swapcache to the pagecache before being mapped). And then the
check for PAGE_MAPPING_FLAGS is guaranteed to be untrue (we know it's
not anon/ksm/movable). So I think this should just be folio->mapping.
> + /*
> + * The page lock not only makes sure that page->mapping cannot
> + * suddenly be NULLified by truncation, it makes sure that the
> + * structure at mapping cannot be freed and reused yet,
> + * so we can safely take mapping->i_mmap_rwsem.
> + */
I know you only moved this comment, but please fix it to refer to
folios, not pages.
> + VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
> +
> + if (!mapping)
> + return;
Maybe make this a WARN_ON_ONCE?
> + __rmap_walk_file(folio, mapping, folio_pgoff(folio),
> + folio_nr_pages(folio), rwc, locked);
folio_pgoff() can go too. Just use folio->index.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function
2025-01-08 16:18 ` [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function Lorenzo Stoakes
@ 2025-01-08 17:25 ` Matthew Wilcox
2025-01-08 19:35 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2025-01-08 17:25 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 04:18:41PM +0000, Lorenzo Stoakes wrote:
> +++ b/include/linux/rmap.h
> @@ -754,6 +754,26 @@ unsigned long page_address_in_vma(const struct folio *folio,
> */
> int folio_mkclean(struct folio *);
>
> +/**
The kerneldoc comment should be with the implementation, not the
prototype.
> + * rmap_wrprotect_file_page() - Traverses the reverse mapping, finding all VMAs
> + * which contain a shared mapping of the single page at PFN @pfn in @mapping at
> + * offset @pgoff and write-protecting the mappings.
After the '-' should come a _short_ description ... maybe "Write protect
all mappings of this page".
> + * The PFN mapped does not have to be a folio, but rather can be a kernel
> + * allocation that is mapped into userland. We therefore do not require that the
> + * PFN maps to a folio with a valid mapping or index field, rather these are
> + * specified in @mapping and @pgoff.
> + *
> + * @mapping: The mapping whose reverse mapping should be traversed.
> + * @pgoff: The page offset at which @pfn is mapped within @mapping.
> + * @nr_pages: The number of physically contiguous base pages spanned.
> + * @pfn: The PFN of the memory mapped in @mapping at @pgoff.
The description of the params comes between the short and full
description of the function.
> + * Return the number of write-protected PTEs, or an error.
colon after Return: so it becomes a section.
> +int rmap_wrprotect_file_page(struct address_space *mapping, pgoff_t pgoff,
> + unsigned long nr_pages, unsigned long pfn)
> +{
> + struct wrprotect_file_state state = {
> + .cleaned = 0,
> + .pgoff = pgoff,
> + .pfn = pfn,
> + .nr_pages = nr_pages,
> + };
> + struct rmap_walk_control rwc = {
> + .arg = (void *)&state,
> + .rmap_one = rmap_wrprotect_file_one,
> + .invalid_vma = invalid_mkclean_vma,
> + };
> +
> + if (!mapping)
> + return 0;
Should it be valid to pass in NULL?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 16:18 ` [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields Lorenzo Stoakes
@ 2025-01-08 17:32 ` Matthew Wilcox
2025-01-08 19:41 ` Lorenzo Stoakes
2025-01-08 20:14 ` David Hildenbrand
0 siblings, 2 replies; 18+ messages in thread
From: Matthew Wilcox @ 2025-01-08 17:32 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
> @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
> struct folio *folio = page_folio(pageref->page);
>
> folio_lock(folio);
> - folio_mkclean(folio);
> + rmap_wrprotect_file_page(fbdefio->mapping,
> + pageref->offset >> PAGE_SHIFT,
> + compound_nr(pageref->page),
> + page_to_pfn(pageref->page));
> folio_unlock(folio);
Why do we need to lock the folio? (since this isn't necessarily a
folio) Also, do we need compound_nr() here? I _think_ for defio,
the number of pages allocated per object are fixed, so this should be
an fbdefio->nr_pages field?
(something that's always troubled me about compound_nr() is that it
returns 1 for tail pages and the number you actually expect for head
pages)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic
2025-01-08 16:38 ` Matthew Wilcox
@ 2025-01-08 19:23 ` Lorenzo Stoakes
0 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 19:23 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 04:38:57PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 08, 2025 at 04:18:40PM +0000, Lorenzo Stoakes wrote:
> > +/*
> > + * rmap_walk_file - do something to file page using the object-based rmap method
> > + * @folio: the folio to be handled
> > + * @rwc: control variable according to each walk type
> > + * @locked: caller holds relevant rmap lock
> > + *
> > + * Find all the mappings of a folio using the mapping pointer and the vma chains
> > + * contained in the address_space struct it points to.
> > + */
> > +static void rmap_walk_file(struct folio *folio,
> > + struct rmap_walk_control *rwc, bool locked)
> > +{
> > + struct address_space *mapping = folio_mapping(folio);
>
> I'm unconvinced this shouldn't be just folio->mapping. On the face of
> it, we're saying that we're walking a file, and file folios just want
> to use folio->mapping. But let's dig a little deeper.
>
> The folio passed in is locked, so it can't be changed during this call.
> In folio_mapping(), folio_test_slab() is guaranteed untrue.
> folio_test_swapcache() doesn't seem likely to be true either; unless
> it's shmem, it can't be in the swapcache, and if it's shmem and in the
> swap cache, it can't be mapped to userspace (they're swizzled back from
> the swapcache to the pagecache before being mapped). And then the
> check for PAGE_MAPPING_FLAGS is guaranteed to be untrue (we know it's
> not anon/ksm/movable). So I think this should just be folio->mapping.
Ack, and we assert that it is indeed locked first. We will have checked
that this is not anon, and with the lock we shouldn't see it disappear
under us to be slab, we have also explicitly checked for ksm so that's out.
Wasn't aware of that swizzling actually... good to know! But I guess that
makes sense since you'd hit a swap entry in the fault code and trigger all
that fun stuff (hm let me go read the swap chapter in my book again :P)
TL;DR - will change. But will add a comment saying we can do it safely.
>
> > + /*
> > + * The page lock not only makes sure that page->mapping cannot
> > + * suddenly be NULLified by truncation, it makes sure that the
> > + * structure at mapping cannot be freed and reused yet,
> > + * so we can safely take mapping->i_mmap_rwsem.
> > + */
>
> I know you only moved this comment, but please fix it to refer to
> folios, not pages.
Ack will do.
>
> > + VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
> > +
> > + if (!mapping)
> > + return;
>
> Maybe make this a WARN_ON_ONCE?
I'm not sure if this isn't actually a vaguely possible scenario? Though
hm. I'm not 100% certain it's not expected to happen _sometimes_.
Perhaps one to do as a follow up in case it turns out this is sometimes
expected due to timing issues with a truncate?
But I may be wrong and this should demonstrably not happen other than in
case of programmatic error?
>
> > + __rmap_walk_file(folio, mapping, folio_pgoff(folio),
> > + folio_nr_pages(folio), rwc, locked);
>
> folio_pgoff() can go too. Just use folio->index.
>
Ack. Will change.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function
2025-01-08 17:25 ` Matthew Wilcox
@ 2025-01-08 19:35 ` Lorenzo Stoakes
0 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 19:35 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 05:25:01PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 08, 2025 at 04:18:41PM +0000, Lorenzo Stoakes wrote:
> > +++ b/include/linux/rmap.h
> > @@ -754,6 +754,26 @@ unsigned long page_address_in_vma(const struct folio *folio,
> > */
> > int folio_mkclean(struct folio *);
> >
> > +/**
>
> The kerneldoc comment should be with the implementation, not the
> prototype.
>
> > + * rmap_wrprotect_file_page() - Traverses the reverse mapping, finding all VMAs
> > + * which contain a shared mapping of the single page at PFN @pfn in @mapping at
> > + * offset @pgoff and write-protecting the mappings.
>
> After the '-' should come a _short_ description ... maybe "Write protect
> all mappings of this page".
As you _well_ know Matthew, brevity is not my strong suite ;)
But sure, will cut this down to size...
>
> > + * The PFN mapped does not have to be a folio, but rather can be a kernel
> > + * allocation that is mapped into userland. We therefore do not require that the
> > + * PFN maps to a folio with a valid mapping or index field, rather these are
> > + * specified in @mapping and @pgoff.
> > + *
> > + * @mapping: The mapping whose reverse mapping should be traversed.
> > + * @pgoff: The page offset at which @pfn is mapped within @mapping.
> > + * @nr_pages: The number of physically contiguous base pages spanned.
> > + * @pfn: The PFN of the memory mapped in @mapping at @pgoff.
>
> The description of the params comes between the short and full
> description of the function.
Ack
>
> > + * Return the number of write-protected PTEs, or an error.
>
> colon after Return: so it becomes a section.
Ack will do
>
> > +int rmap_wrprotect_file_page(struct address_space *mapping, pgoff_t pgoff,
> > + unsigned long nr_pages, unsigned long pfn)
> > +{
> > + struct wrprotect_file_state state = {
> > + .cleaned = 0,
> > + .pgoff = pgoff,
> > + .pfn = pfn,
> > + .nr_pages = nr_pages,
> > + };
> > + struct rmap_walk_control rwc = {
> > + .arg = (void *)&state,
> > + .rmap_one = rmap_wrprotect_file_one,
> > + .invalid_vma = invalid_mkclean_vma,
> > + };
> > +
> > + if (!mapping)
> > + return 0;
>
> Should it be valid to pass in NULL?
>
I think it's ok for it to be, as in that case it's valid to say 'ok we
write-protected everything mapped by mapping - which was nothing'.
It's a bit blurry though.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 17:32 ` Matthew Wilcox
@ 2025-01-08 19:41 ` Lorenzo Stoakes
2025-01-13 23:01 ` Lorenzo Stoakes
2025-01-08 20:14 ` David Hildenbrand
1 sibling, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 19:41 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 05:32:54PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
> > @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
> > struct folio *folio = page_folio(pageref->page);
> >
> > folio_lock(folio);
> > - folio_mkclean(folio);
> > + rmap_wrprotect_file_page(fbdefio->mapping,
> > + pageref->offset >> PAGE_SHIFT,
> > + compound_nr(pageref->page),
> > + page_to_pfn(pageref->page));
> > folio_unlock(folio);
>
> Why do we need to lock the folio? (since this isn't necessarily a
> folio) Also, do we need compound_nr() here? I _think_ for defio,
> the number of pages allocated per object are fixed, so this should be
> an fbdefio->nr_pages field?
I'm trying to keep the code as similar as possible to the way it was before,
even if there are questionable parts.
There is a comment about some timing issue around the locks and so there appears
to be an assumption about that.
As to compound_nr(), we're not write protecting everything, just each invidiual
page in the list that needs it, so we only want to do one at a time. I strongly
suspect it's a single base page each time, but for belts + braces I'm doing
compound_nr().
See below, this is wrong, it should just be '1'.
So this is iterating through a list of pagerefs that can be in any random order.
>
> (something that's always troubled me about compound_nr() is that it
> returns 1 for tail pages and the number you actually expect for head
> pages)
>
OK I changed this from '1' to compound_nr() out of an (apparently) abundance of
caution, but I was wrong:
npagerefs = DIV_ROUND_UP(info->fix.smem_len, PAGE_SIZE);
There are page refs for each PAGE_SIZE (i.e. base page size), so there is no way
anything is compound.
Will switch this to 1.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 17:32 ` Matthew Wilcox
2025-01-08 19:41 ` Lorenzo Stoakes
@ 2025-01-08 20:14 ` David Hildenbrand
2025-01-08 20:54 ` Matthew Wilcox
1 sibling, 1 reply; 18+ messages in thread
From: David Hildenbrand @ 2025-01-08 20:14 UTC (permalink / raw)
To: Matthew Wilcox, Lorenzo Stoakes
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm
On 08.01.25 18:32, Matthew Wilcox wrote:
> On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
>> @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
>> struct folio *folio = page_folio(pageref->page);
>>
>> folio_lock(folio);
>> - folio_mkclean(folio);
>> + rmap_wrprotect_file_page(fbdefio->mapping,
>> + pageref->offset >> PAGE_SHIFT,
>> + compound_nr(pageref->page),
>> + page_to_pfn(pageref->page));
>> folio_unlock(folio);
>
> Why do we need to lock the folio? (since this isn't necessarily a
> folio)
Can you clarify the "since this isn't necessarily a folio" part ? Do you
mean in the future, when we split "struct page" and "struct folio"?
Doing an rmap walk on something that won't be a folio is ... sounds odd
(->wrong :) )
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 20:14 ` David Hildenbrand
@ 2025-01-08 20:54 ` Matthew Wilcox
2025-01-08 21:12 ` David Hildenbrand
0 siblings, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2025-01-08 20:54 UTC (permalink / raw)
To: David Hildenbrand
Cc: Lorenzo Stoakes, Andrew Morton, Jaya Kumar, Simona Vetter,
Helge Deller, linux-fbdev, dri-devel, linux-kernel, linux-mm
On Wed, Jan 08, 2025 at 09:14:53PM +0100, David Hildenbrand wrote:
> On 08.01.25 18:32, Matthew Wilcox wrote:
> > On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
> > > @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
> > > struct folio *folio = page_folio(pageref->page);
> > > folio_lock(folio);
> > > - folio_mkclean(folio);
> > > + rmap_wrprotect_file_page(fbdefio->mapping,
> > > + pageref->offset >> PAGE_SHIFT,
> > > + compound_nr(pageref->page),
> > > + page_to_pfn(pageref->page));
> > > folio_unlock(folio);
> >
> > Why do we need to lock the folio? (since this isn't necessarily a
> > folio)
>
> Can you clarify the "since this isn't necessarily a folio" part ? Do you
> mean in the future, when we split "struct page" and "struct folio"?
Right. I need to finish the email that explains where I think we're
going in 2025 ...
> Doing an rmap walk on something that won't be a folio is ... sounds odd
> (->wrong :) )
Not necessarily! We already do that (since 2022) for DAX (see
6a8e0596f004). rmap lets you find every place that a given range
of a file is mapped into user address spaces; but that file might be a
device file, and so it's not just pagecache but also (in this case)
fb memory, and whatever else device drivers decide to mmap.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 20:54 ` Matthew Wilcox
@ 2025-01-08 21:12 ` David Hildenbrand
2025-01-08 21:55 ` Matthew Wilcox
2025-01-13 17:48 ` Lorenzo Stoakes
0 siblings, 2 replies; 18+ messages in thread
From: David Hildenbrand @ 2025-01-08 21:12 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Lorenzo Stoakes, Andrew Morton, Jaya Kumar, Simona Vetter,
Helge Deller, linux-fbdev, dri-devel, linux-kernel, linux-mm
On 08.01.25 21:54, Matthew Wilcox wrote:
> On Wed, Jan 08, 2025 at 09:14:53PM +0100, David Hildenbrand wrote:
>> On 08.01.25 18:32, Matthew Wilcox wrote:
>>> On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
>>>> @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
>>>> struct folio *folio = page_folio(pageref->page);
>>>> folio_lock(folio);
>>>> - folio_mkclean(folio);
>>>> + rmap_wrprotect_file_page(fbdefio->mapping,
>>>> + pageref->offset >> PAGE_SHIFT,
>>>> + compound_nr(pageref->page),
>>>> + page_to_pfn(pageref->page));
>>>> folio_unlock(folio);
>>>
>>> Why do we need to lock the folio? (since this isn't necessarily a
>>> folio)
>>
>> Can you clarify the "since this isn't necessarily a folio" part ? Do you
>> mean in the future, when we split "struct page" and "struct folio"?
>
> Right. I need to finish the email that explains where I think we're
> going in 2025 ...
>
>> Doing an rmap walk on something that won't be a folio is ... sounds odd
>> (->wrong :) )
>
> Not necessarily! We already do that (since 2022) for DAX (see
> 6a8e0596f004). rmap lets you find every place that a given range
> of a file is mapped into user address spaces; but that file might be a
> device file, and so it's not just pagecache but also (in this case)
> fb memory, and whatever else device drivers decide to mmap.
Yes, that part I remember.
I thought we would be passing in a page into rmap_wrprotect_file_page(),
and was wondering what we would do to "struct page" that won't be a
folio in there.
Probably, because the "_page" in rmap_wrprotect_file_page() is misleading :)
... should it be "file_range" ? (but we also pass the pfn ... )
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 21:12 ` David Hildenbrand
@ 2025-01-08 21:55 ` Matthew Wilcox
2025-01-08 22:02 ` David Hildenbrand
2025-01-13 17:48 ` Lorenzo Stoakes
1 sibling, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2025-01-08 21:55 UTC (permalink / raw)
To: David Hildenbrand
Cc: Lorenzo Stoakes, Andrew Morton, Jaya Kumar, Simona Vetter,
Helge Deller, linux-fbdev, dri-devel, linux-kernel, linux-mm
On Wed, Jan 08, 2025 at 10:12:36PM +0100, David Hildenbrand wrote:
> On 08.01.25 21:54, Matthew Wilcox wrote:
> > Not necessarily! We already do that (since 2022) for DAX (see
> > 6a8e0596f004). rmap lets you find every place that a given range
> > of a file is mapped into user address spaces; but that file might be a
> > device file, and so it's not just pagecache but also (in this case)
> > fb memory, and whatever else device drivers decide to mmap.
>
> Yes, that part I remember.
>
> I thought we would be passing in a page into rmap_wrprotect_file_page(), and
> was wondering what we would do to "struct page" that won't be a folio in
> there.
>
> Probably, because the "_page" in rmap_wrprotect_file_page() is misleading :)
>
> ... should it be "file_range" ? (but we also pass the pfn ... )
I don't think it's unprecedented for us to identify a page by its pfn.
After all, the acronym stands for "page frame number". That said, for
the one caller of this, it has the struct page and passes in the result
from page_to_pfn(). So no harm in passing in the struct page directly.
I would not like to see this function called "rmap_wrprotect_file_pfn".
Files don't have pfns, so that's a bad name.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 21:55 ` Matthew Wilcox
@ 2025-01-08 22:02 ` David Hildenbrand
2025-01-13 17:18 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: David Hildenbrand @ 2025-01-08 22:02 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Lorenzo Stoakes, Andrew Morton, Jaya Kumar, Simona Vetter,
Helge Deller, linux-fbdev, dri-devel, linux-kernel, linux-mm
On 08.01.25 22:55, Matthew Wilcox wrote:
> On Wed, Jan 08, 2025 at 10:12:36PM +0100, David Hildenbrand wrote:
>> On 08.01.25 21:54, Matthew Wilcox wrote:
>>> Not necessarily! We already do that (since 2022) for DAX (see
>>> 6a8e0596f004). rmap lets you find every place that a given range
>>> of a file is mapped into user address spaces; but that file might be a
>>> device file, and so it's not just pagecache but also (in this case)
>>> fb memory, and whatever else device drivers decide to mmap.
>>
>> Yes, that part I remember.
>>
>> I thought we would be passing in a page into rmap_wrprotect_file_page(), and
>> was wondering what we would do to "struct page" that won't be a folio in
>> there.
>>
>> Probably, because the "_page" in rmap_wrprotect_file_page() is misleading :)
>>
>> ... should it be "file_range" ? (but we also pass the pfn ... )
>
> I don't think it's unprecedented for us to identify a page by its pfn.
> After all, the acronym stands for "page frame number". That said, for
> the one caller of this, it has the struct page and passes in the result
> from page_to_pfn(). So no harm in passing in the struct page directly.
>
> I would not like to see this function called "rmap_wrprotect_file_pfn".
> Files don't have pfns, so that's a bad name.
Agreed.
(it's too late in the evening for me to give any good suggestions :) )
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 22:02 ` David Hildenbrand
@ 2025-01-13 17:18 ` Lorenzo Stoakes
0 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-13 17:18 UTC (permalink / raw)
To: David Hildenbrand
Cc: Matthew Wilcox, Andrew Morton, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm
On Wed, Jan 08, 2025 at 11:02:57PM +0100, David Hildenbrand wrote:
> On 08.01.25 22:55, Matthew Wilcox wrote:
> > On Wed, Jan 08, 2025 at 10:12:36PM +0100, David Hildenbrand wrote:
> > > On 08.01.25 21:54, Matthew Wilcox wrote:
> > > > Not necessarily! We already do that (since 2022) for DAX (see
> > > > 6a8e0596f004). rmap lets you find every place that a given range
> > > > of a file is mapped into user address spaces; but that file might be a
> > > > device file, and so it's not just pagecache but also (in this case)
> > > > fb memory, and whatever else device drivers decide to mmap.
> > >
> > > Yes, that part I remember.
> > >
> > > I thought we would be passing in a page into rmap_wrprotect_file_page(), and
> > > was wondering what we would do to "struct page" that won't be a folio in
> > > there.
> > >
> > > Probably, because the "_page" in rmap_wrprotect_file_page() is misleading :)
> > >
> > > ... should it be "file_range" ? (but we also pass the pfn ... )
> >
> > I don't think it's unprecedented for us to identify a page by its pfn.
> > After all, the acronym stands for "page frame number". That said, for
> > the one caller of this, it has the struct page and passes in the result
> > from page_to_pfn(). So no harm in passing in the struct page directly.
> >
> > I would not like to see this function called "rmap_wrprotect_file_pfn".
> > Files don't have pfns, so that's a bad name.
>
> Agreed.
>
> (it's too late in the evening for me to give any good suggestions :) )
Matthew pinged me on irc with mapping_wrprotect_page() :>)
Am happy to do that, will respin in a bit anyway...
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 21:12 ` David Hildenbrand
2025-01-08 21:55 ` Matthew Wilcox
@ 2025-01-13 17:48 ` Lorenzo Stoakes
1 sibling, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-13 17:48 UTC (permalink / raw)
To: David Hildenbrand
Cc: Matthew Wilcox, Andrew Morton, Jaya Kumar, Simona Vetter,
Helge Deller, linux-fbdev, dri-devel, linux-kernel, linux-mm
On Wed, Jan 08, 2025 at 10:12:36PM +0100, David Hildenbrand wrote:
> On 08.01.25 21:54, Matthew Wilcox wrote:
> > On Wed, Jan 08, 2025 at 09:14:53PM +0100, David Hildenbrand wrote:
> > > On 08.01.25 18:32, Matthew Wilcox wrote:
> > > > On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
> > > > > @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
> > > > > struct folio *folio = page_folio(pageref->page);
> > > > > folio_lock(folio);
> > > > > - folio_mkclean(folio);
> > > > > + rmap_wrprotect_file_page(fbdefio->mapping,
> > > > > + pageref->offset >> PAGE_SHIFT,
> > > > > + compound_nr(pageref->page),
> > > > > + page_to_pfn(pageref->page));
> > > > > folio_unlock(folio);
> > > >
> > > > Why do we need to lock the folio? (since this isn't necessarily a
> > > > folio)
> > >
> > > Can you clarify the "since this isn't necessarily a folio" part ? Do you
> > > mean in the future, when we split "struct page" and "struct folio"?
> >
> > Right. I need to finish the email that explains where I think we're
> > going in 2025 ...
> >
> > > Doing an rmap walk on something that won't be a folio is ... sounds odd
> > > (->wrong :) )
> >
> > Not necessarily! We already do that (since 2022) for DAX (see
> > 6a8e0596f004). rmap lets you find every place that a given range
> > of a file is mapped into user address spaces; but that file might be a
> > device file, and so it's not just pagecache but also (in this case)
> > fb memory, and whatever else device drivers decide to mmap.
>
> Yes, that part I remember.
>
> I thought we would be passing in a page into rmap_wrprotect_file_page(), and
> was wondering what we would do to "struct page" that won't be a folio in
> there.
The reason I provide a PFN is that we internally use a PFN for the walk, and
everything else is folio-fied for stuff that isn't necessarily a folio.
However it does seem silly to have to page_to_pfn() a page that we pass in, so I
will update to accept a page and do this bit in the function itself.
>
> Probably, because the "_page" in rmap_wrprotect_file_page() is misleading :)
>
> ... should it be "file_range" ? (but we also pass the pfn ... )
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields
2025-01-08 19:41 ` Lorenzo Stoakes
@ 2025-01-13 23:01 ` Lorenzo Stoakes
0 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-01-13 23:01 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andrew Morton, Jaya Kumar, Simona Vetter, Helge Deller,
linux-fbdev, dri-devel, linux-kernel, linux-mm, David Hildenbrand
On Wed, Jan 08, 2025 at 07:41:31PM +0000, Lorenzo Stoakes wrote:
> On Wed, Jan 08, 2025 at 05:32:54PM +0000, Matthew Wilcox wrote:
> > On Wed, Jan 08, 2025 at 04:18:42PM +0000, Lorenzo Stoakes wrote:
> > > @@ -280,7 +269,10 @@ static void fb_deferred_io_work(struct work_struct *work)
> > > struct folio *folio = page_folio(pageref->page);
> > >
> > > folio_lock(folio);
> > > - folio_mkclean(folio);
> > > + rmap_wrprotect_file_page(fbdefio->mapping,
> > > + pageref->offset >> PAGE_SHIFT,
> > > + compound_nr(pageref->page),
> > > + page_to_pfn(pageref->page));
> > > folio_unlock(folio);
> >
> > Why do we need to lock the folio? (since this isn't necessarily a
> > folio) Also, do we need compound_nr() here? I _think_ for defio,
> > the number of pages allocated per object are fixed, so this should be
> > an fbdefio->nr_pages field?
>
> I'm trying to keep the code as similar as possible to the way it was before,
> even if there are questionable parts.
>
> There is a comment about some timing issue around the locks and so there appears
> to be an assumption about that.
Actually, reading through the code, I think the comment is with regards to
page_mkwrite(), so we should be ok, in fb_deferred_io_track_page():
/*
* We want the page to remain locked from ->page_mkwrite until
* the PTE is marked dirty to avoid mapping_wrprotect_page()
* being called before the PTE is updated, which would leave
* the page ignored by defio.
* Do this by locking the page here and informing the caller
* about it with VM_FAULT_LOCKED.
*/
lock_page(pageref->page);
I don't think we need to lock the page (which is managed as kernel memory so
doesn't require it).
So will remove.
>
> As to compound_nr(), we're not write protecting everything, just each invidiual
> page in the list that needs it, so we only want to do one at a time. I strongly
> suspect it's a single base page each time, but for belts + braces I'm doing
> compound_nr().
>
> See below, this is wrong, it should just be '1'.
>
> So this is iterating through a list of pagerefs that can be in any random order.
>
> >
> > (something that's always troubled me about compound_nr() is that it
> > returns 1 for tail pages and the number you actually expect for head
> > pages)
> >
>
> OK I changed this from '1' to compound_nr() out of an (apparently) abundance of
> caution, but I was wrong:
>
> npagerefs = DIV_ROUND_UP(info->fix.smem_len, PAGE_SIZE);
>
> There are page refs for each PAGE_SIZE (i.e. base page size), so there is no way
> anything is compound.
>
> Will switch this to 1.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-01-13 23:01 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-08 16:18 [RFC PATCH 0/3] expose mapping wrprotect, fix fb_defio use Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 1/3] mm: refactor rmap_walk_file() to separate out traversal logic Lorenzo Stoakes
2025-01-08 16:38 ` Matthew Wilcox
2025-01-08 19:23 ` Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 2/3] mm: provide rmap_wrprotect_file_page() function Lorenzo Stoakes
2025-01-08 17:25 ` Matthew Wilcox
2025-01-08 19:35 ` Lorenzo Stoakes
2025-01-08 16:18 ` [RFC PATCH 3/3] fb_defio: do not use deprecated page->mapping, index fields Lorenzo Stoakes
2025-01-08 17:32 ` Matthew Wilcox
2025-01-08 19:41 ` Lorenzo Stoakes
2025-01-13 23:01 ` Lorenzo Stoakes
2025-01-08 20:14 ` David Hildenbrand
2025-01-08 20:54 ` Matthew Wilcox
2025-01-08 21:12 ` David Hildenbrand
2025-01-08 21:55 ` Matthew Wilcox
2025-01-08 22:02 ` David Hildenbrand
2025-01-13 17:18 ` Lorenzo Stoakes
2025-01-13 17:48 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).