* [PATCH 01/11] f2fs: Remove check for ->writepage
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 02/11] f2fs: Remove f2fs_write_data_page() Matthew Wilcox (Oracle)
` (10 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
We're almost able to remove a_ops->writepage. This check is unnecessary
as we'll never call into __f2fs_write_data_pages() for character
devices.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/f2fs/data.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c82d949709f4..a80d5ef9acbb 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3280,10 +3280,6 @@ static int __f2fs_write_data_pages(struct address_space *mapping,
int ret;
bool locked = false;
- /* deal with chardevs and other special file */
- if (!mapping->a_ops->writepage)
- return 0;
-
/* skip writing if there is no dirty page in this inode */
if (!get_dirty_pages(inode) && wbc->sync_mode == WB_SYNC_NONE)
return 0;
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 02/11] f2fs: Remove f2fs_write_data_page()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 01/11] f2fs: Remove check for ->writepage Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 03/11] f2fs: Remove f2fs_write_meta_page() Matthew Wilcox (Oracle)
` (9 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
Mappings which implement writepages should not implement writepage
as it can only harm writeback patterns.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/f2fs/data.c | 24 ------------------------
1 file changed, 24 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index a80d5ef9acbb..cdd63e8ad42e 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2935,29 +2935,6 @@ int f2fs_write_single_data_page(struct folio *folio, int *submitted,
return err;
}
-static int f2fs_write_data_page(struct page *page,
- struct writeback_control *wbc)
-{
- struct folio *folio = page_folio(page);
-#ifdef CONFIG_F2FS_FS_COMPRESSION
- struct inode *inode = folio->mapping->host;
-
- if (unlikely(f2fs_cp_error(F2FS_I_SB(inode))))
- goto out;
-
- if (f2fs_compressed_file(inode)) {
- if (f2fs_is_compressed_cluster(inode, folio->index)) {
- folio_redirty_for_writepage(wbc, folio);
- return AOP_WRITEPAGE_ACTIVATE;
- }
- }
-out:
-#endif
-
- return f2fs_write_single_data_page(folio, NULL, NULL, NULL,
- wbc, FS_DATA_IO, 0, true);
-}
-
/*
* This function was copied from write_cache_pages from mm/page-writeback.c.
* The major change is making write step of cold data page separately from
@@ -4111,7 +4088,6 @@ static void f2fs_swap_deactivate(struct file *file)
const struct address_space_operations f2fs_dblock_aops = {
.read_folio = f2fs_read_data_folio,
.readahead = f2fs_readahead,
- .writepage = f2fs_write_data_page,
.writepages = f2fs_write_data_pages,
.write_begin = f2fs_write_begin,
.write_end = f2fs_write_end,
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 03/11] f2fs: Remove f2fs_write_meta_page()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 01/11] f2fs: Remove check for ->writepage Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 02/11] f2fs: Remove f2fs_write_data_page() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 04/11] f2fs: Remove f2fs_write_node_page() Matthew Wilcox (Oracle)
` (8 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
Mappings which implement writepages should not implement writepage
as it can only harm writeback patterns.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/f2fs/checkpoint.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index a35595f8d3f5..412282f50cbb 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -381,12 +381,6 @@ static int __f2fs_write_meta_page(struct page *page,
return AOP_WRITEPAGE_ACTIVATE;
}
-static int f2fs_write_meta_page(struct page *page,
- struct writeback_control *wbc)
-{
- return __f2fs_write_meta_page(page, wbc, FS_META_IO);
-}
-
static int f2fs_write_meta_pages(struct address_space *mapping,
struct writeback_control *wbc)
{
@@ -507,7 +501,6 @@ static bool f2fs_dirty_meta_folio(struct address_space *mapping,
}
const struct address_space_operations f2fs_meta_aops = {
- .writepage = f2fs_write_meta_page,
.writepages = f2fs_write_meta_pages,
.dirty_folio = f2fs_dirty_meta_folio,
.invalidate_folio = f2fs_invalidate_folio,
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 04/11] f2fs: Remove f2fs_write_node_page()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (2 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 03/11] f2fs: Remove f2fs_write_meta_page() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 05/11] vboxsf: Convert to writepages Matthew Wilcox (Oracle)
` (7 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
Mappings which implement writepages should not implement writepage
as it can only harm writeback patterns.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/f2fs/node.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 36614a1c2590..b78c1f95bc04 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1784,13 +1784,6 @@ int f2fs_move_node_page(struct page *node_page, int gc_type)
return err;
}
-static int f2fs_write_node_page(struct page *page,
- struct writeback_control *wbc)
-{
- return __write_node_page(page, false, NULL, wbc, false,
- FS_NODE_IO, NULL);
-}
-
int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
struct writeback_control *wbc, bool atomic,
unsigned int *seq_id)
@@ -2217,7 +2210,6 @@ static bool f2fs_dirty_node_folio(struct address_space *mapping,
* Structure of the f2fs node operations
*/
const struct address_space_operations f2fs_node_aops = {
- .writepage = f2fs_write_node_page,
.writepages = f2fs_write_node_pages,
.dirty_folio = f2fs_dirty_node_folio,
.invalidate_folio = f2fs_invalidate_folio,
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 05/11] vboxsf: Convert to writepages
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (3 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 04/11] f2fs: Remove f2fs_write_node_page() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 06/11] migrate: Remove call to ->writepage Matthew Wilcox (Oracle)
` (6 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
If we add a migrate_folio operation, we can convert the writepage
operation to writepages. Further, this lets us optimise by using
the same write handle for multiple folios. The large folio support here
is illusory; we would need to kmap each page in turn for proper support.
But we do remove a few hidden calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/vboxsf/file.c | 47 +++++++++++++++++++++++++----------------------
1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/fs/vboxsf/file.c b/fs/vboxsf/file.c
index b780deb81b02..b492794f8e9a 100644
--- a/fs/vboxsf/file.c
+++ b/fs/vboxsf/file.c
@@ -262,40 +262,42 @@ static struct vboxsf_handle *vboxsf_get_write_handle(struct vboxsf_inode *sf_i)
return sf_handle;
}
-static int vboxsf_writepage(struct page *page, struct writeback_control *wbc)
+static int vboxsf_writepages(struct address_space *mapping,
+ struct writeback_control *wbc)
{
- struct inode *inode = page->mapping->host;
+ struct inode *inode = mapping->host;
+ struct folio *folio = NULL;
struct vboxsf_inode *sf_i = VBOXSF_I(inode);
struct vboxsf_handle *sf_handle;
- loff_t off = page_offset(page);
loff_t size = i_size_read(inode);
- u32 nwrite = PAGE_SIZE;
- u8 *buf;
- int err;
-
- if (off + PAGE_SIZE > size)
- nwrite = size & ~PAGE_MASK;
+ int error;
sf_handle = vboxsf_get_write_handle(sf_i);
if (!sf_handle)
return -EBADF;
- buf = kmap(page);
- err = vboxsf_write(sf_handle->root, sf_handle->handle,
- off, &nwrite, buf);
- kunmap(page);
+ while ((folio = writeback_iter(mapping, wbc, folio, &error))) {
+ loff_t off = folio_pos(folio);
+ u32 nwrite = folio_size(folio);
+ u8 *buf;
- kref_put(&sf_handle->refcount, vboxsf_handle_release);
+ if (nwrite > size - off)
+ nwrite = size - off;
- if (err == 0) {
- /* mtime changed */
- sf_i->force_restat = 1;
- } else {
- ClearPageUptodate(page);
+ buf = kmap_local_folio(folio, 0);
+ error = vboxsf_write(sf_handle->root, sf_handle->handle,
+ off, &nwrite, buf);
+ kunmap_local(buf);
+
+ folio_unlock(folio);
}
- unlock_page(page);
- return err;
+ kref_put(&sf_handle->refcount, vboxsf_handle_release);
+
+ /* mtime changed */
+ if (error == 0)
+ sf_i->force_restat = 1;
+ return error;
}
static int vboxsf_write_end(struct file *file, struct address_space *mapping,
@@ -347,10 +349,11 @@ static int vboxsf_write_end(struct file *file, struct address_space *mapping,
*/
const struct address_space_operations vboxsf_reg_aops = {
.read_folio = vboxsf_read_folio,
- .writepage = vboxsf_writepage,
+ .writepages = vboxsf_writepages,
.dirty_folio = filemap_dirty_folio,
.write_begin = simple_write_begin,
.write_end = vboxsf_write_end,
+ .migrate_folio = filemap_migrate_folio,
};
static const char *vboxsf_get_link(struct dentry *dentry, struct inode *inode,
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 06/11] migrate: Remove call to ->writepage
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (4 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 05/11] vboxsf: Convert to writepages Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-27 15:04 ` Zi Yan
2025-03-07 13:54 ` [PATCH 07/11] writeback: Remove writeback_use_writepage() Matthew Wilcox (Oracle)
` (5 subsequent siblings)
11 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
The writepage callback is going away; filesystems must implement
migrate_folio or else dirty folios will not be migratable.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/migrate.c | 57 ++++------------------------------------------------
1 file changed, 4 insertions(+), 53 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index c0adea67cd62..3d1d9d49fb8e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -944,67 +944,18 @@ int filemap_migrate_folio(struct address_space *mapping,
}
EXPORT_SYMBOL_GPL(filemap_migrate_folio);
-/*
- * Writeback a folio to clean the dirty state
- */
-static int writeout(struct address_space *mapping, struct folio *folio)
-{
- struct writeback_control wbc = {
- .sync_mode = WB_SYNC_NONE,
- .nr_to_write = 1,
- .range_start = 0,
- .range_end = LLONG_MAX,
- .for_reclaim = 1
- };
- int rc;
-
- if (!mapping->a_ops->writepage)
- /* No write method for the address space */
- return -EINVAL;
-
- if (!folio_clear_dirty_for_io(folio))
- /* Someone else already triggered a write */
- return -EAGAIN;
-
- /*
- * A dirty folio may imply that the underlying filesystem has
- * the folio on some queue. So the folio must be clean for
- * migration. Writeout may mean we lose the lock and the
- * folio state is no longer what we checked for earlier.
- * At this point we know that the migration attempt cannot
- * be successful.
- */
- remove_migration_ptes(folio, folio, 0);
-
- rc = mapping->a_ops->writepage(&folio->page, &wbc);
-
- if (rc != AOP_WRITEPAGE_ACTIVATE)
- /* unlocked. Relock */
- folio_lock(folio);
-
- return (rc < 0) ? -EIO : -EAGAIN;
-}
-
/*
* Default handling if a filesystem does not provide a migration function.
*/
static int fallback_migrate_folio(struct address_space *mapping,
struct folio *dst, struct folio *src, enum migrate_mode mode)
{
- if (folio_test_dirty(src)) {
- /* Only writeback folios in full synchronous migration */
- switch (mode) {
- case MIGRATE_SYNC:
- break;
- default:
- return -EBUSY;
- }
- return writeout(mapping, src);
- }
+ if (folio_test_dirty(src))
+ return -EBUSY;
/*
- * Buffers may be managed in a filesystem specific way.
- * We must have no buffers or drop them.
+ * Filesystem may have private data at folio->private that we
+ * can't migrate automatically.
*/
if (!filemap_release_folio(src, GFP_KERNEL))
return mode == MIGRATE_SYNC ? -EAGAIN : -EBUSY;
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 06/11] migrate: Remove call to ->writepage
2025-03-07 13:54 ` [PATCH 06/11] migrate: Remove call to ->writepage Matthew Wilcox (Oracle)
@ 2025-03-27 15:04 ` Zi Yan
2025-03-27 16:52 ` Matthew Wilcox
0 siblings, 1 reply; 25+ messages in thread
From: Zi Yan @ 2025-03-27 15:04 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-fsdevel, David Hildenbrand,
Joanne Koong
Cc: linux-mm, intel-gfx
On Fri Mar 7, 2025 at 8:54 AM EST, Matthew Wilcox (Oracle) wrote:
> The writepage callback is going away; filesystems must implement
> migrate_folio or else dirty folios will not be migratable.
What is the impact of this? Are there any filesystem that has
a_ops->writepage() without migrate_folio()? I wonder if it could make
the un-migratable problem worse[1] when such FS exists.
[1] https://lore.kernel.org/linux-mm/882b566c-34d6-4e68-9447-6c74a0693f18@redhat.com/
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> mm/migrate.c | 57 ++++------------------------------------------------
> 1 file changed, 4 insertions(+), 53 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index c0adea67cd62..3d1d9d49fb8e 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -944,67 +944,18 @@ int filemap_migrate_folio(struct address_space *mapping,
> }
> EXPORT_SYMBOL_GPL(filemap_migrate_folio);
>
> -/*
> - * Writeback a folio to clean the dirty state
> - */
> -static int writeout(struct address_space *mapping, struct folio *folio)
> -{
> - struct writeback_control wbc = {
> - .sync_mode = WB_SYNC_NONE,
> - .nr_to_write = 1,
> - .range_start = 0,
> - .range_end = LLONG_MAX,
> - .for_reclaim = 1
> - };
> - int rc;
> -
> - if (!mapping->a_ops->writepage)
> - /* No write method for the address space */
> - return -EINVAL;
> -
> - if (!folio_clear_dirty_for_io(folio))
> - /* Someone else already triggered a write */
> - return -EAGAIN;
> -
> - /*
> - * A dirty folio may imply that the underlying filesystem has
> - * the folio on some queue. So the folio must be clean for
> - * migration. Writeout may mean we lose the lock and the
> - * folio state is no longer what we checked for earlier.
> - * At this point we know that the migration attempt cannot
> - * be successful.
> - */
> - remove_migration_ptes(folio, folio, 0);
> -
> - rc = mapping->a_ops->writepage(&folio->page, &wbc);
> -
> - if (rc != AOP_WRITEPAGE_ACTIVATE)
> - /* unlocked. Relock */
> - folio_lock(folio);
> -
> - return (rc < 0) ? -EIO : -EAGAIN;
> -}
> -
> /*
> * Default handling if a filesystem does not provide a migration function.
> */
> static int fallback_migrate_folio(struct address_space *mapping,
> struct folio *dst, struct folio *src, enum migrate_mode mode)
> {
> - if (folio_test_dirty(src)) {
> - /* Only writeback folios in full synchronous migration */
> - switch (mode) {
> - case MIGRATE_SYNC:
> - break;
> - default:
> - return -EBUSY;
> - }
> - return writeout(mapping, src);
> - }
Now fallback_migrate_folio() no longer writes out page for FS, so it is
the responsibilty of migrate_folio()?
+Joanne, since she is touching the above code in the FUSE temp page removal
patchset.
> + if (folio_test_dirty(src))
> + return -EBUSY;
>
> /*
> - * Buffers may be managed in a filesystem specific way.
> - * We must have no buffers or drop them.
> + * Filesystem may have private data at folio->private that we
> + * can't migrate automatically.
> */
> if (!filemap_release_folio(src, GFP_KERNEL))
> return mode == MIGRATE_SYNC ? -EAGAIN : -EBUSY;
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 06/11] migrate: Remove call to ->writepage
2025-03-27 15:04 ` Zi Yan
@ 2025-03-27 16:52 ` Matthew Wilcox
2025-03-27 17:22 ` Zi Yan
0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2025-03-27 16:52 UTC (permalink / raw)
To: Zi Yan; +Cc: linux-fsdevel, David Hildenbrand, Joanne Koong, linux-mm,
intel-gfx
On Thu, Mar 27, 2025 at 11:04:57AM -0400, Zi Yan wrote:
> On Fri Mar 7, 2025 at 8:54 AM EST, Matthew Wilcox (Oracle) wrote:
> > The writepage callback is going away; filesystems must implement
> > migrate_folio or else dirty folios will not be migratable.
>
> What is the impact of this? Are there any filesystem that has
> a_ops->writepage() without migrate_folio()? I wonder if it could make
> the un-migratable problem worse[1] when such FS exists.
As Christoph and I have been going through filesystems removing their
->writepage operations, we've been careful to add ->migrate_folio
callbacks at the same time. But we haven't fixed any out-of-tree
filesystems, and we can't fix the filesystems which will be written in
the future.
So maybe what we should do is WARN_ON_ONCE() for filesystems which
have a ->writepages, but do not have a ->migrate_folio()?
> > static int fallback_migrate_folio(struct address_space *mapping,
> > struct folio *dst, struct folio *src, enum migrate_mode mode)
> > {
> > - if (folio_test_dirty(src)) {
> > - /* Only writeback folios in full synchronous migration */
> > - switch (mode) {
> > - case MIGRATE_SYNC:
> > - break;
> > - default:
> > - return -EBUSY;
> > - }
> > - return writeout(mapping, src);
> > - }
>
> Now fallback_migrate_folio() no longer writes out page for FS, so it is
> the responsibilty of migrate_folio()?
->migrate_folio() doesn't need to write out the page. It can migrate
dirty folios (just not folios currently under writeback, obviously)
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 06/11] migrate: Remove call to ->writepage
2025-03-27 16:52 ` Matthew Wilcox
@ 2025-03-27 17:22 ` Zi Yan
2025-04-01 13:32 ` David Hildenbrand
0 siblings, 1 reply; 25+ messages in thread
From: Zi Yan @ 2025-03-27 17:22 UTC (permalink / raw)
To: Matthew Wilcox
Cc: linux-fsdevel, David Hildenbrand, Joanne Koong, linux-mm,
intel-gfx
On Thu Mar 27, 2025 at 12:52 PM EDT, Matthew Wilcox wrote:
> On Thu, Mar 27, 2025 at 11:04:57AM -0400, Zi Yan wrote:
>> On Fri Mar 7, 2025 at 8:54 AM EST, Matthew Wilcox (Oracle) wrote:
>> > The writepage callback is going away; filesystems must implement
>> > migrate_folio or else dirty folios will not be migratable.
>>
>> What is the impact of this? Are there any filesystem that has
>> a_ops->writepage() without migrate_folio()? I wonder if it could make
>> the un-migratable problem worse[1] when such FS exists.
>
> As Christoph and I have been going through filesystems removing their
> ->writepage operations, we've been careful to add ->migrate_folio
> callbacks at the same time. But we haven't fixed any out-of-tree
> filesystems, and we can't fix the filesystems which will be written in
> the future.
>
> So maybe what we should do is WARN_ON_ONCE() for filesystems which
> have a ->writepages, but do not have a ->migrate_folio()?
Sounds good to me. Oh, ->writepage is removed and there is still
->writepages. Presumably, it is possible to use ->writepages in place of
->writepage in the removed writeout(), but that is meaningless since
->migrate_folio should be used.
>
>> > static int fallback_migrate_folio(struct address_space *mapping,
>> > struct folio *dst, struct folio *src, enum migrate_mode mode)
>> > {
>> > - if (folio_test_dirty(src)) {
>> > - /* Only writeback folios in full synchronous migration */
>> > - switch (mode) {
>> > - case MIGRATE_SYNC:
>> > - break;
>> > - default:
>> > - return -EBUSY;
>> > - }
>> > - return writeout(mapping, src);
>> > - }
>>
>> Now fallback_migrate_folio() no longer writes out page for FS, so it is
>> the responsibilty of migrate_folio()?
>
> ->migrate_folio() doesn't need to write out the page. It can migrate
> dirty folios (just not folios currently under writeback, obviously)
Got it. And I just noticed that Joanne's change is in
migrate_folio_unmap() for folios under writeback and irrelevant to this
change.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 06/11] migrate: Remove call to ->writepage
2025-03-27 17:22 ` Zi Yan
@ 2025-04-01 13:32 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2025-04-01 13:32 UTC (permalink / raw)
To: Zi Yan, Matthew Wilcox; +Cc: linux-fsdevel, Joanne Koong, linux-mm, intel-gfx
On 27.03.25 18:22, Zi Yan wrote:
> On Thu Mar 27, 2025 at 12:52 PM EDT, Matthew Wilcox wrote:
>> On Thu, Mar 27, 2025 at 11:04:57AM -0400, Zi Yan wrote:
>>> On Fri Mar 7, 2025 at 8:54 AM EST, Matthew Wilcox (Oracle) wrote:
>>>> The writepage callback is going away; filesystems must implement
>>>> migrate_folio or else dirty folios will not be migratable.
>>>
>>> What is the impact of this? Are there any filesystem that has
>>> a_ops->writepage() without migrate_folio()? I wonder if it could make
>>> the un-migratable problem worse[1] when such FS exists.
>>
>> As Christoph and I have been going through filesystems removing their
>> ->writepage operations, we've been careful to add ->migrate_folio
>> callbacks at the same time. But we haven't fixed any out-of-tree
>> filesystems, and we can't fix the filesystems which will be written in
>> the future.
>>
>> So maybe what we should do is WARN_ON_ONCE() for filesystems which
>> have a ->writepages, but do not have a ->migrate_folio()?
>
> Sounds good to me.
Agreed, that will also make it clear what our expectation towards
filesystems is.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 07/11] writeback: Remove writeback_use_writepage()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (5 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 06/11] migrate: Remove call to ->writepage Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 08/11] shmem: Add shmem_writeout() Matthew Wilcox (Oracle)
` (4 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
The ->writepage operation is going away. Remove this alternative to
calling ->writepages.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/page-writeback.c | 28 ++--------------------------
1 file changed, 2 insertions(+), 26 deletions(-)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 18456ddd463b..3cf7ae45be58 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2621,27 +2621,6 @@ int write_cache_pages(struct address_space *mapping,
}
EXPORT_SYMBOL(write_cache_pages);
-static int writeback_use_writepage(struct address_space *mapping,
- struct writeback_control *wbc)
-{
- struct folio *folio = NULL;
- struct blk_plug plug;
- int err;
-
- blk_start_plug(&plug);
- while ((folio = writeback_iter(mapping, wbc, folio, &err))) {
- err = mapping->a_ops->writepage(&folio->page, wbc);
- if (err == AOP_WRITEPAGE_ACTIVATE) {
- folio_unlock(folio);
- err = 0;
- }
- mapping_set_error(mapping, err);
- }
- blk_finish_plug(&plug);
-
- return err;
-}
-
int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
{
int ret;
@@ -2652,14 +2631,11 @@ int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
wb = inode_to_wb_wbc(mapping->host, wbc);
wb_bandwidth_estimate_start(wb);
while (1) {
- if (mapping->a_ops->writepages) {
+ if (mapping->a_ops->writepages)
ret = mapping->a_ops->writepages(mapping, wbc);
- } else if (mapping->a_ops->writepage) {
- ret = writeback_use_writepage(mapping, wbc);
- } else {
+ else
/* deal with chardevs and other special files */
ret = 0;
- }
if (ret != -ENOMEM || wbc->sync_mode != WB_SYNC_ALL)
break;
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 08/11] shmem: Add shmem_writeout()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (6 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 07/11] writeback: Remove writeback_use_writepage() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-08 5:31 ` Baolin Wang
2025-03-07 13:54 ` [PATCH 09/11] i915: Use writeback_iter() Matthew Wilcox (Oracle)
` (3 subsequent siblings)
11 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
This will be the replacement for shmem_writepage().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/shmem_fs.h | 7 ++++---
mm/shmem.c | 20 ++++++++++++++------
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 0b273a7b9f01..5f03a39a26f7 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -104,10 +104,11 @@ static inline bool shmem_mapping(struct address_space *mapping)
return false;
}
#endif /* CONFIG_SHMEM */
-extern void shmem_unlock_mapping(struct address_space *mapping);
-extern struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
+void shmem_unlock_mapping(struct address_space *mapping);
+struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
pgoff_t index, gfp_t gfp_mask);
-extern void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end);
+int shmem_writeout(struct folio *folio, struct writeback_control *wbc);
+void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end);
int shmem_unuse(unsigned int type);
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/mm/shmem.c b/mm/shmem.c
index ba162e991285..427b7f70fffb 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1536,12 +1536,20 @@ int shmem_unuse(unsigned int type)
return error;
}
-/*
- * Move the page from the page cache to the swap cache.
- */
static int shmem_writepage(struct page *page, struct writeback_control *wbc)
{
- struct folio *folio = page_folio(page);
+ return shmem_writeout(page_folio(page), wbc);
+}
+
+/**
+ * shmem_writeout - Write the folio to swap
+ * @folio: The folio to write
+ * @wbc: How writeback is to be done
+ *
+ * Move the folio from the page cache to the swap cache.
+ */
+int shmem_writeout(struct folio *folio, struct writeback_control *wbc)
+{
struct address_space *mapping = folio->mapping;
struct inode *inode = mapping->host;
struct shmem_inode_info *info = SHMEM_I(inode);
@@ -1586,9 +1594,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
try_split:
/* Ensure the subpages are still dirty */
folio_test_set_dirty(folio);
- if (split_huge_page_to_list_to_order(page, wbc->list, 0))
+ if (split_folio_to_list(folio, wbc->list))
goto redirty;
- folio = page_folio(page);
folio_clear_dirty(folio);
}
@@ -1660,6 +1667,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
folio_unlock(folio);
return 0;
}
+EXPORT_SYMBOL_GPL(shmem_writeout);
#if defined(CONFIG_NUMA) && defined(CONFIG_TMPFS)
static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 08/11] shmem: Add shmem_writeout()
2025-03-07 13:54 ` [PATCH 08/11] shmem: Add shmem_writeout() Matthew Wilcox (Oracle)
@ 2025-03-08 5:31 ` Baolin Wang
0 siblings, 0 replies; 25+ messages in thread
From: Baolin Wang @ 2025-03-08 5:31 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-fsdevel; +Cc: linux-mm, intel-gfx
On 2025/3/7 21:54, Matthew Wilcox (Oracle) wrote:
> This will be the replacement for shmem_writepage().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> include/linux/shmem_fs.h | 7 ++++---
> mm/shmem.c | 20 ++++++++++++++------
> 2 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> index 0b273a7b9f01..5f03a39a26f7 100644
> --- a/include/linux/shmem_fs.h
> +++ b/include/linux/shmem_fs.h
> @@ -104,10 +104,11 @@ static inline bool shmem_mapping(struct address_space *mapping)
> return false;
> }
> #endif /* CONFIG_SHMEM */
> -extern void shmem_unlock_mapping(struct address_space *mapping);
> -extern struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
> +void shmem_unlock_mapping(struct address_space *mapping);
> +struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
> pgoff_t index, gfp_t gfp_mask);
> -extern void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end);
> +int shmem_writeout(struct folio *folio, struct writeback_control *wbc);
> +void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end);
> int shmem_unuse(unsigned int type);
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> diff --git a/mm/shmem.c b/mm/shmem.c
> index ba162e991285..427b7f70fffb 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1536,12 +1536,20 @@ int shmem_unuse(unsigned int type)
> return error;
> }
>
> -/*
> - * Move the page from the page cache to the swap cache.
> - */
> static int shmem_writepage(struct page *page, struct writeback_control *wbc)
> {
> - struct folio *folio = page_folio(page);
> + return shmem_writeout(page_folio(page), wbc);
> +}
> +
> +/**
> + * shmem_writeout - Write the folio to swap
> + * @folio: The folio to write
> + * @wbc: How writeback is to be done
> + *
> + * Move the folio from the page cache to the swap cache.
> + */
> +int shmem_writeout(struct folio *folio, struct writeback_control *wbc)
> +{
> struct address_space *mapping = folio->mapping;
> struct inode *inode = mapping->host;
> struct shmem_inode_info *info = SHMEM_I(inode);
> @@ -1586,9 +1594,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
> try_split:
> /* Ensure the subpages are still dirty */
> folio_test_set_dirty(folio);
> - if (split_huge_page_to_list_to_order(page, wbc->list, 0))
> + if (split_folio_to_list(folio, wbc->list))
> goto redirty;
> - folio = page_folio(page);
> folio_clear_dirty(folio);
> }
>
> @@ -1660,6 +1667,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
> folio_unlock(folio);
> return 0;
> }
> +EXPORT_SYMBOL_GPL(shmem_writeout);
>
> #if defined(CONFIG_NUMA) && defined(CONFIG_TMPFS)
> static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 09/11] i915: Use writeback_iter()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (7 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 08/11] shmem: Add shmem_writeout() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-07 13:54 ` [PATCH 10/11] mm: Remove swap_writepage() and shmem_writepage() Matthew Wilcox (Oracle)
` (2 subsequent siblings)
11 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
Convert from an inefficient loop to the standard writeback iterator.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 32 ++++++-----------------
1 file changed, 8 insertions(+), 24 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index ae3343c81a64..5e784db9f315 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -305,36 +305,20 @@ void __shmem_writeback(size_t size, struct address_space *mapping)
.range_end = LLONG_MAX,
.for_reclaim = 1,
};
- unsigned long i;
+ struct folio *folio = NULL;
+ int error = 0;
/*
* Leave mmapings intact (GTT will have been revoked on unbinding,
- * leaving only CPU mmapings around) and add those pages to the LRU
+ * leaving only CPU mmapings around) and add those folios to the LRU
* instead of invoking writeback so they are aged and paged out
* as normal.
*/
-
- /* Begin writeback on each dirty page */
- for (i = 0; i < size >> PAGE_SHIFT; i++) {
- struct page *page;
-
- page = find_lock_page(mapping, i);
- if (!page)
- continue;
-
- if (!page_mapped(page) && clear_page_dirty_for_io(page)) {
- int ret;
-
- SetPageReclaim(page);
- ret = mapping->a_ops->writepage(page, &wbc);
- if (!PageWriteback(page))
- ClearPageReclaim(page);
- if (!ret)
- goto put;
- }
- unlock_page(page);
-put:
- put_page(page);
+ while ((folio = writeback_iter(mapping, &wbc, folio, &error))) {
+ if (folio_mapped(folio))
+ folio_redirty_for_writepage(&wbc, folio);
+ else
+ error = shmem_writeout(folio, &wbc);
}
}
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 10/11] mm: Remove swap_writepage() and shmem_writepage()
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (8 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 09/11] i915: Use writeback_iter() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-08 5:34 ` Baolin Wang
2025-03-07 13:54 ` [PATCH 11/11] fs: Remove aops->writepage Matthew Wilcox (Oracle)
2025-03-28 9:40 ` [PATCH 00/11] " Christian Brauner
11 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
Call swap_writeout() and shmem_writeout() from pageout() instead.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
block/blk-wbt.c | 2 +-
mm/page_io.c | 3 +--
mm/shmem.c | 23 +++++------------------
mm/swap.h | 4 ++--
mm/swap_state.c | 1 -
mm/swapfile.c | 2 +-
mm/vmscan.c | 28 ++++++++++++++++------------
7 files changed, 26 insertions(+), 37 deletions(-)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index f1754d07f7e0..60885731e8ab 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -37,7 +37,7 @@
enum wbt_flags {
WBT_TRACKED = 1, /* write, tracked for throttling */
WBT_READ = 2, /* read */
- WBT_SWAP = 4, /* write, from swap_writepage() */
+ WBT_SWAP = 4, /* write, from swap_writeout() */
WBT_DISCARD = 8, /* discard */
WBT_NR_BITS = 4, /* number of bits */
diff --git a/mm/page_io.c b/mm/page_io.c
index 9b983de351f9..e9151952c514 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -237,9 +237,8 @@ static void swap_zeromap_folio_clear(struct folio *folio)
* We may have stale swap cache pages in memory: notice
* them here and get rid of the unnecessary final write.
*/
-int swap_writepage(struct page *page, struct writeback_control *wbc)
+int swap_writeout(struct folio *folio, struct writeback_control *wbc)
{
- struct folio *folio = page_folio(page);
int ret;
if (folio_free_swap(folio)) {
diff --git a/mm/shmem.c b/mm/shmem.c
index 427b7f70fffb..a786b94a468a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -98,7 +98,7 @@ static struct vfsmount *shm_mnt __ro_after_init;
#define SHORT_SYMLINK_LEN 128
/*
- * shmem_fallocate communicates with shmem_fault or shmem_writepage via
+ * shmem_fallocate communicates with shmem_fault or shmem_writeout via
* inode->i_private (with i_rwsem making sure that it has only one user at
* a time): we would prefer not to enlarge the shmem inode just for that.
*/
@@ -107,7 +107,7 @@ struct shmem_falloc {
pgoff_t start; /* start of range currently being fallocated */
pgoff_t next; /* the next page offset to be fallocated */
pgoff_t nr_falloced; /* how many new pages have been fallocated */
- pgoff_t nr_unswapped; /* how often writepage refused to swap out */
+ pgoff_t nr_unswapped; /* how often writeout refused to swap out */
};
struct shmem_options {
@@ -446,7 +446,7 @@ static void shmem_recalc_inode(struct inode *inode, long alloced, long swapped)
/*
* Special case: whereas normally shmem_recalc_inode() is called
* after i_mapping->nrpages has already been adjusted (up or down),
- * shmem_writepage() has to raise swapped before nrpages is lowered -
+ * shmem_writeout() has to raise swapped before nrpages is lowered -
* to stop a racing shmem_recalc_inode() from thinking that a page has
* been freed. Compensate here, to avoid the need for a followup call.
*/
@@ -1536,11 +1536,6 @@ int shmem_unuse(unsigned int type)
return error;
}
-static int shmem_writepage(struct page *page, struct writeback_control *wbc)
-{
- return shmem_writeout(page_folio(page), wbc);
-}
-
/**
* shmem_writeout - Write the folio to swap
* @folio: The folio to write
@@ -1558,13 +1553,6 @@ int shmem_writeout(struct folio *folio, struct writeback_control *wbc)
int nr_pages;
bool split = false;
- /*
- * Our capabilities prevent regular writeback or sync from ever calling
- * shmem_writepage; but a stacking filesystem might use ->writepage of
- * its underlying filesystem, in which case tmpfs should write out to
- * swap only in response to memory pressure, and not for the writeback
- * threads or sync.
- */
if (WARN_ON_ONCE(!wbc->for_reclaim))
goto redirty;
@@ -1653,7 +1641,7 @@ int shmem_writeout(struct folio *folio, struct writeback_control *wbc)
mutex_unlock(&shmem_swaplist_mutex);
BUG_ON(folio_mapped(folio));
- return swap_writepage(&folio->page, wbc);
+ return swap_writeout(folio, wbc);
}
list_del_init(&info->swaplist);
@@ -3780,7 +3768,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
index--;
/*
- * Inform shmem_writepage() how far we have reached.
+ * Inform shmem_writeout() how far we have reached.
* No need for lock or barrier: we have the page lock.
*/
if (!folio_test_uptodate(folio))
@@ -5203,7 +5191,6 @@ static int shmem_error_remove_folio(struct address_space *mapping,
}
static const struct address_space_operations shmem_aops = {
- .writepage = shmem_writepage,
.dirty_folio = noop_dirty_folio,
#ifdef CONFIG_TMPFS
.write_begin = shmem_write_begin,
diff --git a/mm/swap.h b/mm/swap.h
index 6f4a3f927edb..aa62463976d5 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -20,7 +20,7 @@ static inline void swap_read_unplug(struct swap_iocb *plug)
__swap_read_unplug(plug);
}
void swap_write_unplug(struct swap_iocb *sio);
-int swap_writepage(struct page *page, struct writeback_control *wbc);
+int swap_writeout(struct folio *folio, struct writeback_control *wbc);
void __swap_writepage(struct folio *folio, struct writeback_control *wbc);
/* linux/mm/swap_state.c */
@@ -141,7 +141,7 @@ static inline struct folio *swapin_readahead(swp_entry_t swp, gfp_t gfp_mask,
return NULL;
}
-static inline int swap_writepage(struct page *p, struct writeback_control *wbc)
+static inline int swap_writeout(struct folio *f, struct writeback_control *wbc)
{
return 0;
}
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 68fd981b514f..ec2b1c9c9926 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -30,7 +30,6 @@
* vmscan's shrink_folio_list.
*/
static const struct address_space_operations swap_aops = {
- .writepage = swap_writepage,
.dirty_folio = noop_dirty_folio,
#ifdef CONFIG_MIGRATION
.migrate_folio = migrate_folio,
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 628f67974a7c..60c994f84842 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2360,7 +2360,7 @@ static int try_to_unuse(unsigned int type)
* Limit the number of retries? No: when mmget_not_zero()
* above fails, that mm is likely to be freeing swap from
* exit_mmap(), which proceeds at its own independent pace;
- * and even shmem_writepage() could have been preempted after
+ * and even shmem_writeout() could have been preempted after
* folio_alloc_swap(), temporarily hiding that swap. It's easy
* and robust (though cpu-intensive) just to keep retrying.
*/
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 34410d24dc15..e9f84fa31b9a 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -648,16 +648,16 @@ typedef enum {
static pageout_t pageout(struct folio *folio, struct address_space *mapping,
struct swap_iocb **plug, struct list_head *folio_list)
{
+ int (*writeout)(struct folio *, struct writeback_control *);
+
/*
- * If the folio is dirty, only perform writeback if that write
- * will be non-blocking. To prevent this allocation from being
- * stalled by pagecache activity. But note that there may be
- * stalls if we need to run get_block(). We could test
- * PagePrivate for that.
- *
- * If this process is currently in __generic_file_write_iter() against
- * this folio's queue, we can perform writeback even if that
- * will block.
+ * We no longer attempt to writeback filesystem folios here, other
+ * than tmpfs/shmem. That's taken care of in page-writeback.
+ * If we find a dirty filesystem folio at the end of the LRU list,
+ * typically that means the filesystem is saturating the storage
+ * with contiguous writes and telling it to write a folio here
+ * would only make the situation worse by injecting an element
+ * of random access.
*
* If the folio is swapcache, write it back even if that would
* block, for some throttling. This happens by accident, because
@@ -680,7 +680,11 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping,
}
return PAGE_KEEP;
}
- if (mapping->a_ops->writepage == NULL)
+ if (shmem_mapping(mapping))
+ writeout = shmem_writeout;
+ else if (folio_test_anon(folio))
+ writeout = swap_writeout;
+ else
return PAGE_ACTIVATE;
if (folio_clear_dirty_for_io(folio)) {
@@ -703,7 +707,7 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping,
wbc.list = folio_list;
folio_set_reclaim(folio);
- res = mapping->a_ops->writepage(&folio->page, &wbc);
+ res = writeout(folio, &wbc);
if (res < 0)
handle_write_error(mapping, folio, res);
if (res == AOP_WRITEPAGE_ACTIVATE) {
@@ -712,7 +716,7 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping,
}
if (!folio_test_writeback(folio)) {
- /* synchronous write or broken a_ops? */
+ /* synchronous write? */
folio_clear_reclaim(folio);
}
trace_mm_vmscan_write_folio(folio);
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 11/11] fs: Remove aops->writepage
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (9 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 10/11] mm: Remove swap_writepage() and shmem_writepage() Matthew Wilcox (Oracle)
@ 2025-03-07 13:54 ` Matthew Wilcox (Oracle)
2025-03-17 1:08 ` Fan Ni
2025-03-28 9:40 ` [PATCH 00/11] " Christian Brauner
11 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-03-07 13:54 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-mm, intel-gfx
All callers and implementations are now removed, so remove the operation
and update the documentation to match.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
Documentation/admin-guide/cgroup-v2.rst | 2 +-
Documentation/filesystems/fscrypt.rst | 2 +-
Documentation/filesystems/locking.rst | 54 +------------------------
Documentation/filesystems/vfs.rst | 39 +++++-------------
fs/buffer.c | 4 +-
include/linux/fs.h | 1 -
mm/vmscan.c | 1 -
7 files changed, 15 insertions(+), 88 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 77d80a7e975b..4e10b4084381 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -3028,7 +3028,7 @@ Filesystem Support for Writeback
--------------------------------
A filesystem can support cgroup writeback by updating
-address_space_operations->writepage[s]() to annotate bio's using the
+address_space_operations->writepages() to annotate bio's using the
following two functions.
wbc_init_bio(@wbc, @bio)
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index e80329908549..3d22e2db732d 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1409,7 +1409,7 @@ read the ciphertext into the page cache and decrypt it in-place. The
folio lock must be held until decryption has finished, to prevent the
folio from becoming visible to userspace prematurely.
-For the write path (->writepage()) of regular files, filesystems
+For the write path (->writepages()) of regular files, filesystems
cannot encrypt data in-place in the page cache, since the cached
plaintext must be preserved. Instead, filesystems must encrypt into a
temporary buffer or "bounce page", then write out the temporary
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 0ec0bb6eb0fb..2e567e341c3b 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -249,7 +249,6 @@ address_space_operations
========================
prototypes::
- int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*read_folio)(struct file *, struct folio *);
int (*writepages)(struct address_space *, struct writeback_control *);
bool (*dirty_folio)(struct address_space *, struct folio *folio);
@@ -280,7 +279,6 @@ locking rules:
====================== ======================== ========= ===============
ops folio locked i_rwsem invalidate_lock
====================== ======================== ========= ===============
-writepage: yes, unlocks (see below)
read_folio: yes, unlocks shared
writepages:
dirty_folio: maybe
@@ -309,54 +307,6 @@ completion.
->readahead() unlocks the folios that I/O is attempted on like ->read_folio().
-->writepage() is used for two purposes: for "memory cleansing" and for
-"sync". These are quite different operations and the behaviour may differ
-depending upon the mode.
-
-If writepage is called for sync (wbc->sync_mode != WBC_SYNC_NONE) then
-it *must* start I/O against the page, even if that would involve
-blocking on in-progress I/O.
-
-If writepage is called for memory cleansing (sync_mode ==
-WBC_SYNC_NONE) then its role is to get as much writeout underway as
-possible. So writepage should try to avoid blocking against
-currently-in-progress I/O.
-
-If the filesystem is not called for "sync" and it determines that it
-would need to block against in-progress I/O to be able to start new I/O
-against the page the filesystem should redirty the page with
-redirty_page_for_writepage(), then unlock the page and return zero.
-This may also be done to avoid internal deadlocks, but rarely.
-
-If the filesystem is called for sync then it must wait on any
-in-progress I/O and then start new I/O.
-
-The filesystem should unlock the page synchronously, before returning to the
-caller, unless ->writepage() returns special WRITEPAGE_ACTIVATE
-value. WRITEPAGE_ACTIVATE means that page cannot really be written out
-currently, and VM should stop calling ->writepage() on this page for some
-time. VM does this by moving page to the head of the active list, hence the
-name.
-
-Unless the filesystem is going to redirty_page_for_writepage(), unlock the page
-and return zero, writepage *must* run set_page_writeback() against the page,
-followed by unlocking it. Once set_page_writeback() has been run against the
-page, write I/O can be submitted and the write I/O completion handler must run
-end_page_writeback() once the I/O is complete. If no I/O is submitted, the
-filesystem must run end_page_writeback() against the page before returning from
-writepage.
-
-That is: after 2.5.12, pages which are under writeout are *not* locked. Note,
-if the filesystem needs the page to be locked during writeout, that is ok, too,
-the page is allowed to be unlocked at any point in time between the calls to
-set_page_writeback() and end_page_writeback().
-
-Note, failure to run either redirty_page_for_writepage() or the combination of
-set_page_writeback()/end_page_writeback() on a page submitted to writepage
-will leave the page itself marked clean but it will be tagged as dirty in the
-radix tree. This incoherency can lead to all sorts of hard-to-debug problems
-in the filesystem like having dirty inodes at umount and losing written data.
-
->writepages() is used for periodic writeback and for syscall-initiated
sync operations. The address_space should start I/O against at least
``*nr_to_write`` pages. ``*nr_to_write`` must be decremented for each page
@@ -364,8 +314,8 @@ which is written. The address_space implementation may write more (or less)
pages than ``*nr_to_write`` asks for, but it should try to be reasonably close.
If nr_to_write is NULL, all dirty pages must be written.
-writepages should _only_ write pages which are present on
-mapping->io_pages.
+writepages should _only_ write pages which are present in
+mapping->i_pages.
->dirty_folio() is called from various places in the kernel when
the target folio is marked as needing writeback. The folio cannot be
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index ae79c30b6c0c..f66a4e706b17 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -716,9 +716,8 @@ page lookup by address, and keeping track of pages tagged as Dirty or
Writeback.
The first can be used independently to the others. The VM can try to
-either write dirty pages in order to clean them, or release clean pages
-in order to reuse them. To do this it can call the ->writepage method
-on dirty pages, and ->release_folio on clean folios with the private
+release clean pages in order to reuse them. To do this it can call
+->release_folio on clean folios with the private
flag set. Clean pages without PagePrivate and with no external references
will be released without notice being given to the address_space.
@@ -731,8 +730,8 @@ maintains information about the PG_Dirty and PG_Writeback status of each
page, so that pages with either of these flags can be found quickly.
The Dirty tag is primarily used by mpage_writepages - the default
-->writepages method. It uses the tag to find dirty pages to call
-->writepage on. If mpage_writepages is not used (i.e. the address
+->writepages method. It uses the tag to find dirty pages to
+write back. If mpage_writepages is not used (i.e. the address
provides its own ->writepages) , the PAGECACHE_TAG_DIRTY tag is almost
unused. write_inode_now and sync_inode do use it (through
__sync_single_inode) to check if ->writepages has been successful in
@@ -756,23 +755,23 @@ pages, however the address_space has finer control of write sizes.
The read process essentially only requires 'read_folio'. The write
process is more complicated and uses write_begin/write_end or
-dirty_folio to write data into the address_space, and writepage and
+dirty_folio to write data into the address_space, and
writepages to writeback data to storage.
Adding and removing pages to/from an address_space is protected by the
inode's i_mutex.
When data is written to a page, the PG_Dirty flag should be set. It
-typically remains set until writepage asks for it to be written. This
+typically remains set until writepages asks for it to be written. This
should clear PG_Dirty and set PG_Writeback. It can be actually written
at any point after PG_Dirty is clear. Once it is known to be safe,
PG_Writeback is cleared.
Writeback makes use of a writeback_control structure to direct the
-operations. This gives the writepage and writepages operations some
+operations. This gives the writepages operation some
information about the nature of and reason for the writeback request,
and the constraints under which it is being done. It is also used to
-return information back to the caller about the result of a writepage or
+return information back to the caller about the result of a
writepages request.
@@ -819,7 +818,6 @@ cache in your filesystem. The following members are defined:
.. code-block:: c
struct address_space_operations {
- int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*read_folio)(struct file *, struct folio *);
int (*writepages)(struct address_space *, struct writeback_control *);
bool (*dirty_folio)(struct address_space *, struct folio *);
@@ -848,25 +846,6 @@ cache in your filesystem. The following members are defined:
int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);
};
-``writepage``
- called by the VM to write a dirty page to backing store. This
- may happen for data integrity reasons (i.e. 'sync'), or to free
- up memory (flush). The difference can be seen in
- wbc->sync_mode. The PG_Dirty flag has been cleared and
- PageLocked is true. writepage should start writeout, should set
- PG_Writeback, and should make sure the page is unlocked, either
- synchronously or asynchronously when the write operation
- completes.
-
- If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
- try too hard if there are problems, and may choose to write out
- other pages from the mapping if that is easier (e.g. due to
- internal dependencies). If it chooses not to start writeout, it
- should return AOP_WRITEPAGE_ACTIVATE so that the VM will not
- keep calling ->writepage on that page.
-
- See the file "Locking" for more details.
-
``read_folio``
Called by the page cache to read a folio from the backing store.
The 'file' argument supplies authentication information to network
@@ -909,7 +888,7 @@ cache in your filesystem. The following members are defined:
given and that many pages should be written if possible. If no
->writepages is given, then mpage_writepages is used instead.
This will choose pages from the address space that are tagged as
- DIRTY and will pass them to ->writepage.
+ DIRTY and will write them back.
``dirty_folio``
called by the VM to mark a folio as dirty. This is particularly
diff --git a/fs/buffer.c b/fs/buffer.c
index c7abb4a029dc..b99dc69dba37 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2695,7 +2695,7 @@ int block_truncate_page(struct address_space *mapping,
EXPORT_SYMBOL(block_truncate_page);
/*
- * The generic ->writepage function for buffer-backed address_spaces
+ * The generic write folio function for buffer-backed address_spaces
*/
int block_write_full_folio(struct folio *folio, struct writeback_control *wbc,
void *get_block)
@@ -2715,7 +2715,7 @@ int block_write_full_folio(struct folio *folio, struct writeback_control *wbc,
/*
* The folio straddles i_size. It must be zeroed out on each and every
- * writepage invocation because it may be mmapped. "A file is mapped
+ * writeback invocation because it may be mmapped. "A file is mapped
* in multiples of the page size. For a file that is not a multiple of
* the page size, the remaining memory is zeroed when mapped, and
* writes to that region are not written out to the file."
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 110d95d04299..26ce65c4a003 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -433,7 +433,6 @@ static inline bool is_sync_kiocb(struct kiocb *kiocb)
}
struct address_space_operations {
- int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*read_folio)(struct file *, struct folio *);
/* Write back some dirty pages from this mapping. */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e9f84fa31b9a..7e79ca975c9d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -643,7 +643,6 @@ typedef enum {
/*
* pageout is called by shrink_folio_list() for each dirty folio.
- * Calls ->writepage().
*/
static pageout_t pageout(struct folio *folio, struct address_space *mapping,
struct swap_iocb **plug, struct list_head *folio_list)
--
2.47.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 11/11] fs: Remove aops->writepage
2025-03-07 13:54 ` [PATCH 11/11] fs: Remove aops->writepage Matthew Wilcox (Oracle)
@ 2025-03-17 1:08 ` Fan Ni
2025-03-17 3:22 ` Matthew Wilcox
0 siblings, 1 reply; 25+ messages in thread
From: Fan Ni @ 2025-03-17 1:08 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, intel-gfx
On Fri, Mar 07, 2025 at 01:54:11PM +0000, Matthew Wilcox (Oracle) wrote:
> All callers and implementations are now removed, so remove the operation
> and update the documentation to match.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
Hi Matthew,
Tried to apply the remaining patches in the patchest (Patch 5-11) which
have not picked up by linux-next. It seems we have more to cleanup.
For example, I hit the following issue when try to compile
----------------------------------------------------------------
drivers/gpu/drm/ttm/ttm_backup.c: In function ‘ttm_backup_backup_page’:
drivers/gpu/drm/ttm/ttm_backup.c:139:39: error: ‘const struct address_space_operations’ has no member named ‘writepage’; did you mean ‘writepages’?
139 | ret = mapping->a_ops->writepage(folio_file_page(to_folio, idx), &wbc);
| ^~~~~~~~~
| writepages
----------------------------------------------------------------
Fan
> Documentation/admin-guide/cgroup-v2.rst | 2 +-
> Documentation/filesystems/fscrypt.rst | 2 +-
> Documentation/filesystems/locking.rst | 54 +------------------------
> Documentation/filesystems/vfs.rst | 39 +++++-------------
> fs/buffer.c | 4 +-
> include/linux/fs.h | 1 -
> mm/vmscan.c | 1 -
> 7 files changed, 15 insertions(+), 88 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 77d80a7e975b..4e10b4084381 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -3028,7 +3028,7 @@ Filesystem Support for Writeback
> --------------------------------
>
> A filesystem can support cgroup writeback by updating
> -address_space_operations->writepage[s]() to annotate bio's using the
> +address_space_operations->writepages() to annotate bio's using the
> following two functions.
>
> wbc_init_bio(@wbc, @bio)
> diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
> index e80329908549..3d22e2db732d 100644
> --- a/Documentation/filesystems/fscrypt.rst
> +++ b/Documentation/filesystems/fscrypt.rst
> @@ -1409,7 +1409,7 @@ read the ciphertext into the page cache and decrypt it in-place. The
> folio lock must be held until decryption has finished, to prevent the
> folio from becoming visible to userspace prematurely.
>
> -For the write path (->writepage()) of regular files, filesystems
> +For the write path (->writepages()) of regular files, filesystems
> cannot encrypt data in-place in the page cache, since the cached
> plaintext must be preserved. Instead, filesystems must encrypt into a
> temporary buffer or "bounce page", then write out the temporary
> diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
> index 0ec0bb6eb0fb..2e567e341c3b 100644
> --- a/Documentation/filesystems/locking.rst
> +++ b/Documentation/filesystems/locking.rst
> @@ -249,7 +249,6 @@ address_space_operations
> ========================
> prototypes::
>
> - int (*writepage)(struct page *page, struct writeback_control *wbc);
> int (*read_folio)(struct file *, struct folio *);
> int (*writepages)(struct address_space *, struct writeback_control *);
> bool (*dirty_folio)(struct address_space *, struct folio *folio);
> @@ -280,7 +279,6 @@ locking rules:
> ====================== ======================== ========= ===============
> ops folio locked i_rwsem invalidate_lock
> ====================== ======================== ========= ===============
> -writepage: yes, unlocks (see below)
> read_folio: yes, unlocks shared
> writepages:
> dirty_folio: maybe
> @@ -309,54 +307,6 @@ completion.
>
> ->readahead() unlocks the folios that I/O is attempted on like ->read_folio().
>
> -->writepage() is used for two purposes: for "memory cleansing" and for
> -"sync". These are quite different operations and the behaviour may differ
> -depending upon the mode.
> -
> -If writepage is called for sync (wbc->sync_mode != WBC_SYNC_NONE) then
> -it *must* start I/O against the page, even if that would involve
> -blocking on in-progress I/O.
> -
> -If writepage is called for memory cleansing (sync_mode ==
> -WBC_SYNC_NONE) then its role is to get as much writeout underway as
> -possible. So writepage should try to avoid blocking against
> -currently-in-progress I/O.
> -
> -If the filesystem is not called for "sync" and it determines that it
> -would need to block against in-progress I/O to be able to start new I/O
> -against the page the filesystem should redirty the page with
> -redirty_page_for_writepage(), then unlock the page and return zero.
> -This may also be done to avoid internal deadlocks, but rarely.
> -
> -If the filesystem is called for sync then it must wait on any
> -in-progress I/O and then start new I/O.
> -
> -The filesystem should unlock the page synchronously, before returning to the
> -caller, unless ->writepage() returns special WRITEPAGE_ACTIVATE
> -value. WRITEPAGE_ACTIVATE means that page cannot really be written out
> -currently, and VM should stop calling ->writepage() on this page for some
> -time. VM does this by moving page to the head of the active list, hence the
> -name.
> -
> -Unless the filesystem is going to redirty_page_for_writepage(), unlock the page
> -and return zero, writepage *must* run set_page_writeback() against the page,
> -followed by unlocking it. Once set_page_writeback() has been run against the
> -page, write I/O can be submitted and the write I/O completion handler must run
> -end_page_writeback() once the I/O is complete. If no I/O is submitted, the
> -filesystem must run end_page_writeback() against the page before returning from
> -writepage.
> -
> -That is: after 2.5.12, pages which are under writeout are *not* locked. Note,
> -if the filesystem needs the page to be locked during writeout, that is ok, too,
> -the page is allowed to be unlocked at any point in time between the calls to
> -set_page_writeback() and end_page_writeback().
> -
> -Note, failure to run either redirty_page_for_writepage() or the combination of
> -set_page_writeback()/end_page_writeback() on a page submitted to writepage
> -will leave the page itself marked clean but it will be tagged as dirty in the
> -radix tree. This incoherency can lead to all sorts of hard-to-debug problems
> -in the filesystem like having dirty inodes at umount and losing written data.
> -
> ->writepages() is used for periodic writeback and for syscall-initiated
> sync operations. The address_space should start I/O against at least
> ``*nr_to_write`` pages. ``*nr_to_write`` must be decremented for each page
> @@ -364,8 +314,8 @@ which is written. The address_space implementation may write more (or less)
> pages than ``*nr_to_write`` asks for, but it should try to be reasonably close.
> If nr_to_write is NULL, all dirty pages must be written.
>
> -writepages should _only_ write pages which are present on
> -mapping->io_pages.
> +writepages should _only_ write pages which are present in
> +mapping->i_pages.
>
> ->dirty_folio() is called from various places in the kernel when
> the target folio is marked as needing writeback. The folio cannot be
> diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
> index ae79c30b6c0c..f66a4e706b17 100644
> --- a/Documentation/filesystems/vfs.rst
> +++ b/Documentation/filesystems/vfs.rst
> @@ -716,9 +716,8 @@ page lookup by address, and keeping track of pages tagged as Dirty or
> Writeback.
>
> The first can be used independently to the others. The VM can try to
> -either write dirty pages in order to clean them, or release clean pages
> -in order to reuse them. To do this it can call the ->writepage method
> -on dirty pages, and ->release_folio on clean folios with the private
> +release clean pages in order to reuse them. To do this it can call
> +->release_folio on clean folios with the private
> flag set. Clean pages without PagePrivate and with no external references
> will be released without notice being given to the address_space.
>
> @@ -731,8 +730,8 @@ maintains information about the PG_Dirty and PG_Writeback status of each
> page, so that pages with either of these flags can be found quickly.
>
> The Dirty tag is primarily used by mpage_writepages - the default
> -->writepages method. It uses the tag to find dirty pages to call
> -->writepage on. If mpage_writepages is not used (i.e. the address
> +->writepages method. It uses the tag to find dirty pages to
> +write back. If mpage_writepages is not used (i.e. the address
> provides its own ->writepages) , the PAGECACHE_TAG_DIRTY tag is almost
> unused. write_inode_now and sync_inode do use it (through
> __sync_single_inode) to check if ->writepages has been successful in
> @@ -756,23 +755,23 @@ pages, however the address_space has finer control of write sizes.
>
> The read process essentially only requires 'read_folio'. The write
> process is more complicated and uses write_begin/write_end or
> -dirty_folio to write data into the address_space, and writepage and
> +dirty_folio to write data into the address_space, and
> writepages to writeback data to storage.
>
> Adding and removing pages to/from an address_space is protected by the
> inode's i_mutex.
>
> When data is written to a page, the PG_Dirty flag should be set. It
> -typically remains set until writepage asks for it to be written. This
> +typically remains set until writepages asks for it to be written. This
> should clear PG_Dirty and set PG_Writeback. It can be actually written
> at any point after PG_Dirty is clear. Once it is known to be safe,
> PG_Writeback is cleared.
>
> Writeback makes use of a writeback_control structure to direct the
> -operations. This gives the writepage and writepages operations some
> +operations. This gives the writepages operation some
> information about the nature of and reason for the writeback request,
> and the constraints under which it is being done. It is also used to
> -return information back to the caller about the result of a writepage or
> +return information back to the caller about the result of a
> writepages request.
>
>
> @@ -819,7 +818,6 @@ cache in your filesystem. The following members are defined:
> .. code-block:: c
>
> struct address_space_operations {
> - int (*writepage)(struct page *page, struct writeback_control *wbc);
> int (*read_folio)(struct file *, struct folio *);
> int (*writepages)(struct address_space *, struct writeback_control *);
> bool (*dirty_folio)(struct address_space *, struct folio *);
> @@ -848,25 +846,6 @@ cache in your filesystem. The following members are defined:
> int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);
> };
>
> -``writepage``
> - called by the VM to write a dirty page to backing store. This
> - may happen for data integrity reasons (i.e. 'sync'), or to free
> - up memory (flush). The difference can be seen in
> - wbc->sync_mode. The PG_Dirty flag has been cleared and
> - PageLocked is true. writepage should start writeout, should set
> - PG_Writeback, and should make sure the page is unlocked, either
> - synchronously or asynchronously when the write operation
> - completes.
> -
> - If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
> - try too hard if there are problems, and may choose to write out
> - other pages from the mapping if that is easier (e.g. due to
> - internal dependencies). If it chooses not to start writeout, it
> - should return AOP_WRITEPAGE_ACTIVATE so that the VM will not
> - keep calling ->writepage on that page.
> -
> - See the file "Locking" for more details.
> -
> ``read_folio``
> Called by the page cache to read a folio from the backing store.
> The 'file' argument supplies authentication information to network
> @@ -909,7 +888,7 @@ cache in your filesystem. The following members are defined:
> given and that many pages should be written if possible. If no
> ->writepages is given, then mpage_writepages is used instead.
> This will choose pages from the address space that are tagged as
> - DIRTY and will pass them to ->writepage.
> + DIRTY and will write them back.
>
> ``dirty_folio``
> called by the VM to mark a folio as dirty. This is particularly
> diff --git a/fs/buffer.c b/fs/buffer.c
> index c7abb4a029dc..b99dc69dba37 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2695,7 +2695,7 @@ int block_truncate_page(struct address_space *mapping,
> EXPORT_SYMBOL(block_truncate_page);
>
> /*
> - * The generic ->writepage function for buffer-backed address_spaces
> + * The generic write folio function for buffer-backed address_spaces
> */
> int block_write_full_folio(struct folio *folio, struct writeback_control *wbc,
> void *get_block)
> @@ -2715,7 +2715,7 @@ int block_write_full_folio(struct folio *folio, struct writeback_control *wbc,
>
> /*
> * The folio straddles i_size. It must be zeroed out on each and every
> - * writepage invocation because it may be mmapped. "A file is mapped
> + * writeback invocation because it may be mmapped. "A file is mapped
> * in multiples of the page size. For a file that is not a multiple of
> * the page size, the remaining memory is zeroed when mapped, and
> * writes to that region are not written out to the file."
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 110d95d04299..26ce65c4a003 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -433,7 +433,6 @@ static inline bool is_sync_kiocb(struct kiocb *kiocb)
> }
>
> struct address_space_operations {
> - int (*writepage)(struct page *page, struct writeback_control *wbc);
> int (*read_folio)(struct file *, struct folio *);
>
> /* Write back some dirty pages from this mapping. */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index e9f84fa31b9a..7e79ca975c9d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -643,7 +643,6 @@ typedef enum {
>
> /*
> * pageout is called by shrink_folio_list() for each dirty folio.
> - * Calls ->writepage().
> */
> static pageout_t pageout(struct folio *folio, struct address_space *mapping,
> struct swap_iocb **plug, struct list_head *folio_list)
> --
> 2.47.2
>
--
Fan Ni
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 11/11] fs: Remove aops->writepage
2025-03-17 1:08 ` Fan Ni
@ 2025-03-17 3:22 ` Matthew Wilcox
2025-03-17 22:30 ` Matthew Wilcox
0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2025-03-17 3:22 UTC (permalink / raw)
To: Fan Ni, Thomas Hellström; +Cc: linux-fsdevel, linux-mm, intel-gfx
On Sun, Mar 16, 2025 at 06:08:52PM -0700, Fan Ni wrote:
> On Fri, Mar 07, 2025 at 01:54:11PM +0000, Matthew Wilcox (Oracle) wrote:
> > All callers and implementations are now removed, so remove the operation
> > and update the documentation to match.
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
>
> Hi Matthew,
>
> Tried to apply the remaining patches in the patchest (Patch 5-11) which
> have not picked up by linux-next. It seems we have more to cleanup.
>
> For example, I hit the following issue when try to compile
> ----------------------------------------------------------------
> drivers/gpu/drm/ttm/ttm_backup.c: In function ‘ttm_backup_backup_page’:
> drivers/gpu/drm/ttm/ttm_backup.c:139:39: error: ‘const struct address_space_operations’ has no member named ‘writepage’; did you mean ‘writepages’?
> 139 | ret = mapping->a_ops->writepage(folio_file_page(to_folio, idx), &wbc);
Looks like that was added to linux-next after I completed the removal of
->writepage. Thomas, what's going on here?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 11/11] fs: Remove aops->writepage
2025-03-17 3:22 ` Matthew Wilcox
@ 2025-03-17 22:30 ` Matthew Wilcox
2025-03-18 8:10 ` Thomas Hellström
0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2025-03-17 22:30 UTC (permalink / raw)
To: Fan Ni, Thomas Hellström; +Cc: linux-fsdevel, linux-mm, intel-gfx
On Mon, Mar 17, 2025 at 03:22:30AM +0000, Matthew Wilcox wrote:
> On Sun, Mar 16, 2025 at 06:08:52PM -0700, Fan Ni wrote:
> > On Fri, Mar 07, 2025 at 01:54:11PM +0000, Matthew Wilcox (Oracle) wrote:
> > > All callers and implementations are now removed, so remove the operation
> > > and update the documentation to match.
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> >
> > Hi Matthew,
> >
> > Tried to apply the remaining patches in the patchest (Patch 5-11) which
> > have not picked up by linux-next. It seems we have more to cleanup.
> >
> > For example, I hit the following issue when try to compile
> > ----------------------------------------------------------------
> > drivers/gpu/drm/ttm/ttm_backup.c: In function ‘ttm_backup_backup_page’:
> > drivers/gpu/drm/ttm/ttm_backup.c:139:39: error: ‘const struct address_space_operations’ has no member named ‘writepage’; did you mean ‘writepages’?
> > 139 | ret = mapping->a_ops->writepage(folio_file_page(to_folio, idx), &wbc);
>
> Looks like that was added to linux-next after I completed the removal of
> ->writepage. Thomas, what's going on here?
This patch fixes the compilation problem. But I don't understand why
it's messing with the reclaim flag. Thomas, can you explain?
+++ b/drivers/gpu/drm/ttm/ttm_backup.c
@@ -136,13 +136,13 @@ ttm_backup_backup_page(struct ttm_backup *backup, struct page *page,
.for_reclaim = 1,
};
folio_set_reclaim(to_folio);
- ret = mapping->a_ops->writepage(folio_file_page(to_folio, idx), &wbc);
+ ret = shmem_writeout(to_folio, &wbc);
if (!folio_test_writeback(to_folio))
folio_clear_reclaim(to_folio);
/*
- * If writepage succeeds, it unlocks the folio.
- * writepage() errors are otherwise dropped, since writepage()
- * is only best effort here.
+ * If writeout succeeds, it unlocks the folio. errors
+ * are otherwise dropped, since writeout is only best
+ * effort here.
*/
if (ret)
folio_unlock(to_folio);
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 11/11] fs: Remove aops->writepage
2025-03-17 22:30 ` Matthew Wilcox
@ 2025-03-18 8:10 ` Thomas Hellström
2025-04-01 16:26 ` Matthew Wilcox
0 siblings, 1 reply; 25+ messages in thread
From: Thomas Hellström @ 2025-03-18 8:10 UTC (permalink / raw)
To: Matthew Wilcox, Fan Ni; +Cc: linux-fsdevel, linux-mm, intel-gfx
On Mon, 2025-03-17 at 22:30 +0000, Matthew Wilcox wrote:
> On Mon, Mar 17, 2025 at 03:22:30AM +0000, Matthew Wilcox wrote:
> > On Sun, Mar 16, 2025 at 06:08:52PM -0700, Fan Ni wrote:
> > > On Fri, Mar 07, 2025 at 01:54:11PM +0000, Matthew Wilcox (Oracle)
> > > wrote:
> > > > All callers and implementations are now removed, so remove the
> > > > operation
> > > > and update the documentation to match.
> > > >
> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > ---
> > >
> > > Hi Matthew,
> > >
> > > Tried to apply the remaining patches in the patchest (Patch 5-
> > > 11) which
> > > have not picked up by linux-next. It seems we have more to
> > > cleanup.
> > >
> > > For example, I hit the following issue when try to compile
> > > ----------------------------------------------------------------
> > > drivers/gpu/drm/ttm/ttm_backup.c: In function
> > > ‘ttm_backup_backup_page’:
> > > drivers/gpu/drm/ttm/ttm_backup.c:139:39: error: ‘const struct
> > > address_space_operations’ has no member named ‘writepage’; did
> > > you mean ‘writepages’?
> > > 139 | ret = mapping->a_ops-
> > > >writepage(folio_file_page(to_folio, idx), &wbc);
> >
> > Looks like that was added to linux-next after I completed the
> > removal of
> > ->writepage. Thomas, what's going on here?
>
> This patch fixes the compilation problem. But I don't understand why
> it's messing with the reclaim flag. Thomas, can you explain?
Hi, Sorry for not responding earlier. The patch that uses writepage()
here has been around for quite some time waiting for reviews / acks so
I failed to notice that it's going away.
Anyway the reclaim flag clearing follows that of pageout() in vmscan.c
which was also the case for the i915_gem_shmem.c usage in
__shmem_writeback(). My understanding was that if the writeback was
already completed at that point, the reclaim flag was no longer
desirable.
Let me know if this requires some action on my side. Unfortunately
freedesktop.org is down for maintainance, possibly for the whole week,
so there will be no drm subsystem PRs this week AFAICT.
The fix below looks good to me, BTW.
Thanks,
Thomas
>
> +++ b/drivers/gpu/drm/ttm/ttm_backup.c
> @@ -136,13 +136,13 @@ ttm_backup_backup_page(struct ttm_backup
> *backup, struct page *page,
> .for_reclaim = 1,
> };
> folio_set_reclaim(to_folio);
> - ret = mapping->a_ops-
> >writepage(folio_file_page(to_folio, idx), &wbc);
> + ret = shmem_writeout(to_folio, &wbc);
> if (!folio_test_writeback(to_folio))
> folio_clear_reclaim(to_folio);
> /*
> - * If writepage succeeds, it unlocks the folio.
> - * writepage() errors are otherwise dropped, since
> writepage()
> - * is only best effort here.
> + * If writeout succeeds, it unlocks the folio.
> errors
> + * are otherwise dropped, since writeout is only best
> + * effort here.
> */
> if (ret)
> folio_unlock(to_folio);
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 11/11] fs: Remove aops->writepage
2025-03-18 8:10 ` Thomas Hellström
@ 2025-04-01 16:26 ` Matthew Wilcox
2025-05-02 14:33 ` Thomas Hellström
0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2025-04-01 16:26 UTC (permalink / raw)
To: Thomas Hellström; +Cc: Fan Ni, linux-fsdevel, linux-mm, intel-gfx
On Tue, Mar 18, 2025 at 09:10:38AM +0100, Thomas Hellström wrote:
> On Mon, 2025-03-17 at 22:30 +0000, Matthew Wilcox wrote:
> > This patch fixes the compilation problem. But I don't understand why
> > it's messing with the reclaim flag. Thomas, can you explain?
>
> Hi, Sorry for not responding earlier. The patch that uses writepage()
> here has been around for quite some time waiting for reviews / acks so
> I failed to notice that it's going away.
My turn to be sorry for dropping this conversation ...
> Anyway the reclaim flag clearing follows that of pageout() in vmscan.c
> which was also the case for the i915_gem_shmem.c usage in
> __shmem_writeback(). My understanding was that if the writeback was
> already completed at that point, the reclaim flag was no longer
> desirable.
I think the question is really why you're setting it in the first place.
Setting the reclaim flag indicates that you want the folio removed from
the page cache as soon as possible. Other changes in flight are about
to make this more aggressive -- instead of waiting for the folio to
reach the end of the writeout queue, it'll be removed upon I/O completion.
It doesn't seem to me that this is what you actually want for TTM,
but perhaps I've misunderstood the intent of the code.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 11/11] fs: Remove aops->writepage
2025-04-01 16:26 ` Matthew Wilcox
@ 2025-05-02 14:33 ` Thomas Hellström
0 siblings, 0 replies; 25+ messages in thread
From: Thomas Hellström @ 2025-05-02 14:33 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Fan Ni, linux-fsdevel, linux-mm, intel-gfx
Hi, Matthew,
On Tue, 2025-04-01 at 17:26 +0100, Matthew Wilcox wrote:
> On Tue, Mar 18, 2025 at 09:10:38AM +0100, Thomas Hellström wrote:
> > On Mon, 2025-03-17 at 22:30 +0000, Matthew Wilcox wrote:
> > > This patch fixes the compilation problem. But I don't understand
> > > why
> > > it's messing with the reclaim flag. Thomas, can you explain?
> >
> > Hi, Sorry for not responding earlier. The patch that uses
> > writepage()
> > here has been around for quite some time waiting for reviews / acks
> > so
> > I failed to notice that it's going away.
>
> My turn to be sorry for dropping this conversation ...
Once again my turn. This disappeared in the mail flood. Sorry about
that.
>
> > Anyway the reclaim flag clearing follows that of pageout() in
> > vmscan.c
> > which was also the case for the i915_gem_shmem.c usage in
> > __shmem_writeback(). My understanding was that if the writeback was
> > already completed at that point, the reclaim flag was no longer
> > desirable.
>
> I think the question is really why you're setting it in the first
> place.
> Setting the reclaim flag indicates that you want the folio removed
> from
> the page cache as soon as possible.
So when the shmem swapout has been called, My understanding was that
the page had been moved from the page cache to the swap cache and now
written out to disc and this is all part of reclaim.
When TTM reaches this part of the code, it's always called from a
shrinker with the aim of freeing up memory as soon as possible.
So if this is incorrect or unsuitable usage of the reclaim flag, we
should of course remove the manipulation of it. (IIRC I was also a bit
confused as to why it didn't seem to be protected by a lock in the
callsites I looked at)
__shmem_writeback() in i915_gem_shmem.c and
pageout() in mm/vmscan.c
Thanks,
Thomas
> Other changes in flight are about
> to make this more aggressive -- instead of waiting for the folio to
> reach the end of the writeout queue, it'll be removed upon I/O
> completion.
>
> It doesn't seem to me that this is what you actually want for TTM,
> but perhaps I've misunderstood the intent of the code.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 00/11] Remove aops->writepage
2025-03-07 13:54 [PATCH 00/11] Remove aops->writepage Matthew Wilcox (Oracle)
` (10 preceding siblings ...)
2025-03-07 13:54 ` [PATCH 11/11] fs: Remove aops->writepage Matthew Wilcox (Oracle)
@ 2025-03-28 9:40 ` Christian Brauner
11 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2025-03-28 9:40 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, intel-gfx
On Fri, Mar 07, 2025 at 01:54:00PM +0000, Matthew Wilcox wrote:
> I was preparing for LSFMM and noticed that actually we're almost done
> with the writepage conversion. This patchset finishes it off.
> Something changed in my test environment and now it crashes before
> even starting a run, so this is only build tested.
>
> The first five patches (f2fs and vboxsf) are uninteresting. I'll try
> and get those into linux-next for the imminent merge window. I think
> the migrate and writeback patches are good, but maybe I've missed
> something. Then we come to i915 needing to tell shmem to do writeout,
> so I added a module-accessible function to do that. I also removed
> the setting/clearing of reclaim, which would be easy to bring back if
> it's really needed. Patch 10 is probably the exciting one where
> pageout() calls swap or shmem directly. And then patch 11 really just
> removes the op itself and the documentation for it. I may have
> over-trimmed here, but some of the documentation was so out of date it
> was hard to tell what was worth preserving.
>
> Anyway, let's see what the bots make of this. This is against
> next-20250307.
Once you're ready it would be nice if you could resend this based on
mainline (can be today) and then I'll pick it up.
Christian
^ permalink raw reply [flat|nested] 25+ messages in thread