* [PATCH v2 0/2] Avoid setting folio->private
@ 2025-03-10 19:29 Goldwyn Rodrigues
2025-03-10 19:29 ` [PATCH 1/2] btrfs: add mapping_set_release_always to inode's mapping Goldwyn Rodrigues
2025-03-10 19:29 ` [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE Goldwyn Rodrigues
0 siblings, 2 replies; 7+ messages in thread
From: Goldwyn Rodrigues @ 2025-03-10 19:29 UTC (permalink / raw)
To: linux-btrfs; +Cc: Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
folio->private was set in order to get release_folio() callbacks.
Instead if we set address_space flags AS_RELEASE_ALWAYS, there is no
need to set EXTENT_FOLIO_PRIVATE on every folio->private.
These patches are posted so we don't face conflicts with iomap's
iomap_folio_state, which also resides in folio->private.
Changes since v1:
- Incorporated Dave Sterba's comments
Goldwyn Rodrigues (2):
btrfs: add mapping_set_release_always to inode's mapping
btrfs: kill EXTENT_FOLIO_PRIVATE
fs/btrfs/compression.c | 2 +-
fs/btrfs/defrag.c | 2 +-
fs/btrfs/extent_io.c | 61 ++++++-------------------------------
fs/btrfs/extent_io.h | 10 ++----
fs/btrfs/file.c | 6 ++--
fs/btrfs/free-space-cache.c | 9 ------
fs/btrfs/inode.c | 13 +++++---
fs/btrfs/reflink.c | 2 +-
fs/btrfs/relocation.c | 4 +--
9 files changed, 27 insertions(+), 82 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] btrfs: add mapping_set_release_always to inode's mapping
2025-03-10 19:29 [PATCH v2 0/2] Avoid setting folio->private Goldwyn Rodrigues
@ 2025-03-10 19:29 ` Goldwyn Rodrigues
2025-03-10 19:29 ` [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE Goldwyn Rodrigues
1 sibling, 0 replies; 7+ messages in thread
From: Goldwyn Rodrigues @ 2025-03-10 19:29 UTC (permalink / raw)
To: linux-btrfs; +Cc: Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
BTRFS has been setting folio->private in order to get a callback to
release_folio() before a folio is released. This can also be performed
by the AS_RELEASE_ALWAYS flags set in the mapping->flags.
Calling mapping_set_release_always() on inode's address_space during
inode initialization will make unnecessary to setup folio->private for
every folio.
Setting the flag will call release_folio() for every folio
under the mapping without the need of setting folio->private. This will
eventually help btrfs move to iomap since iomap also uses folio->private
for iomap_folio_state.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/btrfs/extent_io.c | 2 +-
fs/btrfs/file.c | 2 +-
fs/btrfs/inode.c | 3 +++
3 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a78ff093ea37..297b7168a7d6 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -491,7 +491,7 @@ static void begin_folio_read(struct btrfs_fs_info *fs_info, struct folio *folio)
if (!btrfs_is_subpage(fs_info, folio->mapping))
return;
- ASSERT(folio_test_private(folio));
+ ASSERT(mapping_release_always(folio->mapping));
btrfs_folio_set_lock(fs_info, folio, folio_pos(folio), PAGE_SIZE);
}
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 643f101c7340..160e4030ca60 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -833,7 +833,7 @@ static int prepare_uptodate_folio(struct inode *inode, struct folio *folio, u64
* The private flag check is essential for subpage as we need to store
* extra bitmap using folio private.
*/
- if (folio->mapping != inode->i_mapping || !folio_test_private(folio)) {
+ if (folio->mapping != inode->i_mapping || !mapping_release_always(folio->mapping)) {
folio_unlock(folio);
return -EAGAIN;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e7e6accbaf6c..02ff9c449b35 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5616,6 +5616,8 @@ static int btrfs_init_locked_inode(struct inode *inode, void *p)
btrfs_set_inode_number(BTRFS_I(inode), args->ino);
BTRFS_I(inode)->root = btrfs_grab_root(args->root);
+ mapping_set_release_always(inode->i_mapping);
+
if (args->root && args->root == args->root->fs_info->tree_root &&
args->ino != BTRFS_BTREE_INODE_OBJECTID)
set_bit(BTRFS_INODE_FREE_SPACE_INODE,
@@ -6698,6 +6700,7 @@ static int btrfs_create(struct mnt_idmap *idmap, struct inode *dir,
inode->i_fop = &btrfs_file_operations;
inode->i_op = &btrfs_file_inode_operations;
inode->i_mapping->a_ops = &btrfs_aops;
+ mapping_set_release_always(inode->i_mapping);
return btrfs_create_common(dir, dentry, inode);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE
2025-03-10 19:29 [PATCH v2 0/2] Avoid setting folio->private Goldwyn Rodrigues
2025-03-10 19:29 ` [PATCH 1/2] btrfs: add mapping_set_release_always to inode's mapping Goldwyn Rodrigues
@ 2025-03-10 19:29 ` Goldwyn Rodrigues
2025-03-12 14:08 ` David Sterba
1 sibling, 1 reply; 7+ messages in thread
From: Goldwyn Rodrigues @ 2025-03-10 19:29 UTC (permalink / raw)
To: linux-btrfs; +Cc: Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
Since mapping_set_release_always() will provide a callback for release
folio, remove EXTENT_FOLIO_PRIVATE and all it's helper functions.
This affects how we handle subpage, so convert the function name of
set_folio_extent_mapped() to btrfs_set_folio_subpage() and
clear_folio_extent_mapped() to btrfs_clear_folio_subpage().
These functions are now just wrappers to btrfs_attach_subpage() and
btrfs_detach_subpage().
free_space_cache_inode does not use subpage, so just skip calling
btrfs_set_folio_subpage().
This reverts part of
fd1d7f44b352 ("btrfs: reject out-of-band dirty folios during writeback")
because it is not relevant. Whether a folio belongs to btrfs is now done
at the file inode level as opposed to the folio level.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/btrfs/compression.c | 2 +-
fs/btrfs/defrag.c | 2 +-
fs/btrfs/extent_io.c | 59 +++++--------------------------------
fs/btrfs/extent_io.h | 10 ++-----
fs/btrfs/file.c | 4 +--
fs/btrfs/free-space-cache.c | 9 ------
fs/btrfs/inode.c | 10 +++----
fs/btrfs/reflink.c | 2 +-
fs/btrfs/relocation.c | 4 +--
9 files changed, 22 insertions(+), 80 deletions(-)
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index e7f8ee5d48a4..85162ea47a7f 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -491,7 +491,7 @@ static noinline int add_ra_bio_pages(struct inode *inode,
*memstall = 1;
}
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0) {
folio_unlock(folio);
folio_put(folio);
diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
index ae0b92b96345..6a980cf2acf7 100644
--- a/fs/btrfs/defrag.c
+++ b/fs/btrfs/defrag.c
@@ -883,7 +883,7 @@ static struct folio *defrag_prepare_one_folio(struct btrfs_inode *inode, pgoff_t
return ERR_PTR(-ETXTBSY);
}
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0) {
folio_unlock(folio);
folio_put(folio);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 297b7168a7d6..3afbb754c248 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -859,38 +859,18 @@ static int attach_extent_buffer_folio(struct extent_buffer *eb,
return ret;
}
-int set_folio_extent_mapped(struct folio *folio)
+int btrfs_set_folio_subpage(struct folio *folio)
{
- struct btrfs_fs_info *fs_info;
-
- ASSERT(folio->mapping);
-
- if (folio_test_private(folio))
- return 0;
-
- fs_info = folio_to_fs_info(folio);
-
- if (btrfs_is_subpage(fs_info, folio->mapping))
- return btrfs_attach_subpage(fs_info, folio, BTRFS_SUBPAGE_DATA);
+ struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
- folio_attach_private(folio, (void *)EXTENT_FOLIO_PRIVATE);
- return 0;
+ return btrfs_attach_subpage(fs_info, folio, BTRFS_SUBPAGE_DATA);
}
-void clear_folio_extent_mapped(struct folio *folio)
+void btrfs_clear_folio_subpage(struct folio *folio)
{
- struct btrfs_fs_info *fs_info;
-
- ASSERT(folio->mapping);
-
- if (!folio_test_private(folio))
- return;
-
- fs_info = folio_to_fs_info(folio);
- if (btrfs_is_subpage(fs_info, folio->mapping))
- return btrfs_detach_subpage(fs_info, folio, BTRFS_SUBPAGE_DATA);
+ struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
- folio_detach_private(folio);
+ btrfs_detach_subpage(fs_info, folio, BTRFS_SUBPAGE_DATA);
}
static struct extent_map *get_extent_map(struct btrfs_inode *inode,
@@ -942,7 +922,7 @@ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
int ret = 0;
const size_t blocksize = fs_info->sectorsize;
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0) {
folio_unlock(folio);
return ret;
@@ -1731,30 +1711,7 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
*/
bio_ctrl->submit_bitmap = (unsigned long)-1;
- /*
- * If the page is dirty but without private set, it's marked dirty
- * without informing the fs.
- * Nowadays that is a bug, since the introduction of
- * pin_user_pages*().
- *
- * So here we check if the page has private set to rule out such
- * case.
- * But we also have a long history of relying on the COW fixup,
- * so here we only enable this check for experimental builds until
- * we're sure it's safe.
- */
- if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) &&
- unlikely(!folio_test_private(folio))) {
- WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
- btrfs_err_rl(fs_info,
- "root %lld ino %llu folio %llu is marked dirty without notifying the fs",
- inode->root->root_key.objectid,
- btrfs_ino(inode), folio_pos(folio));
- ret = -EUCLEAN;
- goto done;
- }
-
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0)
goto done;
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 2e261892c7bc..f4077941810f 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -65,12 +65,6 @@ enum {
ENUM_BIT(PAGE_SET_ORDERED),
};
-/*
- * Folio private values. Every page that is controlled by the extent map has
- * folio private set to this value.
- */
-#define EXTENT_FOLIO_PRIVATE 1
-
/*
* The extent buffer bitmap operations are done with byte granularity instead of
* word granularity for two reasons:
@@ -247,8 +241,8 @@ int btrfs_writepages(struct address_space *mapping, struct writeback_control *wb
int btree_write_cache_pages(struct address_space *mapping,
struct writeback_control *wbc);
void btrfs_readahead(struct readahead_control *rac);
-int set_folio_extent_mapped(struct folio *folio);
-void clear_folio_extent_mapped(struct folio *folio);
+int btrfs_set_folio_subpage(struct folio *folio);
+void btrfs_clear_folio_subpage(struct folio *folio);
struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
u64 start, u64 owner_root, int level);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 160e4030ca60..12d7a98c50a9 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -877,7 +877,7 @@ static noinline int prepare_one_folio(struct inode *inode, struct folio **folio_
}
/* Only support page sized folio yet. */
ASSERT(folio_order(folio) == 0);
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0) {
folio_unlock(folio);
folio_put(folio);
@@ -1835,7 +1835,7 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
folio_wait_writeback(folio);
lock_extent(io_tree, page_start, page_end, &cached_state);
- ret2 = set_folio_extent_mapped(folio);
+ ret2 = btrfs_set_folio_subpage(folio);
if (ret2 < 0) {
ret = vmf_error(ret2);
unlock_extent(io_tree, page_start, page_end, &cached_state);
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 3095cce904b5..4dacf8386050 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -450,7 +450,6 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, bool uptodate)
int i;
for (i = 0; i < io_ctl->num_pages; i++) {
- int ret;
folio = __filemap_get_folio(inode->i_mapping, i,
FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
@@ -460,14 +459,6 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, bool uptodate)
return -ENOMEM;
}
- ret = set_folio_extent_mapped(folio);
- if (ret < 0) {
- folio_unlock(folio);
- folio_put(folio);
- io_ctl_drop_pages(io_ctl);
- return ret;
- }
-
io_ctl->pages[i] = &folio->page;
if (uptodate && !folio_test_uptodate(folio)) {
btrfs_read_folio(NULL, folio);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 02ff9c449b35..aba7e6099180 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4873,11 +4873,11 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
/*
* We unlock the page after the io is completed and then re-lock it
- * above. release_folio() could have come in between that and cleared
- * folio private, but left the page in the mapping. Set the page mapped
+ * above. release_folio() could have come in between that,
+ * but left the page in the mapping. Set the page mapped
* here to make sure it's properly set for the subpage stuff.
*/
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0)
goto out_unlock;
@@ -7317,7 +7317,7 @@ static bool __btrfs_release_folio(struct folio *folio, gfp_t gfp_flags)
{
if (try_release_extent_mapping(folio, gfp_flags)) {
wait_subpage_spinlock(folio);
- clear_folio_extent_mapped(folio);
+ btrfs_clear_folio_subpage(folio);
return true;
}
return false;
@@ -7515,7 +7515,7 @@ static void btrfs_invalidate_folio(struct folio *folio, size_t offset,
btrfs_folio_clear_checked(fs_info, folio, folio_pos(folio), folio_size(folio));
if (!inode_evicting)
__btrfs_release_folio(folio, GFP_NOFS);
- clear_folio_extent_mapped(folio);
+ btrfs_clear_folio_subpage(folio);
}
static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback)
diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c
index 15c296cb4dac..e27486fec156 100644
--- a/fs/btrfs/reflink.c
+++ b/fs/btrfs/reflink.c
@@ -91,7 +91,7 @@ static int copy_inline_to_page(struct btrfs_inode *inode,
goto out_unlock;
}
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0)
goto out_unlock;
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index af0969b70b53..a9e62fe6c8af 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2870,10 +2870,10 @@ static int relocate_one_folio(struct reloc_control *rc,
/*
* We could have lost folio private when we dropped the lock to read the
- * folio above, make sure we set_folio_extent_mapped() here so we have any
+ * folio above, make sure we btrfs_set_folio_subpage() here so we have any
* of the subpage blocksize stuff we need in place.
*/
- ret = set_folio_extent_mapped(folio);
+ ret = btrfs_set_folio_subpage(folio);
if (ret < 0)
goto release_folio;
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE
2025-03-10 19:29 ` [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE Goldwyn Rodrigues
@ 2025-03-12 14:08 ` David Sterba
2025-03-12 23:54 ` Qu Wenruo
2025-03-22 22:19 ` Qu Wenruo
0 siblings, 2 replies; 7+ messages in thread
From: David Sterba @ 2025-03-12 14:08 UTC (permalink / raw)
To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues, wqu
On Mon, Mar 10, 2025 at 03:29:07PM -0400, Goldwyn Rodrigues wrote:
> @@ -1731,30 +1711,7 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
> */
> bio_ctrl->submit_bitmap = (unsigned long)-1;
>
> - /*
> - * If the page is dirty but without private set, it's marked dirty
> - * without informing the fs.
> - * Nowadays that is a bug, since the introduction of
> - * pin_user_pages*().
> - *
> - * So here we check if the page has private set to rule out such
> - * case.
> - * But we also have a long history of relying on the COW fixup,
> - * so here we only enable this check for experimental builds until
> - * we're sure it's safe.
> - */
> - if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) &&
> - unlikely(!folio_test_private(folio))) {
> - WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
> - btrfs_err_rl(fs_info,
> - "root %lld ino %llu folio %llu is marked dirty without notifying the fs",
> - inode->root->root_key.objectid,
> - btrfs_ino(inode), folio_pos(folio));
> - ret = -EUCLEAN;
> - goto done;
> - }
> -
> - ret = set_folio_extent_mapped(folio);
> + ret = btrfs_set_folio_subpage(folio);
The removed part is from a recent patch "btrfs: reject out-of-band dirty
folios during writeback" to make sure we don't really need the fixup
worker. So with the rework to remove EXTENT_FOLIO_PRIVATE it's changing
the logic a again. I'm not sure we should do both in one release, merely
from the point of caution and making 2 changes at once.
I can keep the 2 patches in misc-next and move them to for-next once the
6.15 pull request is out so you don't have to track them yourself.
Also question to Qu, if you think the risk is minimal we can take both
changes now.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE
2025-03-12 14:08 ` David Sterba
@ 2025-03-12 23:54 ` Qu Wenruo
2025-03-17 14:56 ` David Sterba
2025-03-22 22:19 ` Qu Wenruo
1 sibling, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2025-03-12 23:54 UTC (permalink / raw)
To: dsterba, Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues, wqu
在 2025/3/13 00:38, David Sterba 写道:
> On Mon, Mar 10, 2025 at 03:29:07PM -0400, Goldwyn Rodrigues wrote:
>> @@ -1731,30 +1711,7 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
>> */
>> bio_ctrl->submit_bitmap = (unsigned long)-1;
>>
>> - /*
>> - * If the page is dirty but without private set, it's marked dirty
>> - * without informing the fs.
>> - * Nowadays that is a bug, since the introduction of
>> - * pin_user_pages*().
>> - *
>> - * So here we check if the page has private set to rule out such
>> - * case.
>> - * But we also have a long history of relying on the COW fixup,
>> - * so here we only enable this check for experimental builds until
>> - * we're sure it's safe.
>> - */
>> - if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) &&
>> - unlikely(!folio_test_private(folio))) {
>> - WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> - btrfs_err_rl(fs_info,
>> - "root %lld ino %llu folio %llu is marked dirty without notifying the fs",
>> - inode->root->root_key.objectid,
>> - btrfs_ino(inode), folio_pos(folio));
>> - ret = -EUCLEAN;
>> - goto done;
>> - }
>> -
>> - ret = set_folio_extent_mapped(folio);
>> + ret = btrfs_set_folio_subpage(folio);
>
> The removed part is from a recent patch "btrfs: reject out-of-band dirty
> folios during writeback" to make sure we don't really need the fixup
> worker. So with the rework to remove EXTENT_FOLIO_PRIVATE it's changing
> the logic a again. I'm not sure we should do both in one release, merely
> from the point of caution and making 2 changes at once.
>
> I can keep the 2 patches in misc-next and move them to for-next once the
> 6.15 pull request is out so you don't have to track them yourself.
>
> Also question to Qu, if you think the risk is minimal we can take both
> changes now.
>
I'm hiding the new change behind EXPERIMENTAL is to get more testing
before we're 100% sure.
Although personally speaking I'm confident the cow fixup is useless and
are just hiding other bugs (e.g. the one I fixed for various error
paths), but the most common objection I can remember is some hardware
specific behaviors, that I can not verify.
So just to be extra safe, I prefer to get it tested on all architectures
for at least 2 releases, before moving it out of experimental flag.
I totally understand Goldwyn's push for iomap integration, but I believe
this one can be pushed a little later, as we have more low-hanging fruits:
- Fully remove the usage of folio ordered and checked
That's depending on the removal of cow fixup
- Remove the async extent behavior
So that the compression happens after the blocks are submitted as
bbios.
This allows us to handle compressed write just like any regular
writes, and allows us to align the writeback path to iomap.
So I still tend to delay this change.
Thanks,
Qu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE
2025-03-12 23:54 ` Qu Wenruo
@ 2025-03-17 14:56 ` David Sterba
0 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2025-03-17 14:56 UTC (permalink / raw)
To: Qu Wenruo; +Cc: dsterba, Goldwyn Rodrigues, linux-btrfs, Goldwyn Rodrigues, wqu
On Thu, Mar 13, 2025 at 10:24:32AM +1030, Qu Wenruo wrote:
> 在 2025/3/13 00:38, David Sterba 写道:
> > On Mon, Mar 10, 2025 at 03:29:07PM -0400, Goldwyn Rodrigues wrote:
> >> @@ -1731,30 +1711,7 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
> >> */
> >> bio_ctrl->submit_bitmap = (unsigned long)-1;
> >>
> >> - /*
> >> - * If the page is dirty but without private set, it's marked dirty
> >> - * without informing the fs.
> >> - * Nowadays that is a bug, since the introduction of
> >> - * pin_user_pages*().
> >> - *
> >> - * So here we check if the page has private set to rule out such
> >> - * case.
> >> - * But we also have a long history of relying on the COW fixup,
> >> - * so here we only enable this check for experimental builds until
> >> - * we're sure it's safe.
> >> - */
> >> - if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) &&
> >> - unlikely(!folio_test_private(folio))) {
> >> - WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
> >> - btrfs_err_rl(fs_info,
> >> - "root %lld ino %llu folio %llu is marked dirty without notifying the fs",
> >> - inode->root->root_key.objectid,
> >> - btrfs_ino(inode), folio_pos(folio));
> >> - ret = -EUCLEAN;
> >> - goto done;
> >> - }
> >> -
> >> - ret = set_folio_extent_mapped(folio);
> >> + ret = btrfs_set_folio_subpage(folio);
> >
> > The removed part is from a recent patch "btrfs: reject out-of-band dirty
> > folios during writeback" to make sure we don't really need the fixup
> > worker. So with the rework to remove EXTENT_FOLIO_PRIVATE it's changing
> > the logic a again. I'm not sure we should do both in one release, merely
> > from the point of caution and making 2 changes at once.
> >
> > I can keep the 2 patches in misc-next and move them to for-next once the
> > 6.15 pull request is out so you don't have to track them yourself.
> >
> > Also question to Qu, if you think the risk is minimal we can take both
> > changes now.
> >
>
> I'm hiding the new change behind EXPERIMENTAL is to get more testing
> before we're 100% sure.
>
> Although personally speaking I'm confident the cow fixup is useless and
> are just hiding other bugs (e.g. the one I fixed for various error
> paths), but the most common objection I can remember is some hardware
> specific behaviors, that I can not verify.
>
> So just to be extra safe, I prefer to get it tested on all architectures
> for at least 2 releases, before moving it out of experimental flag.
>
> I totally understand Goldwyn's push for iomap integration, but I believe
> this one can be pushed a little later, as we have more low-hanging fruits:
>
> - Fully remove the usage of folio ordered and checked
> That's depending on the removal of cow fixup
>
> - Remove the async extent behavior
> So that the compression happens after the blocks are submitted as
> bbios.
> This allows us to handle compressed write just like any regular
> writes, and allows us to align the writeback path to iomap.
>
> So I still tend to delay this change.
Thanks, so I'll keep the patch in misc-next and move it to for-next
eventually.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE
2025-03-12 14:08 ` David Sterba
2025-03-12 23:54 ` Qu Wenruo
@ 2025-03-22 22:19 ` Qu Wenruo
1 sibling, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2025-03-22 22:19 UTC (permalink / raw)
To: dsterba, Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues, wqu
在 2025/3/13 00:38, David Sterba 写道:
> On Mon, Mar 10, 2025 at 03:29:07PM -0400, Goldwyn Rodrigues wrote:
>> @@ -1731,30 +1711,7 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
>> */
>> bio_ctrl->submit_bitmap = (unsigned long)-1;
>>
>> - /*
>> - * If the page is dirty but without private set, it's marked dirty
>> - * without informing the fs.
>> - * Nowadays that is a bug, since the introduction of
>> - * pin_user_pages*().
>> - *
>> - * So here we check if the page has private set to rule out such
>> - * case.
>> - * But we also have a long history of relying on the COW fixup,
>> - * so here we only enable this check for experimental builds until
>> - * we're sure it's safe.
>> - */
>> - if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) &&
>> - unlikely(!folio_test_private(folio))) {
>> - WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> - btrfs_err_rl(fs_info,
>> - "root %lld ino %llu folio %llu is marked dirty without notifying the fs",
>> - inode->root->root_key.objectid,
>> - btrfs_ino(inode), folio_pos(folio));
>> - ret = -EUCLEAN;
>> - goto done;
>> - }
>> -
>> - ret = set_folio_extent_mapped(folio);
>> + ret = btrfs_set_folio_subpage(folio);
>
> The removed part is from a recent patch "btrfs: reject out-of-band dirty
> folios during writeback" to make sure we don't really need the fixup
> worker. So with the rework to remove EXTENT_FOLIO_PRIVATE it's changing
> the logic a again. I'm not sure we should do both in one release, merely
> from the point of caution and making 2 changes at once.
>
> I can keep the 2 patches in misc-next and move them to for-next once the
> 6.15 pull request is out so you don't have to track them yourself.
>
> Also question to Qu, if you think the risk is minimal we can take both
> changes now.
>
BTW, it looks like this series seems to cause subpage cases to crash,
reported from IBM guys:
https://lore.kernel.org/linux-btrfs/87h63ms7gk.fsf@gmail.com/
Meanwhile our current for-next no longer crashes as these two are removed.
So I'm afraid there are still some conflicts between this and subpage cases.
Thanks,
Qu
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-22 22:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-10 19:29 [PATCH v2 0/2] Avoid setting folio->private Goldwyn Rodrigues
2025-03-10 19:29 ` [PATCH 1/2] btrfs: add mapping_set_release_always to inode's mapping Goldwyn Rodrigues
2025-03-10 19:29 ` [PATCH 2/2] btrfs: kill EXTENT_FOLIO_PRIVATE Goldwyn Rodrigues
2025-03-12 14:08 ` David Sterba
2025-03-12 23:54 ` Qu Wenruo
2025-03-17 14:56 ` David Sterba
2025-03-22 22:19 ` Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox