* [PATCH 0/2] btrfs: do not commit transaction to avoid deadly ENOSPC trap
@ 2023-07-27 6:07 Qu Wenruo
2023-07-27 6:07 ` [PATCH 1/2] btrfs: do not commit transaction after adding one device Qu Wenruo
2023-07-27 6:07 ` [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace Qu Wenruo
0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2023-07-27 6:07 UTC (permalink / raw)
To: linux-btrfs
There is a report that a user hits a deadly ENOSPC trap, where the fs
would immediately falls read-only when commit any transaction.
To make the case worse, the fs is using RAID1C4 (later converted to
RAID1C3), this makes it impossible to add any extra disks.
As the kernel would commit the current transaction after just adding the
first device, while RAID1C4 needs 4 new disks, this would still trigger
ENOSPC during transaction committing.
The first patch would address the problem by not committing the
transaction adding a new device.
This would cause a behavior change, would need to co-operate with
btrfs-progs to allow end users to determine if they want to sync the fs
after adding all devices.
The second patch is to address a rare corner case hit by the same
reporter, that canceling a suspended replace would also trigger the
deadly ENOSPC trap.
Qu Wenruo (2):
btrfs: do not commit transaction after adding one device
btrfs: do not commit transaction canceling a suspended replace
fs/btrfs/dev-replace.c | 10 ----------
fs/btrfs/volumes.c | 12 ++----------
2 files changed, 2 insertions(+), 20 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] btrfs: do not commit transaction after adding one device
2023-07-27 6:07 [PATCH 0/2] btrfs: do not commit transaction to avoid deadly ENOSPC trap Qu Wenruo
@ 2023-07-27 6:07 ` Qu Wenruo
2023-07-27 6:07 ` [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace Qu Wenruo
1 sibling, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2023-07-27 6:07 UTC (permalink / raw)
To: linux-btrfs
[BUG]
There is at least one report that btrfs falls to deadly ENOSPC
situation, where without adding new space btrfs would fail to commit any
transaction due to ENOSPC.
However this last resort device adding would not work if the filesystem
is using multi-device metadata profiles (RAID1*, RAID0, RAID5/6), as
after each device added, btrfs would commit the transaction, falling
back to read-only due to ENOSPC.
[CAUSE]
Of course we should not fail a transaction due to ENOSPC in the first
place, as this means at metadata reservation stage we did wrong
calculation.
But the current behavior of device add is also a little too strict,
considering device adding is the last-resort for ENOSPC, we should at
least allow end users to determine if they want to commit transaction or
not.
[FIX]
Instead of committing transaction, just end the current one.
Whether the fs would be synced would be determined at user space
(normally by btrfs-progs) instead.
Link: https://lore.kernel.org/linux-btrfs/CA+W5K0r4Lv4fPf+mWWf-ppgsjyz+wOKdBRgBR6UnQafwL7HPtg@mail.gmail.com/
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/volumes.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 70d69d4b44d2..eda90d1c881f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2789,7 +2789,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
btrfs_sysfs_update_sprout_fsid(fs_devices);
}
- ret = btrfs_commit_transaction(trans);
+ ret = btrfs_end_transaction(trans);
+ trans = NULL;
if (seeding_dev) {
mutex_unlock(&uuid_mutex);
@@ -2803,15 +2804,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
if (ret < 0)
btrfs_handle_fs_error(fs_info, ret,
"Failed to relocate sys chunks after device initialization. This can be fixed using the \"btrfs balance\" command.");
- trans = btrfs_attach_transaction(root);
- if (IS_ERR(trans)) {
- if (PTR_ERR(trans) == -ENOENT)
- return 0;
- ret = PTR_ERR(trans);
- trans = NULL;
- goto error_sysfs;
- }
- ret = btrfs_commit_transaction(trans);
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace
2023-07-27 6:07 [PATCH 0/2] btrfs: do not commit transaction to avoid deadly ENOSPC trap Qu Wenruo
2023-07-27 6:07 ` [PATCH 1/2] btrfs: do not commit transaction after adding one device Qu Wenruo
@ 2023-07-27 6:07 ` Qu Wenruo
2023-07-27 12:09 ` David Sterba
1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2023-07-27 6:07 UTC (permalink / raw)
To: linux-btrfs
[BUG]
There is a very rare corner case that, if the filesystem falls into a
deadly ENOSPC trap (metadata is so full that committing a transaction
would trigger transaction abort and falls RO), and the user needs to
cancel a suspended dev-replace, it would fail.
This is because the dev-replace canceling itself would commit the
current transaction, and falls RO first.
[CAUSE]
There are two involved situations:
- Cancel a running dev-replace
We just call btrfs_scrub_cancel(), it doesn't commit transaction
anyway.
- Cancel a suspended dev-replace
We only need to cleanup the various in-memory replace structure, which
is no difference than the previous situation.
But in this case we commit transaction, and may trigger the deadly
ENOSPC trap.
[FIX]
Just follow the first case, do not commit transaction when not needed.
Link: https://lore.kernel.org/linux-btrfs/CA+W5K0pQyJH5zWxs4JxfHR06DSUWDOcDPNsKxbdKQ_CiUtpyUg@mail.gmail.com/
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/dev-replace.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index fff22ed55c42..35590f17a5d7 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -1056,8 +1056,6 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
struct btrfs_device *tgt_device = NULL;
struct btrfs_device *src_device = NULL;
- struct btrfs_trans_handle *trans;
- struct btrfs_root *root = fs_info->tree_root;
int result;
int ret;
@@ -1112,14 +1110,6 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
/* Scrub for replace must not be running in suspended state */
btrfs_scrub_cancel(fs_info);
- trans = btrfs_start_transaction(root, 0);
- if (IS_ERR(trans)) {
- mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
- return PTR_ERR(trans);
- }
- ret = btrfs_commit_transaction(trans);
- WARN_ON(ret);
-
btrfs_info_in_rcu(fs_info,
"suspended dev_replace from %s (devid %llu) to %s canceled",
btrfs_dev_name(src_device), src_device->devid,
--
2.41.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace
2023-07-27 6:07 ` [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace Qu Wenruo
@ 2023-07-27 12:09 ` David Sterba
2023-07-27 22:26 ` Qu Wenruo
0 siblings, 1 reply; 5+ messages in thread
From: David Sterba @ 2023-07-27 12:09 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Thu, Jul 27, 2023 at 02:07:54PM +0800, Qu Wenruo wrote:
> [BUG]
> There is a very rare corner case that, if the filesystem falls into a
> deadly ENOSPC trap (metadata is so full that committing a transaction
> would trigger transaction abort and falls RO), and the user needs to
> cancel a suspended dev-replace, it would fail.
>
> This is because the dev-replace canceling itself would commit the
> current transaction, and falls RO first.
>
> [CAUSE]
> There are two involved situations:
>
> - Cancel a running dev-replace
> We just call btrfs_scrub_cancel(), it doesn't commit transaction
> anyway.
>
> - Cancel a suspended dev-replace
> We only need to cleanup the various in-memory replace structure, which
> is no difference than the previous situation.
There's dev_replace->item_needs_writeback = 1, and somewhere in
transaction commit it's synced to disk.
btrfs_commit_transaction
commit_cowonly_roots
btrfs_run_dev_replace
item_needs_writeback is checked
so it's not just cleaning the memory structures, it also stores the
state on disk. A delayed commit will have to write it again, which means
that the problem is only postponed.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace
2023-07-27 12:09 ` David Sterba
@ 2023-07-27 22:26 ` Qu Wenruo
0 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2023-07-27 22:26 UTC (permalink / raw)
To: dsterba, Qu Wenruo; +Cc: linux-btrfs
On 2023/7/27 20:09, David Sterba wrote:
> On Thu, Jul 27, 2023 at 02:07:54PM +0800, Qu Wenruo wrote:
>> [BUG]
>> There is a very rare corner case that, if the filesystem falls into a
>> deadly ENOSPC trap (metadata is so full that committing a transaction
>> would trigger transaction abort and falls RO), and the user needs to
>> cancel a suspended dev-replace, it would fail.
>>
>> This is because the dev-replace canceling itself would commit the
>> current transaction, and falls RO first.
>>
>> [CAUSE]
>> There are two involved situations:
>>
>> - Cancel a running dev-replace
>> We just call btrfs_scrub_cancel(), it doesn't commit transaction
>> anyway.
>>
>> - Cancel a suspended dev-replace
>> We only need to cleanup the various in-memory replace structure, which
>> is no difference than the previous situation.
>
> There's dev_replace->item_needs_writeback = 1, and somewhere in
> transaction commit it's synced to disk.
>
> btrfs_commit_transaction
> commit_cowonly_roots
> btrfs_run_dev_replace
> item_needs_writeback is checked
>
> so it's not just cleaning the memory structures, it also stores the
> state on disk. A delayed commit will have to write it again, which means
> that the problem is only postponed.
That may be exactly what the end users wants.
Delaying the commit so that they can add the extra device (with patch 1)
to escape the deadly ENOSPC trap.
Thanks,
Qu
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-07-27 22:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-27 6:07 [PATCH 0/2] btrfs: do not commit transaction to avoid deadly ENOSPC trap Qu Wenruo
2023-07-27 6:07 ` [PATCH 1/2] btrfs: do not commit transaction after adding one device Qu Wenruo
2023-07-27 6:07 ` [PATCH 2/2] btrfs: do not commit transaction canceling a suspended replace Qu Wenruo
2023-07-27 12:09 ` David Sterba
2023-07-27 22:26 ` Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).