From: Qu Wenruo <wqu@suse.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org,
Greed Rong <greedrong@gmail.com>
Subject: Re: [PATCH 3/4] btrfs: preallocate anon_dev for subvolume and snapshot creation
Date: Wed, 17 Jun 2020 06:54:34 +0800 [thread overview]
Message-ID: <de0a6073-36e9-9b6a-18e2-21a4df793157@suse.com> (raw)
In-Reply-To: <20200616151004.GE27795@twin.jikos.cz>
On 2020/6/16 下午11:10, David Sterba wrote:
> On Tue, Jun 16, 2020 at 10:17:36AM +0800, Qu Wenruo wrote:
>> [BUG]
>> When a lot of subvolumes are created, there is a user report about
>> transaction aborted:
>>
>> ------------[ cut here ]------------
>> BTRFS: Transaction aborted (error -24)
>> WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
>> RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
>> Call Trace:
>> create_pending_snapshots+0x82/0xa0 [btrfs]
>> btrfs_commit_transaction+0x275/0x8c0 [btrfs]
>> btrfs_mksubvol+0x4b9/0x500 [btrfs]
>> btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
>> btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
>> btrfs_ioctl+0x11a4/0x2da0 [btrfs]
>> do_vfs_ioctl+0xa9/0x640
>> ksys_ioctl+0x67/0x90
>> __x64_sys_ioctl+0x1a/0x20
>> do_syscall_64+0x5a/0x110
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> ---[ end trace 33f2f83f3d5250e9 ]---
>> BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
>> BTRFS info (device sda1): forced readonly
>> BTRFS warning (device sda1): Skipping commit of aborted transaction.
>> BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
>>
>> [CAUSE]
>> When the global anonymous block device pool is exhausted, the following
>> call chain will fail, and lead to transaction abort:
>>
>> btrfs_ioctl_snap_create_v2()
>> |- btrfs_ioctl_snap_create_transid()
>> |- btrfs_mksubvol()
>> |- btrfs_commit_transaction()
>> |- create_pending_snapshot()
>> |- btrfs_get_fs_root()
>> |- btrfs_init_fs_root()
>> |- get_anon_bdev()
>>
>> [FIX]
>> Although we can't enlarge the anonymous block device pool, at least we
>> can preallocate anon_dev for subvolume/snapshot creation.
>> So that when the pool is exhausted, user will get an error other than
>> aborting transaction later.
>>
>> Reported-by: Greed Rong <greedrong@gmail.com>
>> Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@mail.gmail.com/
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/disk-io.c | 51 ++++++++++++++++++++++++++++++++++++------
>> fs/btrfs/disk-io.h | 2 ++
>> fs/btrfs/ioctl.c | 21 ++++++++++++++++-
>> fs/btrfs/transaction.c | 3 ++-
>> fs/btrfs/transaction.h | 2 ++
>> 5 files changed, 70 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index cfc0ff288238..14fd69b71cb8 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1395,7 +1395,7 @@ struct btrfs_root *btrfs_read_tree_root(struct btrfs_root *tree_root,
>> goto out;
>> }
>>
>> -static int btrfs_init_fs_root(struct btrfs_root *root)
>> +static int btrfs_init_fs_root(struct btrfs_root *root, dev_t anon_dev)
>> {
>> int ret;
>> unsigned int nofs_flag;
>> @@ -1435,9 +1435,13 @@ static int btrfs_init_fs_root(struct btrfs_root *root)
>> */
>> if (is_fstree(root->root_key.objectid) &&
>> btrfs_root_refs(&root->root_item)) {
>> - ret = get_anon_bdev(&root->anon_dev);
>> - if (ret)
>> - goto fail;
>> + if (!anon_dev) {
>> + ret = get_anon_bdev(&root->anon_dev);
>> + if (ret)
>> + goto fail;
>> + } else {
>> + root->anon_dev = anon_dev;
>> + }
>> }
>>
>> mutex_lock(&root->objectid_mutex);
>> @@ -1542,8 +1546,27 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
>> }
>>
>>
>> -struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
>> - u64 objectid, bool check_ref)
>> +/*
>> + * Get a fs root.
>> + *
>> + * For essential trees like root/extent tree, we grab it from fs_info directly.
>> + * For subvolume trees, we check the cached fs roots first. If miss then
>> + * read it from disk and add it to cached fs roots.
>> + *
>> + * Caller should release the root by calling btrfs_put_root() after the usage.
>> + *
>> + * NOTE: Reloc and log trees can't be read by this function as they share the
>> + * same root objectid.
>> + *
>> + * @objectid: Root (subvolume) id
>> + * @anon_dev: Preallocated anonymous block device number for new roots.
>> + * Pass 0 for automatic allocation.
>> + * @check_ref: Whether to check root refs. If true, return -ENOENT for orphan
>> + * roots.
>> + */
>> +static struct btrfs_root *__get_fs_root(struct btrfs_fs_info *fs_info,
>> + u64 objectid, dev_t anon_dev,
>> + bool check_ref)
>
>
>> +struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
>> + u64 objectid, bool check_ref)
>> +{
>> + return __get_fs_root(fs_info, objectid, 0, check_ref);
>> +}
>> +
>> +struct btrfs_root *btrfs_get_new_fs_root(struct btrfs_fs_info *fs_info,
>> + u64 objectid, dev_t anon_dev)
>> +{
>> + return __get_fs_root(fs_info, objectid, anon_dev, true);
>> +}
>
> This does not look like a good API, we should keep btrfs_get_fs_root and
> add the anon_bdev initialization to the callers, there are only a few.
>
OK, I'd go that direction.
Thanks,
Qu
next prev parent reply other threads:[~2020-06-16 22:55 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-16 2:17 [PATCH 0/4] btrfs: workaround exhausted anonymous block device pool Qu Wenruo
2020-06-16 2:17 ` [PATCH 1/4] btrfs: disk-io: don't allocate anonymous block device for user invisible roots Qu Wenruo
2020-06-16 19:21 ` Josef Bacik
2020-06-16 2:17 ` [PATCH 2/4] btrfs: detect uninitialized btrfs_root::anon_dev for user visible subvolumes Qu Wenruo
2020-06-16 19:25 ` Josef Bacik
2020-06-16 22:49 ` Qu Wenruo
2020-06-16 23:32 ` Josef Bacik
2020-06-16 23:49 ` Qu Wenruo
2020-06-17 11:31 ` David Sterba
2020-06-17 13:37 ` Josef Bacik
2020-06-17 23:39 ` Qu Wenruo
2020-06-16 2:17 ` [PATCH 3/4] btrfs: preallocate anon_dev for subvolume and snapshot creation Qu Wenruo
2020-06-16 15:10 ` David Sterba
2020-06-16 22:54 ` Qu Wenruo [this message]
2020-07-01 3:25 ` Qu Wenruo
2020-07-01 17:39 ` David Sterba
2020-07-01 23:56 ` Qu Wenruo
2020-07-02 16:08 ` David Sterba
2020-07-02 23:46 ` David Sterba
2020-07-03 5:19 ` Qu Wenruo
2020-07-03 12:29 ` David Sterba
2020-07-03 12:39 ` Qu Wenruo
2020-06-16 2:17 ` [PATCH 4/4] btrfs: free anon_dev earlier to prevent exhausting anonymous block device pool Qu Wenruo
2020-06-16 19:23 ` Josef Bacik
2020-06-16 22:48 ` David Sterba
2020-06-16 23:31 ` Josef Bacik
2020-06-30 14:14 ` [PATCH 0/4] btrfs: workaround exhausted " David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de0a6073-36e9-9b6a-18e2-21a4df793157@suse.com \
--to=wqu@suse.com \
--cc=dsterba@suse.cz \
--cc=greedrong@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox