From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Su Yue <suy.fnst@cn.fujitsu.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
Date: Tue, 26 Dec 2017 18:28:32 +0800 [thread overview]
Message-ID: <802a5de7-c058-ac24-2ba7-dbe911f858a3@gmx.com> (raw)
In-Reply-To: <262f2016-4b63-49db-11d1-83e4ab964957@cn.fujitsu.com>
[-- Attachment #1.1: Type: text/plain, Size: 8672 bytes --]
On 2017年12月26日 16:17, Su Yue wrote:
>
>
> On 12/21/2017 03:12 PM, Qu Wenruo wrote:
>>
>>
>> On 2017年12月21日 15:09, Su Yue wrote:
>>>
>>>
>>> On 12/21/2017 02:51 PM, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2017年12月20日 16:21, Su Yue wrote:
>>>>>>
>>>>>>
>>>>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>>>>> Introduce create_chunk_and_block_block_group() to allocate new
>>>>>>>> chunk
>>>>>>>> and corresponding block group.
>>>>>>>>
>>>>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>>>>> and records its start.
>>>>>>>> Then it modifies all metadata block groups cached and full.
>>>>>>>> Finally it marks the new block group uncached and unfree.
>>>>>>>> In the next CoW, extents states will be updated automatically by
>>>>>>>> cache_block_group().
>>>>>>>>
>>>>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>>>>> ---
>>>>>>>> cmds-check.c | 80
>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>> 1 file changed, 80 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>>>>> --- a/cmds-check.c
>>>>>>>> +++ b/cmds-check.c
>>>>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>>>> return ret;
>>>>>>>> }
>>>>>>>> +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>>>>> *fs_info,
>>>>>>>> + u64 flags, u64 *start, u64 *nbytes)
>>>>>>>> +{
>>>>>>>> + struct btrfs_trans_handle *trans;
>>>>>>>> + struct btrfs_root *root = fs_info->extent_root;
>>>>>>>> + int ret;
>>>>>>>> +
>>>>>>>> + if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>>>>> + return -EINVAL;
>>>>>>>> +
>>>>>>>> + trans = btrfs_start_transaction(root, 1);
>>>>>>>> + if (IS_ERR(trans)) {
>>>>>>>> + ret = PTR_ERR(trans);
>>>>>>>> + error("error starting transaction %s", strerror(-ret));
>>>>>>>> + return ret;
>>>>>>>> + }
>>>>>>>> + ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>>>>> + if (ret) {
>>>>>>>> + error("fail to allocate new chunk %s", strerror(-ret));
>>>>>>>> + goto out;
>>>>>>>> + }
>>>>>>>> + ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>>>>> + BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start,
>>>>>>>> *nbytes);
>>>>>>>> + if (ret) {
>>>>>>>> + error("fail to make block group for chunk %llu %llu %s",
>>>>>>>> + *start, *nbytes, strerror(-ret));
>>>>>>>> + goto out;
>>>>>>>> + }
>>>>>>>> +out:
>>>>>>>> + btrfs_commit_transaction(trans, root);
>>>>>>>> + return ret;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>>>>> +{
>>>>>>>> + struct btrfs_block_group_cache *bg;
>>>>>>>> + u64 start;
>>>>>>>> + u64 nbytes;
>>>>>>>> + u64 alloc_profile;
>>>>>>>> + u64 flags;
>>>>>>>> + int ret;
>>>>>>>> +
>>>>>>>> + alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>>>>> + fs_info->metadata_alloc_profile);
>>>>>>>> + flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>>>>> + if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>>>>> + flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>>>>> +
>>>>>>>> + ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>>>>> &nbytes);
>>>>>>>
>>>>>>> Why bother allocating a new chunk by yourself?
>>>>>>>
>>>>>>> Just mark all block groups full and that's all.
>>>>>>>
>>>>>>> Any later tree block allocation like btrfs_search_slot() with
>>>>>>> @cow = 1
>>>>>>> will trigger chunk allocation automatically.
>>>>>>>
>>>>>>
>>>>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>>>>
>>>>> Then fix it.
>>>>>
>>>>> It's not a normal behavior in this case.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>
>>>> And I think the fix to allow btrfs_reserve_extent() to allocate new
>>>> chunk will solve a lot of strange BUG_ON().
>>>>
>>>> it just occurred to me that, a lot of use cases relies on the
>>>> assumption
>>>> that btrfs_reserve_extent() will try to allocate new chunks.
>>>>
>>>> Especially for case like convert, certain btrfs check --repair, some
>>>> rescue tools.
>>>>
>>> Sorry for the previous wrong format mail.
>>>
>>> Yes, it has many dependency so I considered to do chunk allocation
>>> manually in the patchset. If fix is not good enough, many funtions
>>> of btrfs-progs will behave abnormal.
>>> Since you ask it, I will go to fix it.
>>
>> Manually allocation in advance has its advantage, like we can determine
>> if there is enough space for new chunk instead of checking every return
>> value with ENOSPC.
>>
>>
>> However in current case, your metadata usage is limited to the new chunk
>> only.
>> If there extent tree has quite a lot of problem, and the chunk allocated
>> is small (if using single profile and small fs), it can easily hit
>> ENOSPC again, since btrfs doesn't allocate new chunk for later metadata
>> write.
>>
> SAD. After I tried to implement above nice idea, infinite recursive
> brings me back to the reality.
>
> Here is the reason why btrfs_reserve_extent can not allocate chunk
> by itself if ENOSPC hints:
>
> btrfs_cow_block
> ...
> btrfs_reserve_extent
> btrfs_alloc_chunk
> btrfs_alloc_dev_extent
> btrfs_insert_empty_item
> ...
> btrfs_cow_block
The order is just wrong.
We should first check for the new chunk space, not doing the insert
right now.
I'll take over the work if you're OK with it.
Thanks,
Qu
>
> Thanks,
> Su
>
>> So here, we still need to allow btrfs allocate new meta chunk, even we
>> pre-allocate one meta chunk in advance.
>>
>> Thanks,
>> Qu
>>>
>>> Thanks,
>>> Su
>>>
>>>> So this would be a quite nice start point for such fix.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is
>>>>>> called
>>>>>> while doing CoW?> In progs, allocation of new chunk during CoW
>>>>>> depends @root->ref_cows.
>>>>>> However, @root->ref_cows should be set only if @root is root of a fs
>>>>>> trees.
>>>>>>
>>>>>> Thanks,
>>>>>> Su
>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>
>>>>>>>> + if (!ret)
>>>>>>>> + printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>>>>> + else
>>>>>>>> + goto err;
>>>>>>>> +
>>>>>>>> + flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>>>>> + /* Mark all metadata block groups cached and full in free
>>>>>>>> space*/
>>>>>>>> + ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>>>>> + if (ret)
>>>>>>>> + goto clear_bg_cache;
>>>>>>>> +
>>>>>>>> + bg = btrfs_lookup_block_group(fs_info, start);
>>>>>>>> + if (!bg) {
>>>>>>>> + ret = -ENOENT;
>>>>>>>> + error("fail to look up block group %llu %llu", start,
>>>>>>>> nbytes);
>>>>>>>> + goto clear_bg_cache;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + /* Clear block group cache just allocated */
>>>>>>>> + ret = modify_block_group_cache(fs_info, bg, 0);
>>>>>>>> + if (ret)
>>>>>>>> + goto clear_bg_cache;
>>>>>>>> +
>>>>>>>> + return 0;
>>>>>>>> +
>>>>>>>> +clear_bg_cache:
>>>>>>>> + modify_block_groups_cache(fs_info, flags, 0);
>>>>>>>> +err:
>>>>>>>> + return ret;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> static int check_extent_refs(struct btrfs_root *root,
>>>>>>>> struct cache_tree *extent_cache)
>>>>>>>> {
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-btrfs" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>
>>>
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]
next prev parent reply other threads:[~2017-12-26 10:28 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-20 4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
2017-12-20 4:57 ` [PATCH v2 01/17] btrfs-progs: lowmem check: release path in repair_extent_data_item() Su Yue
2017-12-20 4:57 ` [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2() Su Yue
2017-12-29 11:17 ` Nikolay Borisov
2018-01-02 1:44 ` Su Yue
2018-01-02 1:50 ` Su Yue
2017-12-20 4:57 ` [PATCH v2 03/17] btrfs-progs: lowmem check: assign @parent early in repair_extent_data_item() Su Yue
2017-12-20 4:57 ` [PATCH v2 04/17] btrfs-progs: lowmem check: exclude extents of metadata blocks Su Yue
2017-12-20 4:57 ` [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache() Su Yue
2017-12-20 5:38 ` Qu Wenruo
2017-12-20 4:57 ` [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk() Su Yue
2017-12-20 5:41 ` Qu Wenruo
2017-12-20 8:21 ` Su Yue
2017-12-20 8:37 ` Qu Wenruo
2017-12-21 6:51 ` Qu Wenruo
2017-12-21 7:06 ` Su Yue
2017-12-21 7:09 ` Su Yue
2017-12-21 7:12 ` Qu Wenruo
2017-12-26 8:17 ` Su Yue
2017-12-26 10:28 ` Qu Wenruo [this message]
2017-12-27 1:11 ` Su Yue
2018-01-09 7:44 ` Qu Wenruo
2017-12-20 4:57 ` [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite() Su Yue
2017-12-20 5:46 ` Qu Wenruo
2017-12-20 4:57 ` [PATCH v2 08/17] btrfs-progs: lowmem check: exclude extents if init-extent-tree in lowmem Su Yue
2017-12-20 4:57 ` [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans " Su Yue
2017-12-20 5:51 ` Qu Wenruo
2017-12-20 4:57 ` [PATCH v2 10/17] btrfs-progs: lowmem check: remove parameter @trans of delete_extent_item() Su Yue
2017-12-20 4:57 ` [PATCH v2 11/17] btrfs-progs: lowmem check: remove parameter @trans of repair_chunk_item() Su Yue
2017-12-20 4:57 ` [PATCH v2 12/17] btrfs-progs: lowmem check: remove parameter @trans of repair_extent_item() Su Yue
2017-12-20 4:57 ` [PATCH v2 13/17] btrfs-progs: lowmem check: remove parameter @trans of check_leaf_items() Su Yue
2017-12-20 4:57 ` [PATCH v2 14/17] btrfs-progs: lowmem check: remove parameter @trans of repair_tree_back_ref() Su Yue
2017-12-20 4:57 ` [PATCH v2 15/17] btrfs-progs: lowmem check: remove parameter @trans of check_btrfs_root() Su Yue
2017-12-20 4:57 ` [PATCH v2 16/17] btrfs-progs: lowmem check: introduce repair_block_accounting() Su Yue
2017-12-20 4:57 ` [PATCH v2 17/17] btrfs-progs: lowmem check: end of removing parameters @trans in lowmem Su Yue
2017-12-20 5:59 ` [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=802a5de7-c058-ac24-2ba7-dbe911f858a3@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=suy.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox