From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.cn.fujitsu.com ([183.91.158.132]:21629 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750705AbdLZINJ (ORCPT ); Tue, 26 Dec 2017 03:13:09 -0500 Subject: Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk() To: Qu Wenruo , References: <20171220045731.19343-1-suy.fnst@cn.fujitsu.com> <20171220045731.19343-7-suy.fnst@cn.fujitsu.com> <37268331-348d-fe29-db97-e289bb2dea1c@gmx.com> <3d829151-9b9a-9eae-e76d-b5bcd6e8c5fd@cn.fujitsu.com> <399b00c9-3ab5-e777-a69d-4cbaf3641ddb@gmx.com> <45d9878a-b6cd-fd0d-5309-05e17c9db51a@cn.fujitsu.com> <87e23c3c-73a9-47c6-e68c-b79186678bfa@gmx.com> From: Su Yue Message-ID: <262f2016-4b63-49db-11d1-83e4ab964957@cn.fujitsu.com> Date: Tue, 26 Dec 2017 16:17:11 +0800 MIME-Version: 1.0 In-Reply-To: <87e23c3c-73a9-47c6-e68c-b79186678bfa@gmx.com> Content-Type: text/plain; charset="utf-8"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 12/21/2017 03:12 PM, Qu Wenruo wrote: > > > On 2017年12月21日 15:09, Su Yue wrote: >> >> >> On 12/21/2017 02:51 PM, Qu Wenruo wrote: >>> >>> >>> On 2017年12月20日 16:37, Qu Wenruo wrote: >>>> >>>> >>>> On 2017年12月20日 16:21, Su Yue wrote: >>>>> >>>>> >>>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote: >>>>>> >>>>>> >>>>>> On 2017年12月20日 12:57, Su Yue wrote: >>>>>>> Introduce create_chunk_and_block_block_group() to allocate new chunk >>>>>>> and corresponding block group. >>>>>>> >>>>>>> The new function force_cow_in_new_chunk() first allocates new chunk >>>>>>> and records its start. >>>>>>> Then it modifies all metadata block groups cached and full. >>>>>>> Finally it marks the new block group uncached and unfree. >>>>>>> In the next CoW, extents states will be updated automatically by >>>>>>> cache_block_group(). >>>>>>> >>>>>>> Suggested-by: Qu Wenruo >>>>>>> Signed-off-by: Su Yue >>>>>>> --- >>>>>>>    cmds-check.c | 80 >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>    1 file changed, 80 insertions(+) >>>>>>> >>>>>>> diff --git a/cmds-check.c b/cmds-check.c >>>>>>> index d98d4bda7357..311c8a9f45e8 100644 >>>>>>> --- a/cmds-check.c >>>>>>> +++ b/cmds-check.c >>>>>>> @@ -10911,6 +10911,86 @@ out: >>>>>>>        return ret; >>>>>>>    } >>>>>>>    +static int create_chunk_and_block_group(struct btrfs_fs_info >>>>>>> *fs_info, >>>>>>> +                    u64 flags, u64 *start, u64 *nbytes) >>>>>>> +{ >>>>>>> +    struct btrfs_trans_handle *trans; >>>>>>> +    struct btrfs_root *root = fs_info->extent_root; >>>>>>> +    int ret; >>>>>>> + >>>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) >>>>>>> +        return -EINVAL; >>>>>>> + >>>>>>> +    trans = btrfs_start_transaction(root, 1); >>>>>>> +    if (IS_ERR(trans)) { >>>>>>> +        ret = PTR_ERR(trans); >>>>>>> +        error("error starting transaction %s", strerror(-ret)); >>>>>>> +        return ret; >>>>>>> +    } >>>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags); >>>>>>> +    if (ret) { >>>>>>> +        error("fail to allocate new chunk %s", strerror(-ret)); >>>>>>> +        goto out; >>>>>>> +    } >>>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags, >>>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes); >>>>>>> +    if (ret) { >>>>>>> +        error("fail to make block group for chunk %llu %llu %s", >>>>>>> +              *start, *nbytes, strerror(-ret)); >>>>>>> +        goto out; >>>>>>> +    } >>>>>>> +out: >>>>>>> +    btrfs_commit_transaction(trans, root); >>>>>>> +    return ret; >>>>>>> +} >>>>>>> + >>>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info) >>>>>>> +{ >>>>>>> +    struct btrfs_block_group_cache *bg; >>>>>>> +    u64 start; >>>>>>> +    u64 nbytes; >>>>>>> +    u64 alloc_profile; >>>>>>> +    u64 flags; >>>>>>> +    int ret; >>>>>>> + >>>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits & >>>>>>> +             fs_info->metadata_alloc_profile); >>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile; >>>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS)) >>>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA; >>>>>>> + >>>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start, >>>>>>> &nbytes); >>>>>> >>>>>> Why bother allocating a new chunk by yourself? >>>>>> >>>>>> Just mark all block groups full and that's all. >>>>>> >>>>>> Any later tree block allocation like btrfs_search_slot() with @cow = 1 >>>>>> will trigger chunk allocation automatically. >>>>>> >>>>> >>>>> Tried to let it happen but BUG_ON with -ENOSPC. >>>> >>>> Then fix it. >>>> >>>> It's not a normal behavior in this case. >>>> >>>> Thanks, >>>> Qu >>> >>> And I think the fix to allow btrfs_reserve_extent() to allocate new >>> chunk will solve a lot of strange BUG_ON(). >>> >>> it just occurred to me that, a lot of use cases relies on the assumption >>> that btrfs_reserve_extent() will try to allocate new chunks. >>> >>> Especially for case like convert, certain btrfs check --repair, some >>> rescue tools. >>> >> Sorry for the previous wrong format mail. >> >> Yes, it has many dependency so I considered to do chunk allocation >> manually in the patchset. If fix is not good enough, many funtions >> of btrfs-progs will behave abnormal. >> Since you ask it, I will go to fix it. > > Manually allocation in advance has its advantage, like we can determine > if there is enough space for new chunk instead of checking every return > value with ENOSPC. > > > However in current case, your metadata usage is limited to the new chunk > only. > If there extent tree has quite a lot of problem, and the chunk allocated > is small (if using single profile and small fs), it can easily hit > ENOSPC again, since btrfs doesn't allocate new chunk for later metadata > write. > SAD. After I tried to implement above nice idea, infinite recursive brings me back to the reality. Here is the reason why btrfs_reserve_extent can not allocate chunk by itself if ENOSPC hints: btrfs_cow_block ... btrfs_reserve_extent btrfs_alloc_chunk btrfs_alloc_dev_extent btrfs_insert_empty_item ... btrfs_cow_block Thanks, Su > So here, we still need to allow btrfs allocate new meta chunk, even we > pre-allocate one meta chunk in advance. > > Thanks, > Qu >> >> Thanks, >> Su >> >>> So this would be a quite nice start point for such fix. >>> >>> Thanks, >>> Qu >>> >>>> >>>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called >>>>> while doing CoW?> In progs, allocation of new chunk during CoW >>>>> depends @root->ref_cows. >>>>> However, @root->ref_cows should be set only if @root is root of a fs >>>>> trees. >>>>> >>>>> Thanks, >>>>> Su >>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>>> +    if (!ret) >>>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes); >>>>>>> +    else >>>>>>> +        goto err; >>>>>>> + >>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA; >>>>>>> +    /* Mark all metadata block groups cached and full in free >>>>>>> space*/ >>>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1); >>>>>>> +    if (ret) >>>>>>> +        goto clear_bg_cache; >>>>>>> + >>>>>>> +    bg = btrfs_lookup_block_group(fs_info, start); >>>>>>> +    if (!bg) { >>>>>>> +        ret = -ENOENT; >>>>>>> +        error("fail to look up block group %llu %llu", start, >>>>>>> nbytes); >>>>>>> +        goto clear_bg_cache; >>>>>>> +    } >>>>>>> + >>>>>>> +    /* Clear block group cache just allocated */ >>>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0); >>>>>>> +    if (ret) >>>>>>> +        goto clear_bg_cache; >>>>>>> + >>>>>>> +    return 0; >>>>>>> + >>>>>>> +clear_bg_cache: >>>>>>> +    modify_block_groups_cache(fs_info, flags, 0); >>>>>>> +err: >>>>>>> +    return ret; >>>>>>> +} >>>>>>> + >>>>>>>    static int check_extent_refs(struct btrfs_root *root, >>>>>>>                     struct cache_tree *extent_cache) >>>>>>>    { >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-btrfs" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html >>>> >>> >> >> >