* [PATCH] btrfs: delete chunk allocation attemp when setting block group ro @ 2015-01-08 21:23 Shaohua Li 2015-01-09 1:01 ` Miao Xie 0 siblings, 1 reply; 5+ messages in thread From: Shaohua Li @ 2015-01-08 21:23 UTC (permalink / raw) To: linux-btrfs; +Cc: Kernel-team Below test will fail currently: mkfs.ext4 -F /dev/sda btrfs-convert /dev/sda mount /dev/sda /mnt btrfs device add -f /dev/sdb /mnt btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt The reason is there are some block groups with usage 0, but the whole disk hasn't free space to allocate new chunk, so we even can't set such block group readonly. This patch deletes the chunk allocation when setting block group ro. For META, we already have reserve. But for SYSTEM, we don't have, so the check_system_chunk is still required. Signed-off-by: Shaohua Li <shli@fb.com> --- fs/btrfs/extent-tree.c | 31 +++++++------------------------ 1 file changed, 7 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a80b971..430101b6 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8493,22 +8493,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) { struct btrfs_space_info *sinfo = cache->space_info; u64 num_bytes; - u64 min_allocable_bytes; int ret = -ENOSPC; - - /* - * We need some metadata space and system metadata space for - * allocating chunks in some corner cases until we force to set - * it to be readonly. - */ - if ((sinfo->flags & - (BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA)) && - !force) - min_allocable_bytes = 1 * 1024 * 1024; - else - min_allocable_bytes = 0; - spin_lock(&sinfo->lock); spin_lock(&cache->lock); @@ -8521,8 +8507,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) cache->bytes_super - btrfs_block_group_used(&cache->item); if (sinfo->bytes_used + sinfo->bytes_reserved + sinfo->bytes_pinned + - sinfo->bytes_may_use + sinfo->bytes_readonly + num_bytes + - min_allocable_bytes <= sinfo->total_bytes) { + sinfo->bytes_may_use + sinfo->bytes_readonly + num_bytes + <= sinfo->total_bytes) { sinfo->bytes_readonly += num_bytes; cache->ro = 1; list_add_tail(&cache->ro_list, &sinfo->ro_bgs); @@ -8548,14 +8534,6 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, if (IS_ERR(trans)) return PTR_ERR(trans); - alloc_flags = update_block_group_flags(root, cache->flags); - if (alloc_flags != cache->flags) { - ret = do_chunk_alloc(trans, root, alloc_flags, - CHUNK_ALLOC_FORCE); - if (ret < 0) - goto out; - } - ret = set_block_group_ro(cache, 0); if (!ret) goto out; @@ -8566,6 +8544,11 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, goto out; ret = set_block_group_ro(cache, 0); out: + if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { + alloc_flags = update_block_group_flags(root, cache->flags); + check_system_chunk(trans, root, alloc_flags); + } + btrfs_end_transaction(trans, root); return ret; } -- 1.8.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: delete chunk allocation attemp when setting block group ro 2015-01-08 21:23 [PATCH] btrfs: delete chunk allocation attemp when setting block group ro Shaohua Li @ 2015-01-09 1:01 ` Miao Xie 2015-01-09 2:06 ` Shaohua Li 0 siblings, 1 reply; 5+ messages in thread From: Miao Xie @ 2015-01-09 1:01 UTC (permalink / raw) To: Shaohua Li, linux-btrfs; +Cc: Kernel-team On Thu, 08 Jan 2015 13:23:13 -0800, Shaohua Li wrote: > Below test will fail currently: > mkfs.ext4 -F /dev/sda > btrfs-convert /dev/sda > mount /dev/sda /mnt > btrfs device add -f /dev/sdb /mnt > btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt > > The reason is there are some block groups with usage 0, but the whole > disk hasn't free space to allocate new chunk, so we even can't set such > block group readonly. This patch deletes the chunk allocation when > setting block group ro. For META, we already have reserve. But for > SYSTEM, we don't have, so the check_system_chunk is still required. > > Signed-off-by: Shaohua Li <shli@fb.com> > --- > fs/btrfs/extent-tree.c | 31 +++++++------------------------ > 1 file changed, 7 insertions(+), 24 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index a80b971..430101b6 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -8493,22 +8493,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) > { > struct btrfs_space_info *sinfo = cache->space_info; > u64 num_bytes; > - u64 min_allocable_bytes; > int ret = -ENOSPC; > > - > - /* > - * We need some metadata space and system metadata space for > - * allocating chunks in some corner cases until we force to set > - * it to be readonly. > - */ > - if ((sinfo->flags & > - (BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA)) && > - !force) > - min_allocable_bytes = 1 * 1024 * 1024; > - else > - min_allocable_bytes = 0; > - > spin_lock(&sinfo->lock); > spin_lock(&cache->lock); > > @@ -8521,8 +8507,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) > cache->bytes_super - btrfs_block_group_used(&cache->item); > > if (sinfo->bytes_used + sinfo->bytes_reserved + sinfo->bytes_pinned + > - sinfo->bytes_may_use + sinfo->bytes_readonly + num_bytes + > - min_allocable_bytes <= sinfo->total_bytes) { > + sinfo->bytes_may_use + sinfo->bytes_readonly + num_bytes > + <= sinfo->total_bytes) { > sinfo->bytes_readonly += num_bytes; > cache->ro = 1; > list_add_tail(&cache->ro_list, &sinfo->ro_bgs); > @@ -8548,14 +8534,6 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, > if (IS_ERR(trans)) > return PTR_ERR(trans); > > - alloc_flags = update_block_group_flags(root, cache->flags); > - if (alloc_flags != cache->flags) { > - ret = do_chunk_alloc(trans, root, alloc_flags, > - CHUNK_ALLOC_FORCE); > - if (ret < 0) > - goto out; > - } > - > ret = set_block_group_ro(cache, 0); > if (!ret) > goto out; > @@ -8566,6 +8544,11 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, > goto out; > ret = set_block_group_ro(cache, 0); > out: > + if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { > + alloc_flags = update_block_group_flags(root, cache->flags); > + check_system_chunk(trans, root, alloc_flags); Please consider the case that the following patch fixed 199c36eaa95077a47ae1bc55532fc0fbeb80cc95 If there is no free device space, check_system_chunk can not allocate new system metadata chunk, so when we run final step of the chunk allocation to update the device item and insert the new chunk item, we would fail. Thanks Miao > + } > + > btrfs_end_transaction(trans, root); > return ret; > } > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: delete chunk allocation attemp when setting block group ro 2015-01-09 1:01 ` Miao Xie @ 2015-01-09 2:06 ` Shaohua Li 2015-01-09 2:50 ` Miao Xie 0 siblings, 1 reply; 5+ messages in thread From: Shaohua Li @ 2015-01-09 2:06 UTC (permalink / raw) To: Miao Xie; +Cc: linux-btrfs, Kernel-team On Fri, Jan 09, 2015 at 09:01:57AM +0800, Miao Xie wrote: > On Thu, 08 Jan 2015 13:23:13 -0800, Shaohua Li wrote: > > Below test will fail currently: > > mkfs.ext4 -F /dev/sda > > btrfs-convert /dev/sda > > mount /dev/sda /mnt > > btrfs device add -f /dev/sdb /mnt > > btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt > > > > The reason is there are some block groups with usage 0, but the whole > > disk hasn't free space to allocate new chunk, so we even can't set such > > block group readonly. This patch deletes the chunk allocation when > > setting block group ro. For META, we already have reserve. But for > > SYSTEM, we don't have, so the check_system_chunk is still required. > > > > Signed-off-by: Shaohua Li <shli@fb.com> > > --- > > fs/btrfs/extent-tree.c | 31 +++++++------------------------ > > 1 file changed, 7 insertions(+), 24 deletions(-) > > > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > > index a80b971..430101b6 100644 > > --- a/fs/btrfs/extent-tree.c > > +++ b/fs/btrfs/extent-tree.c > > @@ -8493,22 +8493,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) > > { > > struct btrfs_space_info *sinfo = cache->space_info; > > u64 num_bytes; > > - u64 min_allocable_bytes; > > int ret = -ENOSPC; > > > > - > > - /* > > - * We need some metadata space and system metadata space for > > - * allocating chunks in some corner cases until we force to set > > - * it to be readonly. > > - */ > > - if ((sinfo->flags & > > - (BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA)) && > > - !force) > > - min_allocable_bytes = 1 * 1024 * 1024; > > - else > > - min_allocable_bytes = 0; > > - > > spin_lock(&sinfo->lock); > > spin_lock(&cache->lock); > > > > @@ -8521,8 +8507,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) > > cache->bytes_super - btrfs_block_group_used(&cache->item); > > > > if (sinfo->bytes_used + sinfo->bytes_reserved + sinfo->bytes_pinned + > > - sinfo->bytes_may_use + sinfo->bytes_readonly + num_bytes + > > - min_allocable_bytes <= sinfo->total_bytes) { > > + sinfo->bytes_may_use + sinfo->bytes_readonly + num_bytes > > + <= sinfo->total_bytes) { > > sinfo->bytes_readonly += num_bytes; > > cache->ro = 1; > > list_add_tail(&cache->ro_list, &sinfo->ro_bgs); > > @@ -8548,14 +8534,6 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, > > if (IS_ERR(trans)) > > return PTR_ERR(trans); > > > > - alloc_flags = update_block_group_flags(root, cache->flags); > > - if (alloc_flags != cache->flags) { > > - ret = do_chunk_alloc(trans, root, alloc_flags, > > - CHUNK_ALLOC_FORCE); > > - if (ret < 0) > > - goto out; > > - } > > - > > ret = set_block_group_ro(cache, 0); > > if (!ret) > > goto out; > > @@ -8566,6 +8544,11 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, > > goto out; > > ret = set_block_group_ro(cache, 0); > > out: > > + if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { > > + alloc_flags = update_block_group_flags(root, cache->flags); > > + check_system_chunk(trans, root, alloc_flags); > > Please consider the case that the following patch fixed > 199c36eaa95077a47ae1bc55532fc0fbeb80cc95 > > If there is no free device space, check_system_chunk can not allocate > new system metadata chunk, so when we run final step of the chunk > allocation to update the device item and insert the new chunk item, we > would fail. So the relocation will always fail in this case. The check just makes the failure earlier, right? We don't have the BUG_ON in do_chunk_alloc() currently. Thanks, Shaohua ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: delete chunk allocation attemp when setting block group ro 2015-01-09 2:06 ` Shaohua Li @ 2015-01-09 2:50 ` Miao Xie 2015-01-09 18:40 ` Shaohua Li 0 siblings, 1 reply; 5+ messages in thread From: Miao Xie @ 2015-01-09 2:50 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-btrfs, Kernel-team On Thu, 08 Jan 2015 18:06:50 -0800, Shaohua Li wrote: > On Fri, Jan 09, 2015 at 09:01:57AM +0800, Miao Xie wrote: >> On Thu, 08 Jan 2015 13:23:13 -0800, Shaohua Li wrote: >>> Below test will fail currently: >>> mkfs.ext4 -F /dev/sda >>> btrfs-convert /dev/sda >>> mount /dev/sda /mnt >>> btrfs device add -f /dev/sdb /mnt >>> btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt >>> >>> The reason is there are some block groups with usage 0, but the whole >>> disk hasn't free space to allocate new chunk, so we even can't set such >>> block group readonly. This patch deletes the chunk allocation when >>> setting block group ro. For META, we already have reserve. But for >>> SYSTEM, we don't have, so the check_system_chunk is still required. >>> >>> Signed-off-by: Shaohua Li <shli@fb.com> >>> --- >>> fs/btrfs/extent-tree.c | 31 +++++++------------------------ >>> 1 file changed, 7 insertions(+), 24 deletions(-) >>> >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>> index a80b971..430101b6 100644 >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -8493,22 +8493,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) >>> { >>> struct btrfs_space_info *sinfo = cache->space_info; >>> u64 num_bytes; >>> - u64 min_allocable_bytes; >>> int ret = -ENOSPC; >>> >>> - >>> - /* >>> - * We need some metadata space and system metadata space for >>> - * allocating chunks in some corner cases until we force to set >>> - * it to be readonly. >>> - */ >>> - if ((sinfo->flags & >>> - (BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA)) && >>> - !force) >>> - min_allocable_bytes = 1 * 1024 * 1024; >>> - else >>> - min_allocable_bytes = 0; >>> - >>> spin_lock(&sinfo->lock); >>> spin_lock(&cache->lock); >>> [SNIP] >>> ret = set_block_group_ro(cache, 0); >>> if (!ret) >>> goto out; >>> @@ -8566,6 +8544,11 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, >>> goto out; >>> ret = set_block_group_ro(cache, 0); >>> out: >>> + if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { >>> + alloc_flags = update_block_group_flags(root, cache->flags); >>> + check_system_chunk(trans, root, alloc_flags); >> >> Please consider the case that the following patch fixed >> 199c36eaa95077a47ae1bc55532fc0fbeb80cc95 >> >> If there is no free device space, check_system_chunk can not allocate >> new system metadata chunk, so when we run final step of the chunk >> allocation to update the device item and insert the new chunk item, we >> would fail. > > So the relocation will always fail in this case. The check just makes > the failure earlier, right? We don't have the BUG_ON in > do_chunk_alloc() currently. The final step of the chunk allocation is a delayed operation, we must make sure it can be done successfully, or we would abort the transaction, make the filesystem readonly and lose the data that is written into the filesystem before we do balance, it would make the users unconfortable. With this patch, we will set the block group successfully at the first time we invoke set_block_group_ro(). But if the block group that will be set to RO is the only system metadata block group in the filesystem, and there is no device space to allocate a new one, that is we have no space to deal with the pending final step of chunk allocation, so the problem I said above will happen. Thanks Miao ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: delete chunk allocation attemp when setting block group ro 2015-01-09 2:50 ` Miao Xie @ 2015-01-09 18:40 ` Shaohua Li 0 siblings, 0 replies; 5+ messages in thread From: Shaohua Li @ 2015-01-09 18:40 UTC (permalink / raw) To: Miao Xie; +Cc: linux-btrfs, Kernel-team On Fri, Jan 09, 2015 at 10:50:17AM +0800, Miao Xie wrote: > On Thu, 08 Jan 2015 18:06:50 -0800, Shaohua Li wrote: > > On Fri, Jan 09, 2015 at 09:01:57AM +0800, Miao Xie wrote: > >> On Thu, 08 Jan 2015 13:23:13 -0800, Shaohua Li wrote: > >>> Below test will fail currently: > >>> mkfs.ext4 -F /dev/sda > >>> btrfs-convert /dev/sda > >>> mount /dev/sda /mnt > >>> btrfs device add -f /dev/sdb /mnt > >>> btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt > >>> > >>> The reason is there are some block groups with usage 0, but the whole > >>> disk hasn't free space to allocate new chunk, so we even can't set such > >>> block group readonly. This patch deletes the chunk allocation when > >>> setting block group ro. For META, we already have reserve. But for > >>> SYSTEM, we don't have, so the check_system_chunk is still required. > >>> > >>> Signed-off-by: Shaohua Li <shli@fb.com> > >>> --- > >>> fs/btrfs/extent-tree.c | 31 +++++++------------------------ > >>> 1 file changed, 7 insertions(+), 24 deletions(-) > >>> > >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > >>> index a80b971..430101b6 100644 > >>> --- a/fs/btrfs/extent-tree.c > >>> +++ b/fs/btrfs/extent-tree.c > >>> @@ -8493,22 +8493,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) > >>> { > >>> struct btrfs_space_info *sinfo = cache->space_info; > >>> u64 num_bytes; > >>> - u64 min_allocable_bytes; > >>> int ret = -ENOSPC; > >>> > >>> - > >>> - /* > >>> - * We need some metadata space and system metadata space for > >>> - * allocating chunks in some corner cases until we force to set > >>> - * it to be readonly. > >>> - */ > >>> - if ((sinfo->flags & > >>> - (BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA)) && > >>> - !force) > >>> - min_allocable_bytes = 1 * 1024 * 1024; > >>> - else > >>> - min_allocable_bytes = 0; > >>> - > >>> spin_lock(&sinfo->lock); > >>> spin_lock(&cache->lock); > >>> > [SNIP] > >>> ret = set_block_group_ro(cache, 0); > >>> if (!ret) > >>> goto out; > >>> @@ -8566,6 +8544,11 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, > >>> goto out; > >>> ret = set_block_group_ro(cache, 0); > >>> out: > >>> + if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { > >>> + alloc_flags = update_block_group_flags(root, cache->flags); > >>> + check_system_chunk(trans, root, alloc_flags); > >> > >> Please consider the case that the following patch fixed > >> 199c36eaa95077a47ae1bc55532fc0fbeb80cc95 > >> > >> If there is no free device space, check_system_chunk can not allocate > >> new system metadata chunk, so when we run final step of the chunk > >> allocation to update the device item and insert the new chunk item, we > >> would fail. > > > > So the relocation will always fail in this case. The check just makes > > the failure earlier, right? We don't have the BUG_ON in > > do_chunk_alloc() currently. > > The final step of the chunk allocation is a delayed operation, we must make sure > it can be done successfully, or we would abort the transaction, make the > filesystem readonly and lose the data that is written into the filesystem before > we do balance, it would make the users unconfortable. > > With this patch, we will set the block group successfully at the first time we > invoke set_block_group_ro(). But if the block group that will be set to RO is > the only system metadata block group in the filesystem, and there is no device > space to allocate a new one, that is we have no space to deal with the pending > final step of chunk allocation, so the problem I said above will happen. Ok, makes sense. Updated patch: commit d8c84ce6d40b471f5d0fb3d4e8b76d5ba424dca4 Author: Shaohua Li <shli@fb.com> Date: Wed Jan 7 14:29:49 2015 -0800 btrfs: delete chunk allocation attemp when setting block group ro Below test will fail currently: mkfs.ext4 -F /dev/sda btrfs-convert /dev/sda mount /dev/sda /mnt btrfs device add -f /dev/sdb /mnt btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt The reason is there are some block groups with usage 0, but the whole disk hasn't free space to allocate new chunk, so we even can't set such block group readonly. This patch deletes the chunk allocation when setting block group ro. For META, we already have reserve. But for SYSTEM, we don't have, so the check_system_chunk is still required. Signed-off-by: Shaohua Li <shli@fb.com> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a80b971..be1aac6 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8548,14 +8548,6 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, if (IS_ERR(trans)) return PTR_ERR(trans); - alloc_flags = update_block_group_flags(root, cache->flags); - if (alloc_flags != cache->flags) { - ret = do_chunk_alloc(trans, root, alloc_flags, - CHUNK_ALLOC_FORCE); - if (ret < 0) - goto out; - } - ret = set_block_group_ro(cache, 0); if (!ret) goto out; @@ -8566,6 +8558,11 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, goto out; ret = set_block_group_ro(cache, 0); out: + if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { + alloc_flags = update_block_group_flags(root, cache->flags); + check_system_chunk(trans, root, alloc_flags); + } + btrfs_end_transaction(trans, root); return ret; } ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-01-09 18:40 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-08 21:23 [PATCH] btrfs: delete chunk allocation attemp when setting block group ro Shaohua Li 2015-01-09 1:01 ` Miao Xie 2015-01-09 2:06 ` Shaohua Li 2015-01-09 2:50 ` Miao Xie 2015-01-09 18:40 ` Shaohua Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).