public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Lai Wei-Hwa <whlai@robco.com>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH RFC] btrfs: volumes: Check if we're hitting sys chunk array size limit before allocating new sys chunks
Date: Wed, 18 Sep 2019 08:07:27 +0800	[thread overview]
Message-ID: <a8b472ff-8681-aa65-41fd-6549d9cba178@gmx.com> (raw)
In-Reply-To: <337881026.86319.1568741259357.JavaMail.zimbra@robco.com>


[-- Attachment #1.1: Type: text/plain, Size: 6675 bytes --]



On 2019/9/18 上午1:27, Lai Wei-Hwa wrote:
> After upgrading to 18.04 4.15.0-62-generic, the problem disappears.

Aha, that looks like something related to empty block groups auto removal.

Then at least the patch is less urgent, but still makes some sense.

Thanks,
Qu
> 
> Thanks! 
> Lai
> 
> ----- Original Message -----
> From: "Qu Wenruo" <wqu@suse.com>
> To: "linux-btrfs" <linux-btrfs@vger.kernel.org>
> Cc: "Lai Wei-Hwa" <whlai@robco.com>
> Sent: Tuesday, September 17, 2019 2:57:30 AM
> Subject: [PATCH RFC] btrfs: volumes: Check if we're hitting sys chunk array size limit before allocating new sys chunks
> 
> [BUG]
> There is a user reporting strange EFBIG error causing transaction to be
> aborted.
> 
> [Sep14 20:02] ------------[ cut here ]------------
> [ +0.000042] WARNING: CPU: 18 PID: 28882 at linux-4.4.0/fs/btrfs/extent-tree.c:10046 btrfs_create_pending_block_groups+0x144/0x1f0 [btrfs]()
> [ +0.000002] BTRFS: Transaction aborted (error -27)
> [ +0.000002] Call Trace:
> [ +0.000008] [<ffffffff8140c9a1>] dump_stack+0x63/0x82
> [ +0.000007] [<ffffffff810864d2>] warn_slowpath_common+0x82/0xc0
> [ +0.000002] [<ffffffff8108656c>] warn_slowpath_fmt+0x5c/0x80
> [ +0.000014] [<ffffffffc01f31c4>] ? btrfs_finish_chunk_alloc+0x204/0x5a0 [btrfs]
> [ +0.000011] [<ffffffffc01b1d24>] btrfs_create_pending_block_groups+0x144/0x1f0 [btrfs]
> [ +0.000012] [<ffffffffc01c7ed3>] __btrfs_end_transaction+0x93/0x340 [btrfs]
> [ +0.000013] [<ffffffffc01c8190>] btrfs_end_transaction+0x10/0x20 [btrfs]
> [ +0.000010] [<ffffffffc01b5a4d>] btrfs_inc_block_group_ro+0xed/0x1b0 [btrfs]
> [ +0.000014] [<ffffffffc02253bf>] scrub_enumerate_chunks+0x21f/0x580 [btrfs]
> [ +0.000004] [<ffffffff810cb700>] ? wake_atomic_t_function+0x60/0x60
> [ +0.000013] [<ffffffffc0226d0c>] btrfs_scrub_dev+0x1bc/0x530 [btrfs]
> [ +0.000004] [<ffffffff8123f306>] ? __mnt_want_write+0x56/0x60
> [ +0.000013] [<ffffffffc0202408>] btrfs_ioctl+0x1ac8/0x28c0 [btrfs]
> [ +0.000003] [<ffffffff8119a3b9>] ? unlock_page+0x69/0x70
> [ +0.000002] [<ffffffff8119a654>] ? filemap_map_pages+0x224/0x230
> [ +0.000004] [<ffffffff811cdb77>] ? handle_mm_fault+0x10f7/0x1b80
> [ +0.000002] [<ffffffff811fb77b>] ? kmem_cache_alloc_node+0xbb/0x210
> [ +0.000003] [<ffffffff813e13e3>] ? create_task_io_context+0x23/0x100
> [ +0.000003] [<ffffffff812318ef>] do_vfs_ioctl+0x2af/0x4b0
> [ +0.000002] [<ffffffff813e1510>] ? get_task_io_context+0x50/0x90
> [ +0.000003] [<ffffffff813f0936>] ? set_task_ioprio+0x86/0xa0
> [ +0.000002] [<ffffffff81231b69>] SyS_ioctl+0x79/0x90
> [ +0.000004] [<ffffffff81864f1b>] entry_SYSCALL_64_fastpath+0x22/0xcb
> [ +0.000002] ---[ end trace 13fce4e84d9b6aed ]---
> [ +0.000003] BTRFS: error (device sda1) in btrfs_create_pending_block_groups:10046: errno=-27 unknown
> [ +0.003942] BTRFS info (device sda1): forced readonly
> 
> [CAUSE]
> From the backtrace, the EFBIG is from btrfs_add_system_chunk() where the
> new system chunk is unable to be inserted in super block.
> 
> Indeed we can't do much to help such problem, but at least we can avoid
> such situation when allocating new chunk.
> 
> [FIX]
> At chunk allocation time, we iterate through the new_bgs list which
> records all new chunks allocated in current transaction.
> 
> And account all new system chunks and its space to be used in super block,
> along with the size of the to-be-allocated chunk to see if it exceeds
> the sys chunk size limit.
> 
> Such early check will make __btrfs_alloc_chunk() return -EFBIG, and
> prevent transaction abort in btrfs_create_pending_block_groups().
> 
> Reported-by: Lai Wei-Hwa <whlai@robco.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Reason for RFC:
> This patch is only to provide early graceful exit, the root reason for
> the initial report is still not fully discovered.
> 
> So I keep the RFC tag until the initial report can be solved.
> ---
>  fs/btrfs/volumes.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index a447d3ec48d5..05d328ce229f 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4901,6 +4901,51 @@ static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type)
>  	btrfs_set_fs_incompat(info, RAID56);
>  }
>  
> +static bool check_syschunk_array_size(struct btrfs_trans_handle *trans,
> +				      int num_stripes)
> +{
> +	struct btrfs_fs_info *fs_info = trans->fs_info;
> +	struct btrfs_block_group_cache *cache;
> +	u32 sb_array_size;
> +	u32 needed = 0;
> +
> +	lockdep_assert_held(&fs_info->chunk_mutex);
> +	sb_array_size = btrfs_super_sys_array_size(fs_info->super_copy);
> +
> +	/*
> +	 * Check and calculate all existing sys chunks in new_bgs.
> +	 * As new system chunks will take up sys chunk array in super block, we
> +	 * want to error out early before we ate up all sys chunk array.
> +	 *
> +	 * This list is only modified by btrfs_make_block_group() and
> +	 * btrfs_create_pending_block_groups().
> +	 *
> +	 * The former is only called in __btrfs_alloc_chunk() and protected
> +	 * by fs_info->chunk_mutex.
> +	 * The later is called when the last trans handle get ended in
> +	 * __btrfs_end_transaction() or btrfs_commit_transaction(), thus there
> +	 * is no race as long as we hold a trans handle.
> +	 */
> +	list_for_each_entry(cache, &trans->new_bgs, bg_list) {
> +		if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) {
> +			struct extent_map *em;
> +
> +			em = btrfs_get_chunk_map(fs_info, cache->key.objectid,
> +						 1);
> +			/* Can't get a chunk map? It's a problem by all means */
> +			if (IS_ERR(em))
> +				return false;
> +			needed += btrfs_chunk_item_size(
> +					em->map_lookup->num_stripes);
> +			needed += sizeof(struct btrfs_disk_key);
> +			free_extent_map(em);
> +		}
> +	}
> +	if (sb_array_size + needed > BTRFS_SYSTEM_CHUNK_ARRAY_SIZE)
> +		return false;
> +	return true;
> +}
> +
>  static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>  			       u64 start, u64 type)
>  {
> @@ -5071,6 +5116,13 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>  	stripe_size = div_u64(devices_info[ndevs - 1].max_avail, dev_stripes);
>  	num_stripes = ndevs * dev_stripes;
>  
> +	if (type & BTRFS_BLOCK_GROUP_SYSTEM &&
> +	    !check_syschunk_array_size(trans, num_stripes)) {
> +		/* Use the unique errno to distinguish from ordinary ENOSPC */
> +		ret = -EFBIG;
> +		goto error;
> +	}
> +
>  	/*
>  	 * this will have to be fixed for RAID1 and RAID10 over
>  	 * more drives
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2019-09-18  0:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17  6:57 [PATCH RFC] btrfs: volumes: Check if we're hitting sys chunk array size limit before allocating new sys chunks Qu Wenruo
2019-09-17 17:27 ` Lai Wei-Hwa
2019-09-18  0:07   ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8b472ff-8681-aa65-41fd-6549d9cba178@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=whlai@robco.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox