Re: [PATCH] btrfs: qgroups, properly handle no reservations

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Jeff Mahoney <jeffm@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: qgroups, properly handle no reservations
Date: Thu, 22 Feb 2018 10:05:36 +0800	[thread overview]
Message-ID: <ac3ffb4b-c60a-fd1e-be50-db99376a6e1e@gmx.com> (raw)
In-Reply-To: <f0c73e34-dc33-d2eb-56d6-2b936f1f967b@suse.com>


[-- Attachment #1.1: Type: text/plain, Size: 6957 bytes --]



On 2018年02月22日 09:50, Jeff Mahoney wrote:
> On 2/21/18 8:36 PM, Qu Wenruo wrote:
>>
>>
>> On 2018年02月22日 04:19, jeffm@suse.com wrote:
>>> From: Jeff Mahoney <jeffm@suse.com>
>>>
>>> There are several places where we call btrfs_qgroup_reserve_meta and
>>> assume that a return value of 0 means that the reservation was successful.
>>>
>>> Later, we use the original bytes value passed to that call to free
>>> bytes during error handling or to pass the number of bytes reserved to
>>> the caller.
>>>
>>> This patch returns -ENODATA when we don't perform a reservation so that
>>> callers can make the distinction.  This also lets call sites not
>>> necessarily care whether qgroups are enabled.
>>
>> IMHO if we don't need to reserve, returning 0 seems good enough.
>> Caller doesn't really need to care if it has reserved some bytes.
>>
>> Or is there any special case where we need to distinguish such case?
> 
> Anywhere where the reservation takes place prior to the transaction
> starting, which is pretty much everywhere.  We wait until transaction
> commit to flip the bit to turn on quotas, which means that if a
> transaction commits that enables quotas lands in between the reservation
> being take and any error handling that involves freeing the reservation,
> we'll end up with an underflow.

So the same case as btrfs_qgroup_reserve_data().

In that case we could use ret > 0 to indicates the real bytes we
reserved, instead of -ENODATA which normally means error.

> 
> This is the first patch of a series I'm working on, but it can stand
> alone.  The rest is the patch set I mentioned when we talked a few
> months ago where the lifetimes of reservations are incorrect.  We can't
> just drop all the reservations at the end of the transaction because 1)
> the lifetime of some reservations can cross transactions and 2) because
> especially in the start_transaction case, we do the reservation prior to
> waiting to join the transaction.  So if the transaction we're waiting on
> commits, our reservation goes away with it but we continue on as if we
> still have it.

Right, the same problem I also addressed in patchset "[PATCH v2 00/10]
Use split qgroup rsv type".

In 6th patch, "[PATCH v2 06/10] btrfs: qgroup: Use
root->qgroup_meta_rsv_* to record qgroup meta reserved space" qgroup
meta reserve will only be increased if we succeeded in reserving
metadata, so later free won't underflow that number.

Thanks,
Qu


> 
> -Jeff
> 
>> Thanks,
>> Qu
>>
>>>
>>> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
>>> ---
>>>  fs/btrfs/extent-tree.c | 33 ++++++++++++++++-----------------
>>>  fs/btrfs/qgroup.c      |  4 ++--
>>>  fs/btrfs/transaction.c |  5 ++++-
>>>  3 files changed, 22 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>> index c1618ab9fecf..2d5e963fae88 100644
>>> --- a/fs/btrfs/extent-tree.c
>>> +++ b/fs/btrfs/extent-tree.c
>>> @@ -5988,20 +5988,18 @@ int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
>>>  				     u64 *qgroup_reserved,
>>>  				     bool use_global_rsv)
>>>  {
>>> -	u64 num_bytes;
>>>  	int ret;
>>>  	struct btrfs_fs_info *fs_info = root->fs_info;
>>>  	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
>>> +	/* One for parent inode, two for dir entries */
>>> +	u64 num_bytes = 3 * fs_info->nodesize;
>>>  
>>> -	if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) {
>>> -		/* One for parent inode, two for dir entries */
>>> -		num_bytes = 3 * fs_info->nodesize;
>>> -		ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
>>> -		if (ret)
>>> -			return ret;
>>> -	} else {
>>> +	ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
>>> +	if (ret == -ENODATA) {
>>>  		num_bytes = 0;
>>> -	}
>>> +		ret = 0;
>>> +	} else if (ret)
>>> +		return ret;
>>>  
>>>  	*qgroup_reserved = num_bytes;
>>>  
>>> @@ -6057,6 +6055,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
>>>  	enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_FLUSH_ALL;
>>>  	int ret = 0;
>>>  	bool delalloc_lock = true;
>>> +	u64 qgroup_reserved;
>>>  
>>>  	/* If we are a free space inode we need to not flush since we will be in
>>>  	 * the middle of a transaction commit.  We also don't need the delalloc
>>> @@ -6090,17 +6089,17 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
>>>  	btrfs_calculate_inode_block_rsv_size(fs_info, inode);
>>>  	spin_unlock(&inode->lock);
>>>  
>>> -	if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) {
>>> -		ret = btrfs_qgroup_reserve_meta(root,
>>> -				nr_extents * fs_info->nodesize, true);
>>> -		if (ret)
>>> -			goto out_fail;
>>> -	}
>>> +	qgroup_reserved = nr_extents * fs_info->nodesize;
>>> +	ret = btrfs_qgroup_reserve_meta(root, qgroup_reserved, true);
>>> +	if (ret == -ENODATA) {
>>> +		ret = 0;
>>> +		qgroup_reserved = 0;
>>> +	} if (ret)
>>> +		goto out_fail;
>>>  
>>>  	ret = btrfs_inode_rsv_refill(inode, flush);
>>>  	if (unlikely(ret)) {
>>> -		btrfs_qgroup_free_meta(root,
>>> -				       nr_extents * fs_info->nodesize);
>>> +		btrfs_qgroup_free_meta(root, qgroup_reserved);
>>>  		goto out_fail;
>>>  	}
>>>  
>>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>>> index aa259d6986e1..5d9e011243c6 100644
>>> --- a/fs/btrfs/qgroup.c
>>> +++ b/fs/btrfs/qgroup.c
>>> @@ -3025,7 +3025,7 @@ int btrfs_qgroup_reserve_meta(struct btrfs_root *root, int num_bytes,
>>>  
>>>  	if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
>>>  	    !is_fstree(root->objectid) || num_bytes == 0)
>>> -		return 0;
>>> +		return -ENODATA;
>>>  
>>>  	BUG_ON(num_bytes != round_down(num_bytes, fs_info->nodesize));
>>>  	trace_qgroup_meta_reserve(root, (s64)num_bytes);
>>> @@ -3057,7 +3057,7 @@ void btrfs_qgroup_free_meta(struct btrfs_root *root, int num_bytes)
>>>  	struct btrfs_fs_info *fs_info = root->fs_info;
>>>  
>>>  	if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
>>> -	    !is_fstree(root->objectid))
>>> +	    !is_fstree(root->objectid) || num_bytes == 0)
>>>  		return;
>>>  
>>>  	BUG_ON(num_bytes != round_down(num_bytes, fs_info->nodesize));
>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>> index 04f07144b45c..ab67b73bd7fa 100644
>>> --- a/fs/btrfs/transaction.c
>>> +++ b/fs/btrfs/transaction.c
>>> @@ -510,7 +510,10 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
>>>  		qgroup_reserved = num_items * fs_info->nodesize;
>>>  		ret = btrfs_qgroup_reserve_meta(root, qgroup_reserved,
>>>  						enforce_qgroups);
>>> -		if (ret)
>>> +		if (ret == -ENODATA) {
>>> +			ret = 0;
>>> +			qgroup_reserved = 0;
>>> +		} else if (ret)
>>>  			return ERR_PTR(ret);
>>>  
>>>  		num_bytes = btrfs_calc_trans_metadata_size(fs_info, num_items);
>>>
>>
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

next prev parent reply	other threads:[~2018-02-22  2:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-21 20:19 [PATCH] btrfs: qgroups, properly handle no reservations jeffm
2018-02-22  1:36 ` Qu Wenruo
2018-02-22  1:50   ` Jeff Mahoney
2018-02-22  2:05     ` Qu Wenruo [this message]
2018-03-07 16:02       ` David Sterba
2018-03-08  1:02         ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac3ffb4b-c60a-fd1e-be50-db99376a6e1e@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).