From: Mark Fasheh <mfasheh@suse.de>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: linux-btrfs@vger.kernel.org, Chris Mason <clm@fb.com>,
Josef Bacik <jbacik@fb.com>
Subject: Re: [PATCH RFC 00/14] Accurate qgroup reserve framework
Date: Thu, 10 Sep 2015 14:01:05 -0700 [thread overview]
Message-ID: <20150910210104.GS1145@wotan.suse.de> (raw)
In-Reply-To: <1441702615-18333-1-git-send-email-quwenruo@cn.fujitsu.com>
Hi Qu,
On Tue, Sep 08, 2015 at 04:56:52PM +0800, Qu Wenruo wrote:
> [[BUG]]
> One of the most common case to trigger the bug is the following method:
> 1) Enable quota
> 2) Limit excl of qgroup 5 to 16M
> 3) Write [0,2M) of a file inside subvol 5 10 times without sync
>
> EQUOT will be triggered at about the 8th write.
Does this happen on all kernels with qgroups or is this related to your
recent rewrite?
> [[CAUSE]]
> The problem is caused by the fact that qgroup will reserve space even
> the data space is already reserved.
>
> In above reproducer, each time we buffered write [0,2M) qgroup will
> reserve 2M space, but in fact, at the 1st time, we have already reserved
> 2M and from then on, we don't need to reserved any data space as we are
> only writing [0,2M).
>
> Also, the reserved space will only be freed *ONCE* when its backref is
> run at commit_transaction() time.
>
> That's causing the reserved space leaking.
>
> [[FIX]]
> The fix is not a simple one, as currently btrfs_qgroup_reserve() follow
Indeed, this is quite a large patch series and I see no testing details from
you. Can you please at the least provide a single reproducer in the form of
something that can be added to xfstests?
> the very bad btrfs space allocating principle:
> Allocate as much as you needed, even it's not fully used.
>
> So for accurate qgroup reserve, we introduce a completely new framework
> for data and metadata.
> 1) Per-inode data reserve map
> Now, each inode will have a data reserve map, recording which range
> of data is already reserved.
> If we are writing a range which is already reserved, we won't need to
> reserve space again.
>
> Also, for the fact that qgroup is only accounted at commit_trans(),
> for data commit into disc and its metadata is also inserted into
> current tree, we should free the data reserved range, but still keep
> the reserved space until commit_trans().
>
> So delayed_ref_head will have new members to record how much space is
> reserved and free them at commit_trans() time.
>
> 2) Per-root metadata reserve counter
> For metadata(tree block), it's impossible to know how much space it
> will use exactly in advance.
> And due to the new qgroup accounting framework, the old
> free-at-end-trans may lead to exceeding limit.
>
> So we record how much metadata space is reserved for each root, and
> free them at commit_trans() time.
> This method is not perfect, but thanks to the compared small size of
> metadata, it should be quite good.
>
> More detailed info can be found in each commit message and source
> commend.
>
> Qu Wenruo (19):
> btrfs: qgroup: New function declaration for new reserve implement
> btrfs: qgroup: Implement data_rsv_map init/free functions
> btrfs: qgroup: Introduce new function to search most left reserve
> range
> btrfs: qgroup: Introduce function to insert non-overlap reserve range
> btrfs: qgroup: Introduce function to reserve data range per inode
> btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function
> btrfs: qgroup: Introduce function to release reserved range
> btrfs: qgroup: Introduce function to release/free reserved data range
> btrfs: delayed_ref: Add new function to record reserved space into
> delayed ref
> btrfs: delayed_ref: release and free qgroup reserved at proper timing
> btrfs: qgroup: Introduce new functions to reserve/free metadata
> btrfs: qgroup: Use new metadata reservation.
> btrfs: extent-tree: Add new verions of btrfs_check_data_free_space
> btrfs: Switch to new check_data_free_space
> btrfs: fallocate: Add support to accurate qgroup reserve
> btrfs: extent-tree: Add new version of btrfs_delalloc_reserve_space
> btrfs: extent-tree: Use new __btrfs_delalloc_reserve_space function
> btrfs: qgroup: Cleanup old inaccurate facilities
> btrfs: qgroup: Add handler for NOCOW and inline
I took a quick look through a few of these, none of them have any trace_*
functions, yet you're adding several new entrypoints to the qgroup code.
Those are incredibly useful for debugging on live systems and in fact I've
got a patch which reintroduces the ones you removed in your last patch
series ;)
This time around can you please provde tracepoints for at least your new
high level entrypoint functions into the qgroup code?
Thanks,
--Mark
--
Mark Fasheh
next prev parent reply other threads:[~2015-09-10 21:01 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-08 8:56 [PATCH RFC 00/14] Accurate qgroup reserve framework Qu Wenruo
2015-09-08 8:56 ` [PATCH 01/19] btrfs: qgroup: New function declaration for new reserve implement Qu Wenruo
2015-09-09 0:01 ` Tsutomu Itoh
2015-09-08 8:56 ` [PATCH 02/19] btrfs: qgroup: Implement data_rsv_map init/free functions Qu Wenruo
2015-09-08 8:56 ` [PATCH 03/19] btrfs: qgroup: Introduce new function to search most left reserve range Qu Wenruo
2015-09-08 9:01 ` [PATCH 04/19] btrfs: qgroup: Introduce function to insert non-overlap " Qu Wenruo
2015-09-09 0:32 ` Tsutomu Itoh
2015-09-08 9:01 ` [PATCH 05/19] btrfs: qgroup: Introduce function to reserve data range per inode Qu Wenruo
2015-09-08 9:01 ` [PATCH 06/19] btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function Qu Wenruo
2015-09-08 9:02 ` [PATCH 07/19] btrfs: qgroup: Introduce function to release reserved range Qu Wenruo
2015-09-08 9:08 ` [PATCH RFC 00/14] Accurate qgroup reserve framework Qu Wenruo
2015-09-10 23:34 ` Chris Mason
2015-09-11 0:50 ` Qu Wenruo
2015-09-08 9:08 ` [PATCH 08/19] btrfs: qgroup: Introduce function to release/free reserved data range Qu Wenruo
2015-09-09 1:05 ` Tsutomu Itoh
2015-09-08 9:08 ` [PATCH 09/19] btrfs: delayed_ref: Add new function to record reserved space into delayed ref Qu Wenruo
2015-09-08 9:08 ` [PATCH 10/19] btrfs: delayed_ref: release and free qgroup reserved at proper timing Qu Wenruo
2015-09-09 1:21 ` Tsutomu Itoh
2015-09-09 1:40 ` Qu Wenruo
2015-09-08 9:08 ` [PATCH 11/19] btrfs: qgroup: Introduce new functions to reserve/free metadata Qu Wenruo
2015-09-08 9:22 ` [PATCH 12/19] btrfs: qgroup: Use new metadata reservation Qu Wenruo
2015-09-08 9:22 ` [PATCH 13/19] btrfs: extent-tree: Add new verions of btrfs_check_data_free_space Qu Wenruo
2015-09-09 1:35 ` Tsutomu Itoh
2015-09-08 9:22 ` [PATCH 14/19] btrfs: Switch to new check_data_free_space Qu Wenruo
2015-09-08 9:22 ` [PATCH 15/19] btrfs: fallocate: Add support to accurate qgroup reserve Qu Wenruo
2015-09-09 1:53 ` Tsutomu Itoh
2015-09-08 9:25 ` [PATCH 16/19] btrfs: extent-tree: Add new version of btrfs_delalloc_reserve_space Qu Wenruo
2015-09-08 9:25 ` [PATCH 17/19] btrfs: extent-tree: Use new __btrfs_delalloc_reserve_space function Qu Wenruo
2015-09-08 9:25 ` [PATCH 18/19] btrfs: qgroup: Cleanup old inaccurate facilities Qu Wenruo
2015-09-09 2:07 ` Tsutomu Itoh
2015-09-08 9:25 ` [PATCH 19/19] btrfs: qgroup: Add handler for NOCOW and inline Qu Wenruo
2015-09-10 21:01 ` Mark Fasheh [this message]
2015-09-10 21:33 ` [PATCH RFC 00/14] Accurate qgroup reserve framework Filipe David Manana
2015-09-10 23:50 ` Mark Fasheh
2015-09-11 0:43 ` Qu Wenruo
-- strict thread matches above, loose matches on Subject: below --
2015-09-08 8:37 Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150910210104.GS1145@wotan.suse.de \
--to=mfasheh@suse.de \
--cc=clm@fb.com \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).