From: Nikolay Borisov <nborisov@suse.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org,
Jeff Mahoney <jeffm@suse.com>
Subject: Re: [PATCH v3 07/12] btrfs: replace pending/pinned chunks lists with io tree
Date: Mon, 25 Mar 2019 18:43:49 +0200 [thread overview]
Message-ID: <75db110a-db25-e1ff-f74d-997b61532bf7@suse.com> (raw)
In-Reply-To: <20190325162651.GI10640@twin.jikos.cz>
On 25.03.19 г. 18:26 ч., David Sterba wrote:
> On Mon, Mar 25, 2019 at 02:31:27PM +0200, Nikolay Borisov wrote:
>> From: Jeff Mahoney <jeffm@suse.com>
>>
>> The pending chunks list contains chunks that are allocated in the
>> current transaction but haven't been created yet. The pinned chunks
>> list contains chunks that are being released in the current transaction.
>> Both describe chunks that are not reflected on disk as in use but are
>> unavailable just the same.
>>
>> The pending chunks list is anchored by the transaction handle, which
>> means that we need to hold a reference to a transaction when working
>> with the list.
>>
>> We use these lists to ensure that we don't end up discarding chunks
>> that are allocated or released in the current transaction. What we r
>>
>> The way we use them is by iterating over both lists to perform
>> comparisons on the stripes they describe for each device. This is
>> backwards and requires that we keep a transaction handle open while
>> we're trimming.
>>
>> This patchset adds an extent_io_tree to btrfs_device that maintains
>> the allocation state of the device. Extents are set dirty when
>> chunks are first allocated -- when the extent maps are added to the
>> mapping tree. They're cleared when last removed -- when the extent
>> maps are removed from the mapping tree. This matches the lifespan
>> of the pending and pinned chunks list and allows us to do trims
>> on unallocated space safely without pinning the transaction for what
>> may be a lengthy operation. We can also use this io tree to mark
>> which chunks have already been trimmed so we don't repeat the operation.
>>
>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>> ---
>> fs/btrfs/ctree.h | 6 ---
>> fs/btrfs/disk-io.c | 11 -----
>> fs/btrfs/extent-tree.c | 28 -----------
>> fs/btrfs/extent_io.c | 2 +-
>> fs/btrfs/extent_io.h | 6 ++-
>> fs/btrfs/extent_map.c | 36 ++++++++++++++
>> fs/btrfs/extent_map.h | 1 -
>> fs/btrfs/free-space-cache.c | 4 --
>> fs/btrfs/transaction.c | 9 ----
>> fs/btrfs/transaction.h | 1 -
>> fs/btrfs/volumes.c | 96 +++++++++++++------------------------
>> fs/btrfs/volumes.h | 2 +
>> 12 files changed, 76 insertions(+), 126 deletions(-)
>>
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -918,7 +918,7 @@ static void cache_state(struct extent_state *state,
>> * [start, end] is inclusive This takes the tree lock.
>> */
>>
>> -static int __must_check
>> +int __must_check
>> __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
>> unsigned bits, unsigned exclusive_bits,
>> u64 *failed_start, struct extent_state **cached_state,
>
> Does this really need to be exported again? There are helpers that can
> wrap specific combinations of parameters.
This is exported so that __set_extent_bit could be called with
GFP_NOWAIT parameter otherwise I was getting lockdep splats since
extent_map_device_set_bits (called from add_extent_mapping) is called
under a write_lock hence we can't sleep.
>
>> @@ -335,6 +335,8 @@ void btrfs_free_device(struct btrfs_device *device)
>> {
>> WARN_ON(!list_empty(&device->post_commit_list));
>> rcu_string_free(device->name);
>> + if (!in_softirq())
>> + extent_io_tree_release(&device->alloc_state);
This is used to distinguish between btrfs_free_device being called from
btrfs_close_devices in close_ctree i.e non rcu (hence no softirq )
context or any of the error handlers and from free_device_rcu. In the
latter case the extent tree is already freed in btrfs_close_one_device,
hence there is no need to do it in the RCU callback.
Furthermore, there is also a comment that the extent io tree cannot be
destroyed in RCU context because extent_io_tree_release calls
cond_resched_lock which in turn could sleep, but this is forbidden in
RCU context.
>
> This needs a comment
>
>> bio_put(device->flush_bio);
>> kfree(device);
>> }
>
> The commit is quite big but I don't see how to shrink it, the changes
> need to be done in several places. So, probably ok.
>
next prev parent reply other threads:[~2019-03-25 16:43 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-25 12:31 [PATCH v3 00/12] FITRIM improvements Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 01/12] btrfs: Honour FITRIM range constraints during free space trim Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 02/12] btrfs: combine device update operations during transaction commit Nikolay Borisov
2019-03-25 13:44 ` Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 03/12] btrfs: Handle pending/pinned chunks before blockgroup relocation during device shrink Nikolay Borisov
2019-03-25 15:09 ` David Sterba
2019-03-25 15:16 ` David Sterba
2019-03-25 12:31 ` [PATCH v3 04/12] btrfs: Rename and export clear_btree_io_tree Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 05/12] btrfs: Populate ->orig_block_len during read_one_chunk Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 06/12] btrfs: Introduce new bits for device allocation tree Nikolay Borisov
2019-03-25 16:12 ` David Sterba
2019-03-25 16:13 ` Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 07/12] btrfs: replace pending/pinned chunks lists with io tree Nikolay Borisov
2019-03-25 14:22 ` David Sterba
2019-03-25 16:26 ` David Sterba
2019-03-25 16:43 ` Nikolay Borisov [this message]
2019-03-25 16:57 ` David Sterba
2019-03-25 12:31 ` [PATCH v3 08/12] btrfs: Remove 'trans' argument from find_free_dev_extent(_start) Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 09/12] btrfs: Factor out in_range macro Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 10/12] btrfs: Optimize unallocated chunks discard Nikolay Borisov
2019-03-25 16:29 ` David Sterba
2019-03-25 12:31 ` [PATCH v3 11/12] btrfs: Implement find_first_clear_extent_bit Nikolay Borisov
2019-03-25 12:31 ` [PATCH v3 12/12] btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit Nikolay Borisov
2019-03-25 18:44 ` [PATCH v3 00/12] FITRIM improvements Darrick J. Wong
2019-03-26 8:09 ` Nikolay Borisov
2019-03-26 10:50 ` Filipe Manana
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=75db110a-db25-e1ff-f74d-997b61532bf7@suse.com \
--to=nborisov@suse.com \
--cc=dsterba@suse.cz \
--cc=jeffm@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox