linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 11/42] btrfs: refactor btrfs_invalidatepage()
Date: Sat, 17 Apr 2021 08:13:15 +0800	[thread overview]
Message-ID: <143e0141-c740-9d77-21db-ef7eb1a9500c@gmx.com> (raw)
In-Reply-To: <0babca4e-325a-88a7-bbfc-c810a5bedbeb@toxicpanda.com>



On 2021/4/16 下午10:42, Josef Bacik wrote:
> On 4/15/21 1:04 AM, Qu Wenruo wrote:
>> This patch will refactor btrfs_invalidatepage() for the incoming subpage
>> support.
>>
>> The invovled modifcations are:
>> - Use while() loop instead of "goto again;"
>> - Use single variable to determine whether to delete extent states
>>    Each branch will also have comments why we can or cannot delete the
>>    extent states
>> - Do qgroup free and extent states deletion per-loop
>>    Current code can only work for PAGE_SIZE == sectorsize case.
>>
>> This refactor also makes it clear what we do for different sectors:
>> - Sectors without ordered extent
>>    We're completely safe to remove all extent states for the sector(s)
>>
>> - Sectors with ordered extent, but no Private2 bit
>>    This means the endio has already been executed, we can't remove all
>>    extent states for the sector(s).
>>
>> - Sectors with ordere extent, still has Private2 bit
>>    This means we need to decrease the ordered extent accounting.
>>    And then it comes to two different variants:
>>    * We have finished and removed the ordered extent
>>      Then it's the same as "sectors without ordered extent"
>>    * We didn't finished the ordered extent
>>      We can remove some extent states, but not all.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/inode.c | 173 +++++++++++++++++++++++++----------------------
>>   1 file changed, 94 insertions(+), 79 deletions(-)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index 4c894de2e813..93bb7c0482ba 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -8320,15 +8320,12 @@ static void btrfs_invalidatepage(struct page 
>> *page, unsigned int offset,
>>   {
>>       struct btrfs_inode *inode = BTRFS_I(page->mapping->host);
>>       struct extent_io_tree *tree = &inode->io_tree;
>> -    struct btrfs_ordered_extent *ordered;
>>       struct extent_state *cached_state = NULL;
>>       u64 page_start = page_offset(page);
>>       u64 page_end = page_start + PAGE_SIZE - 1;
>> -    u64 start;
>> -    u64 end;
>> +    u64 cur;
>> +    u32 sectorsize = inode->root->fs_info->sectorsize;
>>       int inode_evicting = inode->vfs_inode.i_state & I_FREEING;
>> -    bool found_ordered = false;
>> -    bool completed_ordered = false;
>>       /*
>>        * We have page locked so no new ordered extent can be created on
>> @@ -8352,96 +8349,114 @@ static void btrfs_invalidatepage(struct page 
>> *page, unsigned int offset,
>>       if (!inode_evicting)
>>           lock_extent_bits(tree, page_start, page_end, &cached_state);
>> -    start = page_start;
>> -again:
>> -    ordered = btrfs_lookup_ordered_range(inode, start, page_end - 
>> start + 1);
>> -    if (ordered) {
>> -        found_ordered = true;
>> -        end = min(page_end,
>> -              ordered->file_offset + ordered->num_bytes - 1);
>> +    cur = page_start;
>> +    while (cur < page_end) {
>> +        struct btrfs_ordered_extent *ordered;
>> +        bool delete_states = false;
>> +        u64 range_end;
>> +
>> +        /*
>> +         * Here we can't pass "file_offset = cur" and
>> +         * "len = page_end + 1 - cur", as btrfs_lookup_ordered_range()
>> +         * may not return the first ordered extent after @file_offset.
>> +         *
>> +         * Here we want to iterate through the range in byte order.
>> +         * This is slower but definitely correct.
>> +         *
>> +         * TODO: Make btrfs_lookup_ordered_range() to return the
>> +         * first ordered extent in the range to reduce the number
>> +         * of loops.
>> +         */
>> +        ordered = btrfs_lookup_ordered_range(inode, cur, sectorsize);
> 
> How does it not find the first ordered extent after file_offset?  
> Looking at the code it just loops through and returns the first thing it 
> finds that overlaps our range.  Is there a bug in 
> btrfs_lookup_ordered_range()?

btrfs_lookup_ordered_range() does two search:
node = tree_search(tree, file_offset);
if (!node) {
	node = tree_search(tree, file_offset + len);
}

That means for the following seach pattern, it will not return the first OE:
start					end
|	|///////|	|///////|	|


> 
> We should add some self tests to make sure these helpers are doing the 
> right thing if there is in fact a bug.

It's not a bug, as most call sites for btrfs_lookup_ordered_range() will 
wait for the ordered extent to finish, then re-search until all ordered 
extent is exhausted.

In that case, they don't care the order of returned OE.

It's really the first time we need a specific ordered.

Since you're already complaining, I guess I'd either add a new function 
to make the existing one to follow the order.

> 
>> +        if (!ordered) {
>> +            range_end = cur + sectorsize - 1;
>> +            /*
>> +             * No ordered extent covering this sector, we are safe
>> +             * to delete all extent states in the range.
>> +             */
>> +            delete_states = true;
>> +            goto next;
>> +        }
>> +
>> +        range_end = min(ordered->file_offset + ordered->num_bytes - 1,
>> +                page_end);
>> +        if (!PagePrivate2(page)) {
>> +            /*
>> +             * If Private2 is cleared, it means endio has already
>> +             * been executed for the range.
>> +             * We can't delete the extent states as
>> +             * btrfs_finish_ordered_io() may still use some of them.
>> +             */
>> +            delete_states = false;
> 
> delete_states is already false.

Yes, but I want to put some comment here.

I can just remove the initialization value.

> 
>> +            goto next;
>> +        }
>> +        ClearPagePrivate2(page);
>> +
>>           /*
>>            * IO on this page will never be started, so we need to account
>>            * for any ordered extents now. Don't clear EXTENT_DELALLOC_NEW
>>            * here, must leave that up for the ordered extent completion.
>> +         *
>> +         * This will also unlock the range for incoming
>> +         * btrfs_finish_ordered_io().
>>            */
>>           if (!inode_evicting)
>> -            clear_extent_bit(tree, start, end,
>> +            clear_extent_bit(tree, cur, range_end,
>>                        EXTENT_DELALLOC |
>>                        EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
>>                        EXTENT_DEFRAG, 1, 0, &cached_state);
>> +
>> +        spin_lock_irq(&inode->ordered_tree.lock);
>> +        set_bit(BTRFS_ORDERED_TRUNCATED, &ordered->flags);
>> +        ASSERT(cur - ordered->file_offset < U32_MAX);
>> +        ordered->truncated_len = min_t(u32, ordered->truncated_len,
>> +                           cur - ordered->file_offset);
> 
> I've realized my previous comment about this needing to be u64 was 
> wrong, I'm starting to wake up now.  However I still don't see the value 
> in saving the space, as we can just leave everything u64 and the math 
> all works out cleanly.

No problem, I can drop that patch.

> 
>> +        spin_unlock_irq(&inode->ordered_tree.lock);
>> +
>> +        ASSERT(range_end + 1 - cur < U32_MAX);
> 
> And we don't have to pollute the code with all of these checks.

That's indeed to the point.

Thanks,
Qu

> 
>> +        if (btrfs_dec_test_ordered_pending(inode, &ordered,
>> +                    cur, range_end + 1 - cur, 1)) {
>> +            btrfs_finish_ordered_io(ordered);
>> +            /*
>> +             * The ordered extent has finished, now we're again
>> +             * safe to delete all extent states of the range.
>> +             */
>> +            delete_states = true;
>> +        } else {
>> +            /*
>> +             * btrfs_finish_ordered_io() will get executed by endio of
>> +             * other pages, thus we can't delete extent states any more
>> +             */
>> +            delete_states = false;
> 
> This is already false.  Thanks,
> 
> Josef

  reply	other threads:[~2021-04-17  0:13 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-15  5:04 [PATCH 00/42] btrfs: add full read-write support for subpage Qu Wenruo
2021-04-15  5:04 ` [PATCH 01/42] btrfs: introduce end_bio_subpage_eb_writepage() function Qu Wenruo
2021-04-15 18:50   ` Josef Bacik
2021-04-15 23:21     ` Qu Wenruo
2021-04-15  5:04 ` [PATCH 02/42] btrfs: introduce write_one_subpage_eb() function Qu Wenruo
2021-04-15 19:03   ` Josef Bacik
2021-04-15 23:25     ` Qu Wenruo
2021-04-16 13:26       ` Josef Bacik
2021-04-18 19:45       ` Thiago Jung Bauermann
2021-04-15  5:04 ` [PATCH 03/42] btrfs: make lock_extent_buffer_for_io() to be subpage compatible Qu Wenruo
2021-04-15 19:04   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 04/42] btrfs: introduce submit_eb_subpage() to submit a subpage metadata page Qu Wenruo
2021-04-15 19:27   ` Josef Bacik
2021-04-15 23:28     ` Qu Wenruo
2021-04-16 13:25       ` Josef Bacik
2021-04-15  5:04 ` [PATCH 05/42] btrfs: remove the unused parameter @len for btrfs_bio_fits_in_stripe() Qu Wenruo
2021-04-16 13:46   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 06/42] btrfs: allow btrfs_bio_fits_in_stripe() to accept bio without any page Qu Wenruo
2021-04-16 13:50   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 07/42] btrfs: use u32 for length related members of btrfs_ordered_extent Qu Wenruo
2021-04-16 13:54   ` Josef Bacik
2021-04-16 23:59     ` Qu Wenruo
2021-04-15  5:04 ` [PATCH 08/42] btrfs: pass btrfs_inode into btrfs_writepage_endio_finish_ordered() Qu Wenruo
2021-04-16 13:58   ` Josef Bacik
2021-04-17  0:02     ` Qu Wenruo
2021-04-15  5:04 ` [PATCH 09/42] btrfs: refactor how we finish ordered extent io for endio functions Qu Wenruo
2021-04-16 14:09   ` Josef Bacik
2021-04-17  0:06     ` Qu Wenruo
2021-04-15  5:04 ` [PATCH 10/42] btrfs: update the comments in btrfs_invalidatepage() Qu Wenruo
2021-04-16 14:32   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 11/42] btrfs: refactor btrfs_invalidatepage() Qu Wenruo
2021-04-16 14:42   ` Josef Bacik
2021-04-17  0:13     ` Qu Wenruo [this message]
2021-04-15  5:04 ` [PATCH 12/42] btrfs: make Private2 lifespan more consistent Qu Wenruo
2021-04-16 14:43   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 13/42] btrfs: rename PagePrivate2 to PageOrdered inside btrfs Qu Wenruo
2021-04-16 14:49   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 14/42] btrfs: pass bytenr directly to __process_pages_contig() Qu Wenruo
2021-04-16 14:58   ` Josef Bacik
2021-04-17  0:15     ` Qu Wenruo
2021-04-15  5:04 ` [PATCH 15/42] btrfs: refactor the page status update into process_one_page() Qu Wenruo
2021-04-16 15:06   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 16/42] btrfs: provide btrfs_page_clamp_*() helpers Qu Wenruo
2021-04-16 15:09   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 17/42] btrfs: only require sector size alignment for end_bio_extent_writepage() Qu Wenruo
2021-04-16 15:13   ` Josef Bacik
2021-04-17  0:16     ` Qu Wenruo
2021-04-15  5:04 ` [PATCH 18/42] btrfs: make btrfs_dirty_pages() to be subpage compatible Qu Wenruo
2021-04-16 15:14   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 19/42] btrfs: make __process_pages_contig() to handle subpage dirty/error/writeback status Qu Wenruo
2021-04-16 15:20   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 20/42] btrfs: make end_bio_extent_writepage() to be subpage compatible Qu Wenruo
2021-04-16 15:21   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 21/42] btrfs: make process_one_page() to handle subpage locking Qu Wenruo
2021-04-16 15:36   ` Josef Bacik
2021-04-15  5:04 ` [PATCH 22/42] btrfs: introduce helpers for subpage ordered status Qu Wenruo
2021-04-15  5:04 ` [PATCH 23/42] btrfs: make page Ordered bit to be subpage compatible Qu Wenruo
2021-04-15  5:04 ` [PATCH 24/42] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig Qu Wenruo
2021-04-15  5:04 ` [PATCH 25/42] btrfs: prevent extent_clear_unlock_delalloc() to unlock page not locked by __process_pages_contig() Qu Wenruo
2021-04-15  5:04 ` [PATCH 26/42] btrfs: make btrfs_set_range_writeback() subpage compatible Qu Wenruo
2021-04-15  5:04 ` [PATCH 27/42] btrfs: make __extent_writepage_io() only submit dirty range for subpage Qu Wenruo
2021-04-15  5:04 ` [PATCH 28/42] btrfs: add extra assert for submit_extent_page() Qu Wenruo
2021-04-15  5:04 ` [PATCH 29/42] btrfs: make btrfs_truncate_block() to be subpage compatible Qu Wenruo
2021-04-15  5:04 ` [PATCH 30/42] btrfs: make btrfs_page_mkwrite() " Qu Wenruo
2021-04-15  5:04 ` [PATCH 31/42] btrfs: reflink: make copy_inline_to_page() " Qu Wenruo
2021-04-15  5:04 ` [PATCH 32/42] btrfs: fix the filemap_range_has_page() call in btrfs_punch_hole_lock_range() Qu Wenruo
2021-04-15  5:04 ` [PATCH 33/42] btrfs: don't clear page extent mapped if we're not invalidating the full page Qu Wenruo
2021-04-15  5:04 ` [PATCH 34/42] btrfs: extract relocation page read and dirty part into its own function Qu Wenruo
2021-04-15  5:04 ` [PATCH 35/42] btrfs: make relocate_one_page() to handle subpage case Qu Wenruo
2021-04-15  5:04 ` [PATCH 36/42] btrfs: fix wild subpage writeback which does not have ordered extent Qu Wenruo
2021-04-15  5:04 ` [PATCH 37/42] btrfs: disable inline extent creation for subpage Qu Wenruo
2021-04-15  5:04 ` [PATCH 38/42] btrfs: skip validation for subpage read repair Qu Wenruo
2021-04-15  5:04 ` [PATCH 39/42] btrfs: make free space cache size consistent across different PAGE_SIZE Qu Wenruo
2021-04-15  5:04 ` [PATCH 40/42] btrfs: refactor submit_extent_page() to make bio and its flag tracing easier Qu Wenruo
2021-04-15  5:04 ` [PATCH 41/42] btrfs: allow submit_extent_page() to do bio split for subpage Qu Wenruo
2021-04-15  5:04 ` [PATCH 42/42] btrfs: allow read-write for 4K sectorsize on 64K page size systems Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=143e0141-c740-9d77-21db-ef7eb1a9500c@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).