From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3 08/22] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case
Date: Wed, 6 Jan 2021 14:54:46 +0800 [thread overview]
Message-ID: <844a6c6b-c5e9-c0c8-0177-fe3ae015c108@suse.com> (raw)
In-Reply-To: <20210106010201.37864-9-wqu@suse.com>
On 2021/1/6 上午9:01, Qu Wenruo wrote:
> For subpage case, we need to allocate new memory for each metadata page.
>
> So we need to:
> - Allow attach_extent_buffer_page() to return int
> To indicate allocation failure
>
> - Prealloc page->private for alloc_extent_buffer()
> We don't want to call memory allocation with spinlock hold, so
> do preallocation before we acquire the spin lock.
>
> - Handle subpage and regular case differently in
> attach_extent_buffer_page()
> For regular case, just do the usual thing.
> For subpage case, allocate new memory and update the tree_block
> bitmap.
>
> The bitmap update will be handled by new subpage specific helper,
> btrfs_subpage_set_tree_block().
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/extent_io.c | 74 ++++++++++++++++++++++++++++++++++----------
> fs/btrfs/subpage.h | 50 ++++++++++++++++++++++++++++++
> 2 files changed, 108 insertions(+), 16 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index d60f1837f8fb..2eeff925450f 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -24,6 +24,7 @@
> #include "rcu-string.h"
> #include "backref.h"
> #include "disk-io.h"
> +#include "subpage.h"
>
> static struct kmem_cache *extent_state_cache;
> static struct kmem_cache *extent_buffer_cache;
> @@ -3140,22 +3141,41 @@ static int submit_extent_page(unsigned int opf,
> return ret;
> }
>
> -static void attach_extent_buffer_page(struct extent_buffer *eb,
> +static int attach_extent_buffer_page(struct extent_buffer *eb,
> struct page *page)
> {
> - /*
> - * If the page is mapped to btree inode, we should hold the private
> - * lock to prevent race.
> - * For cloned or dummy extent buffers, their pages are not mapped and
> - * will not race with any other ebs.
> - */
> - if (page->mapping)
> - lockdep_assert_held(&page->mapping->private_lock);
> + struct btrfs_fs_info *fs_info = eb->fs_info;
> + int ret;
>
> - if (!PagePrivate(page))
> - attach_page_private(page, eb);
> - else
> - WARN_ON(page->private != (unsigned long)eb);
> + if (fs_info->sectorsize == PAGE_SIZE) {
> + /*
> + * If the page is mapped to btree inode, we should hold the
> + * private lock to prevent race.
> + * For cloned or dummy extent buffers, their pages are not
> + * mapped and will not race with any other ebs.
> + */
> + if (page->mapping)
> + lockdep_assert_held(&page->mapping->private_lock);
> +
> + if (!PagePrivate(page))
> + attach_page_private(page, eb);
> + else
> + WARN_ON(page->private != (unsigned long)eb);
> + return 0;
> + }
> +
> + /* Already mapped, just update the existing range */
> + if (PagePrivate(page))
> + goto update_bitmap;
> +
> + /* Do new allocation to attach subpage */
> + ret = btrfs_attach_subpage(fs_info, page);
> + if (ret < 0)
> + return ret;
> +
> +update_bitmap:
> + btrfs_subpage_set_tree_block(fs_info, page, eb->start, eb->len);
> + return 0;
> }
>
> void set_page_extent_mapped(struct page *page)
> @@ -5063,21 +5083,29 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
> if (new == NULL)
> return NULL;
>
> + set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
> + set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
> +
> for (i = 0; i < num_pages; i++) {
> + int ret;
> +
> p = alloc_page(GFP_NOFS);
> if (!p) {
> btrfs_release_extent_buffer(new);
> return NULL;
> }
> - attach_extent_buffer_page(new, p);
> + ret = attach_extent_buffer_page(new, p);
> + if (ret < 0) {
> + put_page(p);
> + btrfs_release_extent_buffer(new);
> + return NULL;
> + }
> WARN_ON(PageDirty(p));
> SetPageUptodate(p);
> new->pages[i] = p;
> copy_page(page_address(p), page_address(src->pages[i]));
> }
>
> - set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
> - set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
>
> return new;
> }
> @@ -5316,6 +5344,18 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> goto free_eb;
> }
>
> + /*
> + * Preallocate page->private for subpage case, so that
> + * we won't allocate memory with private_lock hold.
> + */
> + ret = btrfs_attach_subpage(fs_info, p);
Although we try to preallocate the subpage structure here, it's not
reliable.
I have hit a case just minutes before, where we still try to allocate
memory inside that private_lock spinlock, and causes sleep inside atomic
warning.
The problem is, we can have a race where the page has one existing eb,
and it's being freed.
At this point, we still have page::private.
But before we acquire that private_lock, the eb get freed and since it's
the only eb of that page (our eb hasn't yet being added to the page), it
detach page private.
> + if (ret < 0) {
> + unlock_page(p);
> + put_page(p);
> + exists = ERR_PTR(-ENOMEM);
> + goto free_eb;
> + }
> +
> spin_lock(&mapping->private_lock);
> exists = grab_extent_buffer(p);
> if (exists) {
> @@ -5325,8 +5365,10 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> mark_extent_buffer_accessed(exists, p);
> goto free_eb;
> }
> + /* Should not fail, as we have attached the subpage already */
> attach_extent_buffer_page(eb, p);
So here we can not rely on any result before we acquire private_lock.
Thus I guess we have to pre-allocate the memory manually and pass the
pointer in.
Why I didn't hit the bug before sending the patches...
Thanks,
Qu
> spin_unlock(&mapping->private_lock);
> +
> WARN_ON(PageDirty(p));
> eb->pages[i] = p;
> if (!PageUptodate(p))
> diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h
> index 96f3b226913e..e49d4a7329e1 100644
> --- a/fs/btrfs/subpage.h
> +++ b/fs/btrfs/subpage.h
> @@ -23,9 +23,59 @@
> struct btrfs_subpage {
> /* Common members for both data and metadata pages */
> spinlock_t lock;
> + union {
> + /* Structures only used by metadata */
> + struct {
> + u16 tree_block_bitmap;
> + };
> + /* structures only used by data */
> + };
> };
>
> int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page);
> void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page);
>
> +/*
> + * Convert the [start, start + len) range into a u16 bitmap
> + *
> + * E.g. if start == page_offset() + 16K, len = 16K, we get 0x00f0.
> + */
> +static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info,
> + struct page *page, u64 start, u32 len)
> +{
> + int bit_start = offset_in_page(start) >> fs_info->sectorsize_bits;
> + int nbits = len >> fs_info->sectorsize_bits;
> +
> + /* Basic checks */
> + ASSERT(PagePrivate(page) && page->private);
> + ASSERT(IS_ALIGNED(start, fs_info->sectorsize) &&
> + IS_ALIGNED(len, fs_info->sectorsize));
> +
> + /*
> + * The range check only works for mapped page, we can
> + * still have unampped page like dummy extent buffer pages.
> + */
> + if (page->mapping)
> + ASSERT(page_offset(page) <= start &&
> + start + len <= page_offset(page) + PAGE_SIZE);
> + /*
> + * Here nbits can be 16, thus can go beyond u16 range. Here we make the
> + * first left shift to be calculated in unsigned long (u32), then
> + * truncate the result to u16.
> + */
> + return (u16)(((1UL << nbits) - 1) << bit_start);
> +}
> +
> +static inline void btrfs_subpage_set_tree_block(struct btrfs_fs_info *fs_info,
> + struct page *page, u64 start, u32 len)
> +{
> + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private;
> + unsigned long flags;
> + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len);
> +
> + spin_lock_irqsave(&subpage->lock, flags);
> + subpage->tree_block_bitmap |= tmp;
> + spin_unlock_irqrestore(&subpage->lock, flags);
> +}
> +
> #endif /* BTRFS_SUBPAGE_H */
>
next prev parent reply other threads:[~2021-01-06 7:04 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-06 1:01 [PATCH v3 00/22] btrfs: add read-only support for subpage sector size Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 01/22] btrfs: extent_io: rename @offset parameter to @disk_bytenr for submit_extent_page() Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 02/22] btrfs: extent_io: refactor __extent_writepage_io() to improve readability Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 03/22] btrfs: file: update comment for btrfs_dirty_pages() Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 04/22] btrfs: extent_io: update locked page dirty/writeback/error bits in __process_pages_contig() Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 05/22] btrfs: extent_io: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK into PAGE_START_WRITEBACK Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 06/22] btrfs: extent_io: introduce a helper to grab an existing extent buffer from a page Qu Wenruo
2021-01-12 15:08 ` David Sterba
2021-01-06 1:01 ` [PATCH v3 07/22] btrfs: extent_io: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 08/22] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2021-01-06 6:54 ` Qu Wenruo [this message]
2021-01-07 1:35 ` Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 09/22] btrfs: extent_io: make grab_extent_buffer_from_page() " Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 10/22] btrfs: extent_io: support subpage for extent buffer page release Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 11/22] btrfs: extent_io: attach private to dummy extent buffer pages Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 12/22] btrfs: subpage: introduce helper for subpage uptodate status Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 13/22] btrfs: subpage: introduce helper for subpage error status Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 14/22] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 15/22] btrfs: extent_io: make btrfs_clone_extent_buffer() to be subpage compatible Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 16/22] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2021-01-06 8:24 ` Qu Wenruo
2021-01-06 8:43 ` Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 17/22] btrfs: extent_io: introduce read_extent_buffer_subpage() Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 18/22] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 19/22] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2021-01-06 1:01 ` [PATCH v3 20/22] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2021-01-06 5:04 ` kernel test robot
2021-01-06 5:32 ` Qu Wenruo
2021-01-06 6:48 ` Rong Chen
2021-01-09 9:53 ` kernel test robot
2021-01-06 1:02 ` [PATCH v3 21/22] btrfs: integrate page status update for read path into begin/end_page_read() Qu Wenruo
2021-01-06 1:02 ` [PATCH v3 22/22] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
2021-01-12 15:14 ` [PATCH v3 00/22] btrfs: add read-only support for subpage sector size David Sterba
2021-01-13 5:06 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=844a6c6b-c5e9-c0c8-0177-fe3ae015c108@suse.com \
--to=wqu@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox