Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes
Date: Tue, 26 Jan 2021 15:05:54 +0800	[thread overview]
Message-ID: <f8b931b1-d6bd-2b1d-0f00-74dcc5775dbe@gmx.com> (raw)
In-Reply-To: <886e0c40-67e6-9700-1373-b29de2e3be95@toxicpanda.com>



On 2021/1/20 下午11:28, Josef Bacik wrote:
> On 1/16/21 2:15 AM, Qu Wenruo wrote:
>> To support subpage sector size, data also need extra info to make sure
>> which sectors in a page are uptodate/dirty/...
>>
>> This patch will make pages for data inodes to get btrfs_subpage
>> structure attached, and detached when the page is freed.
>>
>> This patch also slightly changes the timing when
>> set_page_extent_mapped() to make sure:
>>
>> - We have page->mapping set
>>    page->mapping->host is used to grab btrfs_fs_info, thus we can only
>>    call this function after page is mapped to an inode.
>>
>>    One call site attaches pages to inode manually, thus we have to modify
>>    the timing of set_page_extent_mapped() a little.
>>
>> - As soon as possible, before other operations
>>    Since memory allocation can fail, we have to do extra error handling.
>>    Calling set_page_extent_mapped() as soon as possible can simply the
>>    error handling for several call sites.
>>
>> The idea is pretty much the same as iomap_page, but with more bitmaps
>> for btrfs specific cases.
>>
>> Currently the plan is to switch iomap if iomap can provide sector
>> aligned write back (only write back dirty sectors, but not the full
>> page, data balance require this feature).
>>
>> So we will stick to btrfs specific bitmap for now.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/compression.c      | 10 ++++++--
>>   fs/btrfs/extent_io.c        | 46 +++++++++++++++++++++++++++++++++----
>>   fs/btrfs/extent_io.h        |  3 ++-
>>   fs/btrfs/file.c             | 24 ++++++++-----------
>>   fs/btrfs/free-space-cache.c | 15 +++++++++---
>>   fs/btrfs/inode.c            | 12 ++++++----
>>   fs/btrfs/ioctl.c            |  5 +++-
>>   fs/btrfs/reflink.c          |  5 +++-
>>   fs/btrfs/relocation.c       | 12 ++++++++--
>>   9 files changed, 99 insertions(+), 33 deletions(-)
>>
>> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
>> index 5ae3fa0386b7..6d203acfdeb3 100644
>> --- a/fs/btrfs/compression.c
>> +++ b/fs/btrfs/compression.c
>> @@ -542,13 +542,19 @@ static noinline int add_ra_bio_pages(struct
>> inode *inode,
>>               goto next;
>>           }
>> -        end = last_offset + PAGE_SIZE - 1;
>>           /*
>>            * at this point, we have a locked page in the page cache
>>            * for these bytes in the file.  But, we have to make
>>            * sure they map to this compressed extent on disk.
>>            */
>> -        set_page_extent_mapped(page);
>> +        ret = set_page_extent_mapped(page);
>> +        if (ret < 0) {
>> +            unlock_page(page);
>> +            put_page(page);
>> +            break;
>> +        }
>> +
>> +        end = last_offset + PAGE_SIZE - 1;
>>           lock_extent(tree, last_offset, end);
>>           read_lock(&em_tree->lock);
>>           em = lookup_extent_mapping(em_tree, last_offset,
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 35fbef15d84e..4bce03fed205 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -3194,10 +3194,39 @@ static int attach_extent_buffer_page(struct
>> extent_buffer *eb,
>>       return 0;
>>   }
>> -void set_page_extent_mapped(struct page *page)
>> +int __must_check set_page_extent_mapped(struct page *page)
>>   {
>> +    struct btrfs_fs_info *fs_info;
>> +
>> +    ASSERT(page->mapping);
>> +
>> +    if (PagePrivate(page))
>> +        return 0;
>> +
>> +    fs_info = btrfs_sb(page->mapping->host->i_sb);
>> +
>> +    if (fs_info->sectorsize < PAGE_SIZE)
>> +        return btrfs_attach_subpage(fs_info, page);
>> +
>> +    attach_page_private(page, (void *)EXTENT_PAGE_PRIVATE);
>> +    return 0;
>> +
>> +}
>> +
>> +void clear_page_extent_mapped(struct page *page)
>> +{
>> +    struct btrfs_fs_info *fs_info;
>> +
>> +    ASSERT(page->mapping);
>> +
>>       if (!PagePrivate(page))
>> -        attach_page_private(page, (void *)EXTENT_PAGE_PRIVATE);
>> +        return;
>> +
>> +    fs_info = btrfs_sb(page->mapping->host->i_sb);
>> +    if (fs_info->sectorsize < PAGE_SIZE)
>> +        return btrfs_detach_subpage(fs_info, page);
>> +
>> +    detach_page_private(page);
>>   }
>>   static struct extent_map *
>> @@ -3254,7 +3283,12 @@ int btrfs_do_readpage(struct page *page, struct
>> extent_map **em_cached,
>>       unsigned long this_bio_flag = 0;
>>       struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
>> -    set_page_extent_mapped(page);
>> +    ret = set_page_extent_mapped(page);
>> +    if (ret < 0) {
>> +        unlock_extent(tree, start, end);
>> +        SetPageError(page);
>> +        goto out;
>> +    }
>>       if (!PageUptodate(page)) {
>>           if (cleancache_get_page(page) == 0) {
>> @@ -3694,7 +3728,11 @@ static int __extent_writepage(struct page
>> *page, struct writeback_control *wbc,
>>           flush_dcache_page(page);
>>       }
>> -    set_page_extent_mapped(page);
>> +    ret = set_page_extent_mapped(page);
>> +    if (ret < 0) {
>> +        SetPageError(page);
>> +        goto done;
>> +    }
>>       if (!epd->extent_locked) {
>>           ret = writepage_delalloc(BTRFS_I(inode), page, wbc, start,
>> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
>> index bedf761a0300..357a3380cd42 100644
>> --- a/fs/btrfs/extent_io.h
>> +++ b/fs/btrfs/extent_io.h
>> @@ -178,7 +178,8 @@ int btree_write_cache_pages(struct address_space
>> *mapping,
>>   void extent_readahead(struct readahead_control *rac);
>>   int extent_fiemap(struct btrfs_inode *inode, struct
>> fiemap_extent_info *fieinfo,
>>             u64 start, u64 len);
>> -void set_page_extent_mapped(struct page *page);
>> +int __must_check set_page_extent_mapped(struct page *page);
>> +void clear_page_extent_mapped(struct page *page);
>>   struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info
>> *fs_info,
>>                         u64 start, u64 owner_root, int level);
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index d81ae1f518f2..63b290210eaa 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -1369,6 +1369,12 @@ static noinline int prepare_pages(struct inode
>> *inode, struct page **pages,
>>               goto fail;
>>           }
>> +        err = set_page_extent_mapped(pages[i]);
>> +        if (err < 0) {
>> +            faili = i;
>> +            goto fail;
>> +        }
>> +
>>           if (i == 0)
>>               err = prepare_uptodate_page(inode, pages[i], pos,
>>                               force_uptodate);
>> @@ -1453,23 +1459,11 @@ lock_and_cleanup_extent_if_need(struct
>> btrfs_inode *inode, struct page **pages,
>>       }
>>       /*
>> -     * It's possible the pages are dirty right now, but we don't want
>> -     * to clean them yet because copy_from_user may catch a page fault
>> -     * and we might have to fall back to one page at a time.  If that
>> -     * happens, we'll unlock these pages and we'd have a window where
>> -     * reclaim could sneak in and drop the once-dirty page on the floor
>> -     * without writing it.
>> -     *
>> -     * We have the pages locked and the extent range locked, so there's
>> -     * no way someone can start IO on any dirty pages in this range.
>> -     *
>> -     * We'll call btrfs_dirty_pages() later on, and that will flip
>> around
>> -     * delalloc bits and dirty the pages as required.
>> +     * We should be called after prepare_pages() which should have
>> +     * locked all pages in the range.
>>        */
>> -    for (i = 0; i < num_pages; i++) {
>> -        set_page_extent_mapped(pages[i]);
>> +    for (i = 0; i < num_pages; i++)
>>           WARN_ON(!PageLocked(pages[i]));
>> -    }
>>       return ret;
>>   }
>> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
>> index fd6ddd6b8165..379bef967e1d 100644
>> --- a/fs/btrfs/free-space-cache.c
>> +++ b/fs/btrfs/free-space-cache.c
>> @@ -431,11 +431,22 @@ static int io_ctl_prepare_pages(struct
>> btrfs_io_ctl *io_ctl, bool uptodate)
>>       int i;
>>       for (i = 0; i < io_ctl->num_pages; i++) {
>> +        int ret;
>> +
>>           page = find_or_create_page(inode->i_mapping, i, mask);
>>           if (!page) {
>>               io_ctl_drop_pages(io_ctl);
>>               return -ENOMEM;
>>           }
>> +
>> +        ret = set_page_extent_mapped(page);
>> +        if (ret < 0) {
>> +            unlock_page(page);
>> +            put_page(page);
>> +            io_ctl_drop_pages(io_ctl);
>> +            return -ENOMEM;
>> +        }
>
> If we're going to declare ret here we might as well
>
> return ret;
>
> otherwise we could just lose the error if we add some other error in the
> future.
>
> <snip>
>
>> @@ -8345,7 +8347,9 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
>>       wait_on_page_writeback(page);
>>       lock_extent_bits(io_tree, page_start, page_end, &cached_state);
>> -    set_page_extent_mapped(page);
>> +    ret2 = set_page_extent_mapped(page);
>> +    if (ret2 < 0)
>> +        goto out_unlock;
>
> We lose the error in this case, you need
>
> if (ret2 < 0) {
>      ret = vmf_error(ret2);
>      goto out_unlock;
> }
>
>>       /*
>>        * we can't set the delalloc bits if there are pending ordered
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 7f2935ea8d3a..50a9d784bdc2 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -1314,6 +1314,10 @@ static int cluster_pages_for_defrag(struct
>> inode *inode,
>>           if (!page)
>>               break;
>> +        ret = set_page_extent_mapped(page);
>> +        if (ret < 0)
>> +            break;
>> +
>
> You are leaving a page locked and leaving it referenced here, you need
>
> if (ret < 0) {
>      unlock_page(page);
>      put_page(page);
>      break;
> }

Awesome review!

My gut feeling is telling me something may go wrong for such change, but
I didn't check it more carefully...

Thank you very much to catch such error branch bugs,
Qu

>
> thanks,
>
> Josef

  reply	other threads:[~2021-01-26 17:22 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-16  7:15 [PATCH v4 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 01/18] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig() Qu Wenruo
2021-01-19 21:41   ` Josef Bacik
2021-01-21  6:32     ` Qu Wenruo
2021-01-21  6:51       ` Qu Wenruo
2021-01-23 19:13         ` David Sterba
2021-01-24  0:35           ` Qu Wenruo
2021-01-24 11:49             ` David Sterba
2021-01-16  7:15 ` [PATCH v4 02/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK into PAGE_START_WRITEBACK Qu Wenruo
2021-01-19 21:43   ` Josef Bacik
2021-01-19 21:45   ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 03/18] btrfs: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2021-01-18 22:46   ` David Sterba
2021-01-18 22:54     ` Qu Wenruo
2021-01-19 15:51       ` David Sterba
2021-01-19 16:06         ` David Sterba
2021-01-20  0:19           ` Qu Wenruo
2021-01-23 19:37             ` David Sterba
2021-01-24  0:24               ` Qu Wenruo
2021-01-18 23:01   ` David Sterba
2021-01-16  7:15 ` [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2021-01-18 22:51   ` David Sterba
2021-01-19 21:54   ` Josef Bacik
2021-01-19 22:35     ` David Sterba
2021-01-26  7:29       ` Qu Wenruo
2021-01-27 19:58         ` David Sterba
2021-01-20  0:27     ` Qu Wenruo
2021-01-20 14:22       ` Josef Bacik
2021-01-21  1:20         ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 05/18] btrfs: make grab_extent_buffer_from_page() " Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 06/18] btrfs: support subpage for extent buffer page release Qu Wenruo
2021-01-20 14:44   ` Josef Bacik
2021-01-21  0:45     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 07/18] btrfs: attach private to dummy extent buffer pages Qu Wenruo
2021-01-20 14:48   ` Josef Bacik
2021-01-21  0:47     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 08/18] btrfs: introduce helper for subpage uptodate status Qu Wenruo
2021-01-19 19:45   ` David Sterba
2021-01-20 14:55   ` Josef Bacik
2021-01-26  7:21     ` Qu Wenruo
2021-01-20 15:00   ` Josef Bacik
2021-01-21  0:49     ` Qu Wenruo
2021-01-21  1:28       ` Josef Bacik
2021-01-21  1:38         ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 09/18] btrfs: introduce helper for subpage error status Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 10/18] btrfs: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 11/18] btrfs: make btrfs_clone_extent_buffer() to be subpage compatible Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 12/18] btrfs: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2021-01-20 15:05   ` Josef Bacik
2021-01-21  0:51     ` Qu Wenruo
2021-01-23 20:36     ` David Sterba
2021-01-25 20:02       ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 13/18] btrfs: introduce read_extent_buffer_subpage() Qu Wenruo
2021-01-20 15:08   ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2021-01-19 20:48   ` David Sterba
2021-01-20 15:28   ` Josef Bacik
2021-01-26  7:05     ` Qu Wenruo [this message]
2021-01-16  7:15 ` [PATCH v4 17/18] btrfs: integrate page status update for data read path into begin/end_page_read() Qu Wenruo
2021-01-20 15:41   ` Josef Bacik
2021-01-21  1:05     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
2021-01-18 23:17 ` [PATCH v4 00/18] btrfs: add read-only support for subpage sector size David Sterba
2021-01-18 23:26   ` Qu Wenruo
2021-01-24 12:29     ` David Sterba
2021-01-25  1:19       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f8b931b1-d6bd-2b1d-0f00-74dcc5775dbe@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox