linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: riteshh <riteshh@linux.ibm.com>
To: Qu Wenruo <wqu@suse.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Neal Gompa <ngompa13@gmail.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v3 00/13] btrfs: support read-write for subpage metadata
Date: Wed, 21 Apr 2021 16:43:10 +0530	[thread overview]
Message-ID: <20210421111310.q2g2fhzhlcoaykff@riteshh-domain> (raw)
In-Reply-To: <ca952f24-d0cd-fda7-c9c5-85eba3e7d04a@suse.com>

On 21/04/21 04:26PM, Qu Wenruo wrote:
>
>
> On 2021/4/21 下午3:30, riteshh wrote:
> > On 21/04/21 12:33PM, riteshh wrote:
> > > On 21/04/19 09:24PM, Qu Wenruo wrote:
> > > > [...]
> > > > > >
> > > > > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > > > > > index 45ec3f5ef839..49f78d643392 100644
> > > > > > --- a/fs/btrfs/file.c
> > > > > > +++ b/fs/btrfs/file.c
> > > > > > @@ -1341,7 +1341,17 @@ static int prepare_uptodate_page(struct inode
> > > > > > *inode,
> > > > > >                          unlock_page(page);
> > > > > >                          return -EIO;
> > > > > >                  }
> > > > > > -               if (page->mapping != inode->i_mapping) {
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Since btrfs_readpage() will get the page unlocked, we
> > > > > > have
> > > > > > +                * a window where fadvice() can try to release the page.
> > > > > > +                * Here we check both inode mapping and PagePrivate() to
> > > > > > +                * make sure the page is not released.
> > > > > > +                *
> > > > > > +                * The priavte flag check is essential for subpage as we
> > > > > > need
> > > > > > +                * to store extra bitmap using page->private.
> > > > > > +                */
> > > > > > +               if (page->mapping != inode->i_mapping ||
> > > > > > PagePrivate(page)) {
> > > > >    ^ Obviously it should be !PagePrivate(page).
> > > >
> > > > Hi Ritesh,
> > > >
> > > > Mind to have another try on generic/095?
> > > >
> > > > This time the branch is updated with the following commit at top:
> > > >
> > > > commit d700b16dced6f2e2b47e1ca5588a92216ce84dfb (HEAD -> subpage,
> > > > github/subpage)
> > > > Author: Qu Wenruo <wqu@suse.com>
> > > > Date:   Mon Apr 19 13:41:31 2021 +0800
> > > >
> > > >      btrfs: fix a crash caused by race between prepare_pages() and
> > > >      btrfs_releasepage()
> > > >
> > > > The fix uses the PagePrivate() check to avoid the problem, and passes
> > > > several generic/auto loops without any sign of crash.
> > > >
> > > > But considering I always have difficult in reproducing the bug with previous
> > > > improper fix, your verification would be very helpful.
> > > >
> > >
> > > Hi Qu,
> > >
> > > Thanks for the patch. I did try above patch but even with this I could still
> > > reproduce the issue.
> > >
> > > 1. I think the original problem could be due to below logs.
> > > 	[   79.079641] run fstests generic/095 at 2021-04-21 06:46:23
> > > 	<...>
> > > 	[   83.634710] Page cache invalidation failure on direct I/O.  Possible data corruption due to collision with buffered I/O!
> > >
> > > Meaning, there might be a race here between DIO and buffered IO.
> > > So from DIO path we call invalidate_inode_pages2_range(). Somehow this maybe
> > > causing call of btrfs_releasepage().
> > >
> > > Now from code, invalidate_inode_pages2_range() can be called from both
> > > __iomap_dio_rw() and from iomap_dio_complete(). So it is not clear as to from
> > > where this might be triggering this bug.
> >
> > I think I got one of the problem.
> > 1. we use page->private pointer as btrfs_subpage struct which also happens to
> >     hold spinlock within it.
> >
> >     Now in btrfs_subpage_clear_writeback()
> >     -> we take this spinlock  spin_lock_irqsave(&subpage->lock, flags);
> >     -> we call end_page_writeback(page);
> >     		  -> this may end up waking up invalidate_inode_pages2_range()
> > 		  which is waiting for writeback to complete.
> > 			  -> this then may also call btrfs_releasepage() on the
> > 			  same page and also free the subpage structure.
>
> This indeeds looks like a problem.
>
> This really means we need to have such a small race window below:
> (btrfs_invalidatepage() doesn't seem to be possible to race considering
>  how much work needed to be done in that function)
>
> 	Thread 1		|		Thread 2
> --------------------------------+------------------------------------
>  end_bio_extent_writepage()	| btrfs_releasepage()
>  |- spin_lock_irqsave()		| |
>  |- end_page_writeback()	| |
>  |				| |- if (PageWriteback() ||...)
>  |				| |- clear_page_extent_mapped()
>  |- spin_unlock_irqrestore().
>
> It looks like my arm boards are not fast enough to trigger the race.
>
> Although it can be fixed by doing the same thing as dirty bit, by checking
> the bitmap first and then call end_page_writeback() with spinlock unlocked.
>
> Would you please try the following fix? (based on the latest branch, which
> already has the previous fixes included).
>
> I'm also running the tests on all my arm boards to make sure it doesn't
> cause extra problem, so far so good, but my board is far from fast, thus not
> yet 100% tested.
>
> Thanks,
> Qu
>
> diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
> index 696485ab68a2..c5abf9745c10 100644
> --- a/fs/btrfs/subpage.c
> +++ b/fs/btrfs/subpage.c
> @@ -420,13 +420,16 @@ void btrfs_subpage_clear_writeback(const struct
> btrfs_fs_info *fs_info,
>  {
>         struct btrfs_subpage *subpage = (struct btrfs_subpage
> *)page->private;
>         u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len);
> +       bool finished = false;
>         unsigned long flags;
>
>         spin_lock_irqsave(&subpage->lock, flags);
>         subpage->writeback_bitmap &= ~tmp;
>         if (subpage->writeback_bitmap == 0)
> -               end_page_writeback(page);
> +               finished = true;
>         spin_unlock_irqrestore(&subpage->lock, flags);
> +       if (finished)
> +               end_page_writeback(page);
>  }
>
>  void btrfs_subpage_set_ordered(const struct btrfs_fs_info *fs_info,

Thanks for this patch. I have re-tested generic/095 with 100 iterations and -g
quick (with both of your patches). I don't see this issue anymore.
So with the two patches (including above one) the race with
btrfs_releasepage() is now fixed.


For both of these patches, please feel free to add:

Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com>

-ritesh

  reply	other threads:[~2021-04-21 11:13 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25  7:14 [PATCH v3 00/13] btrfs: support read-write for subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 01/13] btrfs: add sysfs interface for supported sectorsize Qu Wenruo
2021-03-25 14:41   ` Anand Jain
2021-03-29 18:20     ` David Sterba
2021-04-01 22:32       ` Anand Jain
2021-04-01 17:56   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 02/13] btrfs: use min() to replace open-code in btrfs_invalidatepage() Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 03/13] btrfs: remove unnecessary variable shadowing " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 04/13] btrfs: refactor how we iterate ordered extent " Qu Wenruo
2021-04-02  1:15   ` Anand Jain
2021-04-02  3:33     ` Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 05/13] btrfs: introduce helpers for subpage dirty status Qu Wenruo
2021-04-01 18:11   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 06/13] btrfs: introduce helpers for subpage writeback status Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 07/13] btrfs: allow btree_set_page_dirty() to do more sanity check on subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 08/13] btrfs: support subpage metadata csum calculation at write time Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 09/13] btrfs: make alloc_extent_buffer() check subpage dirty bitmap Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 10/13] btrfs: make the page uptodate assert to be subpage compatible Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 11/13] btrfs: make set/clear_extent_buffer_dirty() " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 12/13] btrfs: make set_btree_ioerr() accept extent buffer and " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 13/13] btrfs: add subpage overview comments Qu Wenruo
2021-03-25 12:20 ` [PATCH v3 00/13] btrfs: support read-write for subpage metadata Neal Gompa
2021-03-25 13:16   ` Qu Wenruo
2021-03-28 20:02     ` Ritesh Harjani
2021-03-29  2:01       ` Qu Wenruo
2021-04-02  1:39         ` Anand Jain
2021-04-02  3:26           ` Qu Wenruo
2021-04-02  8:33         ` Ritesh Harjani
2021-04-02  8:36           ` Qu Wenruo
2021-04-02  8:46             ` Ritesh Harjani
2021-04-02  8:52               ` Qu Wenruo
2021-04-12 11:33                 ` Qu Wenruo
2021-04-15  3:44                   ` riteshh
2021-04-15 14:52                     ` riteshh
2021-04-15 23:19                       ` Qu Wenruo
2021-04-15 23:34                         ` Qu Wenruo
2021-04-16  1:34                           ` Qu Wenruo
2021-04-16  5:50                             ` riteshh
2021-04-16  6:14                               ` Qu Wenruo
2021-04-16 16:52                                 ` riteshh
2021-04-19  5:59                                   ` riteshh
2021-04-19  6:16                                     ` Qu Wenruo
2021-04-19  7:04                                       ` riteshh
2021-04-19  7:19                                       ` Qu Wenruo
2021-04-19 13:24                                         ` Qu Wenruo
2021-04-21  7:03                                           ` riteshh
2021-04-21  7:15                                             ` Qu Wenruo
2021-04-21  7:30                                             ` riteshh
2021-04-21  8:26                                               ` Qu Wenruo
2021-04-21 11:13                                                 ` riteshh [this message]
2021-04-21 11:42                                                   ` Qu Wenruo
2021-04-21 12:15                                                     ` riteshh
2021-03-29 18:53 ` David Sterba
2021-04-01  5:36   ` Qu Wenruo
2021-04-01 17:55     ` David Sterba
2021-04-02  1:27     ` Anand Jain
2021-04-03 11:08 ` David Sterba
2021-04-05  6:14   ` Qu Wenruo
2021-04-06  2:31     ` Anand Jain
2021-04-06 19:20       ` David Sterba
2021-04-06 23:59       ` Qu Wenruo
2021-04-06 19:13     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210421111310.q2g2fhzhlcoaykff@riteshh-domain \
    --to=riteshh@linux.ibm.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=ngompa13@gmail.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=ritesh.list@gmail.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).