linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: riteshh <riteshh@linux.ibm.com>
To: Qu Wenruo <wqu@suse.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Neal Gompa <ngompa13@gmail.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v3 00/13] btrfs: support read-write for subpage metadata
Date: Mon, 19 Apr 2021 11:29:15 +0530	[thread overview]
Message-ID: <20210419055915.valqohgrgrdy4pvf@riteshh-domain> (raw)
In-Reply-To: <20210416165253.mexotq7ixpcvcvov@riteshh-domain>

On 21/04/16 10:22PM, riteshh wrote:
> On 21/04/16 02:14PM, Qu Wenruo wrote:
> >
> >
> > On 2021/4/16 下午1:50, riteshh wrote:
> > > On 21/04/16 09:34AM, Qu Wenruo wrote:
> > > >
> > > >
> > > > On 2021/4/16 上午7:34, Qu Wenruo wrote:
> > > > >
> > > > >
> > > > > On 2021/4/16 上午7:19, Qu Wenruo wrote:
> > > > > >
> > > > > >
> > > > > > On 2021/4/15 下午10:52, riteshh wrote:
> > > > > > > On 21/04/15 09:14AM, riteshh wrote:
> > > > > > > > On 21/04/12 07:33PM, Qu Wenruo wrote:
> > > > > > > > > Good news, you can fetch the subpage branch for better test results.
> > > > > > > > >
> > > > > > > > > Now the branch should pass all generic tests, except defrag and known
> > > > > > > > > failures.
> > > > > > > > > And no more random crash during the tests.
> > > > > > > >
> > > > > > > > Thanks, let me test it on PPC64 box.
> > > > > > >
> > > > > > > I do see some failures remaining with the patch series.
> > > > > > > However the one which is blocking my testing is the tests/generic/095
> > > > > > > I see kernel BUG hitting with below signature.
> > > > > >
> > > > > > That's pretty different from my tests.
> > > > > >
> > > > > > As I haven't seen such BUG_ON() for a while.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Please let me know if this a known failure?
> > > > > > >
> > > > > > > <xfstests config>
> > > > > > > #:~/work-tools/xfstests$ sudo ./check -g auto
> > > > > > > SECTION       -- btrfs_4k
> > > > > > > FSTYP         -- btrfs
> > > > > > > PLATFORM      -- Linux/ppc64le qemu 5.12.0-rc7-02316-g3490dae50c0 #73
> > > > > > > SMP Thu Apr 15 07:29:23 CDT 2021
> > > > > > > MKFS_OPTIONS  -- -f -s 4096 -n 4096 /dev/loop3
> > > > > >
> > > > > > I see you're using -n 4096, not the default -n 16K, let me see if I can
> > > > > > reproduce that.
> > > > > >
> > > > > > But from the backtrace, it doesn't look like the case,
> > > > > > as it happens for data path, which means it's only related to sectorsize.
> > > > > >
> > > > > > > MOUNT_OPTIONS -- /dev/loop3 /mnt1/scratch
> > > > > > >
> > > > > > >
> > > > > > > <kernel logs>
> > > > > > > [ 6057.560580] BTRFS warning (device loop3): read-write for sector
> > > > > > > size 4096 with page size 65536 is experimental
> > > > > > > [ 6057.861383] run fstests generic/095 at 2021-04-15 14:12:10
> > > > > > > [ 6058.345127] BTRFS info (device loop2): disk space caching is enabled
> > > > > > > [ 6058.348910] BTRFS info (device loop2): has skinny extents
> > > > > > > [ 6058.351930] BTRFS warning (device loop2): read-write for sector
> > > > > > > size 4096 with page size 65536 is experimental
> > > > > > > [ 6059.896382] BTRFS: device fsid 43ec9cdf-c124-4460-ad93-933bfd5ddbbd
> > > > > > > devid 1 transid 5 /dev/loop3 scanned by mkfs.btrfs (739641)
> > > > > > > [ 6060.225107] BTRFS info (device loop3): disk space caching is enabled
> > > > > > > [ 6060.226213] BTRFS info (device loop3): has skinny extents
> > > > > > > [ 6060.227084] BTRFS warning (device loop3): read-write for sector
> > > > > > > size 4096 with page size 65536 is experimental
> > > > > > > [ 6060.234537] BTRFS info (device loop3): checking UUID tree
> > > > > > > [ 6061.375902] assertion failed: PagePrivate(page) && page->private,
> > > > > > > in fs/btrfs/subpage.c:171
> > > > > > > [ 6061.378296] ------------[ cut here ]------------
> > > > > > > [ 6061.379422] kernel BUG at fs/btrfs/ctree.h:3403!
> > > > > > > cpu 0x5: Vector: 700 (Program Check) at [c0000000260d7490]
> > > > > > >       pc: c000000000a9370c: assertfail.constprop.11+0x34/0x48
> > > > > > >       lr: c000000000a93708: assertfail.constprop.11+0x30/0x48
> > > > > > >       sp: c0000000260d7730
> > > > > > >      msr: 800000000282b033
> > > > > > >     current = 0xc0000000260c0080
> > > > > > >     paca    = 0xc00000003fff8a00   irqmask: 0x03   irq_happened: 0x01
> > > > > > >       pid   = 739712, comm = fio
> > > > > > > kernel BUG at fs/btrfs/ctree.h:3403!
> > > > > > > Linux version 5.12.0-rc7-02316-g3490dae50c0 (riteshh@xxxx) (gcc
> > > > > > > (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0, GNU ld (GNU Binutils for Ubuntu)
> > > > > > > 2.30) #73 SMP Thu Apr 15 07:29:23 CDT 2021
> > > > > > > enter ? for help
> > > > > > > [c0000000260d7790] c000000000a90280
> > > > > > > btrfs_subpage_assert.isra.9+0x70/0x110
> > > > > > > [c0000000260d77b0] c000000000a91064
> > > > > > > btrfs_subpage_set_uptodate+0x54/0x110
> > > > > > > [c0000000260d7800] c0000000009c6d0c btrfs_dirty_pages+0x1bc/0x2c0
> > > > > >
> > > > > > This is very strange.
> > > > > > As in btrfs_dirty_pages(), the pages passed in are already prepared by
> > > > > > prepare_pages(), which means all of them should have Private set.
> > > > > >
> > > > > > Can you reproduce the bug reliable?
> > >
> > > Yes. almost reliably on my PPC box.
> > >
> > > > >
> > > > > OK, I got it reproduced.
> > > > >
> > > > > It's not a reliable BUG_ON(), but can be reproduced.
> > > > > The test get skipped for all my boards as it requires fio tool, thus I
> > > > > didn't get it triggered for all previous runs.
> > > > >
> > > > > I'll take a look into the case.
> > > >
> > > > This exposed an interesting race window in btrfs_buffered_write():
> > > >          Writer                    |             fadvice
> > > > ----------------------------------+-------------------------------
> > > > btrfs_buffered_write()            |
> > > > |- prepare_pages()                |
> > > > |  |- Now all pages involved get  |
> > > > |     Private set                 |
> > > > |                                 | btrfs_release_page()
> > > > |                                 | |- Clear page Private
> > > > |- lock_extent()                  |
> > > > |  |- This would prevent          |
> > > > |     btrfs_release_page() to     |
> > > > |     clear the page Private      |
> > > > |
> > > > |- btrfs_dirty_page()
> > > >     |- Will trigger the BUG_ON()
> > >
> > >
> > > Sorry about the silly query. But help me understand how is above race possible?
> > > Won't prepare_pages() will lock all the pages first. The same requirement
> > > of locked page should be with btrfs_releasepage() too no?
> >
> > releasepage() call can easily got a page locked and release it.
> >
> > For call sites like btrfs_invalidatepage(), the page is already locked.
> >
> > btrfs_releasepage() will not to try to release the page if the extent is
> > locked (any extent range inside the page has EXTENT_LOCK bit).
> >
> > >
> > > I see only two paths which could result into btrfs_releasepage()
> > > 1. one via try_to_release_pages -> releasepage()
> >
> > This is the race one, called from fadvice() to release pages.
> >
> > > 2. writeback path calling btrfs_writepage or btrfs_writepages
> > > 	which may result into calling of btrfs_invalidatepage()
> >
> > Not this one.
> >
> > >
> > > Although I am not sure which one this is racing with.
> > >
> > > >
> > > > This only happens for subpage, because subpage introduces new ASSERT()
> > > > to do extra check.
> > > >
> > > > If we want to speak strictly, regular sector size should also report
> > > > this problem.
> > > > But regular sector size case doesn't really care about page Private, as
> > > > it just set page->private to a constant value, unlike subpage case which
> > > > stores important value.
> > > >
> > > > The fix will just re-set page Private and needed structures in
> > > > btrfs_dirty_page(), under extent locked so no btrfs_releasepage() is
> > > > able to release it anymore.
> > >
> > > With above fix I see a different issue with below signature.
> > >
> > > [  130.272410] BTRFS warning (device loop2): read-write for sector size 4096 with page size 65536 is experimental
> > > [  130.387470] run fstests generic/095 at 2021-04-16 05:04:09
> > > [  132.042532] BTRFS: device fsid 642daee0-165a-4271-b6f3-728f215c5348 devid 1 transid 5 /dev/loop3 scanned by mkfs.btrfs (5226)
> > > [  132.146892] BTRFS info (device loop3): disk space caching is enabled
> > > [  132.147831] BTRFS info (device loop3): has skinny extents
> > > [  132.148491] BTRFS warning (device loop3): read-write for sector size 4096 with page size 65536 is experimental
> > > [  132.158228] BTRFS info (device loop3): checking UUID tree
> > > [  133.931695] BUG: spinlock bad magic on CPU#4, swapper/4/0
> > > [  133.932874] BUG: Unable to handle kernel data access on write at 0x6b6b6b6b6b6b725b
> >
> > That looks like some poisoned memory.
> >
> > I have run 128 runs of generic/095 locally on my Arm board during the fix,
> > unable to reproduce the crash anymore.
> >
> > And this call site is even harder to get race, as in endio context, the page
> > still has PageWriteback until the last bio finished in the page.
> >
> > This means btrfs_releasepage() will not even try to release the page, while
> > btrfs_invalidatepage() will wait the page to finish its writeback before
> > doing anything.
> >
> > So this is very strange to me.
> >
> > Any reproducibility on your side? Or something specific to Power is related
> > to this case? (IIRC some page flag operation is not atomic, maybe that is
> > related?)
>
> I doubt if this is Power related. And yes, I can reproduce the issue fairly
> easily. For now I will exclude the test from my run to get a overall run with

Here, are some other failures that I noticed during testing on Power.
Thanks for looking into this.

1. tests/btrfs/052
btrfs/052       [failed, exit status 1]- output mismatch (see /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/052.out.bad)
    --- tests/btrfs/052.out     2020-08-04 09:59:08.328299552 +0000
    +++ /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/052.out.bad      2021-04-16 17:18:17.762928432 +0000
    @@ -91,553 +91,5 @@
     23 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05
     *
     30
    -0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    -*
    -2 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
    -*
    ...
    (Run 'diff -u /home/qemu/work-tools/xfstests/tests/btrfs/052.out /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/052.out.bad'  to see the entire diff)

^^^ this could also be due to below error found in 052.full
	ERROR: defrag range ioctl not supported in this kernel version, 2.6.33 and newer is required
	total 1 failures
	failed: '/usr/local/bin/btrfs filesystem defragment /mnt1/scratch/foo'

2. tests/btrfs/076 => looks a genuine failure.
btrfs/076       - output mismatch (see /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/076.out.bad)
    --- tests/btrfs/076.out     2020-08-04 09:59:08.338299786 +0000
    +++ /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/076.out.bad      2021-04-16 17:19:33.344981383 +0000
    @@ -1,3 +1,3 @@
     QA output created by 076
    -80
    -80
    +1
    +1
    ...
    (Run 'diff -u /home/qemu/work-tools/xfstests/tests/btrfs/076.out /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/076.out.bad'  to see the entire diff)

3. tests/btrfs/106  => looks a genuine failure.
btrfs/106       - output mismatch (see /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/106.out.bad)
    --- tests/btrfs/106.out     2020-08-04 09:59:08.348300020 +0000
    +++ /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/106.out.bad      2021-04-16 17:49:27.296128823 +0000
    @@ -5,19 +5,19 @@
     File contents before unmount:
     0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
     *
    -40
    +1000
     File contents after remount:
     0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
    ...
    (Run 'diff -u /home/qemu/work-tools/xfstests/tests/btrfs/106.out /home/qemu/work-tools/xfstests/results//btrfs_4k/btrfs/106.out.bad'  to see the entire diff)

> these patches. Later will try and debug what is going on.
>
> But if you need any debug logs - do let me know, as it is fairly easily
> reproducible.

For tests/generic/095 can you pls retry reproducing the issue (with your latest
patch) on your setup with below configs enabled?
1. CONFIG_PAGE_OWNER, CONFIG_PAGE_POISONING, CONFIG_SLUB_DEBUG_ON,
   CONFIG_SCHED_STACK_END_CHECK, CONFIG_DEBUG_VM, CONFIG_DEBUG_STACKOVERFLOW,
   CONFIG_DEBUG_VM_PGFLAGS, CONFIG_DEBUG_SPINLOCK, CONFIG_PROVE_LOCKING


-ritesh

  reply	other threads:[~2021-04-19  5:59 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25  7:14 [PATCH v3 00/13] btrfs: support read-write for subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 01/13] btrfs: add sysfs interface for supported sectorsize Qu Wenruo
2021-03-25 14:41   ` Anand Jain
2021-03-29 18:20     ` David Sterba
2021-04-01 22:32       ` Anand Jain
2021-04-01 17:56   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 02/13] btrfs: use min() to replace open-code in btrfs_invalidatepage() Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 03/13] btrfs: remove unnecessary variable shadowing " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 04/13] btrfs: refactor how we iterate ordered extent " Qu Wenruo
2021-04-02  1:15   ` Anand Jain
2021-04-02  3:33     ` Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 05/13] btrfs: introduce helpers for subpage dirty status Qu Wenruo
2021-04-01 18:11   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 06/13] btrfs: introduce helpers for subpage writeback status Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 07/13] btrfs: allow btree_set_page_dirty() to do more sanity check on subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 08/13] btrfs: support subpage metadata csum calculation at write time Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 09/13] btrfs: make alloc_extent_buffer() check subpage dirty bitmap Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 10/13] btrfs: make the page uptodate assert to be subpage compatible Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 11/13] btrfs: make set/clear_extent_buffer_dirty() " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 12/13] btrfs: make set_btree_ioerr() accept extent buffer and " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 13/13] btrfs: add subpage overview comments Qu Wenruo
2021-03-25 12:20 ` [PATCH v3 00/13] btrfs: support read-write for subpage metadata Neal Gompa
2021-03-25 13:16   ` Qu Wenruo
2021-03-28 20:02     ` Ritesh Harjani
2021-03-29  2:01       ` Qu Wenruo
2021-04-02  1:39         ` Anand Jain
2021-04-02  3:26           ` Qu Wenruo
2021-04-02  8:33         ` Ritesh Harjani
2021-04-02  8:36           ` Qu Wenruo
2021-04-02  8:46             ` Ritesh Harjani
2021-04-02  8:52               ` Qu Wenruo
2021-04-12 11:33                 ` Qu Wenruo
2021-04-15  3:44                   ` riteshh
2021-04-15 14:52                     ` riteshh
2021-04-15 23:19                       ` Qu Wenruo
2021-04-15 23:34                         ` Qu Wenruo
2021-04-16  1:34                           ` Qu Wenruo
2021-04-16  5:50                             ` riteshh
2021-04-16  6:14                               ` Qu Wenruo
2021-04-16 16:52                                 ` riteshh
2021-04-19  5:59                                   ` riteshh [this message]
2021-04-19  6:16                                     ` Qu Wenruo
2021-04-19  7:04                                       ` riteshh
2021-04-19  7:19                                       ` Qu Wenruo
2021-04-19 13:24                                         ` Qu Wenruo
2021-04-21  7:03                                           ` riteshh
2021-04-21  7:15                                             ` Qu Wenruo
2021-04-21  7:30                                             ` riteshh
2021-04-21  8:26                                               ` Qu Wenruo
2021-04-21 11:13                                                 ` riteshh
2021-04-21 11:42                                                   ` Qu Wenruo
2021-04-21 12:15                                                     ` riteshh
2021-03-29 18:53 ` David Sterba
2021-04-01  5:36   ` Qu Wenruo
2021-04-01 17:55     ` David Sterba
2021-04-02  1:27     ` Anand Jain
2021-04-03 11:08 ` David Sterba
2021-04-05  6:14   ` Qu Wenruo
2021-04-06  2:31     ` Anand Jain
2021-04-06 19:20       ` David Sterba
2021-04-06 23:59       ` Qu Wenruo
2021-04-06 19:13     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210419055915.valqohgrgrdy4pvf@riteshh-domain \
    --to=riteshh@linux.ibm.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=ngompa13@gmail.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=ritesh.list@gmail.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).