From: Su Yue <l@damenly.org>
To: Wang Yugui <wangyugui@e16-tech.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>, linux-btrfs@vger.kernel.org
Subject: Re: fstests btrfs/042 triggle 'qgroup reserved space leaked'
Date: Sat, 24 Sep 2022 17:02:11 +0800 [thread overview]
Message-ID: <v8pd2lch.fsf@damenly.org> (raw)
In-Reply-To: <20220924144106.E3BE.409509F4@e16-tech.com>
On Sat 24 Sep 2022 at 14:41, Wang Yugui <wangyugui@e16-tech.com>
wrote:
> Hi,
>
>>
>> On 2022/9/24 12:07, Wang Yugui wrote:
>> > Hi,
>> >
>> >> On 2022/9/24 10:11, Wang Yugui wrote:
>> >>> Hi,
>> >>>
>> >>>>
>> >>>> On 2022/9/24 07:43, Wang Yugui wrote:
>> >>>>> Hi,
>> >>>>>
>> >>>>> fstests btrfs/042 triggle 'qgroup reserved space leaked'
>> >>>>>
>> >>>>> kernel source: btrfs misc-next
>> >>>>
>> >>>> Which commit HEAD?
>> >>>>
>> >>>> As I can not reproduce using a somewhat older misc-next.
>> >>>>
>> >>>> The HEAD I'm on is
>> >>>> 2d1aef6504bf8bdd7b6ca9fa4c0c5ab32f4da2a8 ("btrfs: stop
>> >>>> tracking failed reads in the I/O tree").
>> >>>>
>> >>>> If it's a regression it can be much easier to pin down.
>> >>>>
>> >>>>> kernel config:
>> >>>>> memory debug: CONFIG_KASAN/CONFIG_DEBUG_KMEMLEAK/...
>> >>>>> lock debug: CONFIG_PROVE_LOCKING/...
>> >>>>
>> >>>> And any reproducibility? 16 runs no reproduce.
>> >>>
>> >>> btrfs source version: misc-next: bf940dd88f48,
>> >>> plus some minor local patch(no qgroup related)
>> >>> kernel: 6.0-rc6
>> >>>
>> >>> reproduce rate:
>> >>> 1) 100%(3/3) when local debug config **1
>> >>> 2) 0% (0/3) when local release config
>> >>>
>> >>> **1:local debug config, about 100x slow than release config
>> >>> a) memory debug
>> >>> CONFIG_KASAN/CONFIG_DEBUG_KMEMLEAK/...
>> >>> b) lockdep debug
>> >>> CONFIG_PROVE_LOCKING/...
>> >>> c) btrfs debug
>> >>> CONFIG_BTRFS_FS_CHECK_INTEGRITY=y
>> >>> CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y
>> >>> CONFIG_BTRFS_DEBUG=y
>> >>> CONFIG_BTRFS_ASSERT=y
>> >>> CONFIG_BTRFS_FS_REF_VERIFY=y
>> >>
>> >> I always run with all btrfs features enabled.
>> >>
>> >> So is the lockdep.
>> >>
>> >> KASAN is known to be slow, thus that is only enabled when
>> >> there is suspision on memory corruption caused by some wild
>> >> pointer.
>> >>
>> >>>
>> >>>
>> >>> From source:
>> >>> fs/btrfs/disk-io.c:4668
>> >>> if (btrfs_check_quota_leak(fs_info)) {
>> >>> L4668 WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> >>> btrfs_err(fs_info, "qgroup reserved space
>> >>> leaked");
>> >>> }
>> >>>
>> >>> This problem will triggle fstests btrfs/042 to failure only
>> >>> when
>> >>> CONFIG_BTRFS_DEBUG=y ?
>> >>>
>> >>>
>> >>> maybe related issue:
>> >>> when lockdep debug is enabled, the following issue become
>> >>> very easy to
>> >>> reproduce too.
>> >>> https://lore.kernel.org/linux-nfs/3E21DFEA-8DF7-484B-8122-D578BFF7F9E0@oracle.com/
>> >>> so there maybe some lockdep debug related , but not btrfs
>> >>> related
>> >>> problem in kernel 6.0.
>> >>>
>> >>>
>> >>> more test(remove some minor local patch(no qgroup related))
>> >>> will be done,
>> >>> and then I will report the result.
>> >>
>> >> Better to provide the patches, as I just finished a 16 runs
>> >> of btrfs/042, no reproduce.
>> >>
>> >> Thus I'm starting to suspect the off-tree patches.
>> >
>> > This problem happen on linux 6.0-rc6+ (master a63f2e7cb110,
>> > without
>> > btrfs misc-next patch, without local off-tree patch)
>>
>> Same base, still nope.
>>
>> > so this problem is not related to the patches still in btrfs
>> > misc-next.
>> >
>> > reproduce rate:
>> > 100%(3/3) when local debug config
>> > and the whole config file is attached.
>> >
>>
>> I don't think the config makes much difference, as the main
>> difference
>> is in KASAN and KMEMLEAK, which should not impact the test
>> result.
>>
>> And are you running just that test, or with the full auto
>> group?
>
> For 6.0-rc6 with btrfs misc-next, I tried to run full auto
> group.
> btrfs/042 failed, others(btrfs/001 ~ btrfs/157) are OK, and then
> I
> rebooted the test machine.
>
> for 6.0-rc6 without btrfs misc-next, I tested btrfs/042 and
> btrfs/001 on
> the same machine.
>
May I ask if "6.0-rc6 without btrfs misc-next" is 6.0-rc6 *without
any other changes*?
It looks so weird cause I can't reproduce the issue either.
--
Su
> then I tested 6.0-rc6 without btrfs misc-next on another 2
> servers.
>
> reproduce rate:
> server1 3/3
> server2 2/3
> server3 3/3
> total rate: 8/9
>
> all 3 servers are in good status(ECC memory status and SSD
> status).
>
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2022/09/24
next prev parent reply other threads:[~2022-09-24 9:06 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-23 23:43 fstests btrfs/042 triggle 'qgroup reserved space leaked' Wang Yugui
2022-09-24 1:06 ` Qu Wenruo
2022-09-24 1:53 ` Qu Wenruo
2022-09-24 2:11 ` Wang Yugui
2022-09-24 2:17 ` Qu Wenruo
2022-09-24 4:07 ` Wang Yugui
2022-09-24 4:44 ` Qu Wenruo
2022-09-24 6:41 ` Wang Yugui
2022-09-24 7:22 ` Qu Wenruo
2022-09-24 7:57 ` Wang Yugui
2022-09-24 9:02 ` Su Yue [this message]
2022-09-24 10:25 ` Wang Yugui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=v8pd2lch.fsf@damenly.org \
--to=l@damenly.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=wangyugui@e16-tech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).