All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Zorro Lang <zlang@kernel.org>,
	linux-ext4@vger.kernel.org, fstests@vger.kernel.org,
	regressions@lists.linux.dev,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
Date: Tue, 5 Sep 2023 23:11:02 +0100	[thread overview]
Message-ID: <ZPendrb8gSbAC6fM@casper.infradead.org> (raw)
In-Reply-To: <20230904060819.GB701295@mit.edu>

On Mon, Sep 04, 2023 at 02:08:19AM -0400, Theodore Ts'o wrote:
> #regzbot introduced: 8147c4c4546f9f05ef03bb839b741473b28bb560 ^
> 
> OK, I've isolated the regression of generic/455 failing with ext4/1k
> to this commit, which came in via the mm tree.  Nothing seems
> *obviously* wrong, but I'm not sure if there are any differences in
> the semantics of the new folio functions such as kmap_local_folio,
> offset_in_folio, set_folio_bh() which might be making a difference.

Thanks for the cc,  Let's see what we can do ...

virt_to_folio() - For an order-0 page, there is no difference.
offset_in_folio() - Ditto
bh->b_page vs bh->b_folio - Ditto
virt_to_folio() - Ditto
folio_set_bh() - Ditto

kmap_local_folio() vs kmap_atomic - Here, we have a difference.
memcpy_from_folio() - Same difference as above.

I suppose it must be this, and yet I cannot understand how it would
make a difference.  Perhaps you can help me?

static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot)
{
        if (IS_ENABLED(CONFIG_PREEMPT_RT))
                migrate_disable();
        else
                preempt_disable();

        pagefault_disable();
        return __kmap_local_page_prot(page, prot);
}

vs

static inline void *kmap_local_folio(struct folio *folio, size_t offset)
{
        struct page *page = folio_page(folio, offset / PAGE_SIZE);
        return __kmap_local_page_prot(page, kmap_prot) + offset % PAGE_SIZE;
}

I don't believe that returning the address with the offset included
is the problem here.  It must be disabling preemption / migration.
There's no chace this funcation accesses userspace (... is there?) so
it can't be the pagefault_disable().

We can try splitting this up into tiny commits and figuring out which
of them is the problem.  I'll be back at work tomorrow and can look
more deeply then.

> Using kvm-xfstests[1] I bisected this via the command:
> 
> % install-kconfig ; kbuild ; kvm-xfstests -c ext4/1k -C 10 generic/455
> 
> [1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md
> 
> 
> And the bisection pointed me at this commit:
> 
>     commit 8147c4c4546f9f05ef03bb839b741473b28bb560 (refs/bisect/bad)
>     Author: Matthew Wilcox (Oracle) <willy@infradead.org>
>     AuthorDate: Thu Jul 13 04:55:11 2023 +0100
>     Commit: Andrew Morton <akpm@linux-foundation.org>
>     CommitDate: Fri Aug 18 10:12:30 2023 -0700
> 
>         jbd2: use a folio in jbd2_journal_write_metadata_buffer()
>     
> During the bisection, I treated a commit with 3+ failures as "bad",
> and 0-2 commits as "good".  Running generic/455 50 times to get a
> sense of the failure, with the first bad commit (8147c4c4546f), I got:
> 
>     ext4/1k: 50 tests, 21 failures, 223 seconds
>       Flaky: generic/455: 42% (21/50)
>     Totals: 50 tests, 0 skipped, 21 failures, 0 errors, 223s
> 
> While with the immediately preceding commit (07811230c3cd), I got:
> 
>     ext4/1k: 50 tests, 4 failures, 235 seconds
>       Flaky: generic/455:  8% (4/50)
>     Totals: 50 tests, 0 skipped, 4 failures, 0 errors, 235s
> 
> 
> 
> Comparing these two commits (8147c4c4546f vs 07811230c3cd) using the
> ext4 with a 4k block size, I get:
> 
>     ext4/4k: 50 tests, 2 failures, 365 seconds
>       Flaky: generic/455:  4% (2/50)
>     Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 365s
> 
> vs
> 
>     ext4/4k: 50 tests, 2 failures, 349 seconds
>       Flaky: generic/455:  4% (2/50)
>     Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 349s
> 
> So issue seems to be specifically with a sub-page size block size,
> since ext4/4k doesn't show any issues, while ext4/1k does.

I doubt I tried it with a 1kB block size, so I'll focus on that too.

  reply	other threads:[~2023-09-05 22:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-03 12:00 [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails Zorro Lang
2023-09-03 20:40 ` Theodore Ts'o
2023-09-04  6:08   ` Theodore Ts'o
2023-09-05 22:11     ` Matthew Wilcox [this message]
2023-09-06 11:03       ` Ritesh Harjani
2023-09-06 12:38         ` Matthew Wilcox
2023-09-06 19:51           ` Matthew Wilcox
2023-09-07  2:56             ` Ritesh Harjani
2023-09-07  3:47               ` Matthew Wilcox
2023-09-07 13:35                 ` Ritesh Harjani
2023-09-07 14:15                   ` Matthew Wilcox
2023-09-07 14:59                     ` Ritesh Harjani
2023-09-10  9:26                       ` Linux regression tracking (Thorsten Leemhuis)
2023-09-11  3:43                         ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZPendrb8gSbAC6fM@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=fstests@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=tytso@mit.edu \
    --cc=zlang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.