From: Dave Chinner <david@fromorbit.com>
To: Oliver Sang <oliver.sang@intel.com>
Cc: Guo Xuenan <guoxuenan@huawei.com>,
lkp@lists.01.org, lkp@intel.com, Hou Tao <houtao1@huawei.com>,
linux-xfs@vger.kernel.org
Subject: Re: [xfs] a1df10d42b: xfstests.generic.31*.fail
Date: Mon, 10 Oct 2022 11:07:40 +1100 [thread overview]
Message-ID: <20221010000740.GU3600936@dread.disaster.area> (raw)
In-Reply-To: <Y0J1oxBFwW53udvJ@xsang-OptiPlex-9020>
On Sun, Oct 09, 2022 at 03:17:55PM +0800, Oliver Sang wrote:
> Hi Dave,
>
> On Thu, Oct 06, 2022 at 08:35:43AM +1100, Dave Chinner wrote:
> > On Wed, Oct 05, 2022 at 09:45:12PM +0800, kernel test robot wrote:
> > >
> > > Greeting,
> > >
> > > FYI, we noticed the following commit (built with gcc-11):
> > >
> > > commit: a1df10d42ba99c946f6a574d4d31951bc0a57e33 ("xfs: fix exception caused by unexpected illegal bestcount in leaf dir")
> > > url: https://github.com/intel-lab-lkp/linux/commits/UPDATE-20220929-162751/Guo-Xuenan/xfs-fix-uaf-when-leaf-dir-bestcount-not-match-with-dir-data-blocks/20220831-195920
> > >
> > > in testcase: xfstests
> > > version: xfstests-x86_64-5a5e419-1_20220927
> > > with following parameters:
> > >
> > > disk: 4HDD
> > > fs: xfs
> > > test: generic-group-15
> > >
> > > test-description: xfstests is a regression test suite for xfs and other files ystems.
> > > test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> > >
> > >
> > > on test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz (Ivy Bridge) with 8G memory
> > >
> > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
> > THe attached dmesg ends at:
> >
> > [...]
> > [ 102.727610][ T315] generic/309 IPMI BMC is not supported on this machine, skip bmc-watchdog setup!
> > [ 102.727630][ T315]
> > [ 103.884498][ T7407] XFS (sda1): EXPERIMENTAL online scrub feature in use. Use at your own risk!
> > [ 103.993962][ T7431] XFS (sda1): Unmounting Filesystem
> > [ 104.193659][ T7580] XFS (sda1): Mounting V5 Filesystem
> > [ 104.221178][ T7580] XFS (sda1): Ending clean mount
> > [ 104.223821][ T7580] xfs filesystem being mounted at /fs/sda1 supports timestamps until 2038 (0x7fffffff)
> > [ 104.285615][ T315] 2s
> > [ 104.285629][ T315]
> > [ 104.339232][ T1469] run fstests generic/310 at 2022-10-01 13:36:36
> > (END)
> >
> > The start of the failed test. Do you have the logs from generic/310
> > so we might have some idea what corruption/shutdown event occurred
> > during that test run?
>
> sorry for that. I attached dmesg for another run.
[ 109.424124][ T1474] run fstests generic/310 at 2022-10-01 10:14:01
[ 169.865043][ T7563] XFS (sda1): Metadata corruption detected at xfs_dir3_leaf_check_int+0x381/0x600 [xfs], xfs_dir3_leafn block 0x4000088
[ 169.865406][ T7563] XFS (sda1): Unmount and run xfs_repair
[ 169.865510][ T7563] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[ 169.865639][ T7563] 00000000: 00 80 00 01 00 00 00 00 3d ff 00 00 00 00 00 00 ........=.......
[ 169.865793][ T7563] 00000010: 00 00 00 00 04 00 00 88 00 00 00 00 00 00 00 00 ................
[ 169.865945][ T7563] 00000020: 27 64 dd b1 81 61 45 2b 86 66 64 67 56 f2 40 58 'd...aE+.fdgV.@X
[ 169.866122][ T7563] 00000030: 00 00 00 00 00 00 00 87 00 fc 00 00 00 00 00 00 ................
[ 169.866293][ T7563] 00000040: 00 00 00 2e 00 00 00 08 00 00 00 31 00 00 00 0c ...........1....
[ 169.866467][ T7563] 00000050: 00 00 00 32 00 00 00 0e 00 00 00 33 00 00 00 10 ...2.......3....
[ 169.866640][ T7563] 00000060: 00 00 00 34 00 00 00 12 00 00 00 35 00 00 00 14 ...4.......5....
[ 169.866816][ T7563] 00000070: 00 00 00 36 00 00 00 16 00 00 00 37 00 00 00 18 ...6.......7....
[ 169.867002][ T7563] XFS (sda1): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x508/0x600 [xfs] (fs/xfs/xfs_buf.c:1552). Shutting down filesystem.
I don't see any corruption in the leafn header or the first few hash
entries there. It does say it has 0xfc entries in the block, which
is correct for a full leaf of hash pointers. It has no stale
entries, which is correct according to the what the test does (it
does not remove directory entries at all. It has a forward pointer
but no backwards pointer, which is expected as the hash values tell
me this should be the left-most leaf block in the tree.
The error has been detected at write time, which means the problem
was detected before it got written to disk. But I don't see what
code in xfs_dir3_leaf_check_int() is even triggering a warning on a
leafn block here - what line of code does
xfs_dir3_leaf_check_int+0x381/0x600 actually resolve to?
.....
<nnngggghhh>
No wonder I can't reproduce this locally.
commit a1df10d42ba99c946f6a574d4d31951bc0a57e33 *does not exist in
the upstream xfs-dev tree*. The URL provided pointing to the commit
above resolves to a "404 page not found" error, so I have not idea
what code was even being tested here.
AFAICT, the patch being tested is this one (based on the github url
matching the patch title:
https://lore.kernel.org/linux-xfs/20220831121639.3060527-1-guoxuenan@huawei.com/
Which I NACKed almost a whole month ago! The latest revision of the
patch was posted 2 days ago here:
https://lore.kernel.org/linux-xfs/20221008033624.1237390-1-guoxuenan@huawei.com/
Intel kernel robot maintainers: I've just wasted the best part of 2
hours trying to reproduce and track down a corruption bug that this
report lead me to beleive was in the upstream XFS tree.
You need to make it very clear that your bug report is for a commit
that *hasn't been merged into an upstream tree*. The CI robot
noticed a bug in an *old* NACKed patch, not a bug in a new upstream
commit. Please make it *VERY CLEAR* where the code the CI robot is
testing has come from.
Not happy.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2022-10-10 0:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-05 13:45 [xfs] a1df10d42b: xfstests.generic.31*.fail kernel test robot
2022-10-05 21:35 ` Dave Chinner
2022-10-09 7:17 ` Oliver Sang
2022-10-10 0:07 ` Dave Chinner [this message]
2022-10-10 0:32 ` [LKP] " Philip Li
2022-10-10 20:54 ` Dave Chinner
2022-10-11 1:25 ` Philip Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221010000740.GU3600936@dread.disaster.area \
--to=david@fromorbit.com \
--cc=guoxuenan@huawei.com \
--cc=houtao1@huawei.com \
--cc=linux-xfs@vger.kernel.org \
--cc=lkp@intel.com \
--cc=lkp@lists.01.org \
--cc=oliver.sang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox