Corruption of in-memory data (0x8) detected at xfs_defer_finish

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
@ 2023-05-02 19:14 Mike Pastore
  2023-05-02 22:02 ` Dave Chinner
  2023-05-25  2:15 ` Eric Sandeen
  0 siblings, 2 replies; 7+ messages in thread
From: Mike Pastore @ 2023-05-02 19:14 UTC (permalink / raw)
  To: linux-xfs

Hi folks,

I was playing around with some blockchain projects yesterday and had
some curious crashes while syncing blockchain databases on XFS
filesystems under kernel 6.3.

  * kernel 6.3.0 and 6.3.1 (ubuntu mainline)
  * w/ and w/o the discard mount flag
  * w/ and w/o -m crc=0
  * ironfish (nodejs) and ergo (jvm)

The hardware is as follows:

  * Asus PRIME H670-PLUS D4
  * Intel Core i5-12400
  * 32GB DDR4-3200 Non-ECC UDIMM

In all cases the filesystems were newly-created under kernel 6.3 on an
LVM2 stripe and mounted with the noatime flag. Here is the output of
the mkfs.xfs command (after reverting back to 6.2.14—which I realize
may not be the most helpful thing, but here it is anyway):

$ sudo lvremove -f vgtethys/ironfish
$ sudo lvcreate -n ironfish-L 10G -i2 vgtethys /dev/nvme[12]n1p3
  Using default stripesize 64.00 KiB.
  Logical volume "ironfish" created.
$ sudo mkfs.xfs -m crc=0 -m uuid=b4725d43-a12d-42df-981a-346af2809fad
-s size=4096 /dev/vgtethys/ironfish
meta-data=/dev/vgtethys/ironfish isize=256    agcount=16, agsize=163824 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0, sparse=0, rmapbt=0
         =                       reflink=0    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=2621184, imaxpct=25
         =                       sunit=16     swidth=32 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.

The applications crash with I/O errors. Here's what I see in dmesg:

May 01 18:56:59 tethys kernel: XFS (dm-28): Internal error bno + len >
gtbno at line 1908 of file fs/xfs/libxfs/xfs_alloc.c.  Caller
xfs_free_ag_extent+0x14e/0x950 [xfs]
May 01 18:56:59 tethys kernel: CPU: 2 PID: 48657 Comm: node Tainted: P
          OE      6.3.1-060301-generic #202304302031
May 01 18:56:59 tethys kernel: Hardware name: ASUS System Product
Name/PRIME H670-PLUS D4, BIOS 2014 10/14/2022
May 01 18:56:59 tethys kernel: Call Trace:
May 01 18:56:59 tethys kernel:  <TASK>
May 01 18:56:59 tethys kernel:  dump_stack_lvl+0x48/0x70
May 01 18:56:59 tethys kernel:  dump_stack+0x10/0x20
May 01 18:56:59 tethys kernel:  xfs_corruption_error+0x9e/0xb0 [xfs]
May 01 18:56:59 tethys kernel:  ? xfs_free_ag_extent+0x14e/0x950 [xfs]
May 01 18:56:59 tethys kernel:  xfs_free_ag_extent+0x17c/0x950 [xfs]
May 01 18:56:59 tethys kernel:  ? xfs_free_ag_extent+0x14e/0x950 [xfs]
May 01 18:56:59 tethys kernel:  __xfs_free_extent+0xee/0x1e0 [xfs]
May 01 18:56:59 tethys kernel:  xfs_trans_free_extent+0xad/0x1a0 [xfs]
May 01 18:56:59 tethys kernel:  xfs_extent_free_finish_item+0x14/0x40 [xfs]
May 01 18:56:59 tethys kernel:  xfs_defer_finish_one+0xd9/0x280 [xfs]
May 01 18:56:59 tethys kernel:  xfs_defer_finish_noroll+0xab/0x280 [xfs]
May 01 18:56:59 tethys kernel:  xfs_defer_finish+0x16/0x80 [xfs]
May 01 18:56:59 tethys kernel:  xfs_itruncate_extents_flags+0xe3/0x270 [xfs]
May 01 18:56:59 tethys kernel:  xfs_free_eofblocks+0xe3/0x130 [xfs]
May 01 18:56:59 tethys kernel:  xfs_release+0x153/0x190 [xfs]
May 01 18:56:59 tethys kernel:  xfs_file_release+0x15/0x20 [xfs]
May 01 18:56:59 tethys kernel:  __fput+0x95/0x270
May 01 18:56:59 tethys kernel:  ____fput+0xe/0x20
May 01 18:56:59 tethys kernel:  task_work_run+0x5e/0xa0
May 01 18:56:59 tethys kernel:  exit_to_user_mode_loop+0x136/0x160
May 01 18:56:59 tethys kernel:  exit_to_user_mode_prepare+0xff/0x110
May 01 18:56:59 tethys kernel:  syscall_exit_to_user_mode+0x1b/0x50
May 01 18:56:59 tethys kernel:  do_syscall_64+0x67/0x90
May 01 18:56:59 tethys kernel:  ? syscall_exit_to_user_mode+0x44/0x50
May 01 18:56:59 tethys kernel:  ? do_syscall_64+0x67/0x90
May 01 18:56:59 tethys kernel:  entry_SYSCALL_64_after_hwframe+0x72/0xdc
May 01 18:56:59 tethys kernel: RIP: 0033:0x7f8fce72c6a7
May 01 18:56:59 tethys kernel: Code: 44 00 00 48 8b 15 e9 d7 0d 00 f7
d8 64 89 02 b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
00 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 >
May 01 18:56:59 tethys kernel: RSP: 002b:00007f8fb2a67a78 EFLAGS:
00000202 ORIG_RAX: 0000000000000003
May 01 18:56:59 tethys kernel: RAX: 0000000000000000 RBX:
00007f8f98019420 RCX: 00007f8fce72c6a7
May 01 18:56:59 tethys kernel: RDX: 00007f8fce806880 RSI:
00007f8f982a9b40 RDI: 000000000000004c
May 01 18:56:59 tethys kernel: RBP: 0000000000000000 R08:
0000000000000000 R09: 00007f8fc02c5520
May 01 18:56:59 tethys kernel: R10: 0000000000000064 R11:
0000000000000202 R12: 00007f8fce807480
May 01 18:56:59 tethys kernel: R13: 0000000000006be1 R14:
0000000000000019 R15: 00007f8f980a8b50
May 01 18:56:59 tethys kernel:  </TASK>
May 01 18:56:59 tethys kernel: XFS (dm-28): Corruption detected.
Unmount and run xfs_repair
May 01 18:56:59 tethys kernel: XFS (dm-28): Corruption of in-memory
data (0x8) detected at xfs_defer_finish_noroll+0x130/0x280 [xfs]
(fs/xfs/libxfs/xfs_defer.c:573).  Shutting down filesystem.
May 01 18:56:59 tethys kernel: XFS (dm-28): Please unmount the
filesystem and rectify the problem(s)

And here's what I see in dmesg after rebooting and attempting to mount
the filesystem to replay the log:

May 01 21:34:15 tethys kernel: XFS (dm-35): Metadata corruption
detected at xfs_inode_buf_verify+0x168/0x190 [xfs], xfs_inode block
0x1405a0 xfs_inode_buf_verify
May 01 21:34:15 tethys kernel: XFS (dm-35): Unmount and run xfs_repair
May 01 21:34:15 tethys kernel: XFS (dm-35): First 128 bytes of
corrupted metadata buffer:
May 01 21:34:15 tethys kernel: 00000000: 5b 40 e2 3a ae 52 a0 7a 17 1d
5a f6 f0 de 4c 62  [@.:.R.z..Z...Lb
May 01 21:34:15 tethys kernel: 00000010: d6 31 8b 51 ca 6e ad a2 7e f5
18 65 6e 8a 41 3f  .1.Q.n..~..en.A?
May 01 21:34:15 tethys kernel: 00000020: 68 b5 02 16 2c 84 5d 33 ac 46
fc c9 da 93 af 3f  h...,.]3.F.....?
May 01 21:34:15 tethys kernel: 00000030: a0 3e b7 9c b4 99 5a 45 8c 2f
13 ed bb 07 57 e1  .>....ZE./....W.
May 01 21:34:15 tethys kernel: 00000040: bc 96 aa d7 00 2a 81 65 e6 3b
86 9d b5 0a 63 bd  .....*.e.;....c.
May 01 21:34:15 tethys kernel: 00000050: 38 e5 63 1a 09 42 36 4c b8 e8
7c 92 73 01 04 da  8.c..B6L..|.s...
May 01 21:34:15 tethys kernel: 00000060: 27 df 43 92 b1 ad ba ec 7a 02
3f 8e 84 3a bb cc  '.C.....z.?..:..
May 01 21:34:15 tethys kernel: 00000070: 39 06 74 d1 8b 04 b7 f2 62 c1
c4 f0 3c 5c 54 4f  9.t.....b...<\TO
May 01 21:34:15 tethys kernel: XFS (dm-35): metadata I/O error in
"xlog_recover_items_pass2+0x56/0xf0 [xfs]" at daddr 0x1405a0 len 32
error 117
May 01 21:34:15 tethys kernel: XFS (dm-35): log mount/recovery failed:
error -117
May 01 21:34:15 tethys kernel: XFS (dm-35): log mount failed

Blockchain projects tend to generate pathological filesystem loads;
the sustained random write activity and constant (re)allocations must
be pushing on some soft spot here. Reverting to kernel 6.2.14 and
recreating the filesystems seems to have resolved the issue—so far, at
least—but obviously this is less than ideal. If someone would be
willing to provide a targeted listed of desired artifacts I'd be happy
to boot back into kernel 6.3.1 to reproduce the issue and collect
them. Alternatively I can try to eliminate some variables (like LVM2,
potential hardware instabilities, etc.) and provide step-by-step
directions for reproducing the issue on another machine.

Thank you,

Mike

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
  2023-05-02 19:14 Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3 Mike Pastore
@ 2023-05-02 22:02 ` Dave Chinner
       [not found]   ` <CAP_NaWZEcv3B0nPEFguxVuQ8m93mO7te-bZDfwo-C8eN+f_KNA@mail.gmail.com>
  2023-05-25  2:15 ` Eric Sandeen
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2023-05-02 22:02 UTC (permalink / raw)
  To: Mike Pastore; +Cc: linux-xfs

On Tue, May 02, 2023 at 02:14:34PM -0500, Mike Pastore wrote:
> Hi folks,
> 
> I was playing around with some blockchain projects yesterday and had
> some curious crashes while syncing blockchain databases on XFS
> filesystems under kernel 6.3.
> 
>   * kernel 6.3.0 and 6.3.1 (ubuntu mainline)
>   * w/ and w/o the discard mount flag
>   * w/ and w/o -m crc=0
>   * ironfish (nodejs) and ergo (jvm)
> 
> The hardware is as follows:
> 
>   * Asus PRIME H670-PLUS D4
>   * Intel Core i5-12400
>   * 32GB DDR4-3200 Non-ECC UDIMM
> 
> In all cases the filesystems were newly-created under kernel 6.3 on an
> LVM2 stripe and mounted with the noatime flag. Here is the output of
> the mkfs.xfs command (after reverting back to 6.2.14—which I realize
> may not be the most helpful thing, but here it is anyway):
> 
> $ sudo lvremove -f vgtethys/ironfish
> $ sudo lvcreate -n ironfish-L 10G -i2 vgtethys /dev/nvme[12]n1p3
>   Using default stripesize 64.00 KiB.
>   Logical volume "ironfish" created.
> $ sudo mkfs.xfs -m crc=0 -m uuid=b4725d43-a12d-42df-981a-346af2809fad
> -s size=4096 /dev/vgtethys/ironfish
> meta-data=/dev/vgtethys/ironfish isize=256    agcount=16, agsize=163824 blks
>          =                       sectsz=4096  attr=2, projid32bit=1
>          =                       crc=0        finobt=0, sparse=0, rmapbt=0
>          =                       reflink=0    bigtime=0 inobtcount=0
> data     =                       bsize=4096   blocks=2621184, imaxpct=25
>          =                       sunit=16     swidth=32 blks

Stripe aligned allocation is enabled. Does the problem go away
when you use mkfs.xfs -d noalign .... ?

> The applications crash with I/O errors. Here's what I see in dmesg:
> 
> May 01 18:56:59 tethys kernel: XFS (dm-28): Internal error bno + len >
> gtbno at line 1908 of file fs/xfs/libxfs/xfs_alloc.c.  Caller
> xfs_free_ag_extent+0x14e/0x950 [xfs]

                        /*                                                       
                         * If this failure happens the request to free this      
                         * space was invalid, it's (partly) already free.        
                         * Very bad.                                             
                         */                                                      
                        if (XFS_IS_CORRUPT(mp, ltbno + ltlen > bno)) {           
                                error = -EFSCORRUPTED;                           
                                goto error0;                                     
                        }                                                        

That failure implies the btree records are corrupt in memory,
possibly due to memory corruption from something outside the XFS
code (e.g. use after free).

> May 01 18:56:59 tethys kernel: CPU: 2 PID: 48657 Comm: node Tainted: P
>           OE      6.3.1-060301-generic #202304302031

The kernel being run has been tainted by out of tree proprietary
drivers (a common source of memory corruption bugs in my
experience). Can you reproduce this problem with an untainted
kernel?

....

> And here's what I see in dmesg after rebooting and attempting to mount
> the filesystem to replay the log:
> 
> May 01 21:34:15 tethys kernel: XFS (dm-35): Metadata corruption
> detected at xfs_inode_buf_verify+0x168/0x190 [xfs], xfs_inode block
> 0x1405a0 xfs_inode_buf_verify
> May 01 21:34:15 tethys kernel: XFS (dm-35): Unmount and run xfs_repair
> May 01 21:34:15 tethys kernel: XFS (dm-35): First 128 bytes of
> corrupted metadata buffer:
> May 01 21:34:15 tethys kernel: 00000000: 5b 40 e2 3a ae 52 a0 7a 17 1d

That's not an inode buffer. It's not recognisable as XFS metadata at
all, which indicates some other problem.

Oh, this was from a test with "mkfs.xfs -m crc=0 ...", right? Please
don't use "-m crc=0" - that format is deprecated partly because it
has unfixable on-disk format recovery issues. One of those issues
manifests as an inode recovery failure because the underlying inode
buffer allocation/init does not get replayed correctly before we
attempt to replay inode changes into the buffer (that has not be
initialised)....

i.e. one of those unfixable issues manifest exactly like the
recovery failure being reported here.

> Blockchain projects tend to generate pathological filesystem loads;
> the sustained random write activity and constant (re)allocations must
> be pushing on some soft spot here.

There was a significant allocator infrastructure rewrite in 6.3. If
running an untainted kernel on an unaligned, CRC enabled filesystem
makes the problems go away, then it rules out known issues with the
rewrite.

Alternatively, if it is reproducable in a short time, you may be
able to bisect the XFS changes that landed between 6.2 and 6.3 to
find which change triggers the problem.

> Reverting to kernel 6.2.14 and
> recreating the filesystems seems to have resolved the issue—so far, at
> least—but obviously this is less than ideal. If someone would be
> willing to provide a targeted listed of desired artifacts I'd be happy
> to boot back into kernel 6.3.1 to reproduce the issue and collect
> them. Alternatively I can try to eliminate some variables (like LVM2,
> potential hardware instabilities, etc.) and provide step-by-step
> directions for reproducing the issue on another machine.

If you can find a minimal reproducer, that would help a lot in
diagnosing the issue.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
       [not found]   ` <CAP_NaWZEcv3B0nPEFguxVuQ8m93mO7te-bZDfwo-C8eN+f_KNA@mail.gmail.com>
@ 2023-05-02 23:13     ` Dave Chinner
  2023-05-23 21:32       ` Justin Forbes
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2023-05-02 23:13 UTC (permalink / raw)
  To: Mike Pastore; +Cc: linux-xfs

On Tue, May 02, 2023 at 05:13:09PM -0500, Mike Pastore wrote:
> On Tue, May 2, 2023, 5:03 PM Dave Chinner <david@fromorbit.com> wrote:
> 
> >
> > If you can find a minimal reproducer, that would help a lot in
> > diagnosing the issue.
> >
> 
> This is great, thank you. I'll get to work.
> 
> One note: the problem occured with and without crc=0, so we can rule that
> out at least.

Yes, I noticed that. My point was more that we have much more
confidence in crc=1 filesystems because they have much more robust
verification of the on-disk format and won't fail log recovery in
the way you noticed. The verification with crc=1 configured
filesystems is also known to catch issues caused by
memory corruption more frequently, often preventing such occurrences
from corrupting the on-disk filesystem.

Hence if you are seeing corruption events, you really want to be
using "-m crc=1" (default config) filesystems...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
  2023-05-02 23:13     ` Dave Chinner
@ 2023-05-23 21:32       ` Justin Forbes
  2023-05-24  5:42         ` Dave Chinner
  2023-05-25  2:24         ` Eric Sandeen
  0 siblings, 2 replies; 7+ messages in thread
From: Justin Forbes @ 2023-05-23 21:32 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Mike Pastore, linux-xfs

On Wed, May 03, 2023 at 09:13:18AM +1000, Dave Chinner wrote:
> On Tue, May 02, 2023 at 05:13:09PM -0500, Mike Pastore wrote:
> > On Tue, May 2, 2023, 5:03 PM Dave Chinner <david@fromorbit.com> wrote:
> > 
> > >
> > > If you can find a minimal reproducer, that would help a lot in
> > > diagnosing the issue.
> > >
> > 
> > This is great, thank you. I'll get to work.
> > 
> > One note: the problem occured with and without crc=0, so we can rule that
> > out at least.
> 
> Yes, I noticed that. My point was more that we have much more
> confidence in crc=1 filesystems because they have much more robust
> verification of the on-disk format and won't fail log recovery in
> the way you noticed. The verification with crc=1 configured
> filesystems is also known to catch issues caused by
> memory corruption more frequently, often preventing such occurrences
> from corrupting the on-disk filesystem.
> 
> Hence if you are seeing corruption events, you really want to be
> using "-m crc=1" (default config) filesystems...

Upon trying to roll out 6.3.3 to Fedora users, it seems that we have a
few hitting this reliabily with 6.3 kernels. It is certainly not all
users of XFS though, as I use it extensively and haven't run across it.
The most responsive users who can reproduce all seem to be running on
xfs filesystems that were created a few years ago, and some even can't
reproduce it on their newer systems.  Either way, it is a widespread
enough problem that I can't roll out 6.3 kernels to stable releases
until it is fixed.

https://bugzilla.redhat.com/show_bug.cgi?id=2208553

Justin

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
  2023-05-23 21:32       ` Justin Forbes
@ 2023-05-24  5:42         ` Dave Chinner
  2023-05-25  2:24         ` Eric Sandeen
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2023-05-24  5:42 UTC (permalink / raw)
  To: Justin Forbes; +Cc: Mike Pastore, linux-xfs

On Tue, May 23, 2023 at 04:32:11PM -0500, Justin Forbes wrote:
> On Wed, May 03, 2023 at 09:13:18AM +1000, Dave Chinner wrote:
> > On Tue, May 02, 2023 at 05:13:09PM -0500, Mike Pastore wrote:
> > > On Tue, May 2, 2023, 5:03 PM Dave Chinner <david@fromorbit.com> wrote:
> > > 
> > > >
> > > > If you can find a minimal reproducer, that would help a lot in
> > > > diagnosing the issue.
> > > >
> > > 
> > > This is great, thank you. I'll get to work.
> > > 
> > > One note: the problem occured with and without crc=0, so we can rule that
> > > out at least.
> > 
> > Yes, I noticed that. My point was more that we have much more
> > confidence in crc=1 filesystems because they have much more robust
> > verification of the on-disk format and won't fail log recovery in
> > the way you noticed. The verification with crc=1 configured
> > filesystems is also known to catch issues caused by
> > memory corruption more frequently, often preventing such occurrences
> > from corrupting the on-disk filesystem.
> > 
> > Hence if you are seeing corruption events, you really want to be
> > using "-m crc=1" (default config) filesystems...
> 
> Upon trying to roll out 6.3.3 to Fedora users, it seems that we have a
> few hitting this reliabily with 6.3 kernels.  It is certainly not all
> users of XFS though, as I use it extensively and haven't run across it.

Has anyone who is hitting this bisected the failure to a commit
between 6.2 and 6.3?  Has anyone who is hitting it tried a 6.4-rc3
kernel to see if the problem is already fixed?

> The most responsive users who can reproduce all seem to be running on
> xfs filesystems that were created a few years ago, and some even can't
> reproduce it on their newer systems.  Either way, it is a widespread
> enough problem that I can't roll out 6.3 kernels to stable releases
> until it is fixed.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=2208553

I only see one person reporting the issue in that bug, but you
implied that it is a widespread and easily reproducable issue. Where
can I find all the other bug reports so I can look through them for
hints as to what might be causing this?

Right now I only have two individual reports of the issue - the OP
and the user that reported the above bug.  Both are a shutdown has
occurred due to a metadata corruption being detected when reading
metadata, followed by a shutdown in recovery caused by reading an
inode buffer that doesn't actually contain inodes.

Both reports are from filesystems on LVM, both likely have stripe
units defined. The fedora case is on RAID5+LVM, no idea what the OP
was using.  Neither reports give us a workload description that we
can use to attempt to reproduce this.

Given that it's not widespread (i.e. only a small quantity of users
are seeing this issue) and we have very little details to go on, we
can't even be certain that the corruption is a result of an XFS
issue - it may be a problem in the layers below XFS (lvm, md raid,
drivers, etc) and XFS is simply the first thing to trip over it...

We really need more information to make any progress here. Can you
ask everyone who has reported the issue to you to supply us with
with their full hardware config (CPU, memory, storage devices,
hardware RAID cache settings, storage configuration, lvm/crtyp/md
setup, filesystem configuration (xfs_info), mount options, etc) as
well as what they are doing on their machines and what workloads are
running in the background when the problem manifests.

We need to work out how to reproduce this issue so we can triage it,
but right now we have nothing we can actually work with....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
  2023-05-02 19:14 Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3 Mike Pastore
  2023-05-02 22:02 ` Dave Chinner
@ 2023-05-25  2:15 ` Eric Sandeen
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2023-05-25  2:15 UTC (permalink / raw)
  To: Mike Pastore, linux-xfs

On 5/2/23 2:14 PM, Mike Pastore wrote:
> Hi folks,
> 
> I was playing around with some blockchain projects yesterday and had
> some curious crashes while syncing blockchain databases on XFS
> filesystems under kernel 6.3.
> 
>    * kernel 6.3.0 and 6.3.1 (ubuntu mainline)
>    * w/ and w/o the discard mount flag
>    * w/ and w/o -m crc=0
>    * ironfish (nodejs) and ergo (jvm)
> 
> The hardware is as follows:
> 
>    * Asus PRIME H670-PLUS D4
>    * Intel Core i5-12400
>    * 32GB DDR4-3200 Non-ECC UDIMM
> 
> In all cases the filesystems were newly-created under kernel 6.3 on an
> LVM2 stripe and mounted with the noatime flag. Here is the output of
> the mkfs.xfs command (after reverting back to 6.2.14—which I realize
> may not be the most helpful thing, but here it is anyway):
> 
> $ sudo lvremove -f vgtethys/ironfish
> $ sudo lvcreate -n ironfish-L 10G -i2 vgtethys /dev/nvme[12]n1p3
>    Using default stripesize 64.00 KiB.
>    Logical volume "ironfish" created.
> $ sudo mkfs.xfs -m crc=0 -m uuid=b4725d43-a12d-42df-981a-346af2809fad
> -s size=4096 /dev/vgtethys/ironfish
> meta-data=/dev/vgtethys/ironfish isize=256    agcount=16, agsize=163824 blks
>           =                       sectsz=4096  attr=2, projid32bit=1
>           =                       crc=0        finobt=0, sparse=0, rmapbt=0
>           =                       reflink=0    bigtime=0 inobtcount=0
> data     =                       bsize=4096   blocks=2621184, imaxpct=25
>           =                       sunit=16     swidth=32 blks
> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
> log      =internal log           bsize=4096   blocks=2560, version=2
>           =                       sectsz=4096  sunit=1 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> Discarding blocks...Done.
> 
> The applications crash with I/O errors. Here's what I see in dmesg:
> 
> May 01 18:56:59 tethys kernel: XFS (dm-28): Internal error bno + len >
> gtbno at line 1908 of file fs/xfs/libxfs/xfs_alloc.c.  Caller
> xfs_free_ag_extent+0x14e/0x950 [xfs]
> May 01 18:56:59 tethys kernel: CPU: 2 PID: 48657 Comm: node Tainted: P

What proprietary module do you have loaded?

Does the problem reproduce without it?

-Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3
  2023-05-23 21:32       ` Justin Forbes
  2023-05-24  5:42         ` Dave Chinner
@ 2023-05-25  2:24         ` Eric Sandeen
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2023-05-25  2:24 UTC (permalink / raw)
  To: Justin Forbes, Dave Chinner; +Cc: Mike Pastore, linux-xfs

On 5/23/23 4:32 PM, Justin Forbes wrote:
> On Wed, May 03, 2023 at 09:13:18AM +1000, Dave Chinner wrote:
>> On Tue, May 02, 2023 at 05:13:09PM -0500, Mike Pastore wrote:
>>> On Tue, May 2, 2023, 5:03 PM Dave Chinner <david@fromorbit.com> wrote:
>>>
>>>>
>>>> If you can find a minimal reproducer, that would help a lot in
>>>> diagnosing the issue.
>>>>
>>>
>>> This is great, thank you. I'll get to work.
>>>
>>> One note: the problem occured with and without crc=0, so we can rule that
>>> out at least.
>>
>> Yes, I noticed that. My point was more that we have much more
>> confidence in crc=1 filesystems because they have much more robust
>> verification of the on-disk format and won't fail log recovery in
>> the way you noticed. The verification with crc=1 configured
>> filesystems is also known to catch issues caused by
>> memory corruption more frequently, often preventing such occurrences
>> from corrupting the on-disk filesystem.
>>
>> Hence if you are seeing corruption events, you really want to be
>> using "-m crc=1" (default config) filesystems...
> 
> Upon trying to roll out 6.3.3 to Fedora users, it seems that we have a
> few hitting this reliabily with 6.3 kernels. It is certainly not all
> users of XFS though, as I use it extensively and haven't run across it.
> The most responsive users who can reproduce all seem to be running on
> xfs filesystems that were created a few years ago, and some even can't
> reproduce it on their newer systems.  Either way, it is a widespread
> enough problem that I can't roll out 6.3 kernels to stable releases
> until it is fixed.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2208553

The two cases in that bug look very similar, and are on similar 
hardware, and they also look (to me) like different problems than the 
one reported here.

Those reporters are reading garbage data from disk, this one seems to be 
in-memory corruption of an inode down a xfs_free_eofblocks() path...

I could be wrong, but I'm not seeing a connection between this report 
and the bugzilla report, at first glance.

Thanks,
-Eric



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-25  2:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-02 19:14 Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll on kernel 6.3 Mike Pastore
2023-05-02 22:02 ` Dave Chinner
     [not found]   ` <CAP_NaWZEcv3B0nPEFguxVuQ8m93mO7te-bZDfwo-C8eN+f_KNA@mail.gmail.com>
2023-05-02 23:13     ` Dave Chinner
2023-05-23 21:32       ` Justin Forbes
2023-05-24  5:42         ` Dave Chinner
2023-05-25  2:24         ` Eric Sandeen
2023-05-25  2:15 ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox