public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Jose M Calhariz <jose.calhariz@tecnico.ulisboa.pt>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Data corruption with XFS on Debian 11 and 12 under heavy load.
Date: Tue, 29 Aug 2023 16:41:36 -0700	[thread overview]
Message-ID: <20230829234136.GF28186@frogsfrogsfrogs> (raw)
In-Reply-To: <ZO4nuHNg+KFzZ2Qz@calhariz.com>

On Tue, Aug 29, 2023 at 06:15:36PM +0100, Jose M Calhariz wrote:
> 
> Hi,
> 
> I have been chasing a data corruption problem under heavy load on 4
> servers that I have at my care.  First I thought of an hardware
> problem because it only happen with RAID 6 disks.  So I reported to Debian: 
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032391
> 
> Further research pointed to be the XFS the common pattern, not an
> hardware issue.  So I made an informal query to a friend in a software
> house that relies heavily on XFS about his thought on this issue.  He
> made reference to several problems fixed on kernel 6.2 and a
> discussion on this mailing list about back porting the fixes to 6.1
> kernel.
> 
> With this information I have tried the latest kernel at that time on
> Debian testing over Debian v12 and I could not reproduce the
> problem.  So I made another bug report:
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040416
> 
> My questions to this mailing list:
> 
>   - Have anyone experienced under Debian or with vanilla kernels
>   corruption under heavy load on XFS?

Yes.  There were a rash of corruption problems that got fixed in 6.2:
https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git/tag/?h=xfs-6.2-merge-8

My guess with no other information is either the write invalidation
problem in iomap; or maybe COW extent allocations racing with the log.

Most of these haven't been backported to 6.1 because our only choices as
a community were (a) let a dumb bot shovel in patches with zero QA or
(b) try to scare up volunteers to backport things to LTS kernels.  (a)
wasn't acceptable, but then with (b)...

>   - Should I stop waiting for the fixes being back ported to vanilla
>   6.1 and run the latest kernel from Debian testing anyway?  Taking
>   notice that kernels from testing have less security updates on time
>   than stable kernels, specially security issues with limited
>   disclosure.

...there isn't really a designated 6.1 LTS backport engineer right now.
A couple folks from Cloudflare; Amir Goldstein; and Ted Ts'o have been
sharing the work when they have spare time.

--D

> I am happy to provide more info about my setup or my stability tests
> that fail under XFS.
> 
> 
> Kind regards
> Jose M Calhariz
> 
> -- 
> --
> Um falso amigo nunca o xinga
> 
> Um verdadeiro amigo já o xingou de tudo quanto é
> palavrão que existe - e até inventou alguns novos



      parent reply	other threads:[~2023-08-29 23:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-29 17:15 Data corruption with XFS on Debian 11 and 12 under heavy load Jose M Calhariz
2023-08-29 21:54 ` Dave Chinner
2023-08-29 23:41 ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230829234136.GF28186@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=jose.calhariz@tecnico.ulisboa.pt \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox