From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Juhyung Park <qkrwngud825@gmail.com>,
Gao Xiang <xiang@kernel.org>,
linux-erofs@lists.ozlabs.org
Cc: linux-f2fs-devel@lists.sourceforge.net,
linux-crypto@vger.kernel.org,
Yann Collet <yann.collet.73@gmail.com>
Subject: Re: Weird EROFS data corruption
Date: Mon, 4 Dec 2023 00:52:31 +0800 [thread overview]
Message-ID: <5a0e8b44-6feb-b489-cdea-e3be3811804a@linux.alibaba.com> (raw)
In-Reply-To: <CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR2Q@mail.gmail.com>
Hi Juhyung,
On 2023/12/4 00:22, Juhyung Park wrote:
> (Cc'ing f2fs and crypto as I've noticed something similar with f2fs a
> while ago, which may mean that this is not specific to EROFS:
> https://lore.kernel.org/all/CAD14+f2nBZtLfLC6CwNjgCOuRRRjwzttp3D3iK4Of+1EEjK+cw@mail.gmail.com/
> )
>
> Hi.
>
> I'm encountering a very weird EROFS data corruption.
>
> I noticed when I build an EROFS image for AOSP development, the device
> would randomly not boot from a certain build.
> After inspecting the log, I noticed that a file got corrupted.
Is it observed on your laptop (i7-1185G7), yes? or some other arm64
device?
>
> After adding a hash check during the build flow, I noticed that EROFS
> would randomly read data wrong.
>
> I now have a reliable method of reproducing the issue, but here's the
> funny/weird part: it's only happening on my laptop (i7-1185G7). This
> is not happening with my 128 cores buildfarm machine (Threadripper
> 3990X).>
> I first suspected a hardware issue, but:
> a. The laptop had its motherboard replaced recently (due to a failing
> physical Type-C port).
> b. The laptop passes memory test (memtest86).
> c. This happens on all kernel versions from v5.4 to the latest v6.6
> including my personal custom builds and Canonical's official Ubuntu
> kernels.
> d. This happens on different host SSDs and file-system combinations.
> e. This only happens on LZ4. LZ4HC doesn't trigger the issue.
> f. This only happens when mounting the image natively by the kernel.
> Using fuse with erofsfuse is fine.
I think it's a weird issue with inplace decompression because you said
it depends on the hardware. In addition, with your dataset sadly I
cannot reproduce on my local server (Xeon(R) CPU E5-2682 v4).
What is the difference between these two machines? just different CPU or
they have some other difference like different compliers?
Thanks,
Gao Xiang
next prev parent reply other threads:[~2023-12-03 16:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-03 16:22 Weird EROFS data corruption Juhyung Park
2023-12-03 16:52 ` Gao Xiang [this message]
2023-12-03 17:01 ` Juhyung Park
2023-12-03 17:21 ` Gao Xiang
2023-12-03 17:32 ` Juhyung Park
2023-12-04 3:28 ` Gao Xiang
2023-12-04 3:41 ` Juhyung Park
2023-12-05 7:32 ` Gao Xiang
2023-12-05 14:23 ` Juhyung Park
2023-12-05 14:34 ` Gao Xiang
2023-12-05 14:43 ` Juhyung Park
2023-12-06 3:11 ` Gao Xiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a0e8b44-6feb-b489-cdea-e3be3811804a@linux.alibaba.com \
--to=hsiangkao@linux.alibaba.com \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=qkrwngud825@gmail.com \
--cc=xiang@kernel.org \
--cc=yann.collet.73@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox