From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Arseniy Krasnov <avkrasnov@salutedevices.com>
Cc: oxffffaa@gmail.com, linux-erofs@lists.ozlabs.org,
linux-kernel@vger.kernel.org, kernel@salutedevices.com,
Gao Xiang <xiang@kernel.org>
Subject: Re: erofs pointer corruption and kernel crash
Date: Fri, 10 Apr 2026 23:41:14 +0800 [thread overview]
Message-ID: <8c0bdfab-dbf2-4f1e-8e2a-ce18f166d841@linux.alibaba.com> (raw)
In-Reply-To: <97ca00c7-822d-4b57-9dc0-9b396049adc9@salutedevices.com>
Hi Arseniy,
On 2026/4/10 21:27, Arseniy Krasnov wrote:
>
>
> 10.04.2026 15:20, Gao Xiang пишет:
>>
>>
>> On 2026/4/10 19:37, Arseniy Krasnov wrote:
>>
>> (drop unrelated folks since they all subscribed erofs mailing list)
>>
>>>
>>>
>>> 10.04.2026 11:31, Gao Xiang wrote:
>>>> Hi,
>>>>
>>>> On 2026/4/10 16:13, Arseniy Krasnov wrote:
>>>>> Hi,
>>>>>
>>>>> We found unexpected behaviour of erofs:
>>>>>
>>>>> There is function in erofs - 'erofs_onlinefolio_end()'. It has pointer to
>>>>> 'struct folio' as first argument, and there is loop inside this function,
>>>>> which updates 'private' field of provided folio:
>>>>>
>>>>> do {
>>>>> orig = atomic_read((atomic_t *)&folio->private);
>>>>> DBG_BUGON(orig <= 0);
>>>>> v = dirty << EROFS_ONLINEFOLIO_DIRTY;
>>>>> v |= (orig - 1) | (!!err << EROFS_ONLINEFOLIO_EIO);
>>>>> } while (atomic_cmpxchg((atomic_t *)&folio->private, orig, v) != orig);
>>>>>
>>>>> Now, we see that in some rare case, this function processes folio, where
>>>>> 'private' is pointer, and thus this loop will update some bits in this
>>>>> pointer. Then later kernel dereferences such pointer and crashes.
>>>>>
>>>>> To catch this, the following small debug patch was used (e.g. we check that 'private' field is pointer):
>>>>>
>>>>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>>>>> index 33cb0a7330d2..b1d8deffec4d 100644
>>>>> --- a/fs/erofs/data.c
>>>>> +++ b/fs/erofs/data.c
>>>>> @@ -238,6 +238,11 @@ void erofs_onlinefolio_end(struct folio *folio, int err, bool dirty)
>>>>> {
>>>>> int orig, v;
>>>>> + if (((uintptr_t)folio->private) & 0xffff000000000000) {
>>>>
>>>> No, if erofs_onlinefolio_end() is called, `folio->private`
>>>> shouldn't be a pointer, it's just a counter inside, and
>>>> storing a pointer is unexpected.
>>>>
>>>> And since the folio is locked, it shouldn't call into
>>>> try_to_free_buffers().
>>>>
>>>> Is it easy to reproduce? if yes, can you print other
>>>> values like `folio->mapping` and `folio->index` as
>>>> well?
>>>>
>>>> I need more informations to find some clues.
>>>
>>>
>>>
>>> So reproduced again with this debug patch which adds magic to 'struct z_erofs_pcluster' and prints 'struct folio'
>>> when pointer in 'private' is passed to 'erofs_onlinefolio_end()'. In short - 'private' points to 'struct z_erofs_pcluster'.
>> First, erofs-utils 1.8.10 doesn't support `-E48bit`:
>> only erofs-utils 1.9+ ship it as an experimental
>> feature, see Changelog; so I think you're using
>> modified erofs-utils 1.8.10:
>> https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/tree/ChangeLog
>>
>> ```
>> erofs-utils 1.9
>>
>> * This release includes the following updates:
>> - Add 48-bit layout support for larger filesystems (EXPERIMENTAL);
>> ```
>>
>> Second, I'm pretty sure this issue is related to
>> experimenal `-E48bit`, and those information is
>> not enough for me to find the root cause, so I
>> need to find a way to reproduce myself: It may
>> take time; you could debug yourself but I don't
>> think it's an easy task if you don't quite familiar
>> with the EROFS codebase.
>>
>> Anyway I really suggest if you need a rush solution
>> for production, don't use `-E48bit + zstd` like
>> this for now: try to use other options like
>> `-zzstd -C65536 -Efragments` instead since those
>> are common production choices.
>
> Ok thanks for this advice! One more question: currently we use this options:
> "zstd,22 --max-extent-bytes 65536 -E48bit". Ok we remove "zstd,22" and "E48bit",
> but what about "--max-extent-bytes 65536" - is it considered stable option?
> Or it is better to use your version: "-zzstd -C65536 -Efragments" ?
I'm not sure how you find this
"zstd,22 --max-extent-bytes 65536 -E48bit" combination.
My suggestion based on production is that as long as
you don't use `-zzstd` ++ `-E48bit`, it should be fine.
If you need smaller images, I suggest: `-zlzma,9 -C65536 -Efragments`
Or like Android, they all use `-zlz4hc`,
Or zstd, but don't add `-E48bit`.
As for "--max-extent-bytes 65536", it can be dropped
since if `-E48bit` is not used, it only has negative
impacts.
In short, `-E48bit` + `-zzstd` + `--max-extent-bytes`
enables new unaligned compression for zstd, but it's
a relatively new feature, I still still some time to
stablize it but my own time is limited and all things
are always prioritized.
Thanks,
Gao Xiang
>
> Thanks
>
>>
>> Thanks,
>> Gao Xiang
next prev parent reply other threads:[~2026-04-10 15:41 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 8:13 erofs pointer corruption and kernel crash Arseniy Krasnov
2026-04-10 8:31 ` Gao Xiang
2026-04-10 8:42 ` Gao Xiang
2026-04-10 8:51 ` Arseniy Krasnov
2026-04-10 8:59 ` Gao Xiang
2026-04-10 8:55 ` Arseniy Krasnov
2026-04-10 9:20 ` Gao Xiang
2026-04-10 9:59 ` Arseniy Krasnov
2026-04-10 10:01 ` Gao Xiang
2026-04-10 10:03 ` Arseniy Krasnov
2026-04-10 10:06 ` Gao Xiang
2026-04-10 10:10 ` Arseniy Krasnov
2026-04-10 10:22 ` Gao Xiang
2026-04-10 10:31 ` Arseniy Krasnov
2026-04-10 11:37 ` Arseniy Krasnov
2026-04-10 12:20 ` Gao Xiang
2026-04-10 13:27 ` Arseniy Krasnov
2026-04-10 15:41 ` Gao Xiang [this message]
2026-04-11 15:10 ` Arseniy Krasnov
2026-04-13 7:08 ` Gao Xiang
2026-04-13 7:20 ` Arseniy Krasnov
2026-04-25 15:29 ` Gao Xiang
2026-04-26 11:42 ` Arseniy Krasnov
2026-04-27 14:45 ` Arseniy Krasnov
2026-04-10 13:35 ` Arseniy Krasnov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8c0bdfab-dbf2-4f1e-8e2a-ce18f166d841@linux.alibaba.com \
--to=hsiangkao@linux.alibaba.com \
--cc=avkrasnov@salutedevices.com \
--cc=kernel@salutedevices.com \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oxffffaa@gmail.com \
--cc=xiang@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox