public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Arseniy Krasnov <avkrasnov@salutedevices.com>
Cc: oxffffaa@gmail.com, linux-erofs@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, kernel@salutedevices.com,
	Gao Xiang <xiang@kernel.org>
Subject: Re: erofs pointer corruption and kernel crash
Date: Fri, 10 Apr 2026 20:20:59 +0800	[thread overview]
Message-ID: <c2d7d5ff-237d-48b5-82a2-ac4618f055cc@linux.alibaba.com> (raw)
In-Reply-To: <2e916997-0557-45e7-831a-b436c07c5ba4@salutedevices.com>



On 2026/4/10 19:37, Arseniy Krasnov wrote:

(drop unrelated folks since they all subscribed erofs mailing list)

> 
> 
> 10.04.2026 11:31, Gao Xiang wrote:
>> Hi,
>>
>> On 2026/4/10 16:13, Arseniy Krasnov wrote:
>>> Hi,
>>>
>>> We found unexpected behaviour of erofs:
>>>
>>> There is function in erofs - 'erofs_onlinefolio_end()'. It has pointer to
>>> 'struct folio' as first argument, and there is loop inside this function,
>>> which updates 'private' field of provided folio:
>>>
>>>     do {
>>>             orig = atomic_read((atomic_t *)&folio->private);
>>>             DBG_BUGON(orig <= 0);
>>>             v = dirty << EROFS_ONLINEFOLIO_DIRTY;
>>>             v |= (orig - 1) | (!!err << EROFS_ONLINEFOLIO_EIO);
>>>     } while (atomic_cmpxchg((atomic_t *)&folio->private, orig, v) != orig);
>>>
>>> Now, we see that in some rare case, this function processes folio, where
>>> 'private' is pointer, and thus this loop will update some bits in this
>>> pointer. Then later kernel dereferences such pointer and crashes.
>>>
>>> To catch this, the following small debug patch was used (e.g. we check that 'private' field is pointer):
>>>
>>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>>> index 33cb0a7330d2..b1d8deffec4d 100644
>>> --- a/fs/erofs/data.c
>>> +++ b/fs/erofs/data.c
>>> @@ -238,6 +238,11 @@ void erofs_onlinefolio_end(struct folio *folio, int err, bool dirty)
>>>    {
>>>        int orig, v;
>>>    +    if (((uintptr_t)folio->private) & 0xffff000000000000) {
>>
>> No, if erofs_onlinefolio_end() is called, `folio->private`
>> shouldn't be a pointer, it's just a counter inside, and
>> storing a pointer is unexpected.
>>
>> And since the folio is locked, it shouldn't call into
>> try_to_free_buffers().
>>
>> Is it easy to reproduce? if yes, can you print other
>> values like `folio->mapping` and `folio->index` as
>> well?
>>
>> I need more informations to find some clues.
> 
> 
> 
> So reproduced again with this debug patch which adds magic to 'struct z_erofs_pcluster' and prints 'struct folio'
> when pointer in 'private' is passed to 'erofs_onlinefolio_end()'. In short - 'private' points to 'struct z_erofs_pcluster'.
First, erofs-utils 1.8.10 doesn't support `-E48bit`:
only erofs-utils 1.9+ ship it as an experimental
feature, see Changelog; so I think you're using
modified erofs-utils 1.8.10:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/tree/ChangeLog

```
erofs-utils 1.9

  * This release includes the following updates:
    - Add 48-bit layout support for larger filesystems (EXPERIMENTAL);
```

Second, I'm pretty sure this issue is related to
experimenal `-E48bit`, and those information is
not enough for me to find the root cause, so I
need to find a way to reproduce myself: It may
take time; you could debug yourself but I don't
think it's an easy task if you don't quite familiar
with the EROFS codebase.

Anyway I really suggest if you need a rush solution
for production, don't use `-E48bit + zstd` like
this for now: try to use other options like
`-zzstd -C65536 -Efragments` instead since those
are common production choices.

Thanks,
Gao Xiang

  reply	other threads:[~2026-04-10 12:21 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10  8:13 erofs pointer corruption and kernel crash Arseniy Krasnov
2026-04-10  8:31 ` Gao Xiang
2026-04-10  8:42   ` Gao Xiang
2026-04-10  8:51     ` Arseniy Krasnov
2026-04-10  8:59       ` Gao Xiang
2026-04-10  8:55   ` Arseniy Krasnov
2026-04-10  9:20     ` Gao Xiang
2026-04-10  9:59       ` Arseniy Krasnov
2026-04-10 10:01         ` Gao Xiang
2026-04-10 10:03           ` Arseniy Krasnov
2026-04-10 10:06             ` Gao Xiang
2026-04-10 10:10               ` Arseniy Krasnov
2026-04-10 10:22                 ` Gao Xiang
2026-04-10 10:31                   ` Arseniy Krasnov
2026-04-10 11:37   ` Arseniy Krasnov
2026-04-10 12:20     ` Gao Xiang [this message]
2026-04-10 13:27       ` Arseniy Krasnov
2026-04-10 15:41         ` Gao Xiang
2026-04-11 15:10           ` Arseniy Krasnov
2026-04-13  7:08             ` Gao Xiang
2026-04-13  7:20               ` Arseniy Krasnov
2026-04-25 15:29                 ` Gao Xiang
2026-04-26 11:42                   ` Arseniy Krasnov
2026-04-27 14:45                     ` Arseniy Krasnov
2026-04-10 13:35       ` Arseniy Krasnov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c2d7d5ff-237d-48b5-82a2-ac4618f055cc@linux.alibaba.com \
    --to=hsiangkao@linux.alibaba.com \
    --cc=avkrasnov@salutedevices.com \
    --cc=kernel@salutedevices.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oxffffaa@gmail.com \
    --cc=xiang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox