From: Yeoreum Yun <yeoreum.yun@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: Yunseong Kim <ysk@kzalloc.com>, Byungchul Park <byungchul@sk.com>,
Hillf Danton <hdanton@sina.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel_team@skhynix.com
Subject: Re: [RFC] mm/migrate: make sure folio_unlock() before folio_wait_writeback()
Date: Tue, 7 Oct 2025 08:53:17 +0100 [thread overview]
Message-ID: <aOTG7VTk4s9WfrMN@e129823.arm.com> (raw)
In-Reply-To: <deb6c0a2-e166-4c91-9736-276c9f1741c9@redhat.com>
Hi David,
> On 07.10.25 08:32, Yunseong Kim wrote:
> > Hi Hillf,
> >
> > Here are the syzlang and kernel log, and you can also find the gist snippet
> > in the body of the first RFC mail:
> >
> > https://gist.github.com/kzall0c/a6091bb2fd536865ca9aabfd017a1fc5
> >
> > I am reviewing this issue again on the v6.17, The issue is always reproducible,
> > usually occurring within about 10k attempts with the 8 procs.
>
> I can see a DEPT splat and I wonder what happens if DEPT is disabled.
>
> Will the machine actually deadlock or is this just DEPT complaining (and
> probably getting something wrong)?
>
As Pedro mention[0], I believe this DEPT splat is a false positive.
The folio targeted by __find_get_block_slow() belongs to bd_mapping,
which is not the same folio whose writeback flag gets cleared
in ext4_end_io_end().
Since DEPT currently does not distinguish regular-file data folios from
the corresponding block-device folios,
such false positives are a known issue, and we plan to fix it.
Also, when i see the log shared from Yunseong (in hung.log)
I can check the migration is stuck while waiting buffer_head lock:
...
[ 3123.713542][ T89] INFO: task syz.4.2628:42733 blocked for more than 143 seconds.
[ 3123.713550][ T89] Not tainted 6.15.11-00046-g2c223fa7bd9a-dirty #13
[ 3123.713557][ T89] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3123.713562][ T89] task:syz.4.2628 state:D stack:0 pid:42733 tgid:42732 ppid:41804 task_flags:0x400040 flags:0x00000009
[ 3123.713577][ T89] Call trace:
[ 3123.713582][ T89] __switch_to+0x19c/0x2c0 (T)
[ 3123.713598][ T89] __schedule+0x514/0x1208
[ 3123.713614][ T89] schedule+0x40/0x164
[ 3123.713629][ T89] io_schedule+0x3c/0x5c
[ 3123.713644][ T89] bit_wait_io+0x14/0x70
[ 3123.713662][ T89] __wait_on_bit_lock+0xa0/0x120
[ 3123.713678][ T89] out_of_line_wait_on_bit_lock+0x8c/0xc0
[ 3123.713695][ T89] __lock_buffer+0x74/0xb8
[ 3123.713720][ T89] __buffer_migrate_folio+0x190/0x504
[ 3123.713747][ T89] buffer_migrate_folio_norefs+0x30/0x3c
[ 3123.713764][ T89] move_to_new_folio+0xe4/0x528
[ 3123.713779][ T89] migrate_pages_batch+0xee0/0x1788
[ 3123.713795][ T89] migrate_pages+0x15c4/0x1840
[ 3123.713810][ T89] compact_zone+0x9c8/0x1d20
[ 3123.713822][ T89] compact_node+0xd4/0x27c
[ 3123.713832][ T89] sysctl_compaction_handler+0x104/0x194
[ 3123.713843][ T89] proc_sys_call_handler+0x25c/0x3f8
[ 3123.713865][ T89] proc_sys_write+0x20/0x2c
[ 3123.713878][ T89] do_iter_readv_writev+0x350/0x448
[ 3123.713897][ T89] vfs_writev+0x1ac/0x44c
[ 3123.713913][ T89] do_pwritev+0x100/0x15c
[ 3123.713929][ T89] __arm64_sys_pwritev2+0x6c/0xcc
[ 3123.713945][ T89] invoke_syscall.constprop.0+0x64/0x18c
[ 3123.713961][ T89] el0_svc_common.constprop.0+0x80/0x198
[ 3123.713978][ T89] do_el0_svc+0x28/0x3c
[ 3123.713993][ T89] el0_svc+0x50/0x220
[ 3123.714004][ T89] el0t_64_sync_handler+0x10c/0x140
[ 3123.714017][ T89] el0t_64_sync+0x1b8/0x1bc
...
which is different from description "stuck on writeback".
Unfortunately, I couldn't analyse more with the log he shared
since it was truncated.
@Yunseong, Could you make a reproduce without DEPT and share
full log for futher analysis?
Thanks.
[0] https://lore.kernel.org/all/dglxbwe2i5ubofefdxwo5jvyhdfjov37z5jzc5guedhe4dl6ia@pmkjkec3isb4/
--
Sincerely,
Yeoreum Yun
next prev parent reply other threads:[~2025-10-07 7:54 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-02 8:16 [RFC] mm/migrate: make sure folio_unlock() before folio_wait_writeback() Byungchul Park
2025-10-02 11:38 ` David Hildenbrand
2025-10-02 22:02 ` Hillf Danton
2025-10-03 0:48 ` Byungchul Park
2025-10-03 0:52 ` Byungchul Park
2025-10-07 6:32 ` Yunseong Kim
2025-10-07 7:04 ` David Hildenbrand
2025-10-07 7:53 ` Yeoreum Yun [this message]
2025-10-13 4:36 ` Byungchul Park
2025-10-13 8:08 ` David Hildenbrand
2025-10-03 1:02 ` Byungchul Park
2025-10-03 2:31 ` Byungchul Park
2025-10-03 14:04 ` Pedro Falcato
2025-10-02 11:42 ` Yeoreum Yun
2025-10-02 11:49 ` Yeoreum Yun
2025-10-03 2:08 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOTG7VTk4s9WfrMN@e129823.arm.com \
--to=yeoreum.yun@arm.com \
--cc=akpm@linux-foundation.org \
--cc=byungchul@sk.com \
--cc=david@redhat.com \
--cc=hdanton@sina.com \
--cc=kernel_team@skhynix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ysk@kzalloc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.