linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Lai, Yi" <yi1.lai@linux.intel.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Zhang Yi <yi.zhang@huaweicloud.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, willy@infradead.org,
	adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com,
	libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com,
	yi1.lai@intel.com
Subject: Re: [PATCH v2 8/8] ext4: enable large folio for regular file
Date: Thu, 26 Jun 2025 11:35:06 +0800	[thread overview]
Message-ID: <aFy/6oD/AZeTjwAs@ly-workstation> (raw)
In-Reply-To: <20250625131545.GD28249@mit.edu>

On Wed, Jun 25, 2025 at 09:15:45AM -0400, Theodore Ts'o wrote:
> It looks like this failure requires using madvise() with MADV_HWPOISON
> (which requires root) and MADV_PAGEOUT, and the stack trace is in deep
> in the an mm codepath:
> 
>    madvise_cold_or_pageout_pte_range+0x1cac/0x2800
>       reclaim_pages+0x393/0x560
>          reclaim_folio_list+0xe2/0x4c0
>             shrink_folio_list+0x44f/0x3d90
>                 unmap_poisoned_folio+0x130/0x500
>                     try_to_unmap+0x12f/0x140
>                        rmap_walk+0x16b/0x1f0
> 		       ...
> 
> The bisected commit is the one which enables using large folios, so
> while it's possible that this due to ext4 doing something not quite
> right when using large folios, it's also posible that this might be a
> bug in the folio/mm code paths.
> 
> Does this reproduce on other file systems, such as XFS?
>

Indeed, this issue can also be reproduced on XFS file system. Thanks for the advice. I will conduct cross-filesystem validation next time when I encounter ext4 issue.

[  395.888267] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASI
[  395.888767] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[  395.889150] CPU: 2 UID: 0 PID: 7420 Comm: repro Not tainted 6.16.0-rc3-86731a2a651e #1 PREEMPT(voluntary)
[  395.889620] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20241117-3.el9 11/17/2024
[  395.889967] RIP: 0010:try_to_unmap_one+0x4ef/0x3860
[  395.890230] Code: f5 a5 ff 48 8b 9d 78 ff ff ff 49 8d 46 18 48 89 85 70 fe ff ff 48 85 db 0f 84 96 1a 00 00 e8 c8 f58
[  395.891081] RSP: 0018:ff1100011869ebc0 EFLAGS: 00010246
[  395.891337] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81e1a1a1
[  395.891676] RDX: ff11000130330000 RSI: ffffffff81e186c8 RDI: 0000000000000005
[  395.892018] RBP: ff1100011869ed90 R08: 0000000000000001 R09: ffe21c00230d3d3b
[  395.892356] R10: 0000000000000000 R11: ff11000130330e58 R12: 0000000020e00000
[  395.892691] R13: ffd40000043c8000 R14: ffd40000043c8000 R15: dffffc0000000000
[  395.893043] FS:  00007fbd34523740(0000) GS:ff110004a4e62000(0000) knlGS:0000000000000000
[  395.893437] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  395.893718] CR2: 0000000021000000 CR3: 000000010f8bf004 CR4: 0000000000771ef0
[  395.894060] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  395.894398] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[  395.894732] PKRU: 55555554
[  395.894868] Call Trace:
[  395.894991]  <TASK>
[  395.895109]  ? __pfx_try_to_unmap_one+0x10/0x10
[  395.895337]  __rmap_walk_file+0x2a5/0x4a0
[  395.895538]  rmap_walk+0x16b/0x1f0
[  395.895706]  try_to_unmap+0x12f/0x140
[  395.895853]  ? __pfx_try_to_unmap+0x10/0x10
[  395.896061]  ? __pfx_try_to_unmap_one+0x10/0x10
[  395.896284]  ? __pfx_folio_not_mapped+0x10/0x10
[  395.896504]  ? __pfx_folio_lock_anon_vma_read+0x10/0x10
[  395.896758]  ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[  395.897025]  unmap_poisoned_folio+0x130/0x500
[  395.897251]  shrink_folio_list+0x44f/0x3d90
[  395.897476]  ? __pfx_shrink_folio_list+0x10/0x10
[  395.897719]  ? is_bpf_text_address+0x94/0x1b0
[  395.897941]  ? debug_smp_processor_id+0x20/0x30
[  395.898172]  ? is_bpf_text_address+0x9e/0x1b0
[  395.898387]  ? kernel_text_address+0xd3/0xe0
[  395.898604]  ? __kernel_text_address+0x16/0x50
[  395.898827]  ? unwind_get_return_address+0x65/0xb0
[  395.899066]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[  395.899326]  ? arch_stack_walk+0xa1/0xf0
[  395.899530]  reclaim_folio_list+0xe2/0x4c0
[  395.899733]  ? check_path.constprop.0+0x28/0x50
[  395.899963]  ? __pfx_reclaim_folio_list+0x10/0x10
[  395.900198]  ? folio_isolate_lru+0x38c/0x590
[  395.900412]  reclaim_pages+0x393/0x560
[  395.900606]  ? __pfx_reclaim_pages+0x10/0x10
[  395.900824]  ? do_raw_spin_unlock+0x15c/0x210
[  395.901044]  madvise_cold_or_pageout_pte_range+0x1cac/0x2800
[  395.901326]  ? __pfx_madvise_cold_or_pageout_pte_range+0x10/0x10
[  395.901631]  ? lock_is_held_type+0xef/0x150
[  395.901852]  ? __pfx_madvise_cold_or_pageout_pte_range+0x10/0x10
[  395.902158]  walk_pgd_range+0xe2d/0x2420
[  395.902373]  ? __pfx_walk_pgd_range+0x10/0x10
[  395.902593]  __walk_page_range+0x177/0x810
[  395.902799]  ? find_vma+0xc4/0x140
[  395.902977]  ? __pfx_find_vma+0x10/0x10
[  395.903176]  ? __this_cpu_preempt_check+0x21/0x30
[  395.903401]  ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[  395.903667]  walk_page_range_mm+0x39f/0x770
[  395.903877]  ? __pfx_walk_page_range_mm+0x10/0x10
[  395.904109]  ? __this_cpu_preempt_check+0x21/0x30
[  395.904340]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[  395.904606]  ? mlock_drain_local+0x27f/0x4b0
[  395.904826]  walk_page_range+0x70/0xa0
[  395.905013]  ? __kasan_check_write+0x18/0x20
[  395.905227]  madvise_do_behavior+0x13e3/0x35f0
[  395.905453]  ? copy_vma_and_data+0x353/0x7d0
[  395.905674]  ? __pfx_madvise_do_behavior+0x10/0x10
[  395.905922]  ? __pfx_arch_get_unmapped_area_topdown+0x10/0x10
[  395.906219]  ? __this_cpu_preempt_check+0x21/0x30
[  395.906455]  ? lock_is_held_type+0xef/0x150
[  395.906665]  ? __lock_acquire+0x412/0x22a0
[  395.906875]  ? __this_cpu_preempt_check+0x21/0x30
[  395.907105]  ? lock_acquire+0x180/0x310
[  395.907306]  ? __pfx_down_read+0x10/0x10
[  395.907503]  ? __lock_acquire+0x412/0x22a0
[  395.907707]  ? __pfx___do_sys_mremap+0x10/0x10
[  395.907929]  ? __sanitizer_cov_trace_switch+0x58/0xa0
[  395.908186]  do_madvise+0x193/0x2b0
[  395.908363]  ? do_madvise+0x193/0x2b0
[  395.908550]  ? __pfx_do_madvise+0x10/0x10
[  395.908801]  ? __this_cpu_preempt_check+0x21/0x30
[  395.909036]  ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0
[  395.909335]  ? lockdep_hardirqs_on+0x89/0x110
[  395.909556]  ? trace_hardirqs_on+0x51/0x60
[  395.909763]  ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0
[  395.910073]  ? __sanitizer_cov_trace_cmp4+0x1a/0x20
[  395.910332]  ? ktime_get_coarse_real_ts64+0xad/0xf0
[  395.910578]  ? __audit_syscall_entry+0x39c/0x500
[  395.910812]  __x64_sys_madvise+0xb2/0x120
[  395.911016]  ? syscall_trace_enter+0x14d/0x280
[  395.911240]  x64_sys_call+0x19ac/0x2150
[  395.911431]  do_syscall_64+0x6d/0x2e0
[  395.911619]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  395.911865] RIP: 0033:0x7fbd3430756d
[  395.912046] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d8
[  395.912905] RSP: 002b:00007ffe6486ec48 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
[  395.913267] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd3430756d
[  395.913603] RDX: 0000000000000015 RSI: 0000000000c00000 RDI: 0000000020400000
[  395.913941] RBP: 00007ffe6486ec60 R08: 00007ffe6486ec60 R09: 00007ffe6486ec60
[  395.914280] R10: 0000000020fc6000 R11: 0000000000000217 R12: 00007ffe6486edb8
[  395.914629] R13: 00000000004018e5 R14: 0000000000403e08 R15: 00007fbd3456a000
[  395.914989]  </TASK>
[  395.915111] Modules linked in:
[  395.915296] ---[ end trace 0000000000000000 ]---

FYI, there is ongoing discussion in terms of folio/mm domain - https://lore.kernel.org/all/20250611074643.250837-1-tujinjiang@huawei.com/T/

Regards,
Yi Lai

 
>      	  	       	     	  	   	- Ted

  reply	other threads:[~2025-06-26  3:35 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-12  6:33 [PATCH v2 0/8] ext4: enable large folio for regular files Zhang Yi
2025-05-12  6:33 ` [PATCH v2 1/8] ext4: make ext4_mpage_readpages() support large folios Zhang Yi
2025-05-20 10:41   ` Ojaswin Mujoo
2025-05-12  6:33 ` [PATCH v2 2/8] ext4: make regular file's buffered write path " Zhang Yi
2025-05-12  6:33 ` [PATCH v2 3/8] ext4: make __ext4_block_zero_page_range() support large folio Zhang Yi
2025-05-20 10:41   ` Ojaswin Mujoo
2025-05-12  6:33 ` [PATCH v2 4/8] ext4/jbd2: convert jbd2_journal_blocks_per_page() to " Zhang Yi
2025-05-19 20:16   ` Jan Kara
2025-05-20 12:46     ` Zhang Yi
2025-05-21 10:31       ` Jan Kara
2025-05-12  6:33 ` [PATCH v2 5/8] ext4: correct the journal credits calculations of allocating blocks Zhang Yi
2025-05-19  2:48   ` Zhang Yi
2025-05-19 15:48     ` Theodore Ts'o
2025-05-20 13:04       ` Zhang Yi
2025-05-19 20:24   ` Jan Kara
2025-05-20 12:53     ` Zhang Yi
2025-05-12  6:33 ` [PATCH v2 6/8] ext4: make the writeback path support large folios Zhang Yi
2025-05-20 10:42   ` Ojaswin Mujoo
2025-05-12  6:33 ` [PATCH v2 7/8] ext4: make online defragmentation " Zhang Yi
2025-05-12  6:33 ` [PATCH v2 8/8] ext4: enable large folio for regular file Zhang Yi
2025-05-16  9:05   ` kernel test robot
2025-05-20 10:48   ` Ojaswin Mujoo
2025-06-25  8:14   ` Lai, Yi
2025-06-25 13:15     ` Theodore Ts'o
2025-06-26  3:35       ` Lai, Yi [this message]
2025-06-26 11:29   ` D, Suneeth
2025-06-26 13:26     ` Zhang Yi
2025-06-26 14:56       ` Theodore Ts'o
2025-07-03 14:13         ` Zhang Yi
2025-05-16 11:48 ` [PATCH v2 0/8] ext4: enable large folio for regular files Ojaswin Mujoo
2025-05-19  1:19   ` Zhang Yi
2025-05-19 11:02     ` Ojaswin Mujoo
2025-05-19 20:33 ` Jan Kara
2025-05-20 13:09   ` Zhang Yi
2025-05-20 10:37 ` Ojaswin Mujoo
2025-05-20 13:41   ` Zhang Yi
2025-05-20 14:40 ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFy/6oD/AZeTjwAs@ly-workstation \
    --to=yi1.lai@linux.intel.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=libaokun1@huawei.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yi.zhang@huaweicloud.com \
    --cc=yi1.lai@intel.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).