All of lore.kernel.org
 help / color / mirror / Atom feed
From: Byungchul Park <byungchul@sk.com>
To: Gavin Guo <gavinguo@igalia.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	muchun.song@linux.dev, osalvador@suse.de,
	akpm@linux-foundation.org, mike.kravetz@oracle.com,
	kernel-dev@igalia.com, stable@vger.kernel.org,
	Hugh Dickins <hughd@google.com>,
	Florent Revest <revest@google.com>, Gavin Shan <gshan@redhat.com>,
	kernel_team@skhynix.com
Subject: Re: [PATCH] mm/hugetlb: fix a deadlock with pagecache_folio and hugetlb_fault_mutex_table
Date: Fri, 16 May 2025 16:43:36 +0900	[thread overview]
Message-ID: <20250516074336.GA42829@system.software.com> (raw)
In-Reply-To: <b6e00e77-4a8c-4e05-ab79-266bf05fcc2d@igalia.com>

On Fri, May 16, 2025 at 03:32:35PM +0800, Gavin Guo wrote:
> On 5/16/25 14:03, Byungchul Park wrote:
> > On Wed, May 14, 2025 at 04:10:12PM +0800, Gavin Guo wrote:
> > > Hi Byungchul,
> > > 
> > > On 5/14/25 14:47, Byungchul Park wrote:
> > > > On Tue, May 13, 2025 at 05:34:48PM +0800, Gavin Guo wrote:
> > > > > The patch fixes a deadlock which can be triggered by an internal
> > > > > syzkaller [1] reproducer and captured by bpftrace script [2] and its log
> > > > 
> > > > Hi,
> > > > 
> > > > I'm trying to reproduce using the test program [1].  But not yet
> > > > produced.  I see a lot of segfaults while running [1].  I guess
> > > > something goes wrong.  Is there any prerequisite condition to reproduce
> > > > it?  Lemme know if any.  Or can you try DEPT15 with your config and
> > > > environment by the following steps:
> > > > 
> > > >      1. Apply the patchset on v6.15-rc6.
> > > >         https://lkml.kernel.org/r/20250513100730.12664-1-byungchul@sk.com
> > > >      2. Turn on CONFIG_DEPT.
> > > >      3. Run test program reproducing the deadlock.
> > > >      4. Check dmesg to see if dept reported the dependency.
> > > > 
> > > > 	Byungchul
> > > 
> > > I have enabled the patchset and successfully reproduced the bug. It
> > > seems that there is no warning or error log related to the lock. Did I
> > > miss anything? This is the console log:
> > > https://drive.google.com/file/d/1dxWNiO71qE-H-e5NMPqj7W-aW5CkGSSF/view?usp=sharing
> > 
> > My bad.  I think I found the problem that dept didn't report it.  You
> > might see the report with the following patch applied on the top, there
> > might be a lot of false positives along with that might be annoying tho.
> > 
> > Some of my efforts to suppress false positives, suppressed the real one.
> > 
> > Do you mind if I ask you to run the test with the following patch
> > applied?  It'd be appreciated if you do and share the result with me.
> > 
> > 	Byungchul
> > 
> > ---
> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index f31cd68f2935..fd7559e663c5 100644
> > --- a/include/linux/pagemap.h
> > +++ b/include/linux/pagemap.h
> > @@ -1138,6 +1138,7 @@ static inline bool trylock_page(struct page *page)
> >   static inline void folio_lock(struct folio *folio)
> >   {
> >   	might_sleep();
> > +	dept_page_wait_on_bit(&folio->page, PG_locked);
> >   	if (!folio_trylock(folio))
> >   		__folio_lock(folio);
> >   }
> > diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
> > index b2fa96d984bc..4e96a6a72d02 100644
> > --- a/kernel/dependency/dept.c
> > +++ b/kernel/dependency/dept.c
> > @@ -931,7 +931,6 @@ static void print_circle(struct dept_class *c)
> >   	dept_outworld_exit();
> >   	do {
> > -		tc->reported = true;
> >   		tc = fc;
> >   		fc = fc->bfs_parent;
> >   	} while (tc != c);
> > diff --git a/kernel/dependency/dept_unit_test.c b/kernel/dependency/dept_unit_test.c
> > index 88e846b9f876..496149f31fb3 100644
> > --- a/kernel/dependency/dept_unit_test.c
> > +++ b/kernel/dependency/dept_unit_test.c
> > @@ -125,6 +125,8 @@ static int __init dept_ut_init(void)
> >   {
> >   	int i;
> > +	return 0;
> > +
> >   	lockdep_off();
> >   	dept_ut_results.ecxt_stack_valid_cnt = 0;
> > --
> 
> Please see the test result:
> https://drive.google.com/file/d/1B20Gu3wLFbAeaXXb7aSQP5T6aeN9Mext/view?usp=sharing
> 
> It seems that after the first round, the deadlock is captured:

Thank you for the testing again!

Yeah, dept works well as I expected.  I shouldn't have suppressed dept
reports too aggressively, but.. I (or we if any) need to deal with the
existing false positives one by one by using dept annotations.

Thanks again for confirming it.

	Byungchul

> ubuntu@localhost:~$ ./repro_20250402_0225_154f8fb0580000
> executing program
> [   80.425842][ T3416] ===================================================
> [   80.426707][ T3416] DEPT: Circular dependency has been detected.
> [   80.427497][ T3416] 6.15.0-rc6+ #31 Not tainted
> [   80.428084][ T3416] ---------------------------------------------------
> [   80.428964][ T3416] summary
> [   80.429330][ T3416] ---------------------------------------------------
> [   80.430078][ T3416] *** DEADLOCK ***
> [   80.430078][ T3416]
> [   80.430736][ T3416] context A
> [   80.431076][ T3416]    [S] (unknown)(pg_locked_map:0)
> [   80.431637][ T3416]    [W] lock(&hugetlb_fault_mutex_table[i]:0)
> [   80.432312][ T3416]    [E] dept_page_clear_bit(pg_locked_map:0)
> [   80.432977][ T3416]
> [   80.433246][ T3416] context B
> [   80.433595][ T3416]    [S] lock(&hugetlb_fault_mutex_table[i]:0)
> [   80.434245][ T3416]    [W] dept_page_wait_on_bit(pg_locked_map:0)
> [   80.434880][ T3416]    [E] unlock(&hugetlb_fault_mutex_table[i]:0)
> [   80.435592][ T3416]
> [   80.435852][ T3416] [S]: start of the event context
> [   80.436369][ T3416] [W]: the wait blocked
> [   80.436789][ T3416] [E]: the event not reachable
> [   80.437275][ T3416] ---------------------------------------------------
> [   80.437950][ T3416] context A's detail
> [   80.438367][ T3416] ---------------------------------------------------
> [   80.439006][ T3416] context A
> [   80.439337][ T3416]    [S] (unknown)(pg_locked_map:0)
> [   80.439883][ T3416]    [W] lock(&hugetlb_fault_mutex_table[i]:0)
> [   80.440489][ T3416]    [E] dept_page_clear_bit(pg_locked_map:0)
> [   80.441075][ T3416]
> [   80.441318][ T3416] [S] (unknown)(pg_locked_map:0):
> [   80.441816][ T3416] (N/A)
> [   80.442077][ T3416]
> [   80.442309][ T3416] [W] lock(&hugetlb_fault_mutex_table[i]:0):
> [   80.442872][ T3416] [<ffffffff82144644>] hugetlb_wp+0xfa4/0x3490
> [   80.443502][ T3416] stacktrace:
> [   80.443810][ T3416]       hugetlb_wp+0xfa4/0x3490
> [   80.444267][ T3416]       hugetlb_fault+0x1505/0x2c70
> [   80.444776][ T3416]       handle_mm_fault+0x1845/0x1ab0
> [   80.445275][ T3416]       do_user_addr_fault+0x637/0x1450
> [   80.445779][ T3416]       exc_page_fault+0x67/0x110
> [   80.446239][ T3416]       asm_exc_page_fault+0x26/0x30
> [   80.446722][ T3416]       __put_user_4+0xd/0x20
> [   80.447157][ T3416]       copy_process+0x1f64/0x3d80
> [   80.447621][ T3416]       kernel_clone+0x216/0x940
> [   80.448068][ T3416]       __x64_sys_clone+0x18d/0x1f0
> [   80.448548][ T3416]       do_syscall_64+0x6f/0x120
> [   80.448999][ T3416]       entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   80.449556][ T3416]
> [   80.449765][ T3416] [E] dept_page_clear_bit(pg_locked_map:0):
> [   80.450272][ T3416] [<ffffffff8214263b>] hugetlb_fault+0x1ccb/0x2c70
> [   80.450861][ T3416] stacktrace:
> [   80.451148][ T3416]       hugetlb_fault+0x1ccb/0x2c70
> [   80.451611][ T3416]       handle_mm_fault+0x1845/0x1ab0
> [   80.452080][ T3416]       do_user_addr_fault+0x637/0x1450
> [   80.452566][ T3416]       exc_page_fault+0x67/0x110
> [   80.453014][ T3416]       asm_exc_page_fault+0x26/0x30
> [   80.453497][ T3416]       __put_user_4+0xd/0x20
> [   80.453923][ T3416]       copy_process+0x1f64/0x3d80
> [   80.454379][ T3416]       kernel_clone+0x216/0x940
> [   80.454817][ T3416]       __x64_sys_clone+0x18d/0x1f0
> [   80.455277][ T3416]       do_syscall_64+0x6f/0x120
> [   80.455722][ T3416]       entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   80.456253][ T3416] ---------------------------------------------------
> [   80.456842][ T3416] context B's detail
> [   80.457198][ T3416] ---------------------------------------------------
> [   80.457842][ T3416] context B
> [   80.458122][ T3416]    [S] lock(&hugetlb_fault_mutex_table[i]:0)
> [   80.458661][ T3416]    [W] dept_page_wait_on_bit(pg_locked_map:0)
> [   80.459187][ T3416]    [E] unlock(&hugetlb_fault_mutex_table[i]:0)
> [   80.459763][ T3416]
> [   80.459988][ T3416] [S] lock(&hugetlb_fault_mutex_table[i]:0):
> [   80.460509][ T3416] [<ffffffff82140d36>] hugetlb_fault+0x3c6/0x2c70
> [   80.461074][ T3416] stacktrace:
> [   80.461374][ T3416]       hugetlb_fault+0x3c6/0x2c70
> [   80.461812][ T3416]       handle_mm_fault+0x1845/0x1ab0
> [   80.462281][ T3416]       do_user_addr_fault+0x637/0x1450
> [   80.462775][ T3416]       exc_page_fault+0x67/0x110
> [   80.463220][ T3416]       asm_exc_page_fault+0x26/0x30
> [   80.463694][ T3416]       __put_user_4+0xd/0x20
> [   80.464129][ T3416]       copy_process+0x1f64/0x3d80
> [   80.464577][ T3416]       kernel_clone+0x216/0x940
> [   80.464994][ T3416]       __x64_sys_clone+0x18d/0x1f0
> [   80.465466][ T3416]       do_syscall_64+0x6f/0x120
> [   80.465909][ T3416]       entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   80.466457][ T3416]
> [   80.466660][ T3416] [W] dept_page_wait_on_bit(pg_locked_map:0):
> [   80.467177][ T3416] [<ffffffff82141187>] hugetlb_fault+0x817/0x2c70
> [   80.467740][ T3416] stacktrace:
> [   80.468032][ T3416]       hugetlb_fault+0x817/0x2c70
> [   80.468479][ T3416]       handle_mm_fault+0x1845/0x1ab0
> [   80.468947][ T3416]       do_user_addr_fault+0x637/0x1450
> [   80.469428][ T3416]       exc_page_fault+0x67/0x110
> [   80.469865][ T3416]       asm_exc_page_fault+0x26/0x30
> [   80.470332][ T3416]       __put_user_4+0xd/0x20
> [   80.470742][ T3416]       copy_process+0x1f64/0x3d80
> [   80.471186][ T3416]       kernel_clone+0x216/0x940
> [   80.471616][ T3416]       __x64_sys_clone+0x18d/0x1f0
> [   80.472060][ T3416]       do_syscall_64+0x6f/0x120
> [   80.472492][ T3416]       entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   80.473040][ T3416]
> [   80.473271][ T3416] [E] unlock(&hugetlb_fault_mutex_table[i]:0):
> [   80.473863][ T3416] (N/A)
> [   80.474124][ T3416] ---------------------------------------------------
> [   80.474738][ T3416] information that might be helpful
> [   80.475210][ T3416] ---------------------------------------------------
> [   80.475820][ T3416] CPU: 1 UID: 1000 PID: 3416 Comm: repro_20250402_ Not
> tainted 6.15.0-rc6+ #31 NONE
> [   80.475831][ T3416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   80.475837][ T3416] Call Trace:
> [   80.475841][ T3416]  <TASK>
> [   80.475845][ T3416]  dump_stack_lvl+0x1ad/0x280
> [   80.475858][ T3416]  ? __pfx_dump_stack_lvl+0x10/0x10
> [   80.475867][ T3416]  ? __pfx__printk+0x10/0x10
> [   80.475883][ T3416]  cb_check_dl+0x24a8/0x2530
> [   80.475897][ T3416]  ? bfs_extend_dep+0x271/0x290
> [   80.475909][ T3416]  bfs+0x464/0x5e0
> [   80.475921][ T3416]  ? __pfx_bfs+0x10/0x10
> [   80.475931][ T3416]  ? add_dep+0x387/0x710
> [   80.475943][ T3416]  add_dep+0x3d0/0x710
> [   80.475953][ T3416]  ? __pfx_from_pool+0x10/0x10
> [   80.475963][ T3416]  ? __pfx_bfs_init_check_dl+0x10/0x10
> [   80.475972][ T3416]  ? __pfx_bfs_extend_dep+0x10/0x10
> [   80.475981][ T3416]  ? __pfx_bfs_dequeue_dep+0x10/0x10
> [   80.475990][ T3416]  ? __pfx_cb_check_dl+0x10/0x10
> [   80.475999][ T3416]  ? __pfx_add_dep+0x10/0x10
> [   80.476011][ T3416]  ? put_ecxt+0xda/0x4b0
> [   80.476024][ T3416]  __dept_event+0xee8/0x1590
> [   80.476038][ T3416]  dept_event+0x166/0x240
> [   80.476047][ T3416]  ? hugetlb_fault+0x1ccb/0x2c70
> [   80.476057][ T3416]  folio_unlock+0xb8/0x190
> [   80.476071][ T3416]  hugetlb_fault+0x1ccb/0x2c70
> [   80.476085][ T3416]  ? __pfx_hugetlb_fault+0x10/0x10
> [   80.476100][ T3416]  ? mt_find+0x15a/0x5f0
> [   80.476110][ T3416]  handle_mm_fault+0x1845/0x1ab0
> [   80.476125][ T3416]  ? handle_mm_fault+0xdb/0x1ab0
> [   80.476142][ T3416]  ? __pfx_handle_mm_fault+0x10/0x10
> [   80.476156][ T3416]  ? find_vma+0xec/0x160
> [   80.476164][ T3416]  ? __pfx_find_vma+0x10/0x10
> [   80.476172][ T3416]  ? dept_on+0x1c/0x30
> [   80.476179][ T3416]  ? dept_exit+0x1c5/0x2c0
> [   80.476186][ T3416]  ? lockdep_hardirqs_on_prepare+0x21/0x280
> [   80.476197][ T3416]  ? lock_mm_and_find_vma+0xa1/0x300
> [   80.476211][ T3416]  do_user_addr_fault+0x637/0x1450
> [   80.476219][ T3416]  ? mntput_no_expire+0xc0/0x870
> [   80.476235][ T3416]  ? __pfx_do_user_addr_fault+0x10/0x10
> [   80.476246][ T3416]  ? trace_irq_disable+0x60/0x180
> [   80.476258][ T3416]  exc_page_fault+0x67/0x110
> [   80.476272][ T3416]  asm_exc_page_fault+0x26/0x30
> [   80.476280][ T3416] RIP: 0010:__put_user_4+0xd/0x20
> [   80.476293][ T3416] Code: 66 89 01 31 c9 0f 1f 00 c3 cc cc cc cc 90 90 90
> 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 cb 48 c1 fb 3f 48 09 d9 0f 1f
> 00 <89> 01 31 c9 0
> [   80.476312][ T3416] RSP: 0018:ffffc90004dffa38 EFLAGS: 00010206
> [   80.476322][ T3416] RAX: 000000000000000c RBX: 0000000000000000 RCX:
> 0000200000000200
> [   80.476329][ T3416] RDX: 0000000000000000 RSI: ffff888016abe300 RDI:
> ffff888017878c20
> [   80.476335][ T3416] RBP: ffffc90004dffc10 R08: 0000000000000000 R09:
> 0000000000000000
> [   80.476340][ T3416] R10: 0000000000000000 R11: ffffffff82034b65 R12:
> ffff888017c0a1e8
> [   80.476346][ T3416] R13: ffff88800d6a8200 R14: 0000000000000000 R15:
> ffff888017c08a38
> [   80.476354][ T3416]  ? __might_fault+0xb5/0x130
> [   80.476367][ T3416]  copy_process+0x1f64/0x3d80
> [   80.476375][ T3416]  ? lockdep_hardirqs_on_prepare+0x21/0x280
> [   80.476388][ T3416]  ? copy_process+0x996/0x3d80
> [   80.476399][ T3416]  ? __pfx_copy_process+0x10/0x10
> [   80.476406][ T3416]  ? from_pool+0x1e1/0x750
> [   80.476416][ T3416]  ? handle_mm_fault+0x122e/0x1ab0
> [   80.476432][ T3416]  kernel_clone+0x216/0x940
> [   80.476440][ T3416]  ? __pfx_llist_del_first+0x10/0x10
> [   80.476448][ T3416]  ? check_new_class+0x28a/0xe90
> [   80.476458][ T3416]  ? __pfx_kernel_clone+0x10/0x10
> [   80.476468][ T3416]  ? from_pool+0x1e1/0x750
> [   80.476478][ T3416]  ? __pfx_from_pool+0x10/0x10
> [   80.476487][ T3416]  ? __pfx_from_pool+0x10/0x10
> [   80.476502][ T3416]  __x64_sys_clone+0x18d/0x1f0
> [   80.476512][ T3416]  ? __pfx___x64_sys_clone+0x10/0x10
> [   80.476520][ T3416]  ? llist_add_batch+0x111/0x1f0
> [   80.476532][ T3416]  ? dept_task+0x5/0x20
> [   80.476539][ T3416]  ? dept_on+0x1c/0x30
> [   80.476545][ T3416]  ? dept_exit+0x1c5/0x2c0
> [   80.476553][ T3416]  ? lockdep_hardirqs_on_prepare+0x21/0x280
> [   80.476565][ T3416]  do_syscall_64+0x6f/0x120
> [   80.476573][ T3416]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   80.476580][ T3416] RIP: 0033:0x41b26d
> [   80.476588][ T3416] Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e
> fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
> 05 <48> 3d 01 f0 8
> [   80.476595][ T3416] RSP: 002b:00007ffa1ad2d198 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000038
> [   80.476604][ T3416] RAX: ffffffffffffffda RBX: 00007ffa1ad2dcdc RCX:
> 000000000041b26d
> [   80.476610][ T3416] RDX: 0000200000000200 RSI: 0000000000000000 RDI:
> 0000000000001200
> [   80.476616][ T3416] RBP: 00007ffa1ad2d1e0 R08: 0000000000000000 R09:
> 0000000000000000
> [   80.476621][ T3416] R10: 0000000000000000 R11: 0000000000000206 R12:
> 00007ffa1ad2d6c0
> [   80.476626][ T3416] R13: ffffffffffffffb8 R14: 0000000000000002 R15:
> 00007ffd95d76940
> [   80.476638][ T3416]  </TASK>

  reply	other threads:[~2025-05-16  7:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-13  9:34 [PATCH] mm/hugetlb: fix a deadlock with pagecache_folio and hugetlb_fault_mutex_table Gavin Guo
2025-05-14  0:56 ` Andrew Morton
2025-05-14  4:33   ` Byungchul Park
2025-05-14  6:47 ` Byungchul Park
2025-05-14  8:10   ` Gavin Guo
2025-05-15  2:22     ` Byungchul Park
2025-05-16  6:03     ` Byungchul Park
2025-05-16  7:32       ` Gavin Guo
2025-05-16  7:43         ` Byungchul Park [this message]
2025-05-20 19:53 ` Oscar Salvador
2025-05-21 11:12   ` Gavin Guo
2025-05-26  4:41 ` Gavin Shan
2025-05-27  9:59   ` Gavin Guo
2025-05-27 10:59     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250516074336.GA42829@system.software.com \
    --to=byungchul@sk.com \
    --cc=akpm@linux-foundation.org \
    --cc=gavinguo@igalia.com \
    --cc=gshan@redhat.com \
    --cc=hughd@google.com \
    --cc=kernel-dev@igalia.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=revest@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.