lockdep issue with per-vma locking

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* lockdep issue with per-vma locking
@ 2023-07-12  2:26 Liam R. Howlett
  2023-07-12 15:15 ` Suren Baghdasaryan
  0 siblings, 1 reply; 5+ messages in thread
From: Liam R. Howlett @ 2023-07-12  2:26 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: linux-mm, willy, Liam R. Howlett, Laurent Dufour,
	Michel Lespinasse, Jerome Glisse, Vlastimil Babka,
	Paul E. McKenney

Suren,

When running kselftest mm, I believe I've come across a lockdep issue
with the per-vma locking pagefault:

[  226.105499] WARNING: CPU: 1 PID: 1907 at include/linux/mmap_lock.h:71 handle_userfault+0x34d/0xff0
[  226.106517] Modules linked in:
[  226.107060] CPU: 1 PID: 1907 Comm: uffd-unit-tests Not tainted 6.5.0-rc1+ #636
[  226.108099] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
[  226.109626] RIP: 0010:handle_userfault+0x34d/0xff0
[  226.113056] Code: 00 48 85 c0 0f 85 d4 fe ff ff 4c 89 f7 e8 bb 58 ea ff 0f 0b 31 f6 49 8d be a0 01 00 00 e8 0b 8b 53 01 85 c0 0f 85 00 fe ff ff <0f> 0b e9 f9 fd ff ff 49 8d be a0 01 00 00 be ff ff ff ff e8 eb 8a
[  226.115798] RSP: 0000:ffff888113a8fbf0 EFLAGS: 00010246
[  226.116570] RAX: 0000000000000000 RBX: ffff888113a8fdc8 RCX: 0000000000000001
[  226.117630] RDX: 0000000000000000 RSI: ffffffff97a70220 RDI: ffffffff97c316e0
[  226.118654] RBP: ffff88811de7c1e0 R08: 0000000000000000 R09: ffffed1022991400
[  226.119508] R10: ffff888114c8a003 R11: 0000000000000000 R12: 0000000000000200
[  226.120471] R13: ffff88811de7c1f0 R14: ffff888106ebec00 R15: 0000000000001000
[  226.121521] FS:  00007f226ec0f740(0000) GS:ffff88836f280000(0000) knlGS:0000000000000000
[  226.122543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  226.123242] CR2: 00007f226ac0f028 CR3: 00000001088a4001 CR4: 0000000000370ee0
[  226.124075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  226.125073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  226.126308] Call Trace:
[  226.127473]  <TASK>
[  226.128001]  ? __warn+0x9c/0x1f0
[  226.129005]  ? handle_userfault+0x34d/0xff0
[  226.129940]  ? report_bug+0x1f2/0x220
[  226.130700]  ? handle_bug+0x3c/0x70
[  226.131234]  ? exc_invalid_op+0x13/0x40
[  226.131827]  ? asm_exc_invalid_op+0x16/0x20
[  226.132516]  ? handle_userfault+0x34d/0xff0
[  226.133193]  ? __pfx_do_raw_spin_lock+0x10/0x10
[  226.133862]  ? find_held_lock+0x83/0xa0
[  226.134602]  ? do_anonymous_page+0x81f/0x870
[  226.135314]  ? __pfx_handle_userfault+0x10/0x10
[  226.136226]  ? __pte_offset_map_lock+0xd4/0x160
[  226.136958]  ? do_raw_spin_unlock+0x92/0xf0
[  226.137547]  ? preempt_count_sub+0xf/0xc0
[  226.138011]  ? _raw_spin_unlock+0x24/0x40
[  226.138594]  ? do_anonymous_page+0x81f/0x870
[  226.139239]  __handle_mm_fault+0x40a/0x470
[  226.139749]  ? __pfx___handle_mm_fault+0x10/0x10
[  226.140516]  handle_mm_fault+0xe9/0x270
[  226.141015]  do_user_addr_fault+0x1a9/0x810
[  226.141638]  exc_page_fault+0x58/0xe0
[  226.142101]  asm_exc_page_fault+0x22/0x30
[  226.142713] RIP: 0033:0x561107c4967e
[  226.143391] Code: 48 89 85 18 ff ff ff e9 e2 00 00 00 48 8b 15 49 a0 00 00 48 8b 05 2a a0 00 00 48 0f af 45 f8 48 83 c0 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 c8 48 8b 05 54 a0 00 00 48 8b 55 f8 48 c1 e2 03
[  226.145946] RSP: 002b:00007ffee4f22120 EFLAGS: 00010206
[  226.146745] RAX: 00007f226ac0f028 RBX: 00007ffee4f22448 RCX: 00007f226eca1bb4
[  226.147912] RDX: 00007f226ac0f000 RSI: 0000000000000001 RDI: 0000000000000000
[  226.149093] RBP: 00007ffee4f22220 R08: 0000000000000000 R09: 0000000000000000
[  226.150218] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
[  226.151313] R13: 00007ffee4f22458 R14: 0000561107c52dd8 R15: 00007f226ee34020
[  226.152464]  </TASK>
[  226.152802] irq event stamp: 3177751
[  226.153348] hardirqs last  enabled at (3177761): [<ffffffff95d9fa69>] __up_console_sem+0x59/0x80
[  226.154679] hardirqs last disabled at (3177772): [<ffffffff95d9fa4e>] __up_console_sem+0x3e/0x80
[  226.155998] softirqs last  enabled at (3177676): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
[  226.157364] softirqs last disabled at (3177667): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
[  226.158721] ---[ end trace 0000000000000000 ]---


CONFIG_PER_VMA_LOCK calls handle_mm_fault() in mm/memory.c
handle_mm_fault() may have an outdated comment, depending on what "mm
semaphore" means:

* By the time we get here, we already hold the mm semaphore

__handle_mm_fault+0x40a/0x470:
do_pte_missing at mm/memory.c:3672
(inlined by) handle_pte_fault at mm/memory.c:4955
(inlined by) __handle_mm_fault at mm/memory.c:5095

handle_userfault+0x34d/0xff0:
mmap_assert_write_locked at include/linux/mmap_lock.h:71
(inlined by) __is_vma_write_locked at include/linux/mm.h:673
(inlined by) vma_assert_locked at include/linux/mm.h:714
(inlined by) assert_fault_locked at include/linux/mm.h:747
(inlined by) handle_userfault at fs/userfaultfd.c:440

It looks like vma_assert_locked() is causing a problem if the mmap write
lock is not held in write mode.  

It looks to be an easy fix of checking the mmap_lock is held in write
mode in every other call location BUT the vma_assert_locked() path?

Thanks,
Liam


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockdep issue with per-vma locking
  2023-07-12  2:26 lockdep issue with per-vma locking Liam R. Howlett
@ 2023-07-12 15:15 ` Suren Baghdasaryan
  2023-07-12 15:30   ` Liam R. Howlett
  0 siblings, 1 reply; 5+ messages in thread
From: Suren Baghdasaryan @ 2023-07-12 15:15 UTC (permalink / raw)
  To: Liam R. Howlett, Suren Baghdasaryan, linux-mm, willy,
	Laurent Dufour, Michel Lespinasse, Jerome Glisse, Vlastimil Babka,
	Paul E. McKenney

On Tue, Jul 11, 2023 at 7:26 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> Suren,
>
> When running kselftest mm, I believe I've come across a lockdep issue
> with the per-vma locking pagefault:
>
> [  226.105499] WARNING: CPU: 1 PID: 1907 at include/linux/mmap_lock.h:71 handle_userfault+0x34d/0xff0
> [  226.106517] Modules linked in:
> [  226.107060] CPU: 1 PID: 1907 Comm: uffd-unit-tests Not tainted 6.5.0-rc1+ #636
> [  226.108099] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> [  226.109626] RIP: 0010:handle_userfault+0x34d/0xff0
> [  226.113056] Code: 00 48 85 c0 0f 85 d4 fe ff ff 4c 89 f7 e8 bb 58 ea ff 0f 0b 31 f6 49 8d be a0 01 00 00 e8 0b 8b 53 01 85 c0 0f 85 00 fe ff ff <0f> 0b e9 f9 fd ff ff 49 8d be a0 01 00 00 be ff ff ff ff e8 eb 8a
> [  226.115798] RSP: 0000:ffff888113a8fbf0 EFLAGS: 00010246
> [  226.116570] RAX: 0000000000000000 RBX: ffff888113a8fdc8 RCX: 0000000000000001
> [  226.117630] RDX: 0000000000000000 RSI: ffffffff97a70220 RDI: ffffffff97c316e0
> [  226.118654] RBP: ffff88811de7c1e0 R08: 0000000000000000 R09: ffffed1022991400
> [  226.119508] R10: ffff888114c8a003 R11: 0000000000000000 R12: 0000000000000200
> [  226.120471] R13: ffff88811de7c1f0 R14: ffff888106ebec00 R15: 0000000000001000
> [  226.121521] FS:  00007f226ec0f740(0000) GS:ffff88836f280000(0000) knlGS:0000000000000000
> [  226.122543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  226.123242] CR2: 00007f226ac0f028 CR3: 00000001088a4001 CR4: 0000000000370ee0
> [  226.124075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  226.125073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  226.126308] Call Trace:
> [  226.127473]  <TASK>
> [  226.128001]  ? __warn+0x9c/0x1f0
> [  226.129005]  ? handle_userfault+0x34d/0xff0
> [  226.129940]  ? report_bug+0x1f2/0x220
> [  226.130700]  ? handle_bug+0x3c/0x70
> [  226.131234]  ? exc_invalid_op+0x13/0x40
> [  226.131827]  ? asm_exc_invalid_op+0x16/0x20
> [  226.132516]  ? handle_userfault+0x34d/0xff0
> [  226.133193]  ? __pfx_do_raw_spin_lock+0x10/0x10
> [  226.133862]  ? find_held_lock+0x83/0xa0
> [  226.134602]  ? do_anonymous_page+0x81f/0x870
> [  226.135314]  ? __pfx_handle_userfault+0x10/0x10
> [  226.136226]  ? __pte_offset_map_lock+0xd4/0x160
> [  226.136958]  ? do_raw_spin_unlock+0x92/0xf0
> [  226.137547]  ? preempt_count_sub+0xf/0xc0
> [  226.138011]  ? _raw_spin_unlock+0x24/0x40
> [  226.138594]  ? do_anonymous_page+0x81f/0x870
> [  226.139239]  __handle_mm_fault+0x40a/0x470
> [  226.139749]  ? __pfx___handle_mm_fault+0x10/0x10
> [  226.140516]  handle_mm_fault+0xe9/0x270
> [  226.141015]  do_user_addr_fault+0x1a9/0x810
> [  226.141638]  exc_page_fault+0x58/0xe0
> [  226.142101]  asm_exc_page_fault+0x22/0x30
> [  226.142713] RIP: 0033:0x561107c4967e
> [  226.143391] Code: 48 89 85 18 ff ff ff e9 e2 00 00 00 48 8b 15 49 a0 00 00 48 8b 05 2a a0 00 00 48 0f af 45 f8 48 83 c0 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 c8 48 8b 05 54 a0 00 00 48 8b 55 f8 48 c1 e2 03
> [  226.145946] RSP: 002b:00007ffee4f22120 EFLAGS: 00010206
> [  226.146745] RAX: 00007f226ac0f028 RBX: 00007ffee4f22448 RCX: 00007f226eca1bb4
> [  226.147912] RDX: 00007f226ac0f000 RSI: 0000000000000001 RDI: 0000000000000000
> [  226.149093] RBP: 00007ffee4f22220 R08: 0000000000000000 R09: 0000000000000000
> [  226.150218] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
> [  226.151313] R13: 00007ffee4f22458 R14: 0000561107c52dd8 R15: 00007f226ee34020
> [  226.152464]  </TASK>
> [  226.152802] irq event stamp: 3177751
> [  226.153348] hardirqs last  enabled at (3177761): [<ffffffff95d9fa69>] __up_console_sem+0x59/0x80
> [  226.154679] hardirqs last disabled at (3177772): [<ffffffff95d9fa4e>] __up_console_sem+0x3e/0x80
> [  226.155998] softirqs last  enabled at (3177676): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> [  226.157364] softirqs last disabled at (3177667): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> [  226.158721] ---[ end trace 0000000000000000 ]---
>
>
> CONFIG_PER_VMA_LOCK calls handle_mm_fault() in mm/memory.c
> handle_mm_fault() may have an outdated comment, depending on what "mm
> semaphore" means:
>
> * By the time we get here, we already hold the mm semaphore
>
> __handle_mm_fault+0x40a/0x470:
> do_pte_missing at mm/memory.c:3672
> (inlined by) handle_pte_fault at mm/memory.c:4955
> (inlined by) __handle_mm_fault at mm/memory.c:5095
>
> handle_userfault+0x34d/0xff0:
> mmap_assert_write_locked at include/linux/mmap_lock.h:71
> (inlined by) __is_vma_write_locked at include/linux/mm.h:673
> (inlined by) vma_assert_locked at include/linux/mm.h:714
> (inlined by) assert_fault_locked at include/linux/mm.h:747
> (inlined by) handle_userfault at fs/userfaultfd.c:440
>
> It looks like vma_assert_locked() is causing a problem if the mmap write
> lock is not held in write mode.
>
> It looks to be an easy fix of checking the mmap_lock is held in write
> mode in every other call location BUT the vma_assert_locked() path?

Thanks Liam! Yes, the fix is indeed very simple. I missed the fact
that __is_vma_write_locked() generates an assertion, which should
probably be changed. I believe the same assertion is found by syzbot
here: https://lore.kernel.org/all/0000000000002db68f05ffb791bc@google.com/#t
I'll post a fix shortly.
BTW, this is happening only in mm-unstable, right?
Thanks,
Suren.


>
> Thanks,
> Liam


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockdep issue with per-vma locking
  2023-07-12 15:15 ` Suren Baghdasaryan
@ 2023-07-12 15:30   ` Liam R. Howlett
  2023-07-12 15:42     ` Suren Baghdasaryan
  0 siblings, 1 reply; 5+ messages in thread
From: Liam R. Howlett @ 2023-07-12 15:30 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: linux-mm, willy, Laurent Dufour, Michel Lespinasse, Jerome Glisse,
	Vlastimil Babka, Paul E. McKenney

* Suren Baghdasaryan <surenb@google.com> [230712 11:15]:
> On Tue, Jul 11, 2023 at 7:26 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > Suren,
> >
> > When running kselftest mm, I believe I've come across a lockdep issue
> > with the per-vma locking pagefault:
> >
> > [  226.105499] WARNING: CPU: 1 PID: 1907 at include/linux/mmap_lock.h:71 handle_userfault+0x34d/0xff0
> > [  226.106517] Modules linked in:
> > [  226.107060] CPU: 1 PID: 1907 Comm: uffd-unit-tests Not tainted 6.5.0-rc1+ #636
> > [  226.108099] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> > [  226.109626] RIP: 0010:handle_userfault+0x34d/0xff0
> > [  226.113056] Code: 00 48 85 c0 0f 85 d4 fe ff ff 4c 89 f7 e8 bb 58 ea ff 0f 0b 31 f6 49 8d be a0 01 00 00 e8 0b 8b 53 01 85 c0 0f 85 00 fe ff ff <0f> 0b e9 f9 fd ff ff 49 8d be a0 01 00 00 be ff ff ff ff e8 eb 8a
> > [  226.115798] RSP: 0000:ffff888113a8fbf0 EFLAGS: 00010246
> > [  226.116570] RAX: 0000000000000000 RBX: ffff888113a8fdc8 RCX: 0000000000000001
> > [  226.117630] RDX: 0000000000000000 RSI: ffffffff97a70220 RDI: ffffffff97c316e0
> > [  226.118654] RBP: ffff88811de7c1e0 R08: 0000000000000000 R09: ffffed1022991400
> > [  226.119508] R10: ffff888114c8a003 R11: 0000000000000000 R12: 0000000000000200
> > [  226.120471] R13: ffff88811de7c1f0 R14: ffff888106ebec00 R15: 0000000000001000
> > [  226.121521] FS:  00007f226ec0f740(0000) GS:ffff88836f280000(0000) knlGS:0000000000000000
> > [  226.122543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  226.123242] CR2: 00007f226ac0f028 CR3: 00000001088a4001 CR4: 0000000000370ee0
> > [  226.124075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  226.125073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  226.126308] Call Trace:
> > [  226.127473]  <TASK>
> > [  226.128001]  ? __warn+0x9c/0x1f0
> > [  226.129005]  ? handle_userfault+0x34d/0xff0
> > [  226.129940]  ? report_bug+0x1f2/0x220
> > [  226.130700]  ? handle_bug+0x3c/0x70
> > [  226.131234]  ? exc_invalid_op+0x13/0x40
> > [  226.131827]  ? asm_exc_invalid_op+0x16/0x20
> > [  226.132516]  ? handle_userfault+0x34d/0xff0
> > [  226.133193]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > [  226.133862]  ? find_held_lock+0x83/0xa0
> > [  226.134602]  ? do_anonymous_page+0x81f/0x870
> > [  226.135314]  ? __pfx_handle_userfault+0x10/0x10
> > [  226.136226]  ? __pte_offset_map_lock+0xd4/0x160
> > [  226.136958]  ? do_raw_spin_unlock+0x92/0xf0
> > [  226.137547]  ? preempt_count_sub+0xf/0xc0
> > [  226.138011]  ? _raw_spin_unlock+0x24/0x40
> > [  226.138594]  ? do_anonymous_page+0x81f/0x870
> > [  226.139239]  __handle_mm_fault+0x40a/0x470
> > [  226.139749]  ? __pfx___handle_mm_fault+0x10/0x10
> > [  226.140516]  handle_mm_fault+0xe9/0x270
> > [  226.141015]  do_user_addr_fault+0x1a9/0x810
> > [  226.141638]  exc_page_fault+0x58/0xe0
> > [  226.142101]  asm_exc_page_fault+0x22/0x30
> > [  226.142713] RIP: 0033:0x561107c4967e
> > [  226.143391] Code: 48 89 85 18 ff ff ff e9 e2 00 00 00 48 8b 15 49 a0 00 00 48 8b 05 2a a0 00 00 48 0f af 45 f8 48 83 c0 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 c8 48 8b 05 54 a0 00 00 48 8b 55 f8 48 c1 e2 03
> > [  226.145946] RSP: 002b:00007ffee4f22120 EFLAGS: 00010206
> > [  226.146745] RAX: 00007f226ac0f028 RBX: 00007ffee4f22448 RCX: 00007f226eca1bb4
> > [  226.147912] RDX: 00007f226ac0f000 RSI: 0000000000000001 RDI: 0000000000000000
> > [  226.149093] RBP: 00007ffee4f22220 R08: 0000000000000000 R09: 0000000000000000
> > [  226.150218] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
> > [  226.151313] R13: 00007ffee4f22458 R14: 0000561107c52dd8 R15: 00007f226ee34020
> > [  226.152464]  </TASK>
> > [  226.152802] irq event stamp: 3177751
> > [  226.153348] hardirqs last  enabled at (3177761): [<ffffffff95d9fa69>] __up_console_sem+0x59/0x80
> > [  226.154679] hardirqs last disabled at (3177772): [<ffffffff95d9fa4e>] __up_console_sem+0x3e/0x80
> > [  226.155998] softirqs last  enabled at (3177676): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> > [  226.157364] softirqs last disabled at (3177667): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> > [  226.158721] ---[ end trace 0000000000000000 ]---
> >
> >
> > CONFIG_PER_VMA_LOCK calls handle_mm_fault() in mm/memory.c
> > handle_mm_fault() may have an outdated comment, depending on what "mm
> > semaphore" means:
> >
> > * By the time we get here, we already hold the mm semaphore
> >
> > __handle_mm_fault+0x40a/0x470:
> > do_pte_missing at mm/memory.c:3672
> > (inlined by) handle_pte_fault at mm/memory.c:4955
> > (inlined by) __handle_mm_fault at mm/memory.c:5095
> >
> > handle_userfault+0x34d/0xff0:
> > mmap_assert_write_locked at include/linux/mmap_lock.h:71
> > (inlined by) __is_vma_write_locked at include/linux/mm.h:673
> > (inlined by) vma_assert_locked at include/linux/mm.h:714
> > (inlined by) assert_fault_locked at include/linux/mm.h:747
> > (inlined by) handle_userfault at fs/userfaultfd.c:440
> >
> > It looks like vma_assert_locked() is causing a problem if the mmap write
> > lock is not held in write mode.
> >
> > It looks to be an easy fix of checking the mmap_lock is held in write
> > mode in every other call location BUT the vma_assert_locked() path?
> 
> Thanks Liam! Yes, the fix is indeed very simple. I missed the fact
> that __is_vma_write_locked() generates an assertion, which should
> probably be changed. I believe the same assertion is found by syzbot
> here: https://lore.kernel.org/all/0000000000002db68f05ffb791bc@google.com/#t

Yeah, looks the same.  Sorry for the noise.

> I'll post a fix shortly.
> BTW, this is happening only in mm-unstable, right?

Well, I tested it only in mm-unstable..

It came up while i was testing an unrelated fix for another kselftest:mm
that I broke.


Thanks,
Liam


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockdep issue with per-vma locking
  2023-07-12 15:30   ` Liam R. Howlett
@ 2023-07-12 15:42     ` Suren Baghdasaryan
  2023-07-12 19:59       ` Suren Baghdasaryan
  0 siblings, 1 reply; 5+ messages in thread
From: Suren Baghdasaryan @ 2023-07-12 15:42 UTC (permalink / raw)
  To: Liam R. Howlett, Suren Baghdasaryan, linux-mm, willy,
	Laurent Dufour, Michel Lespinasse, Jerome Glisse, Vlastimil Babka,
	Paul E. McKenney

On Wed, Jul 12, 2023 at 8:30 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Suren Baghdasaryan <surenb@google.com> [230712 11:15]:
> > On Tue, Jul 11, 2023 at 7:26 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > >
> > > Suren,
> > >
> > > When running kselftest mm, I believe I've come across a lockdep issue
> > > with the per-vma locking pagefault:
> > >
> > > [  226.105499] WARNING: CPU: 1 PID: 1907 at include/linux/mmap_lock.h:71 handle_userfault+0x34d/0xff0
> > > [  226.106517] Modules linked in:
> > > [  226.107060] CPU: 1 PID: 1907 Comm: uffd-unit-tests Not tainted 6.5.0-rc1+ #636
> > > [  226.108099] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> > > [  226.109626] RIP: 0010:handle_userfault+0x34d/0xff0
> > > [  226.113056] Code: 00 48 85 c0 0f 85 d4 fe ff ff 4c 89 f7 e8 bb 58 ea ff 0f 0b 31 f6 49 8d be a0 01 00 00 e8 0b 8b 53 01 85 c0 0f 85 00 fe ff ff <0f> 0b e9 f9 fd ff ff 49 8d be a0 01 00 00 be ff ff ff ff e8 eb 8a
> > > [  226.115798] RSP: 0000:ffff888113a8fbf0 EFLAGS: 00010246
> > > [  226.116570] RAX: 0000000000000000 RBX: ffff888113a8fdc8 RCX: 0000000000000001
> > > [  226.117630] RDX: 0000000000000000 RSI: ffffffff97a70220 RDI: ffffffff97c316e0
> > > [  226.118654] RBP: ffff88811de7c1e0 R08: 0000000000000000 R09: ffffed1022991400
> > > [  226.119508] R10: ffff888114c8a003 R11: 0000000000000000 R12: 0000000000000200
> > > [  226.120471] R13: ffff88811de7c1f0 R14: ffff888106ebec00 R15: 0000000000001000
> > > [  226.121521] FS:  00007f226ec0f740(0000) GS:ffff88836f280000(0000) knlGS:0000000000000000
> > > [  226.122543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [  226.123242] CR2: 00007f226ac0f028 CR3: 00000001088a4001 CR4: 0000000000370ee0
> > > [  226.124075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [  226.125073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [  226.126308] Call Trace:
> > > [  226.127473]  <TASK>
> > > [  226.128001]  ? __warn+0x9c/0x1f0
> > > [  226.129005]  ? handle_userfault+0x34d/0xff0
> > > [  226.129940]  ? report_bug+0x1f2/0x220
> > > [  226.130700]  ? handle_bug+0x3c/0x70
> > > [  226.131234]  ? exc_invalid_op+0x13/0x40
> > > [  226.131827]  ? asm_exc_invalid_op+0x16/0x20
> > > [  226.132516]  ? handle_userfault+0x34d/0xff0
> > > [  226.133193]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > > [  226.133862]  ? find_held_lock+0x83/0xa0
> > > [  226.134602]  ? do_anonymous_page+0x81f/0x870
> > > [  226.135314]  ? __pfx_handle_userfault+0x10/0x10
> > > [  226.136226]  ? __pte_offset_map_lock+0xd4/0x160
> > > [  226.136958]  ? do_raw_spin_unlock+0x92/0xf0
> > > [  226.137547]  ? preempt_count_sub+0xf/0xc0
> > > [  226.138011]  ? _raw_spin_unlock+0x24/0x40
> > > [  226.138594]  ? do_anonymous_page+0x81f/0x870
> > > [  226.139239]  __handle_mm_fault+0x40a/0x470
> > > [  226.139749]  ? __pfx___handle_mm_fault+0x10/0x10
> > > [  226.140516]  handle_mm_fault+0xe9/0x270
> > > [  226.141015]  do_user_addr_fault+0x1a9/0x810
> > > [  226.141638]  exc_page_fault+0x58/0xe0
> > > [  226.142101]  asm_exc_page_fault+0x22/0x30
> > > [  226.142713] RIP: 0033:0x561107c4967e
> > > [  226.143391] Code: 48 89 85 18 ff ff ff e9 e2 00 00 00 48 8b 15 49 a0 00 00 48 8b 05 2a a0 00 00 48 0f af 45 f8 48 83 c0 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 c8 48 8b 05 54 a0 00 00 48 8b 55 f8 48 c1 e2 03
> > > [  226.145946] RSP: 002b:00007ffee4f22120 EFLAGS: 00010206
> > > [  226.146745] RAX: 00007f226ac0f028 RBX: 00007ffee4f22448 RCX: 00007f226eca1bb4
> > > [  226.147912] RDX: 00007f226ac0f000 RSI: 0000000000000001 RDI: 0000000000000000
> > > [  226.149093] RBP: 00007ffee4f22220 R08: 0000000000000000 R09: 0000000000000000
> > > [  226.150218] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
> > > [  226.151313] R13: 00007ffee4f22458 R14: 0000561107c52dd8 R15: 00007f226ee34020
> > > [  226.152464]  </TASK>
> > > [  226.152802] irq event stamp: 3177751
> > > [  226.153348] hardirqs last  enabled at (3177761): [<ffffffff95d9fa69>] __up_console_sem+0x59/0x80
> > > [  226.154679] hardirqs last disabled at (3177772): [<ffffffff95d9fa4e>] __up_console_sem+0x3e/0x80
> > > [  226.155998] softirqs last  enabled at (3177676): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> > > [  226.157364] softirqs last disabled at (3177667): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> > > [  226.158721] ---[ end trace 0000000000000000 ]---
> > >
> > >
> > > CONFIG_PER_VMA_LOCK calls handle_mm_fault() in mm/memory.c
> > > handle_mm_fault() may have an outdated comment, depending on what "mm
> > > semaphore" means:
> > >
> > > * By the time we get here, we already hold the mm semaphore
> > >
> > > __handle_mm_fault+0x40a/0x470:
> > > do_pte_missing at mm/memory.c:3672
> > > (inlined by) handle_pte_fault at mm/memory.c:4955
> > > (inlined by) __handle_mm_fault at mm/memory.c:5095
> > >
> > > handle_userfault+0x34d/0xff0:
> > > mmap_assert_write_locked at include/linux/mmap_lock.h:71
> > > (inlined by) __is_vma_write_locked at include/linux/mm.h:673
> > > (inlined by) vma_assert_locked at include/linux/mm.h:714
> > > (inlined by) assert_fault_locked at include/linux/mm.h:747
> > > (inlined by) handle_userfault at fs/userfaultfd.c:440
> > >
> > > It looks like vma_assert_locked() is causing a problem if the mmap write
> > > lock is not held in write mode.
> > >
> > > It looks to be an easy fix of checking the mmap_lock is held in write
> > > mode in every other call location BUT the vma_assert_locked() path?
> >
> > Thanks Liam! Yes, the fix is indeed very simple. I missed the fact
> > that __is_vma_write_locked() generates an assertion, which should
> > probably be changed. I believe the same assertion is found by syzbot
> > here: https://lore.kernel.org/all/0000000000002db68f05ffb791bc@google.com/#t
>
> Yeah, looks the same.  Sorry for the noise.

Not at all! I wouldn't look for it if you did not report :)
Thanks!

>
> > I'll post a fix shortly.
> > BTW, this is happening only in mm-unstable, right?
>
> Well, I tested it only in mm-unstable..
>
> It came up while i was testing an unrelated fix for another kselftest:mm
> that I broke.
>
>
> Thanks,
> Liam


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockdep issue with per-vma locking
  2023-07-12 15:42     ` Suren Baghdasaryan
@ 2023-07-12 19:59       ` Suren Baghdasaryan
  0 siblings, 0 replies; 5+ messages in thread
From: Suren Baghdasaryan @ 2023-07-12 19:59 UTC (permalink / raw)
  To: Liam R. Howlett, Suren Baghdasaryan, linux-mm, willy,
	Laurent Dufour, Michel Lespinasse, Jerome Glisse, Vlastimil Babka,
	Paul E. McKenney

On Wed, Jul 12, 2023 at 8:42 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Wed, Jul 12, 2023 at 8:30 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Suren Baghdasaryan <surenb@google.com> [230712 11:15]:
> > > On Tue, Jul 11, 2023 at 7:26 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > >
> > > > Suren,
> > > >
> > > > When running kselftest mm, I believe I've come across a lockdep issue
> > > > with the per-vma locking pagefault:
> > > >
> > > > [  226.105499] WARNING: CPU: 1 PID: 1907 at include/linux/mmap_lock.h:71 handle_userfault+0x34d/0xff0
> > > > [  226.106517] Modules linked in:
> > > > [  226.107060] CPU: 1 PID: 1907 Comm: uffd-unit-tests Not tainted 6.5.0-rc1+ #636
> > > > [  226.108099] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> > > > [  226.109626] RIP: 0010:handle_userfault+0x34d/0xff0
> > > > [  226.113056] Code: 00 48 85 c0 0f 85 d4 fe ff ff 4c 89 f7 e8 bb 58 ea ff 0f 0b 31 f6 49 8d be a0 01 00 00 e8 0b 8b 53 01 85 c0 0f 85 00 fe ff ff <0f> 0b e9 f9 fd ff ff 49 8d be a0 01 00 00 be ff ff ff ff e8 eb 8a
> > > > [  226.115798] RSP: 0000:ffff888113a8fbf0 EFLAGS: 00010246
> > > > [  226.116570] RAX: 0000000000000000 RBX: ffff888113a8fdc8 RCX: 0000000000000001
> > > > [  226.117630] RDX: 0000000000000000 RSI: ffffffff97a70220 RDI: ffffffff97c316e0
> > > > [  226.118654] RBP: ffff88811de7c1e0 R08: 0000000000000000 R09: ffffed1022991400
> > > > [  226.119508] R10: ffff888114c8a003 R11: 0000000000000000 R12: 0000000000000200
> > > > [  226.120471] R13: ffff88811de7c1f0 R14: ffff888106ebec00 R15: 0000000000001000
> > > > [  226.121521] FS:  00007f226ec0f740(0000) GS:ffff88836f280000(0000) knlGS:0000000000000000
> > > > [  226.122543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [  226.123242] CR2: 00007f226ac0f028 CR3: 00000001088a4001 CR4: 0000000000370ee0
> > > > [  226.124075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [  226.125073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > [  226.126308] Call Trace:
> > > > [  226.127473]  <TASK>
> > > > [  226.128001]  ? __warn+0x9c/0x1f0
> > > > [  226.129005]  ? handle_userfault+0x34d/0xff0
> > > > [  226.129940]  ? report_bug+0x1f2/0x220
> > > > [  226.130700]  ? handle_bug+0x3c/0x70
> > > > [  226.131234]  ? exc_invalid_op+0x13/0x40
> > > > [  226.131827]  ? asm_exc_invalid_op+0x16/0x20
> > > > [  226.132516]  ? handle_userfault+0x34d/0xff0
> > > > [  226.133193]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > > > [  226.133862]  ? find_held_lock+0x83/0xa0
> > > > [  226.134602]  ? do_anonymous_page+0x81f/0x870
> > > > [  226.135314]  ? __pfx_handle_userfault+0x10/0x10
> > > > [  226.136226]  ? __pte_offset_map_lock+0xd4/0x160
> > > > [  226.136958]  ? do_raw_spin_unlock+0x92/0xf0
> > > > [  226.137547]  ? preempt_count_sub+0xf/0xc0
> > > > [  226.138011]  ? _raw_spin_unlock+0x24/0x40
> > > > [  226.138594]  ? do_anonymous_page+0x81f/0x870
> > > > [  226.139239]  __handle_mm_fault+0x40a/0x470
> > > > [  226.139749]  ? __pfx___handle_mm_fault+0x10/0x10
> > > > [  226.140516]  handle_mm_fault+0xe9/0x270
> > > > [  226.141015]  do_user_addr_fault+0x1a9/0x810
> > > > [  226.141638]  exc_page_fault+0x58/0xe0
> > > > [  226.142101]  asm_exc_page_fault+0x22/0x30
> > > > [  226.142713] RIP: 0033:0x561107c4967e
> > > > [  226.143391] Code: 48 89 85 18 ff ff ff e9 e2 00 00 00 48 8b 15 49 a0 00 00 48 8b 05 2a a0 00 00 48 0f af 45 f8 48 83 c0 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 c8 48 8b 05 54 a0 00 00 48 8b 55 f8 48 c1 e2 03
> > > > [  226.145946] RSP: 002b:00007ffee4f22120 EFLAGS: 00010206
> > > > [  226.146745] RAX: 00007f226ac0f028 RBX: 00007ffee4f22448 RCX: 00007f226eca1bb4
> > > > [  226.147912] RDX: 00007f226ac0f000 RSI: 0000000000000001 RDI: 0000000000000000
> > > > [  226.149093] RBP: 00007ffee4f22220 R08: 0000000000000000 R09: 0000000000000000
> > > > [  226.150218] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
> > > > [  226.151313] R13: 00007ffee4f22458 R14: 0000561107c52dd8 R15: 00007f226ee34020
> > > > [  226.152464]  </TASK>
> > > > [  226.152802] irq event stamp: 3177751
> > > > [  226.153348] hardirqs last  enabled at (3177761): [<ffffffff95d9fa69>] __up_console_sem+0x59/0x80
> > > > [  226.154679] hardirqs last disabled at (3177772): [<ffffffff95d9fa4e>] __up_console_sem+0x3e/0x80
> > > > [  226.155998] softirqs last  enabled at (3177676): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> > > > [  226.157364] softirqs last disabled at (3177667): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0
> > > > [  226.158721] ---[ end trace 0000000000000000 ]---
> > > >
> > > >
> > > > CONFIG_PER_VMA_LOCK calls handle_mm_fault() in mm/memory.c
> > > > handle_mm_fault() may have an outdated comment, depending on what "mm
> > > > semaphore" means:
> > > >
> > > > * By the time we get here, we already hold the mm semaphore
> > > >
> > > > __handle_mm_fault+0x40a/0x470:
> > > > do_pte_missing at mm/memory.c:3672
> > > > (inlined by) handle_pte_fault at mm/memory.c:4955
> > > > (inlined by) __handle_mm_fault at mm/memory.c:5095
> > > >
> > > > handle_userfault+0x34d/0xff0:
> > > > mmap_assert_write_locked at include/linux/mmap_lock.h:71
> > > > (inlined by) __is_vma_write_locked at include/linux/mm.h:673
> > > > (inlined by) vma_assert_locked at include/linux/mm.h:714
> > > > (inlined by) assert_fault_locked at include/linux/mm.h:747
> > > > (inlined by) handle_userfault at fs/userfaultfd.c:440
> > > >
> > > > It looks like vma_assert_locked() is causing a problem if the mmap write
> > > > lock is not held in write mode.
> > > >
> > > > It looks to be an easy fix of checking the mmap_lock is held in write
> > > > mode in every other call location BUT the vma_assert_locked() path?
> > >
> > > Thanks Liam! Yes, the fix is indeed very simple. I missed the fact
> > > that __is_vma_write_locked() generates an assertion, which should
> > > probably be changed. I believe the same assertion is found by syzbot
> > > here: https://lore.kernel.org/all/0000000000002db68f05ffb791bc@google.com/#t
> >
> > Yeah, looks the same.  Sorry for the noise.
>
> Not at all! I wouldn't look for it if you did not report :)
> Thanks!
>
> >
> > > I'll post a fix shortly.
> > > BTW, this is happening only in mm-unstable, right?
> >
> > Well, I tested it only in mm-unstable..

https://lore.kernel.org/all/20230712195652.969194-1-surenb@google.com/
should fix the issue. I was able to reproduce the warning and did not
see if after applying the fix.

> >
> > It came up while i was testing an unrelated fix for another kselftest:mm
> > that I broke.
> >
> >
> > Thanks,
> > Liam


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-07-12 19:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-12  2:26 lockdep issue with per-vma locking Liam R. Howlett
2023-07-12 15:15 ` Suren Baghdasaryan
2023-07-12 15:30   ` Liam R. Howlett
2023-07-12 15:42     ` Suren Baghdasaryan
2023-07-12 19:59       ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).