[PATCH] mm: Fix mmap_assert_locked() in follow

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
@ 2024-07-11  5:13 Pei Li
  2024-07-11 21:22 ` Andrew Morton
  2024-07-11 21:33 ` David Hildenbrand
  0 siblings, 2 replies; 9+ messages in thread
From: Pei Li @ 2024-07-11  5:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
	linux-kernel-mentees, syzbot+35a4414f6e247f515443, Pei Li

This patch fixes this warning by acquiring read lock before entering
untrack_pfn() while write lock is not held.

syzbot has tested the proposed patch and the reproducer did not
trigger any issue.

Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
Signed-off-by: Pei Li <peili.dev@gmail.com>
---
Syzbot reported the following warning in follow_pte():

WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980

This is because we are assuming that mm->mmap_lock should be held when
entering follow_pte(). This is added in commit c5541ba378e3 (mm:
follow_pte() improvements).

However, in the following call stack, we are not acquring the lock:
 follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
 get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
 untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
 unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
 zap_page_range_single+0x326/0x560 mm/memory.c:1920

In zap_page_range_single(), we passed mm_wr_locked as false, as we do
not expect write lock to be held.
In the special case where vma->vm_flags is set as VM_PFNMAP, we are
hitting untrack_pfn() which eventually calls into follow_phys.

This patch fixes this warning by acquiring read lock before entering
untrack_pfn() while write lock is not held. 

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Tested on:

commit:         9d9a2f29 Merge tag 'mm-hotfixes-stable-2024-07-10-13-1..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13be8021980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=3456bae478301dc8
dashboard link: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=145e3441980000

Note: testing is done by a robot and is best-effort only.
---
 mm/memory.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index d10e616d7389..75d7959b835b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 	if (vma->vm_file)
 		uprobe_munmap(vma, start, end);

-	if (unlikely(vma->vm_flags & VM_PFNMAP))
+	if (unlikely(vma->vm_flags & VM_PFNMAP)) {
+		if (!mm_wr_locked)
+			mmap_read_lock(vma->vm_mm);
+
 		untrack_pfn(vma, 0, 0, mm_wr_locked);

+		if (!mm_wr_locked)
+			mmap_read_unlock(vma->vm_mm);
+	}
+
 	if (start != end) {
 		if (unlikely(is_vm_hugetlb_page(vma))) {
 			/*

---
base-commit: 734610514cb0234763cc97ddbd235b7981889445
change-id: 20240710-bug12-fecfe4bb0dcb

Best regards,
-- 
Pei Li <peili.dev@gmail.com>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
  2024-07-11  5:13 [PATCH] mm: Fix mmap_assert_locked() in follow_pte() Pei Li
@ 2024-07-11 21:22 ` Andrew Morton
  2024-07-11 21:33 ` David Hildenbrand
  1 sibling, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2024-07-11 21:22 UTC (permalink / raw)
  To: Pei Li
  Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
	linux-kernel-mentees, syzbot+35a4414f6e247f515443,
	David Hildenbrand

On Wed, 10 Jul 2024 22:13:17 -0700 Pei Li <peili.dev@gmail.com> wrote:

> This patch fixes this warning by acquiring read lock before entering
> untrack_pfn() while write lock is not held.
> 
> syzbot has tested the proposed patch and the reproducer did not
> trigger any issue.
> 

Thanks.

> ---
> Syzbot reported the following warning in follow_pte():
> 
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
> 
> This is because we are assuming that mm->mmap_lock should be held when
> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
> follow_pte() improvements).
> 
> However, in the following call stack, we are not acquring the lock:
>  follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>  get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>  untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>  unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>  zap_page_range_single+0x326/0x560 mm/memory.c:1920
> 
> In zap_page_range_single(), we passed mm_wr_locked as false, as we do
> not expect write lock to be held.
> In the special case where vma->vm_flags is set as VM_PFNMAP, we are
> hitting untrack_pfn() which eventually calls into follow_phys.

I included the above (very relevant) info in the changelog.

And I added 

	Fixes: c5541ba378e3 ("mm: follow_pte() improvements")

and queued the patch for 6.10-rc7.  Hopefully David can review it for us.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
  2024-07-11  5:13 [PATCH] mm: Fix mmap_assert_locked() in follow_pte() Pei Li
  2024-07-11 21:22 ` Andrew Morton
@ 2024-07-11 21:33 ` David Hildenbrand
  2024-07-11 21:45   ` David Hildenbrand
  2024-07-12  8:04   ` Sergey Senozhatsky
  1 sibling, 2 replies; 9+ messages in thread
From: David Hildenbrand @ 2024-07-11 21:33 UTC (permalink / raw)
  To: Pei Li, Andrew Morton
  Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
	linux-kernel-mentees, syzbot+35a4414f6e247f515443

On 11.07.24 07:13, Pei Li wrote:
> This patch fixes this warning by acquiring read lock before entering
> untrack_pfn() while write lock is not held.
> 
> syzbot has tested the proposed patch and the reproducer did not
> trigger any issue.
> 
> Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
> Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
> Signed-off-by: Pei Li <peili.dev@gmail.com>
> ---
> Syzbot reported the following warning in follow_pte():
> 
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
> 
> This is because we are assuming that mm->mmap_lock should be held when
> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
> follow_pte() improvements).
> 
> However, in the following call stack, we are not acquring the lock:
>   follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>   get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>   untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>   unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>   zap_page_range_single+0x326/0x560 mm/memory.c:1920

That implies that unmap_vmas() is called without the mmap lock in read 
mode, correct?

Do we know how this happens?

* exit_mmap() holds the mmap lock in read mode
* unmap_region is documented to hold the mmap lock in read mode

> 
> In zap_page_range_single(), we passed mm_wr_locked as false, as we do
> not expect write lock to be held.
> In the special case where vma->vm_flags is set as VM_PFNMAP, we are
> hitting untrack_pfn() which eventually calls into follow_phys.
> 
> This patch fixes this warning by acquiring read lock before entering
> untrack_pfn() while write lock is not held.
> 
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> 
> Tested on:
> 
> commit:         9d9a2f29 Merge tag 'mm-hotfixes-stable-2024-07-10-13-1..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13be8021980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3456bae478301dc8
> dashboard link: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=145e3441980000
> 
> Note: testing is done by a robot and is best-effort only.
> ---
>   mm/memory.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d10e616d7389..75d7959b835b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
>   	if (vma->vm_file)
>   		uprobe_munmap(vma, start, end);
>   
> -	if (unlikely(vma->vm_flags & VM_PFNMAP))
> +	if (unlikely(vma->vm_flags & VM_PFNMAP)) {
> +		if (!mm_wr_locked)
> +			mmap_read_lock(vma->vm_mm);
> +
>   		untrack_pfn(vma, 0, 0, mm_wr_locked);
>   
> +		if (!mm_wr_locked)
> +			mmap_read_unlock(vma->vm_mm);
> +	}
> +
>   	if (start != end) {
>   		if (unlikely(is_vm_hugetlb_page(vma))) {

I'm not sure if this is the right fix. I like to understand how we end 
up without the mmap lock at least in read mode in that path?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
  2024-07-11 21:33 ` David Hildenbrand
@ 2024-07-11 21:45   ` David Hildenbrand
  2024-07-11 21:51     ` David Hildenbrand
  2024-07-12  8:04   ` Sergey Senozhatsky
  1 sibling, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2024-07-11 21:45 UTC (permalink / raw)
  To: Pei Li, Andrew Morton
  Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
	linux-kernel-mentees, syzbot+35a4414f6e247f515443, Peter Xu

On 11.07.24 23:33, David Hildenbrand wrote:
> On 11.07.24 07:13, Pei Li wrote:
>> This patch fixes this warning by acquiring read lock before entering
>> untrack_pfn() while write lock is not held.
>>
>> syzbot has tested the proposed patch and the reproducer did not
>> trigger any issue.
>>
>> Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
>> Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>> Signed-off-by: Pei Li <peili.dev@gmail.com>
>> ---
>> Syzbot reported the following warning in follow_pte():
>>
>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
>>
>> This is because we are assuming that mm->mmap_lock should be held when
>> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
>> follow_pte() improvements).
>>
>> However, in the following call stack, we are not acquring the lock:
>>    follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>>    get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>>    untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>>    unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>>    zap_page_range_single+0x326/0x560 mm/memory.c:1920
> 
> That implies that unmap_vmas() is called without the mmap lock in read
> mode, correct?
> 
> Do we know how this happens?
> 
> * exit_mmap() holds the mmap lock in read mode
> * unmap_region is documented to hold the mmap lock in read mode

I think this is it (missed the call from zap_page_range_single()):

  follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
  get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
  untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
  unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
  zap_page_range_single+0x326/0x560 mm/memory.c:1920
  unmap_mapping_range_vma mm/memory.c:3684 [inline]
  unmap_mapping_range_tree mm/memory.c:3701 [inline]
  unmap_mapping_pages mm/memory.c:3767 [inline]
  unmap_mapping_range+0x1ee/0x280 mm/memory.c:3804
  truncate_pagecache+0x53/0x90 mm/truncate.c:731
  simple_setattr+0xf2/0x120 fs/libfs.c:886
  notify_change+0xec6/0x11f0 fs/attr.c:499
  do_truncate+0x15c/0x220 fs/open.c:65
  handle_truncate fs/namei.c:3308 [inline]

I think Peter recently questioned whether untrack_pfn() should be even 
called from the place, but I might misremember things.

Fix should work (I suspect we are not violating some locking rules?), 
PFNMAP should not happen there too often that we really care.

If everything fails, we could drop the assert, but I'm hoping we can 
avoid that. We really want most users of follow_pte() to do the right thing.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
  2024-07-11 21:45   ` David Hildenbrand
@ 2024-07-11 21:51     ` David Hildenbrand
  0 siblings, 0 replies; 9+ messages in thread
From: David Hildenbrand @ 2024-07-11 21:51 UTC (permalink / raw)
  To: Pei Li, Andrew Morton
  Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
	linux-kernel-mentees, syzbot+35a4414f6e247f515443, Peter Xu

On 11.07.24 23:45, David Hildenbrand wrote:
> On 11.07.24 23:33, David Hildenbrand wrote:
>> On 11.07.24 07:13, Pei Li wrote:
>>> This patch fixes this warning by acquiring read lock before entering
>>> untrack_pfn() while write lock is not held.
>>>
>>> syzbot has tested the proposed patch and the reproducer did not
>>> trigger any issue.
>>>
>>> Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>>> Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
>>> Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>>> Signed-off-by: Pei Li <peili.dev@gmail.com>
>>> ---
>>> Syzbot reported the following warning in follow_pte():
>>>
>>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
>>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
>>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
>>>
>>> This is because we are assuming that mm->mmap_lock should be held when
>>> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
>>> follow_pte() improvements).
>>>
>>> However, in the following call stack, we are not acquring the lock:
>>>     follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>>>     get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>>>     untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>>>     unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>>>     zap_page_range_single+0x326/0x560 mm/memory.c:1920
>>
>> That implies that unmap_vmas() is called without the mmap lock in read
>> mode, correct?
>>
>> Do we know how this happens?
>>
>> * exit_mmap() holds the mmap lock in read mode
>> * unmap_region is documented to hold the mmap lock in read mode
> 
> I think this is it (missed the call from zap_page_range_single()):
> 
>    follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>    get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>    untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>    unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>    zap_page_range_single+0x326/0x560 mm/memory.c:1920
>    unmap_mapping_range_vma mm/memory.c:3684 [inline]
>    unmap_mapping_range_tree mm/memory.c:3701 [inline]
>    unmap_mapping_pages mm/memory.c:3767 [inline]
>    unmap_mapping_range+0x1ee/0x280 mm/memory.c:3804
>    truncate_pagecache+0x53/0x90 mm/truncate.c:731
>    simple_setattr+0xf2/0x120 fs/libfs.c:886
>    notify_change+0xec6/0x11f0 fs/attr.c:499
>    do_truncate+0x15c/0x220 fs/open.c:65
>    handle_truncate fs/namei.c:3308 [inline]
> 
> I think Peter recently questioned whether untrack_pfn() should be even
> called from the place, but I might misremember things.
> 
> Fix should work (I suspect we are not violating some locking rules?),
> PFNMAP should not happen there too often that we really care.

... thinking again, likely we reach this point with "!mm_wr_locked" and 
the mmap lock already held in read mode. So I suspect the fix won't work 
as is.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
  2024-07-11 21:33 ` David Hildenbrand
  2024-07-11 21:45   ` David Hildenbrand
@ 2024-07-12  8:04   ` Sergey Senozhatsky
  1 sibling, 0 replies; 9+ messages in thread
From: Sergey Senozhatsky @ 2024-07-12  8:04 UTC (permalink / raw)
  To: Pei Li, Andrew Morton, David Hildenbrand
  Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
	linux-kernel-mentees, syzbot+35a4414f6e247f515443

On (24/07/11 23:33), David Hildenbrand wrote:
[..]
> > @@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
> >   	if (vma->vm_file)
> >   		uprobe_munmap(vma, start, end);
> > -	if (unlikely(vma->vm_flags & VM_PFNMAP))
> > +	if (unlikely(vma->vm_flags & VM_PFNMAP)) {
> > +		if (!mm_wr_locked)
> > +			mmap_read_lock(vma->vm_mm);
> > +
> >   		untrack_pfn(vma, 0, 0, mm_wr_locked);
> > +		if (!mm_wr_locked)
> > +			mmap_read_unlock(vma->vm_mm);
> > +	}
> > +
> >   	if (start != end) {
> >   		if (unlikely(is_vm_hugetlb_page(vma))) {
>
> I'm not sure if this is the right fix. I like to understand how we end up
> without the mmap lock at least in read mode in that path?

I suspect this is causing a deadlock:

[   10.263161] ============================================
[   10.263165] WARNING: possible recursive locking detected
[   10.263170] 6.10.0-rc7-next-20240712+ #645 Tainted: G                 N
[   10.263177] --------------------------------------------
[   10.263179] (direxec)/166 is trying to acquire lock:
[   10.263184] ffff88810b4f0198 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock+0x12/0x40
[   10.263217]
[   10.263217] but task is already holding lock:
[   10.263219] ffff88810b4f0198 (&mm->mmap_lock){++++}-{3:3}, at: exit_mmap+0x9c/0x830
[   10.263238]
[   10.263238] other info that might help us debug this:
[   10.263241]  Possible unsafe locking scenario:
[   10.263241]
[   10.263243]        CPU0
[   10.263245]        ----
[   10.263247]   lock(&mm->mmap_lock);
[   10.263252]   lock(&mm->mmap_lock);
[   10.263257]
[   10.263257]  *** DEADLOCK ***
[   10.263257]
[   10.263259]  May be due to missing lock nesting notation
[   10.263259]
[   10.263262] 3 locks held by (direxec)/166:
[   10.263267]  #0: ffff88810b4e8548 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x70/0x1110
[   10.263286]  #1: ffff88810b4e85e0 (&sig->exec_update_lock){+.+.}-{3:3}, at: exec_mmap+0x9f/0x510
[   10.263302]  #2: ffff88810b4f0198 (&mm->mmap_lock){++++}-{3:3}, at: exit_mmap+0x9c/0x830
[   10.263318]
[   10.263318] stack backtrace:
[   10.263329] CPU: 6 UID: 0 PID: 166 Comm: (direxec) Tainted: G                 N 6.10.0-rc7-next-20240712+ #645
[   10.263340] Tainted: [N]=TEST
[   10.263349] Call Trace:
[   10.263355]  <TASK>
[   10.263360]  dump_stack_lvl+0xa3/0xeb
[   10.263375]  print_deadlock_bug+0x4d5/0x680
[   10.263387]  __lock_acquire+0x65fb/0x7830
[   10.263408]  ? lock_is_held_type+0xdd/0x150
[   10.263425]  lock_acquire+0x14c/0x3e0
[   10.263433]  ? mmap_read_lock+0x12/0x40
[   10.263445]  ? lock_is_held_type+0xdd/0x150
[   10.263454]  down_read+0x58/0x9a0
[   10.263461]  ? mmap_read_lock+0x12/0x40
[   10.263476]  mmap_read_lock+0x12/0x40
[   10.263485]  unmap_single_vma+0x1bf/0x240
[   10.263497]  unmap_vmas+0x146/0x1c0
[   10.263511]  exit_mmap+0x13d/0x830
[   10.263533]  __mmput+0xc2/0x2c0
[   10.263556]  exec_mmap+0x4cb/0x510
[   10.263580]  begin_new_exec+0xfe6/0x1ba0
[   10.263612]  load_elf_binary+0x797/0x22a0
[   10.263637]  ? load_misc_binary+0x53a/0x930
[   10.263656]  ? lock_release+0x50f/0x830
[   10.263673]  ? bprm_execve+0x6d7/0x1110
[   10.263693]  bprm_execve+0x70d/0x1110
[   10.263730]  do_execveat_common+0x44b/0x600
[   10.263745]  __x64_sys_execve+0x8e/0xa0
[   10.263754]  do_syscall_64+0x71/0x110
[   10.263764]  entry_SYSCALL_64_after_hwframe+0x4b/0x53


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
@ 2024-07-12 12:17 Bert Karwatzki
  0 siblings, 0 replies; 9+ messages in thread
From: Bert Karwatzki @ 2024-07-12 12:17 UTC (permalink / raw)
  To: Pei Li
  Cc: Bert Karwatzki, Andrew Morton, Shuah Khan, David Hildenbrand,
	linux-mm, linux-kernel, linux-next, syzbot+35a4414f6e247f515443,
	syzkaller-bugs, linux-kernel-mentees, senozhatsky

This is causing a deadlock for me, too. Since linux-next-20240712 I observe
a hang when starting the gui. I bisected this to commit a13252049629 and
reverting this commit in linux-next-20240712 fixes the issue for me.
I do not have any log messages to prove the dead lock, though, as I compiled
everything without CONFIG_LOCKDEP.

Bert Karwatzki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
@ 2024-07-12 12:43 Bert Karwatzki
  0 siblings, 0 replies; 9+ messages in thread
From: Bert Karwatzki @ 2024-07-12 12:43 UTC (permalink / raw)
  To: Pei Li
  Cc: Bert Karwatzki, Andrew Morton, Shuah Khan, David Hildenbrand,
	linux-mm, linux-kernel, linux-next, syzbot+35a4414f6e247f515443,
	syzkaller-bugs, linux-kernel-mentees, senozhatsky

diff --git a/mm/memory.c b/mm/memory.c
index 282203363177..2f4b4322ec0e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1817,6 +1817,7 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 {
 	unsigned long start = max(vma->vm_start, start_addr);
 	unsigned long end;
+	bool mm_read_locked;

 	if (start >= vma->vm_end)
 		return;
@@ -1829,11 +1830,11 @@ static void unmap_single_vma(struct mmu_gather *tlb,

 	if (unlikely(vma->vm_flags & VM_PFNMAP)) {
 		if (!mm_wr_locked)
-			mmap_read_lock(vma->vm_mm);
+			mm_read_locked = !mmap_read_trylock(vma->vm_mm);

 		untrack_pfn(vma, 0, 0, mm_wr_locked);

-		if (!mm_wr_locked)
+		if (!mm_wr_locked && !mm_read_locked)
 			mmap_read_unlock(vma->vm_mm);
 	}


This seems to fix the issue without completely removing the locking.

Bert Karwatzki


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
       [not found] <ZpBhCHsG39wIVnQN@x1n>
@ 2024-07-12 13:19 ` David Wang
  0 siblings, 0 replies; 9+ messages in thread
From: David Wang @ 2024-07-12 13:19 UTC (permalink / raw)
  To: peterx
  Cc: akpm, david, linux-kernel-mentees, linux-kernel, linux-mm,
	peili.dev, skhan, syzbot+35a4414f6e247f515443, syzkaller-bugs

Hi,

> Ah yes, I had one rfc patch for that, I temporarily put that aside as it
> seemed nobody cared except myself.. it's here:
> 
> https://lore.kernel.org/all/20240523223745.395337-2-peterx@redhat.com
> 
> I didn't know it can already cause real trouble.  It looks like that patch
> should fix this.
> 
> Thanks,
> 
> -- 
> Peter Xu

Just add another user scenario concering this kernel warning.
Ever since 6.10-rc1, when I suspend my system via `systemctl suspend`, nvidia gpu driver would trigger a warning:

             	 Call Trace:
             	  <TASK>
             	  ? __warn+0x7c/0x120
             	  ? follow_pte+0x15b/0x170
             	  ? report_bug+0x18d/0x1c0
             	  ? handle_bug+0x3c/0x80
             	  ? exc_invalid_op+0x13/0x60
             	  ? asm_exc_invalid_op+0x16/0x20
             	  ? follow_pte+0x15b/0x170
             	  follow_phys+0x3a/0xf0
             	  untrack_pfn+0x53/0x120
             	  unmap_single_vma+0xa6/0xe0
             	  zap_page_range_single+0xe4/0x190
             	  ? _nv002569kms+0x17b/0x210 [nvidia_modeset]
             	  ? srso_return_thunk+0x5/0x5f
             	  ? kfree+0x257/0x290
             	  unmap_mapping_range+0x10d/0x130
             	  nv_revoke_gpu_mappings_locked+0x43/0x70 [nvidia]
             	  nv_set_system_power_state+0x1c9/0x470 [nvidia]
             	  nv_procfs_write_suspend+0xd3/0x140 [nvidia]
             	  proc_reg_write+0x58/0xa0
             	  ? srso_return_thunk+0x5/0x5f
             	  vfs_write+0xf6/0x440
             	  ? __count_memcg_events+0x73/0x110
             	  ? srso_return_thunk+0x5/0x5f
             	  ? count_memcg_events.constprop.0+0x1a/0x30
             	  ? srso_return_thunk+0x5/0x5f
             	  ? handle_mm_fault+0xa9/0x2d0
             	  ? srso_return_thunk+0x5/0x5f
             	  ? preempt_count_add+0x47/0xa0
             	  ksys_write+0x63/0xe0
             	  do_syscall_64+0x4b/0x110
             	  entry_SYSCALL_64_after_hwframe+0x76/0x7e
             	 RIP: 0033:0x7f34a3914240
             	 Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
             	 RSP: 002b:00007ffca2aa2688 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
             	 RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f34a3914240
             	 RDX: 0000000000000008 RSI: 000055a02968ed80 RDI: 0000000000000001
             	 RBP: 000055a02968ed80 R08: 00007f34a39eecd0 R09: 00007f34a39eecd0
             	 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
             	 R13: 00007f34a39ef760 R14: 0000000000000008 R15: 00007f34a39ea9e0
             	  </TASK>
             	 ---[ end trace 0000000000000000 ]---
             	 PM: suspend entry (deep)

Considering out-of-tree nature of nvidia gpu driver, and nobody reported this kernel warning before with in-trees,
 I had almost convinced myself that nvidia driver may need "big" rework to live with those "PTE" changes.
So glad to see this thread of discussion/issue/fix now, I have been patching my system manually ever since 6.10-rc1,
hope things got fixed soon...


FYI
David



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-07-12 13:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-11  5:13 [PATCH] mm: Fix mmap_assert_locked() in follow_pte() Pei Li
2024-07-11 21:22 ` Andrew Morton
2024-07-11 21:33 ` David Hildenbrand
2024-07-11 21:45   ` David Hildenbrand
2024-07-11 21:51     ` David Hildenbrand
2024-07-12  8:04   ` Sergey Senozhatsky
  -- strict thread matches above, loose matches on Subject: below --
2024-07-12 12:17 Bert Karwatzki
2024-07-12 12:43 Bert Karwatzki
     [not found] <ZpBhCHsG39wIVnQN@x1n>
2024-07-12 13:19 ` David Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).