* [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
@ 2024-07-11 5:13 Pei Li
2024-07-11 21:22 ` Andrew Morton
2024-07-11 21:33 ` David Hildenbrand
0 siblings, 2 replies; 9+ messages in thread
From: Pei Li @ 2024-07-11 5:13 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
linux-kernel-mentees, syzbot+35a4414f6e247f515443, Pei Li
This patch fixes this warning by acquiring read lock before entering
untrack_pfn() while write lock is not held.
syzbot has tested the proposed patch and the reproducer did not
trigger any issue.
Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
Signed-off-by: Pei Li <peili.dev@gmail.com>
---
Syzbot reported the following warning in follow_pte():
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
This is because we are assuming that mm->mmap_lock should be held when
entering follow_pte(). This is added in commit c5541ba378e3 (mm:
follow_pte() improvements).
However, in the following call stack, we are not acquring the lock:
follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
zap_page_range_single+0x326/0x560 mm/memory.c:1920
In zap_page_range_single(), we passed mm_wr_locked as false, as we do
not expect write lock to be held.
In the special case where vma->vm_flags is set as VM_PFNMAP, we are
hitting untrack_pfn() which eventually calls into follow_phys.
This patch fixes this warning by acquiring read lock before entering
untrack_pfn() while write lock is not held.
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Tested on:
commit: 9d9a2f29 Merge tag 'mm-hotfixes-stable-2024-07-10-13-1..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13be8021980000
kernel config: https://syzkaller.appspot.com/x/.config?x=3456bae478301dc8
dashboard link: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=145e3441980000
Note: testing is done by a robot and is best-effort only.
---
mm/memory.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index d10e616d7389..75d7959b835b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
if (vma->vm_file)
uprobe_munmap(vma, start, end);
- if (unlikely(vma->vm_flags & VM_PFNMAP))
+ if (unlikely(vma->vm_flags & VM_PFNMAP)) {
+ if (!mm_wr_locked)
+ mmap_read_lock(vma->vm_mm);
+
untrack_pfn(vma, 0, 0, mm_wr_locked);
+ if (!mm_wr_locked)
+ mmap_read_unlock(vma->vm_mm);
+ }
+
if (start != end) {
if (unlikely(is_vm_hugetlb_page(vma))) {
/*
---
base-commit: 734610514cb0234763cc97ddbd235b7981889445
change-id: 20240710-bug12-fecfe4bb0dcb
Best regards,
--
Pei Li <peili.dev@gmail.com>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
2024-07-11 5:13 Pei Li
@ 2024-07-11 21:22 ` Andrew Morton
2024-07-11 21:33 ` David Hildenbrand
1 sibling, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2024-07-11 21:22 UTC (permalink / raw)
To: Pei Li
Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
linux-kernel-mentees, syzbot+35a4414f6e247f515443,
David Hildenbrand
On Wed, 10 Jul 2024 22:13:17 -0700 Pei Li <peili.dev@gmail.com> wrote:
> This patch fixes this warning by acquiring read lock before entering
> untrack_pfn() while write lock is not held.
>
> syzbot has tested the proposed patch and the reproducer did not
> trigger any issue.
>
Thanks.
> ---
> Syzbot reported the following warning in follow_pte():
>
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
>
> This is because we are assuming that mm->mmap_lock should be held when
> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
> follow_pte() improvements).
>
> However, in the following call stack, we are not acquring the lock:
> follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
> get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
> untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
> unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
> zap_page_range_single+0x326/0x560 mm/memory.c:1920
>
> In zap_page_range_single(), we passed mm_wr_locked as false, as we do
> not expect write lock to be held.
> In the special case where vma->vm_flags is set as VM_PFNMAP, we are
> hitting untrack_pfn() which eventually calls into follow_phys.
I included the above (very relevant) info in the changelog.
And I added
Fixes: c5541ba378e3 ("mm: follow_pte() improvements")
and queued the patch for 6.10-rc7. Hopefully David can review it for us.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
2024-07-11 5:13 Pei Li
2024-07-11 21:22 ` Andrew Morton
@ 2024-07-11 21:33 ` David Hildenbrand
2024-07-11 21:45 ` David Hildenbrand
2024-07-12 8:04 ` Sergey Senozhatsky
1 sibling, 2 replies; 9+ messages in thread
From: David Hildenbrand @ 2024-07-11 21:33 UTC (permalink / raw)
To: Pei Li, Andrew Morton
Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
linux-kernel-mentees, syzbot+35a4414f6e247f515443
On 11.07.24 07:13, Pei Li wrote:
> This patch fixes this warning by acquiring read lock before entering
> untrack_pfn() while write lock is not held.
>
> syzbot has tested the proposed patch and the reproducer did not
> trigger any issue.
>
> Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
> Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
> Signed-off-by: Pei Li <peili.dev@gmail.com>
> ---
> Syzbot reported the following warning in follow_pte():
>
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
>
> This is because we are assuming that mm->mmap_lock should be held when
> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
> follow_pte() improvements).
>
> However, in the following call stack, we are not acquring the lock:
> follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
> get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
> untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
> unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
> zap_page_range_single+0x326/0x560 mm/memory.c:1920
That implies that unmap_vmas() is called without the mmap lock in read
mode, correct?
Do we know how this happens?
* exit_mmap() holds the mmap lock in read mode
* unmap_region is documented to hold the mmap lock in read mode
>
> In zap_page_range_single(), we passed mm_wr_locked as false, as we do
> not expect write lock to be held.
> In the special case where vma->vm_flags is set as VM_PFNMAP, we are
> hitting untrack_pfn() which eventually calls into follow_phys.
>
> This patch fixes this warning by acquiring read lock before entering
> untrack_pfn() while write lock is not held.
>
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>
> Tested on:
>
> commit: 9d9a2f29 Merge tag 'mm-hotfixes-stable-2024-07-10-13-1..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13be8021980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=3456bae478301dc8
> dashboard link: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> patch: https://syzkaller.appspot.com/x/patch.diff?x=145e3441980000
>
> Note: testing is done by a robot and is best-effort only.
> ---
> mm/memory.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index d10e616d7389..75d7959b835b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
> if (vma->vm_file)
> uprobe_munmap(vma, start, end);
>
> - if (unlikely(vma->vm_flags & VM_PFNMAP))
> + if (unlikely(vma->vm_flags & VM_PFNMAP)) {
> + if (!mm_wr_locked)
> + mmap_read_lock(vma->vm_mm);
> +
> untrack_pfn(vma, 0, 0, mm_wr_locked);
>
> + if (!mm_wr_locked)
> + mmap_read_unlock(vma->vm_mm);
> + }
> +
> if (start != end) {
> if (unlikely(is_vm_hugetlb_page(vma))) {
I'm not sure if this is the right fix. I like to understand how we end
up without the mmap lock at least in read mode in that path?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
2024-07-11 21:33 ` David Hildenbrand
@ 2024-07-11 21:45 ` David Hildenbrand
2024-07-11 21:51 ` David Hildenbrand
2024-07-12 8:04 ` Sergey Senozhatsky
1 sibling, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2024-07-11 21:45 UTC (permalink / raw)
To: Pei Li, Andrew Morton
Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
linux-kernel-mentees, syzbot+35a4414f6e247f515443, Peter Xu
On 11.07.24 23:33, David Hildenbrand wrote:
> On 11.07.24 07:13, Pei Li wrote:
>> This patch fixes this warning by acquiring read lock before entering
>> untrack_pfn() while write lock is not held.
>>
>> syzbot has tested the proposed patch and the reproducer did not
>> trigger any issue.
>>
>> Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
>> Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>> Signed-off-by: Pei Li <peili.dev@gmail.com>
>> ---
>> Syzbot reported the following warning in follow_pte():
>>
>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
>>
>> This is because we are assuming that mm->mmap_lock should be held when
>> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
>> follow_pte() improvements).
>>
>> However, in the following call stack, we are not acquring the lock:
>> follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>> get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>> untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>> unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>> zap_page_range_single+0x326/0x560 mm/memory.c:1920
>
> That implies that unmap_vmas() is called without the mmap lock in read
> mode, correct?
>
> Do we know how this happens?
>
> * exit_mmap() holds the mmap lock in read mode
> * unmap_region is documented to hold the mmap lock in read mode
I think this is it (missed the call from zap_page_range_single()):
follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
zap_page_range_single+0x326/0x560 mm/memory.c:1920
unmap_mapping_range_vma mm/memory.c:3684 [inline]
unmap_mapping_range_tree mm/memory.c:3701 [inline]
unmap_mapping_pages mm/memory.c:3767 [inline]
unmap_mapping_range+0x1ee/0x280 mm/memory.c:3804
truncate_pagecache+0x53/0x90 mm/truncate.c:731
simple_setattr+0xf2/0x120 fs/libfs.c:886
notify_change+0xec6/0x11f0 fs/attr.c:499
do_truncate+0x15c/0x220 fs/open.c:65
handle_truncate fs/namei.c:3308 [inline]
I think Peter recently questioned whether untrack_pfn() should be even
called from the place, but I might misremember things.
Fix should work (I suspect we are not violating some locking rules?),
PFNMAP should not happen there too often that we really care.
If everything fails, we could drop the assert, but I'm hoping we can
avoid that. We really want most users of follow_pte() to do the right thing.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
2024-07-11 21:45 ` David Hildenbrand
@ 2024-07-11 21:51 ` David Hildenbrand
0 siblings, 0 replies; 9+ messages in thread
From: David Hildenbrand @ 2024-07-11 21:51 UTC (permalink / raw)
To: Pei Li, Andrew Morton
Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
linux-kernel-mentees, syzbot+35a4414f6e247f515443, Peter Xu
On 11.07.24 23:45, David Hildenbrand wrote:
> On 11.07.24 23:33, David Hildenbrand wrote:
>> On 11.07.24 07:13, Pei Li wrote:
>>> This patch fixes this warning by acquiring read lock before entering
>>> untrack_pfn() while write lock is not held.
>>>
>>> syzbot has tested the proposed patch and the reproducer did not
>>> trigger any issue.
>>>
>>> Reported-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>>> Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
>>> Tested-by: syzbot+35a4414f6e247f515443@syzkaller.appspotmail.com
>>> Signed-off-by: Pei Li <peili.dev@gmail.com>
>>> ---
>>> Syzbot reported the following warning in follow_pte():
>>>
>>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
>>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
>>> WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
>>>
>>> This is because we are assuming that mm->mmap_lock should be held when
>>> entering follow_pte(). This is added in commit c5541ba378e3 (mm:
>>> follow_pte() improvements).
>>>
>>> However, in the following call stack, we are not acquring the lock:
>>> follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
>>> get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
>>> untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
>>> unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
>>> zap_page_range_single+0x326/0x560 mm/memory.c:1920
>>
>> That implies that unmap_vmas() is called without the mmap lock in read
>> mode, correct?
>>
>> Do we know how this happens?
>>
>> * exit_mmap() holds the mmap lock in read mode
>> * unmap_region is documented to hold the mmap lock in read mode
>
> I think this is it (missed the call from zap_page_range_single()):
>
> follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
> get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
> untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
> unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
> zap_page_range_single+0x326/0x560 mm/memory.c:1920
> unmap_mapping_range_vma mm/memory.c:3684 [inline]
> unmap_mapping_range_tree mm/memory.c:3701 [inline]
> unmap_mapping_pages mm/memory.c:3767 [inline]
> unmap_mapping_range+0x1ee/0x280 mm/memory.c:3804
> truncate_pagecache+0x53/0x90 mm/truncate.c:731
> simple_setattr+0xf2/0x120 fs/libfs.c:886
> notify_change+0xec6/0x11f0 fs/attr.c:499
> do_truncate+0x15c/0x220 fs/open.c:65
> handle_truncate fs/namei.c:3308 [inline]
>
> I think Peter recently questioned whether untrack_pfn() should be even
> called from the place, but I might misremember things.
>
> Fix should work (I suspect we are not violating some locking rules?),
> PFNMAP should not happen there too often that we really care.
... thinking again, likely we reach this point with "!mm_wr_locked" and
the mmap lock already held in read mode. So I suspect the fix won't work
as is.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
2024-07-11 21:33 ` David Hildenbrand
2024-07-11 21:45 ` David Hildenbrand
@ 2024-07-12 8:04 ` Sergey Senozhatsky
1 sibling, 0 replies; 9+ messages in thread
From: Sergey Senozhatsky @ 2024-07-12 8:04 UTC (permalink / raw)
To: Pei Li, Andrew Morton, David Hildenbrand
Cc: linux-mm, linux-kernel, skhan, syzkaller-bugs,
linux-kernel-mentees, syzbot+35a4414f6e247f515443
On (24/07/11 23:33), David Hildenbrand wrote:
[..]
> > @@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
> > if (vma->vm_file)
> > uprobe_munmap(vma, start, end);
> > - if (unlikely(vma->vm_flags & VM_PFNMAP))
> > + if (unlikely(vma->vm_flags & VM_PFNMAP)) {
> > + if (!mm_wr_locked)
> > + mmap_read_lock(vma->vm_mm);
> > +
> > untrack_pfn(vma, 0, 0, mm_wr_locked);
> > + if (!mm_wr_locked)
> > + mmap_read_unlock(vma->vm_mm);
> > + }
> > +
> > if (start != end) {
> > if (unlikely(is_vm_hugetlb_page(vma))) {
>
> I'm not sure if this is the right fix. I like to understand how we end up
> without the mmap lock at least in read mode in that path?
I suspect this is causing a deadlock:
[ 10.263161] ============================================
[ 10.263165] WARNING: possible recursive locking detected
[ 10.263170] 6.10.0-rc7-next-20240712+ #645 Tainted: G N
[ 10.263177] --------------------------------------------
[ 10.263179] (direxec)/166 is trying to acquire lock:
[ 10.263184] ffff88810b4f0198 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock+0x12/0x40
[ 10.263217]
[ 10.263217] but task is already holding lock:
[ 10.263219] ffff88810b4f0198 (&mm->mmap_lock){++++}-{3:3}, at: exit_mmap+0x9c/0x830
[ 10.263238]
[ 10.263238] other info that might help us debug this:
[ 10.263241] Possible unsafe locking scenario:
[ 10.263241]
[ 10.263243] CPU0
[ 10.263245] ----
[ 10.263247] lock(&mm->mmap_lock);
[ 10.263252] lock(&mm->mmap_lock);
[ 10.263257]
[ 10.263257] *** DEADLOCK ***
[ 10.263257]
[ 10.263259] May be due to missing lock nesting notation
[ 10.263259]
[ 10.263262] 3 locks held by (direxec)/166:
[ 10.263267] #0: ffff88810b4e8548 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x70/0x1110
[ 10.263286] #1: ffff88810b4e85e0 (&sig->exec_update_lock){+.+.}-{3:3}, at: exec_mmap+0x9f/0x510
[ 10.263302] #2: ffff88810b4f0198 (&mm->mmap_lock){++++}-{3:3}, at: exit_mmap+0x9c/0x830
[ 10.263318]
[ 10.263318] stack backtrace:
[ 10.263329] CPU: 6 UID: 0 PID: 166 Comm: (direxec) Tainted: G N 6.10.0-rc7-next-20240712+ #645
[ 10.263340] Tainted: [N]=TEST
[ 10.263349] Call Trace:
[ 10.263355] <TASK>
[ 10.263360] dump_stack_lvl+0xa3/0xeb
[ 10.263375] print_deadlock_bug+0x4d5/0x680
[ 10.263387] __lock_acquire+0x65fb/0x7830
[ 10.263408] ? lock_is_held_type+0xdd/0x150
[ 10.263425] lock_acquire+0x14c/0x3e0
[ 10.263433] ? mmap_read_lock+0x12/0x40
[ 10.263445] ? lock_is_held_type+0xdd/0x150
[ 10.263454] down_read+0x58/0x9a0
[ 10.263461] ? mmap_read_lock+0x12/0x40
[ 10.263476] mmap_read_lock+0x12/0x40
[ 10.263485] unmap_single_vma+0x1bf/0x240
[ 10.263497] unmap_vmas+0x146/0x1c0
[ 10.263511] exit_mmap+0x13d/0x830
[ 10.263533] __mmput+0xc2/0x2c0
[ 10.263556] exec_mmap+0x4cb/0x510
[ 10.263580] begin_new_exec+0xfe6/0x1ba0
[ 10.263612] load_elf_binary+0x797/0x22a0
[ 10.263637] ? load_misc_binary+0x53a/0x930
[ 10.263656] ? lock_release+0x50f/0x830
[ 10.263673] ? bprm_execve+0x6d7/0x1110
[ 10.263693] bprm_execve+0x70d/0x1110
[ 10.263730] do_execveat_common+0x44b/0x600
[ 10.263745] __x64_sys_execve+0x8e/0xa0
[ 10.263754] do_syscall_64+0x71/0x110
[ 10.263764] entry_SYSCALL_64_after_hwframe+0x4b/0x53
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
@ 2024-07-12 12:17 Bert Karwatzki
0 siblings, 0 replies; 9+ messages in thread
From: Bert Karwatzki @ 2024-07-12 12:17 UTC (permalink / raw)
To: Pei Li
Cc: Bert Karwatzki, Andrew Morton, Shuah Khan, David Hildenbrand,
linux-mm, linux-kernel, linux-next, syzbot+35a4414f6e247f515443,
syzkaller-bugs, linux-kernel-mentees, senozhatsky
This is causing a deadlock for me, too. Since linux-next-20240712 I observe
a hang when starting the gui. I bisected this to commit a13252049629 and
reverting this commit in linux-next-20240712 fixes the issue for me.
I do not have any log messages to prove the dead lock, though, as I compiled
everything without CONFIG_LOCKDEP.
Bert Karwatzki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
@ 2024-07-12 12:43 Bert Karwatzki
0 siblings, 0 replies; 9+ messages in thread
From: Bert Karwatzki @ 2024-07-12 12:43 UTC (permalink / raw)
To: Pei Li
Cc: Bert Karwatzki, Andrew Morton, Shuah Khan, David Hildenbrand,
linux-mm, linux-kernel, linux-next, syzbot+35a4414f6e247f515443,
syzkaller-bugs, linux-kernel-mentees, senozhatsky
diff --git a/mm/memory.c b/mm/memory.c
index 282203363177..2f4b4322ec0e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1817,6 +1817,7 @@ static void unmap_single_vma(struct mmu_gather *tlb,
{
unsigned long start = max(vma->vm_start, start_addr);
unsigned long end;
+ bool mm_read_locked;
if (start >= vma->vm_end)
return;
@@ -1829,11 +1830,11 @@ static void unmap_single_vma(struct mmu_gather *tlb,
if (unlikely(vma->vm_flags & VM_PFNMAP)) {
if (!mm_wr_locked)
- mmap_read_lock(vma->vm_mm);
+ mm_read_locked = !mmap_read_trylock(vma->vm_mm);
untrack_pfn(vma, 0, 0, mm_wr_locked);
- if (!mm_wr_locked)
+ if (!mm_wr_locked && !mm_read_locked)
mmap_read_unlock(vma->vm_mm);
}
This seems to fix the issue without completely removing the locking.
Bert Karwatzki
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: Fix mmap_assert_locked() in follow_pte()
[not found] <ZpBhCHsG39wIVnQN@x1n>
@ 2024-07-12 13:19 ` David Wang
0 siblings, 0 replies; 9+ messages in thread
From: David Wang @ 2024-07-12 13:19 UTC (permalink / raw)
To: peterx
Cc: akpm, david, linux-kernel-mentees, linux-kernel, linux-mm,
peili.dev, skhan, syzbot+35a4414f6e247f515443, syzkaller-bugs
Hi,
> Ah yes, I had one rfc patch for that, I temporarily put that aside as it
> seemed nobody cared except myself.. it's here:
>
> https://lore.kernel.org/all/20240523223745.395337-2-peterx@redhat.com
>
> I didn't know it can already cause real trouble. It looks like that patch
> should fix this.
>
> Thanks,
>
> --
> Peter Xu
Just add another user scenario concering this kernel warning.
Ever since 6.10-rc1, when I suspend my system via `systemctl suspend`, nvidia gpu driver would trigger a warning:
Call Trace:
<TASK>
? __warn+0x7c/0x120
? follow_pte+0x15b/0x170
? report_bug+0x18d/0x1c0
? handle_bug+0x3c/0x80
? exc_invalid_op+0x13/0x60
? asm_exc_invalid_op+0x16/0x20
? follow_pte+0x15b/0x170
follow_phys+0x3a/0xf0
untrack_pfn+0x53/0x120
unmap_single_vma+0xa6/0xe0
zap_page_range_single+0xe4/0x190
? _nv002569kms+0x17b/0x210 [nvidia_modeset]
? srso_return_thunk+0x5/0x5f
? kfree+0x257/0x290
unmap_mapping_range+0x10d/0x130
nv_revoke_gpu_mappings_locked+0x43/0x70 [nvidia]
nv_set_system_power_state+0x1c9/0x470 [nvidia]
nv_procfs_write_suspend+0xd3/0x140 [nvidia]
proc_reg_write+0x58/0xa0
? srso_return_thunk+0x5/0x5f
vfs_write+0xf6/0x440
? __count_memcg_events+0x73/0x110
? srso_return_thunk+0x5/0x5f
? count_memcg_events.constprop.0+0x1a/0x30
? srso_return_thunk+0x5/0x5f
? handle_mm_fault+0xa9/0x2d0
? srso_return_thunk+0x5/0x5f
? preempt_count_add+0x47/0xa0
ksys_write+0x63/0xe0
do_syscall_64+0x4b/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f34a3914240
Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
RSP: 002b:00007ffca2aa2688 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f34a3914240
RDX: 0000000000000008 RSI: 000055a02968ed80 RDI: 0000000000000001
RBP: 000055a02968ed80 R08: 00007f34a39eecd0 R09: 00007f34a39eecd0
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
R13: 00007f34a39ef760 R14: 0000000000000008 R15: 00007f34a39ea9e0
</TASK>
---[ end trace 0000000000000000 ]---
PM: suspend entry (deep)
Considering out-of-tree nature of nvidia gpu driver, and nobody reported this kernel warning before with in-trees,
I had almost convinced myself that nvidia driver may need "big" rework to live with those "PTE" changes.
So glad to see this thread of discussion/issue/fix now, I have been patching my system manually ever since 6.10-rc1,
hope things got fixed soon...
FYI
David
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-07-12 13:19 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-12 12:43 [PATCH] mm: Fix mmap_assert_locked() in follow_pte() Bert Karwatzki
[not found] <ZpBhCHsG39wIVnQN@x1n>
2024-07-12 13:19 ` David Wang
-- strict thread matches above, loose matches on Subject: below --
2024-07-12 12:17 Bert Karwatzki
2024-07-11 5:13 Pei Li
2024-07-11 21:22 ` Andrew Morton
2024-07-11 21:33 ` David Hildenbrand
2024-07-11 21:45 ` David Hildenbrand
2024-07-11 21:51 ` David Hildenbrand
2024-07-12 8:04 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).