From: Balbir Singh <balbirs@nvidia.com>
To: "Mika Penttilä" <mpenttil@redhat.com>
Cc: linux-mm@kvack.org, dri-devel@lists.freedesktop.org,
intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
David Hildenbrand <david@kernel.org>,
Jason Gunthorpe <jgg@nvidia.com>,
Leon Romanovsky <leonro@nvidia.com>,
Alistair Popple <apopple@nvidia.com>, Zi Yan <ziy@nvidia.com>,
Matthew Brost <matthew.brost@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v10 0/5] Migrate on fault for device pages
Date: Fri, 15 May 2026 17:33:52 +1000 [thread overview]
Message-ID: <d355e30e-a7d1-4da1-83d7-29131ddc845c@nvidia.com> (raw)
In-Reply-To: <2fe8d022-a414-4b23-af4f-9cecf1aac3d1@redhat.com>
On 5/15/26 14:05, Mika Penttilä wrote:
> Hi,
>
>> FYI: While testing with hmm_tests I ran into
>>
>> [ 107.866004] ============================================
>> [ 107.866284] WARNING: possible recursive locking detected
>> [ 107.866577] 7.1.0-rc3-00311-g4277273ca0e1 #12 Not tainted
>> [ 107.866877] --------------------------------------------
>> [ 107.867217] hmm-tests/1098 is trying to acquire lock:
>> [ 107.867491] ffff888113571b38 (&mm->mmap_lock){++++}-{4:4}, at: dmirror_range_fault+0x147/0x610 [test_hmm] <- line 368 of lib/test_hmm.c
>> [ 107.868076]
>> [ 107.868076] but task is already holding lock:
>> [ 107.868383] ffff888113571b38 (&mm->mmap_lock){++++}-{4:4}, at: dmirror_fault_and_migrate_to_device.constprop.0+0x3aa/0x6a0 [test_hmm] <- line 1267 of lib/test_hmm.c
>> [ 107.869076]
>> [ 107.869076] other info that might help us debug this:
>> [ 107.869415] Possible unsafe locking scenario:
>> [ 107.869415]
>> [ 107.869729] CPU0
>> [ 107.869866] ----
>> [ 107.870054] lock(&mm->mmap_lock);
>> [ 107.870247] lock(&mm->mmap_lock);
>> [ 107.870436]
>> [ 107.870436] *** DEADLOCK ***
>> [ 107.870436]
>> [ 107.870743] May be due to missing lock nesting notation
>> [ 107.870743]
>> [ 107.871158] 1 lock held by hmm-tests/1098:
>> [ 107.871377] #0: ffff888113571b38 (&mm->mmap_lock){++++}-{4:4}, at: dmirror_fault_and_migrate_to_device.constprop.0+0x3aa/0x6a0 [test_hmm]
>> [ 107.872081]
>> [ 107.872081] stack backtrace:
>> [ 107.872348] CPU: 1 UID: 0 PID: 1098 Comm: hmm-tests Not tainted 7.1.0-rc3-00311-g4277273ca0e1 #12 PREEMPT(full)
>> [ 107.872350] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20260213-6.fc44 02/13/2026
>> [ 107.872354] Call Trace:
>> [ 107.872357] <TASK>
>> [ 107.872358] dump_stack_lvl+0x5d/0x80
>> [ 107.872385] print_deadlock_bug.cold+0xc0/0xe2
>> [ 107.872393] __lock_acquire+0x10cf/0x1b90
>> [ 107.872400] lock_acquire+0x189/0x2f0
>> [ 107.872401] ? dmirror_range_fault+0x147/0x610 [test_hmm]
>> [ 107.872404] down_read+0x9b/0x4b0
>> [ 107.872420] ? dmirror_range_fault+0x147/0x610 [test_hmm]
>> [ 107.872421] ? lock_acquire+0x189/0x2f0
>> [ 107.872422] ? __pfx_down_read+0x10/0x10
>> [ 107.872424] ? __lock_acquire+0x3c2/0x1b90
>> [ 107.872425] dmirror_range_fault+0x147/0x610 [test_hmm]
>> [ 107.872427] ? __pfx_down_read+0x10/0x10
>> [ 107.872429] ? __pfx_dmirror_range_fault+0x10/0x10 [test_hmm]
>> [ 107.872430] ? __lock_acquire+0x3c2/0x1b90
>> [ 107.872434] dmirror_fault_and_migrate_to_device.constprop.0+0x3bf/0x6a0 [test_hmm]
>> [ 107.872436] ? __pfx_dmirror_fault_and_migrate_to_device.constprop.0+0x10/0x10 [test_hmm]
>> [ 107.872439] ? find_held_lock+0x2b/0x80
>> [ 107.872444] ? dmirror_device_remove_chunks+0x5b8/0xa00 [test_hmm]
>> [ 107.872445] ? __is_insn_slot_addr+0xee/0x1f0
>> [ 107.872458] ? lock_acquire+0x189/0x2f0
>> [ 107.872460] ? avc_has_extended_perms+0x234/0x1350
>> [ 107.872476] ? __might_fault+0x89/0x150
>> [ 107.872484] ? lock_release+0xe1/0x320
>> [ 107.872486] dmirror_fops_unlocked_ioctl+0x9ba/0xdb0 [test_hmm]
>> [ 107.872488] ? ioctl_has_perm.constprop.0.isra.0+0x2fe/0x6c0
>> [ 107.872494] ? __pfx_dmirror_fops_unlocked_ioctl+0x10/0x10 [test_hmm]
>> [ 107.872498] ? count_memcg_events_mm.constprop.0+0x22/0x1a0
>> [ 107.872499] ? __pfx_ioctl_has_perm.constprop.0.isra.0+0x10/0x10
>> [ 107.872501] ? count_memcg_events_mm.constprop.0+0xaa/0x1a0
>> [ 107.872503] ? lock_release+0xe1/0x320
>> [ 107.872504] ? find_held_lock+0x2b/0x80
>> [ 107.872506] ? exc_page_fault+0x7e/0xf0
>> [ 107.872510] __x64_sys_ioctl+0x13c/0x1d0
>> [ 107.872521] ? lockdep_hardirqs_on_prepare+0xd9/0x190
>> [ 107.872523] do_syscall_64+0xf3/0x6a0
>> [ 107.872526] ? exc_page_fault+0xde/0xf0
>> [ 107.872528] entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> [ 107.872529] RIP: 0033:0x7f7381c543ad
>> [ 107.872531] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
>> [ 107.872532] RSP: 002b:00007ffc3160a9b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> [ 107.872539] RAX: ffffffffffffffda RBX: 00007f7381b44000 RCX: 00007f7381c543ad
>> [ 107.872540] RDX: 00007ffc3160aa30 RSI: 00000000c0284803 RDI: 0000000000000022
>> [ 107.872541] RBP: 00007ffc3160aa00 R08: 00000000ffffffff R09: 0000000000000000
>> [ 107.872541] R10: 0000000000000022 R11: 0000000000000246 R12: 00007ffc3160aa24
>> [ 107.872542] R13: 000000000041f380 R14: 0000000000000200 R15: 00007f7381200000
>> [ 107.872544] </TASK>
>>
>>
>> Thanks,
>> Balbir
>>
> Thanks, I could reproduce. Had lockdep dropped off so went unnoticed. It is nesting mmap_read_lock in the test suite, I will change that in next version.
>
> --Mika
>
>
I'll wait for the next version
Balbir
prev parent reply other threads:[~2026-05-15 7:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 18:44 [PATCH v10 0/5] Migrate on fault for device pages mpenttil
2026-05-05 18:44 ` [PATCH v10 1/5] mm/Kconfig: changes for migrate " mpenttil
2026-05-05 18:44 ` [PATCH v10 2/5] mm: Add helper to convert HMM pfn to migrate pfn mpenttil
2026-05-12 11:43 ` David Hildenbrand (Arm)
2026-05-12 12:08 ` Mika Penttilä
2026-05-12 12:44 ` David Hildenbrand (Arm)
2026-05-05 18:44 ` [PATCH v10 3/5] mm/hmm: do the plumbing for HMM to participate in migration mpenttil
2026-05-05 18:44 ` [PATCH v10 4/5] mm: setup device page migration in HMM pagewalk mpenttil
2026-05-05 18:44 ` [PATCH v10 5/5] lib/test_hmm:: add a new testcase for the migrate on fault mpenttil
2026-05-15 3:07 ` [PATCH v10 0/5] Migrate on fault for device pages Balbir Singh
2026-05-15 4:05 ` Mika Penttilä
2026-05-15 7:33 ` Balbir Singh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d355e30e-a7d1-4da1-83d7-29131ddc845c@nvidia.com \
--to=balbirs@nvidia.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=david@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jgg@nvidia.com \
--cc=leonro@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=mpenttil@redhat.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox