All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Mika Penttilä" <mpenttil@redhat.com>
To: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org, dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	David Hildenbrand <david@kernel.org>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Matthew Brost <matthew.brost@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v10 0/5] Migrate on fault for device pages
Date: Fri, 15 May 2026 07:05:52 +0300	[thread overview]
Message-ID: <2fe8d022-a414-4b23-af4f-9cecf1aac3d1@redhat.com> (raw)
In-Reply-To: <agaN29W7hAklrCxz@parvat>

Hi,

> FYI: While testing with hmm_tests I ran into
>
> [  107.866004] ============================================
> [  107.866284] WARNING: possible recursive locking detected
> [  107.866577] 7.1.0-rc3-00311-g4277273ca0e1 #12 Not tainted
> [  107.866877] --------------------------------------------
> [  107.867217] hmm-tests/1098 is trying to acquire lock:
> [  107.867491] ffff888113571b38 (&mm->mmap_lock){++++}-{4:4}, at: dmirror_range_fault+0x147/0x610 [test_hmm] <- line 368 of lib/test_hmm.c
> [  107.868076] 
> [  107.868076] but task is already holding lock:
> [  107.868383] ffff888113571b38 (&mm->mmap_lock){++++}-{4:4}, at: dmirror_fault_and_migrate_to_device.constprop.0+0x3aa/0x6a0 [test_hmm] <- line 1267 of lib/test_hmm.c
> [  107.869076] 
> [  107.869076] other info that might help us debug this:
> [  107.869415]  Possible unsafe locking scenario:
> [  107.869415] 
> [  107.869729]        CPU0
> [  107.869866]        ----
> [  107.870054]   lock(&mm->mmap_lock);
> [  107.870247]   lock(&mm->mmap_lock);
> [  107.870436] 
> [  107.870436]  *** DEADLOCK ***
> [  107.870436] 
> [  107.870743]  May be due to missing lock nesting notation
> [  107.870743] 
> [  107.871158] 1 lock held by hmm-tests/1098:
> [  107.871377]  #0: ffff888113571b38 (&mm->mmap_lock){++++}-{4:4}, at: dmirror_fault_and_migrate_to_device.constprop.0+0x3aa/0x6a0 [test_hmm]
> [  107.872081] 
> [  107.872081] stack backtrace:
> [  107.872348] CPU: 1 UID: 0 PID: 1098 Comm: hmm-tests Not tainted 7.1.0-rc3-00311-g4277273ca0e1 #12 PREEMPT(full) 
> [  107.872350] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20260213-6.fc44 02/13/2026
> [  107.872354] Call Trace:
> [  107.872357]  <TASK>
> [  107.872358]  dump_stack_lvl+0x5d/0x80
> [  107.872385]  print_deadlock_bug.cold+0xc0/0xe2
> [  107.872393]  __lock_acquire+0x10cf/0x1b90
> [  107.872400]  lock_acquire+0x189/0x2f0
> [  107.872401]  ? dmirror_range_fault+0x147/0x610 [test_hmm]
> [  107.872404]  down_read+0x9b/0x4b0
> [  107.872420]  ? dmirror_range_fault+0x147/0x610 [test_hmm]
> [  107.872421]  ? lock_acquire+0x189/0x2f0
> [  107.872422]  ? __pfx_down_read+0x10/0x10
> [  107.872424]  ? __lock_acquire+0x3c2/0x1b90
> [  107.872425]  dmirror_range_fault+0x147/0x610 [test_hmm]
> [  107.872427]  ? __pfx_down_read+0x10/0x10
> [  107.872429]  ? __pfx_dmirror_range_fault+0x10/0x10 [test_hmm]
> [  107.872430]  ? __lock_acquire+0x3c2/0x1b90
> [  107.872434]  dmirror_fault_and_migrate_to_device.constprop.0+0x3bf/0x6a0 [test_hmm]
> [  107.872436]  ? __pfx_dmirror_fault_and_migrate_to_device.constprop.0+0x10/0x10 [test_hmm]
> [  107.872439]  ? find_held_lock+0x2b/0x80
> [  107.872444]  ? dmirror_device_remove_chunks+0x5b8/0xa00 [test_hmm]
> [  107.872445]  ? __is_insn_slot_addr+0xee/0x1f0
> [  107.872458]  ? lock_acquire+0x189/0x2f0
> [  107.872460]  ? avc_has_extended_perms+0x234/0x1350
> [  107.872476]  ? __might_fault+0x89/0x150
> [  107.872484]  ? lock_release+0xe1/0x320
> [  107.872486]  dmirror_fops_unlocked_ioctl+0x9ba/0xdb0 [test_hmm]
> [  107.872488]  ? ioctl_has_perm.constprop.0.isra.0+0x2fe/0x6c0
> [  107.872494]  ? __pfx_dmirror_fops_unlocked_ioctl+0x10/0x10 [test_hmm]
> [  107.872498]  ? count_memcg_events_mm.constprop.0+0x22/0x1a0
> [  107.872499]  ? __pfx_ioctl_has_perm.constprop.0.isra.0+0x10/0x10
> [  107.872501]  ? count_memcg_events_mm.constprop.0+0xaa/0x1a0
> [  107.872503]  ? lock_release+0xe1/0x320
> [  107.872504]  ? find_held_lock+0x2b/0x80
> [  107.872506]  ? exc_page_fault+0x7e/0xf0
> [  107.872510]  __x64_sys_ioctl+0x13c/0x1d0
> [  107.872521]  ? lockdep_hardirqs_on_prepare+0xd9/0x190
> [  107.872523]  do_syscall_64+0xf3/0x6a0
> [  107.872526]  ? exc_page_fault+0xde/0xf0
> [  107.872528]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [  107.872529] RIP: 0033:0x7f7381c543ad
> [  107.872531] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
> [  107.872532] RSP: 002b:00007ffc3160a9b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [  107.872539] RAX: ffffffffffffffda RBX: 00007f7381b44000 RCX: 00007f7381c543ad
> [  107.872540] RDX: 00007ffc3160aa30 RSI: 00000000c0284803 RDI: 0000000000000022
> [  107.872541] RBP: 00007ffc3160aa00 R08: 00000000ffffffff R09: 0000000000000000
> [  107.872541] R10: 0000000000000022 R11: 0000000000000246 R12: 00007ffc3160aa24
> [  107.872542] R13: 000000000041f380 R14: 0000000000000200 R15: 00007f7381200000
> [  107.872544]  </TASK>
>
>
> Thanks,
> Balbir
>
Thanks, I could reproduce. Had lockdep dropped off so went unnoticed. It is nesting mmap_read_lock in the test suite, I will change that in next version.

--Mika



      reply	other threads:[~2026-05-15  4:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 18:44 [PATCH v10 0/5] Migrate on fault for device pages mpenttil
2026-05-05 18:44 ` [PATCH v10 1/5] mm/Kconfig: changes for migrate " mpenttil
2026-05-05 18:44 ` [PATCH v10 2/5] mm: Add helper to convert HMM pfn to migrate pfn mpenttil
2026-05-12 11:43   ` David Hildenbrand (Arm)
2026-05-12 12:08     ` Mika Penttilä
2026-05-12 12:44       ` David Hildenbrand (Arm)
2026-05-05 18:44 ` [PATCH v10 3/5] mm/hmm: do the plumbing for HMM to participate in migration mpenttil
2026-05-05 18:44 ` [PATCH v10 4/5] mm: setup device page migration in HMM pagewalk mpenttil
2026-05-05 18:44 ` [PATCH v10 5/5] lib/test_hmm:: add a new testcase for the migrate on fault mpenttil
2026-05-05 18:50 ` ✗ CI.checkpatch: warning for Migrate on fault for device pages (rev2) Patchwork
2026-05-05 18:52 ` ✓ CI.KUnit: success " Patchwork
2026-05-15  3:07 ` [PATCH v10 0/5] Migrate on fault for device pages Balbir Singh
2026-05-15  4:05   ` Mika Penttilä [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2fe8d022-a414-4b23-af4f-9cecf1aac3d1@redhat.com \
    --to=mpenttil@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=david@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.