Re: [PATCH v2 1/1] mm/hugetlb: fix possible deadlocks in hugetlb VMA unmap paths

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lance Yang <lance.yang@linux.dev>
To: mike.kravetz@oracle.com
Cc: Harry Yoo <harry.yoo@oracle.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Hillf Danton <hdanton@sina.com>,
	muchun.song@linux.dev, osalvador@suse.de, david@kernel.org
Subject: Re: [PATCH v2 1/1] mm/hugetlb: fix possible deadlocks in hugetlb VMA unmap paths
Date: Tue, 11 Nov 2025 11:25:42 +0800	[thread overview]
Message-ID: <0432bee6-b384-4cd7-ac1c-e9123c7b393f@linux.dev> (raw)
In-Reply-To: <aef27438-7636-4c38-a5c7-beda95efa02b@linux.dev>

Cc: HugeTLB folks

On 2025/11/11 11:20, Lance Yang wrote:
> +Mike
> 
> On 2025/11/11 07:07, Hillf Danton wrote:
>> On Tue, 11 Nov 2025 00:39:29 +0800 Lance Yang wrote:
>>> On 2025/11/10 20:17, Harry Yoo wrote:
>>>> On Mon, Nov 10, 2025 at 07:15:53PM +0800, Lance Yang wrote:
>>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>>
>>>>> The hugetlb VMA unmap path contains several potential deadlocks, as
>>>>> reported by syzbot. These deadlocks occur in __hugetlb_zap_begin(),
>>>>> move_hugetlb_page_tables(), and the retry path of
>>>>> hugetlb_unmap_file_folio() (affecting remove_inode_hugepages() and
>>>>> unmap_vmas()), where vma_lock is acquired before i_mmap_lock. This 
>>>>> lock
>>>>> ordering conflicts with other paths like hugetlb_fault(), which 
>>>>> establish
>>>>> the correct dependency as i_mmap_lock -> vma_lock.
>>>>>
>>>>> Possible unsafe locking scenario:
>>>>>
>>>>> CPU0                                 CPU1
>>>>> ----                                 ----
>>>>> lock(&vma_lock->rw_sema);
>>>>>                                        lock(&i_mmap_lock);
>>>>>                                        lock(&vma_lock->rw_sema);
>>>>> lock(&i_mmap_lock);
>>>>>
>>>>> Resolve the circular dependencies reported by syzbot across 
>>>>> multiple call
>>>>> chains by reordering the locks in all conflicting paths to 
>>>>> consistently
>>>>> follow the established i_mmap_lock -> vma_lock order.
>>>>
>>>> But mm/rmap.c says:
>>>>> * hugetlbfs PageHuge() take locks in this order:
>>>>> *   hugetlb_fault_mutex (hugetlbfs specific page fault mutex)
>>>>> *     vma_lock (hugetlb specific lock for pmd_sharing)
>>>>> *       mapping->i_mmap_rwsem (also used for hugetlb pmd sharing)
>>>>> *         folio_lock
>>>>> */
>>>
>>> Thanks! You are right, I was mistaken ...
>>>
>>>>
>>>> I think the commit message should explain why the locking order 
>>>> described
>>>> above is incorrect (or when it became incorrect) and fix the comment?
>>>
>>> I think the locking order documented in mm/rmap.c (vma_lock -> 
>>> i_mmap_lock)
>>> is indeed the correct one to follow.
> 
> Looking at the commit[1] that introduced the vma_lock, it seems possible 
> that
> the deadlock reported by syzbot[2] is a false positive ...
> 
>  From the commit message:
> 
> ```
> The vma_lock is used as follows:
> - During fault processing. The lock is acquired in read mode before
>    doing a page table lock and allocation (huge_pte_alloc).  The lock is
>    held until code is finished with the page table entry (ptep).
> - The lock must be held in write mode whenever huge_pmd_unshare is
>    called.
> 
> Lock ordering issues come into play when unmapping a page from all
> vmas mapping the page.  The i_mmap_rwsem must be held to search for the
> vmas, and the vma lock must be held before calling unmap which will
> call huge_pmd_unshare.  This is done today in:
> - try_to_migrate_one and try_to_unmap_ for page migration and memory
>    error handling.  In these routines we 'try' to obtain the vma lock and
>    fail to unmap if unsuccessful.  Calling routines already deal with the
>    failure of unmapping.
> - hugetlb_vmdelete_list for truncation and hole punch.  This routine
>    also tries to acquire the vma lock.  If it fails, it skips the
>    unmapping.  However, we can not have file truncation or hole punch
>    fail because of contention.  After hugetlb_vmdelete_list, truncation
>    and hole punch call remove_inode_hugepages.  remove_inode_hugepages
>    checks for mapped pages and call hugetlb_unmap_file_page to unmap them.
>    hugetlb_unmap_file_page is designed to drop locks and reacquire in the
>    correct order to guarantee unmap success.```
> 
> The locking logic is a bit tricky; some paths can't follow a strict lock 
> order
> and must use trylock or a drop/retry pattern to avoid deadlocking :)
> 
> Hoping Mike can take a look and confirm!
> 
> [1] https://lore.kernel.org/all/20220914221810.95771-9- 
> mike.kravetz@oracle.com/
> [2] https://lore.kernel.org/linux- 
> mm/69113a97.a70a0220.22f260.00ca.GAE@google.com/
> 
> Thanks,
> Lance
> 
>>>
>>> This fix has it backwards then. I'll rework it to fix the actual 
>>> violations.
>>>
>> Break a leg, better after taking a look at ffa1e7ada456 ("block: Make
>> request_queue lockdep splats show up earlier")
>

next prev parent reply	other threads:[~2025-11-11  3:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10 11:15 [PATCH v2 1/1] mm/hugetlb: fix possible deadlocks in hugetlb VMA unmap paths Lance Yang
2025-11-10 12:17 ` Harry Yoo
2025-11-10 16:39   ` Lance Yang
2025-11-10 23:07     ` Hillf Danton
2025-11-11  3:20       ` Lance Yang
2025-11-11  3:25         ` Lance Yang [this message]
2025-11-10 15:19 ` [syzbot ci] " syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0432bee6-b384-4cd7-ac1c-e9123c7b393f@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).