From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 34AB82DB7AC for ; Fri, 9 Jan 2026 06:37:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767940656; cv=none; b=fXRWMlP5FvXSXFNlDi3Dms5Uk/lKrMGiZxalberDI+fEDRgYuWAkZphf6Gx2hArrzhFiOApmtEGPZtCxGTKlSXgrKRvmNntJa8T2iucZeDSnePLVrM++B/HFkZG8GcbCon97SQ+Urk77O/EQ5nniyO2BcEV7LzN6wmsvU9YiPUY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767940656; c=relaxed/simple; bh=zYM19bQWJ0jJXyjJlTg7SXkDXhcgoTxKStHresD2gcU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=Acs/yf6s8vy4LQyKI0LVWFjuESbs5g6pZFgeac2kBfmyIGQfnA7RNF41EulRSmAJj49OBRpZduiIyUqsbxIIZlnIyJOSIF96AYUTW3VOLzXg5EoOV4mRXj38XuaR1RcefpNDvtRSldkCkYDDZI7Sa+oW2dVpk8T+pvAV7fzuIlI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=sfvAF5SU; arc=none smtp.client-ip=115.124.30.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="sfvAF5SU" DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1767940652; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=5g+AYkU4WS8Q3yfVSitZSnUSiX3Qp5+cKaw82EFjEr0=; b=sfvAF5SUOt+pfIZLhY9PU2Yd5xOvL833h1xhC3xR226sgFZ2dPxdLltRKRriiUH+tqqUQVtKmwFF09QXcrAmEUEWD+XrDIjI+LikNZUzVZzQlHxGeFfe9Gj2GhmEQCFtwb4r3E0PjTCW18zjb1he6DwIbfEBoelmmqDSNydWFoA= Received: from DESKTOP-5N7EMDA(mailfrom:ying.huang@linux.alibaba.com fp:SMTPD_---0WwfI-4q_1767940650 cluster:ay36) by smtp.aliyun-inc.com; Fri, 09 Jan 2026 14:37:31 +0800 From: "Huang, Ying" To: Jinchao Wang Cc: Matthew Wilcox , Andrew Morton , David Hildenbrand , Zi Yan , Matthew Brost , Joshua Hahn , Rakie Kim , Byungchul Park , Gregory Price , Alistair Popple , linux-mm@kvack.org, linux-kernel@vger.kernel.org, syzbot+2d9c96466c978346b55f@syzkaller.appspotmail.com Subject: Re: [PATCH] mm/migrate: fix hugetlbfs deadlock by respecting lock ordering In-Reply-To: <20260109034723.1342798-1-wangjinchao600@gmail.com> (Jinchao Wang's message of "Fri, 9 Jan 2026 11:47:16 +0800") References: <20260109034723.1342798-1-wangjinchao600@gmail.com> Date: Fri, 09 Jan 2026 14:37:28 +0800 Message-ID: <87secfqok7.fsf@DESKTOP-5N7EMDA> User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Jinchao Wang writes: > Fix an AB-BA deadlock between hugetlbfs_punch_hole() and page migration. > > The deadlock occurs because migration violates the lock ordering defined > in mm/rmap.c for hugetlbfs: > > * hugetlbfs PageHuge() take locks in this order: > * hugetlb_fault_mutex > * vma_lock > * mapping->i_mmap_rwsem > * folio_lock > > The following trace illustrates the inversion: > > Task A (punch_hole): Task B (migration): > -------------------- ------------------- > 1. i_mmap_lock_write(mapping) 1. folio_lock(folio) > 2. folio_lock(folio) 2. i_mmap_lock_read(mapping) > (blocks waiting for B) (blocks waiting for A) > > Task A is blocked in the punch-hole path: > hugetlbfs_fallocate > hugetlbfs_punch_hole > hugetlbfs_zero_partial_page > folio_lock > > Task B is blocked in the migration path: > migrate_pages > unmap_and_move_huge_page > remove_migration_ptes > __rmap_walk_file > i_mmap_lock_read > > To fix this, adjust unmap_and_move_huge_page() to respect the established > hierarchy. If i_mmap_rwsem is acquired during try_to_migrate(), hold it > until remove_migration_ptes() completes. > > This utilizes the existing retry logic, which unlocks the folio and > returns -EAGAIN if hugetlb_folio_mapping_lock_write() fails. > > Link: https://lore.kernel.org/all/68e9715a.050a0220.1186a4.000d.GAE@google.com/ > Link: https://lore.kernel.org/all/20260108123957.1123502-2-wangjinchao600@gmail.com > Reported-by: syzbot+2d9c96466c978346b55f@syzkaller.appspotmail.com > Suggested-by: Matthew Wilcox > Signed-off-by: Jinchao Wang Can you provide a "Fixes:" tag? That is helpful for backporting the bug fix. --- Best Regards, Huang, Ying