From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DEDD239E60 for ; Wed, 1 Jul 2026 14:07:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782914840; cv=none; b=eNslcW9aoPKk9vQHGsqkG5n/9IF5uFwFsjglsDyeI+MlIatFMq2y5dvEQtGjOVaGlReAKRQjZ+4bNXUGn5/TPyfl5t7Fo7ICgRzkHHvT0Q36kF2CIQakld1D9HKyuEJbPGB4z13WVYXC58HzbyHzjdD0YLXVc2bAkJCnue5o+LI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782914840; c=relaxed/simple; bh=cKZTFhQYKkYOQateT4BTQGoG8+roDvz0JZIjnP3nBNE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=J4n54PH0HFsb+hpqI6OL6n/6P6GAKGxW/MGI731JvMtv69YxRULROrGOFaNqZi/PT7XM6KINucV5B6XGeXYGNKp2aRwt/KSxbWSr9p2TsbIY+GMBAVZSg7D7om2KlUB8kp2bQPCrfa9aQ91L1OS9lgRWP5QKEIFrj4be01fB3zU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=VSR20mxP; arc=none smtp.client-ip=95.215.58.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="VSR20mxP" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1782914835; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=+oTWLw7UwF38UExEYP9OuHXWaad05pihfyj+4u0Flgs=; b=VSR20mxPyhfE0mj0ljIXKzkdvtBwXGGCPBjDGG+X1r+9yhEDmiO2Fwnpg2YmCdZjA3qFiU ARmhEWvOwGwgRez4vWCQFGNS4X3L4b2g1ymqpkvattNgyMjIcu5sWHw37VAB/vSPP8BF9r OIb7mGsX+u6jnJ1swL63SSYXMKCw4/Y= From: Usama Arif To: Andrew Morton , apopple@nvidia.com, byungchul@sk.com, david@kernel.org, gourry@gourry.net, joshua.hahnjy@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, matthew.brost@intel.com, rakie.kim@sk.com, ying.huang@linux.alibaba.com, ziy@nvidia.com Cc: shakeel.butt@linux.dev, hannes@cmpxchg.org, kernel-team@meta.com, Usama Arif , sashiko-bot Subject: [PATCH] mm/migrate_device: pin large folios before splitting Date: Wed, 1 Jul 2026 07:06:38 -0700 Message-ID: <20260701140638.840773-1-usama.arif@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT migrate_vma_collect_pmd() can detect a large folio while holding the PTE lock, then drop the PTE lock before calling migrate_vma_split_folio(). The split helper took its own reference, but only after the lock had already been dropped. One way to hit this is device migration over a range that contains a large folio. The walker reads the PTE while holding the PTE lock and derives the folio either from a present PTE via vm_normal_page(), or from a non-present PTE that encodes a device-private softleaf entry. It then has to drop the PTE lock because split_folio() can block. Before migrate_vma_split_folio() gets a folio reference, concurrent reclaim, migration, or truncation can replace or clear the entry and drop the last reference to the folio. The split helper would then take a reference and lock on a stale folio pointer. Take a temporary reference before dropping the PTE lock and pass that reference into migrate_vma_split_folio(). The helper consumes the reference, so split_folio() still sees only the expected caller pin instead of an extra pin that could make the split fail. Reported-by: sashiko-bot Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection") Signed-off-by: Usama Arif --- mm/migrate_device.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 2f8b646302c2..f5a5f699e98e 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -77,6 +77,9 @@ static int migrate_vma_collect_hole(unsigned long start, * @folio: the folio to split * @fault_page: struct page associated with the fault if any * + * If @folio is not the folio containing @fault_page, the caller must hold a + * reference on @folio. The helper consumes that reference. + * * Returns 0 on success */ static int migrate_vma_split_folio(struct folio *folio, @@ -86,10 +89,8 @@ static int migrate_vma_split_folio(struct folio *folio, struct folio *fault_folio = fault_page ? page_folio(fault_page) : NULL; struct folio *new_fault_folio = NULL; - if (folio != fault_folio) { - folio_get(folio); + if (folio != fault_folio) folio_lock(folio); - } ret = split_folio(folio); if (ret) { @@ -310,6 +311,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, if (folio_test_large(folio)) { int ret; + /* + * Keep the folio stable after dropping the PTE + * lock. migrate_vma_split_folio() consumes this + * reference. + */ + if (folio != fault_folio) + folio_get(folio); lazy_mmu_mode_disable(); pte_unmap_unlock(ptep, ptl); ret = migrate_vma_split_folio(folio, @@ -353,6 +361,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, if (folio && folio_test_large(folio)) { int ret; + /* + * Keep the folio stable after dropping the + * PTE lock. migrate_vma_split_folio() consumes + * this reference. + */ + if (folio != fault_folio) + folio_get(folio); lazy_mmu_mode_disable(); pte_unmap_unlock(ptep, ptl); ret = migrate_vma_split_folio(folio, -- 2.53.0-Meta