From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C6CC28F5 for ; Thu, 2 Jul 2026 10:47:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782989282; cv=none; b=JgEmB81oISKuj4hxjrAOQnwMEdZFH/sa9ezjVNbSzEt7K2Whb/8xPHL6HuOeqy7APpuaqOeccTiFfiUgxUspky3dSL0nSMIPRq0W0rYOrSU5vISARQqkcZPes/a53HmyGnvsdQg2lhvn6/7dHXHtooU5XPsUIiwBtMfxI6CvBdM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782989282; c=relaxed/simple; bh=l964X5ZhmLszJq2BedPdDQlVAH/Wekp9VtPQvnoC/Yw=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=ALKUwV3qiV40Hep1vS1QjzbPwGXTyxtemzQzJr9aJ8WQayN+GXni2t6oimmOtXxeF1kWX3dYDn+XMbHh7OMS9UzHy9K2ZH1h5Ly8NDYq1cV/iXSwAOxpdfuHAzuJeMzSOUSWT0T3O35SQ9sk1vziUp2LqtbH0yRh+aelHXRJwPU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=kxBSli5m; arc=none smtp.client-ip=91.218.175.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="kxBSli5m" Message-ID: <44bda521-6184-4130-836a-6d677b183468@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1782989277; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zMBzbB0u23/cpFESkSHmcGjbPgrXe8WIxLCT+sY8QpY=; b=kxBSli5mPyGfg/rvmP68YMGaRYahT4p6G+4UM48R9pZAD4AuDvTnQOcQ6qfzvhplUrzHBv 76SECz7HIO8mTJohiSi1FHMPKcBr2vlQIiEY27QoGPS8mJY6R1egx4vwdCJZLDAYhR3PlH oIHpdgnk3S5gygg+Npkm7gCPna+AAVM= Date: Thu, 2 Jul 2026 11:47:52 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH] mm/migrate_device: pin large folios before splitting To: SJ Park Cc: Andrew Morton , apopple@nvidia.com, byungchul@sk.com, david@kernel.org, gourry@gourry.net, joshua.hahnjy@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, matthew.brost@intel.com, rakie.kim@sk.com, ying.huang@linux.alibaba.com, ziy@nvidia.com, shakeel.butt@linux.dev, hannes@cmpxchg.org, kernel-team@meta.com, sashiko-bot References: <20260701232936.85398-1-sj@kernel.org> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Usama Arif In-Reply-To: <20260701232936.85398-1-sj@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 02/07/2026 00:29, SJ Park wrote: > On Wed, 1 Jul 2026 07:06:38 -0700 Usama Arif wrote: > >> migrate_vma_collect_pmd() can detect a large folio while holding the PTE >> lock, then drop the PTE lock before calling migrate_vma_split_folio(). The >> split helper took its own reference, but only after the lock had already >> been dropped. >> >> One way to hit this is device migration over a range that contains a large >> folio. The walker reads the PTE while holding the PTE lock and derives the >> folio either from a present PTE via vm_normal_page(), or from a non-present >> PTE that encodes a device-private softleaf entry. It then has to drop the >> PTE lock because split_folio() can block. Before migrate_vma_split_folio() >> gets a folio reference, concurrent reclaim, migration, or truncation can >> replace or clear the entry and drop the last reference to the folio. The >> split helper would then take a reference and lock on a stale folio pointer. >> >> Take a temporary reference before dropping the PTE lock and pass that >> reference into migrate_vma_split_folio(). The helper consumes the >> reference, so split_folio() still sees only the expected caller pin instead >> of an extra pin that could make the split fail. >> >> Reported-by: sashiko-bot > > Just curious. Has the bot sent the review directly to you? Or, you read it > from sashiko.dev web site but kindly giving the credit to the bot? I read it from sashiko.dev! > >> Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev > > The link shows the comments for the entire series, so I was little bit confused > where the real finding is. Seems like that is on the reply to the patch 5? > And today I learned Sashiko.dev provides a way to point a reply to a specific > patch, like this: > > https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev?part=5 > Thanks for this! >> Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection") > > Seems merged in 6.19. Should we Cc stable@ ? > >> Signed-off-by: Usama Arif > > Reviewed-by: SJ Park Thanks SJ! > >> --- >> mm/migrate_device.c | 21 ++++++++++++++++++--- >> 1 file changed, 18 insertions(+), 3 deletions(-) >> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c >> index 2f8b646302c2..f5a5f699e98e 100644 >> --- a/mm/migrate_device.c >> +++ b/mm/migrate_device.c >> @@ -77,6 +77,9 @@ static int migrate_vma_collect_hole(unsigned long start, >> * @folio: the folio to split >> * @fault_page: struct page associated with the fault if any >> * >> + * If @folio is not the folio containing @fault_page, the caller must hold a >> + * reference on @folio. The helper consumes that reference. >> + * >> * Returns 0 on success >> */ >> static int migrate_vma_split_folio(struct folio *folio, >> @@ -86,10 +89,8 @@ static int migrate_vma_split_folio(struct folio *folio, >> struct folio *fault_folio = fault_page ? page_folio(fault_page) : NULL; >> struct folio *new_fault_folio = NULL; >> >> - if (folio != fault_folio) { >> - folio_get(folio); >> + if (folio != fault_folio) >> folio_lock(folio); >> - } >> >> ret = split_folio(folio); >> if (ret) { >> @@ -310,6 +311,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> if (folio_test_large(folio)) { >> int ret; >> >> + /* >> + * Keep the folio stable after dropping the PTE >> + * lock. migrate_vma_split_folio() consumes this >> + * reference. >> + */ >> + if (folio != fault_folio) >> + folio_get(folio); >> lazy_mmu_mode_disable(); >> pte_unmap_unlock(ptep, ptl); >> ret = migrate_vma_split_folio(folio, >> @@ -353,6 +361,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> if (folio && folio_test_large(folio)) { >> int ret; >> >> + /* >> + * Keep the folio stable after dropping the >> + * PTE lock. migrate_vma_split_folio() consumes >> + * this reference. >> + */ > > I don't mind having the above comments. Nonetheless, having same content but > different wrapping look special to me. > > If people really hate having two (incompletely) duplicated comments, what about > separating the recurring code out and adding the coment once on the function? > Maybe having a hotfix version without the comment and later refactoring with > the function can also be an option? Just a pure loud thinking. > > Anyway, thank you for making Linux safer! :) > >> + if (folio != fault_folio) >> + folio_get(folio); >> lazy_mmu_mode_disable(); >> pte_unmap_unlock(ptep, ptl); >> ret = migrate_vma_split_folio(folio, >> -- >> 2.53.0-Meta > > > Thanks, > SJ