From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68BBF2C9D for ; Wed, 1 Jul 2026 23:29:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782948586; cv=none; b=PPTbkfi4fky3LlBCZ+AYrr8xQp6vmwl5mSsEdWiUjWJmcMXr3+Cyg1XO2PDFUgemBbACoWtrS8f4lESS1Kg0TsdOe7t6+pDPtZo8XbJW2xzyNaf2G2CFuc5OKTZkCegrA00tAopHOq350P6sNWrL/cLkMCYqv/wy4sWB/XbgTSo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782948586; c=relaxed/simple; bh=tg4tnZlrEiZpc5cLzV2nwjku53yX+qldZZ+KLpbwoXE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NGRcOeNI1nrUbs33Kn98YGoUUw7ho4nu3flj/cqszNlrOtxsjjektN+dKjApFV/FHmyWItAIvoiIwMLQBU/H7DDcfaMfIrQVxK6xoBUl8vohtUQGHgr4uav1tniIVU2PdBCiZgkYxHINgHRAPJ1EapqwuksyHs0lqblhVJCRHPU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BD/vWVtR; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BD/vWVtR" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A465A1F000E9; Wed, 1 Jul 2026 23:29:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782948585; bh=R4Q8EIyzKjGiuA36+iLs5iGLzb0r1XX60du/1F2NN18=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=BD/vWVtRo0Rnc/TXGOBKm19zBg+LpdTzTodtmBuVC2mwFYMFL7FMUSeW6vvhiftzu ZazfvZffYYkiEZS6ehbZz0gKn+LYfpkPZOZduAGTHT2EJBrm3cpqDjh6LBBl/ET983 2Gfl3ww3e6YarOaE1CpFj8CFdommg5njKv2PbSlm4b9G53esxjVZkldjGL3mzAZtHv kKPCulsBL9G/05AL7NqB6mrhdur0SAcX1Vl+dZuzhDG9mOOyEJuSQQpy5bVJP7M5ew B5MJRpuG5CoXQquPbfFTt7ZeDjsRScGrbD2jWfQwhinf6W+ZxMU52qh4F5VxAhuluX 0LHHYV8EUuCfw== From: SJ Park To: Usama Arif Cc: SJ Park , Andrew Morton , apopple@nvidia.com, byungchul@sk.com, david@kernel.org, gourry@gourry.net, joshua.hahnjy@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, matthew.brost@intel.com, rakie.kim@sk.com, ying.huang@linux.alibaba.com, ziy@nvidia.com, shakeel.butt@linux.dev, hannes@cmpxchg.org, kernel-team@meta.com, sashiko-bot Subject: Re: [PATCH] mm/migrate_device: pin large folios before splitting Date: Wed, 1 Jul 2026 16:29:36 -0700 Message-ID: <20260701232936.85398-1-sj@kernel.org> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260701140638.840773-1-usama.arif@linux.dev> References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit On Wed, 1 Jul 2026 07:06:38 -0700 Usama Arif wrote: > migrate_vma_collect_pmd() can detect a large folio while holding the PTE > lock, then drop the PTE lock before calling migrate_vma_split_folio(). The > split helper took its own reference, but only after the lock had already > been dropped. > > One way to hit this is device migration over a range that contains a large > folio. The walker reads the PTE while holding the PTE lock and derives the > folio either from a present PTE via vm_normal_page(), or from a non-present > PTE that encodes a device-private softleaf entry. It then has to drop the > PTE lock because split_folio() can block. Before migrate_vma_split_folio() > gets a folio reference, concurrent reclaim, migration, or truncation can > replace or clear the entry and drop the last reference to the folio. The > split helper would then take a reference and lock on a stale folio pointer. > > Take a temporary reference before dropping the PTE lock and pass that > reference into migrate_vma_split_folio(). The helper consumes the > reference, so split_folio() still sees only the expected caller pin instead > of an extra pin that could make the split fail. > > Reported-by: sashiko-bot Just curious. Has the bot sent the review directly to you? Or, you read it from sashiko.dev web site but kindly giving the credit to the bot? > Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev The link shows the comments for the entire series, so I was little bit confused where the real finding is. Seems like that is on the reply to the patch 5? And today I learned Sashiko.dev provides a way to point a reply to a specific patch, like this: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev?part=5 > Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection") Seems merged in 6.19. Should we Cc stable@ ? > Signed-off-by: Usama Arif Reviewed-by: SJ Park > --- > mm/migrate_device.c | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/mm/migrate_device.c b/mm/migrate_device.c > index 2f8b646302c2..f5a5f699e98e 100644 > --- a/mm/migrate_device.c > +++ b/mm/migrate_device.c > @@ -77,6 +77,9 @@ static int migrate_vma_collect_hole(unsigned long start, > * @folio: the folio to split > * @fault_page: struct page associated with the fault if any > * > + * If @folio is not the folio containing @fault_page, the caller must hold a > + * reference on @folio. The helper consumes that reference. > + * > * Returns 0 on success > */ > static int migrate_vma_split_folio(struct folio *folio, > @@ -86,10 +89,8 @@ static int migrate_vma_split_folio(struct folio *folio, > struct folio *fault_folio = fault_page ? page_folio(fault_page) : NULL; > struct folio *new_fault_folio = NULL; > > - if (folio != fault_folio) { > - folio_get(folio); > + if (folio != fault_folio) > folio_lock(folio); > - } > > ret = split_folio(folio); > if (ret) { > @@ -310,6 +311,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > if (folio_test_large(folio)) { > int ret; > > + /* > + * Keep the folio stable after dropping the PTE > + * lock. migrate_vma_split_folio() consumes this > + * reference. > + */ > + if (folio != fault_folio) > + folio_get(folio); > lazy_mmu_mode_disable(); > pte_unmap_unlock(ptep, ptl); > ret = migrate_vma_split_folio(folio, > @@ -353,6 +361,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > if (folio && folio_test_large(folio)) { > int ret; > > + /* > + * Keep the folio stable after dropping the > + * PTE lock. migrate_vma_split_folio() consumes > + * this reference. > + */ I don't mind having the above comments. Nonetheless, having same content but different wrapping look special to me. If people really hate having two (incompletely) duplicated comments, what about separating the recurring code out and adding the coment once on the function? Maybe having a hotfix version without the comment and later refactoring with the function can also be an option? Just a pure loud thinking. Anyway, thank you for making Linux safer! :) > + if (folio != fault_folio) > + folio_get(folio); > lazy_mmu_mode_disable(); > pte_unmap_unlock(ptep, ptl); > ret = migrate_vma_split_folio(folio, > -- > 2.53.0-Meta Thanks, SJ