From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5527A3655DB for ; Tue, 2 Jun 2026 17:26:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780421214; cv=none; b=k+z+8ZjKHTknpj5WYKXuqmir8pdtZllE0EERsKfk3zJfIp9Uu/ax2Dk2AHaCORr5JCC2g19XM8y4Q/pBUL7NhT3cpDhvoThW+evIaQHRymqGvygPwJECW21QnL9KoM0QhkMZ5lpE9d34TR6eVPwJGv/6fBVH0Zd6KZdbo06U+ok= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780421214; c=relaxed/simple; bh=UTjqVIUJLKUC5TeyyrI849hE5ljmJcEAC4j0KDDQ27k=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bMFb1FchCsdnTDDt6s8aNJB2QYkGRbSoF504o8pEkh1hOKI/GQKmdCtVTPWcEJWocVNOfHTp6+FVEiKgXQ2Xb75UhCzyJlOHJwHUWk6r6/47gRaSb8Rfbd7dLjOiwkWR7I/XV/t6t2Zs84JTS/8lVp6/1tk0zm4ZAvBsXfN0Px0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LQoZygeA; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LQoZygeA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 089E51F00893; Tue, 2 Jun 2026 17:26:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780421213; bh=J1xSzs0yfom2BvFAzfRHJ/gg+qIG4fJVebZ4d/4+CrI=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=LQoZygeAKaOcrmNpUPIdb41xUyr98Ikld1wVHD529rM41xxur/djs8xTQ7xY1XUuC GQoZPkBrSUPk7NAd4oomb3DuYXyx4PCDU2bpfGckfNxm6MAk6m1bcP0FfJcgBe+yMN 5s8NbV8+7tJJnbUtZH5ATphrwIlIt7izN41ik+wDzllCh2b4tbG2C/v/c5HGZiclBV R6CM/K2fpgJuhcy9ybFoL6LgGDRbXYLv89C461AbIhAPrqZ5KgRJ/IAYWCGXCeJ3ye pUBsUZfN0q8i8Z9gUC+1d8gB9RXchbUfJAjukQpWQDpheIheinAEpkHou+zsJAc13k Y47G/jqdjdcnA== Date: Tue, 2 Jun 2026 18:26:46 +0100 From: Lorenzo Stoakes To: Zi Yan Cc: Andrew Morton , David Hildenbrand , Baolin Wang , "Liam R . Howlett" , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , SeongJae Park , Balbir Singh , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH mm-hotfixes] mm/huge_memory: use correct flags for device private PMD entry Message-ID: References: <20260601083044.57132-1-ljs@kernel.org> <263FB5F0-AA3C-4885-86E2-9EDB030A0CDF@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <263FB5F0-AA3C-4885-86E2-9EDB030A0CDF@nvidia.com> On Tue, Jun 02, 2026 at 10:40:16AM -0400, Zi Yan wrote: > On 1 Jun 2026, at 4:30, Lorenzo Stoakes wrote: > > > Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support > > device-private entries") updated set_pmd_migration_entry() to use > > pmdp_huge_get_and_clear() in the softleaf case, but made no further > > adjustments to the function itself. > > > > Therefore this function continues to incorrectly use pmd_write(), > > pmd_soft_dirty() and pmd_uffd_wp() to determine whether the installed > > migration entry should be marked writable, softdirty or uffd-wp > > respectively. > > > > Whilst all are incorrect, the most problematic of these is pmd_write(), as > > this can lead to corrupted rmap state. > > > > On x86-64 _PAGE_SWP_SOFT_DIRTY is aliased to _PAGE_RW. So calling > > pmd_write() on a softleaf will return the softdirty state encoded in the > > entry, assuming CONFIG_MEM_SOFT_DIRTY was enabled. > > > > This was observed when running the hmm.hmm_device_private.anon_write_child > > selftest: > > > > 1. The test faults in a range then migrates it such that a device-private > > THP range is established. > > > > 2. The parent then migrates it to a device-private writable PMD entry whose > > folio is entirely AnonExclusive with entire_mapcount=1, softdirty set > > (accidentally correct write state). > > > > 3. The parent forks and the PMD entries are set to device-private read only > > entries, entire_mapcount=2, softdirty still set. > > > > 4. [BUG] The child writes to the range then migrates to RAM - intending to > > install non-writable migration entries - but replacing parent and child > > PMD mappings with WRITABLE entries due to misinterpreting the softdirty > > bit. > > > > 5. In remove_migration_pmd(), if !softleaf_is_migration_read(entry) we > > set the RMAP_EXCLUSIVE flag when calling folio_add_anon_rmap_pmd() for > > both parent and child, which are therefore AnonExclusive. > > > > 6. [SPLAT] Child sets migrated folio entire_mapcount=1, parent sets > > entire_mapcount=2 and we end up with an AnonExclusive folio with > > entire_mapcount=2! Assert fires in __folio_add_anon_rmap(): > > > > VM_WARN_ON_FOLIO(folio_test_large(folio) && > > folio_entire_mapcount(folio) > 1 && > > PageAnonExclusive(cur_page), folio) > > > > This patch fixes the issue by correctly referencing the softleaf entry > > fields for writable, softdirty and uffd-wp in set_pmd_migration_entry(). > > > > It also only updates A/D flags if the entry is present as these are > > otherwise not meaningful for a softleaf entry. > > > > This patch also flips the if (!present) { ... } else { ... } logic in > > set_pmd_migration_entry() so it is easier to understand, and adds some > > comments to make things clearer. > > > > I was able to bisect this to commit 775465fd26a3 ("lib/test_hmm: add zone > > device private THP test infrastructure") which first exposes this bug as it > > was the commit that permitted test_hmm to generate the test. > > > > However commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support > > device-private entries") is the commit that actually enabled this > > behaviour. > > Thanks for the detailed explanation. > > > > Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries") > > Cc: stable@vger.kernel.org > > Signed-off-by: Lorenzo Stoakes > > --- > > mm/huge_memory.c | 45 +++++++++++++++++++++++++++++++++------------ > > 1 file changed, 33 insertions(+), 12 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index bf9b480bb3b0..79463c709c98 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -4982,7 +4982,7 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > > struct vm_area_struct *vma = pvmw->vma; > > struct mm_struct *mm = vma->vm_mm; > > unsigned long address = pvmw->address; > > - bool anon_exclusive; > > + bool anon_exclusive, present, writable, softdirty, uffd_wp; > > pmd_t pmdval; > > swp_entry_t entry; > > pmd_t pmdswp; > > @@ -4990,12 +4990,26 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > > if (!(pvmw->pmd && !pvmw->pte)) > > return 0; > > > > - flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); > > - if (unlikely(!pmd_present(*pvmw->pmd))) > > - pmdval = pmdp_huge_get_and_clear(vma->vm_mm, address, pvmw->pmd); > > - else > > + present = pmd_present(*pvmw->pmd); > > + if (likely(present)) { > > + flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); > > + > > pmdval = pmdp_invalidate(vma, address, pvmw->pmd); > > > > + writable = pmd_write(pmdval); > > + softdirty = pmd_soft_dirty(pmdval); > > + uffd_wp = pmd_uffd_wp(pmdval); > > + } else { > > + softleaf_t old_entry; > > + > > + pmdval = pmdp_huge_get_and_clear(vma->vm_mm, address, pvmw->pmd); > > + old_entry = softleaf_from_pmd(pmdval); > > + > > + writable = softleaf_is_device_private_write(old_entry); > > Just to make sure I get it. This means the only possible writable > non present/softleaf entry is device private writable. There is > writable migration entry, but since we are setting a migration entry > here, that should not be possible. Yes :) This is doing the same as try_to_migrate_one(), e.g.: if (folio_test_hugetlb(folio)) { ... } else if (likely(pte_present(pteval))) { ... } else { const softleaf_t entry = softleaf_from_pte(pteval); pte_clear(mm, address, pvmw.pte); writable = softleaf_is_device_private_write(entry); } > > The patch LGTM. Thanks. > > Reviewed-by: Zi Yan Thanks! > > > Best Regards, > Yan, Zi Cheers, Lorenzo