From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 050AEC8303C for ; Mon, 7 Jul 2025 03:49:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 971EF6B02E5; Sun, 6 Jul 2025 23:49:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 949626B02E6; Sun, 6 Jul 2025 23:49:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85F756B02E7; Sun, 6 Jul 2025 23:49:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 718D36B02E5 for ; Sun, 6 Jul 2025 23:49:41 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BA8EF160307 for ; Mon, 7 Jul 2025 03:49:40 +0000 (UTC) X-FDA: 83636089320.26.85FC5B9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 8D387A0005 for ; Mon, 7 Jul 2025 03:49:38 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=F+KheqqP; spf=pass (imf15.hostedemail.com: domain of mpenttil@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mpenttil@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751860178; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UvOm2Q7E7WhKUu0R/+1+wPe/9ebrmyIzTORef/nPo+Y=; b=hAVQKpIrMqrv1MOB8J03vmPrYS0ulRnxpwXQvLREQ3IZfacbz0mrSeFXn4Tdz2IxLBCL6E hhir/G+8di39ToxZmGZhMP3mSFa49SF3Gnq0/A6c7Y5x7JYekEUjzvEHg2thfFI9RfT9sv x2GIqlaB6y1JKCiSLsoFJO9Av1WoqdI= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=F+KheqqP; spf=pass (imf15.hostedemail.com: domain of mpenttil@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mpenttil@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751860178; a=rsa-sha256; cv=none; b=iApg67K16/vSVSCaFgVZKZemtoPza6mtwyGca46AbfgBIqtu1jrqh5ukxeEdZ/GJRZx427 3kpYknLG8WB1rfKKbxEGZjlQm6mTzhIuRB9QKFp8qIcN6h1stXsCTuIJRKyqGlsSlQU6Hu gW7ujCtDLPwHSWG0DaC3mkfQX/uKfzE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1751860177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UvOm2Q7E7WhKUu0R/+1+wPe/9ebrmyIzTORef/nPo+Y=; b=F+KheqqPeMuIZKh2lRacgoi3cr/iniDeGtlvaGY7u5Bb/GYfiwYCHSALfmKGP0cMahChe+ /Uv6OFdVLGR8BROus8mlecCoxAgSiZFZFDCtfFeZYlc7uV4J57Z6xBASqf6bR6NSRyETuD Xpeot9aV2kvkeHZu9z9ql3tNEKPlHdU= Received: from mail-lj1-f199.google.com (mail-lj1-f199.google.com [209.85.208.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-454-tDHbJ73hN4mOldTwkLytEw-1; Sun, 06 Jul 2025 23:49:36 -0400 X-MC-Unique: tDHbJ73hN4mOldTwkLytEw-1 X-Mimecast-MFC-AGG-ID: tDHbJ73hN4mOldTwkLytEw_1751860175 Received: by mail-lj1-f199.google.com with SMTP id 38308e7fff4ca-32cbc4a763eso8444431fa.3 for ; Sun, 06 Jul 2025 20:49:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751860175; x=1752464975; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UvOm2Q7E7WhKUu0R/+1+wPe/9ebrmyIzTORef/nPo+Y=; b=tpYvfjY5ibUD9lgyFbYXMhPHbO3iyUKt03kIpB7hJAmyLMzG17TrPvPKo7uPfHcvS0 6aTZZlSf1CMgFILwGBR3/l22BmzVriTPU2Ow/klSFIeAVnjjBbbJ3ojA6uhRD8eW296R zzsBgykT9lNPe1zt5Mh6ch3phGWAd+8nfC+1MMvsZXBlIKKYc9wE/WYGCp8fZuCibNiI ygG5y1sWF3MHqw3rk/ExkHJqfSc09xIZmrln0sbownnKzZ0QO+cCV8dAH3N3XnxDsWd7 GZUU4ZW0nsM46HU2EOGCDYEBZsIBgCMaqCmpRwoJ5/aKUdWi07b9O72ehytzRMUI2CrC 4rfQ== X-Forwarded-Encrypted: i=1; AJvYcCXm8knUElRv2uFxxB2zD88YkqUD/0KGNonxrucz6wTEHJ1ET+smCiyq5n2bjwXSjXrqvUNFUIOO4w==@kvack.org X-Gm-Message-State: AOJu0YxN0ja10ngSiL6flM/rV7Gusm9xwFFCVR8fWePhC0kbNWK7n9CF rBLXkxedzgcEFFhTsl5P/5+haqEkDIRjYvmebpIzz9KvkwbX3RS3L+M1JVwChJiTXwxHp/iqh4o w/KNU5ZQVDQN/TaPGMcnG3Un7ij/n61SLWPHnVyQ6yOh8Lw+98ec= X-Gm-Gg: ASbGncsStlhmxyOWHCBkyzn3aob3YmGmmz/Ofwfd+Y5369qZunjkx6fo6JDCjh2thLE EqbjDMdn4AkLuWh3KrhW59xJ5Jn6GIp+WRFPu9aEDAP/0cLbBAnL8TeHRCrsxv26u8Xk4pFcHXZ ufBsBrO1mqQDNcOxdrlXM3zZf+5OswG2MjwGQJWh4q4Nwmekb8nGOx/5cNnouWYlxh7AqAG0l7y O1IJ4VsHkyZAEcjp1lukzZ5MhITnyqufDMYoLiZthvhvrgb3SBx04M8uNdL04BolYoyjS4zbacZ yz16eueAP4xTMpFmjIVlEPLMYOfRdA4dGHeLDSDw2Bdudq+Z X-Received: by 2002:a05:651c:2044:b0:32c:a851:e7a with SMTP id 38308e7fff4ca-32f092c6933mr16767111fa.24.1751860174979; Sun, 06 Jul 2025 20:49:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFteSHIvjYdgZx+QojYfNef14uWJx4LPm42VX2mBQ2e4xbmwfOPaNbOVGMJz9YCIzDcki2kpQ== X-Received: by 2002:a05:651c:2044:b0:32c:a851:e7a with SMTP id 38308e7fff4ca-32f092c6933mr16766971fa.24.1751860174432; Sun, 06 Jul 2025 20:49:34 -0700 (PDT) Received: from [192.168.1.86] (85-23-48-6.bb.dnainternet.fi. [85.23.48.6]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-32e1b1202besm10321031fa.54.2025.07.06.20.49.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 06 Jul 2025 20:49:33 -0700 (PDT) Message-ID: Date: Mon, 7 Jul 2025 06:49:32 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [v1 resend 03/12] mm/thp: zone_device awareness in THP handling code To: Balbir Singh , linux-mm@kvack.org Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Karol Herbst , Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Shuah Khan , David Hildenbrand , Barry Song , Baolin Wang , Ryan Roberts , Matthew Wilcox , Peter Xu , Zi Yan , Kefeng Wang , Jane Chu , Alistair Popple , Donet Tom References: <20250703233511.2028395-1-balbirs@nvidia.com> <20250703233511.2028395-4-balbirs@nvidia.com> From: =?UTF-8?Q?Mika_Penttil=C3=A4?= In-Reply-To: <20250703233511.2028395-4-balbirs@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ZQsFEemgnUDF3cMVGrdxL8s7lx8YmEzyyIJDmCh_ydA_1751860175 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: zcw7m9t1wasep9fmg1pazqhtzc7qhqrg X-Rspamd-Queue-Id: 8D387A0005 X-Rspamd-Server: rspam11 X-Rspam-User: X-HE-Tag: 1751860178-111470 X-HE-Meta: U2FsdGVkX19u5FoOoAZeUB+dp6hLfUPGkKkehlmH8mz6VRqjW2riqtAg1fXFWGRoj/TWswpdtc2EO5HQtYm1AUzu58LxmvXmfTPh7eq4VlQzdXWeCBcGilwmva0vR838iNBowO+Oz1fFqK4gj1lBVI/YZ4awp3t+Fzu9157HNGWJVKMTPs/40QSULoLCieVt97ov6ILFtKynSHJapecW7dt5Spohrxj8c2pSVhwOKWjEMCF7ZZinvOb9A9dQhmUXa9/zkB2t54hN1osLfr1AzgNBYsuwIEODMpHjwa3uXDGHAhSPPaKw4RqIc2lP8i+s4e0suayppJ/ARshLYl4kBYNS9wAUHI4SkRUjwBdl6CV970Qo00GbIx7w+BUHxmyIOzKuVpcc4Tv2tESTScO8FdRJrW5bHfYfHbf6gH4gYS7u+HrDJGWJrg+CW/jhuKNluWHfI2qJcJVKv2CPvG2eaEBqGPsbsyE+1dxFP0ZLrj5CxQSeWlVDbZHJxgw1ybgKPtgNV3N2qL6mhUG5uJ7O7pIQTbc/+tSs6o0sVJ4gzzMZ7HzH+XsNL12Ss/ISVVkDgohILQqZPZwKg8EWecBvzMxZ+e1ULruTUD1zHgpKBD5qJJIvk3WdrYcEuyNhBgF5Y4fVgztAJU686VFv447etss2EuATZb3AXjHSI6jeb554jE6wzaU55Vq4lO4ZzBYUZXPRVWSNR5V+l7UeYvfi5NnUJ/b4Xyaj4/NFT51mEBstSJQUxsEGkkHRIggyrTdxxuESSgDwKiLUojkHe8lxG+kSA0lzq+cG0bJx6wbGfRIRCsjL5ERJFWMm9L2uAXrB30ehpoW9oyMBJpBk/BMgfrGVBk5sQa+zM8nEP/YQ0GYr4MX9IxKhUXNAKYdDdtrToCpjDRdj7YGyCLx+hK0zfyTf9amJiBPw7BgFV1DunPWTav0fm6S2ZMyrK+eO6sH3Ba1hI65/laftUi/mo0z MHJKW6zd k3X0t3NyB9CCHlAaZWbtdT1D4DrsFoM5SGSe4rsz/7BhmLESkRMQfrXkFUKwLuXaj++RiTjAy1WXA9Cn4i9SQUxbfIj/+P1THMFuWtf/5U3E56fqihF+QEWoAqw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/4/25 02:35, Balbir Singh wrote: > Make THP handling code in the mm subsystem for THP pages > aware of zone device pages. Although the code is > designed to be generic when it comes to handling splitting > of pages, the code is designed to work for THP page sizes > corresponding to HPAGE_PMD_NR. > > Modify page_vma_mapped_walk() to return true when a zone > device huge entry is present, enabling try_to_migrate() > and other code migration paths to appropriately process the > entry > > pmd_pfn() does not work well with zone device entries, use > pfn_pmd_entry_to_swap() for checking and comparison as for > zone device entries. > > try_to_map_to_unused_zeropage() does not apply to zone device > entries, zone device entries are ignored in the call. > > Cc: Karol Herbst > Cc: Lyude Paul > Cc: Danilo Krummrich > Cc: David Airlie > Cc: Simona Vetter > Cc: "Jérôme Glisse" > Cc: Shuah Khan > Cc: David Hildenbrand > Cc: Barry Song > Cc: Baolin Wang > Cc: Ryan Roberts > Cc: Matthew Wilcox > Cc: Peter Xu > Cc: Zi Yan > Cc: Kefeng Wang > Cc: Jane Chu > Cc: Alistair Popple > Cc: Donet Tom > > Signed-off-by: Balbir Singh > --- > mm/huge_memory.c | 153 +++++++++++++++++++++++++++++++------------ > mm/migrate.c | 2 + > mm/page_vma_mapped.c | 10 +++ > mm/pgtable-generic.c | 6 ++ > mm/rmap.c | 19 +++++- > 5 files changed, 146 insertions(+), 44 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index ce130225a8e5..e6e390d0308f 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1711,7 +1711,8 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, > if (unlikely(is_swap_pmd(pmd))) { > swp_entry_t entry = pmd_to_swp_entry(pmd); > > - VM_BUG_ON(!is_pmd_migration_entry(pmd)); > + VM_BUG_ON(!is_pmd_migration_entry(pmd) && > + !is_device_private_entry(entry)); > if (!is_readable_migration_entry(entry)) { > entry = make_readable_migration_entry( > swp_offset(entry)); > @@ -2222,10 +2223,17 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > } else if (thp_migration_supported()) { > swp_entry_t entry; > > - VM_BUG_ON(!is_pmd_migration_entry(orig_pmd)); > entry = pmd_to_swp_entry(orig_pmd); > folio = pfn_swap_entry_folio(entry); > flush_needed = 0; > + > + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && > + !folio_is_device_private(folio)); > + > + if (folio_is_device_private(folio)) { > + folio_remove_rmap_pmd(folio, folio_page(folio, 0), vma); > + WARN_ON_ONCE(folio_mapcount(folio) < 0); > + } > } else > WARN_ONCE(1, "Non present huge pmd without pmd migration enabled!"); > > @@ -2247,6 +2255,15 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > folio_mark_accessed(folio); > } > > + /* > + * Do a folio put on zone device private pages after > + * changes to mm_counter, because the folio_put() will > + * clean folio->mapping and the folio_test_anon() check > + * will not be usable. > + */ > + if (folio_is_device_private(folio)) > + folio_put(folio); > + > spin_unlock(ptl); > if (flush_needed) > tlb_remove_page_size(tlb, &folio->page, HPAGE_PMD_SIZE); > @@ -2375,7 +2392,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > struct folio *folio = pfn_swap_entry_folio(entry); > pmd_t newpmd; > > - VM_BUG_ON(!is_pmd_migration_entry(*pmd)); > + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && > + !folio_is_device_private(folio)); > if (is_writable_migration_entry(entry)) { > /* > * A protection check is difficult so > @@ -2388,9 +2406,11 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > newpmd = swp_entry_to_pmd(entry); > if (pmd_swp_soft_dirty(*pmd)) > newpmd = pmd_swp_mksoft_dirty(newpmd); > - } else { > + } else if (is_writable_device_private_entry(entry)) { > + newpmd = swp_entry_to_pmd(entry); > + entry = make_device_exclusive_entry(swp_offset(entry)); > + } else > newpmd = *pmd; > - } > > if (uffd_wp) > newpmd = pmd_swp_mkuffd_wp(newpmd); > @@ -2842,16 +2862,20 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > struct page *page; > pgtable_t pgtable; > pmd_t old_pmd, _pmd; > - bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; > - bool anon_exclusive = false, dirty = false; > + bool young, write, soft_dirty, uffd_wp = false; > + bool anon_exclusive = false, dirty = false, present = false; > unsigned long addr; > pte_t *pte; > int i; > + swp_entry_t swp_entry; > > VM_BUG_ON(haddr & ~HPAGE_PMD_MASK); > VM_BUG_ON_VMA(vma->vm_start > haddr, vma); > VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma); > - VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)); > + > + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd) > + && !(is_swap_pmd(*pmd) && > + is_device_private_entry(pmd_to_swp_entry(*pmd)))); > > count_vm_event(THP_SPLIT_PMD); > > @@ -2899,20 +2923,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > return __split_huge_zero_page_pmd(vma, haddr, pmd); > } > > - pmd_migration = is_pmd_migration_entry(*pmd); > - if (unlikely(pmd_migration)) { > - swp_entry_t entry; > > + present = pmd_present(*pmd); > + if (unlikely(!present)) { > + swp_entry = pmd_to_swp_entry(*pmd); > old_pmd = *pmd; > - entry = pmd_to_swp_entry(old_pmd); > - page = pfn_swap_entry_to_page(entry); > - write = is_writable_migration_entry(entry); > + > + folio = pfn_swap_entry_folio(swp_entry); > + VM_BUG_ON(!is_migration_entry(swp_entry) && > + !is_device_private_entry(swp_entry)); > + page = pfn_swap_entry_to_page(swp_entry); > + write = is_writable_migration_entry(swp_entry); > + > if (PageAnon(page)) > - anon_exclusive = is_readable_exclusive_migration_entry(entry); > - young = is_migration_entry_young(entry); > - dirty = is_migration_entry_dirty(entry); > + anon_exclusive = > + is_readable_exclusive_migration_entry(swp_entry); > soft_dirty = pmd_swp_soft_dirty(old_pmd); > uffd_wp = pmd_swp_uffd_wp(old_pmd); > + young = is_migration_entry_young(swp_entry); > + dirty = is_migration_entry_dirty(swp_entry); > } else { This is where folio_try_share_anon_rmap_pmd() is skipped for device private pages, to which I referred in https://lore.kernel.org/linux-mm/f1e26e18-83db-4c0e-b8d8-0af8ffa8a206@redhat.com/ --Mika