From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AAE83CAC592 for ; Fri, 19 Sep 2025 10:53:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:References:Cc:To:From:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=WhpXIa44PXWsjs2uxeORCT8byyhIH3oxkQYJDXpg7eI=; b=R4ul7MBwh0wwuGgysKzw10EqTC M37oYqZLXAsgBmlDQNq2jB1nvYrssvIS4mnEExosEo3h+TqOPwffSnHvjpsfHUoXKSS6XVxF9aVAI 2DZEoRLpErfCaZXB4diTaKU80slpKdRrONKb89oef4Qj4/jyBSKgTFRo9rr20qbb6/75HSY7TEpjD +7sBWKWBs08wOIwcOLP0UaiaDJuaHUtXcue6lmGQrwN5J+f4eL9/02Da/rE52z5KlEg7228LmHb3u yFwOQDCB/0MYDysWW+GzqGukUt2ShaNnA3hkbrEOFiBDKo49eu7GDHvkKn14p7ekZPZe5O/wJmIuE 6V/tfFyQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1uzYkB-00000002dfZ-1k21; Fri, 19 Sep 2025 10:53:35 +0000 Received: from out-172.mta0.migadu.com ([91.218.175.172]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1uzYk8-00000002de0-39Gn for linux-arm-kernel@lists.infradead.org; Fri, 19 Sep 2025 10:53:34 +0000 Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1758279199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WhpXIa44PXWsjs2uxeORCT8byyhIH3oxkQYJDXpg7eI=; b=qSYExmxViT2E12srT6OQWReC8YdRSo2KFRn2DWjtAvYvVIyjqxrOdS4zegzCzFd0dCTWYY FSyoAUpPigDRkODuI2z4KTcOxmP/rlQ7XAhhG5HTjOvgkcoGgJsuEz6s+ivTGmAzCqQwyu ZWH56bQt9BjCCQCmDdrtFjEh9FVQXRM= Date: Fri, 19 Sep 2025 18:53:05 +0800 MIME-Version: 1.0 Subject: Re: [PATCH v5 2/6] mm: remap unused subpages to shared zeropage when splitting isolated thp X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang To: David Hildenbrand Cc: =?UTF-8?B?UXVuLXdlaSBMaW4gKOael+e+pOW0tCk=?= , "catalin.marinas@arm.com" , "usamaarif642@gmail.com" , "linux-mm@kvack.org" , "yuzhao@google.com" , "akpm@linux-foundation.org" , "corbet@lwn.net" , =?UTF-8?B?QW5kcmV3IFlhbmcgKOaliuaZuuW8tyk=?= , "npache@redhat.com" , "rppt@kernel.org" , "willy@infradead.org" , "kernel-team@meta.com" , "roman.gushchin@linux.dev" , "hannes@cmpxchg.org" , "cerasuolodomenico@gmail.com" , "linux-kernel@vger.kernel.org" , "ryncsn@gmail.com" , "surenb@google.com" , "riel@surriel.com" , "shakeel.butt@linux.dev" , =?UTF-8?B?Q2hpbndlbiBDaGFuZyAo5by16Yym5paHKQ==?= , "linux-doc@vger.kernel.org" , =?UTF-8?B?Q2FzcGVyIExpICjmnY7kuK3mpq4p?= , "ryan.roberts@arm.com" , "linux-mediatek@lists.infradead.org" , "baohua@kernel.org" , "kaleshsingh@google.com" , "zhais@google.com" , "linux-arm-kernel@lists.infradead.org" References: <20240830100438.3623486-1-usamaarif642@gmail.com> <20240830100438.3623486-3-usamaarif642@gmail.com> <434c092b-0f19-47bf-a5fa-ea5b4b36c35e@redhat.com> <120445c8-7250-42e0-ad6a-978020c8fad3@linux.dev> <9d2c3e3e-439d-4695-b7c9-21fa52f48ced@redhat.com> <4cf41cd5-e93a-412b-b209-4180bd2d4015@linux.dev> Content-Language: en-US In-Reply-To: <4cf41cd5-e93a-412b-b209-4180bd2d4015@linux.dev> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250919_035332_945967_5E58DBB2 X-CRM114-Status: GOOD ( 23.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2025/9/19 16:14, Lance Yang wrote: > > > On 2025/9/19 15:55, David Hildenbrand wrote: >>>> I think where possible we really only want to identify problematic >>>> (tagged) pages and skip them. And we should either look into fixing KSM >>>> as well or finding out why KSM is not affected. >>> >>> Yeah. Seems like we could introduce a new helper, >>> folio_test_mte_tagged(struct >>> folio *folio). By default, it would return false, and architectures like >>> arm64 >>> can override it. >> >> If we add a new helper it should instead express the semantics that we >> cannot deduplicate. > > Agreed. > >> >> For THP, I recall that only some pages might be tagged. So likely we >> want to check per page. > > Yes, a per-page check would be simpler. > >> >>> >>> Looking at the code, the PG_mte_tagged flag is not set for regular THP. >> >> I think it's supported for THP per page. Only for hugetlb we tag the >> whole thing through the head page instead of individual pages. > > Right. That's exactly what I meant. > >> >>> The MTE >>> status actually comes from the VM_MTE flag in the VMA that maps it. >>> >> >> During the rmap walk we could check the VMA flag, but there would be >> no way to just stop the THP shrinker scanning this page early. >> >>> static inline bool folio_test_hugetlb_mte_tagged(struct folio *folio) >>> { >>>     bool ret = test_bit(PG_mte_tagged, &folio->flags.f); >>> >>>     VM_WARN_ON_ONCE(!folio_test_hugetlb(folio)); >>> >>>     /* >>>      * If the folio is tagged, ensure ordering with a likely subsequent >>>      * read of the tags. >>>      */ >>>     if (ret) >>>         smp_rmb(); >>>     return ret; >>> } >>> >>> static inline bool page_mte_tagged(struct page *page) >>> { >>>     bool ret = test_bit(PG_mte_tagged, &page->flags.f); >>> >>>     VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page))); >>> >>>     /* >>>      * If the page is tagged, ensure ordering with a likely subsequent >>>      * read of the tags. >>>      */ >>>     if (ret) >>>         smp_rmb(); >>>     return ret; >>> } >>> >>> contpte_set_ptes() >>>     __set_ptes() >>>         __set_ptes_anysz() >>>             __sync_cache_and_tags() >>>                 mte_sync_tags() >>>                     set_page_mte_tagged() >>> >>> Then, having the THP shrinker skip any folios that are identified as >>> MTE-tagged. >> >> Likely we should just do something like (maybe we want better naming) >> >> #ifndef page_is_mergable >> #define page_is_mergable(page) (true) >> #endif > > > Maybe something like page_is_optimizable()? Just a thought ;p > >> >> And for arm64 have it be >> >> #define page_is_mergable(page) (!page_mte_tagged(page)) >> >> >> And then do >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 1f0813b956436..1cac9093918d6 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -4251,7 +4251,8 @@ static bool thp_underused(struct folio *folio) >> >>          for (i = 0; i < folio_nr_pages(folio); i++) { >>                  kaddr = kmap_local_folio(folio, i * PAGE_SIZE); >> -               if (!memchr_inv(kaddr, 0, PAGE_SIZE)) { >> +               if (page_is_mergable(folio_page(folio, i)) && >> +                   !memchr_inv(kaddr, 0, PAGE_SIZE)) { >>                          num_zero_pages++; >>                          if (num_zero_pages > khugepaged_max_ptes_none) { >>                                  kunmap_local(kaddr); >> diff --git a/mm/migrate.c b/mm/migrate.c >> index 946253c398072..476a9a9091bd3 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -306,6 +306,8 @@ static bool try_to_map_unused_to_zeropage(struct >> page_vma_mapped_walk *pvmw, >> >>          if (PageCompound(page)) >>                  return false; >> +       if (!page_is_mergable(page)) >> +               return false; >>          VM_BUG_ON_PAGE(!PageAnon(page), page); >>          VM_BUG_ON_PAGE(!PageLocked(page), page); >>          VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page); > > Looks good to me! > >> >> >> For KSM, similarly just bail out early. But still wondering if this is >> already checked >> somehow for KSM. > > +1 I'm looking for a machine to test it on. Interestingly, it seems KSM is already skipping MTE-tagged pages. My test, running on a v6.8.0 kernel inside QEMU (with MTE enabled), shows no merging activity for those pages ...