From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: catalin.marinas@arm.com, will@kernel.org, arnd@arndb.de,
mike.kravetz@oracle.com, akpm@linux-foundation.org,
sj@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH 0/3] Introduce new huge_ptep_get_access_flags() interface
Date: Mon, 9 May 2022 09:34:51 +0800 [thread overview]
Message-ID: <97bb8f7e-38ce-4a21-fb76-4bd040ec00b7@linux.alibaba.com> (raw)
In-Reply-To: <YnfhHejDgjgyqEcb@FVFYT0MHHV2J.usts.net>
On 5/8/2022 11:26 PM, Muchun Song wrote:
> On Sun, May 08, 2022 at 04:58:51PM +0800, Baolin Wang wrote:
>> Hi,
>>
>> As Mike pointed out [1], the huge_ptep_get() will only return one specific
>> pte value for the CONT-PTE or CONT-PMD size hugetlb on ARM64 system, which
>> will not take into account the subpages' dirty or young bits of a CONT-PTE/PMD
>> size hugetlb page. That will make us miss dirty or young flags of a CONT-PTE/PMD
>> size hugetlb page for those functions that want to check the dirty or
>> young flags of a hugetlb page. For example, the gather_hugetlb_stats() will
>> get inaccurate dirty hugetlb page statistics, and the DAMON for hugetlb monitoring
>> will also get inaccurate access statistics.
>>
>> To fix this issue, one approach is that we can define an ARM64 specific huge_ptep_get()
>> implementation, which will take into account any subpages' dirty or young bits.
>
> IIUC, we could get the page size by page_size(pte_page(pte)).
> So, how about the following implementation of huge_ptep_get()?
> Does this work for you?
>
> pte_t huge_ptep_get(pte_t *ptep)
> {
> int ncontig, i;
> size_t pgsize;
> pte_t orig_pte = ptep_get(ptep);
>
> if (!pte_present(orig_pte) || !pte_cont(orig_pte))
> return orig_pte;
>
> ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);
>
> for (i = 0; i < ncontig; i++, ptep++) {
> pte_t pte = ptep_get(ptep);
>
> if (pte_dirty(pte))
> orig_pte = pte_mkdirty(orig_pte);
>
> if (pte_young(pte))
> orig_pte = pte_mkyoung(orig_pte);
> }
>
> return orig_pte;
> }
Thanks for your suggestion, and I think this works for me and looks more
straight forward in case some functions using huge_ptep_get() will care
about the young or dirty bits in future.
My only concern is that all the functions using huge_ptep_get() will set
a contPTE dirty or accessed bit, however most functions do not care
about the dirty and accessed bit, which becomes a bit more expensive for
them? Also mentioned by Matthew in his comments. Anyway, I still think
your suggestion is straight forward and I can change in next version if
no other objections.
WARNING: multiple messages have this Message-ID (diff)
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: catalin.marinas@arm.com, will@kernel.org, arnd@arndb.de,
mike.kravetz@oracle.com, akpm@linux-foundation.org,
sj@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH 0/3] Introduce new huge_ptep_get_access_flags() interface
Date: Mon, 9 May 2022 09:34:51 +0800 [thread overview]
Message-ID: <97bb8f7e-38ce-4a21-fb76-4bd040ec00b7@linux.alibaba.com> (raw)
In-Reply-To: <YnfhHejDgjgyqEcb@FVFYT0MHHV2J.usts.net>
On 5/8/2022 11:26 PM, Muchun Song wrote:
> On Sun, May 08, 2022 at 04:58:51PM +0800, Baolin Wang wrote:
>> Hi,
>>
>> As Mike pointed out [1], the huge_ptep_get() will only return one specific
>> pte value for the CONT-PTE or CONT-PMD size hugetlb on ARM64 system, which
>> will not take into account the subpages' dirty or young bits of a CONT-PTE/PMD
>> size hugetlb page. That will make us miss dirty or young flags of a CONT-PTE/PMD
>> size hugetlb page for those functions that want to check the dirty or
>> young flags of a hugetlb page. For example, the gather_hugetlb_stats() will
>> get inaccurate dirty hugetlb page statistics, and the DAMON for hugetlb monitoring
>> will also get inaccurate access statistics.
>>
>> To fix this issue, one approach is that we can define an ARM64 specific huge_ptep_get()
>> implementation, which will take into account any subpages' dirty or young bits.
>
> IIUC, we could get the page size by page_size(pte_page(pte)).
> So, how about the following implementation of huge_ptep_get()?
> Does this work for you?
>
> pte_t huge_ptep_get(pte_t *ptep)
> {
> int ncontig, i;
> size_t pgsize;
> pte_t orig_pte = ptep_get(ptep);
>
> if (!pte_present(orig_pte) || !pte_cont(orig_pte))
> return orig_pte;
>
> ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);
>
> for (i = 0; i < ncontig; i++, ptep++) {
> pte_t pte = ptep_get(ptep);
>
> if (pte_dirty(pte))
> orig_pte = pte_mkdirty(orig_pte);
>
> if (pte_young(pte))
> orig_pte = pte_mkyoung(orig_pte);
> }
>
> return orig_pte;
> }
Thanks for your suggestion, and I think this works for me and looks more
straight forward in case some functions using huge_ptep_get() will care
about the young or dirty bits in future.
My only concern is that all the functions using huge_ptep_get() will set
a contPTE dirty or accessed bit, however most functions do not care
about the dirty and accessed bit, which becomes a bit more expensive for
them? Also mentioned by Matthew in his comments. Anyway, I still think
your suggestion is straight forward and I can change in next version if
no other objections.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-05-09 1:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-08 8:58 [RFC PATCH 0/3] Introduce new huge_ptep_get_access_flags() interface Baolin Wang
2022-05-08 8:58 ` Baolin Wang
2022-05-08 8:58 ` [RFC PATCH 1/3] arm64/hugetlb: " Baolin Wang
2022-05-08 8:58 ` Baolin Wang
2022-05-08 13:14 ` nh26223
2022-05-08 13:14 ` nh26223
2022-05-09 1:19 ` Baolin Wang
2022-05-09 1:19 ` Baolin Wang
2022-05-09 4:10 ` nh26223
2022-05-09 4:10 ` nh26223
2022-05-09 4:19 ` Baolin Wang
2022-05-09 4:19 ` Baolin Wang
2022-05-08 8:58 ` [RFC PATCH 2/3] fs/proc/task_mmu: Change to use huge_ptep_get_access_flags() Baolin Wang
2022-05-08 8:58 ` Baolin Wang
2022-05-08 8:58 ` [RFC PATCH 3/3] mm/damon/vaddr: " Baolin Wang
2022-05-08 8:58 ` Baolin Wang
2022-05-08 12:41 ` kernel test robot
2022-05-08 15:26 ` [RFC PATCH 0/3] Introduce new huge_ptep_get_access_flags() interface Muchun Song
2022-05-08 15:26 ` Muchun Song
2022-05-09 1:34 ` Baolin Wang [this message]
2022-05-09 1:34 ` Baolin Wang
2022-05-08 17:08 ` Matthew Wilcox
2022-05-08 17:08 ` Matthew Wilcox
2022-05-09 1:53 ` Baolin Wang
2022-05-09 1:53 ` Baolin Wang
2022-05-09 2:54 ` Muchun Song
2022-05-09 2:54 ` Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=97bb8f7e-38ce-4a21-fb76-4bd040ec00b7@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=catalin.marinas@arm.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=sj@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.