public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <yang@os.amperecomputing.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	David Hildenbrand <david@redhat.com>
Cc: Sasha Levin <sashal@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] arm64 updates for 6.13-rc1
Date: Wed, 4 Dec 2024 08:00:16 -0800	[thread overview]
Message-ID: <5f99ff7d-67dc-41da-8a90-a1a5e76b8daa@os.amperecomputing.com> (raw)
In-Reply-To: <Z1B6OMqEZitgBVEx@arm.com>



On 12/4/24 7:50 AM, Catalin Marinas wrote:
> On Wed, Dec 04, 2024 at 04:32:11PM +0100, David Hildenbrand wrote:
>> On 04.12.24 16:29, Catalin Marinas wrote:
>>> On Mon, Dec 02, 2024 at 08:22:57AM -0800, Yang Shi wrote:
>>>> On 11/28/24 1:56 AM, David Hildenbrand wrote:
>>>>> On 28.11.24 02:21, Yang Shi wrote:
>>>>>>>> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
>>>>>>>> index 87b3f1a25535..ef303a2262c5 100644
>>>>>>>> --- a/arch/arm64/mm/copypage.c
>>>>>>>> +++ b/arch/arm64/mm/copypage.c
>>>>>>>> @@ -30,9 +30,9 @@ void copy_highpage(struct page *to, struct
>>>>>>>> page *from)
>>>>>>>>           if (!system_supports_mte())
>>>>>>>>               return;
>>>>>>>> -    if (folio_test_hugetlb(src) &&
>>>>>>>> -        folio_test_hugetlb_mte_tagged(src)) {
>>>>>>>> -        if (!folio_try_hugetlb_mte_tagging(dst))
>>>>>>>> +    if (folio_test_hugetlb(src)) {
>>>>>>>> +        if (!folio_test_hugetlb_mte_tagged(src) ||
>>>>>>>> +            !folio_try_hugetlb_mte_tagging(dst))
>>>>>>>>                   return;
>>>>>>>>               /*
>>>>>>> I wonder why we had a 'return' here originally rather than a
>>>>>>> WARN_ON_ONCE() as we do further down for the page case. Do you seen any
>>>>>>> issue with the hunk below? Destination should be a new folio and not
>>>>>>> tagged yet:
>>>>>> Yes, I did see problem. Because we copy tags for all sub pages then set
>>>>>> folio mte tagged when copying the data for the first subpage. The
>>>>>> warning will be triggered when we copy the second subpage.
>>>>> It's rather weird, though. We're instructed to copy a single page, yet
>>>>> copy tags for all pages.
>>>>>
>>>>> This really only makes sense when called from folio_copy(), where we are
>>>>> guaranteed to copy all pages.
>>>>>
>>>>> I'm starting to wonder if we should be able to hook into / overload
>>>>> folio_copy() instead, to just handle the complete hugetlb copy ourselves
>>>>> in one shot, and assume that copy_highpage() will never be called for
>>>>> hugetlb pages (WARN and don't copy tags).
>>>> Actually folio_copy() is just called by migration. Copy huge page in CoW is
>>>> more complicated and uses copy_user_highpage()->copy_highpage() instead of
>>>> folio_copy(). It may start the page copy from any subpage. For example, if
>>>> the CoW is triggered by accessing to the address in the middle of 2M. Kernel
>>>> may copy the second half first then the first half to guarantee the accessed
>>>> data in cache.
>>> Still trying to understand the possible call paths here. If we get a
>>> write fault on a large folio, does the core code allocate a folio of the
>>> same size for CoW or it starts with smaller ones? wp_page_copy()
>>> allocates order 0 AFAICT, though if it was a pmd fault, it takes a
>>> different path in handle_mm_fault(). But we also have huge pages using
>>> contiguous ptes.
>>>
>>> Unless the source and destinations folios are exactly the same size, it
>>> will break many assumptions in the code above. Going the other way
>>> around is also wrong, dst larger than src, we are not initialising the
>>> whole dst folio.
>>>
>>> Maybe going back to per-page PG_mte_tagged flag rather than per-folio
>>> would keep things simple, less risk of wrong assumptions.
>> I think the magic bit here is that for hugetlb, we only get hugetlb folios
>> of the same size, and no mixtures.

Yes, hugetlb always allocates the same order folio for CoW. And hugetlb 
CoW path is:

handle_mm_fault() ->
   hugetlb_fault() ->
     hugetlb_wp()

> Ah, ok, we do check for this and only do the advance copy for hugetlb
> folios. I'd add a check for folio size just in case, something like
> below (I'll add some description and post it properly):
>
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index 87b3f1a25535..c3a83db46ec6 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -30,11 +30,14 @@ void copy_highpage(struct page *to, struct page *from)
>   	if (!system_supports_mte())
>   		return;
>   
> -	if (folio_test_hugetlb(src) &&
> -	    folio_test_hugetlb_mte_tagged(src)) {
> -		if (!folio_try_hugetlb_mte_tagging(dst))
> +	if (folio_test_hugetlb(src)) {
> +		if (!folio_test_hugetlb_mte_tagged(src) ||
> +		    from != folio_page(src, 0) ||
> +		    WARN_ON_ONCE(folio_nr_pages(src) != folio_nr_pages(dst)))

The check is ok, but TBH I don't see too much benefit. The same order is 
guaranteed by hugetlb fault handler. And I don't think we will support 
mixed order for hugetlb in foreseeable future.

>   			return;
>   
> +		WARN_ON_ONCE(!folio_try_hugetlb_mte_tagging(dst));
> +
>   		/*
>   		 * Populate tags for all subpages.
>   		 *
>


  reply	other threads:[~2024-12-04 16:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-18 10:06 [GIT PULL] arm64 updates for 6.13-rc1 Catalin Marinas
2024-11-19  2:24 ` pr-tracker-bot
2024-11-25 15:09 ` Sasha Levin
2024-11-25 19:09   ` Catalin Marinas
2024-11-26 17:41     ` Yang Shi
2024-11-27 18:14       ` Catalin Marinas
2024-11-28  1:21         ` Yang Shi
2024-11-28  9:56           ` David Hildenbrand
2024-12-02 16:22             ` Yang Shi
2024-12-04 15:29               ` Catalin Marinas
2024-12-04 15:32                 ` David Hildenbrand
2024-12-04 15:50                   ` Catalin Marinas
2024-12-04 16:00                     ` Yang Shi [this message]
2024-12-04 16:00                     ` David Hildenbrand
2024-12-04 16:17                       ` Yang Shi
2024-11-28 14:12           ` Catalin Marinas
2024-12-02 16:05             ` Yang Shi
2024-12-02 16:10             ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f99ff7d-67dc-41da-8a90-a1a5e76b8daa@os.amperecomputing.com \
    --to=yang@os.amperecomputing.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox