linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <yang@os.amperecomputing.com>
To: Dev Jain <dev.jain@arm.com>, Will Deacon <will@kernel.org>
Cc: catalin.marinas@arm.com, ryan.roberts@arm.com, rppt@kernel.org,
	shijie@os.amperecomputing.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] arm64/pageattr: Propagate return value from __change_memory_common
Date: Mon, 10 Nov 2025 20:17:03 -0800	[thread overview]
Message-ID: <c1701ce9-c8b7-4ac8-8dd4-930af3dad7d2@os.amperecomputing.com> (raw)
In-Reply-To: <17eed751-e1c5-4ea5-af1d-e96da16d5e26@arm.com>



On 11/10/25 7:39 PM, Dev Jain wrote:
>
> On 05/11/25 9:27 am, Dev Jain wrote:
>>
>> On 04/11/25 6:26 pm, Will Deacon wrote:
>>> On Tue, Nov 04, 2025 at 09:06:12AM +0530, Dev Jain wrote:
>>>> On 04/11/25 12:15 am, Yang Shi wrote:
>>>>> On 11/3/25 7:16 AM, Will Deacon wrote:
>>>>>> On Mon, Nov 03, 2025 at 11:43:06AM +0530, Dev Jain wrote:
>>>>>>> Post a166563e7ec3 ("arm64: mm: support large block mapping when
>>>>>>> rodata=full"),
>>>>>>> __change_memory_common has a real chance of failing due to split
>>>>>>> failure.
>>>>>>> Before that commit, this line was introduced in c55191e96caa,
>>>>>>> still having
>>>>>>> a chance of failing if it needs to allocate pagetable memory in
>>>>>>> apply_to_page_range, although that has never been observed to be 
>>>>>>> true.
>>>>>>> In general, we should always propagate the return value to the 
>>>>>>> caller.
>>>>>>>
>>>>>>> Cc: stable@vger.kernel.org
>>>>>>> Fixes: c55191e96caa ("arm64: mm: apply r/o permissions of VM
>>>>>>> areas to its linear alias as well")
>>>>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>>>>> ---
>>>>>>> Based on Linux 6.18-rc4.
>>>>>>>
>>>>>>>    arch/arm64/mm/pageattr.c | 5 ++++-
>>>>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>>>>>> index 5135f2d66958..b4ea86cd3a71 100644
>>>>>>> --- a/arch/arm64/mm/pageattr.c
>>>>>>> +++ b/arch/arm64/mm/pageattr.c
>>>>>>> @@ -148,6 +148,7 @@ static int change_memory_common(unsigned
>>>>>>> long addr, int numpages,
>>>>>>>        unsigned long size = PAGE_SIZE * numpages;
>>>>>>>        unsigned long end = start + size;
>>>>>>>        struct vm_struct *area;
>>>>>>> +    int ret;
>>>>>>>        int i;
>>>>>>>          if (!PAGE_ALIGNED(addr)) {
>>>>>>> @@ -185,8 +186,10 @@ static int change_memory_common(unsigned
>>>>>>> long addr, int numpages,
>>>>>>>        if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
>>>>>>>                    pgprot_val(clear_mask) == PTE_RDONLY)) {
>>>>>>>            for (i = 0; i < area->nr_pages; i++) {
>>>>>>> - __change_memory_common((u64)page_address(area->pages[i]),
>>>>>>> +            ret =
>>>>>>> __change_memory_common((u64)page_address(area->pages[i]),
>>>>>>>                               PAGE_SIZE, set_mask, clear_mask);
>>>>>>> +            if (ret)
>>>>>>> +                return ret;
>>>>>> Hmm, this means we can return failure half-way through the 
>>>>>> operation. Is
>>>>>> that something callers are expecting to handle? If so, how can 
>>>>>> they tell
>>>>>> how far we got?
>>>>> IIUC the callers don't have to know whether it is half-way or not
>>>>> because the callers will change the permission back (e.g. to RW) 
>>>>> for the
>>>>> whole range when freeing memory.
>>>> Yes, it is the caller's responsibility to set VM_FLUSH_RESET_PERMS 
>>>> flag.
>>>> Upon vfree(), it will change the direct map permissions back to RW.
>>> Ok, but vfree() ends up using update_range_prot() to do that and if we
>>> need to worry about that failing (as per your commit message), then
>>> we're in trouble because the calls to set_area_direct_map() are 
>>> unchecked.
>>>
>>> In other words, this patch is either not necessary or it is incomplete.
>>
>> Here is the relevant email, in the discussion between Ryan and Yang:
>>
>> https://lore.kernel.org/all/fe52a1d8-5211-4962-afc8-c3f9caf64119@os.amperecomputing.com/ 
>>
>>
>> We had concluded that all callers of set_memory_ro() or 
>> set_memory_rox() (which require the
>> linear map perm change back to default, upon vfree() ) will call it 
>> for the entire region (vm_struct).
>> So, when we do the set_direct_map_invalid_noflush, it is guaranteed 
>> that the region has already
>> been split. So this call cannot fail.
>>
>> https://lore.kernel.org/all/f8898c87-8f49-4ef2-86ae-b60bcf67658c@os.amperecomputing.com/ 
>>
>>
>> This email notes that there is some code doing set_memory_rw() and 
>> unnecessarily setting the VM_FLUSH_RESET_PERMS
>> flag, but in that case we don't care about the 
>> set_direct_map_invalid_noflush call failing because the protections
>> are already RW.
>>
>> Although we had also observed that all of this is fragile and depends 
>> on the caller doing the
>> correct thing. The real solution should be somehow getting rid of the 
>> BBM style invalidation.
>> Ryan had proposed some methods in that email thread.
>>
>> One solution which I had thought of, is that, observe that we are 
>> doing an overkill by
>> setting the linear map to invalid and then default, for the *entire* 
>> region. What we
>> can do is iterate over the linear map alias of the vm_struct *area 
>> and only change permission
>> back to RW for the pages which are *not* RW. And, those relevant 
>> mappings are guaranteed to
>> be split because they were changed from RW to not RW.
>
> @Yang and Ryan,
>
> I saw Yang's patch here:
> https://lore.kernel.org/all/20251023204428.477531-1-yang@os.amperecomputing.com/ 
>
> and realized that currently we are splitting away the linear map alias 
> of the *entire* region.
>
> Shouldn't this then imply that set_direct_map_invalid_noflush will 
> never fail, since even
>
> a set_memory_rox() call on a single page will split the linear map for 
> the entire region,
>
> and thus there is no fragility here which we were discussing about? I 
> may be forgetting
>
> something, this linear map stuff is confusing enough already.

It still may fail due to page table allocation failure when doing split. 
But it is still fine. We may run into 3 cases:

1. set_memory_rox succeed to split the whole range, then 
set_direct_map_invalid_noflush() will succeed too
2. set_memory_rox fails to split, for example, just change partial range 
permission due to page table allocation failure, then 
set_direct_map_invalid_noflush() may
    a. successfully change the permission back to default till where 
set_memory_rox fails at since that range has been successfully split. It 
is ok since the remaining range is actually not changed to ro by 
set_memory_rox at all
    b. successfully change the permission back to default for the whole 
range (for example, memory pressure is mitigated when 
set_direct_map_invalid_noflush() is called). It is definitely fine as well

Hopefully I don't miss anything.

Thanks,
Yang


>
>
>>
>>>
>>> Will
>>



  reply	other threads:[~2025-11-11  4:17 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-03  6:13 [PATCH] arm64/pageattr: Propagate return value from __change_memory_common Dev Jain
2025-11-03  7:48 ` Anshuman Khandual
2025-11-03  8:34   ` Dev Jain
2025-11-03 15:16 ` Will Deacon
2025-11-03 18:45   ` Yang Shi
2025-11-04  3:36     ` Dev Jain
2025-11-04 12:56       ` Will Deacon
2025-11-04 13:22         ` Dev Jain
2025-11-05  3:57         ` Dev Jain
2025-11-11  3:39           ` Dev Jain
2025-11-11  4:17             ` Yang Shi [this message]
2025-11-11  4:37               ` Dev Jain
2025-11-11  4:44                 ` Yang Shi
2025-11-11  4:55                   ` Dev Jain
2025-11-11  5:08                     ` Yang Shi
2025-11-11  5:12                       ` Dev Jain
2025-11-11 17:52                         ` Ryan Roberts
2025-11-11 23:59                           ` Yang Shi
2025-11-12  3:50                           ` Dev Jain
2025-11-11 23:45                         ` Yang Shi
2025-11-12  3:47                           ` Dev Jain
2025-11-12  5:59                 ` Dev Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1701ce9-c8b7-4ac8-8dd4-930af3dad7d2@os.amperecomputing.com \
    --to=yang@os.amperecomputing.com \
    --cc=catalin.marinas@arm.com \
    --cc=dev.jain@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shijie@os.amperecomputing.com \
    --cc=stable@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).