[PATCH] arm64: Allow vmalloc regions to be set with set_memory_*

All of lore.kernel.org
 help / color / mirror / Atom feed

From: qiuxishi@huawei.com (Xishi Qiu)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm64: Allow vmalloc regions to be set with set_memory_*
Date: Thu, 28 Jan 2016 09:47:20 +0800	[thread overview]
Message-ID: <56A97328.9070003@huawei.com> (raw)
In-Reply-To: <20160118115640.GK21067@leverpostej>

On 2016/1/18 19:56, Mark Rutland wrote:

> On Wed, Jan 13, 2016 at 05:10:31PM +0100, Ard Biesheuvel wrote:
>> On 13 January 2016 at 15:03, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>> On 12 January 2016 at 22:46, Laura Abbott <labbott@fedoraproject.org> wrote:
>>>>
>>>> The range of set_memory_* is currently restricted to the module address
>>>> range because of difficulties in breaking down larger block sizes.
>>>> vmalloc maps PAGE_SIZE pages so it is safe to use as well. Update the
>>>> function ranges and add a comment explaining why the range is restricted
>>>> the way it is.
>>>>
>>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>>> ---
>>>> This should let the protections for the eBPF work as expected, I don't
>>>> know if there is some sort of self test for thatL.
>>>
>>>
>>> This is going to conflict with my KASLR implementation, since it puts
>>> the kernel image right in the middle of the vmalloc area, and the
>>> kernel is obviously mapped with block mappings. In fact, I am
>>> proposing enabling huge-vmap for arm64 as well, since it seems an
>>> improvement generally, but also specifically allows me to unmap the
>>> __init section using the generic vunmap code (remove_vm_area). But in
>>> general, I think the assumption that the whole vmalloc area is mapped
>>> using pages is not tenable.
>>>
>>> AFAICT, vmalloc still use pages exclusively even with huge-vmap (but
>>> ioremap does not). So perhaps it would make sense to check for the
>>> VM_ALLOC bit in the VMA flags (which I will not set for the kernel
>>> regions either)
>>>
>>
>> Something along these lines, perhaps?
>>
>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>> index 3571c7309c5e..bda0a776c58e 100644
>> --- a/arch/arm64/mm/pageattr.c
>> +++ b/arch/arm64/mm/pageattr.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/kernel.h>
>>  #include <linux/mm.h>
>>  #include <linux/module.h>
>> +#include <linux/vmalloc.h>
>>  #include <linux/sched.h>
>>
>>  #include <asm/pgtable.h>
>> @@ -44,6 +45,7 @@ static int change_memory_common(unsigned long addr
>>         unsigned long end = start + size;
>>         int ret;
>>         struct page_change_data data;
>> +       struct vm_struct *area;
>>
>>         if (!PAGE_ALIGNED(addr)) {
>>                 start &= PAGE_MASK;
>> @@ -51,10 +53,14 @@ static int change_memory_common(unsigned long addr,
>>                 WARN_ON_ONCE(1);
>>         }
>>
>> -       if (start < MODULES_VADDR || start >= MODULES_END)
>> -               return -EINVAL;
>> -
>> -       if (end < MODULES_VADDR || end >= MODULES_END)
>> +       /*
>> +        * Check whether the [addr, addr + size) interval is entirely
>> +        * covered by precisely one VM area that has the VM_ALLOC flag set
>> +        */
>> +       area = find_vm_area((void *)addr);
>> +       if (!area ||
>> +           end > (unsigned long)area->addr + area->size ||
>> +           !(area->flags & VM_ALLOC))
>>                 return -EINVAL;
>>
>>         data.set_mask = set_mask;
> 
> Neat. That fixes the fencepost bug too.
> 
> Looks good to me, though as Laura suggested we should have a comment as
> to why we limit changes to such regions. Fancy taking her wording below
> and spinning this as a patch?
> 
>>>> +       /*
>>>> +        * This check explicitly excludes most kernel memory. Most kernel
>>>> +        * memory is mapped with a larger page size and breaking down the
>>>> +        * larger page size without causing TLB conflicts is very difficult.
>>>> +        *
>>>> +        * If you need to call set_memory_* on a range, the recommendation is
>>>> +        * to use vmalloc since that range is mapped with pages.
>>>> +        */
> 
> Thanks,
> Mark.
> 

Hi Mark,

After change the flag, it calls only flush_tlb_kernel_range(), so why not use 
cpu_replace_ttbr1(swapper_pg_dir)? 

One more question, does TLB conflict only affect kernel page talbe?
There is no problem when spliting the transparent hugepage, right?

Thanks,
Xishi Qiu

> .
>

WARNING: multiple messages have this Message-ID (diff)

From: Xishi Qiu <qiuxishi@huawei.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Laura Abbott <labbott@fedoraproject.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Kees Cook <keescook@chromium.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	zhong jiang <zhongjiang@huawei.com>
Subject: Re: [PATCH] arm64: Allow vmalloc regions to be set with set_memory_*
Date: Thu, 28 Jan 2016 09:47:20 +0800	[thread overview]
Message-ID: <56A97328.9070003@huawei.com> (raw)
In-Reply-To: <20160118115640.GK21067@leverpostej>

On 2016/1/18 19:56, Mark Rutland wrote:

> On Wed, Jan 13, 2016 at 05:10:31PM +0100, Ard Biesheuvel wrote:
>> On 13 January 2016 at 15:03, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>> On 12 January 2016 at 22:46, Laura Abbott <labbott@fedoraproject.org> wrote:
>>>>
>>>> The range of set_memory_* is currently restricted to the module address
>>>> range because of difficulties in breaking down larger block sizes.
>>>> vmalloc maps PAGE_SIZE pages so it is safe to use as well. Update the
>>>> function ranges and add a comment explaining why the range is restricted
>>>> the way it is.
>>>>
>>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>>> ---
>>>> This should let the protections for the eBPF work as expected, I don't
>>>> know if there is some sort of self test for thatL.
>>>
>>>
>>> This is going to conflict with my KASLR implementation, since it puts
>>> the kernel image right in the middle of the vmalloc area, and the
>>> kernel is obviously mapped with block mappings. In fact, I am
>>> proposing enabling huge-vmap for arm64 as well, since it seems an
>>> improvement generally, but also specifically allows me to unmap the
>>> __init section using the generic vunmap code (remove_vm_area). But in
>>> general, I think the assumption that the whole vmalloc area is mapped
>>> using pages is not tenable.
>>>
>>> AFAICT, vmalloc still use pages exclusively even with huge-vmap (but
>>> ioremap does not). So perhaps it would make sense to check for the
>>> VM_ALLOC bit in the VMA flags (which I will not set for the kernel
>>> regions either)
>>>
>>
>> Something along these lines, perhaps?
>>
>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>> index 3571c7309c5e..bda0a776c58e 100644
>> --- a/arch/arm64/mm/pageattr.c
>> +++ b/arch/arm64/mm/pageattr.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/kernel.h>
>>  #include <linux/mm.h>
>>  #include <linux/module.h>
>> +#include <linux/vmalloc.h>
>>  #include <linux/sched.h>
>>
>>  #include <asm/pgtable.h>
>> @@ -44,6 +45,7 @@ static int change_memory_common(unsigned long addr
>>         unsigned long end = start + size;
>>         int ret;
>>         struct page_change_data data;
>> +       struct vm_struct *area;
>>
>>         if (!PAGE_ALIGNED(addr)) {
>>                 start &= PAGE_MASK;
>> @@ -51,10 +53,14 @@ static int change_memory_common(unsigned long addr,
>>                 WARN_ON_ONCE(1);
>>         }
>>
>> -       if (start < MODULES_VADDR || start >= MODULES_END)
>> -               return -EINVAL;
>> -
>> -       if (end < MODULES_VADDR || end >= MODULES_END)
>> +       /*
>> +        * Check whether the [addr, addr + size) interval is entirely
>> +        * covered by precisely one VM area that has the VM_ALLOC flag set
>> +        */
>> +       area = find_vm_area((void *)addr);
>> +       if (!area ||
>> +           end > (unsigned long)area->addr + area->size ||
>> +           !(area->flags & VM_ALLOC))
>>                 return -EINVAL;
>>
>>         data.set_mask = set_mask;
> 
> Neat. That fixes the fencepost bug too.
> 
> Looks good to me, though as Laura suggested we should have a comment as
> to why we limit changes to such regions. Fancy taking her wording below
> and spinning this as a patch?
> 
>>>> +       /*
>>>> +        * This check explicitly excludes most kernel memory. Most kernel
>>>> +        * memory is mapped with a larger page size and breaking down the
>>>> +        * larger page size without causing TLB conflicts is very difficult.
>>>> +        *
>>>> +        * If you need to call set_memory_* on a range, the recommendation is
>>>> +        * to use vmalloc since that range is mapped with pages.
>>>> +        */
> 
> Thanks,
> Mark.
> 

Hi Mark,

After change the flag, it calls only flush_tlb_kernel_range(), so why not use 
cpu_replace_ttbr1(swapper_pg_dir)? 

One more question, does TLB conflict only affect kernel page talbe?
There is no problem when spliting the transparent hugepage, right?

Thanks,
Xishi Qiu

> .
>

next prev parent reply	other threads:[~2016-01-28  1:47 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-12 21:46 [PATCH] arm64: Allow vmalloc regions to be set with set_memory_* Laura Abbott
2016-01-12 21:46 ` Laura Abbott
2016-01-13  0:01 ` Alexei Starovoitov
2016-01-13  0:01   ` Alexei Starovoitov
2016-01-13  0:31   ` Daniel Borkmann
2016-01-13  0:31     ` Daniel Borkmann
2016-01-13 14:03 ` Ard Biesheuvel
2016-01-13 14:03   ` Ard Biesheuvel
2016-01-13 16:10   ` Ard Biesheuvel
2016-01-13 16:10     ` Ard Biesheuvel
2016-01-14 23:01     ` Laura Abbott
2016-01-14 23:01       ` Laura Abbott
2016-01-18 11:56     ` Mark Rutland
2016-01-18 11:56       ` Mark Rutland
2016-01-28  1:47       ` Xishi Qiu [this message]
2016-01-28  1:47         ` Xishi Qiu
2016-01-28 10:51         ` Mark Rutland
2016-01-28 10:51           ` Mark Rutland
2016-01-28 11:47           ` Xishi Qiu
2016-01-28 11:47             ` Xishi Qiu
2016-01-28 14:27             ` Mark Rutland
2016-01-28 14:27               ` Mark Rutland
2016-01-29  1:21               ` Xishi Qiu
2016-01-29  1:21                 ` Xishi Qiu
2016-01-29 11:02                 ` Mark Rutland
2016-01-29 11:02                   ` Mark Rutland
2016-01-30  2:48                   ` Xishi Qiu
2016-01-30  2:48                     ` Xishi Qiu
2016-02-03 13:43                     ` Mark Rutland
2016-02-03 13:43                       ` Mark Rutland
  -- strict thread matches above, loose matches on Subject: below --
2016-01-18 14:01 [PATCH] arm64: allow " Ard Biesheuvel
2016-01-18 15:05 ` Mark Rutland
2016-01-28 15:08 ` Will Deacon
2016-01-28 16:40   ` Ard Biesheuvel
2016-01-28 16:43   ` Ard Biesheuvel
2016-01-28 18:10     ` Will Deacon
2016-01-28 19:07       ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A97328.9070003@huawei.com \
    --to=qiuxishi@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.