From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
Dave Hansen <dave.hansen@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Sat, 15 Jan 2022 10:11:18 +0000 [thread overview]
Message-ID: <21d6fc65-d9d1-66bb-9bea-a4bad78c7aac@csgroup.eu> (raw)
In-Reply-To: <3858de1f-cdbc-ff52-2890-4254d0f48b0a@huawei.com>
Le 28/12/2021 à 11:26, Kefeng Wang a écrit :
>
> On 2021/12/27 23:56, Dave Hansen wrote:
>> On 12/27/21 6:59 AM, Kefeng Wang wrote:
>>> This patch select HAVE_ARCH_HUGE_VMALLOC to let X86_64 and X86_PAE
>>> support huge vmalloc mappings.
>> In general, this seems interesting and the diff is simple. But, I don't
>> see _any_ x86-specific data. I think the bare minimum here would be a
>> few kernel compiles and some 'perf stat' data for some TLB events.
>
> When the feature supported on ppc,
>
> commit 8abddd968a303db75e4debe77a3df484164f1f33
> Author: Nicholas Piggin <npiggin@gmail.com>
> Date: Mon May 3 19:17:55 2021 +1000
>
> powerpc/64s/radix: Enable huge vmalloc mappings
>
> This reduces TLB misses by nearly 30x on a `git diff` workload on a
> 2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due
> to vfs hashes being allocated with 2MB pages.
>
> But the data could be different on different machine/arch.
>
>>> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
>>> index 95fa745e310a..6bf5cb7d876a 100644
>>> --- a/arch/x86/kernel/module.c
>>> +++ b/arch/x86/kernel/module.c
>>> @@ -75,8 +75,8 @@ void *module_alloc(unsigned long size)
>>> p = __vmalloc_node_range(size, MODULE_ALIGN,
>>> MODULES_VADDR + get_module_load_offset(),
>>> - MODULES_END, gfp_mask,
>>> - PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
>>> + MODULES_END, gfp_mask, PAGE_KERNEL,
>>> + VM_DEFER_KMEMLEAK | VM_NO_HUGE_VMAP, NUMA_NO_NODE,
>>> __builtin_return_address(0));
>>> if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
>>> vfree(p);
>> To figure out what's going on in this hunk, I had to look at the cover
>> letter (which I wasn't cc'd on). That's not great and it means that
>> somebody who stumbles upon this in the code is going to have a really
>> hard time figuring out what is going on. Cover letters don't make it
>> into git history.
> Sorry for that, will add more into arch's patch changelog.
>> This desperately needs a comment and some changelog material in *this*
>> patch.
>>
>> But, even the description from the cover letter is sparse:
>>
>>> There are some disadvantages about this feature[2], one of the main
>>> concerns is the possible memory fragmentation/waste in some scenarios,
>>> also archs must ensure that any arch specific vmalloc allocations that
>>> require PAGE_SIZE mappings(eg, module alloc with STRICT_MODULE_RWX)
>>> use the VM_NO_HUGE_VMAP flag to inhibit larger mappings.
>> That just says that x86 *needs* PAGE_SIZE allocations. But, what
>> happens if VM_NO_HUGE_VMAP is not passed (like it was in v1)? Will the
>> subsequent permission changes just fragment the 2M mapping?
>> .
>
> Yes, without VM_NO_HUGE_VMAP, it could fragment the 2M mapping.
>
> When module alloc with STRICT_MODULE_RWX on x86, it calls
> __change_page_attr()
>
> from set_memory_ro/rw/nx which will split large page, so there is no
> need to make
>
> module alloc with HUGE_VMALLOC.
>
Maybe there is no need to perform the module alloc with HUGE_VMALLOC,
but it least it would still work if you do so.
Powerpc did add VM_NO_HUGE_VMAP temporarily and for some reason which is
explained in a comment.
If x86 already has the necessary logic to handle it, why add
VM_NO_HUGE_VMAP ?
Christophe
WARNING: multiple messages have this Message-ID (diff)
From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
Dave Hansen <dave.hansen@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Nicholas Piggin <npiggin@gmail.com>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Paul Mackerras <paulus@samba.org>,
Thomas Gleixner <tglx@linutronix.de>,
Will Deacon <will@kernel.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Sat, 15 Jan 2022 10:11:18 +0000 [thread overview]
Message-ID: <21d6fc65-d9d1-66bb-9bea-a4bad78c7aac@csgroup.eu> (raw)
In-Reply-To: <3858de1f-cdbc-ff52-2890-4254d0f48b0a@huawei.com>
Le 28/12/2021 à 11:26, Kefeng Wang a écrit :
>
> On 2021/12/27 23:56, Dave Hansen wrote:
>> On 12/27/21 6:59 AM, Kefeng Wang wrote:
>>> This patch select HAVE_ARCH_HUGE_VMALLOC to let X86_64 and X86_PAE
>>> support huge vmalloc mappings.
>> In general, this seems interesting and the diff is simple. But, I don't
>> see _any_ x86-specific data. I think the bare minimum here would be a
>> few kernel compiles and some 'perf stat' data for some TLB events.
>
> When the feature supported on ppc,
>
> commit 8abddd968a303db75e4debe77a3df484164f1f33
> Author: Nicholas Piggin <npiggin@gmail.com>
> Date: Mon May 3 19:17:55 2021 +1000
>
> powerpc/64s/radix: Enable huge vmalloc mappings
>
> This reduces TLB misses by nearly 30x on a `git diff` workload on a
> 2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due
> to vfs hashes being allocated with 2MB pages.
>
> But the data could be different on different machine/arch.
>
>>> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
>>> index 95fa745e310a..6bf5cb7d876a 100644
>>> --- a/arch/x86/kernel/module.c
>>> +++ b/arch/x86/kernel/module.c
>>> @@ -75,8 +75,8 @@ void *module_alloc(unsigned long size)
>>> p = __vmalloc_node_range(size, MODULE_ALIGN,
>>> MODULES_VADDR + get_module_load_offset(),
>>> - MODULES_END, gfp_mask,
>>> - PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
>>> + MODULES_END, gfp_mask, PAGE_KERNEL,
>>> + VM_DEFER_KMEMLEAK | VM_NO_HUGE_VMAP, NUMA_NO_NODE,
>>> __builtin_return_address(0));
>>> if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
>>> vfree(p);
>> To figure out what's going on in this hunk, I had to look at the cover
>> letter (which I wasn't cc'd on). That's not great and it means that
>> somebody who stumbles upon this in the code is going to have a really
>> hard time figuring out what is going on. Cover letters don't make it
>> into git history.
> Sorry for that, will add more into arch's patch changelog.
>> This desperately needs a comment and some changelog material in *this*
>> patch.
>>
>> But, even the description from the cover letter is sparse:
>>
>>> There are some disadvantages about this feature[2], one of the main
>>> concerns is the possible memory fragmentation/waste in some scenarios,
>>> also archs must ensure that any arch specific vmalloc allocations that
>>> require PAGE_SIZE mappings(eg, module alloc with STRICT_MODULE_RWX)
>>> use the VM_NO_HUGE_VMAP flag to inhibit larger mappings.
>> That just says that x86 *needs* PAGE_SIZE allocations. But, what
>> happens if VM_NO_HUGE_VMAP is not passed (like it was in v1)? Will the
>> subsequent permission changes just fragment the 2M mapping?
>> .
>
> Yes, without VM_NO_HUGE_VMAP, it could fragment the 2M mapping.
>
> When module alloc with STRICT_MODULE_RWX on x86, it calls
> __change_page_attr()
>
> from set_memory_ro/rw/nx which will split large page, so there is no
> need to make
>
> module alloc with HUGE_VMALLOC.
>
Maybe there is no need to perform the module alloc with HUGE_VMALLOC,
but it least it would still work if you do so.
Powerpc did add VM_NO_HUGE_VMAP temporarily and for some reason which is
explained in a comment.
If x86 already has the necessary logic to handle it, why add
VM_NO_HUGE_VMAP ?
Christophe
WARNING: multiple messages have this Message-ID (diff)
From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
Dave Hansen <dave.hansen@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Sat, 15 Jan 2022 10:11:18 +0000 [thread overview]
Message-ID: <21d6fc65-d9d1-66bb-9bea-a4bad78c7aac@csgroup.eu> (raw)
In-Reply-To: <3858de1f-cdbc-ff52-2890-4254d0f48b0a@huawei.com>
Le 28/12/2021 à 11:26, Kefeng Wang a écrit :
>
> On 2021/12/27 23:56, Dave Hansen wrote:
>> On 12/27/21 6:59 AM, Kefeng Wang wrote:
>>> This patch select HAVE_ARCH_HUGE_VMALLOC to let X86_64 and X86_PAE
>>> support huge vmalloc mappings.
>> In general, this seems interesting and the diff is simple. But, I don't
>> see _any_ x86-specific data. I think the bare minimum here would be a
>> few kernel compiles and some 'perf stat' data for some TLB events.
>
> When the feature supported on ppc,
>
> commit 8abddd968a303db75e4debe77a3df484164f1f33
> Author: Nicholas Piggin <npiggin@gmail.com>
> Date: Mon May 3 19:17:55 2021 +1000
>
> powerpc/64s/radix: Enable huge vmalloc mappings
>
> This reduces TLB misses by nearly 30x on a `git diff` workload on a
> 2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due
> to vfs hashes being allocated with 2MB pages.
>
> But the data could be different on different machine/arch.
>
>>> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
>>> index 95fa745e310a..6bf5cb7d876a 100644
>>> --- a/arch/x86/kernel/module.c
>>> +++ b/arch/x86/kernel/module.c
>>> @@ -75,8 +75,8 @@ void *module_alloc(unsigned long size)
>>> p = __vmalloc_node_range(size, MODULE_ALIGN,
>>> MODULES_VADDR + get_module_load_offset(),
>>> - MODULES_END, gfp_mask,
>>> - PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
>>> + MODULES_END, gfp_mask, PAGE_KERNEL,
>>> + VM_DEFER_KMEMLEAK | VM_NO_HUGE_VMAP, NUMA_NO_NODE,
>>> __builtin_return_address(0));
>>> if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
>>> vfree(p);
>> To figure out what's going on in this hunk, I had to look at the cover
>> letter (which I wasn't cc'd on). That's not great and it means that
>> somebody who stumbles upon this in the code is going to have a really
>> hard time figuring out what is going on. Cover letters don't make it
>> into git history.
> Sorry for that, will add more into arch's patch changelog.
>> This desperately needs a comment and some changelog material in *this*
>> patch.
>>
>> But, even the description from the cover letter is sparse:
>>
>>> There are some disadvantages about this feature[2], one of the main
>>> concerns is the possible memory fragmentation/waste in some scenarios,
>>> also archs must ensure that any arch specific vmalloc allocations that
>>> require PAGE_SIZE mappings(eg, module alloc with STRICT_MODULE_RWX)
>>> use the VM_NO_HUGE_VMAP flag to inhibit larger mappings.
>> That just says that x86 *needs* PAGE_SIZE allocations. But, what
>> happens if VM_NO_HUGE_VMAP is not passed (like it was in v1)? Will the
>> subsequent permission changes just fragment the 2M mapping?
>> .
>
> Yes, without VM_NO_HUGE_VMAP, it could fragment the 2M mapping.
>
> When module alloc with STRICT_MODULE_RWX on x86, it calls
> __change_page_attr()
>
> from set_memory_ro/rw/nx which will split large page, so there is no
> need to make
>
> module alloc with HUGE_VMALLOC.
>
Maybe there is no need to perform the module alloc with HUGE_VMALLOC,
but it least it would still work if you do so.
Powerpc did add VM_NO_HUGE_VMAP temporarily and for some reason which is
explained in a comment.
If x86 already has the necessary logic to handle it, why add
VM_NO_HUGE_VMAP ?
Christophe
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-01-15 10:11 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-27 14:59 [PATCH v2 0/3] mm: support huge vmalloc mapping on arm64/x86 Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` [PATCH v2 1/3] mm: vmalloc: Let user to control huge vmalloc default behavior Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2022-01-18 2:52 ` Nicholas Piggin
2022-01-18 2:52 ` Nicholas Piggin
2022-01-18 2:52 ` Nicholas Piggin
2022-01-19 12:57 ` Kefeng Wang
2022-01-19 12:57 ` Kefeng Wang
2022-01-19 12:57 ` Kefeng Wang
2022-01-19 13:22 ` Matthew Wilcox
2022-01-19 13:22 ` Matthew Wilcox
2022-01-19 13:22 ` Matthew Wilcox
2022-01-19 13:44 ` Kefeng Wang
2022-01-19 13:44 ` Kefeng Wang
2022-01-19 13:44 ` Kefeng Wang
2022-01-19 13:48 ` Matthew Wilcox
2022-01-19 13:48 ` Matthew Wilcox
2022-01-19 13:48 ` Matthew Wilcox
2021-12-27 14:59 ` [PATCH v2 2/3] arm64: Support huge vmalloc mappings Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 17:35 ` (No subject) William Kucharski
2021-12-27 17:35 ` William Kucharski
2021-12-27 17:35 ` William Kucharski
2021-12-28 1:36 ` Kefeng Wang
2021-12-28 1:36 ` Kefeng Wang
2021-12-28 1:36 ` Kefeng Wang
2022-01-15 10:05 ` [PATCH v2 2/3] arm64: Support huge vmalloc mappings Christophe Leroy
2022-01-15 10:05 ` Christophe Leroy
2022-01-15 10:05 ` Christophe Leroy
2021-12-27 14:59 ` [PATCH v2 3/3] x86: " Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 15:56 ` Dave Hansen
2021-12-27 15:56 ` Dave Hansen
2021-12-27 15:56 ` Dave Hansen
2021-12-28 10:26 ` Kefeng Wang
2021-12-28 10:26 ` Kefeng Wang
2021-12-28 10:26 ` Kefeng Wang
2021-12-28 16:14 ` Dave Hansen
2021-12-28 16:14 ` Dave Hansen
2021-12-28 16:14 ` Dave Hansen
2021-12-29 11:01 ` Kefeng Wang
2021-12-29 11:01 ` Kefeng Wang
2021-12-29 11:01 ` Kefeng Wang
2022-01-15 10:17 ` Christophe Leroy
2022-01-15 10:17 ` Christophe Leroy
2022-01-15 10:17 ` Christophe Leroy
2022-01-15 10:15 ` Christophe Leroy
2022-01-15 10:15 ` Christophe Leroy
2022-01-15 10:15 ` Christophe Leroy
2022-01-18 2:46 ` Nicholas Piggin
2022-01-18 2:46 ` Nicholas Piggin
2022-01-18 2:46 ` Nicholas Piggin
2022-01-18 17:28 ` Dave Hansen
2022-01-18 17:28 ` Dave Hansen
2022-01-18 17:28 ` Dave Hansen
2022-01-19 4:17 ` Nicholas Piggin
2022-01-19 4:17 ` Nicholas Piggin
2022-01-19 4:17 ` Nicholas Piggin
2022-01-19 13:32 ` Kefeng Wang
2022-01-19 13:32 ` Kefeng Wang
2022-01-19 13:32 ` Kefeng Wang
2022-01-15 10:11 ` Christophe Leroy [this message]
2022-01-15 10:11 ` Christophe Leroy
2022-01-15 10:11 ` Christophe Leroy
2022-01-15 10:06 ` Christophe Leroy
2022-01-15 10:06 ` Christophe Leroy
2022-01-15 10:06 ` Christophe Leroy
2022-01-15 10:07 ` [PATCH v2 0/3] mm: support huge vmalloc mapping on arm64/x86 Christophe Leroy
2022-01-15 10:07 ` Christophe Leroy
2022-01-15 10:07 ` Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=21d6fc65-d9d1-66bb-9bea-a4bad78c7aac@csgroup.eu \
--to=christophe.leroy@csgroup.eu \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@samba.org \
--cc=tglx@linutronix.de \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.