From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"song@kernel.org" <song@kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Cc: "hch@lst.de" <hch@lst.de>,
"mcgrof@kernel.org" <mcgrof@kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"x86@kernel.org" <x86@kernel.org>,
"Hansen, Dave" <dave.hansen@intel.com>
Subject: Re: [PATCH bpf-next v1 RESEND 5/5] x86: use register_text_tail_vm
Date: Wed, 2 Nov 2022 22:24:22 +0000 [thread overview]
Message-ID: <cc240325744f6b9237c433fb74aac28f34a3a8cf.camel@intel.com> (raw)
In-Reply-To: <20221031222541.1773452-6-song@kernel.org>
On Mon, 2022-10-31 at 15:25 -0700, Song Liu wrote:
> Allocate 2MB pages up to round_up(_etext, 2MB), and register memory
> [round_up(_etext, 4kb), round_up(_etext, 2MB)] with
> register_text_tail_vm
> so that we can use this part of memory for dynamic kernel text (BPF
> programs, etc.).
>
> Here is an example:
>
> [root@eth50-1 ~]# grep _etext /proc/kallsyms
> ffffffff82202a08 T _etext
>
> [root@eth50-1 ~]# grep bpf_prog_ /proc/kallsyms | tail -n 3
> ffffffff8220f920 t
> bpf_prog_cc61a5364ac11d93_handle__sched_wakeup [bpf]
> ffffffff8220fa28 t
> bpf_prog_cc61a5364ac11d93_handle__sched_wakeup_new [bpf]
> ffffffff8220fad4 t
> bpf_prog_3bf73fa16f5e3d92_handle__sched_switch [bpf]
>
> [root@eth50-1 ~]# grep 0xffffffff82200000
> /sys/kernel/debug/page_tables/kernel
> 0xffffffff82200000-
> 0xffffffff82400000 2M ro PSE x pmd
>
> ffffffff82200000-ffffffff82400000 is a 2MB page, serving kernel text,
> and
> bpf programs.
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
> arch/x86/include/asm/pgtable_64_types.h | 1 +
> arch/x86/mm/init_64.c | 4 +++-
> include/linux/vmalloc.h | 4 ++++
> 3 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/pgtable_64_types.h
> b/arch/x86/include/asm/pgtable_64_types.h
> index 04f36063ad54..c0f9cceb109a 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -101,6 +101,7 @@ extern unsigned int ptrs_per_p4d;
> #define PUD_MASK (~(PUD_SIZE - 1))
> #define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT)
> #define PGDIR_MASK (~(PGDIR_SIZE - 1))
> +#define PMD_ALIGN(x) (((unsigned long)(x) + (PMD_SIZE - 1)) &
> PMD_MASK)
>
> /*
> * See Documentation/x86/x86_64/mm.rst for a description of the
> memory map.
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 3f040c6e5d13..5b42fc0c6099 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1373,7 +1373,7 @@ void mark_rodata_ro(void)
> unsigned long start = PFN_ALIGN(_text);
> unsigned long rodata_start = PFN_ALIGN(__start_rodata);
> unsigned long end = (unsigned long)__end_rodata_hpage_align;
> - unsigned long text_end = PFN_ALIGN(_etext);
> + unsigned long text_end = PMD_ALIGN(_etext);
> unsigned long rodata_end = PFN_ALIGN(__end_rodata);
> unsigned long all_end;
Check out is_errata93(). Right now it assumes all text is between text-
etext and MODULES_VADDR-MODULES_END. It's a quite old errata, but it
would be nice if we had a is_text_addr() helper or something. To help
keep track of the places where text might pop up.
Speaking of which, it might be nice to update
Documentation/x86/x86_64/mm.rst with some hints that this area exists.
>
> @@ -1414,6 +1414,8 @@ void mark_rodata_ro(void)
> (void *)rodata_end, (void *)_sdata);
>
> debug_checkwx();
> + register_text_tail_vm(PFN_ALIGN((unsigned long)_etext),
> + PMD_ALIGN((unsigned long)_etext));
> }
>
> int kern_addr_valid(unsigned long addr)
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 9b2042313c12..7365cf9c4e7f 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -132,11 +132,15 @@ extern void vm_unmap_aliases(void);
> #ifdef CONFIG_MMU
> extern void __init vmalloc_init(void);
> extern unsigned long vmalloc_nr_pages(void);
> +void register_text_tail_vm(unsigned long start, unsigned long end);
> #else
> static inline void vmalloc_init(void)
> {
> }
> static inline unsigned long vmalloc_nr_pages(void) { return 0; }
> +void register_text_tail_vm(unsigned long start, unsigned long end)
> +{
> +}
> #endif
This looks like it should be in the previous patch.
>
> extern void *vmalloc(unsigned long size) __alloc_size(1);
next prev parent reply other threads:[~2022-11-02 22:24 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 22:25 [PATCH bpf-next v1 RESEND 0/5] vmalloc_exec for modules and BPF programs Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 1/5] vmalloc: introduce vmalloc_exec, vfree_exec, and vcopy_exec Song Liu
2022-11-02 23:41 ` Luis Chamberlain
2022-11-03 15:51 ` Mike Rapoport
2022-11-03 18:59 ` Luis Chamberlain
2022-11-03 21:19 ` Edgecombe, Rick P
2022-11-03 21:41 ` Song Liu
2022-11-03 23:33 ` Luis Chamberlain
2022-11-04 0:18 ` Luis Chamberlain
2022-11-04 3:29 ` Luis Chamberlain
2022-11-07 6:58 ` Mike Rapoport
2022-11-07 17:26 ` Luis Chamberlain
2022-11-07 6:40 ` Aaron Lu
2022-11-07 17:39 ` Luis Chamberlain
2022-11-07 18:35 ` Song Liu
2022-11-07 18:30 ` Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 2/5] x86/alternative: support vmalloc_exec() and vfree_exec() Song Liu
2022-11-02 22:21 ` Edgecombe, Rick P
2022-11-03 21:03 ` Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 3/5] bpf: use vmalloc_exec for bpf program and bpf dispatcher Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 4/5] vmalloc: introduce register_text_tail_vm() Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 5/5] x86: use register_text_tail_vm Song Liu
2022-11-02 22:24 ` Edgecombe, Rick P [this message]
2022-11-03 21:04 ` Song Liu
2022-11-01 11:26 ` [PATCH bpf-next v1 RESEND 0/5] vmalloc_exec for modules and BPF programs Christoph Hellwig
2022-11-01 15:10 ` Song Liu
2022-11-02 20:45 ` Luis Chamberlain
2022-11-02 22:29 ` Edgecombe, Rick P
2022-11-03 21:13 ` Song Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cc240325744f6b9237c433fb74aac28f34a3a8cf.camel@intel.com \
--to=rick.p.edgecombe@intel.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=dave.hansen@intel.com \
--cc=hch@lst.de \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=peterz@infradead.org \
--cc=song@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).