From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "luto@amacapital.net" <luto@amacapital.net>
Cc: "songliubraving@fb.com" <songliubraving@fb.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"keescook@chromium.org" <keescook@chromium.org>,
"jeyu@kernel.org" <jeyu@kernel.org>,
"ast@kernel.org" <ast@kernel.org>,
"kuznet@ms2.inr.ac.ru" <kuznet@ms2.inr.ac.ru>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
"mjg59@google.com" <mjg59@google.com>,
"thgarnie@chromium.org" <thgarnie@chromium.org>,
"kpsingh@chromium.org" <kpsingh@chromium.org>,
"linux-security-module@vger.kernel.org"
<linux-security-module@vger.kernel.org>,
"x86@kernel.org" <x86@kernel.org>,
"revest@chromium.org" <revest@chromium.org>,
"jannh@google.com" <jannh@google.com>,
"namit@vmware.com" <namit@vmware.com>,
"jackmanb@chromium.org" <jackmanb@chromium.org>,
"kafai@fb.com" <kafai@fb.com>, "yhs@fb.com" <yhs@fb.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"yoshfuji@linux-ipv6.org" <yoshfuji@linux-ipv6.org>,
"mhalcrow@google.com" <mhalcrow@google.com>,
"andriin@fb.com" <andriin@fb.com>
Subject: Re: [PATCH bpf-next] bpf: Make trampolines W^X
Date: Tue, 7 Jan 2020 19:01:44 +0000 [thread overview]
Message-ID: <cdd157ef011efda92c9434f76141fc3aef174d85.camel@intel.com> (raw)
In-Reply-To: <DB882EE8-20B2-4631-A808-E5C968B24CEB@amacapital.net>
CC Nadav and Jessica.
On Mon, 2020-01-06 at 15:36 -1000, Andy Lutomirski wrote:
> > On Jan 6, 2020, at 12:25 PM, Edgecombe, Rick P <rick.p.edgecombe@intel.com>
> > wrote:
> >
> > On Sat, 2020-01-04 at 09:49 +0900, Andy Lutomirski wrote:
> > > > > On Jan 4, 2020, at 8:47 AM, KP Singh <kpsingh@chromium.org> wrote:
> > > >
> > > > From: KP Singh <kpsingh@google.com>
> > > >
> > > > The image for the BPF trampolines is allocated with
> > > > bpf_jit_alloc_exe_page which marks this allocated page executable. This
> > > > means that the allocated memory is W and X at the same time making it
> > > > susceptible to WX based attacks.
> > > >
> > > > Since the allocated memory is shared between two trampolines (the
> > > > current and the next), 2 pages must be allocated to adhere to W^X and
> > > > the following sequence is obeyed where trampolines are modified:
> > >
> > > Can we please do better rather than piling garbage on top of garbage?
> > >
> > > >
> > > > - Mark memory as non executable (set_memory_nx). While module_alloc for
> > > > x86 allocates the memory as PAGE_KERNEL and not PAGE_KERNEL_EXEC, not
> > > > all implementations of module_alloc do so
> > >
> > > How about fixing this instead?
> > >
> > > > - Mark the memory as read/write (set_memory_rw)
> > >
> > > Probably harmless, but see above about fixing it.
> > >
> > > > - Modify the trampoline
> > >
> > > Seems reasonable. It’s worth noting that this whole approach is
> > > suboptimal:
> > > the “module” allocator should really be returning a list of pages to be
> > > written (not at the final address!) with the actual executable mapping to
> > > be
> > > materialized later, but that’s a bigger project that you’re welcome to
> > > ignore
> > > for now. (Concretely, it should produce a vmap address with backing pages
> > > but
> > > with the vmap alias either entirely unmapped or read-only. A subsequent
> > > healer
> > > would, all at once, make the direct map pages RO or not-present and make
> > > the
> > > vmap alias RX.)
> > > > - Mark the memory as read-only (set_memory_ro)
> > > > - Mark the memory as executable (set_memory_x)
> > >
> > > No, thanks. There’s very little excuse for doing two IPI flushes when one
> > > would suffice.
> > >
> > > As far as I know, all architectures can do this with a single flush
> > > without
> > > races x86 certainly can. The module freeing code gets this sequence
> > > right.
> > > Please reuse its mechanism or, if needed, export the relevant interfaces.
> >
> > So if I understand this right, some trampolines have been added that are
> > currently set as RWX at modification time AND left that way during runtime?
> > The
> > discussion on the order of set_memory_() calls in the commit message made me
> > think that this was just a modification time thing at first.
>
> I’m not sure what the status quo is.
>
> We really ought to have a genuinely good API for allocation and initialization
> of text. We can do so much better than set_memory_blahblah.
>
> FWIW, I have some ideas about making kernel flushes cheaper. It’s currently
> blocked on finding some time and on tglx’s irqtrace work.
>
Makes sense to me. I guess there are 6 types of text allocations now:
- These two BPF trampolines
- BPF JITs
- Modules
- Kprobes
- Ftrace
All doing (or should be doing) pretty much the same thing. I believe Jessica had
said at one point that she didn't like all the other features using
module_alloc() as it was supposed to be just for real modules. Where would the
API live?
> >
> > Also, is there a reason you couldn't use text_poke() to modify the
> > trampoline
> > with a single flush?
> >
>
> Does text_poke to an IPI these days?
I don't think so since the RW mapping is just on a single CPU. That was one of
the benefits of the temporary mm struct based thing Nadav did. I haven't looked
into PeterZ's changes though.
next prev parent reply other threads:[~2020-01-07 19:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-03 23:47 [PATCH bpf-next] bpf: Make trampolines W^X KP Singh
2020-01-04 0:49 ` Andy Lutomirski
2020-01-05 1:19 ` Justin Capella
2020-01-06 8:23 ` Peter Zijlstra
2020-01-06 22:25 ` Edgecombe, Rick P
2020-01-07 1:36 ` Andy Lutomirski
2020-01-07 19:01 ` Edgecombe, Rick P [this message]
2020-01-08 8:41 ` Andy Lutomirski
2020-01-08 20:52 ` Edgecombe, Rick P
2020-01-09 6:48 ` Andy Lutomirski
2020-01-10 1:00 ` Edgecombe, Rick P
2020-01-10 18:35 ` Andy Lutomirski
[not found] <CAMrEMU8Vsn8rfULqf1gfuYL_-ybqzit29CLYReskaZ8XUroZww@mail.gmail.com>
[not found] ` <768BAF04-BEBF-489A-8737-B645816B262A@amacapital.net>
2020-01-06 22:13 ` Alexei Starovoitov
2020-01-07 9:11 ` Peter Zijlstra
2020-01-07 18:55 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cdd157ef011efda92c9434f76141fc3aef174d85.camel@intel.com \
--to=rick.p.edgecombe@intel.com \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=jackmanb@chromium.org \
--cc=jannh@google.com \
--cc=jeyu@kernel.org \
--cc=kafai@fb.com \
--cc=keescook@chromium.org \
--cc=kpsingh@chromium.org \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mhalcrow@google.com \
--cc=mjg59@google.com \
--cc=namit@vmware.com \
--cc=revest@chromium.org \
--cc=songliubraving@fb.com \
--cc=thgarnie@chromium.org \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.