From: Eric Dumazet <eric.dumazet@gmail.com>
To: Indan Zupancic <indan@nul.nu>
Cc: Will Drewry <wad@chromium.org>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
netdev@vger.kernel.org, x86@kernel.org, arnd@arndb.de,
davem@davemloft.net, hpa@zytor.com, mingo@redhat.com,
oleg@redhat.com, peterz@infradead.org, rdunlap@xenotime.net,
mcgrathr@chromium.org, tglx@linutronix.de, luto@mit.edu,
eparis@redhat.com, serge.hallyn@canonical.com, djm@mindrot.org,
scarybeasts@gmail.com, pmoore@redhat.com,
akpm@linux-foundation.org, corbet@lwn.net, markus@chromium.org,
coreyb@linux.vnet.ibm.com, keescook@chromium.org
Subject: Re: [PATCH v14 01/13] sk_run_filter: add BPF_S_ANC_SECCOMP_LD_W
Date: Sat, 17 Mar 2012 06:49:44 -0700 [thread overview]
Message-ID: <1331992184.2466.45.camel@edumazet-laptop> (raw)
In-Reply-To: <7a1c4974e8fbc3b82ead0bfb18224d5b.squirrel@webmail.greenhost.nl>
Le samedi 17 mars 2012 à 21:14 +1100, Indan Zupancic a écrit :
> On Wed, March 14, 2012 19:05, Eric Dumazet wrote:
> > Le mercredi 14 mars 2012 à 08:59 +0100, Indan Zupancic a écrit :
> >
> >> The only remaining question is, is it worth the extra code to release
> >> up to 32kB of unused memory? It seems a waste to not free it, but if
> >> people think it's not worth it then let's just leave it around.
> >
> > Quite frankly its not an issue, given JIT BPF is not yet default
> > enabled.
>
> And what if assuming JIT BPF would be default enabled?
>
OK, so here are the reasons why I chose not doing this :
---------------------------------------------------------
1) When I wrote this code, I _wanted_ keeping the original BPF around
for post morterm analysis. When we are 100% confident code is bug free,
we might remove the "BPF source code", but I am not convinced.
2) Most filters are less than 1 Kbytes, and who run thousands of BPF
network filters on a machine ? Do you have real cases ? Because in these
cases, the vmalloc() PAGE granularity might be a problem anyway.
Some filters are setup for a very short period of time...
(tcpdump for example setup a "ret 0" at the very beginning of a capture
). Doing the extra kmalloc()/copy/kfree() is a loss.
tcpdump -n -s 0 -c 1000 arp
[29211.083449] JIT code: ffffffffa0cbe000: 31 c0 c3
[29211.083481] flen=4 proglen=55 pass=3 image=ffffffffa0cc0000
[29211.083487] JIT code: ffffffffa0cc0000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 68
[29211.083494] JIT code: ffffffffa0cc0010: 44 2b 4f 6c 4c 8b 87 e0 00 00 00 be 0c 00 00 00
[29211.083500] JIT code: ffffffffa0cc0020: e8 04 32 38 e0 3d 06 08 00 00 75 07 b8 ff ff 00
[29211.083506] JIT code: ffffffffa0cc0030: 00 eb 02 31 c0 c9 c3
> The current JIT doesn't handle negative offsets: The stuff that's handled
> by __load_pointer(). Easiest solution would be to make it non-static and
> call it instead of doing bpf_error. I guess __load_pointer was added later
> and the JIT code didn't get updated.
I dont think so, check git history if you want :)
>
> But gcc refuses to inline load_pointer, instead it inlines __load_pointer
> and does the important checks first. Considering the current assembly code
> does a call too, it could as well call load_pointer() directly. That would
> save a lot of assembly code, handle all negative cases too and be pretty
> much the same speed. The only question is if this slow down some other
> archs than x86. What do you think?
You miss the point : 99.999 % of offsets are positive in filters.
Best is to not call load_pointer() and only call skb_copy_bits() if the
data is not in skb head, but in some fragment.
I dont know, I never had to use negative offsets in my own filters.
So in the BPF JIT I said : If we have a negative offset in a filter,
just disable JIT code completely for this filter (lines 478-479).
Same for fancy instructions like BPF_S_ANC_NLATTR /
BPF_S_ANC_NLATTR_NEST
Show me a real use first.
I am pragmatic : I spend time coding stuff if there is a real need.
>
> The EMIT_COND_JMP(f_op, f_offset); should be in an else case, otherwise
> it's superfluous. It's a harmless bug though. I haven't spotted anything
> else yet.
Its not superflous, see my comment at the end of this mail.
>
> You can get rid of all the "if (is_imm8(offsetof(struct sk_buff, len)))"
> code by making sure everything is near: Somewhere at the start, just
> add 127 to %rdi and a BUILD_BUG_ON(sizeof(struct sk_buff) > 255).
>
This code is optimized away by the compiler, you know that ?
Adding "add 127 to rdi" is one more instruction, adding dependencies and
making out slow path code more complex (calls to skb_copy_bits() in
bpf_jit.S ...). Thats a bad idea.
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 7c1b765..7e0f575 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -581,8 +581,9 @@ cond_branch: f_offset = addrs[i + filter[i].jf] - addrs[i];
> if (filter[i].jf)
> EMIT_JMP(f_offset);
> break;
> + } else {
> + EMIT_COND_JMP(f_op, f_offset);
> }
> - EMIT_COND_JMP(f_op, f_offset);
> break;
> default:
> /* hmm, too complex filter, give up with jit compiler */
>
>
>
I see no change in your patch in the code generation.
if (filter[i].jt == 0), we want to EMIT_COND_JMP(f_op, f_offset);
because we know at this point that filter[i].jf != 0) [ line 536 ]
if (filter[i].jt != 0), the break; in line 583 prevents the
EMIT_COND_JMP(f_op, f_offset);
Thanks !
next prev parent reply other threads:[~2012-03-17 13:49 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-12 21:28 [PATCH v14 01/13] sk_run_filter: add BPF_S_ANC_SECCOMP_LD_W Will Drewry
2012-03-12 21:28 ` [PATCH v14 02/13] net/compat.c,linux/filter.h: share compat_sock_fprog Will Drewry
2012-03-12 21:28 ` [PATCH v14 03/13] seccomp: kill the seccomp_t typedef Will Drewry
2012-03-12 21:28 ` [PATCH v14 04/13] arch/x86: add syscall_get_arch to syscall.h Will Drewry
2012-03-12 21:28 ` [PATCH v14 05/13] asm/syscall.h: add syscall_get_arch Will Drewry
2012-03-12 21:28 ` [PATCH v14 06/13] seccomp: add system call filtering using BPF Will Drewry
2012-03-13 3:33 ` Indan Zupancic
2012-03-13 15:57 ` [kernel-hardening] " Will Drewry
2012-03-12 21:28 ` [PATCH v14 07/13] signal, x86: add SIGSYS info and make it synchronous Will Drewry
2012-03-12 21:28 ` [PATCH v14 08/13] seccomp: add SECCOMP_RET_ERRNO Will Drewry
2012-03-12 21:28 ` [PATCH v14 09/13] seccomp: Add SECCOMP_RET_TRAP Will Drewry
2012-03-12 21:28 ` [PATCH v14 10/13] ptrace,seccomp: Add PTRACE_SECCOMP support Will Drewry
2012-03-14 7:31 ` Indan Zupancic
2012-03-14 15:03 ` [kernel-hardening] " Will Drewry
2012-03-14 15:52 ` Will Drewry
2012-03-15 20:31 ` [PATCH v16 11/13] " Will Drewry
2012-03-12 21:28 ` [PATCH v14 11/13] x86: Enable HAVE_ARCH_SECCOMP_FILTER Will Drewry
2012-03-12 21:28 ` [PATCH v14 12/13] Documentation: prctl/seccomp_filter Will Drewry
2012-03-12 21:28 ` [PATCH v14 13/13] seccomp: remove duplicated failure logging Will Drewry
2012-03-13 3:40 ` [PATCH v14 01/13] sk_run_filter: add BPF_S_ANC_SECCOMP_LD_W Indan Zupancic
2012-03-13 15:40 ` Will Drewry
2012-03-13 10:04 ` Indan Zupancic
2012-03-13 15:43 ` Will Drewry
2012-03-13 17:13 ` Eric Dumazet
2012-03-14 5:12 ` Indan Zupancic
2012-03-14 5:55 ` Eric Dumazet
2012-03-14 7:59 ` Indan Zupancic
2012-03-14 8:05 ` Eric Dumazet
2012-03-17 10:14 ` Indan Zupancic
2012-03-17 13:49 ` Eric Dumazet [this message]
2012-03-18 8:35 ` Indan Zupancic
2012-03-18 12:40 ` [PATCH] net: bpf_jit: fix BPF_S_LDX_B_MSH compilation Eric Dumazet
2012-03-19 21:42 ` David Miller
2012-03-20 0:16 ` [PATCH] net: bpf_jit: Document evilness of negative indirect loads Indan Zupancic
2012-03-18 12:52 ` [PATCH v14 01/13] sk_run_filter: add BPF_S_ANC_SECCOMP_LD_W Eric Dumazet
2012-03-20 2:24 ` [PATCH] net: bpf_jit: Simplify code by always using offset8 or offset32 Indan Zupancic
2012-03-20 2:59 ` Eric Dumazet
2012-03-20 11:33 ` Indan Zupancic
2012-03-20 11:41 ` David Laight
2012-03-20 13:56 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1331992184.2466.45.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=corbet@lwn.net \
--cc=coreyb@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=djm@mindrot.org \
--cc=eparis@redhat.com \
--cc=hpa@zytor.com \
--cc=indan@nul.nu \
--cc=keescook@chromium.org \
--cc=kernel-hardening@lists.openwall.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@mit.edu \
--cc=markus@chromium.org \
--cc=mcgrathr@chromium.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=pmoore@redhat.com \
--cc=rdunlap@xenotime.net \
--cc=scarybeasts@gmail.com \
--cc=serge.hallyn@canonical.com \
--cc=tglx@linutronix.de \
--cc=wad@chromium.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox