From: Kees Cook <keescook@chromium.org>
To: "zhujianwei (C)" <zhujianwei7@huawei.com>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"linux-security-module@vger.kernel.org"
<linux-security-module@vger.kernel.org>,
Hehuazhen <hehuazhen@huawei.com>,
"Lennart Poettering" <lennart@poettering.net>,
"Christian Ehrhardt" <christian.ehrhardt@canonical.com>,
"Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>
Subject: Re: 答复: 答复: new seccomp mode aims to improve performance
Date: Tue, 2 Jun 2020 11:32:25 -0700 [thread overview]
Message-ID: <202006021111.947830EC@keescook> (raw)
In-Reply-To: <07ce4c1273054955a350e67f2dc35812@huawei.com>
On Tue, Jun 02, 2020 at 11:34:04AM +0000, zhujianwei (C) wrote:
> And in many scenarios, the requirement for syscall filter is usually
> simple, and does not need complex filter rules, for example, just
> configure a syscall black or white list. However, we have noticed that
> seccomp will have a performance overhead that cannot be ignored in this
> simple scenario. For example, referring to Kees's t est data, this cost
> is almost 41/636 = 6.5%, and Alex's data is 17/226 = 7.5%, based on
> single rule of filtering (getpid); Our data for this overhead is 19.8%
> (refer to the previous 'orignal' test results), filtering based on our
> 20 rules (unixbench syscall).
I wonder if aarch64 has higher overhead for calling into the TIF_WORK
trace stuff? (Or if aarch64's BPF JIT is not as efficient as x86?)
> // kernel modification
> --- linux-5.7-rc7_1/arch/arm64/kernel/ptrace.c 2020-05-25 06:32:54.000000000 +0800
> +++ linux-5.7-rc7/arch/arm64/kernel/ptrace.c 2020-06-02 12:35:04.412000000 +0800
> @@ -1827,6 +1827,46 @@
> regs->regs[regno] = saved_reg;
> }
>
> +#define PID_MAX 1000000
> +#define SYSNUM_MAX 0x220
You can use NR_syscalls here, I think.
> +
> +/* all zero*/
> +bool g_light_filter_switch[PID_MAX] = {0};
> +bool g_light_filter_bitmap[PID_MAX][SYSNUM_MAX] = {0};
These can be static, and I would double-check your allocation size -- I
suspect this is allocating a byte for each bool. I would recommend
DECLARE_BITMAP() and friends.
> +static int __light_syscall_filter(void) {
> + int pid;
> + int this_syscall;
> +
> + pid = current->pid;
> + this_syscall = syscall_get_nr(current, task_pt_regs(current));
> +
> + if(g_light_filter_bitmap[pid][this_syscall] == true) {
> + printk(KERN_ERR "light syscall filter: syscall num %d denied.\n", this_syscall);
> + goto skip;
> + }
> +
> + return 0;
> +skip:
> + return -1;
> +}
> +
> +static inline int light_syscall_filter(void) {
> + if (unlikely(test_thread_flag(TIF_SECCOMP))) {
> + return __light_syscall_filter();
> + }
> +
> + return 0;
> +}
> +
> int syscall_trace_enter(struct pt_regs *regs)
> {
> unsigned long flags = READ_ONCE(current_thread_info()->flags);
> @@ -1837,9 +1877,10 @@
> return -1;
> }
>
> - /* Do the secure computing after ptrace; failures should be fast. */
> - if (secure_computing() == -1)
> + /* light check for syscall-num-only rule. */
> + if (light_syscall_filter() == -1) {
> return -1;
> + }
>
> if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> trace_sys_enter(regs, regs->syscallno);
Given that you're still doing this in syscall_trace_enter(), I imagine
it could live in secure_computing().
Anyway, the functionality here is similar to what I've been working
on for bitmaps (having a global preallocated bitmap isn't going to be
upstreamable, but it's good for PoC). The complications are with handling
differing architecture (for compat systems), tracking/choosing between
the various basic SECCOMP_RET_* behaviors, etc.
-Kees
--
Kees Cook
next prev parent reply other threads:[~2020-06-02 18:32 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-29 12:48 new seccomp mode aims to improve performance zhujianwei (C)
2020-05-29 15:43 ` Alexei Starovoitov
2020-05-29 16:09 ` Kees Cook
2020-05-29 17:31 ` Alexei Starovoitov
2020-05-29 19:27 ` Kees Cook
2020-05-31 17:19 ` Alexei Starovoitov
2020-06-01 18:16 ` Kees Cook
2020-06-01 2:08 ` 答复: " zhujianwei (C)
2020-06-01 3:30 ` Alexei Starovoitov
2020-06-02 2:42 ` 答复: " zhujianwei (C)
2020-06-02 3:24 ` Alexei Starovoitov
2020-06-02 11:13 ` 答复: " zhujianwei (C)
2020-06-02 11:34 ` zhujianwei (C)
2020-06-02 18:32 ` Kees Cook [this message]
2020-06-03 4:51 ` 答复: " zhujianwei (C)
2020-06-01 10:11 ` Lennart Poettering
2020-06-01 12:32 ` Paul Moore
2020-06-02 12:53 ` Lennart Poettering
2020-06-02 15:03 ` Paul Moore
2020-06-02 18:39 ` Kees Cook
2020-06-01 18:21 ` Kees Cook
2020-06-02 12:44 ` Lennart Poettering
2020-06-02 18:37 ` Kees Cook
2020-06-16 6:00 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202006021111.947830EC@keescook \
--to=keescook@chromium.org \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=christian.ehrhardt@canonical.com \
--cc=hehuazhen@huawei.com \
--cc=lennart@poettering.net \
--cc=linux-security-module@vger.kernel.org \
--cc=zbyszek@in.waw.pl \
--cc=zhujianwei7@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).