* [PATCH bpf 1/2] bpf: fix panic in prog load calls cleanup
2018-06-15 0:30 [PATCH bpf 0/2] Two bpf fixes Daniel Borkmann
@ 2018-06-15 0:30 ` Daniel Borkmann
2018-06-15 15:56 ` Martin KaFai Lau
2018-06-15 0:30 ` [PATCH bpf 2/2] bpf: reject any prog that failed read-only lock Daniel Borkmann
2018-06-15 19:31 ` [PATCH bpf 0/2] Two bpf fixes Alexei Starovoitov
2 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2018-06-15 0:30 UTC (permalink / raw)
To: ast; +Cc: netdev, Daniel Borkmann
While testing I found that when hitting error path in bpf_prog_load()
where we jump to free_used_maps and prog contained BPF to BPF calls
that were JITed earlier, then we never clean up the bpf_prog_kallsyms_add()
done under jit_subprogs(). Add proper API to make BPF kallsyms deletion
more clear and fix that.
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
include/linux/filter.h | 3 +++
kernel/bpf/core.c | 14 ++++++++++++++
kernel/bpf/syscall.c | 8 ++------
3 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 45fc0f5..297c56f 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -961,6 +961,9 @@ static inline void bpf_prog_kallsyms_del(struct bpf_prog *fp)
}
#endif /* CONFIG_BPF_JIT */
+void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp);
+void bpf_prog_kallsyms_del_all(struct bpf_prog *fp);
+
#define BPF_ANC BIT(15)
static inline bool bpf_needs_clear_a(const struct sock_filter *first)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 9f14937..1061968 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -350,6 +350,20 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
return prog_adj;
}
+void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp)
+{
+ int i;
+
+ for (i = 0; i < fp->aux->func_cnt; i++)
+ bpf_prog_kallsyms_del(fp->aux->func[i]);
+}
+
+void bpf_prog_kallsyms_del_all(struct bpf_prog *fp)
+{
+ bpf_prog_kallsyms_del_subprogs(fp);
+ bpf_prog_kallsyms_del(fp);
+}
+
#ifdef CONFIG_BPF_JIT
/* All BPF JIT sysctl knobs here. */
int bpf_jit_enable __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_ALWAYS_ON);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0fa2062..0f62692 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1034,14 +1034,9 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu)
static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
{
if (atomic_dec_and_test(&prog->aux->refcnt)) {
- int i;
-
/* bpf_prog_free_id() must be called first */
bpf_prog_free_id(prog, do_idr_lock);
-
- for (i = 0; i < prog->aux->func_cnt; i++)
- bpf_prog_kallsyms_del(prog->aux->func[i]);
- bpf_prog_kallsyms_del(prog);
+ bpf_prog_kallsyms_del_all(prog);
call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
}
@@ -1384,6 +1379,7 @@ static int bpf_prog_load(union bpf_attr *attr)
return err;
free_used_maps:
+ bpf_prog_kallsyms_del_subprogs(prog);
free_used_maps(prog->aux);
free_prog:
bpf_prog_uncharge_memlock(prog);
--
2.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH bpf 2/2] bpf: reject any prog that failed read-only lock
2018-06-15 0:30 [PATCH bpf 0/2] Two bpf fixes Daniel Borkmann
2018-06-15 0:30 ` [PATCH bpf 1/2] bpf: fix panic in prog load calls cleanup Daniel Borkmann
@ 2018-06-15 0:30 ` Daniel Borkmann
2018-06-15 16:51 ` Martin KaFai Lau
2018-06-15 19:31 ` [PATCH bpf 0/2] Two bpf fixes Alexei Starovoitov
2 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2018-06-15 0:30 UTC (permalink / raw)
To: ast; +Cc: netdev, Daniel Borkmann
We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
as well as the BPF image as read-only through bpf_prog_lock_ro(). In
the case any of these would fail we throw a WARN_ON_ONCE() in order to
yell loudly to the log. Perhaps, to some extend, this may be comparable
to an allocation where __GFP_NOWARN is explicitly not set.
Added via 65869a47f348 ("bpf: improve read-only handling"), this behavior
is slightly different compared to any of the other in-kernel set_memory_ro()
users who do not check the return code of set_memory_ro() and friends /at
all/ (e.g. in the case of module_enable_ro() / module_disable_ro()). Given
in BPF this is mandatory hardening step, we want to know whether there
are any issues that would leave both BPF data writable. So it happens
that syzkaller enabled fault injection and it triggered memory allocation
failure deep inside x86's change_page_attr_set_clr() which was triggered
from set_memory_ro().
Now, there are two options: i) leaving everything as is, and ii) reworking
the image locking code in order to have a final checkpoint out of the
central bpf_prog_select_runtime() which probes whether any of the calls
during prog setup weren't successful, and then bailing out with an error.
Option ii) is a better approach since this additional paranoia avoids
altogether leaving any potential W+X pages from BPF side in the system.
Therefore, lets be strict about it, and reject programs in such unlikely
occasion. While testing I noticed also that one bpf_prog_lock_ro()
call was missing on the outer dummy prog in case of calls, e.g. in the
destructor we call bpf_prog_free_deferred() on the main prog where we
try to bpf_prog_unlock_free() the program, and since we go via
bpf_prog_select_runtime() do that as well.
Reported-by: syzbot+3b889862e65a98317058@syzkaller.appspotmail.com
Reported-by: syzbot+9e762b52dd17e616a7a5@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
include/linux/filter.h | 60 ++++++++++++++++++++++++++++++++------------------
kernel/bpf/core.c | 55 +++++++++++++++++++++++++++++++++++++++------
kernel/bpf/syscall.c | 4 +---
3 files changed, 87 insertions(+), 32 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 297c56f..108f981 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -469,7 +469,8 @@ struct sock_fprog_kern {
};
struct bpf_binary_header {
- unsigned int pages;
+ u16 pages;
+ u16 locked:1;
u8 image[];
};
@@ -671,15 +672,18 @@ bpf_ctx_narrow_access_ok(u32 off, u32 size, u32 size_default)
#define bpf_classic_proglen(fprog) (fprog->len * sizeof(fprog->filter[0]))
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
static inline void bpf_prog_lock_ro(struct bpf_prog *fp)
{
+#ifdef CONFIG_ARCH_HAS_SET_MEMORY
fp->locked = 1;
- WARN_ON_ONCE(set_memory_ro((unsigned long)fp, fp->pages));
+ if (set_memory_ro((unsigned long)fp, fp->pages))
+ fp->locked = 0;
+#endif
}
static inline void bpf_prog_unlock_ro(struct bpf_prog *fp)
{
+#ifdef CONFIG_ARCH_HAS_SET_MEMORY
if (fp->locked) {
WARN_ON_ONCE(set_memory_rw((unsigned long)fp, fp->pages));
/* In case set_memory_rw() fails, we want to be the first
@@ -687,34 +691,30 @@ static inline void bpf_prog_unlock_ro(struct bpf_prog *fp)
*/
fp->locked = 0;
}
+#endif
}
static inline void bpf_jit_binary_lock_ro(struct bpf_binary_header *hdr)
{
- WARN_ON_ONCE(set_memory_ro((unsigned long)hdr, hdr->pages));
-}
-
-static inline void bpf_jit_binary_unlock_ro(struct bpf_binary_header *hdr)
-{
- WARN_ON_ONCE(set_memory_rw((unsigned long)hdr, hdr->pages));
-}
-#else
-static inline void bpf_prog_lock_ro(struct bpf_prog *fp)
-{
-}
-
-static inline void bpf_prog_unlock_ro(struct bpf_prog *fp)
-{
-}
-
-static inline void bpf_jit_binary_lock_ro(struct bpf_binary_header *hdr)
-{
+#ifdef CONFIG_ARCH_HAS_SET_MEMORY
+ hdr->locked = 1;
+ if (set_memory_ro((unsigned long)hdr, hdr->pages))
+ hdr->locked = 0;
+#endif
}
static inline void bpf_jit_binary_unlock_ro(struct bpf_binary_header *hdr)
{
+#ifdef CONFIG_ARCH_HAS_SET_MEMORY
+ if (hdr->locked) {
+ WARN_ON_ONCE(set_memory_rw((unsigned long)hdr, hdr->pages));
+ /* In case set_memory_rw() fails, we want to be the first
+ * to crash here instead of some random place later on.
+ */
+ hdr->locked = 0;
+ }
+#endif
}
-#endif /* CONFIG_ARCH_HAS_SET_MEMORY */
static inline struct bpf_binary_header *
bpf_jit_binary_hdr(const struct bpf_prog *fp)
@@ -725,6 +725,22 @@ bpf_jit_binary_hdr(const struct bpf_prog *fp)
return (void *)addr;
}
+#ifdef CONFIG_ARCH_HAS_SET_MEMORY
+static inline int bpf_prog_check_pages_ro_single(const struct bpf_prog *fp)
+{
+ if (!fp->locked)
+ return -ENOLCK;
+ if (fp->jited) {
+ const struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp);
+
+ if (!hdr->locked)
+ return -ENOLCK;
+ }
+
+ return 0;
+}
+#endif
+
int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap);
static inline int sk_filter(struct sock *sk, struct sk_buff *skb)
{
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1061968..a9e6c04 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -598,6 +598,8 @@ bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
bpf_fill_ill_insns(hdr, size);
hdr->pages = size / PAGE_SIZE;
+ hdr->locked = 0;
+
hole = min_t(unsigned int, size - (proglen + sizeof(*hdr)),
PAGE_SIZE - sizeof(*hdr));
start = (get_random_int() % hole) & ~(alignment - 1);
@@ -1448,6 +1450,33 @@ static int bpf_check_tail_call(const struct bpf_prog *fp)
return 0;
}
+static int bpf_prog_check_pages_ro_locked(const struct bpf_prog *fp)
+{
+#ifdef CONFIG_ARCH_HAS_SET_MEMORY
+ int i, err;
+
+ for (i = 0; i < fp->aux->func_cnt; i++) {
+ err = bpf_prog_check_pages_ro_single(fp->aux->func[i]);
+ if (err)
+ return err;
+ }
+
+ return bpf_prog_check_pages_ro_single(fp);
+#endif
+ return 0;
+}
+
+static void bpf_prog_select_func(struct bpf_prog *fp)
+{
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
+ u32 stack_depth = max_t(u32, fp->aux->stack_depth, 1);
+
+ fp->bpf_func = interpreters[(round_up(stack_depth, 32) / 32) - 1];
+#else
+ fp->bpf_func = __bpf_prog_ret0_warn;
+#endif
+}
+
/**
* bpf_prog_select_runtime - select exec runtime for BPF program
* @fp: bpf_prog populated with internal BPF program
@@ -1458,13 +1487,13 @@ static int bpf_check_tail_call(const struct bpf_prog *fp)
*/
struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
{
-#ifndef CONFIG_BPF_JIT_ALWAYS_ON
- u32 stack_depth = max_t(u32, fp->aux->stack_depth, 1);
+ /* In case of BPF to BPF calls, verifier did all the prep
+ * work with regards to JITing, etc.
+ */
+ if (fp->bpf_func)
+ goto finalize;
- fp->bpf_func = interpreters[(round_up(stack_depth, 32) / 32) - 1];
-#else
- fp->bpf_func = __bpf_prog_ret0_warn;
-#endif
+ bpf_prog_select_func(fp);
/* eBPF JITs can rewrite the program in case constant
* blinding is active. However, in case of error during
@@ -1485,6 +1514,8 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
if (*err)
return fp;
}
+
+finalize:
bpf_prog_lock_ro(fp);
/* The tail call compatibility check can only be done at
@@ -1493,7 +1524,17 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
* all eBPF JITs might immediately support all features.
*/
*err = bpf_check_tail_call(fp);
-
+ if (*err)
+ return fp;
+
+ /* Checkpoint: at this point onwards any cBPF -> eBPF or
+ * native eBPF program is read-only. If we failed to change
+ * the page attributes (e.g. allocation failure from
+ * splitting large pages), then reject the whole program
+ * in order to guarantee not ending up with any W+X pages
+ * from BPF side in kernel.
+ */
+ *err = bpf_prog_check_pages_ro_locked(fp);
return fp;
}
EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0f62692..35dc466 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1353,9 +1353,7 @@ static int bpf_prog_load(union bpf_attr *attr)
if (err < 0)
goto free_used_maps;
- /* eBPF program is ready to be JITed */
- if (!prog->bpf_func)
- prog = bpf_prog_select_runtime(prog, &err);
+ prog = bpf_prog_select_runtime(prog, &err);
if (err < 0)
goto free_used_maps;
--
2.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread