From: Alexei Starovoitov <ast@kernel.org>
To: <davem@davemloft.net>
Cc: <daniel@iogearbox.net>, <peterz@infradead.org>,
<edumazet@google.com>, <jannh@google.com>,
<netdev@vger.kernel.org>, <kernel-team@fb.com>
Subject: [PATCH bpf-next 4/4] bpf: Fix syscall's stackmap lookup potential deadlock
Date: Tue, 29 Jan 2019 20:04:58 -0800 [thread overview]
Message-ID: <20190130040458.2544340-5-ast@kernel.org> (raw)
In-Reply-To: <20190130040458.2544340-1-ast@kernel.org>
From: Martin KaFai Lau <kafai@fb.com>
The map_lookup_elem used to not acquiring spinlock
in order to optimize the reader.
It was true until commit 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation")
The syscall's map_lookup_elem(stackmap) calls bpf_stackmap_copy().
bpf_stackmap_copy() may find the elem no longer needed after the copy is done.
If that is the case, pcpu_freelist_push() saves this elem for reuse later.
This push requires a spinlock.
If a tracing bpf_prog got run in the middle of the syscall's
map_lookup_elem(stackmap) and this tracing bpf_prog is calling
bpf_get_stackid(stackmap) which also requires the same pcpu_freelist's
spinlock, it may end up with a dead lock situation as reported by
Eric Dumazet in https://patchwork.ozlabs.org/patch/1030266/
The situation is the same as the syscall's map_update_elem() which
needs to acquire the pcpu_freelist's spinlock and could race
with tracing bpf_prog. Hence, this patch fixes it by protecting
bpf_stackmap_copy() with this_cpu_inc(bpf_prog_active)
to prevent tracing bpf_prog from running.
A later syscall's map_lookup_elem commit f1a2e44a3aec ("bpf: add queue and stack maps")
also acquires a spinlock and races with tracing bpf_prog similarly.
Hence, this patch is forward looking and protects the majority
of the map lookups. bpf_map_offload_lookup_elem() is the exception
since it is for network bpf_prog only (i.e. never called by tracing
bpf_prog).
Fixes: 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
kernel/bpf/syscall.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b155cd17c1bd..8577bb7f8be6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -713,8 +713,13 @@ static int map_lookup_elem(union bpf_attr *attr)
if (bpf_map_is_dev_bound(map)) {
err = bpf_map_offload_lookup_elem(map, key, value);
- } else if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
- map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
+ goto done;
+ }
+
+ preempt_disable();
+ this_cpu_inc(bpf_prog_active);
+ if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
+ map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_copy(map, key, value);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
err = bpf_percpu_array_copy(map, key, value);
@@ -744,7 +749,10 @@ static int map_lookup_elem(union bpf_attr *attr)
}
rcu_read_unlock();
}
+ this_cpu_dec(bpf_prog_active);
+ preempt_enable();
+done:
if (err)
goto free_value;
--
2.20.0
next prev parent reply other threads:[~2019-01-30 4:05 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-30 4:04 [PATCH bpf-next 0/4] bpf: fixes for lockdep and deadlock Alexei Starovoitov
2019-01-30 4:04 ` [PATCH bpf-next 1/4] bpf: fix lockdep false positive in percpu_freelist Alexei Starovoitov
2019-01-30 10:21 ` Peter Zijlstra
2019-01-30 19:27 ` Alexei Starovoitov
2019-01-30 19:53 ` Peter Zijlstra
2019-01-30 20:18 ` Alexei Starovoitov
2019-01-30 4:04 ` [PATCH bpf-next 2/4] bpf: fix lockdep false positive in stackmap Alexei Starovoitov
2019-01-30 10:15 ` Peter Zijlstra
2019-01-30 19:30 ` Alexei Starovoitov
2019-01-30 19:42 ` Waiman Long
2019-01-30 20:10 ` Alexei Starovoitov
2019-01-30 21:11 ` Waiman Long
2019-01-30 21:32 ` Waiman Long
2019-01-31 2:01 ` Alexei Starovoitov
2019-01-31 2:48 ` Waiman Long
2019-02-06 3:21 ` Eric Dumazet
2019-02-06 3:30 ` Alexei Starovoitov
2019-02-06 3:40 ` Eric Dumazet
2019-01-30 19:44 ` Peter Zijlstra
2019-01-30 20:05 ` Waiman Long
2019-01-30 4:04 ` [PATCH bpf-next 3/4] bpf: fix lockdep false positive in bpf_prog_register Alexei Starovoitov
2019-01-30 10:10 ` Peter Zijlstra
2019-01-30 10:37 ` Peter Zijlstra
2019-01-30 19:32 ` Alexei Starovoitov
2019-01-30 19:46 ` Peter Zijlstra
2019-01-30 4:04 ` Alexei Starovoitov [this message]
2019-01-30 4:07 ` [PATCH bpf-next 0/4] bpf: fixes for lockdep and deadlock Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190130040458.2544340-5-ast@kernel.org \
--to=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jannh@google.com \
--cc=kernel-team@fb.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.