* [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated
@ 2024-10-31 21:09 Andrii Nakryiko
2024-10-31 21:09 ` [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics Andrii Nakryiko
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2024-10-31 21:09 UTC (permalink / raw)
To: linux-trace-kernel, bpf, rostedt, ast, daniel, martin.lau
Cc: mathieu.desnoyers, linux-kernel, mhiramat, peterz, paulmck, jrife,
Andrii Nakryiko
In general, BPF link's underlying BPF program should be considered to be
reachable through attach hook -> link -> prog chain, and, pessimistically,
we have to assume that as long as link's memory is not safe to free,
attach hook's code might hold a pointer to BPF program and use it.
As such, it's not (generally) correct to put link's program early before
waiting for RCU GPs to go through. More eager bpf_prog_put() that we
currently do is mostly correct due to BPF program's release code doing
similar RCU GP waiting, but as will be shown in the following patches,
BPF program can be non-sleepable (and, thus, reliant on only "classic"
RCU GP), while BPF link's attach hook can have sleepable semantics and
needs to be protected by RCU Tasks Trace, and for such cases BPF link
has to go through RCU Tasks Trace + "classic" RCU GPs before being
deallocated. And so, if we put BPF program early, we might free BPF
program before we free BPF link, leading to use-after-free situation.
So, this patch defers bpf_prog_put() until we are ready to perform
bpf_link's deallocation. At worst, this delays BPF program freeing by
one extra RCU GP, but that seems completely acceptable. Alternatively,
we'd need more elaborate ways to determine BPF hook, BPF link, and BPF
program lifetimes, and how they relate to each other, which seems like
an unnecessary complication.
Note, for most BPF links we still will perform eager bpf_prog_put() and
link dealloc, so for those BPF links there are no observable changes
whatsoever. Only BPF links that use deferred dealloc might notice
slightly delayed freeing of BPF programs.
Also, to reduce code and logic duplication, extract program put + link
dealloc logic into bpf_link_dealloc() helper.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
kernel/bpf/syscall.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a8f1808a1ca5..aa7246a399f3 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2976,12 +2976,24 @@ void bpf_link_inc(struct bpf_link *link)
atomic64_inc(&link->refcnt);
}
+static void bpf_link_dealloc(struct bpf_link *link)
+{
+ /* now that we know that bpf_link itself can't be reached, put underlying BPF program */
+ if (link->prog)
+ bpf_prog_put(link->prog);
+
+ /* free bpf_link and its containing memory */
+ if (link->ops->dealloc_deferred)
+ link->ops->dealloc_deferred(link);
+ else
+ link->ops->dealloc(link);
+}
+
static void bpf_link_defer_dealloc_rcu_gp(struct rcu_head *rcu)
{
struct bpf_link *link = container_of(rcu, struct bpf_link, rcu);
- /* free bpf_link and its containing memory */
- link->ops->dealloc_deferred(link);
+ bpf_link_dealloc(link);
}
static void bpf_link_defer_dealloc_mult_rcu_gp(struct rcu_head *rcu)
@@ -3003,7 +3015,6 @@ static void bpf_link_free(struct bpf_link *link)
sleepable = link->prog->sleepable;
/* detach BPF program, clean up used resources */
ops->release(link);
- bpf_prog_put(link->prog);
}
if (ops->dealloc_deferred) {
/* schedule BPF link deallocation; if underlying BPF program
@@ -3014,8 +3025,9 @@ static void bpf_link_free(struct bpf_link *link)
call_rcu_tasks_trace(&link->rcu, bpf_link_defer_dealloc_mult_rcu_gp);
else
call_rcu(&link->rcu, bpf_link_defer_dealloc_rcu_gp);
- } else if (ops->dealloc)
- ops->dealloc(link);
+ } else if (ops->dealloc) {
+ bpf_link_dealloc(link);
+ }
}
static void bpf_link_put_deferred(struct work_struct *work)
--
2.43.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics
2024-10-31 21:09 [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Andrii Nakryiko
@ 2024-10-31 21:09 ` Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
2024-11-01 16:26 ` Alexei Starovoitov
2024-10-31 21:09 ` [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links Andrii Nakryiko
2024-11-01 10:55 ` [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Steven Rostedt
2 siblings, 2 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2024-10-31 21:09 UTC (permalink / raw)
To: linux-trace-kernel, bpf, rostedt, ast, daniel, martin.lau
Cc: mathieu.desnoyers, linux-kernel, mhiramat, peterz, paulmck, jrife,
Andrii Nakryiko
BPF link's lifecycle protection scheme depends on both BPF hook and BPF
program. If *either* of those require RCU Tasks Trace GP, then we need
to go through a chain of GPs before putting BPF program refcount and
deallocating BPF link memory.
This patch adds bpf_link-specific sleepable flag, which can be set to
true even if underlying BPF program is not sleepable itself. If either
link->sleepable or link->prog->sleepable is true, we'll go through
a chain of RCU Tasks Trace GP and RCU GP before putting BPF program and
freeing memory.
This will be used to protect BPF link for sleepable (faultable) raw
tracepoints in the next patch.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
include/linux/bpf.h | 22 +++++++++++++++++++---
kernel/bpf/syscall.c | 39 ++++++++++++++++++++++++++++-----------
2 files changed, 47 insertions(+), 14 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 19d8ca8ac960..83f44caf47e1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1598,6 +1598,11 @@ struct bpf_link {
enum bpf_link_type type;
const struct bpf_link_ops *ops;
struct bpf_prog *prog;
+ /* whether BPF link itself has "sleepable" semantics, which can differ
+ * from underlying BPF program having a "sleepable" semantics, as BPF
+ * link's semantics is determined by target attach hook
+ */
+ bool sleepable;
/* rcu is used before freeing, work can be used to schedule that
* RCU-based freeing before that, so they never overlap
*/
@@ -1614,8 +1619,10 @@ struct bpf_link_ops {
*/
void (*dealloc)(struct bpf_link *link);
/* deallocate link resources callback, called after RCU grace period;
- * if underlying BPF program is sleepable we go through tasks trace
- * RCU GP and then "classic" RCU GP
+ * if either the underlying BPF program is sleepable or BPF link's
+ * target hook is sleepable, we'll go through tasks trace RCU GP and
+ * then "classic" RCU GP; this need for chaining tasks trace and
+ * classic RCU GPs is designated by setting bpf_link->sleepable flag
*/
void (*dealloc_deferred)(struct bpf_link *link);
int (*detach)(struct bpf_link *link);
@@ -2362,6 +2369,9 @@ int bpf_prog_new_fd(struct bpf_prog *prog);
void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
const struct bpf_link_ops *ops, struct bpf_prog *prog);
+void bpf_link_init_sleepable(struct bpf_link *link, enum bpf_link_type type,
+ const struct bpf_link_ops *ops, struct bpf_prog *prog,
+ bool sleepable);
int bpf_link_prime(struct bpf_link *link, struct bpf_link_primer *primer);
int bpf_link_settle(struct bpf_link_primer *primer);
void bpf_link_cleanup(struct bpf_link_primer *primer);
@@ -2713,7 +2723,13 @@ bpf_prog_inc_not_zero(struct bpf_prog *prog)
static inline void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
const struct bpf_link_ops *ops,
- struct bpf_prog *prog)
+ struct bpf_prog *prog, bool sleepable)
+{
+}
+
+static inline void bpf_link_init_sleepable(struct bpf_link *link, enum bpf_link_type type,
+ const struct bpf_link_ops *ops, struct bpf_prog *prog,
+ bool sleepable)
{
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index aa7246a399f3..0f5540627911 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2933,17 +2933,33 @@ static int bpf_obj_get(const union bpf_attr *attr)
attr->file_flags);
}
-void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
- const struct bpf_link_ops *ops, struct bpf_prog *prog)
+/* bpf_link_init_sleepable() allows to specify whether BPF link itself has
+ * "sleepable" semantics, which normally would mean that BPF link's attach
+ * hook can dereference link or link's underlying program for some time after
+ * detachment due to RCU Tasks Trace-based lifetime protection scheme.
+ * BPF program itself can be non-sleepable, yet, because it's transitively
+ * reachable through BPF link, its freeing has to be delayed until after RCU
+ * Tasks Trace GP.
+ */
+void bpf_link_init_sleepable(struct bpf_link *link, enum bpf_link_type type,
+ const struct bpf_link_ops *ops, struct bpf_prog *prog,
+ bool sleepable)
{
WARN_ON(ops->dealloc && ops->dealloc_deferred);
atomic64_set(&link->refcnt, 1);
link->type = type;
+ link->sleepable = sleepable;
link->id = 0;
link->ops = ops;
link->prog = prog;
}
+void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
+ const struct bpf_link_ops *ops, struct bpf_prog *prog)
+{
+ bpf_link_init_sleepable(link, type, ops, prog, false);
+}
+
static void bpf_link_free_id(int id)
{
if (!id)
@@ -3008,20 +3024,21 @@ static void bpf_link_defer_dealloc_mult_rcu_gp(struct rcu_head *rcu)
static void bpf_link_free(struct bpf_link *link)
{
const struct bpf_link_ops *ops = link->ops;
- bool sleepable = false;
bpf_link_free_id(link->id);
- if (link->prog) {
- sleepable = link->prog->sleepable;
- /* detach BPF program, clean up used resources */
+ /* detach BPF program, clean up used resources */
+ if (link->prog)
ops->release(link);
- }
if (ops->dealloc_deferred) {
- /* schedule BPF link deallocation; if underlying BPF program
- * is sleepable, we need to first wait for RCU tasks trace
- * sync, then go through "classic" RCU grace period
+ /* Schedule BPF link deallocation, which will only then
+ * trigger putting BPF program refcount.
+ * If underlying BPF program is sleepable or BPF link's target
+ * attach hookpoint is sleepable or otherwise requires RCU GPs
+ * to ensure link and its underlying BPF program is not
+ * reachable anymore, we need to first wait for RCU tasks
+ * trace sync, and then go through "classic" RCU grace period
*/
- if (sleepable)
+ if (link->sleepable || (link->prog && link->prog->sleepable))
call_rcu_tasks_trace(&link->rcu, bpf_link_defer_dealloc_mult_rcu_gp);
else
call_rcu(&link->rcu, bpf_link_defer_dealloc_rcu_gp);
--
2.43.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
2024-10-31 21:09 [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Andrii Nakryiko
2024-10-31 21:09 ` [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics Andrii Nakryiko
@ 2024-10-31 21:09 ` Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
` (2 more replies)
2024-11-01 10:55 ` [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Steven Rostedt
2 siblings, 3 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2024-10-31 21:09 UTC (permalink / raw)
To: linux-trace-kernel, bpf, rostedt, ast, daniel, martin.lau
Cc: mathieu.desnoyers, linux-kernel, mhiramat, peterz, paulmck, jrife,
Andrii Nakryiko
Now that kernel supports sleepable tracepoints, the fact that
bpf_probe_unregister() is asynchronous, i.e., that it doesn't wait for
any in-flight tracepoints to conclude before returning, we now need to
delay BPF raw tp link's deallocation and bpf_prog_put() of its
underlying BPF program (regardless of program's own sleepable semantics)
until after full RCU Tasks Trace GP. With that GP over, we'll have
a guarantee that no tracepoint can reach BPF link and thus its BPF program.
We use newly added tracepoint_is_faultable() check to know when this RCU
Tasks Trace GP is necessary and utilize BPF link's own sleepable flag
passed through bpf_link_init_sleepable() initializer.
Reported-by: Jordan Rife <jrife@google.com>
Fixes: a363d27cdbc2 ("tracing: Allow system call tracepoints to handle page faults")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
kernel/bpf/syscall.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0f5540627911..db2a987504b2 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -35,6 +35,7 @@
#include <linux/rcupdate_trace.h>
#include <linux/memcontrol.h>
#include <linux/trace_events.h>
+#include <linux/tracepoint.h>
#include <net/netfilter/nf_bpf_link.h>
#include <net/netkit.h>
@@ -3845,8 +3846,9 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog,
err = -ENOMEM;
goto out_put_btp;
}
- bpf_link_init(&link->link, BPF_LINK_TYPE_RAW_TRACEPOINT,
- &bpf_raw_tp_link_lops, prog);
+ bpf_link_init_sleepable(&link->link, BPF_LINK_TYPE_RAW_TRACEPOINT,
+ &bpf_raw_tp_link_lops, prog,
+ tracepoint_is_faultable(btp->tp));
link->btp = btp;
link->cookie = cookie;
--
2.43.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics
2024-10-31 21:09 ` [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics Andrii Nakryiko
@ 2024-11-01 5:07 ` kernel test robot
2024-11-01 16:26 ` Alexei Starovoitov
1 sibling, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-11-01 5:07 UTC (permalink / raw)
To: Andrii Nakryiko, linux-trace-kernel, bpf, rostedt, ast, daniel,
martin.lau
Cc: oe-kbuild-all, mathieu.desnoyers, linux-kernel, mhiramat, peterz,
paulmck, jrife, Andrii Nakryiko
Hi Andrii,
kernel test robot noticed the following build errors:
[auto build test ERROR on trace/for-next]
url: https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-decouple-BPF-link-attach-hook-and-BPF-program-sleepable-semantics/20241101-051131
base: https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace for-next
patch link: https://lore.kernel.org/r/20241031210938.1696639-2-andrii%40kernel.org
patch subject: [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics
config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20241101/202411011244.LrXOUj8p-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241101/202411011244.LrXOUj8p-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411011244.LrXOUj8p-lkp@intel.com/
All errors (new ones prefixed by >>):
net/core/dev.c: In function 'bpf_xdp_link_attach':
>> net/core/dev.c:9767:9: error: too few arguments to function 'bpf_link_init'
9767 | bpf_link_init(&link->link, BPF_LINK_TYPE_XDP, &bpf_xdp_link_lops, prog);
| ^~~~~~~~~~~~~
In file included from include/linux/security.h:35,
from include/net/scm.h:9,
from include/linux/netlink.h:9,
from include/uapi/linux/neighbour.h:6,
from include/linux/netdevice.h:44,
from net/core/dev.c:92:
include/linux/bpf.h:2724:20: note: declared here
2724 | static inline void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
| ^~~~~~~~~~~~~
vim +/bpf_link_init +9767 net/core/dev.c
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9744
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9745 int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9746 {
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9747 struct net *net = current->nsproxy->net_ns;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9748 struct bpf_link_primer link_primer;
bf4ea1d0b2cb22 Leon Hwang 2023-08-01 9749 struct netlink_ext_ack extack = {};
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9750 struct bpf_xdp_link *link;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9751 struct net_device *dev;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9752 int err, fd;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9753
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9754 rtnl_lock();
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9755 dev = dev_get_by_index(net, attr->link_create.target_ifindex);
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9756 if (!dev) {
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9757 rtnl_unlock();
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9758 return -EINVAL;
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9759 }
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9760
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9761 link = kzalloc(sizeof(*link), GFP_USER);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9762 if (!link) {
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9763 err = -ENOMEM;
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9764 goto unlock;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9765 }
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9766
aa8d3a716b59db Andrii Nakryiko 2020-07-21 @9767 bpf_link_init(&link->link, BPF_LINK_TYPE_XDP, &bpf_xdp_link_lops, prog);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9768 link->dev = dev;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9769 link->flags = attr->link_create.flags;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9770
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9771 err = bpf_link_prime(&link->link, &link_primer);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9772 if (err) {
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9773 kfree(link);
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9774 goto unlock;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9775 }
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9776
bf4ea1d0b2cb22 Leon Hwang 2023-08-01 9777 err = dev_xdp_attach_link(dev, &extack, link);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9778 rtnl_unlock();
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9779
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9780 if (err) {
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9781 link->dev = NULL;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9782 bpf_link_cleanup(&link_primer);
bf4ea1d0b2cb22 Leon Hwang 2023-08-01 9783 trace_bpf_xdp_link_attach_failed(extack._msg);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9784 goto out_put_dev;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9785 }
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9786
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9787 fd = bpf_link_settle(&link_primer);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9788 /* link itself doesn't hold dev's refcnt to not complicate shutdown */
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9789 dev_put(dev);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9790 return fd;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9791
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9792 unlock:
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9793 rtnl_unlock();
5acc7d3e8d3428 Xuan Zhuo 2021-07-10 9794
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9795 out_put_dev:
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9796 dev_put(dev);
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9797 return err;
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9798 }
aa8d3a716b59db Andrii Nakryiko 2020-07-21 9799
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
2024-10-31 21:09 ` [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links Andrii Nakryiko
@ 2024-11-01 5:07 ` kernel test robot
2024-11-01 5:07 ` kernel test robot
2024-11-01 15:03 ` Jordan Rife
2 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-11-01 5:07 UTC (permalink / raw)
To: Andrii Nakryiko, linux-trace-kernel, bpf, rostedt, ast, daniel,
martin.lau
Cc: llvm, oe-kbuild-all, mathieu.desnoyers, linux-kernel, mhiramat,
peterz, paulmck, jrife, Andrii Nakryiko
Hi Andrii,
kernel test robot noticed the following build errors:
[auto build test ERROR on trace/for-next]
url: https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-decouple-BPF-link-attach-hook-and-BPF-program-sleepable-semantics/20241101-051131
base: https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace for-next
patch link: https://lore.kernel.org/r/20241031210938.1696639-3-andrii%40kernel.org
patch subject: [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
config: i386-buildonly-randconfig-001-20241101 (https://download.01.org/0day-ci/archive/20241101/202411011258.IemsLYSp-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241101/202411011258.IemsLYSp-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411011258.IemsLYSp-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from kernel/bpf/syscall.c:4:
In file included from include/linux/bpf.h:21:
In file included from include/linux/kallsyms.h:13:
In file included from include/linux/mm.h:2213:
include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
>> kernel/bpf/syscall.c:3866:5: error: call to undeclared function 'tracepoint_is_faultable'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
3866 | tracepoint_is_faultable(btp->tp));
| ^
kernel/bpf/syscall.c:5876:30: warning: bitwise operation between different enumeration types ('enum bpf_arg_type' and 'enum bpf_type_flag') [-Wenum-enum-conversion]
5876 | .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY,
| ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~
kernel/bpf/syscall.c:5926:41: warning: bitwise operation between different enumeration types ('enum bpf_arg_type' and 'enum bpf_type_flag') [-Wenum-enum-conversion]
5926 | .arg4_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED,
| ~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~
3 warnings and 1 error generated.
vim +/tracepoint_is_faultable +3866 kernel/bpf/syscall.c
3818
3819 static int bpf_raw_tp_link_attach(struct bpf_prog *prog,
3820 const char __user *user_tp_name, u64 cookie)
3821 {
3822 struct bpf_link_primer link_primer;
3823 struct bpf_raw_tp_link *link;
3824 struct bpf_raw_event_map *btp;
3825 const char *tp_name;
3826 char buf[128];
3827 int err;
3828
3829 switch (prog->type) {
3830 case BPF_PROG_TYPE_TRACING:
3831 case BPF_PROG_TYPE_EXT:
3832 case BPF_PROG_TYPE_LSM:
3833 if (user_tp_name)
3834 /* The attach point for this category of programs
3835 * should be specified via btf_id during program load.
3836 */
3837 return -EINVAL;
3838 if (prog->type == BPF_PROG_TYPE_TRACING &&
3839 prog->expected_attach_type == BPF_TRACE_RAW_TP) {
3840 tp_name = prog->aux->attach_func_name;
3841 break;
3842 }
3843 return bpf_tracing_prog_attach(prog, 0, 0, 0);
3844 case BPF_PROG_TYPE_RAW_TRACEPOINT:
3845 case BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE:
3846 if (strncpy_from_user(buf, user_tp_name, sizeof(buf) - 1) < 0)
3847 return -EFAULT;
3848 buf[sizeof(buf) - 1] = 0;
3849 tp_name = buf;
3850 break;
3851 default:
3852 return -EINVAL;
3853 }
3854
3855 btp = bpf_get_raw_tracepoint(tp_name);
3856 if (!btp)
3857 return -ENOENT;
3858
3859 link = kzalloc(sizeof(*link), GFP_USER);
3860 if (!link) {
3861 err = -ENOMEM;
3862 goto out_put_btp;
3863 }
3864 bpf_link_init_sleepable(&link->link, BPF_LINK_TYPE_RAW_TRACEPOINT,
3865 &bpf_raw_tp_link_lops, prog,
> 3866 tracepoint_is_faultable(btp->tp));
3867 link->btp = btp;
3868 link->cookie = cookie;
3869
3870 err = bpf_link_prime(&link->link, &link_primer);
3871 if (err) {
3872 kfree(link);
3873 goto out_put_btp;
3874 }
3875
3876 err = bpf_probe_register(link->btp, link);
3877 if (err) {
3878 bpf_link_cleanup(&link_primer);
3879 goto out_put_btp;
3880 }
3881
3882 return bpf_link_settle(&link_primer);
3883
3884 out_put_btp:
3885 bpf_put_raw_tracepoint(btp);
3886 return err;
3887 }
3888
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
2024-10-31 21:09 ` [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
@ 2024-11-01 5:07 ` kernel test robot
2024-11-01 15:03 ` Jordan Rife
2 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-11-01 5:07 UTC (permalink / raw)
To: Andrii Nakryiko, linux-trace-kernel, bpf, rostedt, ast, daniel,
martin.lau
Cc: oe-kbuild-all, mathieu.desnoyers, linux-kernel, mhiramat, peterz,
paulmck, jrife, Andrii Nakryiko
Hi Andrii,
kernel test robot noticed the following build errors:
[auto build test ERROR on trace/for-next]
url: https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-decouple-BPF-link-attach-hook-and-BPF-program-sleepable-semantics/20241101-051131
base: https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace for-next
patch link: https://lore.kernel.org/r/20241031210938.1696639-3-andrii%40kernel.org
patch subject: [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
config: x86_64-buildonly-randconfig-001-20241101 (https://download.01.org/0day-ci/archive/20241101/202411011255.GYntOfN5-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241101/202411011255.GYntOfN5-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411011255.GYntOfN5-lkp@intel.com/
All errors (new ones prefixed by >>):
kernel/bpf/syscall.c: In function 'bpf_raw_tp_link_attach':
>> kernel/bpf/syscall.c:3866:33: error: implicit declaration of function 'tracepoint_is_faultable' [-Werror=implicit-function-declaration]
3866 | tracepoint_is_faultable(btp->tp));
| ^~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/tracepoint_is_faultable +3866 kernel/bpf/syscall.c
3818
3819 static int bpf_raw_tp_link_attach(struct bpf_prog *prog,
3820 const char __user *user_tp_name, u64 cookie)
3821 {
3822 struct bpf_link_primer link_primer;
3823 struct bpf_raw_tp_link *link;
3824 struct bpf_raw_event_map *btp;
3825 const char *tp_name;
3826 char buf[128];
3827 int err;
3828
3829 switch (prog->type) {
3830 case BPF_PROG_TYPE_TRACING:
3831 case BPF_PROG_TYPE_EXT:
3832 case BPF_PROG_TYPE_LSM:
3833 if (user_tp_name)
3834 /* The attach point for this category of programs
3835 * should be specified via btf_id during program load.
3836 */
3837 return -EINVAL;
3838 if (prog->type == BPF_PROG_TYPE_TRACING &&
3839 prog->expected_attach_type == BPF_TRACE_RAW_TP) {
3840 tp_name = prog->aux->attach_func_name;
3841 break;
3842 }
3843 return bpf_tracing_prog_attach(prog, 0, 0, 0);
3844 case BPF_PROG_TYPE_RAW_TRACEPOINT:
3845 case BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE:
3846 if (strncpy_from_user(buf, user_tp_name, sizeof(buf) - 1) < 0)
3847 return -EFAULT;
3848 buf[sizeof(buf) - 1] = 0;
3849 tp_name = buf;
3850 break;
3851 default:
3852 return -EINVAL;
3853 }
3854
3855 btp = bpf_get_raw_tracepoint(tp_name);
3856 if (!btp)
3857 return -ENOENT;
3858
3859 link = kzalloc(sizeof(*link), GFP_USER);
3860 if (!link) {
3861 err = -ENOMEM;
3862 goto out_put_btp;
3863 }
3864 bpf_link_init_sleepable(&link->link, BPF_LINK_TYPE_RAW_TRACEPOINT,
3865 &bpf_raw_tp_link_lops, prog,
> 3866 tracepoint_is_faultable(btp->tp));
3867 link->btp = btp;
3868 link->cookie = cookie;
3869
3870 err = bpf_link_prime(&link->link, &link_primer);
3871 if (err) {
3872 kfree(link);
3873 goto out_put_btp;
3874 }
3875
3876 err = bpf_probe_register(link->btp, link);
3877 if (err) {
3878 bpf_link_cleanup(&link_primer);
3879 goto out_put_btp;
3880 }
3881
3882 return bpf_link_settle(&link_primer);
3883
3884 out_put_btp:
3885 bpf_put_raw_tracepoint(btp);
3886 return err;
3887 }
3888
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated
2024-10-31 21:09 [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Andrii Nakryiko
2024-10-31 21:09 ` [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics Andrii Nakryiko
2024-10-31 21:09 ` [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links Andrii Nakryiko
@ 2024-11-01 10:55 ` Steven Rostedt
2024-11-01 17:54 ` Andrii Nakryiko
2 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2024-11-01 10:55 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, bpf, ast, daniel, martin.lau,
mathieu.desnoyers, linux-kernel, mhiramat, peterz, paulmck, jrife
On Thu, 31 Oct 2024 14:09:36 -0700
Andrii Nakryiko <andrii@kernel.org> wrote:
> In general, BPF link's underlying BPF program should be considered to be
> reachable through attach hook -> link -> prog chain, and, pessimistically,
> we have to assume that as long as link's memory is not safe to free,
> attach hook's code might hold a pointer to BPF program and use it.
>
> As such, it's not (generally) correct to put link's program early before
> waiting for RCU GPs to go through. More eager bpf_prog_put() that we
> currently do is mostly correct due to BPF program's release code doing
> similar RCU GP waiting, but as will be shown in the following patches,
> BPF program can be non-sleepable (and, thus, reliant on only "classic"
> RCU GP), while BPF link's attach hook can have sleepable semantics and
> needs to be protected by RCU Tasks Trace, and for such cases BPF link
> has to go through RCU Tasks Trace + "classic" RCU GPs before being
> deallocated. And so, if we put BPF program early, we might free BPF
> program before we free BPF link, leading to use-after-free situation.
>
> So, this patch defers bpf_prog_put() until we are ready to perform
> bpf_link's deallocation. At worst, this delays BPF program freeing by
> one extra RCU GP, but that seems completely acceptable. Alternatively,
> we'd need more elaborate ways to determine BPF hook, BPF link, and BPF
> program lifetimes, and how they relate to each other, which seems like
> an unnecessary complication.
>
> Note, for most BPF links we still will perform eager bpf_prog_put() and
> link dealloc, so for those BPF links there are no observable changes
> whatsoever. Only BPF links that use deferred dealloc might notice
> slightly delayed freeing of BPF programs.
>
> Also, to reduce code and logic duplication, extract program put + link
> dealloc logic into bpf_link_dealloc() helper.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Hi Andrii,
Do you want me to add this on top of my queue? If so, would it be possible
that I can get a tested-by from someone? As I don't do much to test BPF
patches.
-- Steve
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
2024-10-31 21:09 ` [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
2024-11-01 5:07 ` kernel test robot
@ 2024-11-01 15:03 ` Jordan Rife
2024-11-01 17:53 ` Andrii Nakryiko
2 siblings, 1 reply; 12+ messages in thread
From: Jordan Rife @ 2024-11-01 15:03 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, bpf, rostedt, ast, daniel, martin.lau,
mathieu.desnoyers, linux-kernel, mhiramat, peterz, paulmck
Just to confirm, I ran the reproducer from [1] after combining this
series with Mathieu's from [2] and it ran for 20m with no issues.
[1]: https://lore.kernel.org/bpf/67121037.050a0220.10f4f4.000f.GAE@google.com/
[2]: https://lore.kernel.org/bpf/20241031152056.744137-1-mathieu.desnoyers@efficios.com/T/#u
Tested-by: Jordan Rife <jrife@google.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics
2024-10-31 21:09 ` [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
@ 2024-11-01 16:26 ` Alexei Starovoitov
2024-11-01 17:51 ` Andrii Nakryiko
1 sibling, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2024-11-01 16:26 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: linux-trace-kernel, bpf, Steven Rostedt, Alexei Starovoitov,
Daniel Borkmann, Martin KaFai Lau, Mathieu Desnoyers, LKML,
Masami Hiramatsu, Peter Zijlstra, Paul E. McKenney, Jordan Rife
On Thu, Oct 31, 2024 at 2:23 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> static inline void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
> const struct bpf_link_ops *ops,
> - struct bpf_prog *prog)
> + struct bpf_prog *prog, bool sleepable)
> +{
> +}
Obvious typo caught by build bot...
Other than that the set looks good.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics
2024-11-01 16:26 ` Alexei Starovoitov
@ 2024-11-01 17:51 ` Andrii Nakryiko
0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2024-11-01 17:51 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Andrii Nakryiko, linux-trace-kernel, bpf, Steven Rostedt,
Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
Mathieu Desnoyers, LKML, Masami Hiramatsu, Peter Zijlstra,
Paul E. McKenney, Jordan Rife
On Fri, Nov 1, 2024 at 9:27 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Oct 31, 2024 at 2:23 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > static inline void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
> > const struct bpf_link_ops *ops,
> > - struct bpf_prog *prog)
> > + struct bpf_prog *prog, bool sleepable)
> > +{
> > +}
>
> Obvious typo caught by build bot...
> Other than that the set looks good.
Yeah, leftover from the initial attempt (I decided to not touch
bpf_link_init() in the end to avoid updating like 20 places where we
do this for various link types). I'll send a new version.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links
2024-11-01 15:03 ` Jordan Rife
@ 2024-11-01 17:53 ` Andrii Nakryiko
0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2024-11-01 17:53 UTC (permalink / raw)
To: Jordan Rife
Cc: Andrii Nakryiko, linux-trace-kernel, bpf, rostedt, ast, daniel,
martin.lau, mathieu.desnoyers, linux-kernel, mhiramat, peterz,
paulmck
On Fri, Nov 1, 2024 at 8:03 AM Jordan Rife <jrife@google.com> wrote:
>
> Just to confirm, I ran the reproducer from [1] after combining this
> series with Mathieu's from [2] and it ran for 20m with no issues.
>
> [1]: https://lore.kernel.org/bpf/67121037.050a0220.10f4f4.000f.GAE@google.com/
> [2]: https://lore.kernel.org/bpf/20241031152056.744137-1-mathieu.desnoyers@efficios.com/T/#u
>
> Tested-by: Jordan Rife <jrife@google.com>
Great, thank you, Jordan! I was going to ask you specifically to
double-check, as I couldn't repro the original issue locally with my
setup (and I was too lazy to mess with custom images and stuff).
I need to send a fixed up v2, I'll add your tested-by.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated
2024-11-01 10:55 ` [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Steven Rostedt
@ 2024-11-01 17:54 ` Andrii Nakryiko
0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2024-11-01 17:54 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andrii Nakryiko, linux-trace-kernel, bpf, ast, daniel, martin.lau,
mathieu.desnoyers, linux-kernel, mhiramat, peterz, paulmck, jrife
On Fri, Nov 1, 2024 at 3:55 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 31 Oct 2024 14:09:36 -0700
> Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > In general, BPF link's underlying BPF program should be considered to be
> > reachable through attach hook -> link -> prog chain, and, pessimistically,
> > we have to assume that as long as link's memory is not safe to free,
> > attach hook's code might hold a pointer to BPF program and use it.
> >
> > As such, it's not (generally) correct to put link's program early before
> > waiting for RCU GPs to go through. More eager bpf_prog_put() that we
> > currently do is mostly correct due to BPF program's release code doing
> > similar RCU GP waiting, but as will be shown in the following patches,
> > BPF program can be non-sleepable (and, thus, reliant on only "classic"
> > RCU GP), while BPF link's attach hook can have sleepable semantics and
> > needs to be protected by RCU Tasks Trace, and for such cases BPF link
> > has to go through RCU Tasks Trace + "classic" RCU GPs before being
> > deallocated. And so, if we put BPF program early, we might free BPF
> > program before we free BPF link, leading to use-after-free situation.
> >
> > So, this patch defers bpf_prog_put() until we are ready to perform
> > bpf_link's deallocation. At worst, this delays BPF program freeing by
> > one extra RCU GP, but that seems completely acceptable. Alternatively,
> > we'd need more elaborate ways to determine BPF hook, BPF link, and BPF
> > program lifetimes, and how they relate to each other, which seems like
> > an unnecessary complication.
> >
> > Note, for most BPF links we still will perform eager bpf_prog_put() and
> > link dealloc, so for those BPF links there are no observable changes
> > whatsoever. Only BPF links that use deferred dealloc might notice
> > slightly delayed freeing of BPF programs.
> >
> > Also, to reduce code and logic duplication, extract program put + link
> > dealloc logic into bpf_link_dealloc() helper.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
>
> Hi Andrii,
>
> Do you want me to add this on top of my queue? If so, would it be possible
> that I can get a tested-by from someone? As I don't do much to test BPF
> patches.
Hey Steven,
Yep, this should go on top of Mathieu's patch set. Let me send v2 with
fixed up stub definition. Jordan gave his Tested-By and Alexei seems
fine with this as well, so I think it should be good to go.
>
> -- Steve
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-11-01 17:54 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-31 21:09 [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Andrii Nakryiko
2024-10-31 21:09 ` [PATCH trace/for-next 2/3] bpf: decouple BPF link/attach hook and BPF program sleepable semantics Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
2024-11-01 16:26 ` Alexei Starovoitov
2024-11-01 17:51 ` Andrii Nakryiko
2024-10-31 21:09 ` [PATCH trace/for-next 3/3] bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links Andrii Nakryiko
2024-11-01 5:07 ` kernel test robot
2024-11-01 5:07 ` kernel test robot
2024-11-01 15:03 ` Jordan Rife
2024-11-01 17:53 ` Andrii Nakryiko
2024-11-01 10:55 ` [PATCH trace/for-next 1/3] bpf: put bpf_link's program when link is safe to be deallocated Steven Rostedt
2024-11-01 17:54 ` Andrii Nakryiko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).