[PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs)
@ 2024-04-08  8:09 Benjamin Tissoires
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers Benjamin Tissoires
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest,
	Kumar Kartikeya Dwivedi

New version of the sleepable bpf_timer code.

I'm posting this as this is the result of the previous review, so we can
have a baseline to compare to.

The plan is now to introduce a new user API struct bpf_wq, as the timer
API working on softIRQ seems to be quite far away from a wq.

For reference, the use cases I have in mind:

---

Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
   Sometimes we receive an out of proximity event, but the device can not
   be trusted enough, and we need to ensure that we won't receive another
   one in the following n milliseconds. So we need to wait those n
   milliseconds, and eventually re-inject that event in the stack.

2. inject new events in reaction to one given event:
   We might want to transform one given event into several. This is the
   case for macro keys where a single key press is supposed to send
   a sequence of key presses. But this could also be used to patch a
   faulty behavior, if a device forgets to send a release event.

3. communicate with the device in reaction to one event:
   We might want to communicate back to the device after a given event.
   For example a device might send us an event saying that it came back
   from sleeping state and needs to be re-initialized.

Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.

bpf_timers are currently running in a soft IRQ context, this patch
series implements a sleppable context for them.

Cheers,
Benjamin

To: Alexei Starovoitov <ast@kernel.org>
To: Daniel Borkmann <daniel@iogearbox.net>
To: Andrii Nakryiko <andrii@kernel.org>
To: Martin KaFai Lau <martin.lau@linux.dev>
To: Eduard Zingerman <eddyz87@gmail.com>
To: Song Liu <song@kernel.org>
To: Yonghong Song <yonghong.song@linux.dev>
To: John Fastabend <john.fastabend@gmail.com>
To: KP Singh <kpsingh@kernel.org>
To: Stanislav Fomichev <sdf@google.com>
To: Hao Luo <haoluo@google.com>
To: Jiri Olsa <jolsa@kernel.org>
To: Mykola Lysenko <mykolal@fb.com>
To: Shuah Khan <shuah@kernel.org>
Cc: Benjamin Tissoires <bentiss@kernel.org>
Cc: <bpf@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: <linux-kselftest@vger.kernel.org>

---
Changes in v6:
- Use of a workqueue to clean up sleepable timers
- integrated Kumar's patch instead of mine
- Link to v5: https://lore.kernel.org/r/20240322-hid-bpf-sleepable-v5-0-179c7b59eaaa@kernel.org

Changes in v5:
- took various reviews into account
- rewrote the tests to be separated to not have a uggly include
- Link to v4: https://lore.kernel.org/r/20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org

Changes in v4:
- dropped the HID changes, they can go independently from bpf-core
- addressed Alexei's and Eduard's remarks
- added selftests
- Link to v3: https://lore.kernel.org/r/20240221-hid-bpf-sleepable-v3-0-1fb378ca6301@kernel.org

Changes in v3:
- fixed the crash from v2
- changed the API to have only BPF_F_TIMER_SLEEPABLE for
  bpf_timer_start()
- split the new kfuncs/verifier patch into several sub-patches, for
  easier reviews
- Link to v2: https://lore.kernel.org/r/20240214-hid-bpf-sleepable-v2-0-5756b054724d@kernel.org

Changes in v2:
- make use of bpf_timer (and dropped the custom HID handling)
- implemented bpf_timer_set_sleepable_cb as a kfunc
- still not implemented global subprogs
- no sleepable bpf_timer selftests yet
- Link to v1: https://lore.kernel.org/r/20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kernel.org

---
Benjamin Tissoires (5):
      bpf/helpers: introduce sleepable bpf_timers
      bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
      bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
      tools: sync include/uapi/linux/bpf.h
      selftests/bpf: add sleepable timer tests

Kumar Kartikeya Dwivedi (1):
      bpf: Add support for KF_ARG_PTR_TO_TIMER

 include/linux/bpf_verifier.h                       |   1 +
 include/uapi/linux/bpf.h                           |  13 ++
 kernel/bpf/helpers.c                               | 202 ++++++++++++++++---
 kernel/bpf/verifier.c                              |  98 +++++++++-
 tools/include/uapi/linux/bpf.h                     |  20 +-
 tools/testing/selftests/bpf/bpf_experimental.h     |   5 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c        |   5 +
 .../selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h  |   1 +
 tools/testing/selftests/bpf/prog_tests/timer.c     |  34 ++++
 .../testing/selftests/bpf/progs/timer_sleepable.c  | 213 +++++++++++++++++++++
 10 files changed, 553 insertions(+), 39 deletions(-)
---
base-commit: 61df575632d6b39213f47810c441bddbd87c3606
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4

Best regards,
-- 
Benjamin Tissoires <bentiss@kernel.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
@ 2024-04-08  8:09 ` Benjamin Tissoires
  2024-04-08 17:07   ` Eduard Zingerman
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER bentiss
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest

They are implemented as a workqueue, which means that there are no
guarantees of timing nor ordering.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---

changes in v6:
- dropped extra spinlock
- implement cancel_and_free of a sleepable timer through
  a dedicated workqueue

no changes in v5

changes in v4:
- dropped __bpf_timer_compute_key()
- use a spin_lock instead of a semaphore
- ensure bpf_timer_cancel_and_free is not complaining about
  non sleepable context and use cancel_work() instead of
  cancel_work_sync()
- return -EINVAL if a delay is given to bpf_timer_start() with
  BPF_F_TIMER_SLEEPABLE

changes in v3:
- extracted the implementation in bpf_timer only, without
  bpf_timer_set_sleepable_cb()
- rely on schedule_work() only, from bpf_timer_start()
- add semaphore to ensure bpf_timer_work_cb() is accessing
  consistent data

changes in v2 (compared to the one attaches to v1 0/9):
- make use of a kfunc
- add a (non-used) BPF_F_TIMER_SLEEPABLE
- the callback is *not* called, it makes the kernel crashes
---
 include/uapi/linux/bpf.h |  13 ++++
 kernel/bpf/helpers.c     | 159 ++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 143 insertions(+), 29 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 9585f5345353..f1890eed213a 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7469,6 +7469,19 @@ enum {
 	BPF_F_TIMER_CPU_PIN = (1ULL << 1),
 };
 
+/* Extra flags to control bpf_timer_init() behaviour, in addition to CLOCK_*.
+ *     - BPF_F_TIMER_SLEEPABLE: Timer will run in a workqueue in a sleepable
+ *       context, with no guarantees of ordering nor timing (consider this as
+ *       being just offloaded immediately).
+ */
+enum {
+	/* CLOCK_* are using bits 0 to 3 */
+	BPF_F_TIMER_SLEEPABLE = (1ULL << 4),
+	__MAX_BPF_F_TIMER_INIT,
+};
+
+#define MAX_BPF_F_TIMER_INIT __MAX_BPF_F_TIMER_INIT
+
 /* BPF numbers iterator state */
 struct bpf_iter_num {
 	/* opaque iterator state; having __u64 here allows to preserve correct
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 9234174ccb21..fd05d4358b31 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1096,12 +1096,19 @@ const struct bpf_func_proto bpf_snprintf_proto = {
  * freeing the timers when inner map is replaced or deleted by user space.
  */
 struct bpf_hrtimer {
-	struct hrtimer timer;
+	union {
+		struct hrtimer timer;
+		struct work_struct work;
+	};
 	struct bpf_map *map;
 	struct bpf_prog *prog;
 	void __rcu *callback_fn;
 	void *value;
-	struct rcu_head rcu;
+	union {
+		struct rcu_head rcu;
+		struct work_struct sync_work;
+	};
+	u64 flags;
 };
 
 /* the actual struct hidden inside uapi struct bpf_timer */
@@ -1114,6 +1121,58 @@ struct bpf_timer_kern {
 	struct bpf_spin_lock lock;
 } __attribute__((aligned(8)));
 
+static void bpf_timer_work_cb(struct work_struct *work)
+{
+	struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, work);
+	struct bpf_tramp_run_ctx __maybe_unused run_ctx;
+	struct bpf_prog *prog = t->prog;
+	struct bpf_map *map = t->map;
+	bpf_callback_t callback_fn;
+	void *value = t->value;
+	unsigned long flags;
+	void *key;
+	u32 idx;
+
+	BTF_TYPE_EMIT(struct bpf_timer);
+
+	callback_fn = READ_ONCE(t->callback_fn);
+	if (!callback_fn || !prog)
+		return;
+
+	if (map->map_type == BPF_MAP_TYPE_ARRAY) {
+		struct bpf_array *array = container_of(map, struct bpf_array, map);
+
+		/* compute the key */
+		idx = ((char *)value - array->value) / array->elem_size;
+		key = &idx;
+	} else { /* hash or lru */
+		key = value - round_up(map->key_size, 8);
+	}
+
+	run_ctx.bpf_cookie = 0;
+
+	if (!__bpf_prog_enter_sleepable_recur(prog, &run_ctx)) {
+		/* recursion detected */
+		__bpf_prog_exit_sleepable_recur(prog, 0, &run_ctx);
+		return;
+	}
+
+	callback_fn((u64)(long)map, (u64)(long)key, (u64)(long)value, 0, 0);
+	/* The verifier checked that return value is zero. */
+
+	__bpf_prog_exit_sleepable_recur(prog, 0 /* bpf_prog_run does runtime stats */,
+					&run_ctx);
+}
+
+static void bpf_timer_sync_work_cb(struct work_struct *work)
+{
+	struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, sync_work);
+
+	cancel_work_sync(&t->work);
+
+	kfree_rcu(t, rcu);
+}
+
 static DEFINE_PER_CPU(struct bpf_hrtimer *, hrtimer_running);
 
 static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer)
@@ -1155,10 +1214,14 @@ static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer)
 	return HRTIMER_NORESTART;
 }
 
+#define BPF_TIMER_CLOCK_MASK (MAX_CLOCKS - 1)  /* 0xf */
+#define BPF_TIMER_FLAGS_MASK GENMASK_ULL(63, 4)
+
 BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map,
 	   u64, flags)
 {
-	clockid_t clockid = flags & (MAX_CLOCKS - 1);
+	clockid_t clockid = flags & BPF_TIMER_CLOCK_MASK;
+	u64 bpf_timer_flags = flags & BPF_TIMER_FLAGS_MASK;
 	struct bpf_hrtimer *t;
 	int ret = 0;
 
@@ -1169,7 +1232,7 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
 	if (in_nmi())
 		return -EOPNOTSUPP;
 
-	if (flags >= MAX_CLOCKS ||
+	if (bpf_timer_flags & ~(BPF_F_TIMER_SLEEPABLE) ||
 	    /* similar to timerfd except _ALARM variants are not supported */
 	    (clockid != CLOCK_MONOTONIC &&
 	     clockid != CLOCK_REALTIME &&
@@ -1190,8 +1253,14 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
 	t->value = (void *)timer - map->record->timer_off;
 	t->map = map;
 	t->prog = NULL;
+	t->flags = flags;
 	rcu_assign_pointer(t->callback_fn, NULL);
-	hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT);
+	if (flags & BPF_F_TIMER_SLEEPABLE) {
+		INIT_WORK(&t->work, bpf_timer_work_cb);
+		INIT_WORK(&t->sync_work, bpf_timer_sync_work_cb);
+	} else {
+		hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT);
+	}
 	t->timer.function = bpf_timer_cb;
 	WRITE_ONCE(timer->timer, t);
 	/* Guarantee the order between timer->timer and map->usercnt. So
@@ -1292,6 +1361,11 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla
 		goto out;
 	}
 
+	if ((t->flags & BPF_F_TIMER_SLEEPABLE) && nsecs) {
+		ret = -EINVAL;
+		goto out;
+	}
+
 	if (flags & BPF_F_TIMER_ABS)
 		mode = HRTIMER_MODE_ABS_SOFT;
 	else
@@ -1300,7 +1374,10 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla
 	if (flags & BPF_F_TIMER_CPU_PIN)
 		mode |= HRTIMER_MODE_PINNED;
 
-	hrtimer_start(&t->timer, ns_to_ktime(nsecs), mode);
+	if (t->flags & BPF_F_TIMER_SLEEPABLE)
+		schedule_work(&t->work);
+	else
+		hrtimer_start(&t->timer, ns_to_ktime(nsecs), mode);
 out:
 	__bpf_spin_unlock_irqrestore(&timer->lock);
 	return ret;
@@ -1329,6 +1406,7 @@ static void drop_prog_refcnt(struct bpf_hrtimer *t)
 BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer)
 {
 	struct bpf_hrtimer *t;
+	u64 flags = 0;
 	int ret = 0;
 
 	if (in_nmi())
@@ -1340,6 +1418,7 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer)
 		ret = -EINVAL;
 		goto out;
 	}
+	flags = t->flags;
 	if (this_cpu_read(hrtimer_running) == t) {
 		/* If bpf callback_fn is trying to bpf_timer_cancel()
 		 * its own timer the hrtimer_cancel() will deadlock
@@ -1351,10 +1430,20 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer)
 	drop_prog_refcnt(t);
 out:
 	__bpf_spin_unlock_irqrestore(&timer->lock);
-	/* Cancel the timer and wait for associated callback to finish
-	 * if it was running.
-	 */
-	ret = ret ?: hrtimer_cancel(&t->timer);
+
+	if (flags & BPF_F_TIMER_SLEEPABLE) {
+		/* Cancel the sleepable work, but *do not* wait for
+		 * it to finish if it was running as we might not be in a
+		 * sleepable context
+		 */
+		ret = ret ?: cancel_work(&t->work);
+	} else {
+		/* Cancel the timer and wait for associated callback to finish
+		 * if it was running.
+		 */
+		ret = ret ?: hrtimer_cancel(&t->timer);
+	}
+
 	rcu_read_unlock();
 	return ret;
 }
@@ -1373,6 +1462,7 @@ void bpf_timer_cancel_and_free(void *val)
 {
 	struct bpf_timer_kern *timer = val;
 	struct bpf_hrtimer *t;
+	u64 flags;
 
 	/* Performance optimization: read timer->timer without lock first. */
 	if (!READ_ONCE(timer->timer))
@@ -1383,6 +1473,7 @@ void bpf_timer_cancel_and_free(void *val)
 	t = timer->timer;
 	if (!t)
 		goto out;
+	flags = t->flags;
 	drop_prog_refcnt(t);
 	/* The subsequent bpf_timer_start/cancel() helpers won't be able to use
 	 * this timer, since it won't be initialized.
@@ -1392,25 +1483,35 @@ void bpf_timer_cancel_and_free(void *val)
 	__bpf_spin_unlock_irqrestore(&timer->lock);
 	if (!t)
 		return;
-	/* Cancel the timer and wait for callback to complete if it was running.
-	 * If hrtimer_cancel() can be safely called it's safe to call kfree(t)
-	 * right after for both preallocated and non-preallocated maps.
-	 * The timer->timer = NULL was already done and no code path can
-	 * see address 't' anymore.
-	 *
-	 * Check that bpf_map_delete/update_elem() wasn't called from timer
-	 * callback_fn. In such case don't call hrtimer_cancel() (since it will
-	 * deadlock) and don't call hrtimer_try_to_cancel() (since it will just
-	 * return -1). Though callback_fn is still running on this cpu it's
-	 * safe to do kfree(t) because bpf_timer_cb() read everything it needed
-	 * from 't'. The bpf subprog callback_fn won't be able to access 't',
-	 * since timer->timer = NULL was already done. The timer will be
-	 * effectively cancelled because bpf_timer_cb() will return
-	 * HRTIMER_NORESTART.
-	 */
-	if (this_cpu_read(hrtimer_running) != t)
-		hrtimer_cancel(&t->timer);
-	kfree_rcu(t, rcu);
+
+	if (flags & BPF_F_TIMER_SLEEPABLE) {
+		/* Trigger cancel of the sleepable work, but *do not* wait for
+		 * it to finish if it was running as we might not be in a
+		 * sleepable context.
+		 * kfree will be called once the work has finished.
+		 */
+		schedule_work(&t->sync_work);
+	} else {
+		/* Cancel the timer and wait for callback to complete if it was running.
+		 * If hrtimer_cancel() can be safely called it's safe to call kfree(t)
+		 * right after for both preallocated and non-preallocated maps.
+		 * The timer->timer = NULL was already done and no code path can
+		 * see address 't' anymore.
+		 *
+		 * Check that bpf_map_delete/update_elem() wasn't called from timer
+		 * callback_fn. In such case don't call hrtimer_cancel() (since it will
+		 * deadlock) and don't call hrtimer_try_to_cancel() (since it will just
+		 * return -1). Though callback_fn is still running on this cpu it's
+		 * safe to do kfree(t) because bpf_timer_cb() read everything it needed
+		 * from 't'. The bpf subprog callback_fn won't be able to access 't',
+		 * since timer->timer = NULL was already done. The timer will be
+		 * effectively cancelled because bpf_timer_cb() will return
+		 * HRTIMER_NORESTART.
+		 */
+		if (this_cpu_read(hrtimer_running) != t)
+			hrtimer_cancel(&t->timer);
+		kfree_rcu(t, rcu);
+	}
 }
 
 BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr)

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers Benjamin Tissoires
@ 2024-04-08 17:07   ` Eduard Zingerman
  2024-04-08 17:20     ` Benjamin Tissoires
  0 siblings, 1 reply; 21+ messages in thread
From: Eduard Zingerman @ 2024-04-08 17:07 UTC (permalink / raw)
  To: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: bpf, linux-kernel, linux-kselftest

On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:

[...]

> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 9234174ccb21..fd05d4358b31 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1096,12 +1096,19 @@ const struct bpf_func_proto bpf_snprintf_proto = {
>   * freeing the timers when inner map is replaced or deleted by user space.
>   */
>  struct bpf_hrtimer {
> -	struct hrtimer timer;
> +	union {
> +		struct hrtimer timer;
> +		struct work_struct work;
> +	};
>  	struct bpf_map *map;
>  	struct bpf_prog *prog;
>  	void __rcu *callback_fn;
>  	void *value;
> -	struct rcu_head rcu;
> +	union {
> +		struct rcu_head rcu;
> +		struct work_struct sync_work;

Nit:
I find this name very confusing, the field is used to cancel timer
execution, is it a convention to call such things '...sync...'?

> +	};
> +	u64 flags;
>  };
>  

[...]

> +static void bpf_timer_sync_work_cb(struct work_struct *work)
> +{
> +	struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, sync_work);
> +
> +	cancel_work_sync(&t->work);
> +
> +	kfree_rcu(t, rcu);

Sorry, I might be wrong, but this looks suspicious.
The 'rcu' field of 'bpf_hrtimer' is defined as follows:

struct bpf_hrtimer {
	...
	union {
		struct rcu_head rcu;
		struct work_struct sync_work;
	};
	...
};

And for sleepable timers the 'sync_work' field is set as follows:

BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map,
	   u64, flags)
{
	...
	INIT_WORK(&t->sync_work, bpf_timer_sync_work_cb);
	...
}

So, it looks like 'kfree_rcu' would be called for a non-rcu pointer.

> +}
> +



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers
  2024-04-08 17:07   ` Eduard Zingerman
@ 2024-04-08 17:20     ` Benjamin Tissoires
  2024-04-08 21:17       ` Eduard Zingerman
  2024-04-09  3:04       ` Alexei Starovoitov
  0 siblings, 2 replies; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08 17:20 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, linux-kernel, linux-kselftest

On Mon, Apr 8, 2024 at 7:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
>
> [...]
>
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 9234174ccb21..fd05d4358b31 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -1096,12 +1096,19 @@ const struct bpf_func_proto bpf_snprintf_proto = {
> >   * freeing the timers when inner map is replaced or deleted by user space.
> >   */
> >  struct bpf_hrtimer {
> > -     struct hrtimer timer;
> > +     union {
> > +             struct hrtimer timer;
> > +             struct work_struct work;
> > +     };
> >       struct bpf_map *map;
> >       struct bpf_prog *prog;
> >       void __rcu *callback_fn;
> >       void *value;
> > -     struct rcu_head rcu;
> > +     union {
> > +             struct rcu_head rcu;
> > +             struct work_struct sync_work;
>
> Nit:
> I find this name very confusing, the field is used to cancel timer
> execution, is it a convention to call such things '...sync...'?
>
> > +     };
> > +     u64 flags;
> >  };
> >
>
> [...]
>
> > +static void bpf_timer_sync_work_cb(struct work_struct *work)
> > +{
> > +     struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, sync_work);
> > +
> > +     cancel_work_sync(&t->work);
> > +
> > +     kfree_rcu(t, rcu);
>
> Sorry, I might be wrong, but this looks suspicious.
> The 'rcu' field of 'bpf_hrtimer' is defined as follows:
>
> struct bpf_hrtimer {
>         ...
>         union {
>                 struct rcu_head rcu;
>                 struct work_struct sync_work;
>         };
>         ...
> };
>
> And for sleepable timers the 'sync_work' field is set as follows:
>
> BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map,
>            u64, flags)
> {
>         ...
>         INIT_WORK(&t->sync_work, bpf_timer_sync_work_cb);
>         ...
> }
>
> So, it looks like 'kfree_rcu' would be called for a non-rcu pointer.

That was my initial assumption too, but Alexei told me it was fine.
And I think he is correct because kfree_rcu doesn't need the rcu_head
to be initialized.

So in the end, we initialize the memory as a work_struct, and when
that work kicks in, we reuse that exact same memory as the rcu_head.
This is fine because that work will never be reused.

If I understand correctly, this is to save a few bytes as this is a
critical struct used in programs with a high rate usage, and every
byte counts.

Cheers,
Benjamin

>
> > +}
> > +
>
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers
  2024-04-08 17:20     ` Benjamin Tissoires
@ 2024-04-08 21:17       ` Eduard Zingerman
  2024-04-09  3:04       ` Alexei Starovoitov
  1 sibling, 0 replies; 21+ messages in thread
From: Eduard Zingerman @ 2024-04-08 21:17 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, linux-kernel, linux-kselftest

On Mon, 2024-04-08 at 19:20 +0200, Benjamin Tissoires wrote:

[...]

> That was my initial assumption too, but Alexei told me it was fine.
> And I think he is correct because kfree_rcu doesn't need the rcu_head
> to be initialized.
> 
> So in the end, we initialize the memory as a work_struct, and when
> that work kicks in, we reuse that exact same memory as the rcu_head.
> This is fine because that work will never be reused.

Oh, I get it, thank you for explanation.

Thanks,
Eduard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers
  2024-04-08 17:20     ` Benjamin Tissoires
  2024-04-08 21:17       ` Eduard Zingerman
@ 2024-04-09  3:04       ` Alexei Starovoitov
  1 sibling, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2024-04-09  3:04 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Eduard Zingerman, Benjamin Tissoires, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Mykola Lysenko, Shuah Khan, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Mon, Apr 8, 2024 at 10:20 AM Benjamin Tissoires
<benjamin.tissoires@redhat.com> wrote:
>
> On Mon, Apr 8, 2024 at 7:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
> >
> > [...]
> >
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index 9234174ccb21..fd05d4358b31 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -1096,12 +1096,19 @@ const struct bpf_func_proto bpf_snprintf_proto = {
> > >   * freeing the timers when inner map is replaced or deleted by user space.
> > >   */
> > >  struct bpf_hrtimer {
> > > -     struct hrtimer timer;
> > > +     union {
> > > +             struct hrtimer timer;
> > > +             struct work_struct work;
> > > +     };
> > >       struct bpf_map *map;
> > >       struct bpf_prog *prog;
> > >       void __rcu *callback_fn;
> > >       void *value;
> > > -     struct rcu_head rcu;
> > > +     union {
> > > +             struct rcu_head rcu;
> > > +             struct work_struct sync_work;
> >
> > Nit:
> > I find this name very confusing, the field is used to cancel timer
> > execution, is it a convention to call such things '...sync...'?
> >
> > > +     };
> > > +     u64 flags;
> > >  };
> > >
> >
> > [...]
> >
> > > +static void bpf_timer_sync_work_cb(struct work_struct *work)
> > > +{
> > > +     struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, sync_work);
> > > +
> > > +     cancel_work_sync(&t->work);
> > > +
> > > +     kfree_rcu(t, rcu);
> >
> > Sorry, I might be wrong, but this looks suspicious.
> > The 'rcu' field of 'bpf_hrtimer' is defined as follows:
> >
> > struct bpf_hrtimer {
> >         ...
> >         union {
> >                 struct rcu_head rcu;
> >                 struct work_struct sync_work;
> >         };
> >         ...
> > };
> >
> > And for sleepable timers the 'sync_work' field is set as follows:
> >
> > BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map,
> >            u64, flags)
> > {
> >         ...
> >         INIT_WORK(&t->sync_work, bpf_timer_sync_work_cb);
> >         ...
> > }
> >
> > So, it looks like 'kfree_rcu' would be called for a non-rcu pointer.
>
> That was my initial assumption too, but Alexei told me it was fine.
> And I think he is correct because kfree_rcu doesn't need the rcu_head
> to be initialized.
>
> So in the end, we initialize the memory as a work_struct, and when
> that work kicks in, we reuse that exact same memory as the rcu_head.
> This is fine because that work will never be reused.
>
> If I understand correctly, this is to save a few bytes as this is a
> critical struct used in programs with a high rate usage, and every
> byte counts.

Yes. All correct.
Probably makes sense to add a comment before kfree_rcu()
line in bpf_timer_sync_work_cb() that
kfree_rcu will wait for bpf_timer_cancel() to finish
as was done in
commit 0281b919e175 ("bpf: Fix racing between
bpf_timer_cancel_and_free and bpf_timer_cancel").

I suspect that's what confused Eduard.

The patch 1 looks great overall.

If we're going to keep this combined bpf_timer_* api for both wq
and timer we'd need to add flags compatibility check
to bpf_timer_start() too.
We can disallow this flag in 'flags' argument and use one from t->flags.
Which kinda makes sense to make bpf_timer_start() less verbose.
Alternatively we can allow bpf_timer_start() to have it,
but then we'd need to check that it is actually passed.
Either way the patch needs an extra check in bpf_timer_start().
Just ignoring BPF_F_TIMER_SLEEPABLE in bpf_timer_start() doesn't look right.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers Benjamin Tissoires
@ 2024-04-08  8:09 ` bentiss
  2024-04-08 21:55   ` Eduard Zingerman
  2024-04-12  8:14   ` Benjamin Tissoires
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc Benjamin Tissoires
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: bentiss @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest,
	Kumar Kartikeya Dwivedi

From: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Introduce support for KF_ARG_PTR_TO_TIMER. The kfuncs will use bpf_timer
as argument and that will be recognized as timer argument by verifier.
bpf_timer_kern casting can happen inside kfunc, but using bpf_timer in
argument makes life easier for users who work with non-kern type in BPF
progs.

Fix up process_timer_func's meta argument usage (ignore if NULL) so that
we can share the same checks for helpers and kfuncs. meta argument is
only needed to ensure bpf_timer_init's timer and map arguments are
coming from the same map (map_uid logic is necessary for correct
inner-map handling).

No such concerns will be necessary for kfuncs as timer initialization
happens using helpers, hence pass NULL to process_timer_func from kfunc
argument handling code to ignore it.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---

changes in v6:
- used Kumar's version of the patch
- reverted `+BTF_ID(struct, bpf_timer_kern)`

changes in v5:
- also check for the reg offset

changes in v4:
- enforce KF_ARG_PTR_TO_TIMER to be of type PTR_TO_MAP_VALUE

new in v3 (split from v2 02/10)
---
 kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ca6cacf7b42f..ccfe9057d8dc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7568,12 +7568,16 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno,
 			val + reg->off, map->record->timer_off);
 		return -EINVAL;
 	}
+	/* meta is only needed for bpf_timer_init to match timer and map */
+	if (!meta)
+		goto out;
 	if (meta->map_ptr) {
 		verbose(env, "verifier bug. Two map pointers in a timer helper\n");
 		return -EFAULT;
 	}
 	meta->map_uid = reg->map_uid;
 	meta->map_ptr = map;
+out:
 	return 0;
 }
 
@@ -10826,6 +10830,7 @@ enum {
 	KF_ARG_LIST_NODE_ID,
 	KF_ARG_RB_ROOT_ID,
 	KF_ARG_RB_NODE_ID,
+	KF_ARG_TIMER_ID,
 };
 
 BTF_ID_LIST(kf_arg_btf_ids)
@@ -10834,6 +10839,7 @@ BTF_ID(struct, bpf_list_head)
 BTF_ID(struct, bpf_list_node)
 BTF_ID(struct, bpf_rb_root)
 BTF_ID(struct, bpf_rb_node)
+BTF_ID(struct, bpf_timer_kern)
 
 static bool __is_kfunc_ptr_arg_type(const struct btf *btf,
 				    const struct btf_param *arg, int type)
@@ -10877,6 +10883,11 @@ static bool is_kfunc_arg_rbtree_node(const struct btf *btf, const struct btf_par
 	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_NODE_ID);
 }
 
+static bool is_kfunc_arg_timer(const struct btf *btf, const struct btf_param *arg)
+{
+	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_TIMER_ID);
+}
+
 static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf,
 				  const struct btf_param *arg)
 {
@@ -10946,6 +10957,7 @@ enum kfunc_ptr_arg_type {
 	KF_ARG_PTR_TO_NULL,
 	KF_ARG_PTR_TO_CONST_STR,
 	KF_ARG_PTR_TO_MAP,
+	KF_ARG_PTR_TO_TIMER,
 };
 
 enum special_kfunc_type {
@@ -11102,6 +11114,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
 	if (is_kfunc_arg_map(meta->btf, &args[argno]))
 		return KF_ARG_PTR_TO_MAP;
 
+	if (is_kfunc_arg_timer(meta->btf, &args[argno]))
+		return KF_ARG_PTR_TO_TIMER;
+
 	if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) {
 		if (!btf_type_is_struct(ref_t)) {
 			verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n",
@@ -11735,6 +11750,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 		case KF_ARG_PTR_TO_CALLBACK:
 		case KF_ARG_PTR_TO_REFCOUNTED_KPTR:
 		case KF_ARG_PTR_TO_CONST_STR:
+		case KF_ARG_PTR_TO_TIMER:
 			/* Trusted by default */
 			break;
 		default:
@@ -12021,6 +12037,15 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 			if (ret)
 				return ret;
 			break;
+		case KF_ARG_PTR_TO_TIMER:
+			if (reg->type != PTR_TO_MAP_VALUE) {
+				verbose(env, "arg#%d doesn't point to a map value\n", i);
+				return -EINVAL;
+			}
+			ret = process_timer_func(env, regno, NULL);
+			if (ret < 0)
+				return ret;
+			break;
 		}
 	}
 

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER bentiss
@ 2024-04-08 21:55   ` Eduard Zingerman
  2024-04-12  8:14   ` Benjamin Tissoires
  1 sibling, 0 replies; 21+ messages in thread
From: Eduard Zingerman @ 2024-04-08 21:55 UTC (permalink / raw)
  To: bentiss, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Mykola Lysenko,
	Shuah Khan
  Cc: bpf, linux-kernel, linux-kselftest, Kumar Kartikeya Dwivedi

On Mon, 2024-04-08 at 10:09 +0200, bentiss@kernel.org wrote:
> From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> 
> Introduce support for KF_ARG_PTR_TO_TIMER. The kfuncs will use bpf_timer
> as argument and that will be recognized as timer argument by verifier.
> bpf_timer_kern casting can happen inside kfunc, but using bpf_timer in
> argument makes life easier for users who work with non-kern type in BPF
> progs.
> 
> Fix up process_timer_func's meta argument usage (ignore if NULL) so that
> we can share the same checks for helpers and kfuncs. meta argument is
> only needed to ensure bpf_timer_init's timer and map arguments are
> coming from the same map (map_uid logic is necessary for correct
> inner-map handling).
> 
> No such concerns will be necessary for kfuncs as timer initialization
> happens using helpers, hence pass NULL to process_timer_func from kfunc
> argument handling code to ignore it.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
> 
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER bentiss
  2024-04-08 21:55   ` Eduard Zingerman
@ 2024-04-12  8:14   ` Benjamin Tissoires
  1 sibling, 0 replies; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-12  8:14 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: bpf, linux-kernel, linux-kselftest, Kumar Kartikeya Dwivedi

On Apr 08 2024, bentiss@kernel.org wrote:
> From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> 
> Introduce support for KF_ARG_PTR_TO_TIMER. The kfuncs will use bpf_timer
> as argument and that will be recognized as timer argument by verifier.
> bpf_timer_kern casting can happen inside kfunc, but using bpf_timer in
> argument makes life easier for users who work with non-kern type in BPF
> progs.
> 
> Fix up process_timer_func's meta argument usage (ignore if NULL) so that
> we can share the same checks for helpers and kfuncs. meta argument is
> only needed to ensure bpf_timer_init's timer and map arguments are
> coming from the same map (map_uid logic is necessary for correct
> inner-map handling).
> 
> No such concerns will be necessary for kfuncs as timer initialization
> happens using helpers, hence pass NULL to process_timer_func from kfunc
> argument handling code to ignore it.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
> 
> ---
> 
> changes in v6:
> - used Kumar's version of the patch
> - reverted `+BTF_ID(struct, bpf_timer_kern)`

My bad. While working on bpf_wq I realized I shouldn't have touched this
part. See below:

> 
> changes in v5:
> - also check for the reg offset
> 
> changes in v4:
> - enforce KF_ARG_PTR_TO_TIMER to be of type PTR_TO_MAP_VALUE
> 
> new in v3 (split from v2 02/10)
> ---
>  kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index ca6cacf7b42f..ccfe9057d8dc 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7568,12 +7568,16 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno,
>  			val + reg->off, map->record->timer_off);
>  		return -EINVAL;
>  	}
> +	/* meta is only needed for bpf_timer_init to match timer and map */
> +	if (!meta)
> +		goto out;
>  	if (meta->map_ptr) {
>  		verbose(env, "verifier bug. Two map pointers in a timer helper\n");
>  		return -EFAULT;
>  	}
>  	meta->map_uid = reg->map_uid;
>  	meta->map_ptr = map;
> +out:
>  	return 0;
>  }
>  
> @@ -10826,6 +10830,7 @@ enum {
>  	KF_ARG_LIST_NODE_ID,
>  	KF_ARG_RB_ROOT_ID,
>  	KF_ARG_RB_NODE_ID,
> +	KF_ARG_TIMER_ID,
>  };
>  
>  BTF_ID_LIST(kf_arg_btf_ids)
> @@ -10834,6 +10839,7 @@ BTF_ID(struct, bpf_list_head)
>  BTF_ID(struct, bpf_list_node)
>  BTF_ID(struct, bpf_rb_root)
>  BTF_ID(struct, bpf_rb_node)
> +BTF_ID(struct, bpf_timer_kern)

As Kumar originally put, this should be BTF_ID(struct, bpf_timer), and
he explained everything in the commit message. I was just too dumb to
understand it properly.

(Adding a comment here in case we want to extend bpf_timer API in the
future, and so this patch will be useful).

Cheers,
Benjamin

>  
>  static bool __is_kfunc_ptr_arg_type(const struct btf *btf,
>  				    const struct btf_param *arg, int type)
> @@ -10877,6 +10883,11 @@ static bool is_kfunc_arg_rbtree_node(const struct btf *btf, const struct btf_par
>  	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_NODE_ID);
>  }
>  
> +static bool is_kfunc_arg_timer(const struct btf *btf, const struct btf_param *arg)
> +{
> +	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_TIMER_ID);
> +}
> +
>  static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf,
>  				  const struct btf_param *arg)
>  {
> @@ -10946,6 +10957,7 @@ enum kfunc_ptr_arg_type {
>  	KF_ARG_PTR_TO_NULL,
>  	KF_ARG_PTR_TO_CONST_STR,
>  	KF_ARG_PTR_TO_MAP,
> +	KF_ARG_PTR_TO_TIMER,
>  };
>  
>  enum special_kfunc_type {
> @@ -11102,6 +11114,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
>  	if (is_kfunc_arg_map(meta->btf, &args[argno]))
>  		return KF_ARG_PTR_TO_MAP;
>  
> +	if (is_kfunc_arg_timer(meta->btf, &args[argno]))
> +		return KF_ARG_PTR_TO_TIMER;
> +
>  	if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) {
>  		if (!btf_type_is_struct(ref_t)) {
>  			verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n",
> @@ -11735,6 +11750,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>  		case KF_ARG_PTR_TO_CALLBACK:
>  		case KF_ARG_PTR_TO_REFCOUNTED_KPTR:
>  		case KF_ARG_PTR_TO_CONST_STR:
> +		case KF_ARG_PTR_TO_TIMER:
>  			/* Trusted by default */
>  			break;
>  		default:
> @@ -12021,6 +12037,15 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>  			if (ret)
>  				return ret;
>  			break;
> +		case KF_ARG_PTR_TO_TIMER:
> +			if (reg->type != PTR_TO_MAP_VALUE) {
> +				verbose(env, "arg#%d doesn't point to a map value\n", i);
> +				return -EINVAL;
> +			}
> +			ret = process_timer_func(env, regno, NULL);
> +			if (ret < 0)
> +				return ret;
> +			break;
>  		}
>  	}
>  
> 
> -- 
> 2.44.0
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers Benjamin Tissoires
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER bentiss
@ 2024-04-08  8:09 ` Benjamin Tissoires
  2024-04-08 14:25   ` Eduard Zingerman
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable Benjamin Tissoires
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest

In this patch, bpf_timer_set_sleepable_cb() is functionally equivalent
to bpf_timer_set_callback(), to the exception that it enforces
the timer to be started with BPF_F_TIMER_SLEEPABLE.

But given that bpf_timer_set_callback() is a helper when
bpf_timer_set_sleepable_cb() is a kfunc, we need to teach the verifier
about its attached callback.
Marking that callback as sleepable will be done in a separate patch

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---

changes in v6:
- adapted for flags being set during timer_init

changes in v5:
- enforced sleepable timers to only have BPF_F_TIMER_SLEEPABLE
- use is_bpf_timer_set_sleepable_cb_impl_kfunc() instead of generic
  is_async_cb()

changes in v4:
- added a new (ignored) argument to the kfunc so that we do not
  need to wlak the stack

new in v3 (split from v2 02/10)
---
 kernel/bpf/helpers.c  | 43 ++++++++++++++++++++++++++++++++++--
 kernel/bpf/verifier.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 99 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index fd05d4358b31..d6528359b3f4 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1291,8 +1291,8 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
-BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn,
-	   struct bpf_prog_aux *, aux)
+static int __bpf_timer_set_callback(struct bpf_timer_kern *timer, void *callback_fn,
+				    struct bpf_prog_aux *aux, bool is_sleepable)
 {
 	struct bpf_prog *prev, *prog = aux->prog;
 	struct bpf_hrtimer *t;
@@ -1306,6 +1306,10 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb
 		ret = -EINVAL;
 		goto out;
 	}
+	if (!!(t->flags & BPF_F_TIMER_SLEEPABLE) != is_sleepable) {
+		ret = -EINVAL;
+		goto out;
+	}
 	if (!atomic64_read(&t->map->usercnt)) {
 		/* maps with timers must be either held by user space
 		 * or pinned in bpffs. Otherwise timer might still be
@@ -1336,6 +1340,12 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb
 	return ret;
 }
 
+BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn,
+	   struct bpf_prog_aux *, aux)
+{
+	return __bpf_timer_set_callback(timer, callback_fn, aux, false);
+}
+
 static const struct bpf_func_proto bpf_timer_set_callback_proto = {
 	.func		= bpf_timer_set_callback,
 	.gpl_only	= true,
@@ -2650,6 +2660,34 @@ __bpf_kfunc void bpf_throw(u64 cookie)
 	WARN(1, "A call to BPF exception callback should never return\n");
 }
 
+/**
+ * bpf_timer_set_sleepable_cb_impl() - Configure the timer to call %callback_fn
+ * static function in a sleepable context.
+ * @timer: The bpf_timer that needs to be configured
+ * @callback_fn: a static bpf function
+ *
+ * @returns %0 on success. %-EINVAL if %timer was not initialized with
+ * bpf_timer_init() earlier. %-EPERM if %timer is in a map that doesn't
+ * have any user references.
+ * The user space should either hold a file descriptor to a map with timers
+ * or pin such map in bpffs. When map is unpinned or file descriptor is
+ * closed all timers in the map will be cancelled and freed.
+ *
+ * This kfunc is equivalent to %bpf_timer_set_callback except that it tells
+ * the verifier that the target callback is run in a sleepable context.
+ */
+__bpf_kfunc int bpf_timer_set_sleepable_cb_impl(struct bpf_timer_kern *timer,
+						int (callback_fn)(void *map, int *key, struct bpf_timer *timer),
+						void *aux__ign)
+{
+	struct bpf_prog_aux *aux = (struct bpf_prog_aux *)aux__ign;
+
+	if (!aux)
+		return -EINVAL;
+
+	return __bpf_timer_set_callback(timer, (void *)callback_fn, aux, true);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(generic_btf_ids)
@@ -2726,6 +2764,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null)
 BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
 BTF_ID_FLAGS(func, bpf_dynptr_size)
 BTF_ID_FLAGS(func, bpf_dynptr_clone)
+BTF_ID_FLAGS(func, bpf_timer_set_sleepable_cb_impl)
 BTF_KFUNCS_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ccfe9057d8dc..00ac3a3a5f01 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -501,8 +501,12 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id)
 }
 
 static bool is_sync_callback_calling_kfunc(u32 btf_id);
+static bool is_async_callback_calling_kfunc(u32 btf_id);
+static bool is_callback_calling_kfunc(u32 btf_id);
 static bool is_bpf_throw_kfunc(struct bpf_insn *insn);
 
+static bool is_bpf_timer_set_sleepable_cb_impl_kfunc(u32 btf_id);
+
 static bool is_sync_callback_calling_function(enum bpf_func_id func_id)
 {
 	return func_id == BPF_FUNC_for_each_map_elem ||
@@ -530,7 +534,8 @@ static bool is_sync_callback_calling_insn(struct bpf_insn *insn)
 
 static bool is_async_callback_calling_insn(struct bpf_insn *insn)
 {
-	return bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm);
+	return (bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm)) ||
+	       (bpf_pseudo_kfunc_call(insn) && is_async_callback_calling_kfunc(insn->imm));
 }
 
 static bool is_may_goto_insn(struct bpf_insn *insn)
@@ -9475,7 +9480,7 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
 	 */
 	env->subprog_info[subprog].is_cb = true;
 	if (bpf_pseudo_kfunc_call(insn) &&
-	    !is_sync_callback_calling_kfunc(insn->imm)) {
+	    !is_callback_calling_kfunc(insn->imm)) {
 		verbose(env, "verifier bug: kfunc %s#%d not marked as callback-calling\n",
 			func_id_name(insn->imm), insn->imm);
 		return -EFAULT;
@@ -10984,6 +10989,7 @@ enum special_kfunc_type {
 	KF_bpf_percpu_obj_drop_impl,
 	KF_bpf_throw,
 	KF_bpf_iter_css_task_new,
+	KF_bpf_timer_set_sleepable_cb_impl,
 };
 
 BTF_SET_START(special_kfunc_set)
@@ -11010,6 +11016,7 @@ BTF_ID(func, bpf_throw)
 #ifdef CONFIG_CGROUPS
 BTF_ID(func, bpf_iter_css_task_new)
 #endif
+BTF_ID(func, bpf_timer_set_sleepable_cb_impl)
 BTF_SET_END(special_kfunc_set)
 
 BTF_ID_LIST(special_kfunc_list)
@@ -11040,6 +11047,7 @@ BTF_ID(func, bpf_iter_css_task_new)
 #else
 BTF_ID_UNUSED
 #endif
+BTF_ID(func, bpf_timer_set_sleepable_cb_impl)
 
 static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
 {
@@ -11368,12 +11376,28 @@ static bool is_sync_callback_calling_kfunc(u32 btf_id)
 	return btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl];
 }
 
+static bool is_async_callback_calling_kfunc(u32 btf_id)
+{
+	return btf_id == special_kfunc_list[KF_bpf_timer_set_sleepable_cb_impl];
+}
+
 static bool is_bpf_throw_kfunc(struct bpf_insn *insn)
 {
 	return bpf_pseudo_kfunc_call(insn) && insn->off == 0 &&
 	       insn->imm == special_kfunc_list[KF_bpf_throw];
 }
 
+static bool is_bpf_timer_set_sleepable_cb_impl_kfunc(u32 btf_id)
+{
+	return btf_id == special_kfunc_list[KF_bpf_timer_set_sleepable_cb_impl];
+}
+
+static bool is_callback_calling_kfunc(u32 btf_id)
+{
+	return is_sync_callback_calling_kfunc(btf_id) ||
+	       is_async_callback_calling_kfunc(btf_id);
+}
+
 static bool is_rbtree_lock_required_kfunc(u32 btf_id)
 {
 	return is_bpf_rbtree_api_kfunc(btf_id);
@@ -12157,6 +12181,16 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		}
 	}
 
+	if (is_bpf_timer_set_sleepable_cb_impl_kfunc(meta.func_id)) {
+		err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+					 set_timer_callback_state);
+		if (err) {
+			verbose(env, "kfunc %s#%d failed callback verification\n",
+				func_name, meta.func_id);
+			return err;
+		}
+	}
+
 	rcu_lock = is_kfunc_bpf_rcu_read_lock(&meta);
 	rcu_unlock = is_kfunc_bpf_rcu_read_unlock(&meta);
 
@@ -19559,6 +19593,28 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		   desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
 		insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1);
 		*cnt = 1;
+	} else if (is_bpf_timer_set_sleepable_cb_impl_kfunc(desc->func_id)) {
+		/* The verifier will process callback_fn as many times as necessary
+		 * with different maps and the register states prepared by
+		 * set_timer_callback_state will be accurate.
+		 *
+		 * The following use case is valid:
+		 *   map1 is shared by prog1, prog2, prog3.
+		 *   prog1 calls bpf_timer_init for some map1 elements
+		 *   prog2 calls bpf_timer_set_callback for some map1 elements.
+		 *     Those that were not bpf_timer_init-ed will return -EINVAL.
+		 *   prog3 calls bpf_timer_start for some map1 elements.
+		 *     Those that were not both bpf_timer_init-ed and
+		 *     bpf_timer_set_callback-ed will return -EINVAL.
+		 */
+		struct bpf_insn ld_addrs[2] = {
+			BPF_LD_IMM64(BPF_REG_3, (long)env->prog->aux),
+		};
+
+		insn_buf[0] = ld_addrs[0];
+		insn_buf[1] = ld_addrs[1];
+		insn_buf[2] = *insn;
+		*cnt = 3;
 	}
 	return 0;
 }

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc Benjamin Tissoires
@ 2024-04-08 14:25   ` Eduard Zingerman
  2024-04-08 17:16     ` Benjamin Tissoires
  0 siblings, 1 reply; 21+ messages in thread
From: Eduard Zingerman @ 2024-04-08 14:25 UTC (permalink / raw)
  To: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: bpf, linux-kernel, linux-kselftest

On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
[...]

> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index fd05d4358b31..d6528359b3f4 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c

[...]

> @@ -2726,6 +2764,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null)
>  BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
>  BTF_ID_FLAGS(func, bpf_dynptr_size)
>  BTF_ID_FLAGS(func, bpf_dynptr_clone)
> +BTF_ID_FLAGS(func, bpf_timer_set_sleepable_cb_impl)

Note:
this hunk does not apply cleanly on top of current master.
The line 'BTF_ID_FLAGS(func, bpf_modify_return_test_tp)'
was added to the list since last time current patch-set was merged.

>  BTF_KFUNCS_END(common_btf_ids)
>  
>  static const struct btf_kfunc_id_set common_kfunc_set = {

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
  2024-04-08 14:25   ` Eduard Zingerman
@ 2024-04-08 17:16     ` Benjamin Tissoires
  0 siblings, 0 replies; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08 17:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, linux-kernel, linux-kselftest

On Mon, Apr 8, 2024 at 4:31 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
> [...]
>
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index fd05d4358b31..d6528359b3f4 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
>
> [...]
>
> > @@ -2726,6 +2764,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null)
> >  BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
> >  BTF_ID_FLAGS(func, bpf_dynptr_size)
> >  BTF_ID_FLAGS(func, bpf_dynptr_clone)
> > +BTF_ID_FLAGS(func, bpf_timer_set_sleepable_cb_impl)
>
> Note:
> this hunk does not apply cleanly on top of current master.
> The line 'BTF_ID_FLAGS(func, bpf_modify_return_test_tp)'
> was added to the list since last time current patch-set was merged.


Oops, thanks for the update.

Just to be clear, I already mentioned it in the cover letter, but this
series is not intended to be merged just now (thus RFC again). The
plan is to add a new bpf_wq API on the side, and compare it with this
v6 to see which one is best, because I am trying to force the
workqueue API into a timer, when it's getting further and further away
from each other.

Cheers,
Benjamin


>
>
> >  BTF_KFUNCS_END(common_btf_ids)
> >
> >  static const struct btf_kfunc_id_set common_kfunc_set = {
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
                   ` (2 preceding siblings ...)
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc Benjamin Tissoires
@ 2024-04-08  8:09 ` Benjamin Tissoires
  2024-04-08 22:35   ` Eduard Zingerman
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h Benjamin Tissoires
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest

Now that we have bpf_timer_set_sleepable_cb() available and working, we
can tag the attached callback as sleepable, and let the verifier check
in the correct context the calls and kfuncs.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---

no changes in v6

no changes in v5

changes in v4:
- use a function parameter to forward the sleepable information

new in v3 (split from v2 02/10)
---
 include/linux/bpf_verifier.h |  1 +
 kernel/bpf/verifier.c        | 13 ++++++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 7cb1b75eee38..14e4ee67b694 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -426,6 +426,7 @@ struct bpf_verifier_state {
 	 * while they are still in use.
 	 */
 	bool used_as_loop_entry;
+	bool in_sleepable;
 
 	/* first and last insn idx of this verifier state */
 	u32 first_insn_idx;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 00ac3a3a5f01..b75a8aca6ec9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1434,6 +1434,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	}
 	dst_state->speculative = src->speculative;
 	dst_state->active_rcu_lock = src->active_rcu_lock;
+	dst_state->in_sleepable = src->in_sleepable;
 	dst_state->curframe = src->curframe;
 	dst_state->active_lock.ptr = src->active_lock.ptr;
 	dst_state->active_lock.id = src->active_lock.id;
@@ -2407,7 +2408,7 @@ static void init_func_state(struct bpf_verifier_env *env,
 /* Similar to push_stack(), but for async callbacks */
 static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 						int insn_idx, int prev_insn_idx,
-						int subprog)
+						int subprog, bool is_sleepable)
 {
 	struct bpf_verifier_stack_elem *elem;
 	struct bpf_func_state *frame;
@@ -2434,6 +2435,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 	 * Initialize it similar to do_check_common().
 	 */
 	elem->st.branches = 1;
+	elem->st.in_sleepable = is_sleepable;
 	frame = kzalloc(sizeof(*frame), GFP_KERNEL);
 	if (!frame)
 		goto err;
@@ -5279,7 +5281,8 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
 
 static bool in_sleepable(struct bpf_verifier_env *env)
 {
-	return env->prog->sleepable;
+	return env->prog->sleepable ||
+	       (env->cur_state && env->cur_state->in_sleepable);
 }
 
 /* The non-sleepable programs and sleepable programs with explicit bpf_rcu_read_lock()
@@ -9497,7 +9500,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
 		/* there is no real recursion here. timer callbacks are async */
 		env->subprog_info[subprog].is_async_cb = true;
 		async_cb = push_async_cb(env, env->subprog_info[subprog].start,
-					 insn_idx, subprog);
+					 insn_idx, subprog,
+					 is_bpf_timer_set_sleepable_cb_impl_kfunc(insn->imm));
 		if (!async_cb)
 			return -EFAULT;
 		callee = async_cb->frame[0];
@@ -16947,6 +16951,9 @@ static bool states_equal(struct bpf_verifier_env *env,
 	if (old->active_rcu_lock != cur->active_rcu_lock)
 		return false;
 
+	if (old->in_sleepable != cur->in_sleepable)
+		return false;
+
 	/* for states to be equal callsites have to be the same
 	 * and all frame states need to be equivalent
 	 */

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable Benjamin Tissoires
@ 2024-04-08 22:35   ` Eduard Zingerman
  2024-04-09  3:14     ` Alexei Starovoitov
  0 siblings, 1 reply; 21+ messages in thread
From: Eduard Zingerman @ 2024-04-08 22:35 UTC (permalink / raw)
  To: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: bpf, linux-kernel, linux-kselftest

On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
> Now that we have bpf_timer_set_sleepable_cb() available and working, we
> can tag the attached callback as sleepable, and let the verifier check
> in the correct context the calls and kfuncs.
> 
> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
> 
> ---

I think this patch is fine with one nit regarding in_sleepable().
Acked-by: Eduard Zingerman <eddyz87@gmail.com>

> @@ -5279,7 +5281,8 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
>  
>  static bool in_sleepable(struct bpf_verifier_env *env)
>  {
> -	return env->prog->sleepable;
> +	return env->prog->sleepable ||
> +	       (env->cur_state && env->cur_state->in_sleepable);
>  }

Sorry, I already raised this before.
As far as I understand the 'env->cur_state' check is needed because
this function is used from do_misc_fixups():

		if (is_storage_get_function(insn->imm)) {
			if (!in_sleepable(env) ||
			    env->insn_aux_data[i + delta].storage_get_func_atomic)
				insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_ATOMIC);
			else
				insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_KERNEL);
			insn_buf[1] = *insn;
			cnt = 2;
			...
		}

For a timer callback function 'env->prog->sleepable' would be false.
Which means that inside sleepable callback function GFP_ATOMIC would
be used in cases where GFP_KERNEL would be sufficient.
An alternative would be to check (and set) sleepable flag not for a
full program but for a subprogram.

Whether or not this is something worth addressing I don't know.

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
  2024-04-08 22:35   ` Eduard Zingerman
@ 2024-04-09  3:14     ` Alexei Starovoitov
  0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2024-04-09  3:14 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Mon, Apr 8, 2024 at 3:36 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
> > Now that we have bpf_timer_set_sleepable_cb() available and working, we
> > can tag the attached callback as sleepable, and let the verifier check
> > in the correct context the calls and kfuncs.
> >
> > Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
> >
> > ---
>
> I think this patch is fine with one nit regarding in_sleepable().
> Acked-by: Eduard Zingerman <eddyz87@gmail.com>
>
> > @@ -5279,7 +5281,8 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
> >
> >  static bool in_sleepable(struct bpf_verifier_env *env)
> >  {
> > -     return env->prog->sleepable;
> > +     return env->prog->sleepable ||
> > +            (env->cur_state && env->cur_state->in_sleepable);
> >  }
>
> Sorry, I already raised this before.
> As far as I understand the 'env->cur_state' check is needed because
> this function is used from do_misc_fixups():
>
>                 if (is_storage_get_function(insn->imm)) {
>                         if (!in_sleepable(env) ||
>                             env->insn_aux_data[i + delta].storage_get_func_atomic)
>                                 insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_ATOMIC);
>                         else
>                                 insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_KERNEL);
>                         insn_buf[1] = *insn;
>                         cnt = 2;
>                         ...
>                 }
>
> For a timer callback function 'env->prog->sleepable' would be false.
> Which means that inside sleepable callback function GFP_ATOMIC would
> be used in cases where GFP_KERNEL would be sufficient.
> An alternative would be to check (and set) sleepable flag not for a
> full program but for a subprogram.

At this point all subprograms are still part of the main program.
jit_subprogs() hasn't been called yet.
So there is only one 'prog' object.
So cannot really set prog->sleepable for callback subprog.

But you've raised a good point.
We can remove "!in_sleepable(env)" part in do_misc_fixups() with:
-                if (in_sleepable(env) && is_storage_get_function(func_id))
-
env->insn_aux_data[insn_idx].storage_get_func_atomic = true;

+ if (is_storage_get_function(func_id))
+   env->insn_aux_data[insn_idx].storage_get_func_atomic = !in_sleepable(env);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
                   ` (3 preceding siblings ...)
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable Benjamin Tissoires
@ 2024-04-08  8:09 ` Benjamin Tissoires
  2024-04-09  3:15   ` Alexei Starovoitov
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests Benjamin Tissoires
  2024-04-09  3:17 ` [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Alexei Starovoitov
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest

cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---

updated in v6

no changes in v5

new in v4
---
 tools/include/uapi/linux/bpf.h | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index bf80b614c4db..f1890eed213a 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1662,9 +1662,10 @@ union bpf_attr {
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
-		__u64 name;
-		__u32 prog_fd;
-		__aligned_u64 cookie;
+		__u64		name;
+		__u32		prog_fd;
+		__u32		:32;
+		__aligned_u64	cookie;
 	} raw_tracepoint;
 
 	struct { /* anonymous struct for BPF_BTF_LOAD */
@@ -7468,6 +7469,19 @@ enum {
 	BPF_F_TIMER_CPU_PIN = (1ULL << 1),
 };
 
+/* Extra flags to control bpf_timer_init() behaviour, in addition to CLOCK_*.
+ *     - BPF_F_TIMER_SLEEPABLE: Timer will run in a workqueue in a sleepable
+ *       context, with no guarantees of ordering nor timing (consider this as
+ *       being just offloaded immediately).
+ */
+enum {
+	/* CLOCK_* are using bits 0 to 3 */
+	BPF_F_TIMER_SLEEPABLE = (1ULL << 4),
+	__MAX_BPF_F_TIMER_INIT,
+};
+
+#define MAX_BPF_F_TIMER_INIT __MAX_BPF_F_TIMER_INIT
+
 /* BPF numbers iterator state */
 struct bpf_iter_num {
 	/* opaque iterator state; having __u64 here allows to preserve correct

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h Benjamin Tissoires
@ 2024-04-09  3:15   ` Alexei Starovoitov
  0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2024-04-09  3:15 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Mon, Apr 8, 2024 at 1:10 AM Benjamin Tissoires <bentiss@kernel.org> wrote:
>
> cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h
>
> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
>
> ---
>
> updated in v6
>
> no changes in v5
>
> new in v4
> ---
>  tools/include/uapi/linux/bpf.h | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index bf80b614c4db..f1890eed213a 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -1662,9 +1662,10 @@ union bpf_attr {
>         } query;
>
>         struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> -               __u64 name;
> -               __u32 prog_fd;
> -               __aligned_u64 cookie;
> +               __u64           name;
> +               __u32           prog_fd;
> +               __u32           :32;
> +               __aligned_u64   cookie;
>         } raw_tracepoint;

something wrong with your tree.
there is no such issue in bpf-next.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
                   ` (4 preceding siblings ...)
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h Benjamin Tissoires
@ 2024-04-08  8:09 ` Benjamin Tissoires
  2024-04-08 23:00   ` Eduard Zingerman
  2024-04-09  3:17 ` [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Alexei Starovoitov
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Tissoires @ 2024-04-08  8:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: Benjamin Tissoires, bpf, linux-kernel, linux-kselftest

bpf_experimental.h and ../bpf_testmod/bpf_testmod_kfunc.h are both
including vmlinux.h, which is not compatible with including time.h
or bpf_tcp_helpers.h.

So keep sleepable tests in a separate bpf source file.

The first correct test is run twice for convenience:
- first through RUN_TESTS
- then we ensure that the timer was actually executed, in a manual
  load/attach/run

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---

changes in v6:
- use of new _init API with BPF_F_TIMER_SLEEPABLE

changes in v5b:
- do not use SEC(syscall) as they are run in a sleepable context
  already
- duplicate test_call_sleepable() so we can test if the callback gets
  called

changes in v5:
- keep sleepable test separate

new in v4
---
 tools/testing/selftests/bpf/bpf_experimental.h     |   5 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c        |   5 +
 .../selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h  |   1 +
 tools/testing/selftests/bpf/prog_tests/timer.c     |  34 ++++
 .../testing/selftests/bpf/progs/timer_sleepable.c  | 213 +++++++++++++++++++++
 5 files changed, 258 insertions(+)

diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index a5b9df38c162..06ed98b3bc51 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -459,4 +459,9 @@ extern int bpf_iter_css_new(struct bpf_iter_css *it,
 extern struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_iter_css *it) __weak __ksym;
 extern void bpf_iter_css_destroy(struct bpf_iter_css *it) __weak __ksym;
 
+extern int bpf_timer_set_sleepable_cb_impl(struct bpf_timer *timer,
+		int (callback_fn)(void *map, int *key, struct bpf_timer *timer),
+		void *aux__ign) __ksym;
+#define bpf_timer_set_sleepable_cb(timer, cb) \
+	bpf_timer_set_sleepable_cb_impl(timer, cb, NULL)
 #endif
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 39ad96a18123..eb2b78552ca2 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -494,6 +494,10 @@ __bpf_kfunc static u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused
 	return arg;
 }
 
+__bpf_kfunc void bpf_kfunc_call_test_sleepable(void)
+{
+}
+
 BTF_KFUNCS_START(bpf_testmod_check_kfunc_ids)
 BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test1)
@@ -520,6 +524,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS | KF_RCU)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_static_unused_arg)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_offset)
+BTF_ID_FLAGS(func, bpf_kfunc_call_test_sleepable, KF_SLEEPABLE)
 BTF_KFUNCS_END(bpf_testmod_check_kfunc_ids)
 
 static int bpf_testmod_ops_init(struct btf *btf)
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
index 7c664dd61059..ce5cd763561c 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
@@ -96,6 +96,7 @@ void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym;
 void bpf_kfunc_call_test_mem_len_fail2(__u64 *mem, int len) __ksym;
 
 void bpf_kfunc_call_test_destructive(void) __ksym;
+void bpf_kfunc_call_test_sleepable(void) __ksym;
 
 void bpf_kfunc_call_test_offset(struct prog_test_ref_kfunc *p);
 struct prog_test_member *bpf_kfunc_call_memb_acquire(void);
diff --git a/tools/testing/selftests/bpf/prog_tests/timer.c b/tools/testing/selftests/bpf/prog_tests/timer.c
index d66687f1ee6a..c6c7c623b31c 100644
--- a/tools/testing/selftests/bpf/prog_tests/timer.c
+++ b/tools/testing/selftests/bpf/prog_tests/timer.c
@@ -3,6 +3,7 @@
 #include <test_progs.h>
 #include "timer.skel.h"
 #include "timer_failure.skel.h"
+#include "timer_sleepable.skel.h"
 
 #define NUM_THR 8
 
@@ -95,3 +96,36 @@ void serial_test_timer(void)
 
 	RUN_TESTS(timer_failure);
 }
+
+/* isolate sleepable tests from main timer tests
+ * because if test_timer fails, it spews the console
+ * with 10000 * 5 "spin_lock_thread:PASS:test_run_opts retval 0 nsec"
+ */
+void serial_test_sleepable_timer(void)
+{
+	struct timer_sleepable *timer_sleepable_skel = NULL;
+	int err, prog_fd;
+
+	LIBBPF_OPTS(bpf_test_run_opts, topts);
+
+	RUN_TESTS(timer_sleepable);
+
+	/* re-run the success test to check if the timer was actually executed */
+
+	timer_sleepable_skel = timer_sleepable__open_and_load();
+	if (!ASSERT_OK_PTR(timer_sleepable_skel, "timer_sleepable_skel_load"))
+		return;
+
+	err = timer_sleepable__attach(timer_sleepable_skel);
+	if (!ASSERT_OK(err, "timer_sleepable_attach"))
+		return;
+
+	prog_fd = bpf_program__fd(timer_sleepable_skel->progs.test_syscall_sleepable);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(topts.retval, 0, "test_run");
+
+	usleep(50); /* 10 usecs should be enough, but give it extra */
+
+	ASSERT_EQ(timer_sleepable_skel->bss->ok_sleepable, 1, "ok_sleepable");
+}
diff --git a/tools/testing/selftests/bpf/progs/timer_sleepable.c b/tools/testing/selftests/bpf/progs/timer_sleepable.c
new file mode 100644
index 000000000000..fc7829d8b6c4
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/timer_sleepable.c
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook
+ * Copyright (c) 2024 Benjamin Tissoires
+ */
+
+#include "bpf_experimental.h"
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "../bpf_testmod/bpf_testmod_kfunc.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define CLOCK_MONOTONIC 1
+
+struct elem {
+	struct bpf_timer t;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 3);
+	__type(key, int);
+	__type(value, struct elem);
+} timer_map SEC(".maps");
+
+__u32 ok_sleepable;
+
+/* callback for sleepable timer */
+static int timer_cb_sleepable(void *map, int *key, struct bpf_timer *timer)
+{
+	bpf_kfunc_call_test_sleepable();
+	ok_sleepable |= (1 << *key);
+	return 0;
+}
+
+SEC("tc")
+/* check that calling bpf_timer_start() with BPF_F_TIMER_SLEEPABLE on a sleepable
+ * callback works
+ */
+__retval(0)
+long test_call_sleepable(void *ctx)
+{
+	int key = 0;
+	struct bpf_timer *timer;
+
+	if (ok_sleepable & 1)
+		return -1;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (timer) {
+		if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE) != 0)
+			return -2;
+		bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable);
+		if (bpf_timer_start(timer, 0, 0))
+			return -3;
+	} else {
+		return -4;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+/* check that calling bpf_timer_start() with BPF_F_TIMER_SLEEPABLE on a sleepable
+ * callback works.
+ */
+__retval(0)
+long test_syscall_sleepable(void *ctx)
+{
+	int key = 0;
+	struct bpf_timer *timer;
+
+	if (ok_sleepable & 1)
+		return -1;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (timer) {
+		if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE) != 0)
+			return -2;
+		bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable);
+		if (bpf_timer_start(timer, 0, 0))
+			return -3;
+	} else {
+		return -4;
+	}
+
+	return 0;
+}
+
+SEC("?tc")
+__log_level(2)
+__failure
+/* check that bpf_timer_set_callback() can not be called with a
+ * sleepable callback
+ */
+__msg("mark_precise: frame0: regs=r1 stack= before")
+__msg(": (85) call bpf_kfunc_call_test_sleepable#") /* anchor message */
+__msg("program must be sleepable to call sleepable kfunc bpf_kfunc_call_test_sleepable")
+long test_non_sleepable_sleepable_callback(void *ctx)
+{
+	int key = 0;
+	struct bpf_timer *timer;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (timer) {
+		bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE);
+		bpf_timer_set_callback(timer, timer_cb_sleepable);
+		bpf_timer_start(timer, 0, 0);
+	}
+
+	return 0;
+}
+
+SEC("tc")
+/* check that calling bpf_timer_set_sleepable_cb() without a sleepable timer
+ * initialized with BPF_F_TIMER_SLEEPABLE on a sleepable
+ * callback is returning -EINVAL
+ */
+__retval(3)
+long test_call_sleepable_missing_flag(void *ctx)
+{
+	int key = 1;
+	struct bpf_timer *timer;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (!timer)
+		return 1;
+
+	if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC))
+		return 2;
+
+	if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable))
+		return 3;
+
+	return bpf_timer_start(timer, 0, 0);
+}
+
+SEC("tc")
+/* check that calling bpf_timer_start() with a delay on a sleepable
+ * callback is returning -EINVAL
+ */
+__retval(-22)
+long test_call_sleepable_delay(void *ctx)
+{
+	int key = 2;
+	struct bpf_timer *timer;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (!timer)
+		return 1;
+
+	if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE))
+		return 2;
+
+	if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable))
+		return 3;
+
+	return bpf_timer_start(timer, 1, 0);
+}
+
+SEC("?tc")
+__log_level(2)
+__failure
+/* check that the first argument of bpf_timer_set_callback()
+ * is a correct bpf_timer pointer.
+ */
+__msg("mark_precise: frame0: regs=r1 stack= before")
+__msg(": (85) call bpf_timer_set_sleepable_cb_impl#") /* anchor message */
+__msg("arg#0 doesn't point to a map value")
+long test_wrong_pointer(void *ctx)
+{
+	int key = 0;
+	struct bpf_timer *timer;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (!timer)
+		return 1;
+
+	if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE))
+		return 2;
+
+	if (bpf_timer_set_sleepable_cb((void *)&timer, timer_cb_sleepable))
+		return 3;
+
+	return -22;
+}
+
+SEC("?tc")
+__log_level(2)
+__failure
+/* check that the first argument of bpf_timer_set_callback()
+ * is a correct bpf_timer pointer.
+ */
+__msg("mark_precise: frame0: regs=r1 stack= before")
+__msg(": (85) call bpf_timer_set_sleepable_cb_impl#") /* anchor message */
+__msg("off 1 doesn't point to 'struct bpf_timer' that is at 0")
+long test_wrong_pointer_offset(void *ctx)
+{
+	int key = 0;
+	struct bpf_timer *timer;
+
+	timer = bpf_map_lookup_elem(&timer_map, &key);
+	if (!timer)
+		return 1;
+
+	if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE))
+		return 2;
+
+	if (bpf_timer_set_sleepable_cb((void *)timer + 1, timer_cb_sleepable))
+		return 3;
+
+	return -22;
+}

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests Benjamin Tissoires
@ 2024-04-08 23:00   ` Eduard Zingerman
  2024-04-09  3:16     ` Alexei Starovoitov
  0 siblings, 1 reply; 21+ messages in thread
From: Eduard Zingerman @ 2024-04-08 23:00 UTC (permalink / raw)
  To: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan
  Cc: bpf, linux-kernel, linux-kselftest

On Mon, 2024-04-08 at 10:09 +0200, Benjamin Tissoires wrote:
> bpf_experimental.h and ../bpf_testmod/bpf_testmod_kfunc.h are both
> including vmlinux.h, which is not compatible with including time.h
> or bpf_tcp_helpers.h.
> 
> So keep sleepable tests in a separate bpf source file.
> 
> The first correct test is run twice for convenience:
> - first through RUN_TESTS
> - then we ensure that the timer was actually executed, in a manual
>   load/attach/run
> 
> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
> 
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
(With a few nitpicks)

> diff --git a/tools/testing/selftests/bpf/prog_tests/timer.c b/tools/testing/selftests/bpf/prog_tests/timer.c
> index d66687f1ee6a..c6c7c623b31c 100644
> --- a/tools/testing/selftests/bpf/prog_tests/timer.c
> +++ b/tools/testing/selftests/bpf/prog_tests/timer.c

[...]

> +void serial_test_sleepable_timer(void)
> +{
> +	struct timer_sleepable *timer_sleepable_skel = NULL;
> +	int err, prog_fd;
> +
> +	LIBBPF_OPTS(bpf_test_run_opts, topts);
> +
> +	RUN_TESTS(timer_sleepable);
> +
> +	/* re-run the success test to check if the timer was actually executed */
> +
> +	timer_sleepable_skel = timer_sleepable__open_and_load();
> +	if (!ASSERT_OK_PTR(timer_sleepable_skel, "timer_sleepable_skel_load"))
> +		return;
> +
> +	err = timer_sleepable__attach(timer_sleepable_skel);
> +	if (!ASSERT_OK(err, "timer_sleepable_attach"))
> +		return;

Nit: this should call timer_sleepable__destroy();

> +
> +	prog_fd = bpf_program__fd(timer_sleepable_skel->progs.test_syscall_sleepable);
> +	err = bpf_prog_test_run_opts(prog_fd, &topts);
> +	ASSERT_OK(err, "test_run");
> +	ASSERT_EQ(topts.retval, 0, "test_run");
> +
> +	usleep(50); /* 10 usecs should be enough, but give it extra */
> +
> +	ASSERT_EQ(timer_sleepable_skel->bss->ok_sleepable, 1, "ok_sleepable");

Nit: same as above.

> +}
> diff --git a/tools/testing/selftests/bpf/progs/timer_sleepable.c b/tools/testing/selftests/bpf/progs/timer_sleepable.c
> new file mode 100644
> index 000000000000..fc7829d8b6c4
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/timer_sleepable.c

[...]

> +SEC("tc")
> +/* check that calling bpf_timer_start() with BPF_F_TIMER_SLEEPABLE on a sleepable
> + * callback works
> + */
> +__retval(0)
> +long test_call_sleepable(void *ctx)
> +{
> +	int key = 0;
> +	struct bpf_timer *timer;
> +
> +	if (ok_sleepable & 1)
> +		return -1;
> +
> +	timer = bpf_map_lookup_elem(&timer_map, &key);
> +	if (timer) {
> +		if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE) != 0)
> +			return -2;
> +		bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable);
> +		if (bpf_timer_start(timer, 0, 0))
> +			return -3;
> +	} else {
> +		return -4;
> +	}
> +
> +	return 0;
> +}
> +
> +SEC("syscall")
> +/* check that calling bpf_timer_start() with BPF_F_TIMER_SLEEPABLE on a sleepable
> + * callback works.
> + */
> +__retval(0)
> +long test_syscall_sleepable(void *ctx)
> +{

Nit: the body of this function is the same as in test_call_sleepable(),
     maybe factor it out as an auxiliary static function?
     (also, should these tests use different 'key' ?)

> +	int key = 0;
> +	struct bpf_timer *timer;
> +
> +	if (ok_sleepable & 1)
> +		return -1;
> +
> +	timer = bpf_map_lookup_elem(&timer_map, &key);
> +	if (timer) {
> +		if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE) != 0)
> +			return -2;
> +		bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable);
> +		if (bpf_timer_start(timer, 0, 0))
> +			return -3;
> +	} else {
> +		return -4;
> +	}
> +
> +	return 0;
> +}

[...]

> +SEC("tc")
> +/* check that calling bpf_timer_start() with a delay on a sleepable
> + * callback is returning -EINVAL
> + */
> +__retval(-22)
> +long test_call_sleepable_delay(void *ctx)
> +{
> +	int key = 2;
> +	struct bpf_timer *timer;
> +
> +	timer = bpf_map_lookup_elem(&timer_map, &key);
> +	if (!timer)
> +		return 1;
> +
> +	if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE))
> +		return 2;
> +
> +	if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable))
> +		return 3;
> +
> +	return bpf_timer_start(timer, 1, 0);

Q: should verifier statically check that 3rd parameter is zero for sleepable timers?
  (same question for call to bpf_timer_set_sleepable_cb() with non-sleepable map)

[...]



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests
  2024-04-08 23:00   ` Eduard Zingerman
@ 2024-04-09  3:16     ` Alexei Starovoitov
  0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2024-04-09  3:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Benjamin Tissoires, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Mon, Apr 8, 2024 at 4:01 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> > +SEC("tc")
> > +/* check that calling bpf_timer_start() with a delay on a sleepable
> > + * callback is returning -EINVAL
> > + */
> > +__retval(-22)
> > +long test_call_sleepable_delay(void *ctx)
> > +{
> > +     int key = 2;
> > +     struct bpf_timer *timer;
> > +
> > +     timer = bpf_map_lookup_elem(&timer_map, &key);
> > +     if (!timer)
> > +             return 1;
> > +
> > +     if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE))
> > +             return 2;
> > +
> > +     if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable))
> > +             return 3;
> > +
> > +     return bpf_timer_start(timer, 1, 0);
>
> Q: should verifier statically check that 3rd parameter is zero for sleepable timers?
>   (same question for call to bpf_timer_set_sleepable_cb() with non-sleepable map)

It can, but that sounds like more work for the verifier.
Which gives more reasons to use new kfuncs and clean start with bpf_wq.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs)
  2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
                   ` (5 preceding siblings ...)
  2024-04-08  8:09 ` [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests Benjamin Tissoires
@ 2024-04-09  3:17 ` Alexei Starovoitov
  6 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2024-04-09  3:17 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK, Kumar Kartikeya Dwivedi

On Mon, Apr 8, 2024 at 1:09 AM Benjamin Tissoires <bentiss@kernel.org> wrote:
>
> New version of the sleepable bpf_timer code.
>
> I'm posting this as this is the result of the previous review, so we can
> have a baseline to compare to.
>
> The plan is now to introduce a new user API struct bpf_wq, as the timer
> API working on softIRQ seems to be quite far away from a wq.

Looking forward. I think it's almost there.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-04-12  8:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-08  8:09 [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Benjamin Tissoires
2024-04-08  8:09 ` [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers Benjamin Tissoires
2024-04-08 17:07   ` Eduard Zingerman
2024-04-08 17:20     ` Benjamin Tissoires
2024-04-08 21:17       ` Eduard Zingerman
2024-04-09  3:04       ` Alexei Starovoitov
2024-04-08  8:09 ` [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER bentiss
2024-04-08 21:55   ` Eduard Zingerman
2024-04-12  8:14   ` Benjamin Tissoires
2024-04-08  8:09 ` [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc Benjamin Tissoires
2024-04-08 14:25   ` Eduard Zingerman
2024-04-08 17:16     ` Benjamin Tissoires
2024-04-08  8:09 ` [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable Benjamin Tissoires
2024-04-08 22:35   ` Eduard Zingerman
2024-04-09  3:14     ` Alexei Starovoitov
2024-04-08  8:09 ` [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h Benjamin Tissoires
2024-04-09  3:15   ` Alexei Starovoitov
2024-04-08  8:09 ` [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests Benjamin Tissoires
2024-04-08 23:00   ` Eduard Zingerman
2024-04-09  3:16     ` Alexei Starovoitov
2024-04-09  3:17 ` [PATCH RFC bpf-next v6 0/6] sleepable bpf_timer (was: allow HID-BPF to do device IOs) Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox