From: Martin KaFai Lau <martin.lau@linux.dev>
To: Amery Hung <ameryhung@gmail.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
yangpeihao@sjtu.edu.cn, daniel@iogearbox.net, andrii@kernel.org,
alexei.starovoitov@gmail.com, martin.lau@kernel.org,
sinquersw@gmail.com, toke@redhat.com, jhs@mojatatu.com,
jiri@resnulli.us, xiyou.wangcong@gmail.com,
yepeilin.cs@gmail.com
Subject: Re: [RFC PATCH v9 06/11] bpf: net_sched: Add bpf qdisc kfuncs
Date: Thu, 25 Jul 2024 15:38:54 -0700 [thread overview]
Message-ID: <47a1dae1-7196-4991-b008-b50fb92fd5c3@linux.dev> (raw)
In-Reply-To: <20240714175130.4051012-7-amery.hung@bytedance.com>
On 7/14/24 10:51 AM, Amery Hung wrote:
> Add kfuncs for working on skb in qdisc.
>
> Both bpf_qdisc_skb_drop() and bpf_skb_release() can be used to release
> a reference to an skb. However, bpf_qdisc_skb_drop() can only be called
> in .enqueue where a to_free skb list is available from kernel to defer
Enforcing the bpf_qdisc_skb_drop() kfunc only available to the ".enqueue" is
achieved by the "struct bpf_sk_buff_ptr" pointer type only available to the
".enqueue" ops ?
> the release. Otherwise, bpf_skb_release() should be used elsewhere. It
> is also used in bpf_obj_free_fields() when cleaning up skb in maps and
> collections.
>
> bpf_qdisc_schedule() can be used to schedule the execution of the qdisc.
> An example use case is to throttle a qdisc if the time to dequeue the
> next packet is known.
>
> bpf_skb_get_hash() returns the flow hash of an skb, which can be used
> to build flow-based queueing algorithms.
>
> Signed-off-by: Amery Hung <amery.hung@bytedance.com>
> ---
> net/sched/bpf_qdisc.c | 74 ++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 73 insertions(+), 1 deletion(-)
>
> diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c
> index a68fc115d8f8..eff7559aa346 100644
> --- a/net/sched/bpf_qdisc.c
> +++ b/net/sched/bpf_qdisc.c
> @@ -148,6 +148,64 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log,
> return 0;
> }
>
> +__bpf_kfunc_start_defs();
> +
> +/* bpf_skb_get_hash - Get the flow hash of an skb.
> + * @skb: The skb to get the flow hash from.
> + */
> +__bpf_kfunc u32 bpf_skb_get_hash(struct sk_buff *skb)
> +{
> + return skb_get_hash(skb);
> +}
> +
> +/* bpf_skb_release - Release an skb reference acquired on an skb immediately.
> + * @skb: The skb on which a reference is being released.
> + */
> +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb)
> +{
> + consume_skb(skb);
snippet from the comment of consume_skb():
* Functions identically to kfree_skb, but kfree_skb assumes that the frame
* is being dropped after a failure and notes that
consume_skb() has a different tracepoint from the kfree_skb also. It is better
not to confuse the tracing.
I think at least the Qdisc_ops.reset and the btf_id_dtor_kfunc don't fall into
the consume_skb(). May be useful to add the kfree_skb[_reason?]() kfunc also?
> +}
> +
> +/* bpf_qdisc_skb_drop - Add an skb to be dropped later to a list.
> + * @skb: The skb on which a reference is being released and dropped.
> + * @to_free_list: The list of skbs to be dropped.
> + */
> +__bpf_kfunc void bpf_qdisc_skb_drop(struct sk_buff *skb,
> + struct bpf_sk_buff_ptr *to_free_list)
> +{
> + __qdisc_drop(skb, (struct sk_buff **)to_free_list);
> +}
> +
> +/* bpf_qdisc_watchdog_schedule - Schedule a qdisc to a later time using a timer.
> + * @sch: The qdisc to be scheduled.
> + * @expire: The expiry time of the timer.
> + * @delta_ns: The slack range of the timer.
> + */
> +__bpf_kfunc void bpf_qdisc_watchdog_schedule(struct Qdisc *sch, u64 expire, u64 delta_ns)
> +{
> + struct bpf_sched_data *q = qdisc_priv(sch);
> +
> + qdisc_watchdog_schedule_range_ns(&q->watchdog, expire, delta_ns);
> +}
> +
> +__bpf_kfunc_end_defs();
> +
> +BTF_KFUNCS_START(bpf_qdisc_kfunc_ids)
> +BTF_ID_FLAGS(func, bpf_skb_get_hash)
Add KF_TRUSTED_ARGS. Avoid cases like getting a skb from walking the skb->next
for now.
> +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE)
> +BTF_ID_FLAGS(func, bpf_qdisc_skb_drop, KF_RELEASE)
> +BTF_ID_FLAGS(func, bpf_qdisc_watchdog_schedule)
Also add KF_TRUSTED_ARGS here.
> +BTF_KFUNCS_END(bpf_qdisc_kfunc_ids)
> +
> +static const struct btf_kfunc_id_set bpf_qdisc_kfunc_set = {
> + .owner = THIS_MODULE,
> + .set = &bpf_qdisc_kfunc_ids,
> +};
> +
> +BTF_ID_LIST(skb_kfunc_dtor_ids)
> +BTF_ID(struct, sk_buff)
> +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE)
> +
> static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = {
> .get_func_proto = bpf_qdisc_get_func_proto,
> .is_valid_access = bpf_qdisc_is_valid_access,
> @@ -347,6 +405,20 @@ static struct bpf_struct_ops bpf_Qdisc_ops = {
>
> static int __init bpf_qdisc_kfunc_init(void)
> {
> - return register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops);
> + int ret;
> + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = {
> + {
> + .btf_id = skb_kfunc_dtor_ids[0],
> + .kfunc_btf_id = skb_kfunc_dtor_ids[1]
> + },
> + };
> +
> + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_qdisc_kfunc_set);
> + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors,
> + ARRAY_SIZE(skb_kfunc_dtors),
> + THIS_MODULE);
> + ret = ret ?: register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops);
> +
> + return ret;
> }
> late_initcall(bpf_qdisc_kfunc_init);
next prev parent reply other threads:[~2024-07-25 22:39 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-14 17:51 [RFC PATCH v9 00/11] bpf qdisc Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 01/11] bpf: Support getting referenced kptr from struct_ops argument Amery Hung
2024-07-24 0:32 ` Martin KaFai Lau
2024-07-24 17:00 ` Amery Hung
2024-07-25 1:28 ` Martin KaFai Lau
2024-07-14 17:51 ` [RFC PATCH v9 02/11] selftests/bpf: Test referenced kptr arguments of struct_ops programs Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 03/11] bpf: Allow struct_ops prog to return referenced kptr Amery Hung
2024-07-24 5:36 ` Kui-Feng Lee
2024-07-24 18:27 ` Kui-Feng Lee
2024-07-24 20:44 ` Amery Hung
2024-07-26 18:22 ` Kui-Feng Lee
2024-07-26 22:45 ` Amery Hung
2024-07-24 23:57 ` Martin KaFai Lau
2024-07-14 17:51 ` [RFC PATCH v9 04/11] selftests/bpf: Test returning referenced kptr from struct_ops programs Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 05/11] bpf: net_sched: Support implementation of Qdisc_ops in bpf Amery Hung
2024-07-15 5:55 ` kernel test robot
2024-07-18 0:00 ` Amery Hung
2024-07-25 21:24 ` Martin KaFai Lau
2024-07-31 4:09 ` Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 06/11] bpf: net_sched: Add bpf qdisc kfuncs Amery Hung
2024-07-25 22:38 ` Martin KaFai Lau [this message]
2024-07-31 4:08 ` Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 07/11] bpf: net_sched: Allow more optional operators in Qdisc_ops Amery Hung
2024-07-18 0:01 ` Amery Hung
2024-07-26 1:15 ` Martin KaFai Lau
2024-07-26 18:30 ` Martin KaFai Lau
2024-07-26 22:30 ` Amery Hung
2024-07-30 0:20 ` Martin KaFai Lau
2024-07-14 17:51 ` [RFC PATCH v9 08/11] libbpf: Support creating and destroying qdisc Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 09/11] selftests: Add a basic fifo qdisc test Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 10/11] selftests: Add a bpf fq qdisc to selftest Amery Hung
2024-07-19 1:54 ` Martin KaFai Lau
2024-07-19 18:20 ` Amery Hung
2024-07-14 17:51 ` [RFC PATCH v9 11/11] selftests: Add a bpf netem " Amery Hung
2024-07-17 10:13 ` [RFC PATCH v9 00/11] bpf qdisc Donald Hunter
2024-07-17 22:04 ` Amery Hung
2024-07-19 17:21 ` [OFFLIST RFC 1/4] bpf: Search for kptrs in prog BTF structs Amery Hung
2024-07-19 17:21 ` [OFFLIST RFC 2/4] bpf: Rename ARG_PTR_TO_KPTR -> ARG_KPTR_XCHG_DEST Amery Hung
2024-07-19 17:21 ` [OFFLIST RFC 3/4] bpf: Support bpf_kptr_xchg into local kptr Amery Hung
2024-07-23 0:18 ` Alexei Starovoitov
2024-07-24 0:08 ` Amery Hung
2024-07-19 17:21 ` [OFFLIST RFC 4/4] selftests/bpf: Test bpf_kptr_xchg stashing " Amery Hung
2024-07-23 0:19 ` [RFC PATCH v9 00/11] bpf qdisc Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47a1dae1-7196-4991-b008-b50fb92fd5c3@linux.dev \
--to=martin.lau@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=martin.lau@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sinquersw@gmail.com \
--cc=toke@redhat.com \
--cc=xiyou.wangcong@gmail.com \
--cc=yangpeihao@sjtu.edu.cn \
--cc=yepeilin.cs@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.