From: Yonghong Song <yonghong.song@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jakub Sitnicki <jakub@cloudflare.com>,
John Fastabend <john.fastabend@gmail.com>,
kernel-team@fb.com, Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v4 1/5] bpf: Add bpf_link support for sk_msg and sk_skb progs
Date: Fri, 5 Apr 2024 22:21:19 -0700 [thread overview]
Message-ID: <bbfb67d2-2674-41ed-a748-eb364fd22146@linux.dev> (raw)
In-Reply-To: <CAEf4Bza+V-DerE=Y2cuNNYgyv0jv0b4RH2r3gXt9=uzZzA2JTQ@mail.gmail.com>
On 4/5/24 1:12 PM, Andrii Nakryiko wrote:
> On Wed, Apr 3, 2024 at 7:53 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> Add bpf_link support for sk_msg and sk_skb programs. We have an
>> internal request to support bpf_link for sk_msg programs so user
>> space can have a uniform handling with bpf_link based libbpf
>> APIs. Using bpf_link based libbpf API also has a benefit which
>> makes system robust by decoupling prog life cycle and
>> attachment life cycle.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>> include/linux/bpf.h | 6 +
>> include/linux/skmsg.h | 4 +
>> include/uapi/linux/bpf.h | 5 +
>> kernel/bpf/syscall.c | 4 +
>> net/core/sock_map.c | 268 ++++++++++++++++++++++++++++++++-
>> tools/include/uapi/linux/bpf.h | 5 +
>> 6 files changed, 284 insertions(+), 8 deletions(-)
>>
> [...]
>
>> @@ -103,7 +111,7 @@ int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype)
>> goto put_prog;
>> }
>>
>> - ret = sock_map_prog_update(map, NULL, prog, attr->attach_type);
>> + ret = sock_map_prog_update(map, NULL, prog, NULL, attr->attach_type);
>> put_prog:
>> bpf_prog_put(prog);
>> put_map:
>> @@ -1488,21 +1496,79 @@ static int sock_map_prog_lookup(struct bpf_map *map, struct bpf_prog ***pprog,
>> return 0;
>> }
>>
>> +static int sock_map_link_lookup(struct bpf_map *map, struct bpf_link ***plink,
>> + struct bpf_link *link, bool skip_check, u32 which)
> why not combine prog + link into a single lookup? also it seems like
> sock_map_prog_lookup has some additional EBUSY conditions, do we need
> to replicate them here?
I can combine them together.
>
>> +{
>> + struct sk_psock_progs *progs = sock_map_progs(map);
>> + struct bpf_link **cur_plink;
>> +
>> + switch (which) {
>> + case BPF_SK_MSG_VERDICT:
>> + cur_plink = &progs->msg_parser_link;
>> + break;
>> +#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
>> + case BPF_SK_SKB_STREAM_PARSER:
>> + cur_plink = &progs->stream_parser_link;
>> + break;
>> +#endif
>> + case BPF_SK_SKB_STREAM_VERDICT:
>> + cur_plink = &progs->stream_verdict_link;
>> + break;
>> + case BPF_SK_SKB_VERDICT:
>> + cur_plink = &progs->skb_verdict_link;
>> + break;
>> + default:
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + if (!skip_check && ((!link && *cur_plink) || (link && link != *cur_plink)))
>> + return -EBUSY;
>> +
>> + *plink = cur_plink;
>> + return 0;
>> +}
>> +
>> +/* Handle the following four cases:
>> + * prog_attach: prog != NULL, old == NULL, link == NULL
>> + * prog_detach: prog == NULL, old != NULL, link == NULL
>> + * link_attach: prog != NULL, old == NULL, link != NULL
>> + * link_detach: prog == NULL, old != NULL, link != NULL
>> + */
>> static int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog,
>> - struct bpf_prog *old, u32 which)
>> + struct bpf_prog *old, struct bpf_link *link,
>> + u32 which)
>> {
>> struct bpf_prog **pprog;
>> + struct bpf_link **plink;
>> int ret;
>>
>> + mutex_lock(&sockmap_mutex);
>> +
>> ret = sock_map_prog_lookup(map, &pprog, which);
>> if (ret)
>> - return ret;
>> + goto out;
>>
>> - if (old)
>> - return psock_replace_prog(pprog, prog, old);
>> + if (!link || prog)
>> + ret = sock_map_link_lookup(map, &plink, NULL, false, which);
>> + else
>> + ret = sock_map_link_lookup(map, &plink, NULL, true, which);
>> + if (ret)
>> + goto out;
>> +
>> + if (old) {
>> + ret = psock_replace_prog(pprog, prog, old);
>> + if (!ret)
>> + *plink = NULL;
>> + goto out;
>> + }
>>
>> psock_set_prog(pprog, prog);
>> - return 0;
>> + if (link)
>> + *plink = link;
> nit: feels more natural to do
>
> if (old) {
> psock_replace_prog(...)
> } else {
> psock_set_prog(...)
> }
>
> it's two alternatives, not one unlikely vs one main use case (but it's minor)
Ack indeed better.
>
>> +
>> +out:
>> + mutex_unlock(&sockmap_mutex);
>> + return ret;
>> }
>>
>> int sock_map_bpf_prog_query(const union bpf_attr *attr,
>> @@ -1657,6 +1723,192 @@ void sock_map_close(struct sock *sk, long timeout)
>> }
>> EXPORT_SYMBOL_GPL(sock_map_close);
>>
>> +struct sockmap_link {
>> + struct bpf_link link;
>> + struct bpf_map *map;
>> + enum bpf_attach_type attach_type;
>> +};
>> +
>> +static void sock_map_link_release(struct bpf_link *link)
>> +{
>> + struct sockmap_link *sockmap_link = container_of(link, struct sockmap_link, link);
>> +
>> + if (sockmap_link->map) {
> nit: if (!sockmap_link->map) return;
>
> and reduce nesting of everything else
>
>> + WARN_ON_ONCE(sock_map_prog_update(sockmap_link->map, NULL, link->prog, link,
>> + sockmap_link->attach_type));
> I think sockmap_link->map access in general has to be always protected
> my sockmap_mutex (I'd do that even for the if above), because it can
> race with force-detach logic at least
Ack. will fix this.
>
>> +
>> + mutex_lock(&sockmap_mutex);
>> + bpf_map_put_with_uref(sockmap_link->map);
>> + sockmap_link->map = NULL;
>> + mutex_unlock(&sockmap_mutex);
>> + }
>> +}
>> +
> [...]
>
>> + if (old) {
>> + ret = psock_replace_prog(pprog, prog, old);
>> + goto out;
>> + }
>> +
>> + psock_set_prog(pprog, prog);
>> +
>> +out:
> same nit, feels like
>
> if (old) /* replace */ else /* set */ is more natural, and then you
> can move xchg logic before out: knowing that it's the only success
> case
Ack. will do.
>
>> + if (!ret) {
>> + bpf_prog_inc(prog);
>> + old = xchg(&link->prog, prog);
>> + bpf_prog_put(old);
>> + }
>> + mutex_unlock(&sockmap_mutex);
>> + return ret;
>> +}
>> +
> [...]
next prev parent reply other threads:[~2024-04-06 5:21 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-04 2:53 [PATCH bpf-next v4 0/5] bpf: Add bpf_link support for sk_msg and sk_skb progs Yonghong Song
2024-04-04 2:53 ` [PATCH bpf-next v4 1/5] " Yonghong Song
2024-04-05 15:19 ` John Fastabend
2024-04-05 15:53 ` Yonghong Song
2024-04-05 16:23 ` John Fastabend
2024-04-05 16:51 ` Yonghong Song
2024-04-05 19:43 ` John Fastabend
2024-04-05 20:05 ` Yonghong Song
2024-04-05 20:12 ` Andrii Nakryiko
2024-04-06 5:21 ` Yonghong Song [this message]
2024-04-04 2:53 ` [PATCH bpf-next v4 2/5] libbpf: Add bpf_link support for BPF_PROG_TYPE_SOCKMAP Yonghong Song
2024-04-05 15:20 ` John Fastabend
2024-04-05 20:14 ` Andrii Nakryiko
2024-04-06 5:19 ` Yonghong Song
2024-04-04 2:53 ` [PATCH bpf-next v4 3/5] bpftool: Add link dump support for BPF_LINK_TYPE_SOCKMAP Yonghong Song
2024-04-05 15:20 ` John Fastabend
2024-04-04 2:53 ` [PATCH bpf-next v4 4/5] selftests/bpf: Refactor out helper functions for a few tests Yonghong Song
2024-04-04 2:53 ` [PATCH bpf-next v4 5/5] selftests/bpf: Add some tests with new bpf_program__attach_sockmap() APIs Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bbfb67d2-2674-41ed-a748-eb364fd22146@linux.dev \
--to=yonghong.song@linux.dev \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.