BPF List
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Shmulik Ladkani <shmulik@metanetworks.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Shmulik Ladkani <shmulik.ladkani@gmail.com>
Subject: Re: [PATCH bpf-next 1/2] bpf: Support getting tunnel flags
Date: Wed, 31 Aug 2022 22:46:15 +0200	[thread overview]
Message-ID: <3b4e74bb-5ede-e773-69e6-6c272ffa2459@iogearbox.net> (raw)
In-Reply-To: <20220831144010.174110-1-shmulik.ladkani@gmail.com>

On 8/31/22 4:40 PM, Shmulik Ladkani wrote:
> Existing 'bpf_skb_get_tunnel_key' extracts various tunnel parameters
> (id, ttl, tos, local and remote) but does not expose ip_tunnel_info's
> tun_flags to the bpf program.
> 
> It makes sense to expose tun_flags to the bpf program.
> 
> Assume for example multiple GRE tunnels maintained on a single GRE
> interface in collect_md mode. The program expects origins to initiate
> over GRE, however different origins use different GRE characteristics
> (e.g. some prefer to use GRE checksum, some do not; some pass a GRE key,
> some do not, etc..).
> 
> A bpf program getting tun_flags can therefore remember the relevant
> flags (e.g. TUNNEL_CSUM, TUNNEL_SEQ...) for each initiating remote.
> In the reply path, the program can use 'bpf_skb_set_tunnel_key' in order
> to correctly reply to the remote, using similar characteristics, based on
> the stored tunnel flags.
> 
> Introduce BPF_F_TUNINFO_FLAGS flag for bpf_skb_get_tunnel_key.
> If specified, 'bpf_tunnel_key->tunnel_flags' is set with the tun_flags.
> 
> Decided to use the existing unused 'tunnel_ext' as the storage for the
> 'tunnel_flags' in order to avoid changing bpf_tunnel_key's layout.
> 
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>

The set looks good to me, one comment below..

> ---
>   include/uapi/linux/bpf.h       | 10 +++++++++-
>   net/core/filter.c              |  8 ++++++--
>   tools/include/uapi/linux/bpf.h | 10 +++++++++-
>   3 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 962960a98835..837c0f9b7fdd 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5659,6 +5659,11 @@ enum {
>   	BPF_F_SEQ_NUMBER		= (1ULL << 3),
>   };
>   
> +/* BPF_FUNC_skb_get_tunnel_key flags. */
> +enum {
> +	BPF_F_TUNINFO_FLAGS		= (1ULL << 4),
> +};
> +
>   /* BPF_FUNC_perf_event_output, BPF_FUNC_perf_event_read and
>    * BPF_FUNC_perf_event_read_value flags.
>    */
> @@ -5848,7 +5853,10 @@ struct bpf_tunnel_key {
>   	};
>   	__u8 tunnel_tos;
>   	__u8 tunnel_ttl;
> -	__u16 tunnel_ext;	/* Padding, future use. */
> +	union {
> +		__u16 tunnel_ext;	/* compat */
> +		__be16 tunnel_flags;
> +	};
>   	__u32 tunnel_label;
>   	union {
>   		__u32 local_ipv4;
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 63e25d8ce501..74e2a4a0d747 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -4488,7 +4488,8 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key
>   	void *to_orig = to;
>   	int err;
>   
> -	if (unlikely(!info || (flags & ~(BPF_F_TUNINFO_IPV6)))) {
> +	if (unlikely(!info || (flags & ~(BPF_F_TUNINFO_IPV6 |
> +					 BPF_F_TUNINFO_FLAGS)))) {
>   		err = -EINVAL;
>   		goto err_clear;
>   	}
> @@ -4520,7 +4521,10 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key
>   	to->tunnel_id = be64_to_cpu(info->key.tun_id);
>   	to->tunnel_tos = info->key.tos;
>   	to->tunnel_ttl = info->key.ttl;
> -	to->tunnel_ext = 0;
> +	if (flags & BPF_F_TUNINFO_FLAGS)
> +		to->tunnel_flags = info->key.tun_flags;
> +	else
> +		to->tunnel_ext = 0;

The bpf_skb_set_tunnel_key() helper has a number of flags we pass in, e.g.
BPF_F_ZERO_CSUM_TX, BPF_F_DONT_FRAGMENT, BPF_F_SEQ_NUMBER, and then based on
those flags we set:

   [...]
   info->key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
   if (flags & BPF_F_DONT_FRAGMENT)
           info->key.tun_flags |= TUNNEL_DONT_FRAGMENT;
   if (flags & BPF_F_ZERO_CSUM_TX)
           info->key.tun_flags &= ~TUNNEL_CSUM;
   if (flags & BPF_F_SEQ_NUMBER)
           info->key.tun_flags |= TUNNEL_SEQ;
   [...]

Should we similarly only expose those which are interesting/relevant to BPF
program authors as a __u16 tunnel_flags and not the whole set? Which ones
do you have a need for? TUNNEL_SEQ, TUNNEL_CSUM, TUNNEL_KEY, and then the
TUNNEL_OPTIONS_PRESENT?

Thanks,
Daniel

  parent reply	other threads:[~2022-08-31 20:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-31 14:40 [PATCH bpf-next 1/2] bpf: Support getting tunnel flags Shmulik Ladkani
2022-08-31 14:40 ` [PATCH bpf-next 2/2] selftests/bpf: Amend test_tunnel to exercise BPF_F_TUNINFO_FLAGS Shmulik Ladkani
2022-08-31 20:46 ` Daniel Borkmann [this message]
2022-09-01  6:10   ` [PATCH bpf-next 1/2] bpf: Support getting tunnel flags Shmulik Ladkani
2022-09-02 13:26     ` Daniel Borkmann
2022-09-02 13:30 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b4e74bb-5ede-e773-69e6-6c272ffa2459@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=shmulik.ladkani@gmail.com \
    --cc=shmulik@metanetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox