* [PATCH bpf v2 0/2] bpf: Fix src IP addr related limitation in bpf_*_fib_lookup() @ 2023-10-03 7:10 Martynas Pumputis 2023-10-03 7:10 ` [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis 2023-10-03 7:10 ` [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis 0 siblings, 2 replies; 9+ messages in thread From: Martynas Pumputis @ 2023-10-03 7:10 UTC (permalink / raw) To: bpf Cc: Daniel Borkmann, netdev, Martin KaFai Lau, Nikolay Aleksandrov, Martynas Pumputis The patchset fixes the limitation of bpf_*_fib_lookup() helper, which prevents it from being used in BPF dataplanes with network interfaces which have more than one IP addr. See the first patch for more details. Thanks! Martynas Pumputis (2): bpf: Derive source IP addr via bpf_*_fib_lookup() selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests include/uapi/linux/bpf.h | 9 +++ net/core/filter.c | 13 +++- tools/include/uapi/linux/bpf.h | 10 +++ .../selftests/bpf/prog_tests/fib_lookup.c | 76 +++++++++++++++++-- 4 files changed, 101 insertions(+), 7 deletions(-) -- 2.42.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() 2023-10-03 7:10 [PATCH bpf v2 0/2] bpf: Fix src IP addr related limitation in bpf_*_fib_lookup() Martynas Pumputis @ 2023-10-03 7:10 ` Martynas Pumputis 2023-10-04 20:09 ` Martin KaFai Lau 2023-10-03 7:10 ` [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis 1 sibling, 1 reply; 9+ messages in thread From: Martynas Pumputis @ 2023-10-03 7:10 UTC (permalink / raw) To: bpf Cc: Daniel Borkmann, netdev, Martin KaFai Lau, Nikolay Aleksandrov, Martynas Pumputis Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SET_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; /* the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> --- include/net/ipv6_stubs.h | 5 +++++ include/uapi/linux/bpf.h | 9 +++++++++ net/core/filter.c | 19 ++++++++++++++++++- net/ipv6/af_inet6.c | 1 + tools/include/uapi/linux/bpf.h | 10 ++++++++++ 5 files changed, 43 insertions(+), 1 deletion(-) diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h index c48186bf4737..21da31e1dff5 100644 --- a/include/net/ipv6_stubs.h +++ b/include/net/ipv6_stubs.h @@ -85,6 +85,11 @@ struct ipv6_bpf_stub { sockptr_t optval, unsigned int optlen); int (*ipv6_getsockopt)(struct sock *sk, int level, int optname, sockptr_t optval, sockptr_t optlen); + int (*ipv6_dev_get_saddr)(struct net *net, + const struct net_device *dst_dev, + const struct in6_addr *daddr, + unsigned int prefs, + struct in6_addr *saddr); }; extern const struct ipv6_bpf_stub *ipv6_bpf_stub __read_mostly; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 0448700890f7..a6bf686eecbc 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3257,6 +3257,10 @@ union bpf_attr { * and *params*->smac will not be set as output. A common * use case is to call **bpf_redirect_neigh**\ () after * doing **bpf_fib_lookup**\ (). + * **BPF_FIB_LOOKUP_SET_SRC** + * Derive and set source IP addr in *params*->ipv{4,6}_src + * for the nexthop. If the src addr cannot be derived, + * **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned. * * *ctx* is either **struct xdp_md** for XDP programs or * **struct sk_buff** tc cls_act programs. @@ -6953,6 +6957,7 @@ enum { BPF_FIB_LOOKUP_OUTPUT = (1U << 1), BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), BPF_FIB_LOOKUP_TBID = (1U << 3), + BPF_FIB_LOOKUP_SET_SRC = (1U << 4), }; enum { @@ -6965,6 +6970,7 @@ enum { BPF_FIB_LKUP_RET_UNSUPP_LWT, /* fwd requires encapsulation */ BPF_FIB_LKUP_RET_NO_NEIGH, /* no neighbor entry for nh */ BPF_FIB_LKUP_RET_FRAG_NEEDED, /* fragmentation required to fwd */ + BPF_FIB_LKUP_RET_NO_SRC_ADDR, /* failed to derive IP src addr */ }; struct bpf_fib_lookup { @@ -6999,6 +7005,9 @@ struct bpf_fib_lookup { __u32 rt_metric; }; + /* input: source address to consider for lookup + * output: source address result from lookup + */ union { __be32 ipv4_src; __u32 ipv6_src[4]; /* in6_addr; network order */ diff --git a/net/core/filter.c b/net/core/filter.c index a094694899c9..f3777ef1840b 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5850,6 +5850,9 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params, params->rt_metric = res.fi->fib_priority; params->ifindex = dev->ifindex; + if (flags & BPF_FIB_LOOKUP_SET_SRC) + params->ipv4_src = fib_result_prefsrc(net, &res); + /* xdp and cls_bpf programs are run in RCU-bh so * rcu_read_lock_bh is not needed here */ @@ -5992,6 +5995,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, params->rt_metric = res.f6i->fib6_metric; params->ifindex = dev->ifindex; + if (flags & BPF_FIB_LOOKUP_SET_SRC) { + if (res.f6i->fib6_prefsrc.plen) { + *(struct in6_addr *)params->ipv6_src = res.f6i->fib6_prefsrc.addr; + } else { + err = ipv6_bpf_stub->ipv6_dev_get_saddr(net, dev, + &fl6.daddr, 0, + (struct in6_addr *) + params->ipv6_src); + if (err) + return BPF_FIB_LKUP_RET_NO_SRC_ADDR; + } + } + if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) goto set_fwd_params; @@ -6010,7 +6026,8 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, #endif #define BPF_FIB_LOOKUP_MASK (BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | \ - BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID) + BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID | \ + BPF_FIB_LOOKUP_SET_SRC) BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx, struct bpf_fib_lookup *, params, int, plen, u32, flags) diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 368824fe9719..5382c6543d46 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -1060,6 +1060,7 @@ static const struct ipv6_bpf_stub ipv6_bpf_stub_impl = { .udp6_lib_lookup = __udp6_lib_lookup, .ipv6_setsockopt = do_ipv6_setsockopt, .ipv6_getsockopt = do_ipv6_getsockopt, + .ipv6_dev_get_saddr = ipv6_dev_get_saddr, }; static int __init inet6_init(void) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 0448700890f7..72cd0ca97689 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3257,6 +3257,10 @@ union bpf_attr { * and *params*->smac will not be set as output. A common * use case is to call **bpf_redirect_neigh**\ () after * doing **bpf_fib_lookup**\ (). + * **BPF_FIB_LOOKUP_SET_SRC** + * Derive and set source IP addr in *params*->ipv{4,6}_src + * for the nexthop. If the src addr cannot be derived, + * **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned. * * *ctx* is either **struct xdp_md** for XDP programs or * **struct sk_buff** tc cls_act programs. @@ -6953,6 +6957,7 @@ enum { BPF_FIB_LOOKUP_OUTPUT = (1U << 1), BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), BPF_FIB_LOOKUP_TBID = (1U << 3), + BPF_FIB_LOOKUP_SET_SRC = (1U << 4), }; enum { @@ -6965,6 +6970,7 @@ enum { BPF_FIB_LKUP_RET_UNSUPP_LWT, /* fwd requires encapsulation */ BPF_FIB_LKUP_RET_NO_NEIGH, /* no neighbor entry for nh */ BPF_FIB_LKUP_RET_FRAG_NEEDED, /* fragmentation required to fwd */ + BPF_FIB_LKUP_RET_NO_SRC_ADDR, /* failed to derive IP src addr */ }; struct bpf_fib_lookup { @@ -6999,6 +7005,10 @@ struct bpf_fib_lookup { __u32 rt_metric; }; + + /* input: source address to consider for lookup + * output: source address result from lookup + */ union { __be32 ipv4_src; __u32 ipv6_src[4]; /* in6_addr; network order */ -- 2.42.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() 2023-10-03 7:10 ` [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis @ 2023-10-04 20:09 ` Martin KaFai Lau 2023-10-05 20:16 ` Martynas 0 siblings, 1 reply; 9+ messages in thread From: Martin KaFai Lau @ 2023-10-04 20:09 UTC (permalink / raw) To: Martynas Pumputis; +Cc: Daniel Borkmann, netdev, Nikolay Aleksandrov, bpf On 10/3/23 12:10 AM, Martynas Pumputis wrote: > Extend the bpf_fib_lookup() helper by making it to return the source > IPv4/IPv6 address if the BPF_FIB_LOOKUP_SET_SRC flag is set. > > For example, the following snippet can be used to derive the desired > source IP address: > > struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; > > ret = bpf_skb_fib_lookup(skb, p, sizeof(p), > BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH); > if (ret != BPF_FIB_LKUP_RET_SUCCESS) > return TC_ACT_SHOT; > > /* the p.ipv4_src now contains the source address */ > > The inability to derive the proper source address may cause malfunctions > in BPF-based dataplanes for hosts containing netdevs with more than one > routable IP address or for multi-homed hosts. > > For example, Cilium implements packet masquerading in BPF. If an > egressing netdev to which the Cilium's BPF prog is attached has > multiple IP addresses, then only one [hardcoded] IP address can be used for > masquerading. This breaks connectivity if any other IP address should have > been selected instead, for example, when a public and private addresses > are attached to the same egress interface. > > The change was tested with Cilium [1]. > > Nikolay Aleksandrov helped to figure out the IPv6 addr selection. > > [1]: https://github.com/cilium/cilium/pull/28283 > > Signed-off-by: Martynas Pumputis <m@lambda.lt> > --- > include/net/ipv6_stubs.h | 5 +++++ > include/uapi/linux/bpf.h | 9 +++++++++ > net/core/filter.c | 19 ++++++++++++++++++- > net/ipv6/af_inet6.c | 1 + > tools/include/uapi/linux/bpf.h | 10 ++++++++++ > 5 files changed, 43 insertions(+), 1 deletion(-) > > diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h > index c48186bf4737..21da31e1dff5 100644 > --- a/include/net/ipv6_stubs.h > +++ b/include/net/ipv6_stubs.h > @@ -85,6 +85,11 @@ struct ipv6_bpf_stub { > sockptr_t optval, unsigned int optlen); > int (*ipv6_getsockopt)(struct sock *sk, int level, int optname, > sockptr_t optval, sockptr_t optlen); > + int (*ipv6_dev_get_saddr)(struct net *net, > + const struct net_device *dst_dev, > + const struct in6_addr *daddr, > + unsigned int prefs, > + struct in6_addr *saddr); > }; > extern const struct ipv6_bpf_stub *ipv6_bpf_stub __read_mostly; > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 0448700890f7..a6bf686eecbc 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -3257,6 +3257,10 @@ union bpf_attr { > * and *params*->smac will not be set as output. A common > * use case is to call **bpf_redirect_neigh**\ () after > * doing **bpf_fib_lookup**\ (). > + * **BPF_FIB_LOOKUP_SET_SRC** > + * Derive and set source IP addr in *params*->ipv{4,6}_src > + * for the nexthop. If the src addr cannot be derived, > + * **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned. > * > * *ctx* is either **struct xdp_md** for XDP programs or > * **struct sk_buff** tc cls_act programs. > @@ -6953,6 +6957,7 @@ enum { > BPF_FIB_LOOKUP_OUTPUT = (1U << 1), > BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), > BPF_FIB_LOOKUP_TBID = (1U << 3), > + BPF_FIB_LOOKUP_SET_SRC = (1U << 4), bikeshedding: Shorten it to BPF_FIB_LOOKUP_SRC? > }; > > enum { > @@ -6965,6 +6970,7 @@ enum { > BPF_FIB_LKUP_RET_UNSUPP_LWT, /* fwd requires encapsulation */ > BPF_FIB_LKUP_RET_NO_NEIGH, /* no neighbor entry for nh */ > BPF_FIB_LKUP_RET_FRAG_NEEDED, /* fragmentation required to fwd */ > + BPF_FIB_LKUP_RET_NO_SRC_ADDR, /* failed to derive IP src addr */ > }; > > struct bpf_fib_lookup { > @@ -6999,6 +7005,9 @@ struct bpf_fib_lookup { > __u32 rt_metric; > }; > > + /* input: source address to consider for lookup > + * output: source address result from lookup > + */ > union { > __be32 ipv4_src; > __u32 ipv6_src[4]; /* in6_addr; network order */ > diff --git a/net/core/filter.c b/net/core/filter.c > index a094694899c9..f3777ef1840b 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -5850,6 +5850,9 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params, > params->rt_metric = res.fi->fib_priority; > params->ifindex = dev->ifindex; > > + if (flags & BPF_FIB_LOOKUP_SET_SRC) > + params->ipv4_src = fib_result_prefsrc(net, &res); Does it need to check 0 and return BPF_FIB_LKUP_RET_NO_SRC_ADDR for v4 also, or the fib_result_prefsrc does not fail? > + > /* xdp and cls_bpf programs are run in RCU-bh so > * rcu_read_lock_bh is not needed here > */ > @@ -5992,6 +5995,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, > params->rt_metric = res.f6i->fib6_metric; > params->ifindex = dev->ifindex; > > + if (flags & BPF_FIB_LOOKUP_SET_SRC) { > + if (res.f6i->fib6_prefsrc.plen) { > + *(struct in6_addr *)params->ipv6_src = res.f6i->fib6_prefsrc.addr; > + } else { > + err = ipv6_bpf_stub->ipv6_dev_get_saddr(net, dev, > + &fl6.daddr, 0, > + (struct in6_addr *) > + params->ipv6_src); > + if (err) > + return BPF_FIB_LKUP_RET_NO_SRC_ADDR; This error also implies BPF_FIB_LKUP_RET_NO_NEIGH. I don't have a clean way of improving the API. May be others have some ideas. Considering dev has no saddr is probably (?) an unlikely case, it should be ok to leave it as is but at least a comment in the uapi will be needed. Otherwise, the bpf prog may use the 0 dmac as-is. I feel the current bpf_ipv[46]_fib_lookup helper is doing many things in one function and then requires different BPF_FIB_LOOKUP_* bits to select what/how to do. In the future, it may be worth to consider breaking it into smaller kfunc(s). e.g. the __ipv[46]_neigh_lookup could be in its own kfunc. > + } > + } > + > if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) > goto set_fwd_params; > > @@ -6010,7 +6026,8 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, > #endif > > #define BPF_FIB_LOOKUP_MASK (BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | \ > - BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID) > + BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID | \ > + BPF_FIB_LOOKUP_SET_SRC) > > BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx, > struct bpf_fib_lookup *, params, int, plen, u32, flags) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() 2023-10-04 20:09 ` Martin KaFai Lau @ 2023-10-05 20:16 ` Martynas 2023-10-06 6:29 ` Martin KaFai Lau 0 siblings, 1 reply; 9+ messages in thread From: Martynas @ 2023-10-05 20:16 UTC (permalink / raw) To: Martin KaFai Lau; +Cc: Daniel Borkmann, netdev, Nikolay Aleksandrov, bpf Thanks for the review. On Wed, Oct 4, 2023, at 10:09 PM, Martin KaFai Lau wrote: > On 10/3/23 12:10 AM, Martynas Pumputis wrote: >> Extend the bpf_fib_lookup() helper by making it to return the source >> IPv4/IPv6 address if the BPF_FIB_LOOKUP_SET_SRC flag is set. >> >> For example, the following snippet can be used to derive the desired >> source IP address: >> >> struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; >> >> ret = bpf_skb_fib_lookup(skb, p, sizeof(p), >> BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH); >> if (ret != BPF_FIB_LKUP_RET_SUCCESS) >> return TC_ACT_SHOT; >> >> /* the p.ipv4_src now contains the source address */ >> >> The inability to derive the proper source address may cause malfunctions >> in BPF-based dataplanes for hosts containing netdevs with more than one >> routable IP address or for multi-homed hosts. >> >> For example, Cilium implements packet masquerading in BPF. If an >> egressing netdev to which the Cilium's BPF prog is attached has >> multiple IP addresses, then only one [hardcoded] IP address can be used for >> masquerading. This breaks connectivity if any other IP address should have >> been selected instead, for example, when a public and private addresses >> are attached to the same egress interface. >> >> The change was tested with Cilium [1]. >> >> Nikolay Aleksandrov helped to figure out the IPv6 addr selection. >> >> [1]: https://github.com/cilium/cilium/pull/28283 >> >> Signed-off-by: Martynas Pumputis <m@lambda.lt> >> --- >> include/net/ipv6_stubs.h | 5 +++++ >> include/uapi/linux/bpf.h | 9 +++++++++ >> net/core/filter.c | 19 ++++++++++++++++++- >> net/ipv6/af_inet6.c | 1 + >> tools/include/uapi/linux/bpf.h | 10 ++++++++++ >> 5 files changed, 43 insertions(+), 1 deletion(-) >> >> diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h >> index c48186bf4737..21da31e1dff5 100644 >> --- a/include/net/ipv6_stubs.h >> +++ b/include/net/ipv6_stubs.h >> @@ -85,6 +85,11 @@ struct ipv6_bpf_stub { >> sockptr_t optval, unsigned int optlen); >> int (*ipv6_getsockopt)(struct sock *sk, int level, int optname, >> sockptr_t optval, sockptr_t optlen); >> + int (*ipv6_dev_get_saddr)(struct net *net, >> + const struct net_device *dst_dev, >> + const struct in6_addr *daddr, >> + unsigned int prefs, >> + struct in6_addr *saddr); >> }; >> extern const struct ipv6_bpf_stub *ipv6_bpf_stub __read_mostly; >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index 0448700890f7..a6bf686eecbc 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -3257,6 +3257,10 @@ union bpf_attr { >> * and *params*->smac will not be set as output. A common >> * use case is to call **bpf_redirect_neigh**\ () after >> * doing **bpf_fib_lookup**\ (). >> + * **BPF_FIB_LOOKUP_SET_SRC** >> + * Derive and set source IP addr in *params*->ipv{4,6}_src >> + * for the nexthop. If the src addr cannot be derived, >> + * **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned. >> * >> * *ctx* is either **struct xdp_md** for XDP programs or >> * **struct sk_buff** tc cls_act programs. >> @@ -6953,6 +6957,7 @@ enum { >> BPF_FIB_LOOKUP_OUTPUT = (1U << 1), >> BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), >> BPF_FIB_LOOKUP_TBID = (1U << 3), >> + BPF_FIB_LOOKUP_SET_SRC = (1U << 4), > > bikeshedding: Shorten it to BPF_FIB_LOOKUP_SRC? SGTM. > >> }; >> >> enum { >> @@ -6965,6 +6970,7 @@ enum { >> BPF_FIB_LKUP_RET_UNSUPP_LWT, /* fwd requires encapsulation */ >> BPF_FIB_LKUP_RET_NO_NEIGH, /* no neighbor entry for nh */ >> BPF_FIB_LKUP_RET_FRAG_NEEDED, /* fragmentation required to fwd */ >> + BPF_FIB_LKUP_RET_NO_SRC_ADDR, /* failed to derive IP src addr */ >> }; >> >> struct bpf_fib_lookup { >> @@ -6999,6 +7005,9 @@ struct bpf_fib_lookup { >> __u32 rt_metric; >> }; >> >> + /* input: source address to consider for lookup >> + * output: source address result from lookup >> + */ >> union { >> __be32 ipv4_src; >> __u32 ipv6_src[4]; /* in6_addr; network order */ >> diff --git a/net/core/filter.c b/net/core/filter.c >> index a094694899c9..f3777ef1840b 100644 >> --- a/net/core/filter.c >> +++ b/net/core/filter.c >> @@ -5850,6 +5850,9 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params, >> params->rt_metric = res.fi->fib_priority; >> params->ifindex = dev->ifindex; >> >> + if (flags & BPF_FIB_LOOKUP_SET_SRC) >> + params->ipv4_src = fib_result_prefsrc(net, &res); > > Does it need to check 0 and return BPF_FIB_LKUP_RET_NO_SRC_ADDR for v4 also, > or the fib_result_prefsrc does not fail? From inspecting the other usages of the function - it does not fail. > >> + >> /* xdp and cls_bpf programs are run in RCU-bh so >> * rcu_read_lock_bh is not needed here >> */ >> @@ -5992,6 +5995,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, >> params->rt_metric = res.f6i->fib6_metric; >> params->ifindex = dev->ifindex; >> >> + if (flags & BPF_FIB_LOOKUP_SET_SRC) { >> + if (res.f6i->fib6_prefsrc.plen) { >> + *(struct in6_addr *)params->ipv6_src = res.f6i->fib6_prefsrc.addr; >> + } else { >> + err = ipv6_bpf_stub->ipv6_dev_get_saddr(net, dev, >> + &fl6.daddr, 0, >> + (struct in6_addr *) >> + params->ipv6_src); >> + if (err) >> + return BPF_FIB_LKUP_RET_NO_SRC_ADDR; > > This error also implies BPF_FIB_LKUP_RET_NO_NEIGH. I don't have a clean way of > improving the API. May be others have some ideas. > > Considering dev has no saddr is probably (?) an unlikely case, it should be ok > to leave it as is but at least a comment in the uapi will be needed. Otherwise, > the bpf prog may use the 0 dmac as-is. I expect that a user of the helper checks that err == 0 before using any of the output params. > > I feel the current bpf_ipv[46]_fib_lookup helper is doing many things > in one > function and then requires different BPF_FIB_LOOKUP_* bits to select > what/how to > do. In the future, it may be worth to consider breaking it into smaller > kfunc(s). e.g. the __ipv[46]_neigh_lookup could be in its own kfunc. > Yep, good idea. At least it seems that the neigh lookup could live in its own function. >> + } >> + } >> + >> if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) >> goto set_fwd_params; >> >> @@ -6010,7 +6026,8 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, >> #endif >> >> #define BPF_FIB_LOOKUP_MASK (BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | \ >> - BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID) >> + BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID | \ >> + BPF_FIB_LOOKUP_SET_SRC) >> >> BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx, >> struct bpf_fib_lookup *, params, int, plen, u32, flags) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() 2023-10-05 20:16 ` Martynas @ 2023-10-06 6:29 ` Martin KaFai Lau 2023-10-06 7:03 ` Martynas 0 siblings, 1 reply; 9+ messages in thread From: Martin KaFai Lau @ 2023-10-06 6:29 UTC (permalink / raw) To: Martynas; +Cc: Daniel Borkmann, netdev, Nikolay Aleksandrov, bpf On 10/5/23 1:16 PM, Martynas wrote: >>> @@ -5992,6 +5995,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, >>> params->rt_metric = res.f6i->fib6_metric; >>> params->ifindex = dev->ifindex; >>> >>> + if (flags & BPF_FIB_LOOKUP_SET_SRC) { >>> + if (res.f6i->fib6_prefsrc.plen) { >>> + *(struct in6_addr *)params->ipv6_src = res.f6i->fib6_prefsrc.addr; A nit. just noticed. Similar to the "*dst" assignment a few lines above: *src = res.f6i->fib6_prefsrc.addr; >>> + } else { >>> + err = ipv6_bpf_stub->ipv6_dev_get_saddr(net, dev, >>> + &fl6.daddr, 0, >>> + (struct in6_addr *) >>> + params->ipv6_src); Same here. Use the "src". >>> + if (err) >>> + return BPF_FIB_LKUP_RET_NO_SRC_ADDR; >> >> This error also implies BPF_FIB_LKUP_RET_NO_NEIGH. I don't have a clean way of >> improving the API. May be others have some ideas. >> >> Considering dev has no saddr is probably (?) an unlikely case, it should be ok >> to leave it as is but at least a comment in the uapi will be needed. Otherwise, >> the bpf prog may use the 0 dmac as-is. > > I expect that a user of the helper checks that err == 0 before using any of the output params. For example, the bpf prog gets BPF_FIB_LKUP_RET_NO_NEIGH and learns neigh is not available but ipv6_dst (and the optional ipv6_src) is still valid. If the bpf prog gets BPF_FIB_LKUP_RET_NO_SRC_ADDR, intuitively, only ipv6_src is not available. The bpf prog will continue to use the ipv6_dst and dmac (which is actually 0). > >> >> I feel the current bpf_ipv[46]_fib_lookup helper is doing many things >> in one >> function and then requires different BPF_FIB_LOOKUP_* bits to select >> what/how to >> do. In the future, it may be worth to consider breaking it into smaller >> kfunc(s). e.g. the __ipv[46]_neigh_lookup could be in its own kfunc. >> > > Yep, good idea. At least it seems that the neigh lookup could live in its own function. To be clear, it could be independent of this set. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() 2023-10-06 6:29 ` Martin KaFai Lau @ 2023-10-06 7:03 ` Martynas 0 siblings, 0 replies; 9+ messages in thread From: Martynas @ 2023-10-06 7:03 UTC (permalink / raw) To: Martin KaFai Lau; +Cc: Daniel Borkmann, netdev, Nikolay Aleksandrov, bpf On Fri, Oct 6, 2023, at 8:29 AM, Martin KaFai Lau wrote: > On 10/5/23 1:16 PM, Martynas wrote: >>>> @@ -5992,6 +5995,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, >>>> params->rt_metric = res.f6i->fib6_metric; >>>> params->ifindex = dev->ifindex; >>>> >>>> + if (flags & BPF_FIB_LOOKUP_SET_SRC) { >>>> + if (res.f6i->fib6_prefsrc.plen) { >>>> + *(struct in6_addr *)params->ipv6_src = res.f6i->fib6_prefsrc.addr; > > A nit. just noticed. Similar to the "*dst" assignment a few lines above: SGTM. > > *src = res.f6i->fib6_prefsrc.addr; > >>>> + } else { >>>> + err = ipv6_bpf_stub->ipv6_dev_get_saddr(net, dev, >>>> + &fl6.daddr, 0, >>>> + (struct in6_addr *) >>>> + params->ipv6_src); > > Same here. Use the "src". > >>>> + if (err) >>>> + return BPF_FIB_LKUP_RET_NO_SRC_ADDR; >>> >>> This error also implies BPF_FIB_LKUP_RET_NO_NEIGH. I don't have a clean way of >>> improving the API. May be others have some ideas. >>> >>> Considering dev has no saddr is probably (?) an unlikely case, it should be ok >>> to leave it as is but at least a comment in the uapi will be needed. Otherwise, >>> the bpf prog may use the 0 dmac as-is. >> >> I expect that a user of the helper checks that err == 0 before using any of the output params. > > For example, the bpf prog gets BPF_FIB_LKUP_RET_NO_NEIGH and learns > neigh is not > available but ipv6_dst (and the optional ipv6_src) is still valid. > > If the bpf prog gets BPF_FIB_LKUP_RET_NO_SRC_ADDR, intuitively, only > ipv6_src is > not available. The bpf prog will continue to use the ipv6_dst and dmac > (which is > actually 0). > Thinking out loud, we could make BPF_FIB_LOOKUP_SRC to require BPF_FIB_LOOKUP_SKIP_NEIGH, but then for some cases a user would be required to call the helper twice. This is a no-go due perf and instruction count reasons. Nothing betters comes than explicitly documenting the behavior in the uapi comments. >> >>> >>> I feel the current bpf_ipv[46]_fib_lookup helper is doing many things >>> in one >>> function and then requires different BPF_FIB_LOOKUP_* bits to select >>> what/how to >>> do. In the future, it may be worth to consider breaking it into smaller >>> kfunc(s). e.g. the __ipv[46]_neigh_lookup could be in its own kfunc. >>> >> >> Yep, good idea. At least it seems that the neigh lookup could live in its own function. > > To be clear, it could be independent of this set. > > Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests 2023-10-03 7:10 [PATCH bpf v2 0/2] bpf: Fix src IP addr related limitation in bpf_*_fib_lookup() Martynas Pumputis 2023-10-03 7:10 ` [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis @ 2023-10-03 7:10 ` Martynas Pumputis 2023-10-04 22:02 ` Martin KaFai Lau 1 sibling, 1 reply; 9+ messages in thread From: Martynas Pumputis @ 2023-10-03 7:10 UTC (permalink / raw) To: bpf Cc: Daniel Borkmann, netdev, Martin KaFai Lau, Nikolay Aleksandrov, Martynas Pumputis This patch extends the existing fib_lookup test suite by adding two test cases (for each IP family): * Test source IP selection when default route is used. * Test source IP selection when an IP route has a preferred src IP addr. Signed-off-by: Martynas Pumputis <m@lambda.lt> --- .../selftests/bpf/prog_tests/fib_lookup.c | 76 +++++++++++++++++-- 1 file changed, 70 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c index 2fd05649bad1..1b0ab1dbd4f1 100644 --- a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c +++ b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c @@ -11,9 +11,13 @@ #define NS_TEST "fib_lookup_ns" #define IPV6_IFACE_ADDR "face::face" +#define IPV6_IFACE_ADDR_SEC "cafe::cafe" +#define IPV6_ADDR_DST "face::3" #define IPV6_NUD_FAILED_ADDR "face::1" #define IPV6_NUD_STALE_ADDR "face::2" #define IPV4_IFACE_ADDR "10.0.0.254" +#define IPV4_IFACE_ADDR_SEC "10.1.0.254" +#define IPV4_ADDR_DST "10.2.0.254" #define IPV4_NUD_FAILED_ADDR "10.0.0.1" #define IPV4_NUD_STALE_ADDR "10.0.0.2" #define IPV4_TBID_ADDR "172.0.0.254" @@ -31,6 +35,8 @@ struct fib_lookup_test { const char *desc; const char *daddr; int expected_ret; + const char *expected_ipv4_src; + const char *expected_ipv6_src; int lookup_flags; __u32 tbid; __u8 dmac[6]; @@ -69,6 +75,22 @@ static const struct fib_lookup_test tests[] = { .daddr = IPV6_TBID_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, .lookup_flags = BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_TBID, .tbid = 100, .dmac = DMAC_INIT2, }, + { .desc = "IPv4 set src addr", + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .expected_ipv4_src = IPV4_IFACE_ADDR, + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, + { .desc = "IPv6 set src addr", + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .expected_ipv6_src = IPV6_IFACE_ADDR, + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, + { .desc = "IPv4 set prefsrc addr from route", + .daddr = IPV4_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .expected_ipv4_src = IPV4_IFACE_ADDR_SEC, + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, + { .desc = "IPv6 set prefsrc addr route", + .daddr = IPV6_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .expected_ipv6_src = IPV6_IFACE_ADDR_SEC, + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, }; static int ifindex; @@ -97,6 +119,13 @@ static int setup_netns(void) SYS(fail, "ip neigh add %s dev veth1 nud failed", IPV4_NUD_FAILED_ADDR); SYS(fail, "ip neigh add %s dev veth1 lladdr %s nud stale", IPV4_NUD_STALE_ADDR, DMAC); + /* Setup for prefsrc IP addr selection */ + SYS(fail, "ip addr add %s/24 dev veth1", IPV4_IFACE_ADDR_SEC); + SYS(fail, "ip route add %s/32 dev veth1 src %s", IPV4_ADDR_DST, IPV4_IFACE_ADDR_SEC); + + SYS(fail, "ip addr add %s/64 dev veth1 nodad", IPV6_IFACE_ADDR_SEC); + SYS(fail, "ip route add %s/128 dev veth1 src %s", IPV6_ADDR_DST, IPV6_IFACE_ADDR_SEC); + /* Setup for tbid lookup tests */ SYS(fail, "ip addr add %s/24 dev veth2", IPV4_TBID_ADDR); SYS(fail, "ip route del %s/24 dev veth2", IPV4_TBID_NET); @@ -133,9 +162,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo if (inet_pton(AF_INET6, test->daddr, params->ipv6_dst) == 1) { params->family = AF_INET6; - ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); - if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) - return -1; + if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) { + ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) + return -1; + } + return 0; } @@ -143,9 +175,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo if (!ASSERT_EQ(ret, 1, "convert IP[46] address")) return -1; params->family = AF_INET; - ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); - if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) - return -1; + + if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) { + ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) + return -1; + } return 0; } @@ -207,6 +242,35 @@ void test_fib_lookup(void) ASSERT_EQ(skel->bss->fib_lookup_ret, tests[i].expected_ret, "fib_lookup_ret"); + if (tests[i].expected_ipv4_src) { + __be32 expected_ipv4_src; + + ret = inet_pton(AF_INET, tests[i].expected_ipv4_src, + &expected_ipv4_src); + ASSERT_EQ(ret, 1, "inet_pton(expected_ipv4_src)"); + + ASSERT_EQ(fib_params->ipv4_src, expected_ipv4_src, + "fib_lookup ipv4 src"); + } + if (tests[i].expected_ipv6_src) { + __u32 expected_ipv6_src[4]; + + ret = inet_pton(AF_INET6, tests[i].expected_ipv6_src, + expected_ipv6_src); + ASSERT_EQ(ret, 1, "inet_pton(expected_ipv6_src)"); + + ret = memcmp(expected_ipv6_src, fib_params->ipv6_src, + sizeof(fib_params->ipv6_src)); + if (!ASSERT_EQ(ret, 0, "fib_lookup ipv6 src")) { + char src_ip6[64]; + + inet_ntop(AF_INET6, fib_params->ipv6_src, src_ip6, + sizeof(src_ip6)); + printf("ipv6 expected %s actual %s ", + tests[i].expected_ipv6_src, src_ip6); + } + } + ret = memcmp(tests[i].dmac, fib_params->dmac, sizeof(tests[i].dmac)); if (!ASSERT_EQ(ret, 0, "dmac not match")) { char expected[18], actual[18]; -- 2.42.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests 2023-10-03 7:10 ` [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis @ 2023-10-04 22:02 ` Martin KaFai Lau 2023-10-05 20:19 ` Martynas 0 siblings, 1 reply; 9+ messages in thread From: Martin KaFai Lau @ 2023-10-04 22:02 UTC (permalink / raw) To: Martynas Pumputis; +Cc: Daniel Borkmann, netdev, Nikolay Aleksandrov, bpf On 10/3/23 12:10 AM, Martynas Pumputis wrote: > This patch extends the existing fib_lookup test suite by adding two test > cases (for each IP family): > > * Test source IP selection when default route is used. It will be helpful to reword "default route". I was looking in the patch for a new route addition like "default via xxx". I think the test is reusing the existing prefix route "/24" and "/64". This is to test the address selection from the dev. > * Test source IP selection when an IP route has a preferred src IP addr. > > Signed-off-by: Martynas Pumputis <m@lambda.lt> > --- > .../selftests/bpf/prog_tests/fib_lookup.c | 76 +++++++++++++++++-- > 1 file changed, 70 insertions(+), 6 deletions(-) > > diff --git a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c > index 2fd05649bad1..1b0ab1dbd4f1 100644 > --- a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c > +++ b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c > @@ -11,9 +11,13 @@ > > #define NS_TEST "fib_lookup_ns" > #define IPV6_IFACE_ADDR "face::face" > +#define IPV6_IFACE_ADDR_SEC "cafe::cafe" SEC stands for secondary? > +#define IPV6_ADDR_DST "face::3" > #define IPV6_NUD_FAILED_ADDR "face::1" > #define IPV6_NUD_STALE_ADDR "face::2" > #define IPV4_IFACE_ADDR "10.0.0.254" > +#define IPV4_IFACE_ADDR_SEC "10.1.0.254" > +#define IPV4_ADDR_DST "10.2.0.254" > #define IPV4_NUD_FAILED_ADDR "10.0.0.1" > #define IPV4_NUD_STALE_ADDR "10.0.0.2" > #define IPV4_TBID_ADDR "172.0.0.254" > @@ -31,6 +35,8 @@ struct fib_lookup_test { > const char *desc; > const char *daddr; > int expected_ret; > + const char *expected_ipv4_src; > + const char *expected_ipv6_src; Instead of two members, can it be one "expected_src" member which is v4/v6 agnoastic (similar to the "daddr" above)? The logic needs to be a little smarter for one time but the future test additions will be easier and less error prone. > int lookup_flags; > __u32 tbid; > __u8 dmac[6]; > @@ -69,6 +75,22 @@ static const struct fib_lookup_test tests[] = { > .daddr = IPV6_TBID_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, > .lookup_flags = BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_TBID, .tbid = 100, > .dmac = DMAC_INIT2, }, > + { .desc = "IPv4 set src addr", > + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, > + .expected_ipv4_src = IPV4_IFACE_ADDR, > + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, > + { .desc = "IPv6 set src addr", > + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, > + .expected_ipv6_src = IPV6_IFACE_ADDR, > + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, > + { .desc = "IPv4 set prefsrc addr from route", > + .daddr = IPV4_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, > + .expected_ipv4_src = IPV4_IFACE_ADDR_SEC, > + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, > + { .desc = "IPv6 set prefsrc addr route", > + .daddr = IPV6_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, > + .expected_ipv6_src = IPV6_IFACE_ADDR_SEC, > + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, > }; > > static int ifindex; > @@ -97,6 +119,13 @@ static int setup_netns(void) > SYS(fail, "ip neigh add %s dev veth1 nud failed", IPV4_NUD_FAILED_ADDR); > SYS(fail, "ip neigh add %s dev veth1 lladdr %s nud stale", IPV4_NUD_STALE_ADDR, DMAC); > > + /* Setup for prefsrc IP addr selection */ > + SYS(fail, "ip addr add %s/24 dev veth1", IPV4_IFACE_ADDR_SEC); > + SYS(fail, "ip route add %s/32 dev veth1 src %s", IPV4_ADDR_DST, IPV4_IFACE_ADDR_SEC); > + > + SYS(fail, "ip addr add %s/64 dev veth1 nodad", IPV6_IFACE_ADDR_SEC); > + SYS(fail, "ip route add %s/128 dev veth1 src %s", IPV6_ADDR_DST, IPV6_IFACE_ADDR_SEC); > + > /* Setup for tbid lookup tests */ > SYS(fail, "ip addr add %s/24 dev veth2", IPV4_TBID_ADDR); > SYS(fail, "ip route del %s/24 dev veth2", IPV4_TBID_NET); > @@ -133,9 +162,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo > > if (inet_pton(AF_INET6, test->daddr, params->ipv6_dst) == 1) { > params->family = AF_INET6; > - ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); > - if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) > - return -1; > + if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) { > + ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); > + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) > + return -1; > + } > + > return 0; > } > > @@ -143,9 +175,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo > if (!ASSERT_EQ(ret, 1, "convert IP[46] address")) > return -1; > params->family = AF_INET; > - ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); > - if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) > - return -1; > + > + if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) { > + ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); > + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) > + return -1; > + } > > return 0; > } > @@ -207,6 +242,35 @@ void test_fib_lookup(void) > ASSERT_EQ(skel->bss->fib_lookup_ret, tests[i].expected_ret, > "fib_lookup_ret"); > > + if (tests[i].expected_ipv4_src) { > + __be32 expected_ipv4_src; > + > + ret = inet_pton(AF_INET, tests[i].expected_ipv4_src, > + &expected_ipv4_src); > + ASSERT_EQ(ret, 1, "inet_pton(expected_ipv4_src)"); > + > + ASSERT_EQ(fib_params->ipv4_src, expected_ipv4_src, > + "fib_lookup ipv4 src"); > + } > + if (tests[i].expected_ipv6_src) { > + __u32 expected_ipv6_src[4]; > + > + ret = inet_pton(AF_INET6, tests[i].expected_ipv6_src, > + expected_ipv6_src); > + ASSERT_EQ(ret, 1, "inet_pton(expected_ipv6_src)"); > + > + ret = memcmp(expected_ipv6_src, fib_params->ipv6_src, > + sizeof(fib_params->ipv6_src)); > + if (!ASSERT_EQ(ret, 0, "fib_lookup ipv6 src")) { > + char src_ip6[64]; > + > + inet_ntop(AF_INET6, fib_params->ipv6_src, src_ip6, > + sizeof(src_ip6)); > + printf("ipv6 expected %s actual %s ", > + tests[i].expected_ipv6_src, src_ip6); > + } > + } nit. Move the v4/v6 expected_src comparison to a static function, potentially done in a v4/v6 agnostic way mentioned in the above expected_ipv[46]_src comment. > + > ret = memcmp(tests[i].dmac, fib_params->dmac, sizeof(tests[i].dmac)); > if (!ASSERT_EQ(ret, 0, "dmac not match")) { > char expected[18], actual[18]; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests 2023-10-04 22:02 ` Martin KaFai Lau @ 2023-10-05 20:19 ` Martynas 0 siblings, 0 replies; 9+ messages in thread From: Martynas @ 2023-10-05 20:19 UTC (permalink / raw) To: Martin KaFai Lau; +Cc: Daniel Borkmann, netdev, Nikolay Aleksandrov, bpf On Thu, Oct 5, 2023, at 12:02 AM, Martin KaFai Lau wrote: > On 10/3/23 12:10 AM, Martynas Pumputis wrote: >> This patch extends the existing fib_lookup test suite by adding two test >> cases (for each IP family): >> >> * Test source IP selection when default route is used. > > It will be helpful to reword "default route". I was looking in the patch for a > new route addition like "default via xxx". I think the test is reusing the > existing prefix route "/24" and "/64". This is to test the address selection > from the dev. > Yep, I will change to that. >> * Test source IP selection when an IP route has a preferred src IP addr. >> >> Signed-off-by: Martynas Pumputis <m@lambda.lt> >> --- >> .../selftests/bpf/prog_tests/fib_lookup.c | 76 +++++++++++++++++-- >> 1 file changed, 70 insertions(+), 6 deletions(-) >> >> diff --git a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c >> index 2fd05649bad1..1b0ab1dbd4f1 100644 >> --- a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c >> +++ b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c >> @@ -11,9 +11,13 @@ >> >> #define NS_TEST "fib_lookup_ns" >> #define IPV6_IFACE_ADDR "face::face" >> +#define IPV6_IFACE_ADDR_SEC "cafe::cafe" > > SEC stands for secondary? > Yep, for secondary. IPV6_IFACE_ADDR_SECONDARY felt a bit too long. >> +#define IPV6_ADDR_DST "face::3" >> #define IPV6_NUD_FAILED_ADDR "face::1" >> #define IPV6_NUD_STALE_ADDR "face::2" >> #define IPV4_IFACE_ADDR "10.0.0.254" >> +#define IPV4_IFACE_ADDR_SEC "10.1.0.254" >> +#define IPV4_ADDR_DST "10.2.0.254" >> #define IPV4_NUD_FAILED_ADDR "10.0.0.1" >> #define IPV4_NUD_STALE_ADDR "10.0.0.2" >> #define IPV4_TBID_ADDR "172.0.0.254" >> @@ -31,6 +35,8 @@ struct fib_lookup_test { >> const char *desc; >> const char *daddr; >> int expected_ret; >> + const char *expected_ipv4_src; >> + const char *expected_ipv6_src; > > Instead of two members, can it be one "expected_src" member which is > v4/v6 > agnoastic (similar to the "daddr" above)? The logic needs to be a > little smarter > for one time but the future test additions will be easier and less > error prone. > SGTM. >> int lookup_flags; >> __u32 tbid; >> __u8 dmac[6]; >> @@ -69,6 +75,22 @@ static const struct fib_lookup_test tests[] = { >> .daddr = IPV6_TBID_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, >> .lookup_flags = BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_TBID, .tbid = 100, >> .dmac = DMAC_INIT2, }, >> + { .desc = "IPv4 set src addr", >> + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, >> + .expected_ipv4_src = IPV4_IFACE_ADDR, >> + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, >> + { .desc = "IPv6 set src addr", >> + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, >> + .expected_ipv6_src = IPV6_IFACE_ADDR, >> + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, >> + { .desc = "IPv4 set prefsrc addr from route", >> + .daddr = IPV4_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, >> + .expected_ipv4_src = IPV4_IFACE_ADDR_SEC, >> + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, >> + { .desc = "IPv6 set prefsrc addr route", >> + .daddr = IPV6_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, >> + .expected_ipv6_src = IPV6_IFACE_ADDR_SEC, >> + .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, }, >> }; >> >> static int ifindex; >> @@ -97,6 +119,13 @@ static int setup_netns(void) >> SYS(fail, "ip neigh add %s dev veth1 nud failed", IPV4_NUD_FAILED_ADDR); >> SYS(fail, "ip neigh add %s dev veth1 lladdr %s nud stale", IPV4_NUD_STALE_ADDR, DMAC); >> >> + /* Setup for prefsrc IP addr selection */ >> + SYS(fail, "ip addr add %s/24 dev veth1", IPV4_IFACE_ADDR_SEC); >> + SYS(fail, "ip route add %s/32 dev veth1 src %s", IPV4_ADDR_DST, IPV4_IFACE_ADDR_SEC); >> + >> + SYS(fail, "ip addr add %s/64 dev veth1 nodad", IPV6_IFACE_ADDR_SEC); >> + SYS(fail, "ip route add %s/128 dev veth1 src %s", IPV6_ADDR_DST, IPV6_IFACE_ADDR_SEC); >> + >> /* Setup for tbid lookup tests */ >> SYS(fail, "ip addr add %s/24 dev veth2", IPV4_TBID_ADDR); >> SYS(fail, "ip route del %s/24 dev veth2", IPV4_TBID_NET); >> @@ -133,9 +162,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo >> >> if (inet_pton(AF_INET6, test->daddr, params->ipv6_dst) == 1) { >> params->family = AF_INET6; >> - ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); >> - if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) >> - return -1; >> + if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) { >> + ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); >> + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) >> + return -1; >> + } >> + >> return 0; >> } >> >> @@ -143,9 +175,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo >> if (!ASSERT_EQ(ret, 1, "convert IP[46] address")) >> return -1; >> params->family = AF_INET; >> - ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); >> - if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) >> - return -1; >> + >> + if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) { >> + ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); >> + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) >> + return -1; >> + } >> >> return 0; >> } >> @@ -207,6 +242,35 @@ void test_fib_lookup(void) >> ASSERT_EQ(skel->bss->fib_lookup_ret, tests[i].expected_ret, >> "fib_lookup_ret"); >> >> + if (tests[i].expected_ipv4_src) { >> + __be32 expected_ipv4_src; >> + >> + ret = inet_pton(AF_INET, tests[i].expected_ipv4_src, >> + &expected_ipv4_src); >> + ASSERT_EQ(ret, 1, "inet_pton(expected_ipv4_src)"); >> + >> + ASSERT_EQ(fib_params->ipv4_src, expected_ipv4_src, >> + "fib_lookup ipv4 src"); >> + } >> + if (tests[i].expected_ipv6_src) { >> + __u32 expected_ipv6_src[4]; >> + >> + ret = inet_pton(AF_INET6, tests[i].expected_ipv6_src, >> + expected_ipv6_src); >> + ASSERT_EQ(ret, 1, "inet_pton(expected_ipv6_src)"); >> + >> + ret = memcmp(expected_ipv6_src, fib_params->ipv6_src, >> + sizeof(fib_params->ipv6_src)); >> + if (!ASSERT_EQ(ret, 0, "fib_lookup ipv6 src")) { >> + char src_ip6[64]; >> + >> + inet_ntop(AF_INET6, fib_params->ipv6_src, src_ip6, >> + sizeof(src_ip6)); >> + printf("ipv6 expected %s actual %s ", >> + tests[i].expected_ipv6_src, src_ip6); >> + } >> + } > > nit. Move the v4/v6 expected_src comparison to a static function, > potentially > done in a v4/v6 agnostic way mentioned in the above > expected_ipv[46]_src comment. > > SGTM. >> + >> ret = memcmp(tests[i].dmac, fib_params->dmac, sizeof(tests[i].dmac)); >> if (!ASSERT_EQ(ret, 0, "dmac not match")) { >> char expected[18], actual[18]; ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-10-06 7:03 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-10-03 7:10 [PATCH bpf v2 0/2] bpf: Fix src IP addr related limitation in bpf_*_fib_lookup() Martynas Pumputis 2023-10-03 7:10 ` [PATCH bpf v2 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis 2023-10-04 20:09 ` Martin KaFai Lau 2023-10-05 20:16 ` Martynas 2023-10-06 6:29 ` Martin KaFai Lau 2023-10-06 7:03 ` Martynas 2023-10-03 7:10 ` [PATCH bpf v2 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis 2023-10-04 22:02 ` Martin KaFai Lau 2023-10-05 20:19 ` Martynas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).