Re: [PATCH net-next v2] seg6: enable route leak for encap routes

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: Andrea Mayer <andrea.mayer@uniroma2.it>
Cc: "David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	David Lebrun <david.lebrun@uclouvain.be>,
	David Ahern <dsahern@kernel.org>,
	netdev@vger.kernel.org, stefano.salsano@uniroma2.it,
	Paolo Lungaroni <paolo.lungaroni@uniroma2.it>,
	ahabdels@cisco.com
Subject: Re: [PATCH net-next v2] seg6: enable route leak for encap routes
Date: Tue, 31 Mar 2026 17:57:17 +0200	[thread overview]
Message-ID: <b16b18c3-5e15-43a2-8712-cebe6ba65d12@6wind.com> (raw)
In-Reply-To: <20260329205846.801e8096bb8cde3eda5ed0c1@uniroma2.it>

Le 29/03/2026 à 20:58, Andrea Mayer a écrit :
> On Fri, 27 Mar 2026 15:06:24 +0100
> Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
> 
>> The goal is to support x-vrf route. To avoid breaking existing setup, a new
>> flag is introduced: nh-vrf.
>>
>> The dev parameter is mandatory when a seg6 encap route is configured, but
>> before this commit, it is ignored/not used. After the srv6 encapsulation, a
>> second route lookup in the same vrf is performed.
>>
>> The new nh-vrf flag specifies to use the vrf associated with the dev
>> parameter to perform this second route lookup.
>>
> 
> Hi Nicolas,
>  
> thanks for looking into this. The use case is valid and limiting the
> effect of nh-vrf to routes that explicitly carry the attribute avoids
> regressions, which is good.
> 
> 
> I have a few thoughts on the nh-vrf semantics though. The attribute
> says "use the VRF of dev", but in your use case dev is not in any
> VRF, so the lookup ends up in the main table as a side effect of
> the loopback fallback, which is not obvious from the attribute
> name. Today dev is used for source address selection in
The 'no-vrf' / 'default-vrf', I don't know how to call it, is a kind of vrf. I
can try to find another name if it's clearer.
What about 'dev-context'?

> set_tun_src() and has no routing role; nh-vrf would give it one
> that does not match what actually happens.
> 
> 
>>
>> [snip]
>>
>> Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
> 
> This looks like a leftover from v1; since this is new UAPI on
> net-next, the Fixes tag should not be here.
Yep, I will remove it.

> 
> 
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>> ---
>>
>> v1 -> v2:
>>  - target net-next instead of net
>>  - add a new attribute to avoid breaking the legacy behavior
>>  - use dst_dev_rcu()
>>
>>  include/uapi/linux/seg6_iptunnel.h            |  1 +
>>  net/ipv6/seg6_iptunnel.c                      | 23 ++++++++++--
>>  .../selftests/net/srv6_end_dt46_l3vpn_test.sh | 35 +++++++++++++++++--
>>  .../selftests/net/srv6_end_dt4_l3vpn_test.sh  | 33 +++++++++++++++--
>>  .../selftests/net/srv6_end_dt6_l3vpn_test.sh  | 33 +++++++++++++++--
>>  5 files changed, 116 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/uapi/linux/seg6_iptunnel.h b/include/uapi/linux/seg6_iptunnel.h
>> index 485889b19900..d7d6aa2f72c5 100644
>> --- a/include/uapi/linux/seg6_iptunnel.h
>> +++ b/include/uapi/linux/seg6_iptunnel.h
>> @@ -21,6 +21,7 @@ enum {
>>  	SEG6_IPTUNNEL_UNSPEC,
>>  	SEG6_IPTUNNEL_SRH,
>>  	SEG6_IPTUNNEL_SRC,	/* struct in6_addr */
>> +	SEG6_IPTUNNEL_NH_VRF,
>>  	__SEG6_IPTUNNEL_MAX,
>>  };
>>  #define SEG6_IPTUNNEL_MAX (__SEG6_IPTUNNEL_MAX - 1)
>> diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
>> index e76cc0cc481e..f8c6f0d719be 100644
>> --- a/net/ipv6/seg6_iptunnel.c
>> +++ b/net/ipv6/seg6_iptunnel.c
>> @@ -50,6 +50,7 @@ static size_t seg6_lwt_headroom(struct seg6_iptunnel_encap *tuninfo)
>>  struct seg6_lwt {
>>  	struct dst_cache cache;
>>  	struct in6_addr tunsrc;
>> +	bool nh_vrf;
>>  	struct seg6_iptunnel_encap tuninfo[];
>>  };
>>  
>> @@ -67,6 +68,7 @@ seg6_encap_lwtunnel(struct lwtunnel_state *lwt)
>>  static const struct nla_policy seg6_iptunnel_policy[SEG6_IPTUNNEL_MAX + 1] = {
>>  	[SEG6_IPTUNNEL_SRH]	= { .type = NLA_BINARY },
>>  	[SEG6_IPTUNNEL_SRC]	= NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)),
>> +	[SEG6_IPTUNNEL_NH_VRF]	= { .type = NLA_FLAG },
>>  };
>>  
>>  static int nla_put_srh(struct sk_buff *skb, int attrtype,
>> @@ -499,9 +501,15 @@ static int seg6_input_core(struct net *net, struct sock *sk,
>>  	 * now and use it later as a comparison.
>>  	 */
>>  	lwtst = orig_dst->lwtstate;
>> -
>>  	slwt = seg6_lwt_lwtunnel(lwtst);
>>  
>> +	if (slwt->nh_vrf) {
>> +		rcu_read_lock();
>> +		skb->dev = l3mdev_master_dev_rcu(dst_dev_rcu(orig_dst)) ?:
>> +			dev_net(skb->dev)->loopback_dev;
>> +		rcu_read_unlock();
>> +	}
>> +
>> [snip]
>>
> 
> Overwriting skb->dev alters flowi6_iif, which affects ip rule matching
yes, this is the goal of the patch.

> on the ingress interface and changes what netfilter FORWARD sees as "in"
> device. Also, seg6_output_core() never checks nh_vrf and its fl6 is
Sure, but it enables filtering on the vrf interface. It won't break anything
because the flag doesn't exist for now.

> built without involving skb->dev, so this only works for forwarded
> traffic. These aspects would need to be addressed in any case.
Yes, I saw this. There is already a difference between the forwarding path and
the local output path. On the forwarding path, the input vrf is used for the
second route lookup. On the local output path, the 'default-vrf' is always used.
I didn't want to mix problems, so I targeted the forwarding path to start.

> 
> 
> Looking at this from a different angle, specifying the FIB table ID
> directly could be a more natural fit here. Something like
> SEG6_IPTUNNEL_TABLE (u32) with fib6_get_table() + ip6_pol_route(),
> similar to seg6_lookup_any_nexthop() in seg6_local.c.
> It would work for both input and output with no need to touch skb->dev.
> For example:
>  
>   ip -6 route add cafe::1/128 vrf vrf-100 \
>       encap seg6 mode encap segs fc00::1 lookup 254 dev veth0
> 
>  
> Beyond the semantics, a table ID is also more general: it covers
> the main table, tables associated with VRFs, and custom tables with
> the same mechanism, and keeps dev consistent with its current role
> across behaviors. I think we should explore this direction before
> moving forward. I am happy to help if you want.
My goal is to 'fix' the current behavior, not to add a new feature.
Today, the dev arg is mandatory but not used, this is misleading. The selftests
shows the inconsistency. The device of the encap route is in the 'default-vrf'
but another route in the same vrf is needed, with the same nexthop (dev).

Regards,
Nicolas

     prev parent reply	other threads:[~2026-03-31 15:57 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-27 14:06 [PATCH net-next v2] seg6: enable route leak for encap routes Nicolas Dichtel
2026-03-29 18:58 ` Andrea Mayer
2026-03-31 15:57   ` Nicolas Dichtel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b16b18c3-5e15-43a2-8712-cebe6ba65d12@6wind.com \
    --to=nicolas.dichtel@6wind.com \
    --cc=ahabdels@cisco.com \
    --cc=andrea.mayer@uniroma2.it \
    --cc=davem@davemloft.net \
    --cc=david.lebrun@uclouvain.be \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paolo.lungaroni@uniroma2.it \
    --cc=stefano.salsano@uniroma2.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox