From mboxrd@z Thu Jan 1 00:00:00 1970 From: Liran Alon Subject: Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns Date: Thu, 15 Mar 2018 05:14:15 -0700 (PDT) Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: , , , , , To: Return-path: Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org ----- shmulik.ladkani@gmail.com wrote: > Hi, >=20 > On Tue, 13 Mar 2018 17:07:22 +0200 Liran Alon > wrote: > > Before this commit, dev_forward_skb() always cleared packet's > > per-network-namespace info. Even if the packet doesn't cross > > network namespaces. > >=20 > > The comment above dev_forward_skb() describes that this is done > > because the receiving device may be in another network namespace. > > However, this case can easily be tested for and therefore we can > > scrub packet's per-network-namespace info only when receiving > device > > is indeed in another network namespace. > >=20 > > Therefore, this commit changes ____dev_forward_skb() to tell > > skb_scrub_packet() that skb has crossed network-namespace only in > case > > transmitting device (skb->dev) network namespace is different then > > receiving device (dev) network namespace. >=20 > Assuming the premise of this commit is correct, note it may not act > as > intended for xnet situation in ipvlan_process_multicast, snip: >=20 > =09=09nskb->dev =3D ipvlan->dev; > =09=09if (tx_pkt) > =09=09=09ret =3D dev_forward_skb(ipvlan->dev, nskb); > =09=09else > =09=09=09ret =3D netif_rx(nskb); >=20 > as 'dev' gets already assigned to nskb prior dev_forward_skb (hence > in > ____dev_forward_skb both dev and skb->dev are the same). > Fortunately every ipvlan_multicast_enqueue call is preceded by a > forced > scrub; It would be future-proof to not assign nskb->dev in the > dev_forward_skb case (assign it only in the netif_rx case). I agree. Nice catch. skb->dev will later get assigned in eth_type_trans() called from __dev_forw= ard_skb(). I will do this small change in a separate patch before this patch. (In v2 of this series) >=20 >=20 > Regarding the premise of this commit, this "reduces" the > ipvs/orphan/mark scrubbing in the following *non* xnet situations: >=20 > 1. mac2vlan port xmit to other macvlan ports in Bridge Mode > 2. similarly for ipvlan > 3. veth xmit > 4. l2tp_eth_dev_recv > 5. bpf redirect/clone_redirect ingress actions >=20 > Regarding l2tp recv, this commit seems to align the srubbing behavior > with ip tunnels (full scrub only if crossing netns, see > ip_tunnel_rcv). >=20 > Regarding veth xmit, it does makes sense to preserve the fields if > not > crossing netns. This is also the case when one uses tc mirred. >=20 > Regarding bpf redirect, well, it depends on the expectations of each > bpf > program. > I'd argue that preserving the fields (at least the mark field) in the > *non* xnet makes sense and provides more information and therefore > more > capabilities; Alas this might change behavior already being relied > on. I now noticed that a similar change was done in the past in commit 8a83a00b0735 ("net: maintain namespace isolation between vlan and real device"). Commit changed dev_forward_skb() from alway= s setting skb->mark=3D0 to do this only in case we cross netns. However, a later commit: 59b9997baba5 ("Revert "net: maintain namespace isolation between vlan and real device") seems to later reverted that chang= e. But I think that the regression issue mentioned in the revert isn't related to the change proposed by this commit. Please correct me if I'm missing something. >=20 > Maybe Daniel can comment on the matter. >=20 > Regards, > Shmulik