From mboxrd@z Thu Jan 1 00:00:00 1970 From: Liran Alon Subject: Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns Date: Thu, 15 Mar 2018 08:05:33 -0700 (PDT) Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: , , , , , , To: Return-path: Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org ----- daniel@iogearbox.net wrote: > On 03/15/2018 03:35 PM, Roman Mashak wrote: > > Liran Alon writes: > > [...] > >>> Overall I think it might be nice to not need scrubbing skb in > such > >>> cases, > >>> although my concern would be that this has potential to break > >>> existing > >>> setups when they would expect mark being zero on other veth peer > in > >>> any > >>> case since it's the behavior for a long time already. The safer > >>> option > >>> would be to have some sort of explicit opt-in e.g. on link > creation to > >>> let > >>> the skb->mark pass through unscrubbed. This would definitely be a > >>> useful > >>> option e.g. when mark is set in the netns facing veth via > >>> clsact/egress > >>> on xmit and when the container is unprivileged anyway. > >>> > >>> Thanks, > >>> Daniel > >> > >> I see your point in regards to backwards comparability. > >> However, not scrubbing skb when it cross netns via some kernel > functions compared to > >> others is basically a bug which could easily break with a little > bit of more refactoring. > >> Therefore, it seems a bit weird to me to from now on, we will > force > >> every user on link creation to consider that once there was a bug > leading > >> to this weird behavior on specific netdevs. >=20 > Why bug specifically? It could well be that for some unpriv > containers > it would be fine to do e.g. in cases where orchestrator sets up > clsact/ > egress on veth/ipvlan/etc in the container to set the mark and where > app > cannot mess with this while for others you need to act out of host > facing > veth; thus, explicit opt-in per dev could provide such more fine > grained > control. >=20 > > One valid use case could be preserving a source namespace nsid in > > skb->mark when a packet crosses netns. >=20 > Right, was thinking about something similar. I agree with all the above but this behavior was not supported both before and after this commit. skb->mark is always zeroed when crossing netn= s. This commit only changes behavior for skb crossing netdevs on *same* netns via dev_forward_skb(). Therefore, I believe we should discuss here what we want default behavior t= o be and how it should be controlled for backwards comparability. Only after we should discuss about adding an extra feature of controlling s= kb scrub per netdev or something similar.