From mboxrd@z Thu Jan  1 00:00:00 1970
From: Pablo Neira Ayuso <pablo@netfilter.org>
Subject: Re: [PATCH 02/14] tcp: fix mark propagation with fwmark_reflect
 enabled
Date: Thu, 26 Jan 2017 20:19:35 +0100
Message-ID: <20170126191935.GA26591@salvia>
References: <1485448687-6072-1-git-send-email-pablo@netfilter.org>
 <1485448687-6072-3-git-send-email-pablo@netfilter.org>
 <1485453760.5145.144.camel@edumazet-glaptop3.roam.corp.google.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: netfilter-devel@vger.kernel.org, davem@davemloft.net,
        netdev@vger.kernel.org
To: Eric Dumazet <eric.dumazet@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Content-Disposition: inline
In-Reply-To: <1485453760.5145.144.camel@edumazet-glaptop3.roam.corp.google.com>
Sender: netdev-owner@vger.kernel.org
List-Id: netfilter-devel.vger.kernel.org

On Thu, Jan 26, 2017 at 10:02:40AM -0800, Eric Dumazet wrote:
> On Thu, 2017-01-26 at 17:37 +0100, Pablo Neira Ayuso wrote:
> > From: Pau Espin Pedrol <pespin.shar@gmail.com>
> > 
> > Otherwise, RST packets generated by the TCP stack for non-existing
> > sockets always have mark 0.
> > The mark from the original packet is assigned to the netns_ipv4/6
> > socket used to send the response so that it can get copied into the
> > response skb when the socket sends it.
> > 
> > Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
> > Cc: Lorenzo Colitti <lorenzo@google.com>
> > Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> >  net/ipv4/ip_output.c | 1 +
> >  net/ipv6/tcp_ipv6.c  | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> > index fac275c48108..b67719f45953 100644
> > --- a/net/ipv4/ip_output.c
> > +++ b/net/ipv4/ip_output.c
> > @@ -1629,6 +1629,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
> >  	sk->sk_protocol = ip_hdr(skb)->protocol;
> >  	sk->sk_bound_dev_if = arg->bound_dev_if;
> >  	sk->sk_sndbuf = sysctl_wmem_default;
> > +	sk->sk_mark = fl4.flowi4_mark;
> >  	err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base,
> >  			     len, 0, &ipc, &rt, MSG_DONTWAIT);
> >  	if (unlikely(err)) {
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index 73bc8fc68acd..2b20622a5824 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -840,6 +840,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
> >  	dst = ip6_dst_lookup_flow(ctl_sk, &fl6, NULL);
> >  	if (!IS_ERR(dst)) {
> >  		skb_dst_set(buff, dst);
> > +		ctl_sk->sk_mark = fl6.flowi6_mark;
> >  		ip6_xmit(ctl_sk, buff, &fl6, NULL, tclass);
> >  		TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> >  		if (rst)
> 
> 
> This patch is wrong.
> 
> ctl_sk is a shared socket, and tcp_v6_send_response() can be called from
> many different cpus at the same time.

Right. This is not percpu as in IPv4.

I can send a follow up patch to get this in sync with the way we do it
in IPv4, ie. add percpu socket.

Fine with this approach? Thanks!