All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: nicolas.dichtel@6wind.com
Cc: netdev@vger.kernel.org, davem@davemloft.net, bcrl@kvack.org,
	ravi.mlists@gmail.com
Subject: Re: [RFC PATCH net-next 2/2] sit: add support of x-netns
Date: Mon, 24 Jun 2013 15:42:15 -0700	[thread overview]
Message-ID: <874ncni114.fsf@xmission.com> (raw)
In-Reply-To: <51C8B5F6.7020603@6wind.com> (Nicolas Dichtel's message of "Mon, 24 Jun 2013 23:11:18 +0200")

Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:

> Le 24/06/2013 21:28, Eric W. Biederman a écrit :
>> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>>
>>> This patch allows to switch the netns when packet is encapsulated or
>>> decapsulated. In other word, the encapsulated packet is received in a netns,
>>> where the lookup is done to find the tunnel. Once the tunnel is found, the
>>> packet is decapsulated and injecting into the corresponding interface which
>>> stands to another netns.
>>>
>>> When one of the two netns is removed, the tunnel is destroyed.
>>
>> I don't see any fundamental problems with this code.  There are bugs
>> with the cleanup noted below.
>>
>> The primary sit interface is marked as NETNS_LOCAL which is good.  A
>> comment might be nice explaining the reasonsing but for code
>> archeologists.
> Ok.
>
>>
>> Conditionally calling dev_cleanup_skb bugs me.  The extra conditional
>> looks like a maintenance hazard.   Unless I have missed some subtle
>> detail either we don't need the cleanup at all or actually it is a bug
>> that we aren't scrubbing our packets as they progress through tunnels
>> even in the same network namespace.
>>
>> Can we just make that function the skb scrubbing needed for packets to
>> traverse a tunnel?
>>
>> My concern going into this was that we would get code that would break
>> because it would not be tested enough.  If we can remove the conditional
>> from dev_cleanup_skb we won't have any code that is conditionally run
>> and the logic looks simple enough not to bitrot in routine maintenance.
> My idea was to have the same level of cleanup/scrubbing that when a packet is
> sent from a netns to another netns through a veth. I cannot use
> dev_forward_skb() because this function expects to have an ethernet header, it's
> why I split it in the patch #1.
>
> If we leave all information attached to the skb, we may have, for example, an
> skb with a conntrack from netns1 and a netdevice from netns2. It seems not safe,
> but maybe I'm wrong. And in fact, the conntrack will not be created in the
> second netns (nf_conntrack_in() => skb->nfct is not null and not a template =>
> stats ignore++).
> Another example is a socket from a netns and the netdevice or conntrack from
> another netns.

All of that I agree with.

I just don't see any need to make that scrubbing/cleaning of the packet
conditional.

Semantically going through a tunnel is the same as crossing between
network namespaces.  So you can change

>>> +	if (tunnel->net != dev_net(tunnel->dev))
>>> +		dev_cleanup_skb(skb);

to just:

	dev_cleanup_skb(skb);

> I was thinking that when a packet enter a namespace, it must not be associated
> to any object from the previous namespace, it should be like if we just receive
> it on the host.

Overall agree.  Tunnels have the same properties.

Which leads me to conclude either we are missing something or the
current tunnel code is mildly buggy because it does not do this level of
scrubbing.

Eric

>>> -static void __net_exit sit_destroy_tunnels(struct sit_net *sitn, struct list_head *head)
>>> +static void __net_exit sit_destroy_tunnels(struct net *net,
>>> +					   struct list_head *head)
>>>   {
>>> -	int prio;
>>> +	struct net_device *dev, *aux;
>>>
>>> -	for (prio = 1; prio < 4; prio++) {
>>> -		int h;
>>> -		for (h = 0; h < HASH_SIZE; h++) {
>>> -			struct ip_tunnel *t;
>>> -
>>> -			t = rtnl_dereference(sitn->tunnels[prio][h]);
>>> -			while (t != NULL) {
>>> -				unregister_netdevice_queue(t->dev, head);
>>> -				t = rtnl_dereference(t->next);
>>> -			}
>>> -		}
>>> -	}
>>> +	for_each_netdev_safe(net, dev, aux)
>>> +		if (dev->rtnl_link_ops &&
>>> +		    !strcmp(dev->rtnl_link_ops->kind, "sit"))
>>> +			unregister_netdevice_queue(dev, head);
>>
>> This entire idiom change is a bit ugly, and it is wrong.
>>
>> You need to look for two classes of tunnels to take down.  Tunnels that
>> originate in net and tunnels whose netdevice is in net.
>>
>> For tunnles that reside in net you should be able to just compare the
>> rtnl_link_ops pointer, rather than an ascii name.
>>
>> Tunnels that originate in this network namespace most definitely need to
>> be taken down as among other things you wisely do not keep a reference
>> count on the originating network namespace.
> Yes sure. My beta version was doing the right things, but I change this code
> before sending the patch :/

Bahahaha!  The dangers of the last minute cleanup.

Eric

  reply	other threads:[~2013-06-24 22:42 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-23 17:49 switching network namespace midway rsa
2012-10-24 21:11 ` Eric W. Biederman
2012-10-24 21:21   ` Benjamin LaHaise
2012-10-25  1:37     ` Eric W. Biederman
2012-10-25 14:38       ` Benjamin LaHaise
2012-10-25 16:21         ` Stephen Hemminger
2012-10-28  5:43           ` Eric W. Biederman
2012-10-29 14:23             ` Stephen Hemminger
2012-10-30  0:21               ` Eric W. Biederman
2012-10-30  8:55                 ` James Chapman
2012-10-25 15:12     ` rsa
2012-10-25 15:29     ` rsa
2012-10-25 15:59       ` Benjamin LaHaise
2012-10-25 16:15         ` Eric W. Biederman
2012-11-02  2:25           ` Benjamin LaHaise
2012-11-02  6:18             ` Eric W. Biederman
2012-11-02 14:03               ` Benjamin LaHaise
2012-11-02 20:45                 ` Eric W. Biederman
2013-06-24 14:13                   ` [RFC PATCH net-next 0/2] sit: allow to switch netns during encap/decap Nicolas Dichtel
2013-06-24 14:13                     ` [RFC PATCH net-next 1/2] dev: introduce dev_cleanup_skb() Nicolas Dichtel
2013-06-24 18:13                       ` Ben Hutchings
2013-06-24 19:05                         ` Eric W. Biederman
2013-06-24 14:13                     ` [RFC PATCH net-next 2/2] sit: add support of x-netns Nicolas Dichtel
2013-06-24 19:28                       ` Eric W. Biederman
2013-06-24 21:11                         ` Nicolas Dichtel
2013-06-24 22:42                           ` Eric W. Biederman [this message]
2013-06-25 14:10                             ` Nicolas Dichtel
2013-06-25 14:24                               ` [PATCH v2 net-next 0/2] sit: allow to switch netns during encap/decap Nicolas Dichtel
2013-06-25 14:24                                 ` [PATCH v2 net-next 1/2] dev: introduce skb_scrub_packet() Nicolas Dichtel
2013-06-25 14:24                                 ` [PATCH v2 net-next 2/2] sit: add support of x-netns Nicolas Dichtel
2013-06-25 23:56                                   ` David Miller
2013-06-26  1:35                                     ` Eric W. Biederman
2013-06-26  5:48                                       ` David Miller
2013-06-26 10:03                                         ` Eric W. Biederman
2013-06-26 10:22                                           ` Eric Dumazet
2013-06-26 12:15                                             ` Nicolas Dichtel
2013-06-26 14:11                                               ` [PATCH v3 net-next 0/2] sit: allow to switch netns during encap/decap Nicolas Dichtel
2013-06-26 14:11                                                 ` [PATCH v3 net-next 1/2] dev: introduce skb_scrub_packet() Nicolas Dichtel
2013-06-26 14:11                                                 ` [PATCH v3 net-next 2/2] sit: add support of x-netns Nicolas Dichtel
2013-06-28  5:36                                                 ` [PATCH v3 net-next 0/2] sit: allow to switch netns during encap/decap David Miller
2013-07-03 15:00                                                   ` [PATCH net-next 0/3] ipip/ip6tnl: " Nicolas Dichtel
2013-07-03 15:00                                                     ` [PATCH net-next 1/3] sit: fix tunnel update via netlink Nicolas Dichtel
2013-07-03 15:00                                                     ` [PATCH net-next 2/3] ipip: add x-netns support Nicolas Dichtel
2013-07-03 15:00                                                     ` [PATCH net-next 3/3] ip6tnl: " Nicolas Dichtel
2013-07-04 21:56                                                     ` [PATCH net-next 0/3] ipip/ip6tnl: allow to switch netns during encap/decap David Miller
2013-08-13 15:51                                                       ` [PATCH net-next v2 0/4] " Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 1/4] dev: move skb_scrub_packet() after eth_type_trans() Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 2/4] ipv4 tunnels: use net_eq() helper to check netns Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 3/4] ipip: add x-netns support Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 4/4] ip6tnl: " Nicolas Dichtel
2013-08-15  8:01                                                         ` [PATCH net-next v2 0/4] ipip/ip6tnl: allow to switch netns during encap/decap David Miller
2013-06-26 13:49                                     ` [PATCH v2 net-next 2/2] sit: add support of x-netns Nicolas Dichtel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874ncni114.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=bcrl@kvack.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=ravi.mlists@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.