netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>,
	David Miller <davem@davemloft.net>,
	jbenc@redhat.com
Cc: netdev@vger.kernel.org, thaller@redhat.com
Subject: Re: [PATCH net] net: try harder to not reuse ifindex when moving interfaces
Date: Thu, 22 Oct 2015 16:52:13 +0200	[thread overview]
Message-ID: <5628F81D.1070009@6wind.com> (raw)
In-Reply-To: <1445447578.1265325.416533273.3D5599FD@webmail.messagingengine.com>

Le 21/10/2015 19:12, Hannes Frederic Sowa a écrit :
> Hello,
>
> On Wed, Oct 21, 2015, at 17:56, David Miller wrote:
>> From: Jiri Benc <jbenc@redhat.com>
>> Date: Wed, 21 Oct 2015 17:25:02 +0200
>>
>>> On Wed, 21 Oct 2015 08:32:14 -0700 (PDT), David Miller wrote:
>>>> As you say the apps are broken, so file a bug and have them fixed.
>>>>
>>>> The assumption is clearly invalid, so apps cannot make such an
>>>> assumption.
>>>
>>> Does it mean you would be okay with a patch that always allocates and
>>> assigns a new ifindex in the target netns when interface is moved
>>> between name spaces?
>>
>> I think you're misunderstanding me if you're still recommending
>> kernel changes.
>>
>> I'm plainly saying to remove the assumption in the apps.
>>
>> If you don't show me exactly how some kernel change can lead to
>> the apps implementing things properly, without the invalid
>> assumptions, then I can only assume you didn't hear what I
>> said.
>
> I think the reason why ifindexes exists as ints is that we want to have
> lightweight way to refer to interfaces without taking references or
> timestamps or generation ids which completely remove the possibility for
> races. But the racy nature in ifindexes is something we actually want,
> otherwise a user space program acquiring an ifindex would need to get a
> reference on the device and either during socket close or program
> termination release it, that would be very costly.
>
> This patch minimizes the race quite a lot, from something we could
> actually see in everydays container creation to probably something only
> some users will expire with depleting the ifindex pool or playing around
> with CRIU.
>
> We could come up with more heavy machinery to close the race further for
> CRIU by keeping track of "poisoned" ifindexes, which would need a
> hashmap which could become pretty big and we could recycle when ifindex
> wraps around, but this seems too heavy weight to me.
>
> I am in favor of a solution to minimize this race in the kernel even
> though we cannot ever close it completely.
I probably miss something, but if the app listens netlink, I don't see how such
app may have a race window.

With the proposed scenario:
1. create netns 'new_netns'
2. in root netns, move the interface with ifindex 2 to new_netns
3. in new_netns, delete the interface with ifindex 2
4. in new_netns, create an interface - it will get ifindex 2

Operation 2 and 4 are done by dev_change_net_namespace() under rtnl_lock().
RTM_DELLINK(root netns) and RTM_NEWLINK(new_netns) are sent by this function.
It means that operation 3 has been done before and that RTM_DELLINK(new_netns)
has been sent before.

Regards,
Nicolas

  reply	other threads:[~2015-10-22 14:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-16 11:07 [PATCH net] net: try harder to not reuse ifindex when moving interfaces Jiri Benc
2015-10-18 15:11 ` Alexei Starovoitov
2015-10-19  9:06   ` Jiri Benc
2015-10-19 15:36     ` Alexei Starovoitov
2015-10-21 14:43 ` David Miller
2015-10-21 14:46   ` Jiri Benc
2015-10-21 15:32     ` David Miller
2015-10-21 15:25       ` Jiri Benc
2015-10-21 15:56         ` David Miller
2015-10-21 17:12           ` Hannes Frederic Sowa
2015-10-22 14:52             ` Nicolas Dichtel [this message]
2015-10-22 15:00               ` Jiri Benc
2015-10-22 15:10                 ` Hannes Frederic Sowa
2015-10-22 15:20                 ` Thomas Haller
2015-10-22 15:23                 ` Nicolas Dichtel
2015-10-22 16:45                 ` Thomas Graf
2015-10-22 17:21                   ` Hannes Frederic Sowa
2015-10-22 18:56                     ` Thomas Graf
2015-10-23 10:40                       ` Thomas Haller
2015-10-22 15:21               ` Thomas Haller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5628F81D.1070009@6wind.com \
    --to=nicolas.dichtel@6wind.com \
    --cc=davem@davemloft.net \
    --cc=hannes@stressinduktion.org \
    --cc=jbenc@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=thaller@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).