netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: David Miller <davem@davemloft.net>
Cc: equinox@diac24.net, hans.schillstrom@ericsson.com,
	netdev@vger.kernel.org
Subject: Re: RFC Hanging clean-up of a namespace
Date: Thu, 19 Jan 2012 13:24:13 -0800	[thread overview]
Message-ID: <m1lip32xoi.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20120119.152752.318442465605898328.davem@davemloft.net> (David Miller's message of "Thu, 19 Jan 2012 15:27:52 -0500 (EST)")

David Miller <davem@davemloft.net> writes:

> From: David Lamparter <equinox@diac24.net>
> Date: Thu, 19 Jan 2012 20:53:49 +0100
>
>> On Thu, Jan 19, 2012 at 02:31:05PM -0500, David Miller wrote:
>>> >> >> Keeping the timewait sockets around is necessary to absorb any lingering
>>> >> >> packets in the network meant for those sockets.
>> [...]
>>> >> The assumption is that the address is moving, which might not be true.
>>> > 
>>> > I don't understand what you mean, what address may not be moving?
>>> > 
>>> > We're talking about dropping a netns. All of its addresses disappear,
>>> > all of its soft devices disappear. Its hard devices fall back into the
>>> > init namespace, is that what you're referring to?
>>> 
>>> And then you immediately start up a new netns with the same address
>>> and then resets go back to lingering TCP packets the time-waits would
>>> have consumed.
>>> 
>>> The reason this is different from a host reboot is that a host reboot
>>> takes some amount of time, which even if around 30 seconds is superior
>>> in behavior to what can happen with netns which can be created almost
>>> instantly.
>> 
>> Arjan van de Ven booted Linux in 5 seconds in 2008,
>> cf. http://lwn.net/Articles/299483/
>> 
>> On the TCP timewait scale of time, this is pretty much "immediate".
>> 
>> [..]
>>> Then if a new netns is created that tries to reuse the address used by
>>> the mini-netns which hasn't cleared yet, you give -EAGAIN until all
>>> the timewaits expire.
>> 
>> The effect of this is that you end up being unable to reboot lxc based
>> virtualised hosts without waiting 2 minutes for the TCP timers to
>> expire. That sounds completely unacceptable to me.
>
> All you are saying to me is that we are on a trajectory to major problems
> if it becomes pervasive that time-wait gets cancelled out and addresses
> then get reused so quickly.

This thread is a fascinating disconnect from reality all of the way
around.

- inet_twsk_purge already implements throwing out of timewait sockets
  when a network namespaces is being cleaned up.  So the RFC is nonsense.

- Keeping the timewait sockets at that point we purge them in the code
  can achieve nothing.  We don't have any userspace processes or network
  devices associated with the timewait sockets at the point we get rid
  of them.  The network namespace exists so long as a userspace process
  can find it.  The network namespace exit is asynchronous in it's own
  workqueue so userspace definitely is not blocked.

- I don't see anything obvious that we can do in the kernel that will
  will make the situation better than it is today.

I'm not arguing that we should reuse addresses quickly.  I see value
in the tcp_timewait mechanism.  I'm just saying this thread seems
to be discussing some other network stack than the one that lives
in the linux kernel.

Eric

  parent reply	other threads:[~2012-01-19 21:21 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-19 11:07 RFC Hanging clean-up of a namespace Hans Schillstrom
2012-01-19 13:31 ` David Lamparter
2012-01-19 17:40 ` David Miller
2012-01-19 19:01   ` David Lamparter
2012-01-19 19:06     ` David Miller
2012-01-19 19:25       ` David Lamparter
2012-01-19 19:31         ` David Miller
2012-01-19 19:53           ` David Lamparter
2012-01-19 20:27             ` David Miller
2012-01-19 21:03               ` David Lamparter
2012-01-19 21:24               ` Eric W. Biederman [this message]
2012-01-19 21:40                 ` David Lamparter
2012-01-19 21:40                 ` Hagen Paul Pfeifer
2012-01-19 21:47                   ` David Lamparter
2012-01-19 22:10                     ` Rick Jones
2012-01-19 22:16                     ` Hagen Paul Pfeifer
2012-01-19 22:37                     ` David Miller
2012-01-20  6:08                   ` Hans Schillstrom
2012-01-20 10:08                     ` Eric W. Biederman
2012-01-20 11:51                       ` Hans Schillstrom
2012-01-20 20:55                         ` Eric W. Biederman
2012-01-23  6:07                           ` Hans Schillstrom
2012-01-23  6:25                             ` Eric W. Biederman
2012-01-23  6:58                               ` Hans Schillstrom
2012-01-23  7:17                                 ` Eric W. Biederman
2012-01-23  7:30                                   ` Hans Schillstrom
2012-01-23  7:55                                     ` Eric W. Biederman
2012-01-19 19:40   ` Hans Schillström

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1lip32xoi.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=davem@davemloft.net \
    --cc=equinox@diac24.net \
    --cc=hans.schillstrom@ericsson.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).