From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: RFC Hanging clean-up of a namespace Date: Thu, 19 Jan 2012 13:24:13 -0800 Message-ID: References: <20120119192541.GM2262734@jupiter.n2.diac24.net> <20120119.143105.735366189369504929.davem@davemloft.net> <20120119195349.GN2262734@jupiter.n2.diac24.net> <20120119.152752.318442465605898328.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: equinox@diac24.net, hans.schillstrom@ericsson.com, netdev@vger.kernel.org To: David Miller Return-path: Received: from out03.mta.xmission.com ([166.70.13.233]:49859 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750993Ab2ASVVr (ORCPT ); Thu, 19 Jan 2012 16:21:47 -0500 In-Reply-To: <20120119.152752.318442465605898328.davem@davemloft.net> (David Miller's message of "Thu, 19 Jan 2012 15:27:52 -0500 (EST)") Sender: netdev-owner@vger.kernel.org List-ID: David Miller writes: > From: David Lamparter > Date: Thu, 19 Jan 2012 20:53:49 +0100 > >> On Thu, Jan 19, 2012 at 02:31:05PM -0500, David Miller wrote: >>> >> >> Keeping the timewait sockets around is necessary to absorb any lingering >>> >> >> packets in the network meant for those sockets. >> [...] >>> >> The assumption is that the address is moving, which might not be true. >>> > >>> > I don't understand what you mean, what address may not be moving? >>> > >>> > We're talking about dropping a netns. All of its addresses disappear, >>> > all of its soft devices disappear. Its hard devices fall back into the >>> > init namespace, is that what you're referring to? >>> >>> And then you immediately start up a new netns with the same address >>> and then resets go back to lingering TCP packets the time-waits would >>> have consumed. >>> >>> The reason this is different from a host reboot is that a host reboot >>> takes some amount of time, which even if around 30 seconds is superior >>> in behavior to what can happen with netns which can be created almost >>> instantly. >> >> Arjan van de Ven booted Linux in 5 seconds in 2008, >> cf. http://lwn.net/Articles/299483/ >> >> On the TCP timewait scale of time, this is pretty much "immediate". >> >> [..] >>> Then if a new netns is created that tries to reuse the address used by >>> the mini-netns which hasn't cleared yet, you give -EAGAIN until all >>> the timewaits expire. >> >> The effect of this is that you end up being unable to reboot lxc based >> virtualised hosts without waiting 2 minutes for the TCP timers to >> expire. That sounds completely unacceptable to me. > > All you are saying to me is that we are on a trajectory to major problems > if it becomes pervasive that time-wait gets cancelled out and addresses > then get reused so quickly. This thread is a fascinating disconnect from reality all of the way around. - inet_twsk_purge already implements throwing out of timewait sockets when a network namespaces is being cleaned up. So the RFC is nonsense. - Keeping the timewait sockets at that point we purge them in the code can achieve nothing. We don't have any userspace processes or network devices associated with the timewait sockets at the point we get rid of them. The network namespace exists so long as a userspace process can find it. The network namespace exit is asynchronous in it's own workqueue so userspace definitely is not blocked. - I don't see anything obvious that we can do in the kernel that will will make the situation better than it is today. I'm not arguing that we should reuse addresses quickly. I see value in the tcp_timewait mechanism. I'm just saying this thread seems to be discussing some other network stack than the one that lives in the linux kernel. Eric