From: Hans Schillstrom <hans.schillstrom@ericsson.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>,
David Miller <davem@davemloft.net>,
"equinox@diac24.net" <equinox@diac24.net>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: RFC Hanging clean-up of a namespace
Date: Fri, 20 Jan 2012 12:51:24 +0100 [thread overview]
Message-ID: <201201201251.25032.hans.schillstrom@ericsson.com> (raw)
In-Reply-To: <m1y5t2u1ne.fsf@fess.ebiederm.org>
On Friday 20 January 2012 11:08:37 Eric W. Biederman wrote:
> Hans Schillstrom <hans.schillstrom@ericsson.com> writes:
>
> > On Thursday 19 January 2012 22:40:53 Hagen Paul Pfeifer wrote:
> >> * Eric W. Biederman | 2012-01-19 13:24:13 [-0800]:
> >>
> >> >This thread is a fascinating disconnect from reality all of the way
> >> >around.
> >> >
> >> >- inet_twsk_purge already implements throwing out of timewait sockets
> >> > when a network namespaces is being cleaned up. So the RFC is nonsense.
> >>
> >> This is how it is implemented, not how it should be. TIME_WAIT is not the
> >> problem, it is there to keep the stack from sending wrong RST messages. Maybe
> >> the 2*MSL could be fixed by a more accurate 2*RTT.
> >>
> >
> > I was only refering to my printk's i.e. the last sockets leaving the namespace was
> > from tcp_timer() with state 7, 2 minutes after free_nsproxy() was called.
> > (and assumed that was the time_wait)
>
> Which kernel are you running?
3.2.0
> I can't find a mention of a function
> named tcp_timer() anywhere in the kernel since 2.6.16 when the kernel
> was put into git.
Sorry, it was tcp_write_timer() in tcp_timer.c
>
> There is a file named net/ipv4/tcp_timer.c
>
> But if you are actually describing normal sockets and not timewait
> sockets then it is remotely possible that something like what you are
> talking about is happening.
Hmm, state 7 is TCP_CLOSE I simply assumed that it was TCP_WAIT ...
> Normal sockets keep the network namespace
> alive. So if something was keeping the sockets open. Like perhaps a
> process that has one of your sockets from your network namespace open
> then it could happen.
We had a number of procs. with tcp connections open, and kill proc 1 (lxc-init)
i.e. all procs. in the ns got killed within a few ms.
(or at least no visible traces left)
> nsproxy is not the only place that references to the network namespace
> are allowed to live that keep the network namespace alive.
>
> >> >- Keeping the timewait sockets at that point we purge them in the code
> >> > can achieve nothing. We don't have any userspace processes or network
> >> > devices associated with the timewait sockets at the point we get rid
> >> > of them. The network namespace exists so long as a userspace process
> >> > can find it. The network namespace exit is asynchronous in it's own
> >> > workqueue so userspace definitely is not blocked.
> >>
> >
> > One example of a real life problem is when a container crash where a VLAN from
> > a physical interface is used in the container, and you automatically reboot
> > that container. A new namespace is created with that VLAN again and what happens ?
> > That VLAN id is busy (waiting for tcp_timer) and the continer start fails ...
> > So you have to wait a couple of minutes :-(
>
> Yes the vlan is busy until that the network namespace is cleaned up, and
> we get as far as calling dellink on the network namespace.
>
> There are a lot of reasons why a network namespace would not be cleaned
> up immediately. Especially in older kernels.
>
> One problem people running older kernels had troubles with was vsftp
> created an empty network namespace for every connection. On kernels pre
> 2.6.34 I think before we had batching support for cleaning up network
> devices and network namespaces the kernel could simply not keep up with
> the rate that vsftp was creating and destroying network namespaces, and
> would slowly fall farther and farther behind in it's cleanup.
>
> If you are running an older kernel it is quite possible that you are
> missing some cleanups. It is also possible that you are hitting one of
> the cases where we can only destroy 4 network devices a second and you
> have lots of network devices dying with your network namespace.
>
We started with 2.6.32 but the cleanup process didn't work we always end up
with ref-counts on loopback
Thanks
Hans
next prev parent reply other threads:[~2012-01-20 11:51 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-19 11:07 RFC Hanging clean-up of a namespace Hans Schillstrom
2012-01-19 13:31 ` David Lamparter
2012-01-19 17:40 ` David Miller
2012-01-19 19:01 ` David Lamparter
2012-01-19 19:06 ` David Miller
2012-01-19 19:25 ` David Lamparter
2012-01-19 19:31 ` David Miller
2012-01-19 19:53 ` David Lamparter
2012-01-19 20:27 ` David Miller
2012-01-19 21:03 ` David Lamparter
2012-01-19 21:24 ` Eric W. Biederman
2012-01-19 21:40 ` David Lamparter
2012-01-19 21:40 ` Hagen Paul Pfeifer
2012-01-19 21:47 ` David Lamparter
2012-01-19 22:10 ` Rick Jones
2012-01-19 22:16 ` Hagen Paul Pfeifer
2012-01-19 22:37 ` David Miller
2012-01-20 6:08 ` Hans Schillstrom
2012-01-20 10:08 ` Eric W. Biederman
2012-01-20 11:51 ` Hans Schillstrom [this message]
2012-01-20 20:55 ` Eric W. Biederman
2012-01-23 6:07 ` Hans Schillstrom
2012-01-23 6:25 ` Eric W. Biederman
2012-01-23 6:58 ` Hans Schillstrom
2012-01-23 7:17 ` Eric W. Biederman
2012-01-23 7:30 ` Hans Schillstrom
2012-01-23 7:55 ` Eric W. Biederman
2012-01-19 19:40 ` Hans Schillström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201201201251.25032.hans.schillstrom@ericsson.com \
--to=hans.schillstrom@ericsson.com \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=equinox@diac24.net \
--cc=hagen@jauu.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.