Linux Container Development
 help / color / mirror / Atom feed
From: Daniel Lezcano <dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
To: "Denis V. Lunev" <den-3ImXcnM4P+0@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org,
	"Denis V. Lunev" <den-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org
Subject: Re: [PATCH] [NETNS49] support for per/namespace routing cache cleanup
Date: Wed, 17 Oct 2007 16:46:34 +0200	[thread overview]
Message-ID: <4716204A.60603@fr.ibm.com> (raw)
In-Reply-To: <471617CA.9090901-3ImXcnM4P+0@public.gmane.org>

Denis V. Lunev wrote:
> Daniel Lezcano wrote:
>> Denis V. Lunev wrote:
>>> Daniel Lezcano wrote:
>>>> Denis V. Lunev wrote:
>>>>> /proc/sys/net/route/flush should be accessible inside the net 
>>>>> namespace.
>>>>> Though, the complete opening of this file will result in a DoS or
>>>>> significant entire host slowdown if a namespace process will 
>>>>> continually
>>>>> flush routes.
>>>>>
>>>>> This patch introduces per/namespace route flush facility.
>>>>>
>>>>> Each namespace wanted to flush a cache copies global generation 
>>>>> count to
>>>>> itself and starts the timer. The cache is dropped for a specific 
>>>>> namespace
>>>>> iff the namespace counter is greater or equal global ones.
>>>>>
>>>>> So, in general, unwanted namespaces do nothing. They hold very old low
>>>>> counter and they are unaffected by the requested cleanup.
>>>>>
>>>>> Signed-of-by: Denis V. Lunev <den-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
>>>>
>>>> That's right and that will happen when manipulating ip addresses of 
>>>> the network devices too. But I am not confortable with your 
>>>> patchset. It touches the routing flush function too hardly and it 
>>>> uses current->nsproxy->net_ns.
>>>>
>>>> IMHO we should have two flush functions. One taking a network 
>>>> namespace parameter and one without the network namespace parameter. 
>>>> The first one is called when a write to 
>>>> /proc/sys/net/ipv4/route/flush is done (we must use the network 
>>>> namespace of the writer) or when a interface address is changed or 
>>>> shutdown|up. The last one is called by the timer, so we have a 
>>>> global timer flushing the routing cache for all the namespaces.
>>>
>>> we can't :( The unfortunate thing is that the actual cleanup is 
>>> called indirectly and asynchronously. The user _schedule_ the garbage 
>>> collector to run NOW and we are moving over a large routing cache. 
>>> Really large.
>>>
>>> The idea to iterate over the list of each namespace to flush is bad. 
>>> We are in atomic context. The list is protected by the mutex.
>>>
>>> The idea of several timers (per namespace) is also bad. You will 
>>> iterate over large cache several times.
>>>
>>> No other acceptable way here for me :(.
>>>
>>> As for "the trigger" - rt_cache_flush, looks like you are right. We 
>>> should pass namespace as a parameter. This should be done as a 
>>> separate patch.
>>
>> If we change:
>>
>>     rt_cache_flush(struct net *net, int delay);
>>
>> and inside we call rt_cache_flush((unsigned long)net);
>>
>> And then we check in the rt_run_flush function,
>>
>>     struct net *net = (struct net *)dummy;
>>
>>     ...
>>     for (i = rt_hash_mask; i >= 0; i--) {
>>         ...
>>         if (dummy && rth->fl.fl_net != net)
>>             continue
>>         ...
>>     ...
>>
>> So when rt_run_flush is called synchronously, the netns is specified 
>> in dummy and only the routes belonging to netns are flushed. Otherwise 
>> when it is called by the timer, netns is not set so all routes are 
>> flushed.
>>
> 
> this does not look good for me. The size of this cache for 4GB host is 
> 2*10^6 entries for IPv4 with a 131072 chains. The conventional 
> mainstream kernel wants to merge the routing cache cleanup requests from 
> the different sources if they are delayed (default).
> 
> The main idea for this patch is to protect all other namespaces from the 
> current one. This cache is an important resource. You proposal will work 
> for the forced synchronous cleanups. Though, there are some requests 
> with results in the delayed rt_run_flush via [mod/add]_timer
> 
> How should we handle them?

IHMO we should let the timer to remove all routes.

The problem you raised is someone in a namespace can flush routes for 
another namespaces and mess the network traffic for them.
This case is resolved with the synchronous call to rt_cache_flush and 
netns parameter.

The second point here is we can have a lot of routes and the timer will 
flush them all, with or without namespaces.
IMHO I don't think we should handle this case ... for now.

  parent reply	other threads:[~2007-10-17 14:46 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-17 11:12 [PATCH] [NETNS49] support for per/namespace routing cache cleanup Denis V. Lunev
     [not found] ` <20071017111215.GA29653-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
2007-10-17 11:46   ` Daniel Lezcano
     [not found]     ` <4715F60F.6060304-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2007-10-17 12:51       ` Denis V. Lunev
     [not found]         ` <4716055D.4010102-3ImXcnM4P+0@public.gmane.org>
2007-10-17 13:40           ` Daniel Lezcano
     [not found]             ` <471610E8.8020008-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2007-10-17 14:10               ` Denis V. Lunev
     [not found]                 ` <471617CA.9090901-3ImXcnM4P+0@public.gmane.org>
2007-10-17 14:46                   ` Daniel Lezcano [this message]
2007-10-17 15:05                   ` Daniel Lezcano
     [not found]                     ` <471624C0.9020108-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2007-10-17 17:56                       ` Denis V. Lunev
     [not found]                         ` <47164CBD.3040107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2007-10-17 18:50                           ` Eric W. Biederman
2007-10-18  7:18                           ` Benjamin Thery
     [not found]                             ` <471708D8.3080808-6ktuUTfB/bM@public.gmane.org>
2007-10-18  9:54                               ` Denis V. Lunev
2007-10-18 14:51                               ` Denis V. Lunev
     [not found]                                 ` <471772EB.2060206-3ImXcnM4P+0@public.gmane.org>
2007-10-18 16:29                                   ` Benjamin Thery
     [not found]                                     ` <471789FD.5020802-6ktuUTfB/bM@public.gmane.org>
2007-10-18 19:01                                       ` Denis V. Lunev
2007-10-18 19:05                                       ` Denis V. Lunev
     [not found]                                         ` <4717AE7D.1060000-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2007-10-19  7:39                                           ` Daniel Lezcano
2007-10-19  7:39                                   ` Daniel Lezcano
     [not found]                                     ` <47185F2C.1070404-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2007-10-19  8:53                                       ` Denis V. Lunev
     [not found]                                         ` <47187074.3090206-3ImXcnM4P+0@public.gmane.org>
2007-10-19 19:03                                           ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4716204A.60603@fr.ibm.com \
    --to=dlezcano-nmtc/0zbporqt0dzr+alfa@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=den-3ImXcnM4P+0@public.gmane.org \
    --cc=den-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox