netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	Willem de Bruijn <willemb@google.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [PATCH v5 2/2] sock: Move the socket inuse to namespace.
Date: Fri, 8 Dec 2017 19:29:43 +0800	[thread overview]
Message-ID: <CAMDZJNX-JTGhQNTJSByLC2LCW=uY9C8MfmRr2eYoxbPQGGg6Sw@mail.gmail.com> (raw)
In-Reply-To: <CAMDZJNXoW1zPV6czAZiS1XN7Reb3w49whEnt25j8hPOQzdRJRg@mail.gmail.com>

hi all. we can add synchronize_rcu and rcu_barrier in sock_inuse_exit_net to
ensure there are no outstanding rcu callbacks using this network namespace.
we will not have to test if net->core.sock_inuse is NULL or not from
sock_inuse_add(). :)

 static void __net_exit sock_inuse_exit_net(struct net *net)
 {
        free_percpu(net->core.prot_inuse);
+
+       synchronize_rcu();
+       rcu_barrier();
+
+       free_percpu(net->core.sock_inuse);
 }


On Fri, Dec 8, 2017 at 5:52 PM, Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> On Fri, Dec 8, 2017 at 1:40 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Fri, 2017-12-08 at 13:28 +0800, Tonghao Zhang wrote:
>>> On Fri, Dec 8, 2017 at 1:20 AM, Eric Dumazet <eric.dumazet@gmail.com>
>>> wrote:
>>> > On Thu, 2017-12-07 at 08:45 -0800, Tonghao Zhang wrote:
>>> > > In some case, we want to know how many sockets are in use in
>>> > > different _net_ namespaces. It's a key resource metric.
>>> > >
>>> >
>>> > ...
>>> >
>>> > > +static void sock_inuse_add(struct net *net, int val)
>>> > > +{
>>> > > +     if (net->core.prot_inuse)
>>> > > +             this_cpu_add(*net->core.sock_inuse, val);
>>> > > +}
>>> >
>>> > This is very confusing.
>>> >
>>> > Why testing net->core.prot_inuse for NULL is needed at all ?
>>> >
>>> > Why not testing net->core.sock_inuse instead ?
>>> >
>>>
>>> Hi Eric and Cong, oh it's a typo. it's net->core.sock_inuse there.
>>> Why
>>> we should check the net->core.sock_inuse
>>> Now show you the code:
>>>
>>> cleanup_net will call all of the network namespace exit methods,
>>> rcu_barrier, and then remove the _net_ namespace.
>>>
>>> cleanup_net:
>>>     list_for_each_entry_reverse(ops, &pernet_list, list)
>>>          ops_exit_list(ops, &net_exit_list);
>>>
>>>     rcu_barrier(); /* for netlink sock, the ‘deferred_put_nlk_sk’
>>> will
>>> be called. But sock_inuse has been released. */
>>
>>
>> Thats would be a bug.
>>
>> Please find another way, but we want ultimately to check that before
>> net->core.sock_inuse is freed, folding the inuse count on all cpus is
>> 0, to make sure we do not have a bug somewhere.
>
> Yes, I am aware of this issue even we will destroy the network namespace.
> By the way, we can counter the socket-inuse in sock_alloc or sock_release.
> In this way, we have to hold the network namespace again(via
> get_net()) while sock
> may hold it.
>
> what do you think of this idea?
>
>> We should not have to test if net->core.sock_inuse is NULL or not from
>> sock_inuse_add(). Pointer must be there all the time.
>>
>> The freeing should only happen once we are sure sock_inuse_add() can
>> not be called anymore.
>>
>>>
>>>
>>>     /* Finally it is safe to free my network namespace structure */
>>>     list_for_each_entry_safe(net, tmp, &net_exit_list, exit_list) {}
>>>
>>>
>>>
>>> Release the netlink sock created in kernel(not hold the _net_
>>> namespace):
>>>
>>> netlink_release
>>>        call_rcu(&nlk->rcu, deferred_put_nlk_sk);
>>>
>>> deferred_put_nlk_sk
>>>        sk_free(sk);
>>>
>>>
>>> I may add a comment for sock_inuse_add in v6.
>>
>>

  reply	other threads:[~2017-12-08 11:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 16:45 [PATCH v5 1/2] sock: Change the netns_core member name Tonghao Zhang
2017-12-07 16:45 ` [PATCH v5 2/2] sock: Move the socket inuse to namespace Tonghao Zhang
2017-12-07 17:20   ` Eric Dumazet
2017-12-07 21:28     ` Cong Wang
2017-12-08  5:28     ` Tonghao Zhang
2017-12-08  5:40       ` Eric Dumazet
2017-12-08  9:52         ` Tonghao Zhang
2017-12-08 11:29           ` Tonghao Zhang [this message]
2017-12-08 13:24             ` Eric Dumazet
2017-12-09  5:25               ` Tonghao Zhang
2017-12-08 22:09       ` Cong Wang
2017-12-09  5:27         ` Tonghao Zhang
2017-12-09 19:42           ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMDZJNX-JTGhQNTJSByLC2LCW=uY9C8MfmRr2eYoxbPQGGg6Sw@mail.gmail.com' \
    --to=xiangxia.m.yue@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).