netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
To: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: santosh.shilimkar@oracle.com, davem@davemloft.net,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	rds-devel@oss.oracle.com, edumazet@google.com
Subject: Re: [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL)
Date: Fri, 16 Mar 2018 13:29:18 -0400	[thread overview]
Message-ID: <20180316172918.GM16870@oracle.com> (raw)
In-Reply-To: <512c2483-aa62-b385-6079-1b6ff8ba4636@virtuozzo.com>


I had taken some of this offline, but it occurs to me
that some of these notes should be saved to the netdev archives, 
in case this question pops up again in a few years.

When I run your patch, I get a repeatable panic by doing
  modprobe rds-tcp
  ip netns create blue
the panic is because we are finding a null trn in rds_tcp_init_net.

I think there's something very disturbed about calling
register_pernet_operations() twice, once via 
register_pernet_device() and again via register_pernet_subsys().

I suspect this has everything to do with the panic but I have
not had time to debug every little detail here.

In general, rds_tcp is not a network device, it is a kernel
module.  That is the fundamental problem here. 

To repeat the comments form net_namespace.h:
 * Network interfaces need to be removed from a dying netns _before_
 * subsys notifiers can be called, as most of the network code cleanup
 * (which is done from subsys notifiers) runs with the assumption that
 * dev_remove_pack has been called so no new packets will arrive during
 * and after the cleanup functions have been called.  dev_remove_pack
 * is not per namespace so instead the guarantee of no more packets
 * arriving in a network namespace is provided by ensuring that all
 * network devices and all sockets have left the network namespace
 * before the cleanup methods are called.

when the "blue" netns starts up, it creates at least one kernel listen
socket on *.16385. This socket, and any other child/client sockets 
created must be cleaned up before the cleanup_net can happen.

This is why I chose to call regster_pernet_subsys. Again, as per
comments in net_namespace.h:

 * Use these carefully.  If you implement a network device and it
 * needs per network namespace operations use device pernet operations,
 * otherwise use pernet subsys operations.

On (03/16/18 18:51), Kirill Tkhai wrote:
> > Let's find another approach. Could you tell what problem we have in 
> > case of rds_tcp_dev_ops is declared as pernet_device?

As above, rds-tcp is not a network device.

> One more question. Which time we take a reference of loopback device?
> Is it possible before we created a net completely?

We dont take a reference on the loopback device. 
We make sure none of the kernel sockets does a get_net() so
that we dont block the cleanup_net, and then, when all
the network interfaces have been taken down (loopback is
the last one) we know there are no more packets coming in
and out, so it is safe to dismantle all kernel sockets 
created by rds-tcp.

Hope that helps.

--Sowmini

  parent reply	other threads:[~2018-03-16 17:29 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-16 12:38 [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL) Kirill Tkhai
2018-03-16 13:00 ` Sowmini Varadhan
2018-03-16 13:17   ` Kirill Tkhai
2018-03-16 13:53     ` Sowmini Varadhan
2018-03-16 14:36       ` Kirill Tkhai
2018-03-16 14:41         ` Kirill Tkhai
2018-03-16 17:29     ` Sowmini Varadhan [this message]
2018-03-16 18:14       ` Kirill Tkhai
2018-03-16 18:31         ` Sowmini Varadhan
2018-03-16 18:48           ` Kirill Tkhai
2018-03-16 18:53             ` Sowmini Varadhan
2018-03-17 14:15         ` Sowmini Varadhan
2018-03-17 21:13           ` Kirill Tkhai
2018-03-17 21:26           ` [rds-devel] " Sowmini Varadhan
2018-03-17 21:55             ` Kirill Tkhai
2018-03-18 20:45               ` Sowmini Varadhan
2018-03-19 10:08                 ` Kirill Tkhai
2018-03-20 11:37                 ` Håkon Bugge
2018-03-20 13:29                   ` Sowmini Varadhan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180316172918.GM16870@oracle.com \
    --to=sowmini.varadhan@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rds-devel@oss.oracle.com \
    --cc=santosh.shilimkar@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).