From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [PATCH] net: Convert net_mutex into rw_semaphore and down read it on net->init/->exit Date: Tue, 14 Nov 2017 21:19:50 -0600 Message-ID: <87shdg8bzd.fsf@xmission.com> References: <151066759055.14465.9783879083192000862.stgit@localhost.localdomain> <88152c11-a5b5-90f8-be46-99ed6c722064@virtuozzo.com> Mime-Version: 1.0 Content-Type: text/plain Cc: Cong Wang , David Miller , vyasevic@redhat.com, kstewart@linuxfoundation.org, pombredanne@nexb.com, Vladislav Yasevich , mark.rutland@arm.com, Greg KH , Alexey Dobriyan , Florian Westphal , Nicolas Dichtel , roman.kapl@sysgo.com, Paul Moore , David Ahern , Daniel Borkmann , lucien xin , Matthias Schiffer , rshearma@brocade.com, LKML , Linux Kernel Network Developers , avagin@virtuozzo.com, gorcunov@virtuozzo.com To: Kirill Tkhai Return-path: In-Reply-To: <88152c11-a5b5-90f8-be46-99ed6c722064@virtuozzo.com> (Kirill Tkhai's message of "Tue, 14 Nov 2017 22:58:15 +0300") Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org Kirill Tkhai writes: > On 14.11.2017 21:39, Cong Wang wrote: >> On Tue, Nov 14, 2017 at 5:53 AM, Kirill Tkhai wrote: >>> @@ -406,7 +406,7 @@ struct net *copy_net_ns(unsigned long flags, >>> >>> get_user_ns(user_ns); >>> >>> - rv = mutex_lock_killable(&net_mutex); >>> + rv = down_read_killable(&net_sem); >>> if (rv < 0) { >>> net_free(net); >>> dec_net_namespaces(ucounts); >>> @@ -421,7 +421,7 @@ struct net *copy_net_ns(unsigned long flags, >>> list_add_tail_rcu(&net->list, &net_namespace_list); >>> rtnl_unlock(); >>> } >>> - mutex_unlock(&net_mutex); >>> + up_read(&net_sem); >>> if (rv < 0) { >>> dec_net_namespaces(ucounts); >>> put_user_ns(user_ns); >>> @@ -446,7 +446,7 @@ static void cleanup_net(struct work_struct *work) >>> list_replace_init(&cleanup_list, &net_kill_list); >>> spin_unlock_irq(&cleanup_list_lock); >>> >>> - mutex_lock(&net_mutex); >>> + down_read(&net_sem); >>> >>> /* Don't let anyone else find us. */ >>> rtnl_lock(); >>> @@ -486,7 +486,7 @@ static void cleanup_net(struct work_struct *work) >>> list_for_each_entry_reverse(ops, &pernet_list, list) >>> ops_free_list(ops, &net_exit_list); >>> >>> - mutex_unlock(&net_mutex); >>> + up_read(&net_sem); >> >> After your patch setup_net() could run concurrently with cleanup_net(), >> given that ops_exit_list() is called on error path of setup_net() too, >> it means ops->exit() now could run concurrently if it doesn't have its >> own lock. Not sure if this breaks any existing user. > > Yes, there will be possible concurrent ops->init() for a net namespace, > and ops->exit() for another one. I hadn't found pernet operations, which > have a problem with that. If they exist, they are hidden and not clear seen. > The pernet operations in general do not touch someone else's memory. > If suddenly there is one, KASAN should show it after a while. Certainly the use of hash tables shared between multiple network namespaces would count. I don't rembmer how many of these we have but there used to be quite a few. Eric