From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8979AC07E99 for ; Mon, 12 Jul 2021 19:18:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6EBD86120A for ; Mon, 12 Jul 2021 19:18:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236365AbhGLTVZ (ORCPT ); Mon, 12 Jul 2021 15:21:25 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:42322 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236357AbhGLTVY (ORCPT ); Mon, 12 Jul 2021 15:21:24 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]:33676) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1m31Rp-008l2D-CI; Mon, 12 Jul 2021 13:18:33 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:58936 helo=email.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1m31Ro-002ckq-4T; Mon, 12 Jul 2021 13:18:32 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Alexander Mihalicyn Cc: Manfred Spraul , Andrew Morton , "linux-kernel\@vger.kernel.org" , Milton Miller , Jack Miller , Pavel Tikhomirov , Davidlohr Bueso , Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrei Vagin , Christian Brauner References: <20210706132259.71740-1-alexander.mikhalitsyn@virtuozzo.com> <20210709181241.cca57cf83c52964b2cd0dcf0@linux-foundation.org> Date: Mon, 12 Jul 2021 14:18:25 -0500 In-Reply-To: (Alexander Mihalicyn's message of "Mon, 12 Jul 2021 12:54:58 +0300") Message-ID: <87y2ab9w8u.fsf@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1m31Ro-002ckq-4T;;;mid=<87y2ab9w8u.fsf@disp2133>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+TkMjm6pjtrnTpKimaI11kJrwLXlvbJMM= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Alexander Mihalicyn writes: > Hello Manfred, > > On Sun, Jul 11, 2021 at 2:47 PM Manfred Spraul wrote: >> >> Hi Alex, >> >> >> Am Sonntag, 11. Juli 2021 schrieb Alexander Mihalicyn : >> > >> > Hi, Manfred, >> > >> > On Sun, Jul 11, 2021 at 12:13 PM Manfred Spraul >> > wrote: >> > > >> > > Hi, >> > > >> > > >> > > Am Samstag, 10. Juli 2021 schrieb Alexander Mihalicyn : >> > >> >> > >> >> > >> Now, using setns() syscall, we can construct situation when on >> > >> task->sysvshm.shm_clist list >> > >> we have shm items from several (!) IPC namespaces. >> > >> >> > >> >> > > Does this imply that locking ist affected as well? According to the initial patch, accesses to shm_clist are protected by "the" IPC shm namespace rwsem. This can't work if the list contains objects from several namespaces. >> > >> > Of course, you are right. I've to rework this part -> I can add check into >> > static int newseg(struct ipc_namespace *ns, struct ipc_params *params) >> > function and before adding new shm into task list check that list is empty OR >> > an item which is present on the list from the same namespace as >> > current->nsproxy->ipc_ns. >> > >> Ok. (Sorry, I have only smartphone internet, thus I could not check >> the patch fully) >> >> > >> I've proposed a change which keeps the old behaviour of setns() but >> > >> fixes double free. >> > >> >> > > Assuming that locking works, I would consider this as a namespace design question: Do we want to support that a task contains shm objects from several ipc namespaces? >> > >> > This depends on what we mean by "task contains shm objects from >> > several ipc namespaces". There are two meanings: >> > >> > 1. Task has attached shm object from different ipc namespaces >> > >> > We already support that by design. When we doing a change of namespace >> > using unshare(CLONE_NEWIPC) even with >> > sysctl shm_rmid_forced=1 we not detach all ipc's from task! >> >> OK. Thus shm and sem have different behavior anyways. >> >> > >> > 2. Task task->sysvshm.shm_clist list has items from different IPC namespaces. >> > >> > I'm not sure, do we need that or not. But I'm ready to prepare a patch >> > for any of the options which we choose: >> > a) just add exit_shm(current)+shm_init_task(current); >> > b) prepare PATCHv2 with appropriate check in the newseg() to prevent >> > adding new items from different namespace to the list >> > c) rework algorithm so we can safely have items from different >> > namespaces in task->sysvshm.shm_clist >> > >> Before you write something, let's wait what the others say. I don't >> qualify AS shm expert >> >> a) is user space visible, without any good excuse > > yes, but maybe we decide that this is not so critical? > We need more people here :) It is barely visible. You have to do something very silly to see this happening. It is probably ok, but the work to verify that nothing cares so that we can safely backport the change is probably much more work than just updating the list to handle shmid's for multiple namespaces. >> c) is probably highest amount of Changes > > yep. but ok, I will prepare patches fast. Given that this is a bug I think c) is the safest option. A couple of suggestions. 1) We can replace the test "shm_creator != NULL" with "list_empty(&shp->shm_clist)" and remove shm_creator. Along with replacing "shm_creator = NULL" with "list_del_init(&shp->shm_clist)". 2) We can update shmat to do "list_del_init(&shp->shm_clist)" upon shmat. The last unmap will still shm_destroy the shm segment as ns->shm_rmid_forced is set. For a multi-threaded process I think this will nicely clean up the clist, and make it clear that the clist only cares about those segments that have been created but never attached. 3) Put a non-reference counted struct ipc_namespace in struct shmid_kernel, and use it to remove the namespace parameter from shm_destroy. I think that is enough to fix this bug with no changes in semantics, no additional memory consumed, and an implementation that is easier to read and perhaps a little faster. Eric