From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: Possible bug: detached mounts difficult to cleanup Date: Wed, 11 Jan 2017 15:37:36 +1300 Message-ID: <87shoqtj7z.fsf@xmission.com> References: <20170111012454.GB2497@templeofstupid.com> <87fukqwcue.fsf@xmission.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <87fukqwcue.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> (Eric W. Biederman's message of "Wed, 11 Jan 2017 15:27:05 +1300") List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Krister Johansen Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, Al Viro List-Id: containers.vger.kernel.org ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) writes: > Krister Johansen writes: > >> Gents, >> >> I wondered if a naive solution could re-walk the list of mounts >> processed in umount_tree() and if all of the detached but locked mounts >> had a refcount that indicated they're unused, they could be unlocked and >> unmounted. At least in the case of the containers I'm dealing with, the >> the container software should be ensuring that nothing in the container >> has a reference on anything that's under the detached portion of the >> tree. However, there's probably a better way to do this. > > So if the code is working correctly that should already happen. > > The design is for the parent mount to hold a reference to the submounts. > And when the reference on the parent drops to 0. The references on > all of the submounts will also be dropped. > > I was hoping to read the code and point it out to you quickly, but I am > not seeing it now. I am wondering if in all of the refactoring of that > code something was dropped/missed :( > > Somewhere there is supposed to be the equivalent of: > pin_insert_group(&p->mnt_umount, &p->mnt_parent->mnt, &unmounted); > when we unhash those mounts because the last count has gone away. > Either it is very sophisticated or I am missing it. Grr.... Ok. I see the code now, and it should be doing the right thing. During umount_tree the code calls pin_insert_group(...) with the last paramenter being NULL. That adds the mount to one or two lists. The mnt_pins list of the parent mount and the &unmounted hlist. Then later when the parent's cleanup_mnt is called if the mnt_pins still has entries mnt_pin_kill is called. For every mount on the mnt_pins list drop_mountpoint is called. Which calls dput and mntput. So that is how your references are supposed to be freed. Which leaves the question why aren't your mounts being freed? Is a file descriptor perhaps from a mmaped executable holding a mount reference? Perhaps you want to manually unmount everything and see if unmount will fail on some mount and let you see which mount has a reference to it. Eric From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out02.mta.xmission.com ([166.70.13.232]:50216 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760307AbdAKClg (ORCPT ); Tue, 10 Jan 2017 21:41:36 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Krister Johansen Cc: Al Viro , linux-fsdevel@vger.kernel.org, containers@lists.linux-foundation.org References: <20170111012454.GB2497@templeofstupid.com> <87fukqwcue.fsf@xmission.com> Date: Wed, 11 Jan 2017 15:37:36 +1300 In-Reply-To: <87fukqwcue.fsf@xmission.com> (Eric W. Biederman's message of "Wed, 11 Jan 2017 15:27:05 +1300") Message-ID: <87shoqtj7z.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: Possible bug: detached mounts difficult to cleanup Sender: linux-fsdevel-owner@vger.kernel.org List-ID: ebiederm@xmission.com (Eric W. Biederman) writes: > Krister Johansen writes: > >> Gents, >> >> I wondered if a naive solution could re-walk the list of mounts >> processed in umount_tree() and if all of the detached but locked mounts >> had a refcount that indicated they're unused, they could be unlocked and >> unmounted. At least in the case of the containers I'm dealing with, the >> the container software should be ensuring that nothing in the container >> has a reference on anything that's under the detached portion of the >> tree. However, there's probably a better way to do this. > > So if the code is working correctly that should already happen. > > The design is for the parent mount to hold a reference to the submounts. > And when the reference on the parent drops to 0. The references on > all of the submounts will also be dropped. > > I was hoping to read the code and point it out to you quickly, but I am > not seeing it now. I am wondering if in all of the refactoring of that > code something was dropped/missed :( > > Somewhere there is supposed to be the equivalent of: > pin_insert_group(&p->mnt_umount, &p->mnt_parent->mnt, &unmounted); > when we unhash those mounts because the last count has gone away. > Either it is very sophisticated or I am missing it. Grr.... Ok. I see the code now, and it should be doing the right thing. During umount_tree the code calls pin_insert_group(...) with the last paramenter being NULL. That adds the mount to one or two lists. The mnt_pins list of the parent mount and the &unmounted hlist. Then later when the parent's cleanup_mnt is called if the mnt_pins still has entries mnt_pin_kill is called. For every mount on the mnt_pins list drop_mountpoint is called. Which calls dput and mntput. So that is how your references are supposed to be freed. Which leaves the question why aren't your mounts being freed? Is a file descriptor perhaps from a mmaped executable holding a mount reference? Perhaps you want to manually unmount everything and see if unmount will fail on some mount and let you see which mount has a reference to it. Eric