Linux Container Development
 help / color / mirror / Atom feed
From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Cc: linux-fsdevel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Andrei Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>,
	Alexander Viro
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
Subject: Re: [RFC][PATCH] mount: In mark_umount_candidates and __propogate_umount visit each mount once
Date: Thu, 13 Oct 2016 21:45:00 -0500	[thread overview]
Message-ID: <87wphb4pjn.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <CANaxB-xPkgdyeg0z6TvExMfyy4uOC+Nu4Q99WpCscNKMWz8VPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> (Andrey Vagin's message of "Thu, 13 Oct 2016 19:31:22 -0700")

Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:

> On Thu, Oct 13, 2016 at 2:46 PM, Andrei Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> wrote:
>> On Thu, Oct 13, 2016 at 02:53:46PM -0500, Eric W. Biederman wrote:
>>>
>>> Adrei Vagin pointed out that time to executue propagate_umount can go
>>> non-linear (and take a ludicrious amount of time) when the mount
>>> propogation trees of the mounts to be unmunted by a lazy unmount
>>> overlap.
>>>
>>> Solve this in the most straight forward way possible, by adding a new
>>> mount flag to mark parts of the mount propagation tree that have been
>>> visited, and use that mark to skip parts of the mount propagation tree
>>> that have already been visited during an unmount.  This guarantees
>>> that each mountpoint in the possibly overlapping mount propagation
>>> trees will be visited exactly once.
>>>
>>> Add the functions propagation_visit_next and propagation_revisit_next
>>> to coordinate setting and clearling the visited mount mark.
>>>
>>> Here is a script to generate such mount tree:
>>> $ cat run.sh
>>> mount -t tmpfs test-mount /mnt
>>> mount --make-shared /mnt
>>> for i in `seq $1`; do
>>>         mkdir /mnt/test.$i
>>>         mount --bind /mnt /mnt/test.$i
>>> done
>>> cat /proc/mounts | grep test-mount | wc -l
>>> time umount -l /mnt
>>> $ for i in `seq 10 16`; do echo $i; unshare -Urm bash ./run.sh $i; done
>>>
>>> Here are the performance numbers with and without the patch:
>>>
>>> mounts | before | after (real sec)
>>> -----------------------------
>>>   1024 |  0.071 | 0.024
>>>   2048 |  0.184 | 0.030
>>>   4096 |  0.604 | 0.040
>>>   8912 |  4.471 | 0.043
>>>  16384 | 34.826 | 0.082
>>>  32768 |        | 0.151
>>>  65536 |        | 0.289
>>> 131072 |        | 0.659
>>>
>>> Andrei Vagin fixing this performance problem is part of the
>>> work to fix CVE-2016-6213.
>>>
>>> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> Reported-by: Andrei Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
>>> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
>>> ---
>>>
>>> Andrei can you take a look at this patch and see if you can see any
>>> problems.  My limited testing suggests this approach does a much better
>>> job of solving the problem you were seeing.  With the time looking
>>> almost linear in the number of mounts now.
>>
>> I read this patch and I like the idea.
>>
>> Then I run my tests and one of them doesn't work with this patch.
>> I haven't found a reason yet.
>
>>> +     for (m = propagation_visit_next(parent, parent); m;
>>> +                     m = propagation_visit_next(m, parent)) {
>>>               struct mount *child = __lookup_mnt_last(&m->mnt,
>>>                                               mnt->mnt_mountpoint);
>
> The reason is that this loop is called for different "mnt", but
> it is executed only once with this optimization.
>
> So I think the idea to mark parent will not work, because one parent
> can have a few children which have to be umounted.

Good catch.  So what needs to be marked is the parent mount and
mountpoint combination.  Which is effectively the child mount.

I still think replacing the propagation_next and fixing the propagation
walk is the way to go.   But it sounds like to make things work the
__lookup_mnt_last needs to be moved into the propagation walk function.

That doesn't feel to hard.  I will have to see what the code looks like.

Eric

  parent reply	other threads:[~2016-10-14  2:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-10 23:26 [PATCH] [v3] mount: dont execute propagate_umount() many times for same mounts Andrei Vagin
     [not found] ` <1476141965-21429-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-10-13 17:14   ` Eric W. Biederman
     [not found] ` <877f9c6ui8.fsf@x220.int.ebiederm.org>
     [not found]   ` <877f9c6ui8.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-10-13 19:53     ` [RFC][PATCH] mount: In mark_umount_candidates and __propogate_umount visit each mount once Eric W. Biederman
     [not found]   ` <87pon458l1.fsf_-_@x220.int.ebiederm.org>
     [not found]     ` <87pon458l1.fsf_-_-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-10-13 21:46       ` Andrei Vagin
     [not found]     ` <20161013214650.GB19836@outlook.office365.com>
     [not found]       ` <20161013214650.GB19836-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-14  2:31         ` Andrey Vagin
     [not found]           ` <CANaxB-xPkgdyeg0z6TvExMfyy4uOC+Nu4Q99WpCscNKMWz8VPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-10-14  2:45             ` Eric W. Biederman [this message]
     [not found]               ` <87wphb4pjn.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-10-14 18:29                 ` [RFC][PATCH v2] " Eric W. Biederman
     [not found]                   ` <8737jy3htt.fsf_-_-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-10-18  2:40                     ` Andrei Vagin
     [not found]                       ` <20161018024000.GA4901-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-18  6:49                         ` Eric W. Biederman
     [not found]                           ` <87r37e9mnj.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2016-10-19  3:46                             ` [REVIEW][PATCH] mount: In propagate_umount handle overlapping mount propagation trees Eric W. Biederman
     [not found]                           ` <877f95ngpr.fsf_-_@xmission.com>
     [not found]                             ` <877f95ngpr.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2016-10-20 21:30                               ` Andrei Vagin
     [not found]                                 ` <20161020213052.GA25226-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-21 19:26                                   ` Eric W. Biederman
     [not found]                                     ` <87pomtec6c.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2016-10-22 19:42                                       ` [RFC][PATCH v2] " Eric W. Biederman
     [not found]                                         ` <877f90b27o.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2016-10-25 20:58                                           ` Andrei Vagin
     [not found]                                             ` <20161025205846.GA25080-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-25 21:45                                               ` Eric W. Biederman
     [not found]                                                 ` <87mvhs14s7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2016-10-25 23:41                                                   ` Andrei Vagin
     [not found]                                                     ` <20161025234125.GA20335-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-26  1:42                                                       ` Eric W. Biederman
2016-11-01  6:14                                                   ` Andrei Vagin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wphb4pjn.fsf@x220.int.ebiederm.org \
    --to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
    --cc=avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org \
    --cc=avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox