linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: James Bottomley
	<James.Bottomley-JuX6DAaQMKPCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>,
	Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Alexander Viro
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	"criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org"
	<criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	"Michael Kerrisk (man-pages)"
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	linux-fsdevel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
Date: Mon, 1 Aug 2016 16:01:49 -0700	[thread overview]
Message-ID: <20160801230147.GA32309@outlook.office365.com> (raw)
In-Reply-To: <87h9b8e2v7.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>

On Fri, Jul 29, 2016 at 01:05:48PM -0500, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
> > Hi Eric,
> >
> > On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
> >> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >>
> >>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
> >>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
> >>
> >>>> If we want to compare two file descriptors of the current process,
> >>>> it is one of cases for which kcmp can be used. We can call kcmp to
> >>>> compare two namespaces which are opened in other processes.
> >>>
> >>> Is there really a use case there? I assume we're talking about the
> >>> scenario where a process in one namespace opens a /proc/PID/ns/*
> >>> file descriptor and passes that FD to another process via a UNIX
> >>> domain socket. Is that correct?
> >>>
> >>> So, supposing that we want to build a map of the relationships
> >>> between namespaces using the proposed kcmp() API, and there are
> >>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
> >>> to kcmp()?
> >>
> >> Potentially.  The numbers are small enough O(N^2) isn't fatal.
> >
> > Define "small", please.
> >
> > O(N^2) makes me nervous about what other use cases lurk out
> > there that may get bitten by this.
> 
> Worst case for N (One namespace per thread) is about 60k.
> A typical heavy use case may be 1000 namespaces of any type.
> So we are talking about O(N^2) that rarely happens and should be done in
> a couple of seconds.
> 
> >> Where kcmp shines is that it allows migration to happen.  Inode numbers
> >> to change (which they very much will today), and still have things work.
> >
> >
> >> We can keep it O(Nlog(N)) by taking advantage of not just the equality
> >> but the ordering relationship.  Although Ugh.
> >
> > Yes, that sounds pretty ugly...
> 
> Actually having thought about this a little more if kcmp returns an
> ordering by inode and migration preserves the relative order of
> the inodes (which should just be a creation order) it should be quite
> solvable.
> 
> Switch from an order by inode number to an order by object creation
> time, and guarantee that all creations are have an order (which with
> task_list_lock we practically already have) and it should be even easier
> to create.  (A 64bit nanosecond resolution timestamp is good for 544
> years of uptime).  A 64bit number that increments each time an object is
> created should have an even better lifespan.
> 
> I don't know if we can find a way to give that guarantee for other kcmp
> comparisons but it is worth a thought.
> 
> >>One disadvantage of
> >> kcmp currently is that the way the ordering relationship is defined
> >> the order is not preserved over migration :(
> >
> > So, does kcmp() fully solve the proble(s) at hand? It sounds like
> > not, if I understand your last point correctly.
> 
> There are 3 possibilities I see for migration in migration, ordered
> in order of implementation difficulty.
> 1) Have a clear signal that migration happened and a nested migration
>    needs to restart.
> 2) Use kcmp so that only the relative order needs to be preserved.
> 3) Preserve the device number and inode numbers.
> 
> At a practical level I think (2) may actually in net be the simplest.
> It requires a little more care to implement and you have to opt in,
> but it should not require any rolling back of activity (merely careful
> ordering of object creation).
> 
> I definititely like kcmp knowing how to compare things by inode
> (aka st_dev, st_inode) because then even if you have to restart
> the comparisons after a migration the exact details you are comparing
> are hidden and so it is easier to support and harder to get wrong.
> 
> I can imagine how to preserve inode numbers by creating a new instance
> of nsfs instance and using the old inode numbers upon restore.  I don't
> currently see how we could possibly preserve st_dev over migration short of
> a device number namespace.

I think we can avoid comparing st_dev if we will compare inode numbers
for parent user namespaces.

Namespaces looks like a tree where user-namespaces are directories and
other namespaces are files.

A namespace can be described by a path in this imaginary file system,
which looks like /userns1/userns2/XXXns.

In this case we need to guarantee uniq names inside each directories and
that they will be not changed over migration.

> 
> So if we are going to continue with making device numbers be a legacy
> attribute applications should not care about we need a way to compare
> things by not looking at st_dev.  Which brings us back to kcmp.
> 
> Hmm.  Hotplugging as disk and plugging it back likely will change the
> device number and give the same kind of challenge with st_dev (although
> you can't keep a file descriptor open across that kind of event).  So
> certainly a hotplug event on a device should be enough to say don't care
> about the device number.
> 
> Eric
> 

  parent reply	other threads:[~2016-08-01 23:01 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-14 18:20 [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces Andrey Vagin
2016-07-14 18:20 ` [PATCH 1/5] namespaces: move user_ns into ns_common Andrey Vagin
     [not found]   ` <1468520419-28220-2-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-15 12:21     ` kbuild test robot
2016-07-14 18:20 ` [PATCH 3/5] nsfs: add ioctl to get an owning user namespace for ns file descriptor Andrey Vagin
     [not found]   ` <1468520419-28220-4-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-14 18:48     ` W. Trevor King
     [not found] ` <1468520419-28220-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-14 18:20   ` [PATCH 2/5] kernel: add a helper to get an owning user namespace for a namespace Andrey Vagin
     [not found]     ` <1468520419-28220-3-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-14 19:07       ` W. Trevor King
2016-07-14 18:20   ` [PATCH 4/5] nsfs: add ioctl to get a parent namespace Andrey Vagin
2016-07-14 18:20   ` [PATCH 5/5] tools/testing: add a test to check nsfs ioctl-s Andrey Vagin
2016-07-14 22:02   ` [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces Andrey Vagin
     [not found]     ` <CANaxB-xw_xBUq=0uT14ANv-jfg2NsGaPy=jyDO9=yF03_7toSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-15  2:12       ` [PATCH 1/5] namespaces: move user_ns into ns_common Andrey Vagin
     [not found]         ` <1468548742-32136-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-15  2:12           ` [PATCH 2/5] kernel: add a helper to get an owning user namespace for a namespace Andrey Vagin
     [not found]             ` <1468548742-32136-2-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-24  5:03               ` Eric W. Biederman
     [not found]                 ` <878twrmxu2.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-24  6:37                   ` Andrew Vagin
     [not found]                     ` <20160724063728.GA17810-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-24 14:30                       ` Eric W. Biederman
     [not found]                         ` <87shuzglck.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-24 17:05                           ` W. Trevor King
2016-07-24 16:54               ` W. Trevor King
2016-07-15  2:12           ` [PATCH 3/5] nsfs: add ioctl to get an owning user namespace for ns file descriptor Andrey Vagin
2016-07-15  2:12           ` [PATCH 4/5] nsfs: add ioctl to get a parent namespace Andrey Vagin
     [not found]             ` <1468548742-32136-4-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-24  5:07               ` Eric W. Biederman
2016-07-16  8:21           ` [PATCH 1/5] namespaces: move user_ns into ns_common kbuild test robot
2016-07-23 23:07           ` kbuild test robot
2016-07-24  5:00           ` Eric W. Biederman
     [not found]             ` <87k2gbmy02.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-24  5:54               ` Andrew Vagin
2016-07-24  5:54             ` Andrew Vagin
2016-07-24  5:54             ` Andrew Vagin
2016-07-15  2:12         ` [PATCH 5/5] tools/testing: add a test to check nsfs ioctl-s Andrey Vagin
2016-07-24  5:10       ` [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces Eric W. Biederman
     [not found]         ` <87poq3liyq.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-26  2:07           ` Andrew Vagin
2016-07-21 14:41   ` Michael Kerrisk (man-pages)
     [not found]     ` <c9bdaf3d-ec93-d754-81ac-9f524a0d0954-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-21 21:06       ` Andrew Vagin
     [not found]         ` <20160721210650.GA10989-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-22  6:48           ` Michael Kerrisk (man-pages)
     [not found]             ` <1515f5f2-5a49-fcab-61f4-8b627d3ba3e2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-22 18:25               ` Andrey Vagin
2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
     [not found]                   ` <e2811bf1-4b86-e115-bcdb-301d6f2546eb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-25 13:18                     ` Eric W. Biederman
     [not found]                       ` <87lh0pg8jx.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-25 14:46                         ` Michael Kerrisk (man-pages)
     [not found]                           ` <44ca0e41-dc92-45b1-2a6c-c41a048a072d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-25 14:54                             ` Serge E. Hallyn
     [not found]                               ` <20160725145445.GA19879-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-07-25 15:17                                 ` Eric W. Biederman
2016-07-25 14:59                             ` Eric W. Biederman
     [not found]                               ` <87r3ahepb4.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-26  2:54                                 ` Andrew Vagin
     [not found]                                   ` <20160726025455.GC26206-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-26  8:03                                     ` Michael Kerrisk (man-pages)
     [not found]                                       ` <3390535b-0660-757f-aeba-c03d936b3485-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-26 18:25                                         ` Andrew Vagin
     [not found]                                           ` <20160726182524.GA328-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-26 18:32                                             ` W. Trevor King
     [not found]                                               ` <20160726183224.GN24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-26 19:11                                                 ` Andrew Vagin
2016-07-26 19:17                                           ` Michael Kerrisk (man-pages)
     [not found]                                             ` <CAKgNAkjmOu+vfiMDyeYQkkf7wQBH9PVmJ4nH2CTg43GrN-k7eA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-26 20:39                                               ` Andrew Vagin
     [not found]                                                 ` <20160726203955.GA9415-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-28 10:45                                                   ` Michael Kerrisk (man-pages)
     [not found]                                                     ` <ca0787a3-b270-e962-46d1-7e63c9335a55-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-28 12:56                                                       ` Eric W. Biederman
2016-07-28 19:00                                                         ` Michael Kerrisk (man-pages)
     [not found]                                                           ` <40e35f1a-10e6-b7a5-936e-a09f008be0d0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-29 18:05                                                             ` Eric W. Biederman
     [not found]                                                               ` <87h9b8e2v7.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-31 21:31                                                                 ` Michael Kerrisk (man-pages)
2016-08-01 23:01                                                                 ` Andrew Vagin [this message]
2016-07-26 19:38                                     ` Eric W. Biederman
2016-07-23 21:14   ` W. Trevor King
     [not found]     ` <20160723211414.GA25371-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-23 21:38       ` James Bottomley
     [not found]         ` <1469309936.2332.35.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-23 21:58           ` W. Trevor King
     [not found]             ` <20160723215802.GO24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-23 21:56               ` Eric W. Biederman
     [not found]                 ` <87mvl8nhlv.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-23 22:34                   ` W. Trevor King
     [not found]                     ` <20160723223448.GP24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-24  4:51                       ` Eric W. Biederman
2016-08-01 18:20   ` Alban Crequy
     [not found]     ` <CAMXgnP6j+rTeb5XJgoPV20y8puGyVm=9O9gdg9Sah4DuF5qm9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-01 23:32       ` Andrew Vagin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160801230147.GA32309@outlook.office365.com \
    --to=avagin-5hdwgun5lf+gspxsjd1c4w@public.gmane.org \
    --cc=James.Bottomley-JuX6DAaQMKPCXq6kfMZ53/egYHeGw8Jk@public.gmane.org \
    --cc=avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).