Re: [RFC] netns / sysfs interaction

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Al Viro <viro@ZenIV.linux.org.uk>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, htejun@gmail.com,
	linux-fsdevel@vger.kernel.org, gregkh@suse.de
Subject: Re: [RFC] netns / sysfs interaction
Date: Mon, 7 Jan 2008 10:24:17 +0000	[thread overview]
Message-ID: <20080107102417.GY27894@ZenIV.linux.org.uk> (raw)
In-Reply-To: <m1y7b2j6no.fsf@ebiederm.dsl.xmission.com>

On Mon, Jan 07, 2008 at 03:01:47AM -0700, Eric W. Biederman wrote:
> Al Viro <viro@ZenIV.linux.org.uk> writes:

> What appears to be a clean solution is to have multiple sysfs superblocks
> and to capture the namespace at mount time.

It is not a clean solution at all.  In particular, it leaves you with hell
of a coherency issues between these trees.

>  For planning purposes there
> is a device namespace on the drawing board as well, so you can keep
> your same major minor numbers for devices (tty names, network attached
> disk) in a migration event.

Yes, I'm quite sure there's more coming.  Which is why I'm asking now,
before we are even deeper into that... area

>   This means netns isn't the only
> namespace we will have to worry about with sysfs before it is all
> done.

Exciting.
 
> > 	a) what happens if I do chdir("/sys/class/net/eth42/") and then
> > migrate?
> 
> It shouldn't be any better or worse then any other filesystem.  The
> prerequisite for a OS level migration is that the set of all
> namespaces and all of the processes that use them all go together.
> As we recreate the virtual filesystem and virtual devices we should
> recreate a sysfs that is essentially the same.  I doubt we will go
> to the trouble of keeping the unnamed device number we are mounted on
> and the inode numbers the same, but otherwise we should be able to
> recreate an identical looking sysfs (baring real hardware changes).

Have you even bothered to read the pathname in question?  Please, do so.

> > 	c) what happens to open files?  E.g. to /sys/class/net - say it,
> > if migration happens between two getdents(2).
> 
> How do we restore the internal state?  Hmm.    The rule is that you
> are only guaranteed to see directory entries that existed
> both before you started to read the directory and after you finished.
> 
> The cheap solution is just to declared everything hotplugged and
> deleted and recreated.  Removing any meaningful guarantee of seeing
> anything.
> 
> Since we only depend upon the value of f_pos that should largely work.
> 
> If we ever figure out how to preserve inode numbers over a migration
> event the current scheme will work unmodified but that sounds like
> more pain then it is worth.
> 

Inode numbers?  Are you suggesting a wholesale replacement of all struct
file referenced by descriptor tables, all way down to inodes?  May I see
the patches for that, please?

> Third when the goal is isolation and not migration (a better chroot)
> then our hardware never changes.

... and you have quite a bit of system state (starting with those net:eth0
symlinks, etc.) visible in there, not just the hardware.

> The idea is supporting multiple superblocks for sysfs:
> 
>   Ultimately capturing the relevant namespace at mount time
>   and if we don't have a superblock for that namespace creating
>   a new one.
> 
>   So we have one sysfs dirent tree and multiple dentry trees.
> 
>   The tricky parts are rename/move and blocking mount/unmount requests
>   for sysfs until we complete the rename operation calling d_move
>   everywhere.

Excuse me, _what_?  Are you seriously suggesting going through all dentry
trees, doing d_move() in each?  I want to see your locking.  It's promising
to be worse than devfs had ever been.  Much worse.

     prev parent reply	other threads:[~2008-01-07 10:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-07  7:23 [RFC] netns / sysfs interaction Al Viro
2008-01-07 10:01 ` Eric W. Biederman
2008-01-07 10:24   ` Al Viro [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080107102417.GY27894@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@suse.de \
    --cc=htejun@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.