Re: [RFC] netns / sysfs interaction

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Al Viro <viro@ZenIV.linux.org.uk>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, htejun@gmail.com,
	linux-fsdevel@vger.kernel.org, gregkh@suse.de
Subject: Re: [RFC] netns / sysfs interaction
Date: Mon, 7 Jan 2008 10:24:17 +0000	[thread overview]
Message-ID: <20080107102417.GY27894@ZenIV.linux.org.uk> (raw)
In-Reply-To: <m1y7b2j6no.fsf@ebiederm.dsl.xmission.com>

On Mon, Jan 07, 2008 at 03:01:47AM -0700, Eric W. Biederman wrote:
> Al Viro <viro@ZenIV.linux.org.uk> writes:

> What appears to be a clean solution is to have multiple sysfs superblocks
> and to capture the namespace at mount time.

It is not a clean solution at all.  In particular, it leaves you with hell
of a coherency issues between these trees.

>  For planning purposes there
> is a device namespace on the drawing board as well, so you can keep
> your same major minor numbers for devices (tty names, network attached
> disk) in a migration event.

Yes, I'm quite sure there's more coming.  Which is why I'm asking now,
before we are even deeper into that... area

>   This means netns isn't the only
> namespace we will have to worry about with sysfs before it is all
> done.

Exciting.
 
> > 	a) what happens if I do chdir("/sys/class/net/eth42/") and then
> > migrate?
> 
> It shouldn't be any better or worse then any other filesystem.  The
> prerequisite for a OS level migration is that the set of all
> namespaces and all of the processes that use them all go together.
> As we recreate the virtual filesystem and virtual devices we should
> recreate a sysfs that is essentially the same.  I doubt we will go
> to the trouble of keeping the unnamed device number we are mounted on
> and the inode numbers the same, but otherwise we should be able to
> recreate an identical looking sysfs (baring real hardware changes).

Have you even bothered to read the pathname in question?  Please, do so.

> > 	c) what happens to open files?  E.g. to /sys/class/net - say it,
> > if migration happens between two getdents(2).
> 
> How do we restore the internal state?  Hmm.    The rule is that you
> are only guaranteed to see directory entries that existed
> both before you started to read the directory and after you finished.
> 
> The cheap solution is just to declared everything hotplugged and
> deleted and recreated.  Removing any meaningful guarantee of seeing
> anything.
> 
> Since we only depend upon the value of f_pos that should largely work.
> 
> If we ever figure out how to preserve inode numbers over a migration
> event the current scheme will work unmodified but that sounds like
> more pain then it is worth.
> 

Inode numbers?  Are you suggesting a wholesale replacement of all struct
file referenced by descriptor tables, all way down to inodes?  May I see
the patches for that, please?

> Third when the goal is isolation and not migration (a better chroot)
> then our hardware never changes.

... and you have quite a bit of system state (starting with those net:eth0
symlinks, etc.) visible in there, not just the hardware.

> The idea is supporting multiple superblocks for sysfs:
> 
>   Ultimately capturing the relevant namespace at mount time
>   and if we don't have a superblock for that namespace creating
>   a new one.
> 
>   So we have one sysfs dirent tree and multiple dentry trees.
> 
>   The tricky parts are rename/move and blocking mount/unmount requests
>   for sysfs until we complete the rename operation calling d_move
>   everywhere.

Excuse me, _what_?  Are you seriously suggesting going through all dentry
trees, doing d_move() in each?  I want to see your locking.  It's promising
to be worse than devfs had ever been.  Much worse.

     prev parent reply	other threads:[~2008-01-07 10:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-07  7:23 [RFC] netns / sysfs interaction Al Viro
2008-01-07 10:01 ` Eric W. Biederman
2008-01-07 10:24   ` Al Viro [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080107102417.GY27894@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@suse.de \
    --cc=htejun@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox