Persisting mounts between 'ip netns' invocations

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: netdev@vger.kernel.org, bpf@vger.kernel.org
Cc: David Ahern <dsahern@kernel.org>, Christian Brauner <brauner@kernel.org>
Subject: Persisting mounts between 'ip netns' invocations
Date: Thu, 28 Sep 2023 10:29:07 +0200	[thread overview]
Message-ID: <87a5t68zvw.fsf@toke.dk> (raw)

Hi everyone

I recently ran into this problem again, and so I figured I'd ask if
anyone has any good idea how to solve it:

When running a command through 'ip netns exec', iproute2 will
"helpfully" create a new mount namespace and remount /sys inside it,
AFAICT to make sure /sys/class/net/* refers to the right devices inside
the namespace. This makes sense, but unfortunately it has the side
effect that no mount commands executed inside the ns persist. In
particular, this makes it difficult to work with bpffs; even when
mounting a bpffs inside the ns, it will disappear along with the
namespace as soon as the process exits.

To illustrate:

# ip netns exec <nsname> bpftool map pin id 2 /sys/fs/bpf/mymap
# ip netns exec <nsname> ls /sys/fs/bpf
<nothing>

This happens because namespaces are cleaned up as soon as they have no
processes, unless they are persisted by some other means. For the
network namespace itself, iproute2 will bind mount /proc/self/ns/net to
/var/run/netns/<nsname> (in the root mount namespace) to persist the
namespace. I tried implementing something similar for the mount
namespace, but that doesn't work; I can't manually bind mount the 'mnt'
ns reference either:

# mount -o bind /proc/104444/ns/mnt /var/run/netns/mnt/testns
mount: /run/netns/mnt/testns: wrong fs type, bad option, bad superblock on /proc/104444/ns/mnt, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

When running strace on that mount command, it seems the move_mount()
syscall returns EINVAL, which, AFAICT, is because the mount namespace
file references itself as its namespace, which means it can't be
bind-mounted into the containing mount namespace.

So, my question is, how to overcome this limitation? I know it's
possible to get a reference to the namespace of a running process, but
there is no guarantee there is any processes running inside the
namespace (hence the persisting bind mount for the netns). So is there
some other way to persist the mount namespace reference, so we can pick
it back up on the next 'ip netns' invocation?

Hoping someone has a good idea :)

-Toke

next             reply	other threads:[~2023-09-28  8:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-28  8:29 Toke Høiland-Jørgensen [this message]
2023-09-28  9:54 ` Persisting mounts between 'ip netns' invocations Nicolas Dichtel
2023-09-28 16:17   ` Christian Brauner
2023-09-28 18:21     ` Toke Høiland-Jørgensen
2023-09-29  8:26       ` Nicolas Dichtel
2023-09-29  9:25         ` Christian Brauner
2023-09-29  9:45           ` Nicolas Dichtel
2023-09-29 21:23             ` David Laight
2023-09-29 15:00   ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5t68zvw.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=dsahern@kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox