From: Petr Vorel <pvorel@suse.cz>
To: Jeff Layton <jlayton@kernel.org>
Cc: Nikita Yushchenko <nikita.yoush@cogentembedded.com>,
linux-nfs@vger.kernel.org, NeilBrown <neilb@suse.de>,
Steve Dickson <steved@redhat.com>,
ltp@lists.linux.it
Subject: Re: [LTP] [PATCH 1/1] nfslock01.sh: Don't test on NFS v3 on TCP
Date: Tue, 2 May 2023 15:41:46 +0200 [thread overview]
Message-ID: <20230502134146.GA3654451@pevik> (raw)
In-Reply-To: <d441b9f9dfcbb4719d97c7b3b5950dfeeb8913d2.camel@kernel.org>
Hi all,
[ Cc Steve, Alexey, Nikita - hope the new address is the correct,
because nikita.yushchenko@virtuozzo.com is dead ]
> On Tue, 2023-05-02 at 09:59 +0200, Petr Vorel wrote:
> > nfs_flock (run via nfslock01.sh) is known to fail on NFS v3 [1]:
> > not unsharing /var makes AF_UNIX socket for host's rpcbind to become
> > available inside ltp_ns. Then, at NFS v3 mount time, kernel creates
> > an instance of lockd for ltp_ns, and ports for that instance leak to
> > host's rpcbind and overwrite ports for lockd already active for root
> > namespace. This breaks nfs3 file locking.
> Yeccchhh...that is pretty nasty.
> rpcbind was obviously written in a time before namespaces were even a
> thought to anyone. I wonder if there is something we can do in rpcbind
> itself to guard against these sorts of shenanigans? Probably not, I
> guess...
Maybe Steve or Neil have some idea.
> Is /var shared between namespaces in this test for some particular
> reason?
I hope I got , we talk about /var/run/netns/ltp_ns, which is symlink to
/proc/$pid/ns/net. Or is it really about /run/rpcbind.sock vs
/var/run/rpcbind.sock?
/var/run/netns/<NETNS> file is something created by ip:
#define NETNS_RUN_DIR "/var/run/netns" [1]
NETNS_RUN_DIR?=/var/run/netns [2]
man ip-netns(8) [3]
By convention a named network namespace is an object at
/var/run/netns/NAME that can be opened. The file descriptor
resulting from opening /var/run/netns/NAME refers to the
specified network namespace. Holding that file descriptor open
keeps the network namespace alive. The file descriptor can be
used with the setns(2) system call to change the network
namespace associated with a task.
LTP used to use ip for creating namespaces. Later Alexey added reused custom
LTP code ns_exec.c and ns_ifmove.c [4] (NOTE: these were recently renamed to
tst_ns_exec.c and tst_ns_ifmove.c and moved) in order to allow to reuse already
defined namespaces. This change did the symlink from /proc/$pid/ns/net
/var/run/netns/ltp_ns, to keep the convention.
> > Before bd512e733 ("nfs_flock: fail the test if lock/unlock ops fail")
> > it run indefinitely with "unhandled error -107":
> > [ 2840.099565] lockd: cannot monitor 10.0.0.2
> > [ 2840.109353] lockd: cannot monitor 10.0.0.2
> > [ 2843.286811] xs_tcp_setup_socket: connect returned unhandled error -107
> > [ 2850.198791] xs_tcp_setup_socket: connect returned unhandled error -107
> > bd512e733 caused an early abort (therefore only "cannot monitor 10.0.0.2"
> > appears).
> > Although there is suggestion, how to fix the problem in kernel [2]:
> > > Maybe rpcb_create_local() shall detect that it is not in root
> > > netns, and only try AF_INET connection to > localhost in that case.
> > That would be simple and might be sensible. IF changing the AF_UNIX
> > path to "/run/rpcbind.sock" isn't sufficient, then testing for the
> > root_ns is probably the best second option.
> Was it determined that changing the location of the socket wasn't
> sufficient to fix this? FWIW, My Fedora 38 machine seems to listen on
> that socket already:
> [Socket]
> ListenStream=/run/rpcbind.sock
NOTE both openSUSE Tumbleweed and Debian 11 (bullseye), on which I'm able to
detect the problem (IMHO it would be reproducible on the most of distros) have:
* /var/run is symlink to /run
* use /run/rpcbind.lock (have patch to change /var/run/rpcbind.lock to
/run/rpcbind.lock [5] [6] @Steve shouldn't be this patch accepted as the default?)
Kind regards,
Petr
[1] https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/include/namespace.h#n12
[2] https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/Makefile#n20
[3] https://man7.org/linux/man-pages/man8/ip-netns.8.html
[4] https://github.com/linux-test-project/ltp/commit/3fb501e04c61fb3d6b6b82011919572a87425cf9
[5] https://build.opensuse.org/package/view_file/network/rpcbind/0001-change-lockingdir-to-run.patch?expand=1
[6] https://salsa.debian.org/debian/rpcbind/-/blob/master/debian/patches/run-migration
--
Mailing list info: https://lists.linux.it/listinfo/ltp
next prev parent reply other threads:[~2023-05-02 13:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-02 7:59 [LTP] [PATCH 1/1] nfslock01.sh: Don't test on NFS v3 on TCP Petr Vorel
2023-05-02 9:22 ` Petr Vorel
2023-05-02 12:25 ` Jeff Layton
2023-05-02 13:41 ` Petr Vorel [this message]
2023-05-08 2:50 ` Nikita Yushchenko
2023-05-09 23:00 ` NeilBrown
2023-05-10 2:44 ` Nikita Yushchenko
2023-05-02 21:21 ` NeilBrown
2023-05-04 20:37 ` Petr Vorel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230502134146.GA3654451@pevik \
--to=pvorel@suse.cz \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=ltp@lists.linux.it \
--cc=neilb@suse.de \
--cc=nikita.yoush@cogentembedded.com \
--cc=steved@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox