public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: Dave Chinner <david@fromorbit.com>,
	"zlang@redhat.com" <zlang@redhat.com>,
	"dchinner@redhat.com" <dchinner@redhat.com>,
	"fstests@vger.kernel.org" <fstests@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Naohiro Aota <Naohiro.Aota@wdc.com>
Subject: Re: [PATCH v3.1 15/34] check: run tests in a private pid/mount namespace
Date: Tue, 25 Feb 2025 18:18:16 -0800	[thread overview]
Message-ID: <20250226021816.GG6265@frogsfrogsfrogs> (raw)
In-Reply-To: <qjjk4spah52oyautabncgjfluaixy4rbfpuecydm5izauhuqki@lkcw62dpij3o>

On Wed, Feb 26, 2025 at 02:13:00AM +0000, Shinichiro Kawasaki wrote:
> On Feb 26, 2025 / 08:29, Dave Chinner wrote:
> > On Tue, Feb 25, 2025 at 07:49:10AM -0800, Darrick J. Wong wrote:
> > > On Tue, Feb 25, 2025 at 11:27:19AM +0000, Shinichiro Kawasaki wrote:
> > > > On Feb 14, 2025 / 13:13, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > > 
> > > > > As mentioned in the previous patch, trying to isolate processes from
> > > > > separate test instances through the use of distinct Unix process
> > > > > sessions is annoying due to the many complications with signal handling.
> > > > > 
> > > > > Instead, we could just use nsexec to run the test program with a private
> > > > > pid namespace so that each test instance can only see its own processes;
> > > > > and private mount namespace so that tests writing to /tmp cannot clobber
> > > > > other tests or the stuff running on the main system.  Further, the
> > > > > process created by the clone(CLONE_NEWPID) call is considered the init
> > > > > process of that pid namespace, so all processes will be SIGKILL'd when
> > > > > the init process terminates, so we no longer need systemd scopes for
> > > > > externally enforced cleanup.
> > > > > 
> > > > > However, it's not guaranteed that a particular kernel has pid and mount
> > > > > namespaces enabled.  Mount (2.4.19) and pid (2.6.24) namespaces have
> > > > > been around for a long time, but there's no hard requirement for the
> > > > > latter to be enabled in the kernel.  Therefore, this bugfix slips
> > > > > namespace support in alongside the session id thing.
> > > > > 
> > > > > Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> > > > > support should be a separate conversation, not something that I have to
> > > > > do in a bug fix to get mainline QA back up.
> > > > > 
> > > > > Note that the new helper cannot unmount the /proc it inherits before
> > > > > mounting a pidns-specific /proc because generic/504 relies on being able
> > > > > to read the init_pid_ns (aka systemwide) version of /proc/locks to find
> > > > > a file lock that was taken and leaked by a process.
> > > > 
> > > > Hello Darrick,
> > > > 
> > > > I ran fstests for zoned btrfs using the latest fstests tag v2025.02.23, and
> > > > observed all test cases failed with my set up. I bisected and found that this
> > > > commit is the trigger. Let me share my observations.
> > > > 
> > > > For example, btrfs/001.out.bad contents are as follows:
> > > > 
> > > >   QA output created by 001
> > > >   mount: bad usage
> > > >   Try 'mount --help' for more information.
> > > >   common/rc: retrying test device mount with external set
> > > >   mount: bad usage
> > > >   Try 'mount --help' for more information.
> > > >   common/rc: could not mount /dev/sda on common/config: TEST_DIR (/tmp/test) is not a directory
> > > > 
> > > > As the last line above shows, fstests failed to find out TEST_DIR, /tmp/test.
> > > > 
> > > > My set up uses mount point directories in tmpfs, /tmp/*:
> > > > 
> > > >   export TEST_DIR=/tmp/test
> > > >   export SCRATCH_MNT=/tmp/scratch
> > > > 
> > > > I guessed that tmpfs might be a cause. As a trial, I modified these to,
> > > > 
> > > >   export TEST_DIR=/var/test
> > > >   export SCRATCH_MNT=/var/scratch
> > > > 
> > > > then I observed the failures disappeared. I guess this implies that the commit
> > > > for the private pid/mount namespace makes tmpfs unique to each namespace. Then,
> > > > the the mount points in tmpfs were not found in the private namespaces context,
> > > > probably.
> > > 
> > > Yes, /tmp is now private to the test program (e.g. tests/btrfs/001) so
> > > that tests run in parallel cannot interfere with each other.
> > > 
> > > > If this guess is correct, in indicates that tmpfs can no longer be used for
> > > > fstests mount points. Is this expected?
> > > 
> > > Expected, yes.  But still broken for you. :(
> 
> Darrick, thanks for the clarifications.
> 
> > > 
> > > I can think of a few solutions:
> > > 
> > > 1. Perhaps run_privatens could detect that TEST_DIR/SCRATCH_MNT start
> > > with "/tmp" and bind mount them into the private /tmp before it starts
> > > the test.
> > 
> > Which then makes it specific to test running, and that makes it
> > less suited to use from check-parallel (or any other generic test
> > context).
> > 
> > > 2. fstests could take care of defining and mkdir'ing the
> > > TEST_DIR/SCRATCH_MNT directories and users no longer have to create
> > > them.  It might, however, be useful to have them accessible to someone
> > > who has ssh'd in to look around after a failure.
> > 
> > check-parallel already does this, and it leaves them around after
> > the test, so....
> > 
> > 4. use check-parallel.
> 
> I took a quick look in check-parallel. IIUC, it creates loop devices then it can
> not run with real storage devices. I run fstests with real zoned storage
> devices regularly, so I'm afraid that this solution does not fit my use case.
> 
> > 
> > > 3. Everyone rewrites their fstests configs to choose something outside
> > > of /tmp (e.g. /var/tmp/{test,scratch})?
> 
> I'm okay with this, since it is not a big deal to change the mount points in my
> test environments.
> 
> If this solution is chosen, I suggest to document the restriction and/or have
> the test script to check the restriction, to avoid the confusion.

I would certainly do that if everyone else agrees to the additional
restriction + adding the necessary checks to ./check?

--D

> > 
> > How many people actually use /tmp for fstests mount points?
> > 
> > IMO, it's better for ongoing maintenance to drop support for /tmp
> > based mount points (less complexity in infrastructure setup). If
> > there are relatively few ppl who do this, perhaps it would be best
> > to treat this setup the same as the setsid encapsulation. i.e. works
> > for now, but is deprecated and is going to be removed in a years
> > time....
> > 
> > Then we can simply add a /tmp test to the HAVE_PRIVATENS setting and
> > avoid using a private ns for these setups for now. This gives
> > everyone who does use these setups time to migrate to a different
> > setup that will work with private namespaces correctly, whilst
> > keeping the long term maintenance burden of running tests in private
> > namespaces down to a minimum.
> > 
> > -Dave.
> > -- 
> > Dave Chinner
> > david@fromorbit.com

  reply	other threads:[~2025-02-26  2:18 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-12  3:30 [PATCHSET v3] fstests: random fixes for v2025.02.02 Darrick J. Wong
2025-02-12  3:30 ` [PATCH 01/34] generic/476: fix fsstress process management Darrick J. Wong
2025-02-12  3:31 ` [PATCH 02/34] metadump: make non-local function variables more obvious Darrick J. Wong
2025-02-12  3:31 ` [PATCH 03/34] metadump: fix cleanup for v1 metadump testing Darrick J. Wong
2025-02-12  3:31 ` [PATCH 04/34] generic/019: don't fail if fio crashes while shutting down Darrick J. Wong
2025-02-12  3:31 ` [PATCH 05/34] fuzzy: do not set _FSSTRESS_PID when exercising fsx Darrick J. Wong
2025-02-12  3:32 ` [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack Darrick J. Wong
2025-02-12  3:32 ` [PATCH 07/34] common/dump: don't replace pids arbitrarily Darrick J. Wong
2025-02-12  3:32 ` [PATCH 08/34] common/populate: correct the parent pointer name creation formulae Darrick J. Wong
2025-02-12  3:33 ` [PATCH 09/34] generic/759,760: fix MADV_COLLAPSE detection and inclusion Darrick J. Wong
2025-02-12 18:31   ` Joanne Koong
2025-02-12  3:33 ` [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO Darrick J. Wong
2025-02-12 18:39   ` Joanne Koong
2025-02-12  3:33 ` [PATCH 11/34] common/rc: create a wrapper for the su command Darrick J. Wong
2025-02-12  3:33 ` [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT Darrick J. Wong
2025-02-12  4:45   ` Dave Chinner
2025-02-12  6:05     ` Darrick J. Wong
2025-02-12  6:10   ` Darrick J. Wong
2025-02-12  3:34 ` [PATCH 13/34] common/rc: hoist pkill to a helper function Darrick J. Wong
2025-02-12  3:34 ` [PATCH 14/34] common: fix pkill by running test program in a separate session Darrick J. Wong
2025-02-14 17:34   ` Zorro Lang
2025-02-14 17:56     ` Darrick J. Wong
2025-02-14 20:50       ` Theodore Ts'o
2025-02-14 21:11         ` Darrick J. Wong
2025-02-15 13:34         ` Zorro Lang
2025-02-14 21:12   ` [PATCH 13.99/34] tools: add a Makefile " Darrick J. Wong
2025-02-15 13:16     ` Zorro Lang
2025-02-14 21:13   ` [PATCH v3.1 14/34] common: fix pkill by running test program in a " Darrick J. Wong
2025-02-15 13:22     ` Zorro Lang
2025-02-15 16:54       ` Darrick J. Wong
2025-02-12  3:34 ` [PATCH 15/34] check: run tests in a private pid/mount namespace Darrick J. Wong
2025-02-14 17:36   ` Zorro Lang
2025-02-14 21:13   ` [PATCH v3.1 " Darrick J. Wong
2025-02-25 11:27     ` Shinichiro Kawasaki
2025-02-25 15:49       ` Darrick J. Wong
2025-02-25 21:29         ` Dave Chinner
2025-02-26  2:13           ` Shinichiro Kawasaki
2025-02-26  2:18             ` Darrick J. Wong [this message]
2025-02-12  3:34 ` [PATCH 16/34] check: deprecate using process sessions to isolate test instances Darrick J. Wong
2025-02-12  3:35 ` [PATCH 17/34] common/rc: don't copy fsstress to $TEST_DIR Darrick J. Wong
2025-02-12  3:35 ` [PATCH 18/34] unmount: resume logging of stdout and stderr for filtering Darrick J. Wong
2025-02-12  3:35 ` [PATCH 19/34] mkfs: don't hardcode log size Darrick J. Wong
2025-02-12  3:35 ` [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount Darrick J. Wong
2025-02-12  3:36 ` [PATCH 21/34] preamble: fix missing _kill_fsstress Darrick J. Wong
2025-02-12  3:36 ` [PATCH 22/34] generic/650: revert SOAK DURATION changes Darrick J. Wong
2025-02-12  3:36 ` [PATCH 23/34] generic/032: fix pinned mount failure Darrick J. Wong
2025-02-12  3:36 ` [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails Darrick J. Wong
2025-02-12  3:37 ` [PATCH 25/34] fuzzy: don't use readarray for xfsfind output Darrick J. Wong
2025-02-12  3:37 ` [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error Darrick J. Wong
2025-02-12  3:37 ` [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration Darrick J. Wong
2025-02-12  4:47   ` Dave Chinner
2025-02-12  3:37 ` [PATCH 28/34] fix _require_scratch_duperemove ordering Darrick J. Wong
2025-02-12  3:38 ` [PATCH 29/34] fsstress: fix a memory leak Darrick J. Wong
2025-02-12  3:38 ` [PATCH 30/34] fsx: fix leaked log file pointer Darrick J. Wong
2025-02-12  3:38 ` [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument Darrick J. Wong
2025-02-12  3:39 ` [PATCH 32/34] common/config: add $here to FSSTRESS_PROG Darrick J. Wong
2025-02-12  3:39 ` [PATCH 33/34] config: add FSX_PROG variable Darrick J. Wong
2025-02-12  3:39 ` [PATCH 34/34] build: initialize stack variables to zero by default Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250226021816.GG6265@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox