From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49D2B219FC; Wed, 22 Jan 2025 03:36:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737516986; cv=none; b=EeHymJEXFUzXSUn68+G+jAE9Zls4sZn56lRk1d9lGbVCoXe9m4+TuhtbHIP9TIyVW1k7Ftg8hvYSu8EAkzcc3T97qKcxCD9WyGPrpWKxXJSgxeLiKAEIfaKnZYY3DqThbVmWFgLabVonqqUYLMpxzXFl3B3Ru4HhU5kdFbOuRB0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737516986; c=relaxed/simple; bh=YlxgwC2nQKRXQ4LN1WJ/vx7iVFsohQVRSu5j1c6XAnQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Sp7/zU/oY5UOLWoEEqs4ujS4eexcG99nwDQXNzLxvrP650NZncJIG9QyIqdFOJOvg7Qbfjvn7+clt0BwbGUdfDyhOSC82586dZgMgZzVYy2EV760m6m5Bd1iHRQilHy66xY8Ri23n75lsrE4MPhmCsm29gDg8XNC3jjMHuIM0bg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=KvG+BQ0z; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="KvG+BQ0z" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9F129C4CED6; Wed, 22 Jan 2025 03:36:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737516985; bh=YlxgwC2nQKRXQ4LN1WJ/vx7iVFsohQVRSu5j1c6XAnQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=KvG+BQ0zIOOY5SsmFgcRD9pkfgZXy5j52RvltrDjWX38WDW+GUFLqikhFVHgR2C+H vkn0xQeGsKUnEvuJWGIhYVBRCb9UEt8A4JVohPLXCFqGKyLbl/QXqe/mbSkP2WFW5A qwQPX2DAZqCzD6ugBCmsqleo7SXIROV0FfM2qER4tAObMfOV1EznovkKNWaN03QcDX Fwu0E67BfA84iFCs4+JHcjAb7kneCMzCPB6RGToGy+6WHJcFvmzul5vp0ZEvwkizdC 3yxUhpZp22CtoiKL9N/+m1gXL6F1nHKQlc8CGa+nd+2mmiQRKME75+5Vqxoc9ni2l0 4wFypyW4bhmLA== Date: Tue, 21 Jan 2025 19:36:25 -0800 From: "Darrick J. Wong" To: Theodore Ts'o Cc: Dave Chinner , zlang@redhat.com, hch@lst.de, fstests@vger.kernel.org, linux-xfs@vger.kernel.org Subject: Re: [PATCH 10/23] mkfs: don't hardcode log size Message-ID: <20250122033625.GQ1611770@frogsfrogsfrogs> References: <173706974044.1927324.7824600141282028094.stgit@frogsfrogsfrogs> <173706974228.1927324.17714311358227511791.stgit@frogsfrogsfrogs> <20250121124430.GA3809348@mit.edu> Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250121124430.GA3809348@mit.edu> On Tue, Jan 21, 2025 at 07:44:30AM -0500, Theodore Ts'o wrote: > On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote: > > > +# Are there mkfs options to try to improve concurrency? > > > +_scratch_mkfs_concurrency_options() > > > +{ > > > + local nr_cpus="$(( $1 * LOAD_FACTOR ))" > > > > caller does not need to pass a number of CPUs. This function can > > simply do: > > > > local nr_cpus=$(getconf _NPROCESSORS_CONF) > > > > And that will set concurrency to be "optimal" for the number of CPUs > > in the machine the test is going to run on. That way tests don't > > need to hard code some number that is going to be too large for > > small systems and to small for large systems... > > Hmm, but is this the right thing if you are using check-parallel? If > you are running multiple tests that are all running some kind of load > or stress-testing antagonist at the same time, then having 3x to 5x > the number of necessary antagonist threads is going to unnecessarily > slow down the test run, which goes against the original goal of what > we were hoping to achieve with check-parallel. Maybe a more appropriate thing to do is: local nr_cpus=$(grep Cpus_allowed /proc/self/status | hweight) So a check-parallel could (if they see such problems) constrain the parallelism through cpu pinning. I think getconf _NPROCESSORS_CONF is probably fine for now. (The other day I /did/ see some program in either util-linux or coreutils that told you the number of "available" cpus based on checking the affinity mask and whatever cgroups constraints are applied. I can't find it now, alas...) > How many tests are you currently able to run in parallel today, and > what's the ultimate goal? We could have some kind of antagonist load > which is shared across multiple tests, but it's not clear to me that > it's worth the complexity. (And note that it's not just fs and cpu > load antagonistsw; there could also be memory stress antagonists, where > having multiple antagonists could lead to OOM kills...) On the other hand, perhaps having random antagonistic processes from other ./check instances is exactly the kind of stress testing that we want to shake out weirder bugs? It's clear from Dave's RFC that the generic/650 cpu hotplug shenanigans had some effect. ;) --D > - Ted >