FS/XFS testing framework
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Zorro Lang <zlang@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>,
	Dave Chinner <david@fromorbit.com>,
	fstests@vger.kernel.org
Subject: Re: [PATCH 1/3] more python dependence. was: populate: fix horrible performance due to excessive forking
Date: Sun, 15 Jan 2023 10:33:59 -0800	[thread overview]
Message-ID: <Y8RHF8c2vMJl3khu@magnolia> (raw)
In-Reply-To: <20230112204213.k6amqgcydn76pjsm@zlang-mailbox>

On Fri, Jan 13, 2023 at 04:42:13AM +0800, Zorro Lang wrote:
> On Thu, Jan 12, 2023 at 09:07:56AM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 12, 2023 at 11:24:58AM +0100, David Disseldorp wrote:
> > > Hi Darrick,
> > > 
> > > On Wed, 11 Jan 2023 17:58:17 -0800, Darrick J. Wong wrote:
> > > 
> > > > > (removexattr looks like a pain in perl though...)
> > > > > 
> > > > > Anyway it's late now, I'll look at the diff tomorrow.  
> > > > 
> > > > ...or thursday now, since I decided to reply to the online fsck design
> > > > doc review comments, which took most of the workday.  I managed to bang
> > > > out a python script (perl doesn't support setxattr!) that cut the xattr
> > > > overhead down to nearly zero.
> > > 
> > > IIUC we currently only depend on python for the fio perf tests and
> > > btrfs/154 . My preference would be to not see it spread further
> > 
> > I don't appreciate your dismissal of the patch before I've even posted
> > it!
> > 
> > The fstests README clearly lists python3 as a dependency.  Argument
> > parsing and xattr calls are provided by the base python3 runtime.  No
> > third party libraries are required for this new program, and if they
> > were, they'd be added to the README.
> 
> Sorry Darrick, that README description might not exact enough :/ some packages
> are not *necessary* for the whole fstests running. Their missing might just
> cause some single cases be skipped.
> 
> The python3 isn't necessary running/building dependence of fstests. Some
> people might run fstests without python3 currently. And I don't plan to
> make it become *necessary* (no running if no python3) now. If you'd like
> to add python3 dependence to common helpers, that might affect more cases.
> 
> So how about fall back to old code if no python3? Or you'd like to skipped
> populate related testing if no python3? Or use another way to reduce the
> hard dependence change.

I'm still working on getting this to pass the online fsck fuzz QA tests.

I started by moving the btree dir creation fix to the front of the
series, and then split this first patch into one patch to speed up
directory creation and a second patch to run most of the creation
helpers in parallel.

It turned out that my earlier patch moving the loop to a perl script was
even faster than Dave's thing, so I substituted that one.

Next I reworked the xattr creation speedups to incorporate all the
review comments.  Zorro pointed out that unlike perl, there's nothing in
fstests that _fatals the lack of python.

For people with python3, I still provide a new script to handle creation
and selective deletion of xattrs.  This reduces the xattr part of the
runtime to nearly zero.  For people who don't want python3, I
resurrected the previous version of the patch that formats a fake xattr
dump file and pipes it to setattr --restore.  The deletion loop is still
slow, but overall it's faster than before.

That leaves the parallel creation patch.  This has caused lots of
headaches for me because the previous populate() implementation would
(perhaps accidentally) create a filesystem where most of the field
fuzzing didn't get in the way of mounting the filesystem, but with
parallel creation enabled, I see a lot of online fsck test failures
resulting from mount failure.

For example, the function creates a directory tree under
SCRATCH_MNT/INOBT/ with enough files to create a multilevel inode btree.
When the function ran serially, the inode allocation rotor would not
generally create the diretory (and its children) in AG 0.  This is what
we want, since problems in the AG 0 inode btree usually result in mount
failures when we try to look up the root directory.  With parallelism
enabled, the inobt gets created in AG 0 sometimes, which causes the
online fsck fuzz tests to fail because mount failed.

The populate function *could* get smarter about avoiding AG 0, though
that would come with an increase in code complexity.

I then measured the performance speedup of each of the individual
patches in my branch.  Removing the execve overhead resulted in a 10x
decrease in runtime (~170s to ~15s), but parallelizing after that only
netted a ~2s decrease in runtime.

Online fsck testing seems fine if I pop the parallelization patches off,
so I'll keep trying, but I think I've hit the point of diminished
returns.

--D

> Thanks,
> Zorro
> 
> > 
> > > (especially if it's just to shave off a little runtime), mostly because
> > > it's a pain for dependency tracking.
> > > Perhaps you could use perl's syscall(SYS_fsetxattr(), ...)? Well, that or
> > 
> > Raw system calls are a terrible idea for maintainability.  You'd
> > *seriously* rather I open-code the glibc xattr wrappers and make the
> > fstests community maintain that for the sake of your preference?
> > 
> > > rewrite it again in awk ;-P
> > 
> > WTAF?
> > 
> > --D
> > 
> > > Cheers, David
> > 
> 

  reply	other threads:[~2023-01-15 18:34 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-10 22:49 [PATCH 0/3] fstests: filesystem population fixes Dave Chinner
2023-01-10 22:49 ` [PATCH 1/3] populate: fix horrible performance due to excessive forking Dave Chinner
2023-01-11  6:02   ` Darrick J. Wong
2023-01-12  1:58     ` Darrick J. Wong
2023-01-12 10:24       ` [PATCH 1/3] more python dependence. was: " David Disseldorp
2023-01-12 17:07         ` Darrick J. Wong
2023-01-12 20:23           ` David Disseldorp
2023-01-12 20:42           ` Zorro Lang
2023-01-15 18:33             ` Darrick J. Wong [this message]
2023-01-10 22:49 ` [PATCH 2/3] populate: ensure btree directories are created reliably Dave Chinner
2023-01-11  5:47   ` Darrick J. Wong
2023-01-12  5:42   ` Gao Xiang
2023-01-10 22:49 ` [PATCH 3/3] xfs/294: performance is unreasonably slow Dave Chinner
2023-01-11 20:29   ` David Disseldorp
2023-01-12  8:39   ` Zorro Lang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8RHF8c2vMJl3khu@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=ddiss@suse.de \
    --cc=fstests@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox