public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] generic/038: speed up file creation
@ 2015-08-06  0:27 Dave Chinner
  2015-08-06 14:17 ` Eryu Guan
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2015-08-06  0:27 UTC (permalink / raw)
  To: fstests

From: Dave Chinner <dchinner@redhat.com>

Now that generic/038 is running on my test machine, I notice how
slow it is:

generic/038      692s

11-12 minutes for a single test is way too long.
The test is creating
400,000 single block files, which can be easily parallelised and
hence run much faster than the test is currently doing.

Split the file creation up into 4 threads that create 100,000 files
each. 4 is chosen because XFS defaults to 4AGs, ext4 still has decent
speedups at 4 concurrent creates, and other filesystems aren't hurt
by excessive concurrency. The result:

generic/038      237s

on the same machine, which is roughly 3x faster and so it (just)
fast enough to to be considered acceptible.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 tests/generic/038 | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/tests/generic/038 b/tests/generic/038
index 4d108cf..3c94a3b 100755
--- a/tests/generic/038
+++ b/tests/generic/038
@@ -105,19 +105,30 @@ trim_loop()
 # the fallocate calls happen. So we don't really care if they all succeed or
 # not, the goal is just to keep metadata space usage growing while data block
 # groups are deleted.
+#
+# reating 400,000 files sequentially is really slow, so speed it up a bit
+# by doing it concurrently with 4 threads in 4 separate directories.
 create_files()
 {
 	local prefix=$1
 
-	for ((i = 1; i <= 400000; i++)); do
-		$XFS_IO_PROG -f -c "pwrite -S 0xaa 0 3900" \
-			$SCRATCH_MNT/"${prefix}_$i" &> /dev/null
-		if [ $? -ne 0 ]; then
-			echo "Failed creating file ${prefix}_$i" >>$seqres.full
-			break
-		fi
+	for ((n = 0; n < 4; n++)); do
+		mkdir $SCRATCH_MNT/$n
+		(
+		for ((i = 1; i <= 100000; i++)); do
+			$XFS_IO_PROG -f -c "pwrite -S 0xaa 0 3900" \
+				$SCRATCH_MNT/$n/"${prefix}_$i" &> /dev/null
+			if [ $? -ne 0 ]; then
+				echo "Failed creating file $n/${prefix}_$i" >>$seqres.full
+				break
+			fi
+		done
+		) &
+		create_pids[$n]=$!
 	done
 
+	wait ${create_pids[@]}
+
 }
 
 _scratch_mkfs >>$seqres.full 2>&1
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic/038: speed up file creation
  2015-08-06  0:27 [PATCH] generic/038: speed up file creation Dave Chinner
@ 2015-08-06 14:17 ` Eryu Guan
  2015-08-06 22:21   ` Dave Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Eryu Guan @ 2015-08-06 14:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: fstests

On Thu, Aug 06, 2015 at 10:27:28AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Now that generic/038 is running on my test machine, I notice how
> slow it is:
> 
> generic/038      692s
> 
> 11-12 minutes for a single test is way too long.
> The test is creating
> 400,000 single block files, which can be easily parallelised and
> hence run much faster than the test is currently doing.
> 
> Split the file creation up into 4 threads that create 100,000 files
> each. 4 is chosen because XFS defaults to 4AGs, ext4 still has decent
> speedups at 4 concurrent creates, and other filesystems aren't hurt
> by excessive concurrency. The result:
> 
> generic/038      237s
> 
> on the same machine, which is roughly 3x faster and so it (just)
> fast enough to to be considered acceptible.

I got a speedup from 5663s to 639s, and confirmed the test could
fail the test on unpatched btrfs (btrfsck failed, not every time).

Reviewed-by: Eryu Guan <eguan@redhat.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic/038: speed up file creation
  2015-08-06 14:17 ` Eryu Guan
@ 2015-08-06 22:21   ` Dave Chinner
  2015-08-07  8:09     ` Filipe David Manana
  2015-08-09 10:45     ` Eryu Guan
  0 siblings, 2 replies; 7+ messages in thread
From: Dave Chinner @ 2015-08-06 22:21 UTC (permalink / raw)
  To: Eryu Guan; +Cc: fstests

On Thu, Aug 06, 2015 at 10:17:22PM +0800, Eryu Guan wrote:
> On Thu, Aug 06, 2015 at 10:27:28AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Now that generic/038 is running on my test machine, I notice how
> > slow it is:
> > 
> > generic/038      692s
> > 
> > 11-12 minutes for a single test is way too long.
> > The test is creating
> > 400,000 single block files, which can be easily parallelised and
> > hence run much faster than the test is currently doing.
> > 
> > Split the file creation up into 4 threads that create 100,000 files
> > each. 4 is chosen because XFS defaults to 4AGs, ext4 still has decent
> > speedups at 4 concurrent creates, and other filesystems aren't hurt
> > by excessive concurrency. The result:
> > 
> > generic/038      237s
> > 
> > on the same machine, which is roughly 3x faster and so it (just)
> > fast enough to to be considered acceptible.
> 
> I got a speedup from 5663s to 639s, and confirmed the test could

Oh, wow. You should consider any test that takes longer than 5
minutes in the auto group as taking too long. An hour for a test in
the auto group is not acceptible. I expect the auto group to
complete within 1-2 hours for an xfs run, depending on storage in
use. 

On my slowest test vm, the slowest tests are:

$ cat results/check.time | sort -nr -k 2 |head -10
generic/127 1060
generic/038 537
xfs/042 426
generic/231 273
xfs/227 267
generic/208 200
generic/027 156
shared/005 153
generic/133 125
xfs/217 123
$

As you can see, generic/038 is the second worst offender here (it's
a single CPU machine, so parallelism doesn't help a great deal).
generic/127 and xfs/042 are the other two tests that really need
looking at, and only generic/231 and xfs/227 are in the
"borderline-too-slow" category.

generic/038 was a simple on to speed up. I've looked at generic/127,
and it's limited by the pair of synchronous IO fsx runs of 100,000
ops, which means there's probably 40,000 synchronous writes in the
test. Of course, this is meaningless on a ramdisk - generic/127
takes only 24s on my fastest test vm....

> fail the test on unpatched btrfs (btrfsck failed, not every time).

Seeing as you can reproduce the problem, I encourage you to work out
what the minimum number of files need to reproduce the problem is,
and update the test to use that so that it runs even faster...

> Reviewed-by: Eryu Guan <eguan@redhat.com>

Thanks!

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic/038: speed up file creation
  2015-08-06 22:21   ` Dave Chinner
@ 2015-08-07  8:09     ` Filipe David Manana
  2015-08-07 23:22       ` Dave Chinner
  2015-08-09 10:45     ` Eryu Guan
  1 sibling, 1 reply; 7+ messages in thread
From: Filipe David Manana @ 2015-08-07  8:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eryu Guan, fstests

On Thu, Aug 6, 2015 at 11:21 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Aug 06, 2015 at 10:17:22PM +0800, Eryu Guan wrote:
>> On Thu, Aug 06, 2015 at 10:27:28AM +1000, Dave Chinner wrote:
>> > From: Dave Chinner <dchinner@redhat.com>
>> >
>> > Now that generic/038 is running on my test machine, I notice how
>> > slow it is:
>> >
>> > generic/038      692s
>> >
>> > 11-12 minutes for a single test is way too long.
>> > The test is creating
>> > 400,000 single block files, which can be easily parallelised and
>> > hence run much faster than the test is currently doing.
>> >
>> > Split the file creation up into 4 threads that create 100,000 files
>> > each. 4 is chosen because XFS defaults to 4AGs, ext4 still has decent
>> > speedups at 4 concurrent creates, and other filesystems aren't hurt
>> > by excessive concurrency. The result:
>> >
>> > generic/038      237s
>> >
>> > on the same machine, which is roughly 3x faster and so it (just)
>> > fast enough to to be considered acceptible.
>>
>> I got a speedup from 5663s to 639s, and confirmed the test could
>
> Oh, wow. You should consider any test that takes longer than 5
> minutes in the auto group as taking too long. An hour for a test in
> the auto group is not acceptible. I expect the auto group to
> complete within 1-2 hours for an xfs run, depending on storage in
> use.
>
> On my slowest test vm, the slowest tests are:
>
> $ cat results/check.time | sort -nr -k 2 |head -10
> generic/127 1060
> generic/038 537
> xfs/042 426
> generic/231 273
> xfs/227 267
> generic/208 200
> generic/027 156
> shared/005 153
> generic/133 125
> xfs/217 123
> $
>
> As you can see, generic/038 is the second worst offender here (it's
> a single CPU machine, so parallelism doesn't help a great deal).
> generic/127 and xfs/042 are the other two tests that really need
> looking at, and only generic/231 and xfs/227 are in the
> "borderline-too-slow" category.
>
> generic/038 was a simple on to speed up. I've looked at generic/127,
> and it's limited by the pair of synchronous IO fsx runs of 100,000
> ops, which means there's probably 40,000 synchronous writes in the
> test. Of course, this is meaningless on a ramdisk - generic/127
> takes only 24s on my fastest test vm....
>
>> fail the test on unpatched btrfs (btrfsck failed, not every time).
>
> Seeing as you can reproduce the problem, I encourage you to work out
> what the minimum number of files need to reproduce the problem is,
> and update the test to use that so that it runs even faster...

There are actually several (easily over a dozen or so) problems this
test triggered on btrfs (with some more found after the test was
checked in), and they were all races leading to fs corruption,
crashes, memory leaks, etc. Some were very hard to hit, and I remember
the higher the number of files the easier it was to hit the races (or
better say, the less hard it was to hit them) and less than 400k made
it really really hard to hit them on my test machines (locally I often
use this test with over 1 million files to verify specific patches).

Thanks for doing this. You can add:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

>
>> Reviewed-by: Eryu Guan <eguan@redhat.com>
>
> Thanks!
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic/038: speed up file creation
  2015-08-07  8:09     ` Filipe David Manana
@ 2015-08-07 23:22       ` Dave Chinner
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2015-08-07 23:22 UTC (permalink / raw)
  To: Filipe David Manana; +Cc: Eryu Guan, fstests

On Fri, Aug 07, 2015 at 09:09:43AM +0100, Filipe David Manana wrote:
> On Thu, Aug 6, 2015 at 11:21 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Thu, Aug 06, 2015 at 10:17:22PM +0800, Eryu Guan wrote:
> >> On Thu, Aug 06, 2015 at 10:27:28AM +1000, Dave Chinner wrote:
> >> > From: Dave Chinner <dchinner@redhat.com>
> >> >
> >> > Now that generic/038 is running on my test machine, I notice how
> >> > slow it is:
> >> >
> >> > generic/038      692s
> >> >
> >> > 11-12 minutes for a single test is way too long.
> >> > The test is creating
> >> > 400,000 single block files, which can be easily parallelised and
> >> > hence run much faster than the test is currently doing.
> >> >
> >> > Split the file creation up into 4 threads that create 100,000 files
> >> > each. 4 is chosen because XFS defaults to 4AGs, ext4 still has decent
> >> > speedups at 4 concurrent creates, and other filesystems aren't hurt
> >> > by excessive concurrency. The result:
> >> >
> >> > generic/038      237s
> >> >
> >> > on the same machine, which is roughly 3x faster and so it (just)
> >> > fast enough to to be considered acceptible.
> >>
> >> I got a speedup from 5663s to 639s, and confirmed the test could
> >
> > Oh, wow. You should consider any test that takes longer than 5
> > minutes in the auto group as taking too long. An hour for a test in
> > the auto group is not acceptible. I expect the auto group to
> > complete within 1-2 hours for an xfs run, depending on storage in
> > use.
> >
> > On my slowest test vm, the slowest tests are:
> >
> > $ cat results/check.time | sort -nr -k 2 |head -10
> > generic/127 1060
> > generic/038 537
> > xfs/042 426
> > generic/231 273
> > xfs/227 267
> > generic/208 200
> > generic/027 156
> > shared/005 153
> > generic/133 125
> > xfs/217 123
> > $
> >
> > As you can see, generic/038 is the second worst offender here (it's
> > a single CPU machine, so parallelism doesn't help a great deal).
> > generic/127 and xfs/042 are the other two tests that really need
> > looking at, and only generic/231 and xfs/227 are in the
> > "borderline-too-slow" category.
> >
> > generic/038 was a simple on to speed up. I've looked at generic/127,
> > and it's limited by the pair of synchronous IO fsx runs of 100,000
> > ops, which means there's probably 40,000 synchronous writes in the
> > test. Of course, this is meaningless on a ramdisk - generic/127
> > takes only 24s on my fastest test vm....
> >
> >> fail the test on unpatched btrfs (btrfsck failed, not every time).
> >
> > Seeing as you can reproduce the problem, I encourage you to work out
> > what the minimum number of files need to reproduce the problem is,
> > and update the test to use that so that it runs even faster...
> 
> There are actually several (easily over a dozen or so) problems this
> test triggered on btrfs (with some more found after the test was
> checked in), and they were all races leading to fs corruption,
> crashes, memory leaks, etc. Some were very hard to hit, and I remember
> the higher the number of files the easier it was to hit the races (or
> better say, the less hard it was to hit them) and less than 400k made
> it really really hard to hit them on my test machines (locally I often
> use this test with over 1 million files to verify specific patches).

I use fsmark for this sort of large scale directory testing. Itis
much faster than using shell script loops to create files, and it's
much more flexible in terms of file layout, too. xfstests is not
really the best vehicle for testing problems that require lots of
time to create data sets. Jan Tulak's environment work is a step
towards being able to do that (i.e. define an environment
that persists over multiple tests which may take some time and
complexity to set up), but we need to get that sorted and merged
first...

Cheers,

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic/038: speed up file creation
  2015-08-06 22:21   ` Dave Chinner
  2015-08-07  8:09     ` Filipe David Manana
@ 2015-08-09 10:45     ` Eryu Guan
  2015-08-09 23:20       ` Dave Chinner
  1 sibling, 1 reply; 7+ messages in thread
From: Eryu Guan @ 2015-08-09 10:45 UTC (permalink / raw)
  To: Dave Chinner; +Cc: fstests

On Fri, Aug 07, 2015 at 08:21:27AM +1000, Dave Chinner wrote:
> On Thu, Aug 06, 2015 at 10:17:22PM +0800, Eryu Guan wrote:
> > On Thu, Aug 06, 2015 at 10:27:28AM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > Now that generic/038 is running on my test machine, I notice how
> > > slow it is:
> > > 
> > > generic/038      692s
> > > 
> > > 11-12 minutes for a single test is way too long.
> > > The test is creating
> > > 400,000 single block files, which can be easily parallelised and
> > > hence run much faster than the test is currently doing.
> > > 
> > > Split the file creation up into 4 threads that create 100,000 files
> > > each. 4 is chosen because XFS defaults to 4AGs, ext4 still has decent
> > > speedups at 4 concurrent creates, and other filesystems aren't hurt
> > > by excessive concurrency. The result:
> > > 
> > > generic/038      237s
> > > 
> > > on the same machine, which is roughly 3x faster and so it (just)
> > > fast enough to to be considered acceptible.
> > 
> > I got a speedup from 5663s to 639s, and confirmed the test could
> 
> Oh, wow. You should consider any test that takes longer than 5
> minutes in the auto group as taking too long. An hour for a test in
> the auto group is not acceptible. I expect the auto group to
> complete within 1-2 hours for an xfs run, depending on storage in
> use. 

Maybe it's taking hours to finish on my test vm is because I'm testing
on loop device, my hard disk doesn't support trim, so generic/038 is a
_not_run for me, and I didn't notice its slowness before.

> 
> On my slowest test vm, the slowest tests are:
> 
> $ cat results/check.time | sort -nr -k 2 |head -10
> generic/127 1060
> generic/038 537
> xfs/042 426
> generic/231 273
> xfs/227 267
> generic/208 200
> generic/027 156
> shared/005 153
> generic/133 125
> xfs/217 123
> $
> 
> As you can see, generic/038 is the second worst offender here (it's
> a single CPU machine, so parallelism doesn't help a great deal).
> generic/127 and xfs/042 are the other two tests that really need
> looking at, and only generic/231 and xfs/227 are in the
> "borderline-too-slow" category.
> 
> generic/038 was a simple on to speed up. I've looked at generic/127,
> and it's limited by the pair of synchronous IO fsx runs of 100,000
> ops, which means there's probably 40,000 synchronous writes in the
> test. Of course, this is meaningless on a ramdisk - generic/127
> takes only 24s on my fastest test vm....
> 
> > fail the test on unpatched btrfs (btrfsck failed, not every time).
> 
> Seeing as you can reproduce the problem, I encourage you to work out
> what the minimum number of files need to reproduce the problem is,
> and update the test to use that so that it runs even faster...

I found that 50000 files per thread is good enough for me to reproduce
the fs corruption, sometimes WARNINGs. With 20000 or 30000 files per
thread, only 20% to 33% runs could hit some problems. So this is what
I'm testing (comments are not updated)

[root@dhcp-66-87-213 xfstests]# git diff
diff --git a/tests/generic/038 b/tests/generic/038
index 3c94a3b..7564c87 100755
--- a/tests/generic/038
+++ b/tests/generic/038
@@ -108,6 +108,7 @@ trim_loop()
 #
 # reating 400,000 files sequentially is really slow, so speed it up a bit
 # by doing it concurrently with 4 threads in 4 separate directories.
+nr_files=$((50000 * LOAD_FACTOR))
 create_files()
 {
        local prefix=$1
@@ -115,7 +116,7 @@ create_files()
        for ((n = 0; n < 4; n++)); do
                mkdir $SCRATCH_MNT/$n
                (
-               for ((i = 1; i <= 100000; i++)); do
+               for ((i = 1; i <= $nr_files; i++)); do
                        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 3900" \
                                $SCRATCH_MNT/$n/"${prefix}_$i" &> /dev/null
                        if [ $? -ne 0 ]; then

Would you like a follow up patch from me or you can just make this one a v2?

Thanks,
Eryu

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic/038: speed up file creation
  2015-08-09 10:45     ` Eryu Guan
@ 2015-08-09 23:20       ` Dave Chinner
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2015-08-09 23:20 UTC (permalink / raw)
  To: Eryu Guan; +Cc: fstests

On Sun, Aug 09, 2015 at 06:45:58PM +0800, Eryu Guan wrote:
> On Fri, Aug 07, 2015 at 08:21:27AM +1000, Dave Chinner wrote:
> > Seeing as you can reproduce the problem, I encourage you to work out
> > what the minimum number of files need to reproduce the problem is,
> > and update the test to use that so that it runs even faster...
> 
> I found that 50000 files per thread is good enough for me to reproduce
> the fs corruption, sometimes WARNINGs. With 20000 or 30000 files per
> thread, only 20% to 33% runs could hit some problems. So this is what
> I'm testing (comments are not updated)
> 
> [root@dhcp-66-87-213 xfstests]# git diff
> diff --git a/tests/generic/038 b/tests/generic/038
> index 3c94a3b..7564c87 100755
> --- a/tests/generic/038
> +++ b/tests/generic/038
> @@ -108,6 +108,7 @@ trim_loop()
>  #
>  # reating 400,000 files sequentially is really slow, so speed it up a bit
>  # by doing it concurrently with 4 threads in 4 separate directories.
> +nr_files=$((50000 * LOAD_FACTOR))
>  create_files()
>  {
>         local prefix=$1
> @@ -115,7 +116,7 @@ create_files()
>         for ((n = 0; n < 4; n++)); do
>                 mkdir $SCRATCH_MNT/$n
>                 (
> -               for ((i = 1; i <= 100000; i++)); do
> +               for ((i = 1; i <= $nr_files; i++)); do
>                         $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 3900" \
>                                 $SCRATCH_MNT/$n/"${prefix}_$i" &> /dev/null
>                         if [ $? -ne 0 ]; then
> 
> Would you like a follow up patch from me or you can just make this one a v2?

Ok, I'll fold that into my original patch, update the comment and
the commit message with:

[Eryu Guan: reduced number of files to minimum needed to reproduce
 btrfs problem reliably, added $LOAD_FACTOR scaling for longer
 running.]

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-08-09 23:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-06  0:27 [PATCH] generic/038: speed up file creation Dave Chinner
2015-08-06 14:17 ` Eryu Guan
2015-08-06 22:21   ` Dave Chinner
2015-08-07  8:09     ` Filipe David Manana
2015-08-07 23:22       ` Dave Chinner
2015-08-09 10:45     ` Eryu Guan
2015-08-09 23:20       ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox