From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cn.fujitsu.com ([59.151.112.132]:31075 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1750788AbaLYHi4 convert rfc822-to-8bit (ORCPT ); Thu, 25 Dec 2014 02:38:56 -0500 Message-ID: <549BBE3C.80201@cn.fujitsu.com> Date: Thu, 25 Dec 2014 15:35:24 +0800 From: "gux.fnst" MIME-Version: 1.0 Subject: Re: [PATCH] xfs: add test for truncate/collapse range race References: <1419060301-26830-1-git-send-email-gux.fnst@cn.fujitsu.com> <20141224015335.GL4521@dastard> In-Reply-To: <20141224015335.GL4521@dastard> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8BIT Sender: fstests-owner@vger.kernel.org To: Dave Chinner Cc: fstests@vger.kernel.org, guaneryu@gmail.com, lczerner@redhat.com List-ID: On 12/24/2014 09:53 AM, Dave Chinner wrote: > On Sat, Dec 20, 2014 at 03:25:01PM +0800, Xing Gu wrote: >> This case tests truncate/collapse range race. If >> the race occurs, it will trigger BUG_ON. >> >> Signed-off-by: Xing Gu >> --- > > What changed from the previous version? > Compared with the previous version,there are mainly two changes: (1) Since this patch only checks for the truncate/collapse range race, the description of previous version is not clear. I changed the description. (2) Considering the different performance of each test machine, it is not reasonable to set a run loop for a fixed time eg. 3 minutes in the previous version. I changed the form of loop. > ... >> +rm -f $seqres.full >> +_scratch_mkfs >>$seqres.full 2>&1 >> +_scratch_mount >> + >> +old_bug=`dmesg | grep -c "kernel BUG"` >> + >> +testfile=$SCRATCH_MNT/file.$seq >> +# fcollapse/truncate continuously and simultaneously a same file >> +for ((i=1; i <= 100; i++)); do >> + for ((i=1; i <= 1000; i++)); do >> + $XFS_IO_PROG -f -c 'truncate 100k' $testfile 2>> $seqres.full >> + $XFS_IO_PROG -f -c 'fcollapse 0 16k' $testfile 2>> $seqres.full >> + done & >> + for ((i=1; i <= 1000; i++)); do >> + $XFS_IO_PROG -f -c 'truncate 0' $testfile 2>> $seqres.full >> + done & >> +done > > The previous version of this ran a loop for 3 minutes, which we > talked about being too long. This loop forks 300,000 processes > and generates a 1.5MB $seqres.full file. On my single CPU test VM > it takes: > > generic/039 302s > > About 5 minutes to run, so it takes longer than the 3 minute version > of the same test we said was too long. FYI, my 16p test VM still > takes 35s to crunch through this test and it pegs all 16 CPUs to > 100% usage. > > We don't need to record the output of the xfs_io commands, so > avoiding a fork and throwing away the output such as: > > $XFS_IO_PROG -f -c 'truncate 100k' \ > -c 'fcollapse 0 16k' \ > $testfile > /dev/null 2>&1 > > makes the runtime on the 16p VM drop by 40% (22s) and by 33% (200s) > on the single CPU VM. but that's still too long on the smaller CPU > systems. > > I think the loop iterations need to be tuned to the number of CPUs > in the system. This: > > NCPUS=`$here/src/feature -o` > OUTER_LOOPS=$((10 * $NCPUS * $LOAD_FACTOR)) > INNER_LOOPS=$((50 * $NCPUS * $LOAD_FACTOR)) > > plus the above xfs_io optimisations give a runtime of 3s on my 1p > machien and 30s on my 16p machine. That would be more acceptible > to everyone, I think. > Got it. >> +wait >> + >> +new_bug=`dmesg | grep -c "kernel BUG"` >> +if [ $new_bug -ne $old_bug ]; then >> + _fail "kernel bug detected, check dmesg for more infomation." >> +fi > > A kernel bug in a process with an open file descriptor will cause > the filesystem to be unmountable. It will hang the test, require a > reboot. Hence there's no point in checking dmesg for a bug message > as it will be noticed by the test failing to complete. > Got it. >> +status=0 >> +exit >> diff --git a/tests/generic/039.out b/tests/generic/039.out >> new file mode 100644 >> index 0000000..0cacac7 >> --- /dev/null >> +++ b/tests/generic/039.out >> @@ -0,0 +1 @@ >> +QA output created by 039 > > The test needs to echo something to indicate that an empty golden > output file is expected. "Silence is golden" is the usual phrase > here.... > Got it. >> 036 auto aio rw stress >> 037 metadata auto quick >> 038 auto stress >> +039 auto metadata rw > > With the addition of $LOAD_FACTOR, this can be added to the stress > group as well. > Got it. Thanks for your suggestion! Regards, Xing Gu > Cheers, > > Dave. >