public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Eryu Guan <eguan@redhat.com>
To: nave vardy <nave.vardy@plexistor.com>
Cc: fstests@vger.kernel.org
Subject: Re: [PATCH] generic/409: reflink concurrent operations
Date: Tue, 28 Feb 2017 10:49:12 +0800	[thread overview]
Message-ID: <20170228024912.GD16443@eguan.usersys.redhat.com> (raw)
In-Reply-To: <1488205583.6778.34.camel@plexistor.com>

On Mon, Feb 27, 2017 at 04:26:23PM +0200, nave vardy wrote:
> On Fri, 2017-02-24 at 17:47 +0800, Eryu Guan wrote:
> > On Thu, Feb 23, 2017 at 02:30:54PM +0200, Nave Vardy wrote:
> > > 
> > > reflink: concurrent operations test
> > > 
> > > perform read operation on the target file while
> > > doing write or fpunch operations on the reflinks.
> > I'm having the same question as Darrick's, is this test reproducing
> > some
> > kind of bugs for you?
> Maybe it would be better if I'll first explain my motivation for
> sending the patch. I am working at Plexistor a company that develops a
> new file-system for persistent memory. The file-system supports cow,
> and we wrote some fstests, we thought to contribute for the community.
> For the test itself, it helped us to found a bug in the development of
> cow, no other generic test currently exists did.

Thanks a lot for writing new tests!

> > 
> > > 
> > > 
> > > Signed-off-by: Nave Vardy <nave.vardy@plexistor.com>
> > > ---
> > >  tests/generic/409     | 97
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/409.out |  2 ++
> > >  tests/generic/group   |  1 +
> > >  3 files changed, 100 insertions(+)
> > >  create mode 100644 tests/generic/409
> > 0755 file mode for new test. This is done automatically if you use
> > "./new generic" to setup new test template.
> > 
> In the next version of the patch I'll use this.
> > > 
> > >  create mode 100644 tests/generic/409.out
> > > 
> > > diff --git a/tests/generic/409 b/tests/generic/409
> > > new file mode 100644
> > > index 0000000..eb1da13
> > > --- /dev/null
> > > +++ b/tests/generic/409
> > > @@ -0,0 +1,97 @@
> > > +#!/bin/bash
> > > +# FS QA Test No. 409
> > > +#
> > > +# test for races between write or fpunch operations on reflinked
> > > files
> > > +# to read operations on the targed file
> > > +#
> > > +#-----------------------------------------------------------------
> > > ------
> > > +#
> > > +# Copyright (c) 2017 Plexistor Ltd. All Rights Reserved.
> > > +#
> > > +# This program is free software; you can redistribute it and/or
> > > +# modify it under the terms of the GNU General Public License as
> > > +# published by the Free Software Foundation.
> > > +#
> > > +# This program is distributed in the hope that it would be useful,
> > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +# GNU General Public License for more details.
> > > +#
> > > +# You should have received a copy of the GNU General Public
> > > License
> > > +# along with this program; if not, write the Free Software
> > > Foundation,
> > > +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> > > +#
> > > +#-----------------------------------------------------------------
> > > ------
> > > +
> > > +seq=`basename $0`
> > > +seqres=$RESULT_DIR/$seq
> > > +echo "QA output created by $seq"
> > > +
> > > +here=`pwd`
> > > +tmp=/tmp/$$
> > > +status=0    # success is the default!
> > > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > > +
> > > +_cleanup()
> > > +{
> > > +	cd /
> > > +	rm -f $tmp.*
> > > +}
> > > +
> > > +# get standard environment, filters and checks
> > > +. ./common/rc
> > > +. ./common/reflink
> > > +
> > > +# remove previous $seqres.full before test
> > > +rm -f $seqres.full
> > > +
> > > +# real QA test starts here
> > > +_supported_fs generic
> > > +_supported_os Linux
> > > +_require_scratch_reflink
> > > +_require_cp_reflink
> > > +
> > > +_scratch_mkfs_sized $((400 * 1024 ** 2)) >> $seqres.full 2>&1 ||
> > > _fail "mkfs failed"
> > Same question on the filesystem size.
> > 
> I used _scratch_mkfs_sized to avoid enospace in this scenario. 
> I took a lager margin than I needed, which I can narrow down.
> The targted file is 10mb and I used 20 reflinks, because the tests
> writes the reflinks, in the end each will take place of max 10mb (it is
> not accurate because I use fpunch for every second block), so we have
> 21 files sum to max size of 210mb, so I think 250mb will be good
> enough.

Then I think, as Darrick suggested, _require_fs_space() is what you need.

> > > 
> > > +_scratch_mount || _fail "mount failed"
> > > +
> > > +echo "Silence is golden"
> > > +
> > > +workfile=${SCRATCH_MNT}/workfile
> > > +light_clone=${SCRATCH_MNT}/light_clone
> > > +
> > > +file_size=$((10 * 1024 * 1024))
> > > +bs=4096
> I used 4096 because this is our file-system copy-on-write granularity,
> this is in order to get maximum effect. I'll be able to use
> _get_file_block_size().
> > > +block_num=$((file_size / bs))
> > > +reflinks_num=20
> > > +
> > > +$XFS_IO_PROG -f -c "pwrite 0 $file_size" $workfile >> $seqres.full
> > > +
> > > +#create all reflinkfs
> > > +for ((i=1; i<reflinks_num; i++));
> > > +do
> > > +	cp --reflink $workfile ${light_clone}_$i
> I'll change this to use _cp_reflink()
> > > +done
> > > +
> > > +#for each block simultaneously write to all reflinks
> > > +for ((block_index=0; block_index<block_num; block_index++));
> > > +do
> > > +	for ((i=0; i<reflinks_num; i++));
> > > +	do
> > > +		if [ $i -eq 0 ]; then
> > > +			$XFS_IO_PROG -c "pread $((block_index *
> > > bs)) $bs" \
> > > +						$workfile >>
> > > $seqres.full &
> > > +		elif [ $((block_index % 2)) -eq 0 ]; then
> > > +			$XFS_IO_PROG -c "pwrite $((block_index *
> > > bs)) $bs" \
> > > +					${light_clone}_$i  >>
> > > $seqres.full &
> > > +		else
> > > +			$XFS_IO_PROG -c "fpunch $((block_index *
> > > bs)) $bs" \
> > > +					${light_clone}_$i >>
> > > $seqres.full &
> > > +		fi
> > I don't see much concurrency in this loop, for each block_index
> > there's
> > only one pread process and reflinks_num processes doing pwrite or
> > fpunch. Seems only pwrites are racing with pwrites and fpunchs are
> > racing with fpunchs. How about:
> > 
> > for ((block_index=0; block_index<block_num; block_index++)); do
> > 	for ((i=0; i<reflinks_num; i++)); do
> > 		$XFS_IO_PROG -c "pread ..." ... &
> > 		if [ $((i % 2)) -eq 0 ]; then
> > 			$XFS_IO_PROG -c "pwrite ..." ... &
> > 		else
> > 			$XFS_IO_PROG -c "fpunch ..." ... &
> > 		fi
> > 	done
> > 	wait
> > done
> > 
> > So for every pwrite or fpunch there's a pread racing with it. But
> > because I don't know if you're reproducing some specific bug with
> > this
> > loop, I'm not sure if my version still works.
> > 
> I understand your comment and your suggestion, we interntionally didn't
> won't to modify the workfile so we just read it and race the pread with
> the pwrites of the reflinks (or the fpunchs). we wanted that all
> reflinks simultaneously will perform pwrite and in the next iteration
> fpunch (and not half reflinks pwrite and the other half fpunch - which
> is fine but describes another scenario). I think we can combine your
> suggestion with my version, to get more concurrency like this:

Looks reasonable, please add some comments to describe this workload.

> 
> for ((block_index=0; block_index<block_num; block_index++)); do
>      for ((i=0; i<reflinks_num; i++)); do

Perhaps we can start with i=1 then remove the "$i -ne 0" check?

Thanks,
Eryu

>           $XFS_IO_PROG -c "pread $((block_index * 	bs)) $bs" \
> 					$workfile >> $seqres.full &
>           if [ $i -ne 0 ]; then
>               if [ $((block_index % 2)) -eq 0 ]; then
> 		  $XFS_IO_PROG -c "pwrite $((block_index * bs)) $bs" \        
>                             ${light_clone}_$i >> $seqres.full &
>               else
>                   $XFS_IO_PROG -c "fpunch $((block_index * bs)) $bs" \
>                                     ${light_clone}_$i >> $seqres.full &
>               fi
>            fi
>        wait
>        done
> done		 
> > > +	wait
> > > +done
> > > +
> > > +# success, all done
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/409.out b/tests/generic/409.out
> > > new file mode 100644
> > > index 0000000..6f11537
> > > --- /dev/null
> > > +++ b/tests/generic/409.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 409
> > > +Silence is golden
> > > diff --git a/tests/generic/group b/tests/generic/group
> > > index d14f221..27ff229 100644
> > > --- a/tests/generic/group
> > > +++ b/tests/generic/group
> > > @@ -411,3 +411,4 @@
> > >  406 auto quick dangerous
> > >  407 auto quick clone metadata
> > >  408 auto quick clone dedupe metadata
> > > +409 auto quick clone
> > If you decide to increase the concurrency/workload, probably the test
> > will not be quick enough to fit in quick group :)
> > 
> It is not too important if it would only be in clone and auto
> > Thanks,
> > Eryu
> > 
> > > 

  reply	other threads:[~2017-02-28  8:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-23 12:30 [PATCH] generic/409: reflink concurrent operations Nave Vardy
2017-02-24  3:38 ` Darrick J. Wong
2017-02-24  9:47 ` Eryu Guan
2017-02-27 14:26   ` nave vardy
2017-02-28  2:49     ` Eryu Guan [this message]
2017-02-28  3:34     ` Eryu Guan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170228024912.GD16443@eguan.usersys.redhat.com \
    --to=eguan@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=nave.vardy@plexistor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox