Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Zorro Lang <zlang@redhat.com>
To: Naohiro Aota <naohiro.aota@wdc.com>
Cc: fstests@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3 1/3] common/rc: introduce _random_file() helper
Date: Fri, 25 Aug 2023 22:00:09 +0800	[thread overview]
Message-ID: <20230825140009.4pg43yyprmunrxkn@zlang-mailbox> (raw)
In-Reply-To: <20230825133948.oubggt74y7cmci2j@zlang-mailbox>

On Fri, Aug 25, 2023 at 09:39:48PM +0800, Zorro Lang wrote:
> On Mon, Aug 21, 2023 at 04:12:11PM +0900, Naohiro Aota wrote:
> > Currently, we use "ls ... | sort -R | head -n1" (or tail) to choose a
> > random file in a directory.It sorts the files with "ls", sort it randomly
> > and pick the first line, which wastes the "ls" sort.
> > 
> > Also, using "sort -R | head -n1" is inefficient. For example, in a
> > directory with 1000000 files, it takes more than 15 seconds to pick a file.
> > 
> >   $ time bash -c "ls -U | sort -R | head -n 1 >/dev/null"
> >   bash -c "ls -U | sort -R | head -n 1 >/dev/null"  15.38s user 0.14s system 99% cpu 15.536 total
> > 
> >   $ time bash -c "ls -U | shuf -n 1 >/dev/null"
> >   bash -c "ls -U | shuf -n 1 >/dev/null"  0.30s user 0.12s system 138% cpu 0.306 total
> > 
> > So, we should just use "ls -U" and "shuf -n 1" to choose a random file.
> > Introduce _random_file() helper to do it properly.
> > 
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> >  common/rc | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/common/rc b/common/rc
> > index 5c4429ed0425..4d414955f6d9 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -5224,6 +5224,13 @@ _soak_loop_running() {
> >  	return 0
> >  }
> >  
> > +# Return a random file in a directory. A directory is *not* followed
> > +# recursively.
> > +_random_file() {
> > +	local basedir=$1
> > +	echo "$basedir/$(ls -U $basedir | shuf -n 1)"
> 
> I think the "1" can be the second argument, for we might want to get a random
> file list sometimes. For example:
> 
>   local basedir=$1
>   local num=$2
>   local opt
> 
>   if [ -n "$num" ];then
> 	  opt="-n $num"
>   fi
>   echo "$basedir/$(ls -U $basedir | shuf $opt)"
> 
> What do you think?

Hmm... nack my review point. Looks like this makes a simple change to be
complicated, especially multiple output lines. I'll merge this patchset
at first, then we can support that second argument If we need that
feature in one day. Or if you're interested in it.

Thanks,
Zorro

> 
> Thanks,
> Zorro
> 
> > +}
> > +
> >  init_rc
> >  
> >  ################################################################################
> > -- 
> > 2.41.0
> > 


  reply	other threads:[~2023-08-25 14:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21  7:12 [PATCH v3 0/3] use shuf to choose a random file Naohiro Aota
2023-08-21  7:12 ` [PATCH v3 1/3] common/rc: introduce _random_file() helper Naohiro Aota
2023-08-21  9:24   ` Anand Jain
2023-08-25 13:39   ` Zorro Lang
2023-08-25 14:00     ` Zorro Lang [this message]
2023-08-28  1:37       ` Naohiro Aota
2023-08-21  7:12 ` [PATCH v3 2/3] fstests/btrfs: use " Naohiro Aota
2023-08-21  9:25   ` Anand Jain
2023-08-21  7:12 ` [PATCH v3 3/3] btrfs/004: use shuf to shuffle the file lines Naohiro Aota
2023-08-21  9:25   ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230825140009.4pg43yyprmunrxkn@zlang-mailbox \
    --to=zlang@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=naohiro.aota@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox