linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Abhijith Das <adas@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	cluster-devel@redhat.com, Steven Whitehouse <swhiteho@redhat.com>
Subject: Re: [RFC PATCH 0/2] dirreadahead system call
Date: Sun, 9 Nov 2014 22:41:17 -0500 (EST)	[thread overview]
Message-ID: <1672873802.9686431.1415590877143.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <1218263590.8113472.1413868891802.JavaMail.zimbra@redhat.com>


> > 
> 
> Hi Dave/all,
> 
> I finally got around to playing with the multithreaded userspace readahead
> idea and the results are quite promising. I tried to mimic what my kernel
> readahead patch did with this userspace program (userspace_ra.c)
> Source code here:
> https://www.dropbox.com/s/am9q26ndoiw1cdr/userspace_ra.c?dl=0
> 
> Each thread has an associated buffer into which a chunk of directory
> entries are read in using getdents(). Each thread then sorts the entries in
> inode number order (for GFS2, this is also their disk block order) and
> proceeds
> to cache in the inodes in that order by issuing open(2) syscalls against
> them.
> In my tests, I backgrounded this program and issued an 'ls -l' on the dir
> in question. I did the same following the kernel dirreadahead syscall as
> well.
> 
> I did not manage to test out too many parameter combinations for both
> userspace_ra and SYS_dirreadahead because the test matrix got pretty big and
> time consuming. However, I did notice that without sorting, userspace_ra did
> not perform as well in some of my tests. I haven't investigated that yet,
> so the numbers shown here are all with sorting enabled.
> 
> For a directory with 100000 files,
> a) simple 'ls -l' took 14m11s
> b) SYS_dirreadahead + 'ls -l' took 3m9s, and
> c) userspace_ra (1M buffer/thread, 32 threads) took 1m42s
> 
> https://www.dropbox.com/s/85na3hmo3qrtib1/ra_vs_u_ra_vs_ls.jpg?dl=0 is a
> graph
> that contains a few more data points. In the graph, along with data for 'ls
> -l'
> and SYS_dirreadahead, there are six data series for userspace_ra for each
> directory size (10K, 100K and 200K files). i.e. u_ra:XXX,YYY, where XXX is
> one
> of (64K, 1M) buffer size and YYY is one of (4, 16, 32) threads.
> 

Hi,

Here are some more numbers for larger directories and it seems like userspace
readahead scales well and is still a good option.

I've chosen the best-performing runs for kernel readahead and userspace readahead. I
have data for runs with different parameters (buffer size, number of threads, etc)
that I can provide, if anybody's interested.

The numbers here are total elapsed times for the readahead plus 'ls -l' operations
to complete.

							#files in testdir
						50k	100k	200k	500k	1m
------------------------------------------------------------------------------------
Readdir 'ls -l'					11	849	1873	5024	10365
Kernel readahead + 'ls -l' (best case)		7	214	814	2330	4900
Userspace MT readahead + 'ls -l' (best case)	12	99	239	1351	4761

Cheers!
--Abhi

  reply	other threads:[~2014-11-10  3:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-25 17:37 [RFC PATCH 0/2] dirreadahead system call Abhi Das
2014-07-25 17:37 ` [RFC PATCH 1/2] fs: Add dirreadahead syscall and VFS hooks Abhi Das
     [not found]   ` <1406309851-10628-2-git-send-email-adas-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-07-29  8:21     ` Michael Kerrisk
     [not found]       ` <CAHO5Pa2fW6mZRTao3uEx2p_X9GvO1btrbb9Bg2ns94+p4biKAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-07-31  3:31         ` Dave Chinner
2014-07-25 17:37 ` [RFC PATCH 2/2] gfs2: GFS2's implementation of the dir_readahead file operation Abhi Das
2014-07-26  5:27 ` [RFC PATCH 0/2] dirreadahead system call Andreas Dilger
2014-07-28 12:52   ` Abhijith Das
2014-07-28 21:19     ` Andreas Dilger
2014-07-29  9:36       ` [Cluster-devel] " Steven Whitehouse
2014-07-31  4:49       ` Dave Chinner
2014-07-31 11:19         ` Andreas Dilger
2014-07-31 23:53           ` Dave Chinner
2014-08-01  2:11             ` Abhijith Das
2014-08-01  5:54             ` Andreas Dilger
2014-08-06  2:01               ` Dave Chinner
2014-10-21  5:21             ` Abhijith Das
2014-11-10  3:41               ` Abhijith Das [this message]
2014-11-10 22:23                 ` Andreas Dilger
2014-11-10 22:47                   ` Abhijith Das
2014-08-01  9:14       ` Thomas Martitz
2014-07-29  8:19 ` Michael Kerrisk
2014-07-31  3:18 ` NeilBrown
2014-08-01  2:21   ` Abhijith Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1672873802.9686431.1415590877143.JavaMail.zimbra@redhat.com \
    --to=adas@redhat.com \
    --cc=adilger@dilger.ca \
    --cc=cluster-devel@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).