From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: [RFC] readdirplus implementations: xgetdents vs dirreadahead syscalls Date: Sat, 26 Jul 2014 10:38:59 +1000 Message-ID: <20140726003859.GF20518@dastard> References: <1106785262.13440918.1406308542921.JavaMail.zimbra@redhat.com> <1717400531.13456321.1406309839199.JavaMail.zimbra@redhat.com> <20140725175257.GK17798@lenny.home.zabbo.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Abhijith Das , linux-kernel@vger.kernel.org, linux-fsdevel , cluster-devel To: Zach Brown Return-path: Content-Disposition: inline In-Reply-To: <20140725175257.GK17798@lenny.home.zabbo.net> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Fri, Jul 25, 2014 at 10:52:57AM -0700, Zach Brown wrote: > On Fri, Jul 25, 2014 at 01:37:19PM -0400, Abhijith Das wrote: > > Hi all, > > > > The topic of a readdirplus-like syscall had come up for discussion at last year's > > LSF/MM collab summit. I wrote a couple of syscalls with their GFS2 implementations > > to get at a directory's entries as well as stat() info on the individual inodes. > > I'm presenting these patches and some early test results on a single-node GFS2 > > filesystem. > > > > 1. dirreadahead() - This patchset is very simple compared to the xgetdents() system > > call below and scales very well for large directories in GFS2. dirreadahead() is > > designed to be called prior to getdents+stat operations. > > Hmm. Have you tried plumbing these read-ahead calls in under the normal > getdents() syscalls? The issue is not directory block readahead (which some filesystems like XFS already have), but issuing inode readahead during the getdents() syscall. It's the semi-random, interleaved inode IO that is being optimised here (i.e. queued, ordered, issued, cached), not the directory blocks themselves. As such, why does this need to be done in the kernel? This can all be done in userspace, and even hidden within the readdir() or ftw/ntfw() implementations themselves so it's OS, kernel and filesystem independent...... Cheers, Dave. -- Dave Chinner david@fromorbit.com