From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?utf-8?B?SsO2cm4=?= Engel Subject: Re: Filesystem benchmarks on reasonably fast hardware Date: Mon, 18 Jul 2011 16:34:50 +0200 Message-ID: <20110718143450.GH1437@logfs.org> References: <20110717160501.GA1437@logfs.org> <20110717233252.GH21663@dastard> <20110718075339.GB1437@logfs.org> <20110718105749.GE30254@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org To: Dave Chinner Return-path: Received: from [85.183.114.52] ([85.183.114.52]:55593 "EHLO Dublin.logfs.org" rhost-flags-FAIL-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1751115Ab1GROeU (ORCPT ); Mon, 18 Jul 2011 10:34:20 -0400 Content-Disposition: inline In-Reply-To: <20110718105749.GE30254@dastard> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > On Mon, Jul 18, 2011 at 09:53:39AM +0200, J=C3=B6rn Engel wrote: > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, J=C3=B6rn Engel wrote: >=20 > > > > xfs: > > > > =3D=3D=3D=3D > > > > seqrd 1 2 4 8 16 32 64 128 > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679=09 > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812=09 > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906=09 > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273=09 > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301=09 > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271=09 seqrd 1 2 4 8 16 32 64 128 16384 4542 8311 15738 28955 38273 36644 38530 38527=09 8192 6000 10413 19208 33878 65927 76906 77083 77102=09 4096 8931 14971 24794 44223 83512 144867 147581 150702=09 2048 14375 23489 34364 56887 103053 192662 307167 309222=09 1024 21647 36022 49649 77163 132886 243296 421389 497581=09 512 31832 61257 79545 108782 176341 303836 517814 584741=09 Quite a nice improvement for such a small patch. As they say, "every small factor of 17 helps". ;) What bothers me a bit is that the single-threaded numbers took such a noticeable hit... > Ok, the patch below takes the numbers on my test setup on a 16k IO > size: >=20 > seqrd 1 2 4 8 16 > vanilla 3603 2798 2563 not tested... > patches 3707 5746 10304 12875 11016 =2E..in particular when your numbers improve even for a single thread. Wonder what's going on here. Anyway, feel free to add a Tested-By: or something from me. And maybe fix the two typos below. > xfs: don't serialise direct IO reads on page cache checks >=20 > From: Dave Chinner >=20 > There is no need to grab the i_mutex of the IO lock in exclusive > mode if we don't need to invalidate the page cache. Taking hese ^ > locks on every direct IO effective serialisaes them as taking the IO ^ > lock in exclusive mode has to wait for all shared holders to drop > the lock. That only happens when IO is complete, so effective it > prevents dispatch of concurrent direct IO reads to the same inode. >=20 > Fix this by taking the IO lock shared to check the page cache state, > and only then drop it and take the IO lock exclusively if there is > work to be done. Hence for the normal direct IO case, no exclusive > locking will occur. >=20 > Signed-off-by: Dave Chinner > --- > fs/xfs/linux-2.6/xfs_file.c | 17 ++++++++++++++--- > 1 files changed, 14 insertions(+), 3 deletions(-) >=20 > diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.= c > index 1e641e6..16a4bf0 100644 > --- a/fs/xfs/linux-2.6/xfs_file.c > +++ b/fs/xfs/linux-2.6/xfs_file.c > @@ -321,7 +321,19 @@ xfs_file_aio_read( > if (XFS_FORCED_SHUTDOWN(mp)) > return -EIO; > =20 > - if (unlikely(ioflags & IO_ISDIRECT)) { > + /* > + * Locking is a bit tricky here. If we take an exclusive lock > + * for direct IO, we effectively serialise all new concurrent > + * read IO to this file and block it behind IO that is currently in > + * progress because IO in progress holds the IO lock shared. We onl= y > + * need to hold the lock exclusive to blow away the page cache, so > + * only take lock exclusively if the page cache needs invalidation. > + * This allows the normal direct IO case of no page cache pages to > + * proceeed concurrently without serialisation. > + */ > + xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > + if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) { > + xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); > xfs_rw_ilock(ip, XFS_IOLOCK_EXCL); > =20 > if (inode->i_mapping->nrpages) { > @@ -334,8 +346,7 @@ xfs_file_aio_read( > } > } > xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL); > - } else > - xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > + } > =20 > trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags); > =20 J=C3=B6rn --=20 Everything should be made as simple as possible, but not simpler. -- Albert Einstein -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html