public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Performance regression between 2.6.32 and 2.6.38
@ 2011-09-10  0:23 Joshua Aune
  2011-09-10  6:05 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Joshua Aune @ 2011-09-10  0:23 UTC (permalink / raw)
  To: xfs@oss.sgi.com; +Cc: Paul Saab

Hi,

We have been doing some performance testing on a handful of kernels and are seeing a significant performance regression with lower number of outstanding I/Os somewhere between 2.6.32 and 2.6.38.  The test case shows a significant drop in random read IOPS (45k -> 8k) and a significantly dirtier latency profile.  

We also tested against the raw block device and against ext4.  The performance profiles of those tests were fairly consistent between the .32 and 3.0 based kernels where most of the testing was done.

Also worth noting, the test case below has 24 thread with one I/O each (~24 outstanding total).  We did do a small number of tests that used 4 threads with libaio and 64 I/Os each (~256 total outstanding) which showed performance across the various kernel versions to be fairly stable.


-- Results

2.6.32-71.el6.x86_64
   iops=45,694
   bw=731,107 KB/s
   lat (usec): min=149 , max=2465 , avg=523.58, stdev=106.68
   lat (usec): 250=0.01%, 500=48.93%, 750=48.30%, 1000=2.70%
   lat (msec): 2=0.07%, 4=0.01%

2.6.40.3-0.fc15.x86_64 (aka 3.0)
  iops=8043
  bw=128,702 KB/s
  lat (usec): min=77 , max=147441 , avg=452.33, stdev=2773.88
  lat (usec): 100=0.01%, 250=61.30%, 500=37.59%, 750=0.01%, 1000=0.01%
  lat (msec): 2=0.05%, 4=0.04%, 10=0.30%, 20=0.33%, 50=0.30%
  lat (msec): 100=0.07%, 250=0.01%


-- Testing Configuration

Most testing was performed on various 2 socket intel x5600 class server systems using various models of ioDrive.  The results above are from a 160GB ioDrive with a 2.3.1 driver.

The fio benchmark tool was used for most of the testing, but another benchmark showed similar results.


-- Testing  Process

# load the ioDrive driver
modprobe iomemory-vs

# Reset the ioDrive back to a known state
fio-detach /dev/fct0
fio-format -y /dev/fct0
fio-attach /dev/fct0

# Setup XFS for testing and create the sample file
mkfs.xfs -i size=2048 /dev/fioa
mkdir -p /mnt/tmp
mount -t xfs /dev/fioa /mnt/tmp
dd if=/dev/zero of=/mnt/tmp/bigfile bs=1M oflag=direct count=$((10*1024))

# Run fio test
fio --direct=1 --rw=randread --bs=16k --numjobs=24 --runtime=60 --group_reporting --norandommap --time_based --ioengine=sync --name=file1 --filename=/mnt/tmp/bigfile


-- Other

Are there any mount options or other tests that can be run in the failing configuration that would be helpful to isolate this further?

Thanks,
Josh


Please cc Paul and I, we are not subscribed to the list.



Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance regression between 2.6.32 and 2.6.38
  2011-09-10  0:23 Performance regression between 2.6.32 and 2.6.38 Joshua Aune
@ 2011-09-10  6:05 ` Christoph Hellwig
  2011-09-10 18:10   ` Paul Saab
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2011-09-10  6:05 UTC (permalink / raw)
  To: Joshua Aune; +Cc: Paul Saab, xfs@oss.sgi.com

On Fri, Sep 09, 2011 at 06:23:54PM -0600, Joshua Aune wrote:
> Are there any mount options or other tests that can be run in the failing configuration that would be helpful to isolate this further?

The best thing would be to bisect it down to at least a kernel release,
and if possible to a -rc or individual change (the latter might start
to get hard due to various instabilities in early -rc kernels)

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance regression between 2.6.32 and 2.6.38
  2011-09-10  6:05 ` Christoph Hellwig
@ 2011-09-10 18:10   ` Paul Saab
  2011-09-10 18:26     ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Saab @ 2011-09-10 18:10 UTC (permalink / raw)
  To: Christoph Hellwig, Joshua Aune; +Cc: xfs@oss.sgi.com

On 9/9/11 11:05 PM, "Christoph Hellwig" <hch@infradead.org> wrote:

>On Fri, Sep 09, 2011 at 06:23:54PM -0600, Joshua Aune wrote:
>> Are there any mount options or other tests that can be run in the
>>failing configuration that would be helpful to isolate this further?
>
>The best thing would be to bisect it down to at least a kernel release,
>and if possible to a -rc or individual change (the latter might start
>to get hard due to various instabilities in early -rc kernels)

487f84f3 is where the regression was introduced.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance regression between 2.6.32 and 2.6.38
  2011-09-10 18:10   ` Paul Saab
@ 2011-09-10 18:26     ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2011-09-10 18:26 UTC (permalink / raw)
  To: Paul Saab; +Cc: Christoph Hellwig, Joshua Aune, xfs@oss.sgi.com

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

On Sat, Sep 10, 2011 at 06:10:50PM +0000, Paul Saab wrote:
> On 9/9/11 11:05 PM, "Christoph Hellwig" <hch@infradead.org> wrote:
> 
> >On Fri, Sep 09, 2011 at 06:23:54PM -0600, Joshua Aune wrote:
> >> Are there any mount options or other tests that can be run in the
> >>failing configuration that would be helpful to isolate this further?
> >
> >The best thing would be to bisect it down to at least a kernel release,
> >and if possible to a -rc or individual change (the latter might start
> >to get hard due to various instabilities in early -rc kernels)
> 
> 487f84f3 is where the regression was introduced.

The patch below which is in the queue for Linux 3.2 should fix this
issue, and in fact improve behaviour even further.



[-- Attachment #2: xfs-dio-read-fix.diff --]
[-- Type: text/plain, Size: 2286 bytes --]

commit 37b652ec6445be99d0193047d1eda129a1a315d3
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Aug 25 07:17:01 2011 +0000

    xfs: don't serialise direct IO reads on page cache checks
    
    There is no need to grab the i_mutex of the IO lock in exclusive
    mode if we don't need to invalidate the page cache. Taking these
    locks on every direct IO effective serialises them as taking the IO
    lock in exclusive mode has to wait for all shared holders to drop
    the lock. That only happens when IO is complete, so effective it
    prevents dispatch of concurrent direct IO reads to the same inode.
    
    Fix this by taking the IO lock shared to check the page cache state,
    and only then drop it and take the IO lock exclusively if there is
    work to be done. Hence for the normal direct IO case, no exclusive
    locking will occur.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Tested-by: Joern Engel <joern@logfs.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Alex Elder <aelder@sgi.com>

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 7f7b424..8fd4a07 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -317,7 +317,19 @@ xfs_file_aio_read(
 	if (XFS_FORCED_SHUTDOWN(mp))
 		return -EIO;
 
-	if (unlikely(ioflags & IO_ISDIRECT)) {
+	/*
+	 * Locking is a bit tricky here. If we take an exclusive lock
+	 * for direct IO, we effectively serialise all new concurrent
+	 * read IO to this file and block it behind IO that is currently in
+	 * progress because IO in progress holds the IO lock shared. We only
+	 * need to hold the lock exclusive to blow away the page cache, so
+	 * only take lock exclusively if the page cache needs invalidation.
+	 * This allows the normal direct IO case of no page cache pages to
+	 * proceeed concurrently without serialisation.
+	 */
+	xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
+	if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) {
+		xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED);
 		xfs_rw_ilock(ip, XFS_IOLOCK_EXCL);
 
 		if (inode->i_mapping->nrpages) {
@@ -330,8 +342,7 @@ xfs_file_aio_read(
 			}
 		}
 		xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
-	} else
-		xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
+	}
 
 	trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags);
 

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-09-10 18:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-10  0:23 Performance regression between 2.6.32 and 2.6.38 Joshua Aune
2011-09-10  6:05 ` Christoph Hellwig
2011-09-10 18:10   ` Paul Saab
2011-09-10 18:26     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox