From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id nBB1em2C135768 for ; Thu, 10 Dec 2009 19:40:49 -0600 Received: from BLADE3.ISTI.CNR.IT (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 07E1E1956FED for ; Thu, 10 Dec 2009 17:41:23 -0800 (PST) Received: from BLADE3.ISTI.CNR.IT (blade3.isti.cnr.it [194.119.192.19]) by cuda.sgi.com with ESMTP id 9yXKyGqOke1mSDiw for ; Thu, 10 Dec 2009 17:41:23 -0800 (PST) Received: from conversionlocal.isti.cnr.it by mx.isti.cnr.it (PMDF V6.4 #31773) id <01NH44YC1MJ490WNCK@mx.isti.cnr.it> for xfs@oss.sgi.com; Fri, 11 Dec 2009 02:41:13 +0100 Date: Fri, 11 Dec 2009 02:41:32 +0100 From: Asdo Subject: Re: Disappointing performance of copy (MD raid + XFS) In-reply-to: <4B207620.3060605@sandeen.net> Message-id: <4B21A34C.9090100@shiftmail.org> MIME-version: 1.0 References: <4B204334.1000605@shiftmail.org> <4B207620.3060605@sandeen.net> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com Cc: linux-raid , =?ISO-8859-1?Q?Kristleifur_Da=F0ason?= , Eric Sandeen , Gabor Gombas , xfs@oss.sgi.com Eric Sandeen wrote: Gabor Gombas wrote: Kristleifur Da=F0ason wrote: [CUT] Thank you guys for your help I have done further investigation. I still have not checked how performances are with very small files and = multiple simultaneous rsyncs. I have checked the other problem I had which I was mentioning, that I = couldn't go more than 150MB/sec even with large files and multiple = simultaneous transfers. I confirm this one and I have narrowed the problem: two XFS defaults = (optimizations) actually damage the performances. The first and most important is the aligned writes: cat /proc/mounts = lists this (autodetected) stripe size: "sunit=3D2048,swidth=3D28672" . My = chunks are is 1MB and I have 16 disks in raid-6 so 14 data disks. Do you = think it's correct? xfs_info lists blocks as 4k and sunit and swidth are = in 4k blocks and have a very different value. Please do not use the same = name "sunit"/"swidth" to mean 2 different things in 2 different places, = it can confuse the user (me!) Anyway that's not the problem: I have tried to specify other values in = my mount (in particular I tried the values sunit and swidth should have = had if blocks were 4k), but ANY xfs aligned mount kills the performances = for me. I have to specify "noalign" in my mount to go fast. (Also note = this option cannot be changed on mount -o remount. I have to unmount.) The other default feature that kills performances for me is the = rotorstep. I have to max it out at 255 in order to have good = performances. Actually it is reasonable that a higher rotorstep should = be faster... why is 1 the default? Why it even exists? With low values = the await (iostat -x 1) increases, I guess because of the seeks, and = stripe_cache_active stays higher, because there are less filled stripes. If I use noalign and rotorstep at 255 I am able to go at 325 MB/sec on = average (16 parallel transfers of 7MB files) while with defaults I go at = about 90 MB/sec. Also with noalign and rotorstep at 255 the stripe_cache_size stays = usually in the lower half (below 16000 out of 32000) while with defaults = it's stuck for most of the time at the maximum and processes are stuck = sleeping in MD locks for this reason. Do you have any knowledge of sunit/swidth alignment mechanism being = broken on 2.6.31 or more specifically 2.6.31 ubuntu generic-14 ? (Kristleifur thank you I have seen your mention of the Ubuntu vs vanilla = kernel, I will try a vanilla one but right now I can't. However now I = have narrowed the problem so XFS people might want to watch at the = alignment problem more specifically) Regarding my previous post I still would like to know what are those = stack traces I posted in my previous post: what are the functions xlog_state_get_iclog_space+0xed/0x2d0 [xfs] = and xfs_buf_lock+0x1e/0x60 [xfs] and what are they waiting for... these are still the place where processes get stuck, even after having = worked around the alignment/rotorstep problem... And then a few questions on inode64: - if I start using inode64, do I have to remember to use inode64 on = every subsequent mount for the life for that filesystem? Or does it = write it in some filesystem info region that the option has been used = once, so it applies the inode64 by itself on subsequent mounts? - if I use a 64bit linux distro, will ALL userland programs = automatically support 64bit inodes or do I have to continuously pay = attention and risk to damage my data? Thanks for your help _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs