From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933344Ab0CKQ7G (ORCPT ); Thu, 11 Mar 2010 11:59:06 -0500 Received: from moutng.kundenserver.de ([212.227.17.10]:63226 "EHLO moutng.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933302Ab0CKQ7E (ORCPT ); Thu, 11 Mar 2010 11:59:04 -0500 From: "Hans-Peter Jansen" To: linux-kernel@vger.kernel.org Subject: Re: howto combat highly pathologic latencies on a server? Date: Thu, 11 Mar 2010 17:58:49 +0100 User-Agent: KMail/1.9.10 Cc: Dave Chinner References: <201003101817.42812.hpj@urpla.net> <20100310232940.GB16344@discord.disaster> In-Reply-To: <20100310232940.GB16344@discord.disaster> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <201003111758.50224.hpj@urpla.net> X-Provags-ID: V01U2FsdGVkX1/4+7m7lGOLWmVxpcq7L7W82QKPDgqDeaAy17R HkOyYtkorPjwdbJtFK4Pu9UuKbQKUzGmDBy1czr5bXgv36GUNk bJo6vaNWDWpeGVHtkmmjw== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thursday 11 March 2010, 00:29:40 Dave Chinner wrote: > On Wed, Mar 10, 2010 at 06:17:42PM +0100, Hans-Peter Jansen wrote: > > > > The xfs filesystems are mounted with rw,noatime,attr2,nobarrier,noquota > > (yes, I do have a BBU on the areca, and disk write cache is effectively > > turned off). > > Make sure the filesystem has the "lazy-count=1" attribute set (use > xfs_info to check, xfs_admin to change). That will remove the > superblock from most transactions and significant reduce latency of > transactions as they serialise while locking it... Done that now on my local test system, but on one of its filesystems, xfs_admin -c1 didn't succeed, it simply stopped (waiting for a futex): Famous last syscall: 6750 futex(0x868330c8, FUTEX_WAIT_PRIVATE, 0, NULL Consequently, xfs_repair behaved similar, hanging in phase 6, traversing filesystem... I have a huge strace from this run, if someone is interested. It's an 3 TB Raid 5 array (4 * 1 TB hd) with one FS also driven by areca: meta-data=/dev/sdb1 isize=256 agcount=4, agsize=183105406 blks = sectsz=512 attr=2 data = bsize=4096 blocks=732421623, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 Luckily, xfs_repair -P finally did succeed. Phuah.. This is with: xfs_repair version 2.10.1. After calling xfs_admin -c1, all filesystems showed differences in superblock features (from a xfs_repair -n run). Is xfs_repair mandatory, or does the initial mount fix this automatically? Thanks, Pete