From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id q9AETj8K054451 for ; Wed, 10 Oct 2012 09:29:45 -0500 Received: from mail.sandeen.net (sandeen.net [63.231.237.45]) by cuda.sgi.com with ESMTP id 4CQ1DJeqB7Jve73e for ; Wed, 10 Oct 2012 07:31:16 -0700 (PDT) Message-ID: <507586B4.6010201@sandeen.net> Date: Wed, 10 Oct 2012 09:31:16 -0500 From: Eric Sandeen MIME-Version: 1.0 Subject: Re: Performance degradation over time References: <20121010105142.148519ca@booking.com> <50757583.9000901@hardwarefreak.com> In-Reply-To: <50757583.9000901@hardwarefreak.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: stan@hardwarefreak.com Cc: xfs@oss.sgi.com On 10/10/12 8:17 AM, Stan Hoeppner wrote: > On 10/10/2012 3:51 AM, Marcin Deranek wrote: >> Hi, >> >> We are running XFS filesystem on one of out machines which is a big >> store (~3TB) of different data files (mostly images). Quite recently we >> experienced some performance problems - machine wasn't able to keep up >> with updates. After some investigation it turned out that open() >> syscalls (open for writing) were taking significantly more time than >> they should eg. 15-20ms vs 100-150us. >> Some more info about our workload as I think it's important here: >> our XFS filesystem is exclusively used as data store, so we only >> read and write our data (we mostly write). When new update comes it's >> written to a temporary file eg. >> >> /mountpoint/some/path/.tmp/file >> >> When file is completely stored we move it to final location eg. >> >> /mountpoint/some/path/different/subdir/newname >> >> That means that we create lots of files in /mountpoint/some/path/.tmp >> directory, but directory is empty as they are moved (rename() syscall) >> shortly after file creation to a different directory on the same >> filesystem. >> The workaround which I found so far is to remove that directory >> (/mountpoint/some/path/.tmp in our case) with its content and re-create >> it. After this operation open() syscall goes down to 100-150us again. >> Is this a known problem ? >> Information regarding our system: >> CentOS 5.8 / kernel 2.6.18-308.el5 / kmod-xfs-0.4-2 >> Let me know if you need to know anything more. > > Hi Marcin, > > I'll begin where you ended: kmod-xfs. DO NOT USE THAT. Use the kernel > driver. Eric Sandeen can point you to the why. AIUI that XFS module > hasn't been supported for many many years. Yep. Ditch that; it overrides the maintained module that comes with the kernel itself. See if that helps, first, I suppose. I've been asking Centos for a while to find some way to deprecate that, but it's like night of the living dead xfs modules. (modinfo xfs will tell you for sure which xfs.ko is getting loaded I suppose). > Regarding your problem, I can't state some of the following with > authority, though it might read that way. I'm making an educated guess > based on what I do know of XFS and the behavior you're seeing. Dave > will clobber and correct me if I'm wrong here. ;) > > XFS filesystems are divided into multiple equal sized allocation groups > on the underlying storage device (single disk, RAID, LVM volume, etc). > With inode32 each directory that is created has its files store in only > one AG, with some exceptions, which you appear to bumping up against. > If you're using inode64 the directories, along with their files, go into > the AGs round robin. Agreed that it would be good to know whether inode64 is in use. Let's start there (and with a modern xfs.ko) before we speculate further. > Educated guessing: When you use rename(2) to move the files, the file > contents are not being moved, only the directory entry, as with EXTx > etc. Thus the file data is still in the ".tmp" directory AG, but that > AG is no longer its home. Once this temp dir AG gets full of these > "phantom" file contents (you can only see them with XFS tools), the AG > spills over. At that point XFS starts moving the phantom contents of > the rename(2) files into the AG which owns the directory of the > rename(2) target. I believe this is the source of your additional > latency. Each time you do an open(2) call to write a new file, XFS is > moving a file's contents (extents) to its new/correct parent AG, causing > much additional IO, especially if these are large files. Nope, don't think so ;) Nothing is going to be moving file contents behind your back on a rename. -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs