From mboxrd@z Thu Jan 1 00:00:00 1970 From: tj Subject: Re: Raid-5 long write wait while reading Date: Sun, 03 Jun 2007 02:14:27 +0200 Message-ID: <466207E3.60307@jager.no> References: <4653306B.4090500@jager.no> <4658CB8A.8080503@jager.no> <465AFCC6.8040006@tmr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <465AFCC6.8040006@tmr.com> Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids Bill Davidsen wrote: > tj wrote: >> Thomas Jager wrote: >>> Hi list. >>> >>> I run a file server on MD raid-5. >>> If a client reads one big file and at the same time another client >>> tries to write a file, the thread writing just sits in >>> uninterruptible sleep until the reader has finished. Only very small >>> amount of writes get trough while the reader is still working. >>> I'm having some trouble pinpointing the problem. >>> It's not consistent either sometimes it works as expected both the >>> reader and writer gets some transactions. On huge reads I've seen >>> the writer blocked for 30-40 minutes without any significant writes >>> happening (Maybe a few megabytes, of several gigs waiting). It >>> happens with NFS, SMB and FTP, and local with dd. And seems to be >>> connected to raid-5. This does not happen on block devices without >>> raid-5. I'm also wondering if it can have anything to do with >>> loop-aes? I use loop-aes on top of the md, but then again i have not >>> observed this problem on loop-devices with disk backend. I do know >>> that loop-aes degrades performance but i didn't think it would do >>> something like this? >>> >>> I've seen this problem in 2.6.16-2.6.21 >>> >>> All disks in the array is connected to a controller with a SiI 3114 >>> chip. >> >> I just noticed something else. A couple of slow readers where running >> on my raid-5 array. Then i started a copy from another local disk to >> the array. Then i got the extremely long wait. I noticed something in >> iostat: >> >> avg-cpu: %user %nice %system %iowait %steal %idle >> 3.90 0.00 48.05 31.93 0.00 16.12 >> >> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn >> .... >> sdg 0.80 25.55 0.00 128 0 >> sdh 154.89 632.34 0.00 3168 0 >> sdi 0.20 12.77 0.00 64 0 >> sdj 0.40 25.55 0.00 128 0 >> sdk 0.40 25.55 0.00 128 0 >> sdl 0.80 25.55 0.00 128 0 >> sdm 0.80 25.55 0.00 128 0 >> sdn 0.60 23.95 0.00 120 0 >> md0 199.20 796.81 0.00 3992 0 >> >> All disks are member of the same raid array (md0). One of the disks >> has a ton of transactions compared to the other disks. Read >> operations as far as i can tell. Why? May be connected with my problem? > Two thoughts on that, if you are doing a lot of directory operations, > it's possible that the inodes being used most are all in one chunk. Hi thanks for the reply. It's not directory operations AFAIK. Reading a few files (3 in this case) and writing one. > > The other possibility is that these a journal writes and reflect > updates to the atime. The way to see if this is in some way related > is to mount (remount) with noatime: "mount -o remount,noatime /dev/md0 > /wherever" and retest. If this is journal activity you can do several > things to reduce the problem, which I'll go into (a) if it seems to be > the problem, and (b) if someone else doesn't point you to an existing > document or old post on the topic. Oh, you could also try mounting the > filesystem as etc2, assuming that it's ext3 now. I wouldn't run that > way, but it's useful as a diagnostic tool. I don't use ext3 i use ReiserFS. ( It seemed like a good idea at the time. ) It's mounted with -o noatime. I've done some more testing and i seems like it might be connected to mount --bind. If i write to a binded mount i get the slow writes. But if i write directly to the real mount i don't. It might just be a random occurrence, as the problem always has been inconsistent. Thoughts?