* Array 'freezes' for some time after large writes? @ 2010-03-30 17:07 Jim Duchek 2010-03-30 17:18 ` Mark Knecht ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Jim Duchek @ 2010-03-30 17:07 UTC (permalink / raw) To: linux-raid Hi all. Regularly after a large write to the disk (untarring a very large file, etc), my RAID5 will 'freeze' for a period of time -- perhaps around a minute. My system is completely responsive otherwise during this time, with the exception of anything that is attempting to read or write from the array -- it's as if any file descriptors simply block. Nothing disk/raid-related is written to the logs during this time. The array is mounted as /home -- so an awful lot of things completely freeze during this time (web browser, any video that is running, etc). The disks don't seem to be actually accessed during this time (I can't hear them, and the disk access light stays off), and it's not as if it's just reading slowly -- it's not reading at all. Array performance is completely normal before and after the freeze and simply non-existent during it. The root disk (which is on a seperate disk entirely from the RAID) runs fine during this time, as does everything else (network, video card, etc -- as long it doesn't touch the array) -- for example, a Terminal window open is still responsive during the freeze, and 'ls /' would work fine, while 'ls /home' would block until the 'freeze' is over. Some more detailed information on my setup attached. It's pretty vanilla. Unfortunately this started around the time four things happened -- a kernel upgrade to 2.6.32, upgrading my filesystems to ext4, replacing a disk gone bad in the RAID, and a video card change. I would assume one of these is the culprit, but you know what they say about 'assume'. I cannot reproduce the problem reliably, but it happens a couple times a day. My questions are these: 1. Is there any way to turn on more detailed logging for the RAID system in the kernel? The wiki or a google search makes no mention I can find, and mdadm doesn't put anything out during this time. 2. Possibly a problem with the SATA system? My root drive is PATA -- my RAID disks are all SATA. 2. Uh, any other ideas? :) Thanks, all. Jim Duchek [jrduchek@jimbob ~]$ uname -a Linux jimbob 2.6.32-ARCH #1 SMP PREEMPT Mon Mar 15 20:44:03 CET 2010 x86_64 Intel(R) Core(TM)2 Quad CPU Q8400 @ 2.66GHz GenuineIntel GNU/Linux [jrduchek@jimbob ~]$ cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sdb1[0] sde1[3] sdd1[2] sdc1[1] 1465151808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] unused devices: <none> [jrduchek@jimbob ~]$ mount /dev/sda3 on / type ext4 (rw,noatime,user_xattr) udev on /dev type tmpfs (rw,nosuid,relatime,size=10240k,mode=755) none on /proc type proc (rw,relatime) none on /sys type sysfs (rw,relatime) none on /dev/pts type devpts (rw) none on /dev/shm type tmpfs (rw) /dev/sda1 on /boot type ext2 (rw) /dev/md0 on /home type ext4 (rw,noatime,user_xattr) [jrduchek@jimbob ~]$ more /etc/rc.local #!/bin/bash # # /etc/rc.local: Local multi-user startup script. # echo 8192 > /sys/block/md0/md/stripe_cache_size blockdev --setra 32768 /dev/md0 blockdev --setfra 32768 /dev/md0 dmesg (relevant): ata3: SATA max UDMA/133 cmd 0xc400 ctl 0xc080 bmdma 0xb880 irq 19 ata4: SATA max UDMA/133 cmd 0xc000 ctl 0xbc00 bmdma 0xb888 irq 19 ata3.00: ATA-7: WDC WD5000AAJS-22TKA0, 12.01C01, max UDMA/133 ata3.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata3.01: ATA-8: WDC WD5002ABYS-02B1B0, 02.03B03, max UDMA/133 ata3.01: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata3.00: configured for UDMA/133 ata3.01: configured for UDMA/133 ata4.00: ATA-7: WDC WD5000AAJS-22TKA0, 12.01C01, max UDMA/133 ata4.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata4.01: ATA-7: WDC WD5000AAJS-22TKA0, 12.01C01, max UDMA/133 ata4.01: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata4.00: configured for UDMA/133 ata4.01: configured for UDMA/133 ata1.00: ATA-7: MAXTOR STM3160815A, 3.AAD, max UDMA/100 ata1.00: 312581808 sectors, multi 16: LBA48 ata1.01: ATAPI: LITE-ON DVDRW LH-20A1P, KL0G, max UDMA/66 ata1.00: configured for UDMA/100 ata1.01: configured for UDMA/66 scsi 0:0:0:0: Direct-Access ATA MAXTOR STM316081 3.AA PQ: 0 ANSI: 5 scsi 0:0:1:0: CD-ROM LITE-ON DVDRW LH-20A1P KL0G PQ: 0 ANSI: 5 scsi 2:0:0:0: Direct-Access ATA WDC WD5000AAJS-2 12.0 PQ: 0 ANSI: 5 scsi 2:0:1:0: Direct-Access ATA WDC WD5002ABYS-0 02.0 PQ: 0 ANSI: 5 scsi 3:0:0:0: Direct-Access ATA WDC WD5000AAJS-2 12.0 PQ: 0 ANSI: 5 scsi 3:0:1:0: Direct-Access ATA WDC WD5000AAJS-2 12.0 PQ: 0 ANSI: 5 sd 2:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) sd 2:0:1:0: [sdc] 976773168 512-byte logical blocks: (500 GB/465 GiB) sd 0:0:0:0: [sda] 312581808 512-byte logical blocks: (160 GB/149 GiB) sd 3:0:0:0: [sdd] 976773168 512-byte logical blocks: (500 GB/465 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdd: sda: sdb: sd 2:0:1:0: [sdc] Write Protect is off sd 2:0:1:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:1:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: sdb1 sdd1 sd 3:0:0:0: [sdd] Attached SCSI disk sd 3:0:1:0: [sde] 976773168 512-byte logical blocks: (500 GB/465 GiB) sd 3:0:1:0: [sde] Write Protect is off sd 3:0:1:0: [sde] Mode Sense: 00 3a 00 00 sd 3:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sde: sde1 sd 3:0:1:0: [sde] Attached SCSI disk sda1 sda2 sda3 sdc1 sd 0:0:0:0: [sda] Attached SCSI disk sd 2:0:0:0: [sdb] Attached SCSI disk sd 2:0:1:0: [sdc] Attached SCSI disk md: md0 stopped. md: bind<sdc1> md: bind<sdd1> md: bind<sde1> md: bind<sdb1> async_tx: api initialized (async) xor: automatically using best checksumming function: generic_sse generic_sse: 7597.200 MB/sec xor: using function: generic_sse (7597.200 MB/sec) raid6: int64x1 1567 MB/s raid6: int64x2 1994 MB/s raid6: int64x4 1582 MB/s raid6: int64x8 1427 MB/s raid6: sse2x1 3698 MB/s raid6: sse2x2 4184 MB/s raid6: sse2x4 5888 MB/s raid6: using algorithm sse2x4 (5888 MB/s) md: raid6 personality registered for level 6 md: raid5 personality registered for level 5 md: raid4 personality registered for level 4 raid5: device sdb1 operational as raid disk 0 raid5: device sde1 operational as raid disk 3 raid5: device sdd1 operational as raid disk 2 raid5: device sdc1 operational as raid disk 1 raid5: allocated 4272kB for md0 0: w=1 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 3: w=2 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 2: w=3 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 1: w=4 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 raid5: raid level 5 set md0 active with 4 out of 4 devices, algorithm 2 RAID5 conf printout: --- rd:4 wd:4 disk 0, o:1, dev:sdb1 disk 1, o:1, dev:sdc1 disk 2, o:1, dev:sdd1 disk 3, o:1, dev:sde1 md0: detected capacity change from 0 to 1500315451392 md0: unknown partition table EXT4-fs (md0): mounted filesystem with ordered data mode ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 17:07 Array 'freezes' for some time after large writes? Jim Duchek @ 2010-03-30 17:18 ` Mark Knecht 2010-03-30 17:47 ` Jim Duchek 2010-03-31 1:35 ` Roger Heflin 2010-03-31 16:37 ` Asdo 2 siblings, 1 reply; 15+ messages in thread From: Mark Knecht @ 2010-03-30 17:18 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid On Tue, Mar 30, 2010 at 10:07 AM, Jim Duchek <jim.duchek@gmail.com> wrote: > Hi all. Regularly after a large write to the disk (untarring a very > large file, etc), my RAID5 will 'freeze' for a period of time -- > perhaps around a minute. My system is completely responsive otherwise > during this time, with the exception of anything that is attempting to > read or write from the array -- it's as if any file descriptors simply > block. Nothing disk/raid-related is written to the logs during this > time. The array is mounted as /home -- so an awful lot of things > completely freeze during this time (web browser, any video that is > running, etc). The disks don't seem to be actually accessed during > this time (I can't hear them, and the disk access light stays off), > and it's not as if it's just reading slowly -- it's not reading at > all. Array performance is completely normal before and after the > freeze and simply non-existent during it. The root disk (which is on > a seperate disk entirely from the RAID) runs fine during this time, as > does everything else (network, video card, etc -- as long it doesn't > touch the array) -- for example, a Terminal window open is still > responsive during the freeze, and 'ls /' would work fine, while 'ls > /home' would block until the 'freeze' is over. > > Some more detailed information on my setup attached. It's pretty > vanilla. Unfortunately this started around the time four things > happened -- a kernel upgrade to 2.6.32, upgrading my filesystems to > ext4, replacing a disk gone bad in the RAID, and a video card change. > I would assume one of these is the culprit, but you know what they say > about 'assume'. I cannot reproduce the problem reliably, but it > happens a couple times a day. My questions are these: > > 1. Is there any way to turn on more detailed logging for the RAID > system in the kernel? The wiki or a google search makes no mention I > can find, and mdadm doesn't put anything out during this time. > 2. Possibly a problem with the SATA system? My root drive is PATA -- > my RAID disks are all SATA. > 2. Uh, any other ideas? :) > > > Thanks, all. > > Jim Duchek > I'm seeing a lot of this on a new Intel-based system. I've never run into it before. In my case I can see the delays while looking at top. They correspond to 100%wa, as shown here: top - 02:27:17 up 28 min, 2 users, load average: 2.76, 1.95, 1.30 Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.3%sy, 0.0%ni, 0.0%id, 99.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 6107024k total, 1448676k used, 4658348k free, 187492k buffers Swap: 4200988k total, 0k used, 4200988k free, 915900k cached Like you nothing seems to get written anywhere when this is happening, and in my case it happens whether I'm using RAID1 or not. From the command line if I do the following and wait for one of these 100%wa events to occur echo "1" > /proc/sys/vm/block_dump ... wait a short while ... echo "0" > /proc/sys/vm/block_dump then grepping dmesg with this command dmesg | egrep "READ|WRITE|dirtied" shows the following: flush-8:0(3365): WRITE block 33555792 on sda3 flush-8:0(3365): WRITE block 33555800 on sda3 flush-8:0(3365): WRITE block 33701984 on sda3 flush-8:0(3365): WRITE block 33720128 on sda3 flush-8:0(3365): WRITE block 33721496 on sda3 flush-8:0(3365): WRITE block 33816576 on sda3 so something ugly is going on. I have no idea what causes these blocks but they are really messing me up. Sometimes these events last for minutes. I've not yet discovered if it's specific to my drives, my motherboard, the kernel or what. - Mark -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 17:18 ` Mark Knecht @ 2010-03-30 17:47 ` Jim Duchek 2010-03-30 18:00 ` Mark Knecht 2010-03-30 18:05 ` Mark Knecht 0 siblings, 2 replies; 15+ messages in thread From: Jim Duchek @ 2010-03-30 17:47 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-raid Well it appears that I can absolutely reproduce it 100% of the time by copying a large (>1 gig) video file and then immediately playing it. It seems to hit the freeze on just about the same frame every time, and playing it seems to be necessary (it doesn't freeze if I just do the copy and go about my business). Possibly an issue w/disk buffers? You're having this happen even if the disk in question is not in an array? If so perhaps it's an SATA issue and not a RAID one, and we should move this discussion accordingly. I reproduced your steps and I'm seeing pretty much the same thing, although not quite hitting 100% wa (although I'm guessing it would if I shut everything else down -- Got a full desktop running). I'm using a Biostar T41-A7 mobo, Intel Core 2 Quad Q8400 Yorkfield 2.66GHz 4MB L2, and 3 Western Digital Caviar Blue WD5000AAJS drives and 1 WD5002ABYS. I note that the 3 older drives claim using ATA-7 and the newer one says ATA-8. Any similarities? Jim On 30 March 2010 12:18, Mark Knecht <markknecht@gmail.com> wrote: > On Tue, Mar 30, 2010 at 10:07 AM, Jim Duchek <jim.duchek@gmail.com> wrote: >> Hi all. Regularly after a large write to the disk (untarring a very >> large file, etc), my RAID5 will 'freeze' for a period of time -- >> perhaps around a minute. My system is completely responsive otherwise >> during this time, with the exception of anything that is attempting to >> read or write from the array -- it's as if any file descriptors simply >> block. Nothing disk/raid-related is written to the logs during this >> time. The array is mounted as /home -- so an awful lot of things >> completely freeze during this time (web browser, any video that is >> running, etc). The disks don't seem to be actually accessed during >> this time (I can't hear them, and the disk access light stays off), >> and it's not as if it's just reading slowly -- it's not reading at >> all. Array performance is completely normal before and after the >> freeze and simply non-existent during it. The root disk (which is on >> a seperate disk entirely from the RAID) runs fine during this time, as >> does everything else (network, video card, etc -- as long it doesn't >> touch the array) -- for example, a Terminal window open is still >> responsive during the freeze, and 'ls /' would work fine, while 'ls >> /home' would block until the 'freeze' is over. >> >> Some more detailed information on my setup attached. It's pretty >> vanilla. Unfortunately this started around the time four things >> happened -- a kernel upgrade to 2.6.32, upgrading my filesystems to >> ext4, replacing a disk gone bad in the RAID, and a video card change. >> I would assume one of these is the culprit, but you know what they say >> about 'assume'. I cannot reproduce the problem reliably, but it >> happens a couple times a day. My questions are these: >> >> 1. Is there any way to turn on more detailed logging for the RAID >> system in the kernel? The wiki or a google search makes no mention I >> can find, and mdadm doesn't put anything out during this time. >> 2. Possibly a problem with the SATA system? My root drive is PATA -- >> my RAID disks are all SATA. >> 2. Uh, any other ideas? :) >> >> >> Thanks, all. >> >> Jim Duchek >> > > I'm seeing a lot of this on a new Intel-based system. I've never run > into it before. > > In my case I can see the delays while looking at top. They correspond > to 100%wa, as shown here: > > top - 02:27:17 up 28 min, 2 users, load average: 2.76, 1.95, 1.30 > Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie > Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu5 : 0.0%us, 0.3%sy, 0.0%ni, 0.0%id, 99.7%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu7 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 6107024k total, 1448676k used, 4658348k free, 187492k buffers > Swap: 4200988k total, 0k used, 4200988k free, 915900k cached > > Like you nothing seems to get written anywhere when this is happening, > and in my case it happens whether I'm using RAID1 or not. > > From the command line if I do the following and wait for one of these > 100%wa events to occur > > echo "1" > /proc/sys/vm/block_dump > ... wait a short while ... > echo "0" > /proc/sys/vm/block_dump > > then grepping dmesg with this command > > dmesg | egrep "READ|WRITE|dirtied" > > shows the following: > > > flush-8:0(3365): WRITE block 33555792 on sda3 > flush-8:0(3365): WRITE block 33555800 on sda3 > flush-8:0(3365): WRITE block 33701984 on sda3 > flush-8:0(3365): WRITE block 33720128 on sda3 > flush-8:0(3365): WRITE block 33721496 on sda3 > flush-8:0(3365): WRITE block 33816576 on sda3 > > so something ugly is going on. I have no idea what causes these blocks > but they are really messing me up. > > Sometimes these events last for minutes. I've not yet discovered if > it's specific to my drives, my motherboard, the kernel or what. > > - Mark > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 17:47 ` Jim Duchek @ 2010-03-30 18:00 ` Mark Knecht 2010-03-30 18:05 ` Mark Knecht 1 sibling, 0 replies; 15+ messages in thread From: Mark Knecht @ 2010-03-30 18:00 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote: > Well it appears that I can absolutely reproduce it 100% of the time by > copying a large (>1 gig) video file and then immediately playing it. > It seems to hit the freeze on just about the same frame every time, > and playing it seems to be necessary (it doesn't freeze if I just do > the copy and go about my business). Possibly an issue w/disk buffers? > You're having this happen even if the disk in question is not in an > array? If so perhaps it's an SATA issue and not a RAID one, and we > should move this discussion accordingly. I reproduced your steps and > I'm seeing pretty much the same thing, although not quite hitting 100% > wa (although I'm guessing it would if I shut everything else down -- > Got a full desktop running). > > I'm using a Biostar T41-A7 mobo, Intel Core 2 Quad Q8400 Yorkfield > 2.66GHz 4MB L2, and 3 Western Digital Caviar Blue WD5000AAJS drives > and 1 WD5002ABYS. I note that the 3 older drives claim using ATA-7 > and the newer one says ATA-8. Any similarities? > > Jim > Intex DX58SO motherboard 6GB i7 920 3 WD10EARS green 1TB (4K/sector) nvidia 9500GT 2.6.32-gentoo kernel In my setup (and being Gentoo based where we build everything from scratch) I see this mostly when building code. I haven't copied any large files yet but I'm not surprised at your report as my dmesg error is about writes being blocked so that makes sense. Closest similarity would be both motherboards are based on Intel chipsets. SATA support is in the South Bridge. Your motherboard chipset North Bridge Intel G41 South Bridge Intel ICH7 My motherboard chipset North Bridge Intel X58 South Bridge Intel ICH10R You did see the same message about writes being block, correct? Nothing else as of yet? - Mark -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 17:47 ` Jim Duchek 2010-03-30 18:00 ` Mark Knecht @ 2010-03-30 18:05 ` Mark Knecht 2010-03-30 20:32 ` Jim Duchek 1 sibling, 1 reply; 15+ messages in thread From: Mark Knecht @ 2010-03-30 18:05 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote: <SNIP> > You're having this happen even if the disk in question is not in an > array? If so perhaps it's an SATA issue and not a RAID one, and we > should move this discussion accordingly. Yes, in my case the delays are so long - sometimes 2 or 3 minutes - that when I tried to build the system using RAID1 I got this kernel bug in dmesg. It's jsut info - not a real failure - but because it's talking about long delays I gave up on RAID and tried a standard single drive build. Turns out that it has (I think...) nothing to do with RAID at all. you'll not that there are instructions for turning the message off but I've not tried them. I intend to do a parallel RAID1 build on this machine and be able to test both RAID vs non-RAID. - Mark INFO: task kjournald:17466 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff8800280bbe00 0 17466 2 0x00000000 ffff8801adf9d890 0000000000000046 0000000000000000 0000000000000000 ffff8801adcbde44 0000000000004000 000000000000fe00 000000000000c878 0000000800000050 ffff88017a99aa40 ffff8801af90a150 ffff8801adf9db08 Call Trace: [<ffffffff812dd063>] ? md_make_request+0xb6/0xf1 [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 [<ffffffff8137a4fc>] ? io_schedule+0x2d/0x3a [<ffffffff8109c283>] ? sync_buffer+0x3b/0x40 [<ffffffff8137a879>] ? __wait_on_bit+0x41/0x70 [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 [<ffffffff8137a913>] ? out_of_line_wait_on_bit+0x6b/0x77 [<ffffffff810438b2>] ? wake_bit_function+0x0/0x23 [<ffffffff8109c637>] ? sync_dirty_buffer+0x72/0xaa [<ffffffff81131b8e>] ? journal_commit_transaction+0xa74/0xde2 [<ffffffff8103abcc>] ? lock_timer_base+0x26/0x4b [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e [<ffffffff81134804>] ? kjournald+0xe3/0x206 [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e [<ffffffff81134721>] ? kjournald+0x0/0x206 [<ffffffff81043591>] ? kthread+0x8b/0x93 [<ffffffff8100bd3a>] ? child_rip+0xa/0x20 [<ffffffff81043506>] ? kthread+0x0/0x93 [<ffffffff8100bd30>] ? child_rip+0x0/0x20 livecd ~ # -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 18:05 ` Mark Knecht @ 2010-03-30 20:32 ` Jim Duchek 2010-03-30 20:45 ` Mark Knecht 0 siblings, 1 reply; 15+ messages in thread From: Jim Duchek @ 2010-03-30 20:32 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-raid Hrm, I've never seen that kernel message. I don't think any of my freezes have lasted for up to 120 seconds though (my drives are half as big -- might matter?) It looks like we've both got WD drives -- and we both have nvidia 9500gt's as well. Are you running the nvidia binary drivers, or noveau? (It seems like it wouldn't matter especially as, at least on my system, they don't share an interrupt or anything, but I hate to ignore any hardware that we both have the same of). I did move to 2.6.33 for some time, but that didn't change the behaviour. Jim On 30 March 2010 13:05, Mark Knecht <markknecht@gmail.com> wrote: > On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote: > <SNIP> >> You're having this happen even if the disk in question is not in an >> array? If so perhaps it's an SATA issue and not a RAID one, and we >> should move this discussion accordingly. > > Yes, in my case the delays are so long - sometimes 2 or 3 minutes - > that when I tried to build the system using RAID1 I got this kernel > bug in dmesg. It's jsut info - not a real failure - but because it's > talking about long delays I gave up on RAID and tried a standard > single drive build. Turns out that it has (I think...) nothing to do > with RAID at all. you'll not that there are instructions for turning > the message off but I've not tried them. I intend to do a parallel > RAID1 build on this machine and be able to test both RAID vs non-RAID. > > - Mark > > INFO: task kjournald:17466 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > kjournald D ffff8800280bbe00 0 17466 2 0x00000000 > ffff8801adf9d890 0000000000000046 0000000000000000 0000000000000000 > ffff8801adcbde44 0000000000004000 000000000000fe00 000000000000c878 > 0000000800000050 ffff88017a99aa40 ffff8801af90a150 ffff8801adf9db08 > Call Trace: > [<ffffffff812dd063>] ? md_make_request+0xb6/0xf1 > [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 > [<ffffffff8137a4fc>] ? io_schedule+0x2d/0x3a > [<ffffffff8109c283>] ? sync_buffer+0x3b/0x40 > [<ffffffff8137a879>] ? __wait_on_bit+0x41/0x70 > [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 > [<ffffffff8137a913>] ? out_of_line_wait_on_bit+0x6b/0x77 > [<ffffffff810438b2>] ? wake_bit_function+0x0/0x23 > [<ffffffff8109c637>] ? sync_dirty_buffer+0x72/0xaa > [<ffffffff81131b8e>] ? journal_commit_transaction+0xa74/0xde2 > [<ffffffff8103abcc>] ? lock_timer_base+0x26/0x4b > [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e > [<ffffffff81134804>] ? kjournald+0xe3/0x206 > [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e > [<ffffffff81134721>] ? kjournald+0x0/0x206 > [<ffffffff81043591>] ? kthread+0x8b/0x93 > [<ffffffff8100bd3a>] ? child_rip+0xa/0x20 > [<ffffffff81043506>] ? kthread+0x0/0x93 > [<ffffffff8100bd30>] ? child_rip+0x0/0x20 > livecd ~ # > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 20:32 ` Jim Duchek @ 2010-03-30 20:45 ` Mark Knecht 2010-03-30 20:59 ` Jim Duchek 0 siblings, 1 reply; 15+ messages in thread From: Mark Knecht @ 2010-03-30 20:45 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid Hi, I am running the nvidia binary drivers. I'm not doing anything with X at this point so I an just unload them I think. I could even remove the card I suppose. I built a machine for my dad a couple of months ago that uses the same 1TB WD drive that I am using now. I don't remember seeing anything like this on his machine but I'm going to go check that. One other similarity I suspect we have is ext3? There were problems with ext3 priority inversion in earlier kernel. It's my understanding that they thought they had that worked out but possibly we're triggering this somehow? since I've got a lot of disk space I can set up some other partitions, etc4, reiser4, etc., and try copying files to trigger it. However it's difficult for me if it requires read/write as I'm not set up to really use the machine yet. Is that something you have room to try? Also, we haven't discussed what drivers are loaded or kernel config. Here's my current driver set: keeper ~ # lsmod Module Size Used by ipv6 207757 30 usbhid 21529 0 nvidia 10611606 22 snd_hda_codec_realtek 239530 1 snd_hda_intel 17688 0 ehci_hcd 30854 0 snd_hda_codec 45755 2 snd_hda_codec_realtek,snd_hda_intel snd_pcm 58104 2 snd_hda_intel,snd_hda_codec snd_timer 15030 1 snd_pcm snd 37476 5 snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_pcm,snd_timer soundcore 800 1 snd snd_page_alloc 5809 2 snd_hda_intel,snd_pcm rtc_cmos 7678 0 rtc_core 11093 1 rtc_cmos sg 23029 0 uhci_hcd 18047 0 usbcore 115023 4 usbhid,ehci_hcd,uhci_hcd agpgart 24341 1 nvidia processor 23121 0 e1000e 111701 0 firewire_ohci 20022 0 rtc_lib 1617 1 rtc_core firewire_core 36109 1 firewire_ohci thermal 11650 0 keeper ~ # - Mark On Tue, Mar 30, 2010 at 1:32 PM, Jim Duchek <jim.duchek@gmail.com> wrote: > Hrm, I've never seen that kernel message. I don't think any of my > freezes have lasted for up to 120 seconds though (my drives are half > as big -- might matter?) It looks like we've both got WD drives -- > and we both have nvidia 9500gt's as well. Are you running the nvidia > binary drivers, or noveau? (It seems like it wouldn't matter > especially as, at least on my system, they don't share an interrupt or > anything, but I hate to ignore any hardware that we both have the same > of). I did move to 2.6.33 for some time, but that didn't change the > behaviour. > > Jim > > > On 30 March 2010 13:05, Mark Knecht <markknecht@gmail.com> wrote: >> On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote: >> <SNIP> >>> You're having this happen even if the disk in question is not in an >>> array? If so perhaps it's an SATA issue and not a RAID one, and we >>> should move this discussion accordingly. >> >> Yes, in my case the delays are so long - sometimes 2 or 3 minutes - >> that when I tried to build the system using RAID1 I got this kernel >> bug in dmesg. It's jsut info - not a real failure - but because it's >> talking about long delays I gave up on RAID and tried a standard >> single drive build. Turns out that it has (I think...) nothing to do >> with RAID at all. you'll not that there are instructions for turning >> the message off but I've not tried them. I intend to do a parallel >> RAID1 build on this machine and be able to test both RAID vs non-RAID. >> >> - Mark >> >> INFO: task kjournald:17466 blocked for more than 120 seconds. >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> kjournald D ffff8800280bbe00 0 17466 2 0x00000000 >> ffff8801adf9d890 0000000000000046 0000000000000000 0000000000000000 >> ffff8801adcbde44 0000000000004000 000000000000fe00 000000000000c878 >> 0000000800000050 ffff88017a99aa40 ffff8801af90a150 ffff8801adf9db08 >> Call Trace: >> [<ffffffff812dd063>] ? md_make_request+0xb6/0xf1 >> [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 >> [<ffffffff8137a4fc>] ? io_schedule+0x2d/0x3a >> [<ffffffff8109c283>] ? sync_buffer+0x3b/0x40 >> [<ffffffff8137a879>] ? __wait_on_bit+0x41/0x70 >> [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 >> [<ffffffff8137a913>] ? out_of_line_wait_on_bit+0x6b/0x77 >> [<ffffffff810438b2>] ? wake_bit_function+0x0/0x23 >> [<ffffffff8109c637>] ? sync_dirty_buffer+0x72/0xaa >> [<ffffffff81131b8e>] ? journal_commit_transaction+0xa74/0xde2 >> [<ffffffff8103abcc>] ? lock_timer_base+0x26/0x4b >> [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e >> [<ffffffff81134804>] ? kjournald+0xe3/0x206 >> [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e >> [<ffffffff81134721>] ? kjournald+0x0/0x206 >> [<ffffffff81043591>] ? kthread+0x8b/0x93 >> [<ffffffff8100bd3a>] ? child_rip+0xa/0x20 >> [<ffffffff81043506>] ? kthread+0x0/0x93 >> [<ffffffff8100bd30>] ? child_rip+0x0/0x20 >> livecd ~ # >> > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 20:45 ` Mark Knecht @ 2010-03-30 20:59 ` Jim Duchek 2010-03-30 22:21 ` Mark Knecht 0 siblings, 1 reply; 15+ messages in thread From: Jim Duchek @ 2010-03-30 20:59 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-raid I'm using ext4 on everything, but it's hard to judge which ext3 bugs might affect ext4 as well. I really don't have the ability to destructively test the array, I need all the data that's on it and I don't have enough spare space elsewhere to back it all up. You might see if you can trigger it with dd, writing to the drive directly w/no filesystem? Jim On 30 March 2010 14:45, Mark Knecht <markknecht@gmail.com> wrote: > Hi, > I am running the nvidia binary drivers. I'm not doing anything with > X at this point so I an just unload them I think. I could even remove > the card I suppose. > > I built a machine for my dad a couple of months ago that uses the > same 1TB WD drive that I am using now. I don't remember seeing > anything like this on his machine but I'm going to go check that. > > One other similarity I suspect we have is ext3? There were problems > with ext3 priority inversion in earlier kernel. It's my understanding > that they thought they had that worked out but possibly we're > triggering this somehow? since I've got a lot of disk space I can set > up some other partitions, etc4, reiser4, etc., and try copying files > to trigger it. However it's difficult for me if it requires read/write > as I'm not set up to really use the machine yet. Is that something you > have room to try? > > Also, we haven't discussed what drivers are loaded or kernel > config. Here's my current driver set: > > keeper ~ # lsmod > Module Size Used by > ipv6 207757 30 > usbhid 21529 0 > nvidia 10611606 22 > snd_hda_codec_realtek 239530 1 > snd_hda_intel 17688 0 > ehci_hcd 30854 0 > snd_hda_codec 45755 2 snd_hda_codec_realtek,snd_hda_intel > snd_pcm 58104 2 snd_hda_intel,snd_hda_codec > snd_timer 15030 1 snd_pcm > snd 37476 5 > snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_pcm,snd_timer > soundcore 800 1 snd > snd_page_alloc 5809 2 snd_hda_intel,snd_pcm > rtc_cmos 7678 0 > rtc_core 11093 1 rtc_cmos > sg 23029 0 > uhci_hcd 18047 0 > usbcore 115023 4 usbhid,ehci_hcd,uhci_hcd > agpgart 24341 1 nvidia > processor 23121 0 > e1000e 111701 0 > firewire_ohci 20022 0 > rtc_lib 1617 1 rtc_core > firewire_core 36109 1 firewire_ohci > thermal 11650 0 > keeper ~ # > > - Mark > > On Tue, Mar 30, 2010 at 1:32 PM, Jim Duchek <jim.duchek@gmail.com> wrote: >> Hrm, I've never seen that kernel message. I don't think any of my >> freezes have lasted for up to 120 seconds though (my drives are half >> as big -- might matter?) It looks like we've both got WD drives -- >> and we both have nvidia 9500gt's as well. Are you running the nvidia >> binary drivers, or noveau? (It seems like it wouldn't matter >> especially as, at least on my system, they don't share an interrupt or >> anything, but I hate to ignore any hardware that we both have the same >> of). I did move to 2.6.33 for some time, but that didn't change the >> behaviour. >> >> Jim >> >> >> On 30 March 2010 13:05, Mark Knecht <markknecht@gmail.com> wrote: >>> On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote: >>> <SNIP> >>>> You're having this happen even if the disk in question is not in an >>>> array? If so perhaps it's an SATA issue and not a RAID one, and we >>>> should move this discussion accordingly. >>> >>> Yes, in my case the delays are so long - sometimes 2 or 3 minutes - >>> that when I tried to build the system using RAID1 I got this kernel >>> bug in dmesg. It's jsut info - not a real failure - but because it's >>> talking about long delays I gave up on RAID and tried a standard >>> single drive build. Turns out that it has (I think...) nothing to do >>> with RAID at all. you'll not that there are instructions for turning >>> the message off but I've not tried them. I intend to do a parallel >>> RAID1 build on this machine and be able to test both RAID vs non-RAID. >>> >>> - Mark >>> >>> INFO: task kjournald:17466 blocked for more than 120 seconds. >>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> kjournald D ffff8800280bbe00 0 17466 2 0x00000000 >>> ffff8801adf9d890 0000000000000046 0000000000000000 0000000000000000 >>> ffff8801adcbde44 0000000000004000 000000000000fe00 000000000000c878 >>> 0000000800000050 ffff88017a99aa40 ffff8801af90a150 ffff8801adf9db08 >>> Call Trace: >>> [<ffffffff812dd063>] ? md_make_request+0xb6/0xf1 >>> [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 >>> [<ffffffff8137a4fc>] ? io_schedule+0x2d/0x3a >>> [<ffffffff8109c283>] ? sync_buffer+0x3b/0x40 >>> [<ffffffff8137a879>] ? __wait_on_bit+0x41/0x70 >>> [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 >>> [<ffffffff8137a913>] ? out_of_line_wait_on_bit+0x6b/0x77 >>> [<ffffffff810438b2>] ? wake_bit_function+0x0/0x23 >>> [<ffffffff8109c637>] ? sync_dirty_buffer+0x72/0xaa >>> [<ffffffff81131b8e>] ? journal_commit_transaction+0xa74/0xde2 >>> [<ffffffff8103abcc>] ? lock_timer_base+0x26/0x4b >>> [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e >>> [<ffffffff81134804>] ? kjournald+0xe3/0x206 >>> [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e >>> [<ffffffff81134721>] ? kjournald+0x0/0x206 >>> [<ffffffff81043591>] ? kthread+0x8b/0x93 >>> [<ffffffff8100bd3a>] ? child_rip+0xa/0x20 >>> [<ffffffff81043506>] ? kthread+0x0/0x93 >>> [<ffffffff8100bd30>] ? child_rip+0x0/0x20 >>> livecd ~ # >>> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 20:59 ` Jim Duchek @ 2010-03-30 22:21 ` Mark Knecht 2010-03-30 23:50 ` Mark Knecht 0 siblings, 1 reply; 15+ messages in thread From: Mark Knecht @ 2010-03-30 22:21 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid I just finished a long compile on my dad's i5-661/DH55HC machine which uses this same WD drive and I didn't spot any sign of this happening there. That's a very recent Intel chipset also and probably more or less the same SATA controller. I'm going to turn on the kernel message into dmesg thing for a while and see if anything pops up. I can set up some additional partitions on my local drive to test other file systems but since you're ext3 and I'm ext3 then it's not that unless the problem moved forward with code over time. I like the idea of using dd but I want to be careful about that sort of thing. I've not used dd before, but if I could tell it to write a gigabyte without messing up existing stuff then that could be helpful. Back later, Mark On Tue, Mar 30, 2010 at 1:59 PM, Jim Duchek <jim.duchek@gmail.com> wrote: > I'm using ext4 on everything, but it's hard to judge which ext3 bugs > might affect ext4 as well. I really don't have the ability to > destructively test the array, I need all the data that's on it and I > don't have enough spare space elsewhere to back it all up. You might > see if you can trigger it with dd, writing to the drive directly w/no > filesystem? > > Jim > > > > On 30 March 2010 14:45, Mark Knecht <markknecht@gmail.com> wrote: >> Hi, >> I am running the nvidia binary drivers. I'm not doing anything with >> X at this point so I an just unload them I think. I could even remove >> the card I suppose. >> >> I built a machine for my dad a couple of months ago that uses the >> same 1TB WD drive that I am using now. I don't remember seeing >> anything like this on his machine but I'm going to go check that. >> >> One other similarity I suspect we have is ext3? There were problems >> with ext3 priority inversion in earlier kernel. It's my understanding >> that they thought they had that worked out but possibly we're >> triggering this somehow? since I've got a lot of disk space I can set >> up some other partitions, etc4, reiser4, etc., and try copying files >> to trigger it. However it's difficult for me if it requires read/write >> as I'm not set up to really use the machine yet. Is that something you >> have room to try? >> >> Also, we haven't discussed what drivers are loaded or kernel >> config. Here's my current driver set: >> >> keeper ~ # lsmod >> Module Size Used by >> ipv6 207757 30 >> usbhid 21529 0 >> nvidia 10611606 22 >> snd_hda_codec_realtek 239530 1 >> snd_hda_intel 17688 0 >> ehci_hcd 30854 0 >> snd_hda_codec 45755 2 snd_hda_codec_realtek,snd_hda_intel >> snd_pcm 58104 2 snd_hda_intel,snd_hda_codec >> snd_timer 15030 1 snd_pcm >> snd 37476 5 >> snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_pcm,snd_timer >> soundcore 800 1 snd >> snd_page_alloc 5809 2 snd_hda_intel,snd_pcm >> rtc_cmos 7678 0 >> rtc_core 11093 1 rtc_cmos >> sg 23029 0 >> uhci_hcd 18047 0 >> usbcore 115023 4 usbhid,ehci_hcd,uhci_hcd >> agpgart 24341 1 nvidia >> processor 23121 0 >> e1000e 111701 0 >> firewire_ohci 20022 0 >> rtc_lib 1617 1 rtc_core >> firewire_core 36109 1 firewire_ohci >> thermal 11650 0 >> keeper ~ # >> >> - Mark >> >> On Tue, Mar 30, 2010 at 1:32 PM, Jim Duchek <jim.duchek@gmail.com> wrote: >>> Hrm, I've never seen that kernel message. I don't think any of my >>> freezes have lasted for up to 120 seconds though (my drives are half >>> as big -- might matter?) It looks like we've both got WD drives -- >>> and we both have nvidia 9500gt's as well. Are you running the nvidia >>> binary drivers, or noveau? (It seems like it wouldn't matter >>> especially as, at least on my system, they don't share an interrupt or >>> anything, but I hate to ignore any hardware that we both have the same >>> of). I did move to 2.6.33 for some time, but that didn't change the >>> behaviour. >>> >>> Jim >>> >>> >>> On 30 March 2010 13:05, Mark Knecht <markknecht@gmail.com> wrote: >>>> On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote: >>>> <SNIP> >>>>> You're having this happen even if the disk in question is not in an >>>>> array? If so perhaps it's an SATA issue and not a RAID one, and we >>>>> should move this discussion accordingly. >>>> >>>> Yes, in my case the delays are so long - sometimes 2 or 3 minutes - >>>> that when I tried to build the system using RAID1 I got this kernel >>>> bug in dmesg. It's jsut info - not a real failure - but because it's >>>> talking about long delays I gave up on RAID and tried a standard >>>> single drive build. Turns out that it has (I think...) nothing to do >>>> with RAID at all. you'll not that there are instructions for turning >>>> the message off but I've not tried them. I intend to do a parallel >>>> RAID1 build on this machine and be able to test both RAID vs non-RAID. >>>> >>>> - Mark >>>> >>>> INFO: task kjournald:17466 blocked for more than 120 seconds. >>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> kjournald D ffff8800280bbe00 0 17466 2 0x00000000 >>>> ffff8801adf9d890 0000000000000046 0000000000000000 0000000000000000 >>>> ffff8801adcbde44 0000000000004000 000000000000fe00 000000000000c878 >>>> 0000000800000050 ffff88017a99aa40 ffff8801af90a150 ffff8801adf9db08 >>>> Call Trace: >>>> [<ffffffff812dd063>] ? md_make_request+0xb6/0xf1 >>>> [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 >>>> [<ffffffff8137a4fc>] ? io_schedule+0x2d/0x3a >>>> [<ffffffff8109c283>] ? sync_buffer+0x3b/0x40 >>>> [<ffffffff8137a879>] ? __wait_on_bit+0x41/0x70 >>>> [<ffffffff8109c248>] ? sync_buffer+0x0/0x40 >>>> [<ffffffff8137a913>] ? out_of_line_wait_on_bit+0x6b/0x77 >>>> [<ffffffff810438b2>] ? wake_bit_function+0x0/0x23 >>>> [<ffffffff8109c637>] ? sync_dirty_buffer+0x72/0xaa >>>> [<ffffffff81131b8e>] ? journal_commit_transaction+0xa74/0xde2 >>>> [<ffffffff8103abcc>] ? lock_timer_base+0x26/0x4b >>>> [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e >>>> [<ffffffff81134804>] ? kjournald+0xe3/0x206 >>>> [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e >>>> [<ffffffff81134721>] ? kjournald+0x0/0x206 >>>> [<ffffffff81043591>] ? kthread+0x8b/0x93 >>>> [<ffffffff8100bd3a>] ? child_rip+0xa/0x20 >>>> [<ffffffff81043506>] ? kthread+0x0/0x93 >>>> [<ffffffff8100bd30>] ? child_rip+0x0/0x20 >>>> livecd ~ # >>>> >>> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 22:21 ` Mark Knecht @ 2010-03-30 23:50 ` Mark Knecht 2010-03-31 0:22 ` Jim Duchek 0 siblings, 1 reply; 15+ messages in thread From: Mark Knecht @ 2010-03-30 23:50 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid On Tue, Mar 30, 2010 at 3:21 PM, Mark Knecht <markknecht@gmail.com> wrote: > I just finished a long compile on my dad's i5-661/DH55HC machine which > uses this same WD drive and I didn't spot any sign of this happening > there. That's a very recent Intel chipset also and probably more or > less the same SATA controller. > > I'm going to turn on the kernel message into dmesg thing for a while > and see if anything pops up. > > I can set up some additional partitions on my local drive to test > other file systems but since you're ext3 and I'm ext3 then it's not > that unless the problem moved forward with code over time. > > I like the idea of using dd but I want to be careful about that sort > of thing. I've not used dd before, but if I could tell it to write a > gigabyte without messing up existing stuff then that could be helpful. > > Back later, > Mark > > On Tue, Mar 30, 2010 at 1:59 PM, Jim Duchek <jim.duchek@gmail.com> wrote: >> I'm using ext4 on everything, but it's hard to judge which ext3 bugs >> might affect ext4 as well. I really don't have the ability to >> destructively test the array, I need all the data that's on it and I >> don't have enough spare space elsewhere to back it all up. You might >> see if you can trigger it with dd, writing to the drive directly w/no >> filesystem? >> >> Jim >> <SNIP> I know this isn't going to survive email very well but you might want to look at interrupts. I'm seeing the count on CPU #5 rising much more quickly than other CPU's, and in my case it's generally CPU #5 that stalls out with this 100% wait problem. I'm looking at another 4 processor machine that's been up for a few days. Its interrupt counts are fairly balanced, except for TLB Shootdowns, whatever that is. Wouldn't know how to tell if it's related... - Mark Using keyboard-interactive authentication. Password: Last login: Tue Mar 30 15:59:22 PDT 2010 from 192.168.1.65 on pts/0 keeper ~ # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 232 0 0 1 0 0 0 0 IO-APIC-edge timer 1: 0 0 0 2 0 0 0 0 IO-APIC-edge i8042 3: 0 0 0 2 0 0 0 0 IO-APIC-edge 8: 0 0 0 91 0 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi 12: 0 0 0 4 0 0 0 0 IO-APIC-edge i8042 14: 0 0 0 0 0 0 0 0 IO-APIC-edge ide0 15: 0 0 0 0 0 0 0 0 IO-APIC-edge ide1 16: 0 0 0 0 82 0 0 0 IO-APIC-fasteoi ahci, uhci_hcd:usb1, nvidia 18: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb6, ehci_hcd:usb7 19: 0 0 0 0 0 3137 0 0 IO-APIC-fasteoi ahci, firewire_ohci, uhci_hcd:usb3, uhci_hcd:usb5 20: 0 0 0 0 0 0 265 0 IO-APIC-fasteoi eth0 21: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2 22: 154 0 0 0 0 0 0 0 IO-APIC-fasteoi hda_intel 23: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb8 NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts LOC: 7048 6722 3577 3598 3491 8425 3756 3569 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts PND: 0 0 0 0 0 0 0 0 Performance pending work RES: 335 332 353 259 176 173 251 82 Rescheduling interrupts CAL: 242 233 258 180 241 160 260 260 Function call interrupts TLB: 232 242 270 235 342 474 537 497 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 2 2 2 2 2 2 2 2 Machine check polls ERR: 7 MIS: 0 keeper ~ # date Tue Mar 30 16:45:13 PDT 2010 keeper ~ # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 232 0 0 9 0 0 0 0 IO-APIC-edge timer 1: 0 0 0 2 0 0 0 0 IO-APIC-edge i8042 3: 0 0 0 2 0 0 0 0 IO-APIC-edge 8: 0 0 0 91 0 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi 12: 0 0 0 4 0 0 0 0 IO-APIC-edge i8042 14: 0 0 0 0 0 0 0 0 IO-APIC-edge ide0 15: 0 0 0 0 0 0 0 0 IO-APIC-edge ide1 16: 0 0 0 0 2660 0 0 0 IO-APIC-fasteoi ahci, uhci_hcd:usb1, nvidia 18: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb6, ehci_hcd:usb7 19: 0 0 0 0 0 20762 0 0 IO-APIC-fasteoi ahci, firewire_ohci, uhci_hcd:usb3, uhci_hcd:usb5 20: 0 0 0 0 0 0 1903 0 IO-APIC-fasteoi eth0 21: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2 22: 154 0 0 0 0 0 0 0 IO-APIC-fasteoi hda_intel 23: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb8 NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts LOC: 10618 11998 8756 6940 6484 22076 7456 6599 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts PND: 0 0 0 0 0 0 0 0 Performance pending work RES: 335 332 353 259 176 173 251 82 Rescheduling interrupts CAL: 242 233 258 180 241 160 260 260 Function call interrupts TLB: 232 243 270 236 343 475 538 497 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 10 10 10 10 10 10 10 10 Machine check polls ERR: 7 MIS: 0 keeper ~ # -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 23:50 ` Mark Knecht @ 2010-03-31 0:22 ` Jim Duchek 0 siblings, 0 replies; 15+ messages in thread From: Jim Duchek @ 2010-03-31 0:22 UTC (permalink / raw) To: Mark Knecht; +Cc: linux-raid Interesting... You're using the AHCI SATA driver... I'm using ata_piix. I begin to think it might be a hardware issue. Jim On 30 March 2010 17:50, Mark Knecht <markknecht@gmail.com> wrote: > On Tue, Mar 30, 2010 at 3:21 PM, Mark Knecht <markknecht@gmail.com> wrote: >> I just finished a long compile on my dad's i5-661/DH55HC machine which >> uses this same WD drive and I didn't spot any sign of this happening >> there. That's a very recent Intel chipset also and probably more or >> less the same SATA controller. >> >> I'm going to turn on the kernel message into dmesg thing for a while >> and see if anything pops up. >> >> I can set up some additional partitions on my local drive to test >> other file systems but since you're ext3 and I'm ext3 then it's not >> that unless the problem moved forward with code over time. >> >> I like the idea of using dd but I want to be careful about that sort >> of thing. I've not used dd before, but if I could tell it to write a >> gigabyte without messing up existing stuff then that could be helpful. >> >> Back later, >> Mark >> >> On Tue, Mar 30, 2010 at 1:59 PM, Jim Duchek <jim.duchek@gmail.com> wrote: >>> I'm using ext4 on everything, but it's hard to judge which ext3 bugs >>> might affect ext4 as well. I really don't have the ability to >>> destructively test the array, I need all the data that's on it and I >>> don't have enough spare space elsewhere to back it all up. You might >>> see if you can trigger it with dd, writing to the drive directly w/no >>> filesystem? >>> >>> Jim >>> > > <SNIP> > > I know this isn't going to survive email very well but you might want > to look at interrupts. I'm seeing the count on CPU #5 rising much more > quickly than other CPU's, and in my case it's generally CPU #5 that > stalls out with this 100% wait problem. > > I'm looking at another 4 processor machine that's been up for a few > days. Its interrupt counts are fairly balanced, except for TLB > Shootdowns, whatever that is. > > Wouldn't know how to tell if it's related... > > - Mark > > Using keyboard-interactive authentication. > Password: > Last login: Tue Mar 30 15:59:22 PDT 2010 from 192.168.1.65 on pts/0 > keeper ~ # cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > CPU6 CPU7 > 0: 232 0 0 1 0 0 > 0 0 IO-APIC-edge timer > 1: 0 0 0 2 0 0 > 0 0 IO-APIC-edge i8042 > 3: 0 0 0 2 0 0 > 0 0 IO-APIC-edge > 8: 0 0 0 91 0 0 > 0 0 IO-APIC-edge rtc0 > 9: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi acpi > 12: 0 0 0 4 0 0 > 0 0 IO-APIC-edge i8042 > 14: 0 0 0 0 0 0 > 0 0 IO-APIC-edge ide0 > 15: 0 0 0 0 0 0 > 0 0 IO-APIC-edge ide1 > 16: 0 0 0 0 82 0 > 0 0 IO-APIC-fasteoi ahci, uhci_hcd:usb1, nvidia > 18: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi uhci_hcd:usb6, ehci_hcd:usb7 > 19: 0 0 0 0 0 3137 > 0 0 IO-APIC-fasteoi ahci, firewire_ohci, > uhci_hcd:usb3, uhci_hcd:usb5 > 20: 0 0 0 0 0 0 > 265 0 IO-APIC-fasteoi eth0 > 21: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi uhci_hcd:usb2 > 22: 154 0 0 0 0 0 > 0 0 IO-APIC-fasteoi hda_intel > 23: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb8 > NMI: 0 0 0 0 0 0 > 0 0 Non-maskable interrupts > LOC: 7048 6722 3577 3598 3491 8425 > 3756 3569 Local timer interrupts > SPU: 0 0 0 0 0 0 > 0 0 Spurious interrupts > PMI: 0 0 0 0 0 0 > 0 0 Performance monitoring interrupts > PND: 0 0 0 0 0 0 > 0 0 Performance pending work > RES: 335 332 353 259 176 173 > 251 82 Rescheduling interrupts > CAL: 242 233 258 180 241 160 > 260 260 Function call interrupts > TLB: 232 242 270 235 342 474 > 537 497 TLB shootdowns > TRM: 0 0 0 0 0 0 > 0 0 Thermal event interrupts > THR: 0 0 0 0 0 0 > 0 0 Threshold APIC interrupts > MCE: 0 0 0 0 0 0 > 0 0 Machine check exceptions > MCP: 2 2 2 2 2 2 > 2 2 Machine check polls > ERR: 7 > MIS: 0 > keeper ~ # date > Tue Mar 30 16:45:13 PDT 2010 > keeper ~ # cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > CPU6 CPU7 > 0: 232 0 0 9 0 0 > 0 0 IO-APIC-edge timer > 1: 0 0 0 2 0 0 > 0 0 IO-APIC-edge i8042 > 3: 0 0 0 2 0 0 > 0 0 IO-APIC-edge > 8: 0 0 0 91 0 0 > 0 0 IO-APIC-edge rtc0 > 9: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi acpi > 12: 0 0 0 4 0 0 > 0 0 IO-APIC-edge i8042 > 14: 0 0 0 0 0 0 > 0 0 IO-APIC-edge ide0 > 15: 0 0 0 0 0 0 > 0 0 IO-APIC-edge ide1 > 16: 0 0 0 0 2660 0 > 0 0 IO-APIC-fasteoi ahci, uhci_hcd:usb1, nvidia > 18: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi uhci_hcd:usb6, ehci_hcd:usb7 > 19: 0 0 0 0 0 20762 > 0 0 IO-APIC-fasteoi ahci, firewire_ohci, > uhci_hcd:usb3, uhci_hcd:usb5 > 20: 0 0 0 0 0 0 > 1903 0 IO-APIC-fasteoi eth0 > 21: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi uhci_hcd:usb2 > 22: 154 0 0 0 0 0 > 0 0 IO-APIC-fasteoi hda_intel > 23: 0 0 0 0 0 0 > 0 0 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb8 > NMI: 0 0 0 0 0 0 > 0 0 Non-maskable interrupts > LOC: 10618 11998 8756 6940 6484 22076 > 7456 6599 Local timer interrupts > SPU: 0 0 0 0 0 0 > 0 0 Spurious interrupts > PMI: 0 0 0 0 0 0 > 0 0 Performance monitoring interrupts > PND: 0 0 0 0 0 0 > 0 0 Performance pending work > RES: 335 332 353 259 176 173 > 251 82 Rescheduling interrupts > CAL: 242 233 258 180 241 160 > 260 260 Function call interrupts > TLB: 232 243 270 236 343 475 > 538 497 TLB shootdowns > TRM: 0 0 0 0 0 0 > 0 0 Thermal event interrupts > THR: 0 0 0 0 0 0 > 0 0 Threshold APIC interrupts > MCE: 0 0 0 0 0 0 > 0 0 Machine check exceptions > MCP: 10 10 10 10 10 10 > 10 10 Machine check polls > ERR: 7 > MIS: 0 > keeper ~ # > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 17:07 Array 'freezes' for some time after large writes? Jim Duchek 2010-03-30 17:18 ` Mark Knecht @ 2010-03-31 1:35 ` Roger Heflin 2010-03-31 16:12 ` Mark Knecht 2010-03-31 16:37 ` Asdo 2 siblings, 1 reply; 15+ messages in thread From: Roger Heflin @ 2010-03-31 1:35 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid Jim Duchek wrote: > Hi all. Regularly after a large write to the disk (untarring a very > large file, etc), my RAID5 will 'freeze' for a period of time -- > perhaps around a minute. My system is completely responsive otherwise > during this time, with the exception of anything that is attempting to > read or write from the array -- it's as if any file descriptors simply > block. Nothing disk/raid-related is written to the logs during this > time. The array is mounted as /home -- so an awful lot of things > completely freeze during this time (web browser, any video that is > running, etc). The disks don't seem to be actually accessed during > this time (I can't hear them, and the disk access light stays off), > and it's not as if it's just reading slowly -- it's not reading at > all. Array performance is completely normal before and after the > freeze and simply non-existent during it. The root disk (which is on > a seperate disk entirely from the RAID) runs fine during this time, as > does everything else (network, video card, etc -- as long it doesn't > touch the array) -- for example, a Terminal window open is still > responsive during the freeze, and 'ls /' would work fine, while 'ls > /home' would block until the 'freeze' is over. > > Some more detailed information on my setup attached. It's pretty > vanilla. Unfortunately this started around the time four things > happened -- a kernel upgrade to 2.6.32, upgrading my filesystems to > ext4, replacing a disk gone bad in the RAID, and a video card change. > I would assume one of these is the culprit, but you know what they say > about 'assume'. I cannot reproduce the problem reliably, but it > happens a couple times a day. My questions are these: > > 1. Is there any way to turn on more detailed logging for the RAID > system in the kernel? The wiki or a google search makes no mention I > can find, and mdadm doesn't put anything out during this time. > 2. Possibly a problem with the SATA system? My root drive is PATA -- > my RAID disks are all SATA. > 2. Uh, any other ideas? :) > > > Thanks, all. > > Jim Duchek > > > > > > [jrduchek@jimbob ~]$ uname -a > Linux jimbob 2.6.32-ARCH #1 SMP PREEMPT Mon Mar 15 20:44:03 CET 2010 > x86_64 Intel(R) Core(TM)2 Quad CPU Q8400 @ 2.66GHz GenuineIntel > GNU/Linux > > [jrduchek@jimbob ~]$ cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] > md0 : active raid5 sdb1[0] sde1[3] sdd1[2] sdc1[1] > 1465151808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] > > unused devices: <none> > > > [jrduchek@jimbob ~]$ mount > /dev/sda3 on / type ext4 (rw,noatime,user_xattr) > udev on /dev type tmpfs (rw,nosuid,relatime,size=10240k,mode=755) > none on /proc type proc (rw,relatime) > none on /sys type sysfs (rw,relatime) > none on /dev/pts type devpts (rw) > none on /dev/shm type tmpfs (rw) > /dev/sda1 on /boot type ext2 (rw) > /dev/md0 on /home type ext4 (rw,noatime,user_xattr) > > [jrduchek@jimbob ~]$ more /etc/rc.local > #!/bin/bash > # > # /etc/rc.local: Local multi-user startup script. > # > > echo 8192 > /sys/block/md0/md/stripe_cache_size > blockdev --setra 32768 /dev/md0 > blockdev --setfra 32768 /dev/md0 > > > > dmesg (relevant): > > > > > ata3: SATA max UDMA/133 cmd 0xc400 ctl 0xc080 bmdma 0xb880 irq 19 > ata4: SATA max UDMA/133 cmd 0xc000 ctl 0xbc00 bmdma 0xb888 irq 19 > ata3.00: ATA-7: WDC WD5000AAJS-22TKA0, 12.01C01, max UDMA/133 > ata3.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) > ata3.01: ATA-8: WDC WD5002ABYS-02B1B0, 02.03B03, max UDMA/133 > ata3.01: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) > ata3.00: configured for UDMA/133 > ata3.01: configured for UDMA/133 > ata4.00: ATA-7: WDC WD5000AAJS-22TKA0, 12.01C01, max UDMA/133 > ata4.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) > ata4.01: ATA-7: WDC WD5000AAJS-22TKA0, 12.01C01, max UDMA/133 > ata4.01: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32) > ata4.00: configured for UDMA/133 > ata4.01: configured for UDMA/133 > ata1.00: ATA-7: MAXTOR STM3160815A, 3.AAD, max UDMA/100 > ata1.00: 312581808 sectors, multi 16: LBA48 > ata1.01: ATAPI: LITE-ON DVDRW LH-20A1P, KL0G, max UDMA/66 > ata1.00: configured for UDMA/100 > ata1.01: configured for UDMA/66 > scsi 0:0:0:0: Direct-Access ATA MAXTOR STM316081 3.AA PQ: 0 ANSI: 5 > scsi 0:0:1:0: CD-ROM LITE-ON DVDRW LH-20A1P KL0G PQ: 0 ANSI: 5 > scsi 2:0:0:0: Direct-Access ATA WDC WD5000AAJS-2 12.0 PQ: 0 ANSI: 5 > scsi 2:0:1:0: Direct-Access ATA WDC WD5002ABYS-0 02.0 PQ: 0 ANSI: 5 > scsi 3:0:0:0: Direct-Access ATA WDC WD5000AAJS-2 12.0 PQ: 0 ANSI: 5 > scsi 3:0:1:0: Direct-Access ATA WDC WD5000AAJS-2 12.0 PQ: 0 ANSI: 5 > sd 2:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) > sd 2:0:1:0: [sdc] 976773168 512-byte logical blocks: (500 GB/465 GiB) > sd 0:0:0:0: [sda] 312581808 512-byte logical blocks: (160 GB/149 GiB) > sd 3:0:0:0: [sdd] 976773168 512-byte logical blocks: (500 GB/465 GiB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 3:0:0:0: [sdd] Write Protect is off > sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 2:0:0:0: [sdb] Write Protect is off > sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdd: > sda: > sdb: > sd 2:0:1:0: [sdc] Write Protect is off > sd 2:0:1:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:1:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdc: sdb1 > sdd1 > sd 3:0:0:0: [sdd] Attached SCSI disk > sd 3:0:1:0: [sde] 976773168 512-byte logical blocks: (500 GB/465 GiB) > sd 3:0:1:0: [sde] Write Protect is off > sd 3:0:1:0: [sde] Mode Sense: 00 3a 00 00 > sd 3:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sde: sde1 > sd 3:0:1:0: [sde] Attached SCSI disk > sda1 sda2 sda3 > sdc1 > sd 0:0:0:0: [sda] Attached SCSI disk > > sd 2:0:0:0: [sdb] Attached SCSI disk > sd 2:0:1:0: [sdc] Attached SCSI disk > > md: md0 stopped. > md: bind<sdc1> > md: bind<sdd1> > md: bind<sde1> > md: bind<sdb1> > async_tx: api initialized (async) > xor: automatically using best checksumming function: generic_sse > generic_sse: 7597.200 MB/sec > xor: using function: generic_sse (7597.200 MB/sec) > raid6: int64x1 1567 MB/s > raid6: int64x2 1994 MB/s > raid6: int64x4 1582 MB/s > raid6: int64x8 1427 MB/s > raid6: sse2x1 3698 MB/s > raid6: sse2x2 4184 MB/s > raid6: sse2x4 5888 MB/s > raid6: using algorithm sse2x4 (5888 MB/s) > md: raid6 personality registered for level 6 > md: raid5 personality registered for level 5 > md: raid4 personality registered for level 4 > raid5: device sdb1 operational as raid disk 0 > raid5: device sde1 operational as raid disk 3 > raid5: device sdd1 operational as raid disk 2 > raid5: device sdc1 operational as raid disk 1 > raid5: allocated 4272kB for md0 > 0: w=1 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 > 3: w=2 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 > 2: w=3 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 > 1: w=4 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0 > raid5: raid level 5 set md0 active with 4 out of 4 devices, algorithm 2 > RAID5 conf printout: > --- rd:4 wd:4 > disk 0, o:1, dev:sdb1 > disk 1, o:1, dev:sdc1 > disk 2, o:1, dev:sdd1 > disk 3, o:1, dev:sde1 > md0: detected capacity change from 0 to 1500315451392 > md0: unknown partition table > EXT4-fs (md0): mounted filesystem with ordered data mode > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > In /etc/sysctl.conf or with "sysctl -a|grep vm.dirty" check these two settings: vm.dirty_background_ratio 5 vm.dirty_ratio = 6 Default will be something like 40 for the second one and 10 for the first on. 40% is how much memory the kernel lets get dirty with write data, 10% or whatever the bottom number is, is once it starts cleaning it up how low it has to go before letting anyone else write again (ie freeze all writes and massively slow down reads) I set the values to the above, in older kernels 5 is the min value, newer ones may allow lower, I don't believe it is well documented what the limits are, and if you set it lower the older kernels silently set the value to the min internally in the kernel, you won't see it on sysctl -a check. So on my machine I could freeze for how long it takes to write 1% of memory out to disk, which with 8GB is 81MB which takes at most a second or 2 at 60mb/second or so. If you have 8G and have the difference between the two set to 10% it can take 10+ seconds, I don't remember the default, but the large it is the bigger the freeze will be. And these depends on the underlying disk speed, if the underlying disk is slower the time it takes to write out that amount of data is larger and things are uglier, and file copies do a good job of causing this. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-31 1:35 ` Roger Heflin @ 2010-03-31 16:12 ` Mark Knecht 2010-03-31 16:25 ` Jim Duchek 0 siblings, 1 reply; 15+ messages in thread From: Mark Knecht @ 2010-03-31 16:12 UTC (permalink / raw) To: Roger Heflin; +Cc: Jim Duchek, linux-raid On Tue, Mar 30, 2010 at 6:35 PM, Roger Heflin <rogerheflin@gmail.com> wrote: > Jim Duchek wrote: >> >> Hi all. Regularly after a large write to the disk (untarring a very >> large file, etc), my RAID5 will 'freeze' for a period of time -- >> perhaps around a minute. My system is completely responsive otherwise >> during this time, with the exception of anything that is attempting to >> read or write from the array -- it's as if any file descriptors simply >> block. <SNIP> > > In /etc/sysctl.conf or with "sysctl -a|grep vm.dirty" check these two > settings: > vm.dirty_background_ratio 5 > vm.dirty_ratio = 6 > > Default will be something like 40 for the second one and 10 for the first > on. > > 40% is how much memory the kernel lets get dirty with write data, 10% or > whatever the bottom number is, is once it starts cleaning it up how low it > has to go before letting anyone else write again (ie freeze all writes and > massively slow down reads) > > I set the values to the above, in older kernels 5 is the min value, newer > ones may allow lower, I don't believe it is well documented what the limits > are, and if you set it lower the older kernels silently set the value to the > min internally in the kernel, you won't see it on sysctl -a check. So on > my machine I could freeze for how long it takes to write 1% of memory out to > disk, which with 8GB is 81MB which takes at most a second or 2 at > 60mb/second or so. If you have 8G and have the difference between the two > set to 10% it can take 10+ seconds, I don't remember the default, but the > large it is the bigger the freeze will be. > > And these depends on the underlying disk speed, if the underlying disk is > slower the time it takes to write out that amount of data is larger and > things are uglier, and file copies do a good job of causing this. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Very interesting Roger. Thanks. I did some reading on a couple of web site and then did some testing. I found for the sort of jobs I do that create and write data, as an example compiling and installing MythTV, these settings have a big effect on the percentage of time my system drops into these 100%wa, 0% CPU type of states. The default setting on my system was 10/20 and that tended to create this state quite a lot. 3/40 reduced it by probably 50-75%, while 3/70 seemed to eliminate it until the end of the build where the kernel/compiler is presumably forcing it out to disk because the job is finishing. One page I read mentioned data centers using a very good UPS and internal power supply and then running it at 1/100. I think the basic idea is that if we lose power there should be enough time to flush all this stuff to disk before the power completely drops out but up until that time let the kernel take care of things completely. Experimentally what I see is that when I cross above the lower value it isn't that nothing gets written, but more that the kernel sort of opportunistically starts writing it to disk without letting it get too much in the way of running programs, and then when the higher value seems to get crossed the system goes 100% wait while it pushes the data out and is waiting for the disk. I used the command grep -A 1 dirty /proc/vmstat to watch a compile taking place and looked when it was 100% user/system and then also when it went to 100% wait. Some additional reading seems to suggest tuning things like vm.overcommit_ratio and possibly changing the I/O scheduler keeper ~ # cat /sys/block/sda/queue/scheduler noop deadline [cfq] or changing the number of requests keeper ~ # cat /sys/block/sda/queue/nr_requests 128 or read ahead values keeper ~ # blockdev --getra /dev/sda 256 I haven't played with any of those. Based on this info I think it's worth my time trying a new RAID install and see if I'm more successful. Thanks very much for your insights and help! Cheers, Mark keeper ~ # vi /etc/sysctl.conf vm.dirty_background_ratio = 10 vm.dirty_ratio = 20 keeper ~ # sysctl -p real 8m50.667s user 30m6.995s sys 1m30.605s keeper ~ # keeper ~ # vi /etc/sysctl.conf vm.dirty_background_ratio = 3 vm.dirty_ratio = 40 keeper ~ # sysctl -p keeper ~ # time emerge -DuN mythtv <SNIP> real 8m59.401s user 30m9.980s sys 1m30.303s keeper ~ # keeper ~ # vi /etc/sysctl.conf vm.dirty_background_ratio = 3 vm.dirty_ratio = 70 keeper ~ # time emerge -DuN mythtv <SNIP> real 8m52.272s user 30m0.889s sys 1m30.609s keeper ~ #keeper ~ # vi /etc/sysctl.conf -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-31 16:12 ` Mark Knecht @ 2010-03-31 16:25 ` Jim Duchek 0 siblings, 0 replies; 15+ messages in thread From: Jim Duchek @ 2010-03-31 16:25 UTC (permalink / raw) To: Mark Knecht; +Cc: Roger Heflin, linux-raid Agreed, playing with some of these settings appear to clear the problem up, for at least the cases in which I tend to trigger it. Much obliged for the help! Jim On 31 March 2010 10:12, Mark Knecht <markknecht@gmail.com> wrote: > On Tue, Mar 30, 2010 at 6:35 PM, Roger Heflin <rogerheflin@gmail.com> wrote: >> Jim Duchek wrote: >>> >>> Hi all. Regularly after a large write to the disk (untarring a very >>> large file, etc), my RAID5 will 'freeze' for a period of time -- >>> perhaps around a minute. My system is completely responsive otherwise >>> during this time, with the exception of anything that is attempting to >>> read or write from the array -- it's as if any file descriptors simply >>> block. > <SNIP> >> >> In /etc/sysctl.conf or with "sysctl -a|grep vm.dirty" check these two >> settings: >> vm.dirty_background_ratio 5 >> vm.dirty_ratio = 6 >> >> Default will be something like 40 for the second one and 10 for the first >> on. >> >> 40% is how much memory the kernel lets get dirty with write data, 10% or >> whatever the bottom number is, is once it starts cleaning it up how low it >> has to go before letting anyone else write again (ie freeze all writes and >> massively slow down reads) >> >> I set the values to the above, in older kernels 5 is the min value, newer >> ones may allow lower, I don't believe it is well documented what the limits >> are, and if you set it lower the older kernels silently set the value to the >> min internally in the kernel, you won't see it on sysctl -a check. So on >> my machine I could freeze for how long it takes to write 1% of memory out to >> disk, which with 8GB is 81MB which takes at most a second or 2 at >> 60mb/second or so. If you have 8G and have the difference between the two >> set to 10% it can take 10+ seconds, I don't remember the default, but the >> large it is the bigger the freeze will be. >> >> And these depends on the underlying disk speed, if the underlying disk is >> slower the time it takes to write out that amount of data is larger and >> things are uglier, and file copies do a good job of causing this. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > Very interesting Roger. Thanks. > > I did some reading on a couple of web site and then did some testing. > I found for the sort of jobs I do that create and write data, as an > example compiling and installing MythTV, these settings have a big > effect on the percentage of time my system drops into these 100%wa, 0% > CPU type of states. The default setting on my system was 10/20 and > that tended to create this state quite a lot. 3/40 reduced it by > probably 50-75%, while 3/70 seemed to eliminate it until the end of > the build where the kernel/compiler is presumably forcing it out to > disk because the job is finishing. > > One page I read mentioned data centers using a very good UPS and > internal power supply and then running it at 1/100. I think the basic > idea is that if we lose power there should be enough time to flush all > this stuff to disk before the power completely drops out but up until > that time let the kernel take care of things completely. > > Experimentally what I see is that when I cross above the lower value > it isn't that nothing gets written, but more that the kernel sort of > opportunistically starts writing it to disk without letting it get too > much in the way of running programs, and then when the higher value > seems to get crossed the system goes 100% wait while it pushes the > data out and is waiting for the disk. I used the command > > grep -A 1 dirty /proc/vmstat > > to watch a compile taking place and looked when it was 100% > user/system and then also when it went to 100% wait. > > Some additional reading seems to suggest tuning things like > > vm.overcommit_ratio > > and possibly changing the I/O scheduler > > keeper ~ # cat /sys/block/sda/queue/scheduler > noop deadline [cfq] > > or changing the number of requests > > keeper ~ # cat /sys/block/sda/queue/nr_requests > 128 > > or read ahead values > > keeper ~ # blockdev --getra /dev/sda > 256 > > I haven't played with any of those. > > Based on this info I think it's worth my time trying a new RAID > install and see if I'm more successful. > > Thanks very much for your insights and help! > > Cheers, > Mark > > > > keeper ~ # vi /etc/sysctl.conf > > vm.dirty_background_ratio = 10 > vm.dirty_ratio = 20 > > keeper ~ # sysctl -p > > real 8m50.667s > user 30m6.995s > sys 1m30.605s > keeper ~ # > > > keeper ~ # vi /etc/sysctl.conf > > vm.dirty_background_ratio = 3 > vm.dirty_ratio = 40 > > keeper ~ # sysctl -p > > keeper ~ # time emerge -DuN mythtv > <SNIP> > real 8m59.401s > user 30m9.980s > sys 1m30.303s > keeper ~ # > > > keeper ~ # vi /etc/sysctl.conf > > vm.dirty_background_ratio = 3 > vm.dirty_ratio = 70 > > keeper ~ # time emerge -DuN mythtv > <SNIP> > real 8m52.272s > user 30m0.889s > sys 1m30.609s > keeper ~ #keeper ~ # vi /etc/sysctl.conf > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Array 'freezes' for some time after large writes? 2010-03-30 17:07 Array 'freezes' for some time after large writes? Jim Duchek 2010-03-30 17:18 ` Mark Knecht 2010-03-31 1:35 ` Roger Heflin @ 2010-03-31 16:37 ` Asdo 2 siblings, 0 replies; 15+ messages in thread From: Asdo @ 2010-03-31 16:37 UTC (permalink / raw) To: Jim Duchek; +Cc: linux-raid Jim Duchek wrote: > Hi all. Regularly after a large write to the disk (untarring a very > large file, etc), my RAID5 will 'freeze' for a period of time -- > perhaps around a minute. My system is completely responsive otherwise > during this time, Why don't you cat /proc/<pid of blocked process>/stack to see what was the process doing? Maybe more than once to see if it hangs always on the same syscalls... I believe this is more useful if there is only 1 access to the array, not concurrent access. You need to freeze it with e.g. the tar only, This might identify filesystem problems. Other problems might be more difficult. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2010-03-31 16:37 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-30 17:07 Array 'freezes' for some time after large writes? Jim Duchek 2010-03-30 17:18 ` Mark Knecht 2010-03-30 17:47 ` Jim Duchek 2010-03-30 18:00 ` Mark Knecht 2010-03-30 18:05 ` Mark Knecht 2010-03-30 20:32 ` Jim Duchek 2010-03-30 20:45 ` Mark Knecht 2010-03-30 20:59 ` Jim Duchek 2010-03-30 22:21 ` Mark Knecht 2010-03-30 23:50 ` Mark Knecht 2010-03-31 0:22 ` Jim Duchek 2010-03-31 1:35 ` Roger Heflin 2010-03-31 16:12 ` Mark Knecht 2010-03-31 16:25 ` Jim Duchek 2010-03-31 16:37 ` Asdo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).