* Spare disk could not sleep / standby @ 2005-03-08 4:05 Peter Evertz 2005-03-08 4:14 ` Guy 0 siblings, 1 reply; 20+ messages in thread From: Peter Evertz @ 2005-03-08 4:05 UTC (permalink / raw) To: linux-raid I have 2 Raid5 arrays on a hpt375. Each has a (unused) spare disk. With change from 2.4 to 2.6 I can not put the spare disk to sleep or standby. It wakes up after some seconds. /proc/diskstat shows activities every 2 to 5 seconds. It is a problem of the plain kernel driver ( not application ) because if i raidhotremove the drives it can sleep/standby and no activities are shown in /proc/diskstats. If the whole Array is unmounted, there are no activities an all drives. It seems that access to md always causes an access at the spare disk ?! Any hints ? Anyone with the same problem ? Regards Peter Evertz (Sorry: If this is a dup.) Linux pec6 2.6.8-24.11-default #1 Fri Jan 14 13:01:26 UTC 2005 i686 i686 i386 GNU/Linux Output of ver_linux: Gnu C 3.3.4 Gnu make 3.80 binutils 2.15.91.0.2 util-linux 2.12c mount 2.12c module-init-tools 3.1-pre5 e2fsprogs 1.35 jfsutils 1.1.7 reiserfsprogs 3.6.18 reiser4progs line xfsprogs 2.6.13 PPP 2.4.2 isdn4k-utils 3.5 nfs-utils 1.0.6 Linux C Library x 1 root root 1359489 Oct 2 02:30 /lib/tls/libc.so.6 Dynamic linker (ldd) 2.3.3 Linux C++ Library 5.0.7 Procps 3.2.3 Net-tools 1.60 Kbd 1.12 Sh-utils 5.2.1 Modules Loaded speedstep_lib freq_table thermal processor fan button battery ac nvram nfsd exportfs pppoe pppox ipt_TOS ip6t_LOG ip6t_limit ipt_LOG ipt_limit ipt_TCPMSS ipt_MASQUERADE usbserial ipt_pkttype parport_pc lp parport eeprom w83781d i2c_sensor i2c_i801 af_packet ppp_generic ip6t_state ip6_conntrack ipt_state ip6t_REJECT ipt_REJECT iptable_mangle iptable_filter ip6table_mangle ip_nat_ftp iptable_nat ip_conntrack_ftp ip_conntrack ip_tables ip6table_filter ip6_tables ipv6 hisax crc_ccitt isdn slhc edd usbhid skystar2 dvb_core i2c_core ohci_hcd evdev joydev sg st sr_mod ide_cd cdrom ehci_hcd uhci_hcd intel_agp agpgart hw_random dm_mod natsemi usbcore bcm5700 raid5 xor ext3 jbd sd_mod scsi_mod proc/version: Linux version 2.6.8-24.11-default (geeko@buildhost) (gcc version 3.3.4 (pre 3.3.5 20040809)) #1 Fri Jan 14 13:01:26 UTC 2005 ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: Spare disk could not sleep / standby 2005-03-08 4:05 Spare disk could not sleep / standby Peter Evertz @ 2005-03-08 4:14 ` Guy 2005-03-08 4:40 ` Neil Brown 0 siblings, 1 reply; 20+ messages in thread From: Guy @ 2005-03-08 4:14 UTC (permalink / raw) To: 'Peter Evertz', linux-raid I have no idea, but... Is the disk IO reads or writes. If writes, scary!!!! Maybe data destined for the array goes to the spare sometimes. I hope not. I feel safe with my 2.4 kernel. :) Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Peter Evertz Sent: Monday, March 07, 2005 11:05 PM To: linux-raid@vger.kernel.org Subject: Spare disk could not sleep / standby I have 2 Raid5 arrays on a hpt375. Each has a (unused) spare disk. With change from 2.4 to 2.6 I can not put the spare disk to sleep or standby. It wakes up after some seconds. /proc/diskstat shows activities every 2 to 5 seconds. It is a problem of the plain kernel driver ( not application ) because if i raidhotremove the drives it can sleep/standby and no activities are shown in /proc/diskstats. If the whole Array is unmounted, there are no activities an all drives. It seems that access to md always causes an access at the spare disk ?! Any hints ? Anyone with the same problem ? Regards Peter Evertz (Sorry: If this is a dup.) Linux pec6 2.6.8-24.11-default #1 Fri Jan 14 13:01:26 UTC 2005 i686 i686 i386 GNU/Linux Output of ver_linux: Gnu C 3.3.4 Gnu make 3.80 binutils 2.15.91.0.2 util-linux 2.12c mount 2.12c module-init-tools 3.1-pre5 e2fsprogs 1.35 jfsutils 1.1.7 reiserfsprogs 3.6.18 reiser4progs line xfsprogs 2.6.13 PPP 2.4.2 isdn4k-utils 3.5 nfs-utils 1.0.6 Linux C Library x 1 root root 1359489 Oct 2 02:30 /lib/tls/libc.so.6 Dynamic linker (ldd) 2.3.3 Linux C++ Library 5.0.7 Procps 3.2.3 Net-tools 1.60 Kbd 1.12 Sh-utils 5.2.1 Modules Loaded speedstep_lib freq_table thermal processor fan button battery ac nvram nfsd exportfs pppoe pppox ipt_TOS ip6t_LOG ip6t_limit ipt_LOG ipt_limit ipt_TCPMSS ipt_MASQUERADE usbserial ipt_pkttype parport_pc lp parport eeprom w83781d i2c_sensor i2c_i801 af_packet ppp_generic ip6t_state ip6_conntrack ipt_state ip6t_REJECT ipt_REJECT iptable_mangle iptable_filter ip6table_mangle ip_nat_ftp iptable_nat ip_conntrack_ftp ip_conntrack ip_tables ip6table_filter ip6_tables ipv6 hisax crc_ccitt isdn slhc edd usbhid skystar2 dvb_core i2c_core ohci_hcd evdev joydev sg st sr_mod ide_cd cdrom ehci_hcd uhci_hcd intel_agp agpgart hw_random dm_mod natsemi usbcore bcm5700 raid5 xor ext3 jbd sd_mod scsi_mod proc/version: Linux version 2.6.8-24.11-default (geeko@buildhost) (gcc version 3.3.4 (pre 3.3.5 20040809)) #1 Fri Jan 14 13:01:26 UTC 2005 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: Spare disk could not sleep / standby 2005-03-08 4:14 ` Guy @ 2005-03-08 4:40 ` Neil Brown 2005-03-08 5:20 ` Molle Bestefich ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Neil Brown @ 2005-03-08 4:40 UTC (permalink / raw) To: Guy; +Cc: 'Peter Evertz', linux-raid On Monday March 7, bugzilla@watkins-home.com wrote: > I have no idea, but... > > Is the disk IO reads or writes. If writes, scary!!!! Maybe data destined > for the array goes to the spare sometimes. I hope not. I feel safe with my > 2.4 kernel. :) It is writes, but don't be scared. It is just super-block updates. In 2.6, the superblock is marked 'clean' whenever there is a period of about 20ms of no write activity. This increases the chance on a resync won't be needed after a crash. (unfortunately) the superblocks on the spares need to be updated too. The only way around this that I can think of is to have the spares attached to some other array, and have mdadm monitoring the situation and using the SpareGroup functionality to move the spare to where it is needed when it is needed. This would really require having and array with spare drives but no data drives... maybe a 1-drive raid1 with a loopback device as the main drive, and all the spares attached to that..... there must be a better way, or atleast some sensible support in mdadm to make it not too horrible. I'll think about it. NeilBrown > -----Original Message----- > From: linux-raid-owner@vger.kernel.org > [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Peter Evertz > Sent: Monday, March 07, 2005 11:05 PM > To: linux-raid@vger.kernel.org > Subject: Spare disk could not sleep / standby > > I have 2 Raid5 arrays on a hpt375. Each has a (unused) spare disk. > With change from 2.4 to 2.6 I can not put the spare disk to sleep or > > standby. > It wakes up after some seconds. > /proc/diskstat shows activities every 2 to 5 seconds. > It is a problem of the plain kernel driver ( not application ) > because > if i raidhotremove the drives it can sleep/standby and no activities > are > shown in /proc/diskstats. > If the whole Array is unmounted, there are no activities an all > drives. > It seems that access to md always causes an access at the spare disk > ?! > > Any hints ? Anyone with the same problem ? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 4:40 ` Neil Brown @ 2005-03-08 5:20 ` Molle Bestefich 2005-03-08 5:36 ` Neil Brown 2005-03-08 15:59 ` Mike Tran 2005-03-09 15:53 ` Spare disk could not sleep / standby [probably dangerous PATCH] Peter Evertz 2 siblings, 1 reply; 20+ messages in thread From: Molle Bestefich @ 2005-03-08 5:20 UTC (permalink / raw) To: linux-raid Neil Brown wrote: > It is writes, but don't be scared. It is just super-block updates. > > In 2.6, the superblock is marked 'clean' whenever there is a period of > about 20ms of no write activity. This increases the chance on a > resync won't be needed after a crash. > (unfortunately) the superblocks on the spares need to be updated too. Ack, one of the cool things that a linux md array can do that others can't is imho that the disks can spin down when inactive. Granted, it's mostly for home users who want their desktop RAID to be quiet when it's not in use, and their basement multi-terabyte facility to use a minimum of power when idling, but anyway. Is there any particular reason to update the superblocks every 20 msecs when they're already marked clean? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 5:20 ` Molle Bestefich @ 2005-03-08 5:36 ` Neil Brown 2005-03-08 5:46 ` Molle Bestefich 2005-03-08 8:51 ` David Greaves 0 siblings, 2 replies; 20+ messages in thread From: Neil Brown @ 2005-03-08 5:36 UTC (permalink / raw) To: Molle Bestefich; +Cc: linux-raid On Tuesday March 8, molle.bestefich@gmail.com wrote: > Neil Brown wrote: > > It is writes, but don't be scared. It is just super-block updates. > > > > In 2.6, the superblock is marked 'clean' whenever there is a period of > > about 20ms of no write activity. This increases the chance on a > > resync won't be needed after a crash. > > (unfortunately) the superblocks on the spares need to be updated too. > > Ack, one of the cool things that a linux md array can do that others > can't is imho that the disks can spin down when inactive. Granted, > it's mostly for home users who want their desktop RAID to be quiet > when it's not in use, and their basement multi-terabyte facility to > use a minimum of power when idling, but anyway. > > Is there any particular reason to update the superblocks every 20 > msecs when they're already marked clean? It doesn't (well, shouldn't and I don't think it does). Before the first write, they are all marked 'active'. Then after 20ms with no write, they are all marked 'clean'. Then before the next write they are all marked 'active'. As the event count needs to be updated every time the superblock is modified, the event count will be updated forever active->clean or clean->active transition. All the drives in an array must have the same value for the event count, so the spares need to be updated even though they, themselves, aren't exactly 'active' or 'clean'. NeilBrown ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 5:36 ` Neil Brown @ 2005-03-08 5:46 ` Molle Bestefich 2005-03-08 6:03 ` Neil Brown 2005-03-08 8:51 ` David Greaves 1 sibling, 1 reply; 20+ messages in thread From: Molle Bestefich @ 2005-03-08 5:46 UTC (permalink / raw) To: linux-raid Neil Brown wrote: > Then after 20ms with no write, they are all marked 'clean'. > Then before the next write they are all marked 'active'. > > As the event count needs to be updated every time the superblock is > modified, the event count will be updated forever active->clean or > clean->active transition. So.. Sorry if I'm a bit slow here.. But what you're saying is: The kernel marks the partition clean when all writes have expired to disk. This change is propagated through MD, and when it is, it causes the event counter to rise, thus causing a write, thus marking the superblock active. 20 msecs later, the same scenario repeats itself. Is my perception of the situation correct? Seems like a design flaw to me, but then again, I'm biased towards hating this behaviour since I really like being able to put inactive RAIDs to sleep.. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 5:46 ` Molle Bestefich @ 2005-03-08 6:03 ` Neil Brown 2005-03-08 6:24 ` Molle Bestefich 0 siblings, 1 reply; 20+ messages in thread From: Neil Brown @ 2005-03-08 6:03 UTC (permalink / raw) To: Molle Bestefich; +Cc: linux-raid On Tuesday March 8, molle.bestefich@gmail.com wrote: > Neil Brown wrote: > > Then after 20ms with no write, they are all marked 'clean'. > > Then before the next write they are all marked 'active'. > > > > As the event count needs to be updated every time the superblock is > > modified, the event count will be updated forever active->clean or > > clean->active transition. > > So.. Sorry if I'm a bit slow here.. But what you're saying is: > > The kernel marks the partition clean when all writes have expired to disk. > This change is propagated through MD, and when it is, it causes the > event counter to rise, thus causing a write, thus marking the > superblock active. 20 msecs later, the same scenario repeats itself. > > Is my perception of the situation correct? No. Writing the superblock does not cause the array to be marked active. If the array is idle, the individual drives will be idle. > > Seems like a design flaw to me, but then again, I'm biased towards > hating this behaviour since I really like being able to put inactive > RAIDs to sleep.. Hmmm... maybe I misunderstood your problem. I thought you were just talking about a spare not being idle when you thought it should be. Are you saying that your whole array is idle, but still seeing writes? That would have to be something non-md-specific I think. NeilBrown > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 6:03 ` Neil Brown @ 2005-03-08 6:24 ` Molle Bestefich [not found] ` <422D625C.5020803@medien.uni-weimar.de> 0 siblings, 1 reply; 20+ messages in thread From: Molle Bestefich @ 2005-03-08 6:24 UTC (permalink / raw) To: linux-raid Neil Brown wrote: >> Is my perception of the situation correct? > > No. Writing the superblock does not cause the array to be marked > active. > If the array is idle, the individual drives will be idle. Ok, thank you for the clarification. >> Seems like a design flaw to me, but then again, I'm biased towards >> hating this behaviour since I really like being able to put inactive >> RAIDs to sleep.. > > Hmmm... maybe I misunderstood your problem. I thought you were just > talking about a spare not being idle when you thought it should be. > Are you saying that your whole array is idle, but still seeing writes? > That would have to be something non-md-specific I think. No, the confusion is my bad. That was the original problem posted by Peter Evertz, which you provided a workaround for. _I_ was just curious about the workings of MD in 2.6, since it sounded a bit like it wasn't possible to put a RAID array to sleep. I'm about to upgrade a server to 2.6, which "needs" to spin down when idle. Got a bit worried for a moment there =). Thanks again. ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <422D625C.5020803@medien.uni-weimar.de>]
* Re: Spare disk could not sleep / standby [not found] ` <422D625C.5020803@medien.uni-weimar.de> @ 2005-03-08 8:57 ` Molle Bestefich 2005-03-08 10:51 ` Tobias Hofmann 0 siblings, 1 reply; 20+ messages in thread From: Molle Bestefich @ 2005-03-08 8:57 UTC (permalink / raw) To: linux-raid Tobias wrote: [...] > I just found your mail on this list, where I have been lurking for > some weeks now to get acquainted with RAID, but I fear my mail would > be almost OT there: Think so? It's about RAID on Linux isn't it? I'm gonna CC the list anyway, hope it's okay :-). >> I was just curious about the workings of MD in 2.6, since it sounded >> a bit like it wasn't possible to put a RAID array to sleep. I'm about >> to upgrade a server to 2.6, which "needs" to spin down when idle. > Which is exactly what I am planning to do at my home - currently, I have [...] > Thus my question: Would you have a link to info on the net concerning > safely powering down an unused/idle Raid? No, but I can tell you what I did. I stuffed a bunch of cheap SATA disks and crappy controllers in an old system. (And replaced the power supply with one that has enough power on the 12V rail.) It's running 2.4, and since it's IDE disks, I just call 'hdparm -S<whatever>' in rc.local, which instructs the disks to go on standby whenever they've been idle for 10 minutes. Works like a charm so far, been running for a couple of years. There does not seem to be any issues with MD and timing because of the disks using 5 seconds or so to spin up, MD happily waits for them, and no corruption or wrong behaviour has stemmed from putting the disks in sleep mode. There have been a couple of annoyances, though. One is that MD reads from the disks sequentially, thus spinning up the disks one by one. The more disks you have, the longer you will have to wait for the entire array to come up :-/. Would have been beautiful if MD issues the requests in parallel. Another is that you need to have your root partition outside of the array. The reason for this is that some fancy feature in your favorite distro with guarantee periodically writes something to the disk, which will make the array spin up constantly. Incidentally, this also makes using Linux as a desktop system a PITA, since the disks are noisy as hell if you leave it on. I'm currently using two old disks in RAID1 for the root filesystem, but I'm thinking that there's probably a better solution. Perhaps the root filesystem can be shifted to a ramdisk during startup. Or you could boot from a custom made CD - that would also be extremely handy as a rescue disk. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 8:57 ` Molle Bestefich @ 2005-03-08 10:51 ` Tobias Hofmann 2005-03-08 13:13 ` Gordon Henderson 0 siblings, 1 reply; 20+ messages in thread From: Tobias Hofmann @ 2005-03-08 10:51 UTC (permalink / raw) To: linux-raid; +Cc: Molle Bestefich On 08.03.2005 09:57, Molle Bestefich wrote: [...] > I'm gonna CC the list anyway, hope it's okay :-). I hope so, too... ;) [...] > No, but I can tell you what I did. > > I stuffed a bunch of cheap SATA disks and crappy controllers in an old system. > (And replaced the power supply with one that has enough power on the 12V rail.) > > It's running 2.4, and since it's IDE disks, I just call 'hdparm > -S<whatever>' in rc.local, > which instructs the disks to go on standby whenever they've been idle > for 10 minutes. I had found postings on the net claiming that doing so without unmounting the fs on the raid, this would lead to bad things happening - but your report seems to prove them wrong... > Works like a charm so far, been running for a couple of years. > There does not seem to be any issues with MD and timing because of the > disks using 5 seconds or so to spin up, MD happily waits for them, and > no corruption or wrong behaviour has stemmed from putting the disks in > sleep mode. Good to read - will test that once I have the raid free to fool around with, thanks for the info! :) > There have been a couple of annoyances, though. > > One is that MD reads from the disks sequentially, thus spinning up the > disks one by one. > The more disks you have, the longer you will have to wait for the > entire array to come up :-/. > Would have been beautiful if MD issues the requests in parallel. Ack. > Another is that you need to have your root partition outside of the array. Which will be the case here... > The reason for this is that some fancy feature in your favorite distro > with guarantee periodically writes something to the disk, which will > make the array spin up constantly. Yup, I see that... > Incidentally, this also makes using Linux as a desktop system a PITA, > since the disks are noisy as hell if you leave it on. > I'm currently using two old disks in RAID1 for the root filesystem, > but I'm thinking that there's probably a better solution. > Perhaps the root filesystem can be shifted to a ramdisk during > startup. Or you could boot from a custom made CD - that would also be > extremely handy as a rescue disk. Hm. Knoppix seems to be coming out in a new version real soon... ;) Thanks for the feedback, greets, tobi... :) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 10:51 ` Tobias Hofmann @ 2005-03-08 13:13 ` Gordon Henderson 2005-03-09 5:11 ` Brad Campbell 2005-03-09 9:03 ` Tobias Hofmann 0 siblings, 2 replies; 20+ messages in thread From: Gordon Henderson @ 2005-03-08 13:13 UTC (permalink / raw) To: linux-raid On Tue, 8 Mar 2005, Tobias Hofmann wrote: > > I stuffed a bunch of cheap SATA disks and crappy controllers in an old > > system. (And replaced the power supply with one that has enough power > > on the 12V rail.) > > > > It's running 2.4, and since it's IDE disks, I just call 'hdparm > > -S<whatever>' in rc.local, > > which instructs the disks to go on standby whenever they've been idle > > for 10 minutes. > > I had found postings on the net claiming that doing so without > unmounting the fs on the raid, this would lead to bad things happening - > but your report seems to prove them wrong... I've been using something called noflushd on a couple of small "home servers" for a couple of years now to spin the disks down. I made a posting about it here some time back and the consensus seemed to be (at the time) that it all should "just work"... And indeed it has been just working. They are only running RAID-1 though, 2.4 and ext2. I understand the ext3 would force spin-up every 5 seconds which would sort of defeat it. There are other things to be aware of to (things that will defeat using hdparm) - making sure every entry in syslog.conf is -/var/log/whatever (ie. with the hyphen prepended) to stop if doing the fsync on every write which will spin up the disks. They are on UPSs, but they have been known to run-out in the past )-: so a long fsck and some data loss is to be expected. Essentially noflushd blocks the kernel from writing to disk until memory fills up.. So most of the time the box sits with the disks spun down, and only spins up when we do some file reading/saving to them. Noflushd is at http://noflushd.sourceforge.net/ and claims to work with 2.6, but says it will never work with journaling FSs like ext3 and XFS. (which is understandable) It's a bit weird at times, but very predictable. It takes 8 seconds to spin the disks up, and sometimes I login to it, suffer the delay, then get a 2nd (frustrating :) delay as it spins up the other disk in a RAID-1 set to read some more. Here are some recent entries from the log-file: (hda & c are part of a raid-1 set) Mar 7 06:55:34 watertower noflushd[376]: Spinning down /dev/hda. Mar 7 06:55:35 watertower noflushd[376]: Spinning down /dev/hdc. Mar 7 14:10:06 watertower noflushd[376]: Spinning up /dev/hdc after 434 minutes. Mar 7 14:40:13 watertower noflushd[376]: Spinning down /dev/hdc. Mar 8 06:25:13 watertower noflushd[376]: Spinning up /dev/hda after 1409 minutes. Mar 8 06:25:13 watertower noflushd[376]: Spinning up /dev/hdc after 944 minutes. Mar 8 06:55:14 watertower noflushd[376]: Spinning down /dev/hda. Mar 8 06:55:25 watertower noflushd[376]: Spinning down /dev/hdc. Mar 8 13:01:02 watertower noflushd[376]: Spinning up /dev/hdc after 365 minutes. Mar 8 13:01:12 watertower noflushd[376]: Spinning up /dev/hda after 365 minutes. Thats under 2.4.20 (gosh, is it that old? Uptime is 216 days!) That machine is my firewall and a small server, so it's not used that often interactively. I have the timeout for spinning the disks down set to 30 minutes, as I found then when ser to 5-10 minutes, it was sometimes spinning them down when I was doing some work on it which was a bit frustrating and probably not that good for the disks themselves. I'm in the middle of building up a new home server - looking at RAID-5 or 6 and 2.6.x, so maybe it's time to look at all this again, but it sounds like the auto superblock update might thwart it all now... Gordon ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 13:13 ` Gordon Henderson @ 2005-03-09 5:11 ` Brad Campbell 2005-03-09 9:03 ` Tobias Hofmann 1 sibling, 0 replies; 20+ messages in thread From: Brad Campbell @ 2005-03-09 5:11 UTC (permalink / raw) To: Gordon Henderson; +Cc: linux-raid Gordon Henderson wrote: > > I'm in the middle of building up a new home server - looking at RAID-5 or > 6 and 2.6.x, so maybe it's time to look at all this again, but it sounds > like the auto superblock update might thwart it all now... Nah... As far as I can tell, 20ms after the last write, the auto superblock update will write the array as clean. You can then spin the disks down as you normally would after a delay. It's just like a normal write. There is an overhead I guess, where prior to the next write it's going to mark the superblocks as dirty. I wonder in your case if this would spin up *all* the disks at once, or do a staged spin up, given it's going to touch all the disks "at the same time"? I have my Raid-6 with ext3 and a commit time of 30s. With a idle system, it really stays idle. Nothing touches the disks. If I wanted to spin them down I could do that. The thing I *love* about this feature, is when I do something totally stupid and panic the box, 90% of the time I don't need a resync as the array was marked clean after the last write. Thanks Neil! Just for yuk's, here are a couple of photos of my latest Frankenstein. 3TB of Raid-6 in a Midi-Tower case. Had to re-wire the PSU internally to export an extra 12v rail to an appropriate place however. I have been beating Raid-6 senseless for the last week now and doing horrid things to the hardware. I'm now completely confident in its stability and ready to use it for production. Thanks HPA! http://www.wasp.net.au/~brad/nas/nas-front.jpg http://www.wasp.net.au/~brad/nas/nas-psu.jpg http://www.wasp.net.au/~brad/nas/nas-rear.jpg http://www.wasp.net.au/~brad/nas/nas-side.jpg Regards, Brad -- "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." -- Douglas Adams ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 13:13 ` Gordon Henderson 2005-03-09 5:11 ` Brad Campbell @ 2005-03-09 9:03 ` Tobias Hofmann 1 sibling, 0 replies; 20+ messages in thread From: Tobias Hofmann @ 2005-03-09 9:03 UTC (permalink / raw) To: linux-raid On 08.03.2005 14:13, Gordon Henderson wrote: > On Tue, 8 Mar 2005, Tobias Hofmann wrote: [...] >>I had found postings on the net claiming that doing so without >>unmounting the fs on the raid, this would lead to bad things happening - >> but your report seems to prove them wrong... > > > I've been using something called noflushd on a couple of small "home > servers" for a couple of years now to spin the disks down. I made a > posting about it here some time back and the consensus seemed to be (at > the time) that it all should "just work"... And indeed it has been just > working. Thanks for mentioning this... > They are only running RAID-1 though, 2.4 and ext2. I understand the ext3 > would force spin-up every 5 seconds which would sort of defeat it. There > are other things to be aware of to (things that will defeat using hdparm) > - making sure every entry in syslog.conf is -/var/log/whatever (ie. with > the hyphen prepended) to stop if doing the fsync on every write which will > spin up the disks. They are on UPSs, but they have been known to run-out > in the past )-: so a long fsck and some data loss is to be expected. > > Essentially noflushd blocks the kernel from writing to disk until memory > fills up.. So most of the time the box sits with the disks spun down, and > only spins up when we do some file reading/saving to them. ...and this is no prob for me, as my idea is to only spin down a raid used for data, not OS... > Noflushd is at http://noflushd.sourceforge.net/ and claims to work with > 2.6, but says it will never work with journaling FSs like ext3 and XFS. > (which is understandable) ...true, but bites me. I,ll still look into it, once I am free to fool around with the raid (which currently is a backup, so I,d hesitate to "kill" it... :) greets, tobi... :) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 5:36 ` Neil Brown 2005-03-08 5:46 ` Molle Bestefich @ 2005-03-08 8:51 ` David Greaves 1 sibling, 0 replies; 20+ messages in thread From: David Greaves @ 2005-03-08 8:51 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Neil Brown wrote: >As the event count needs to be updated every time the superblock is >modified, the event count will be updated forever active->clean or >clean->active transition. All the drives in an array must have the >same value for the event count, so the spares need to be updated even >though they, themselves, aren't exactly 'active' or 'clean'. > > May I ask why? I can understand why the active drives need to have the same superblock - it marks the data set as consistent and is used on restart to indicate integrity across the set and avoid a resync. But the spare has no data on it. What does it mean that the superblock is up to date? In fact isn't that misleading? Surely, if anything, the spare _should_ have an out of date superblock? David ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby 2005-03-08 4:40 ` Neil Brown 2005-03-08 5:20 ` Molle Bestefich @ 2005-03-08 15:59 ` Mike Tran 2005-03-09 15:53 ` Spare disk could not sleep / standby [probably dangerous PATCH] Peter Evertz 2 siblings, 0 replies; 20+ messages in thread From: Mike Tran @ 2005-03-08 15:59 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Neil Brown wrote: >On Monday March 7, bugzilla@watkins-home.com wrote: > > >>I have no idea, but... >> >>Is the disk IO reads or writes. If writes, scary!!!! Maybe data destined >>for the array goes to the spare sometimes. I hope not. I feel safe with my >>2.4 kernel. :) >> >> > >It is writes, but don't be scared. It is just super-block updates. > >In 2.6, the superblock is marked 'clean' whenever there is a period of >about 20ms of no write activity. This increases the chance on a >resync won't be needed after a crash. >(unfortunately) the superblocks on the spares need to be updated too. > >The only way around this that I can think of is to have the spares >attached to some other array, and have mdadm monitoring the situation >and using the SpareGroup functionality to move the spare to where it >is needed when it is needed. >This would really require having and array with spare drives but no >data drives... maybe a 1-drive raid1 with a loopback device as the >main drive, and all the spares attached to that..... there must be a >better way, or atleast some sensible support in mdadm to make it not >too horrible. I'll think about it. > > > While updating superblocks, faulty disks are skipped. Maybe skipping superblock update on spares could be considered. Of course, this requires conresponding changes in md superblock validation code. In addition, I would suggest to treat spares as shared global disks. That is a spare can be referenced by more than 1 md array. After a spare is selected to recover a degraded array, it will be removed from the shared list. I know that this suggestion is away from the SpareGroup functionality used by mdadm. But I am afraid that there is timing issue with monitoring /proc/mdstat. -- Regards, Mike T. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby [probably dangerous PATCH] 2005-03-08 4:40 ` Neil Brown 2005-03-08 5:20 ` Molle Bestefich 2005-03-08 15:59 ` Mike Tran @ 2005-03-09 15:53 ` Peter Evertz 2005-03-09 10:44 ` Mike Tran 2 siblings, 1 reply; 20+ messages in thread From: Peter Evertz @ 2005-03-09 15:53 UTC (permalink / raw) To: Neil Brown; +Cc: Guy, 'Peter Evertz', linux-raid This patch removes my problem. I hope it doesn't have influence on the stability of the system. It is simple: The Update routine skips normaly only "faulty" disks. Now it skips all disk that are not part of the working array ( raid_disk == -1 ) I made some testing, but surely not all, so : DON'T APPLY TO YOUR SYSTEM WITH IMPORTENT DATA ! Regards Peter --- md.c.orig 2005-01-14 16:33:49.000000000 +0100 +++ md.c 2005-03-09 15:27:23.000000000 +0100 @@ -1340,14 +1340,14 @@ ITERATE_RDEV(mddev,rdev,tmp) { char b[BDEVNAME_SIZE]; dprintk(KERN_INFO "md: "); - if (rdev->faulty) - dprintk("(skipping faulty "); + if (rdev->faulty || rdev->raid_disk < 0) + dprintk("(skipping faulty/spare "); dprintk("%s ", bdevname(rdev->bdev,b)); - if (!rdev->faulty) { + if (!rdev->faulty && !rdev->raid_disk <0 ) { err += write_disk_sb(rdev); } else - dprintk(")\n"); + dprintk("<%d>)\n",rdev->raid_disk); if (!err && mddev->level == LEVEL_MULTIPATH) /* only need to write one superblock... */ break; ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby [probably dangerous PATCH] 2005-03-09 15:53 ` Spare disk could not sleep / standby [probably dangerous PATCH] Peter Evertz @ 2005-03-09 10:44 ` Mike Tran 2005-03-09 20:05 ` Peter Evertz 0 siblings, 1 reply; 20+ messages in thread From: Mike Tran @ 2005-03-09 10:44 UTC (permalink / raw) To: linux-raid Hi Peter, After applying this patch, have you tried stop and restart the MD array? I believe the spares will be kicked out in analyze_sbs() function (see the second ITERATE_RDEV) -- Regards, Mike T. On Wed, 2005-03-09 at 09:53, Peter Evertz wrote: > This patch removes my problem. I hope it doesn't have influence on the > stability of > the system. > It is simple: The Update routine skips normaly only "faulty" disks. Now it > skips all disk > that are not part of the working array ( raid_disk == -1 ) > I made some testing, but surely not all, so : > > DON'T APPLY TO YOUR SYSTEM WITH IMPORTENT DATA ! > > Regards > Peter > > --- md.c.orig 2005-01-14 16:33:49.000000000 +0100 > +++ md.c 2005-03-09 15:27:23.000000000 +0100 > @@ -1340,14 +1340,14 @@ > ITERATE_RDEV(mddev,rdev,tmp) { > char b[BDEVNAME_SIZE]; > dprintk(KERN_INFO "md: "); > - if (rdev->faulty) > - dprintk("(skipping faulty "); > + if (rdev->faulty || rdev->raid_disk < 0) > + dprintk("(skipping faulty/spare "); > > dprintk("%s ", bdevname(rdev->bdev,b)); > - if (!rdev->faulty) { > + if (!rdev->faulty && !rdev->raid_disk <0 ) { > err += write_disk_sb(rdev); > } else > - dprintk(")\n"); > + dprintk("<%d>)\n",rdev->raid_disk); > if (!err && mddev->level == LEVEL_MULTIPATH) > /* only need to write one superblock... */ > break; > > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby [probably dangerous PATCH] 2005-03-09 10:44 ` Mike Tran @ 2005-03-09 20:05 ` Peter Evertz 2005-03-09 16:29 ` Mike Tran 0 siblings, 1 reply; 20+ messages in thread From: Peter Evertz @ 2005-03-09 20:05 UTC (permalink / raw) To: Mike Tran; +Cc: linux-raid Mike Tran writes: > Hi Peter, > > After applying this patch, have you tried stop and restart the MD > array? I believe the spares will be kicked out in analyze_sbs() > function (see the second ITERATE_RDEV) mdadm ( v1.6.0 - 4 June 2004 ) shows the arrays complete including spare. /proc/mdstat is ok I booted with my patched raid modules. So analyze_sbs() should have run. Maybe it works only for 0.90 superblocks, I haven't tried 1.00 No problems yet. If it really fails the hard way, I will go to the next Internetcafe and tell you about it :) Peter > > -- > Regards, > Mike T. > > > On Wed, 2005-03-09 at 09:53, Peter Evertz wrote: >> This patch removes my problem. I hope it doesn't have influence on the >> stability of >> the system. >> It is simple: The Update routine skips normaly only "faulty" disks. Now it >> skips all disk >> that are not part of the working array ( raid_disk == -1 ) >> I made some testing, but surely not all, so : >> >> DON'T APPLY TO YOUR SYSTEM WITH IMPORTENT DATA ! >> >> Regards >> Peter >> >> --- md.c.orig 2005-01-14 16:33:49.000000000 +0100 >> +++ md.c 2005-03-09 15:27:23.000000000 +0100 >> @@ -1340,14 +1340,14 @@ >> ITERATE_RDEV(mddev,rdev,tmp) { >> char b[BDEVNAME_SIZE]; >> dprintk(KERN_INFO "md: "); >> - if (rdev->faulty) >> - dprintk("(skipping faulty "); >> + if (rdev->faulty || rdev->raid_disk < 0) >> + dprintk("(skipping faulty/spare "); >> >> dprintk("%s ", bdevname(rdev->bdev,b)); >> - if (!rdev->faulty) { >> + if (!rdev->faulty && !rdev->raid_disk <0 ) { >> err += write_disk_sb(rdev); >> } else >> - dprintk(")\n"); >> + dprintk("<%d>)\n",rdev->raid_disk); >> if (!err && mddev->level == LEVEL_MULTIPATH) >> /* only need to write one superblock... */ >> break; >> >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby [probably dangerous PATCH] 2005-03-09 20:05 ` Peter Evertz @ 2005-03-09 16:29 ` Mike Tran 2005-03-09 23:20 ` Peter Evertz 0 siblings, 1 reply; 20+ messages in thread From: Mike Tran @ 2005-03-09 16:29 UTC (permalink / raw) To: linux-raid I tried the patch and immediately found problems. On creation of raid1 array, only the spare has md superblock, the raid disks has no superblock. For instance: mdadm -C /dev/md0 -l 1 -n 2 /dev/hdd1 /dev/hdd2 -x 1 /dev/hdd3 [wait for resync to finish if you want to...] mdadm --stop /dev/md0 mdadm --examine /dev/hdd1 (no super block found) mdadm --examine /dev/hdd2 (no super block found) mdadm --examine /dev/hdd3 (nice output) If you want to skip spares, you will need to alter the patch (see below) On Wed, 2005-03-09 at 14:05, Peter Evertz wrote: > Mike Tran writes: > > > Hi Peter, > > > > After applying this patch, have you tried stop and restart the MD > > array? I believe the spares will be kicked out in analyze_sbs() > > function (see the second ITERATE_RDEV) > mdadm ( v1.6.0 - 4 June 2004 ) > shows the arrays complete including spare. > /proc/mdstat is ok > > I booted with my patched raid modules. So analyze_sbs() should have run. > Maybe it works only for 0.90 superblocks, I haven't tried 1.00 > > No problems yet. If it really fails the hard way, I will go to the next > Internetcafe and tell you about it :) > > Peter > > > > -- > > Regards, > > Mike T. > > > > > > On Wed, 2005-03-09 at 09:53, Peter Evertz wrote: > >> This patch removes my problem. I hope it doesn't have influence on the > >> stability of > >> the system. > >> It is simple: The Update routine skips normaly only "faulty" disks. Now it > >> skips all disk > >> that are not part of the working array ( raid_disk == -1 ) > >> I made some testing, but surely not all, so : > >> > >> DON'T APPLY TO YOUR SYSTEM WITH IMPORTENT DATA ! > >> > >> Regards > >> Peter > >> > >> --- md.c.orig 2005-01-14 16:33:49.000000000 +0100 > >> +++ md.c 2005-03-09 15:27:23.000000000 +0100 > >> @@ -1340,14 +1340,14 @@ > >> ITERATE_RDEV(mddev,rdev,tmp) { > >> char b[BDEVNAME_SIZE]; > >> dprintk(KERN_INFO "md: "); > >> - if (rdev->faulty) > >> - dprintk("(skipping faulty "); > >> + if (rdev->faulty || rdev->raid_disk < 0) > >> + dprintk("(skipping faulty/spare "); > >> > >> dprintk("%s ", bdevname(rdev->bdev,b)); > >> - if (!rdev->faulty) { > >> + if (!rdev->faulty && !rdev->raid_disk <0 ) { if (!rdev->faulty && rdev->in_sync) err += write_disk_sb(rdev); else { if (rdev->faulty) dprintk(" faulty.\n"); else dprintk(" spare.\n"); } /* * Don't try this :( * because this still breaks creation of new md array and.. * for existing arrays with spares, the spares will be kicked out when * the arrays are re-assembled. */ -- Regards, Mike T. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Spare disk could not sleep / standby [probably dangerous PATCH] 2005-03-09 16:29 ` Mike Tran @ 2005-03-09 23:20 ` Peter Evertz 0 siblings, 0 replies; 20+ messages in thread From: Peter Evertz @ 2005-03-09 23:20 UTC (permalink / raw) To: Mike Tran; +Cc: linux-raid Mike Tran writes: > I tried the patch and immediately found problems. > > On creation of raid1 array, only the spare has md superblock, the raid > disks has no superblock. For instance: > > mdadm -C /dev/md0 -l 1 -n 2 /dev/hdd1 /dev/hdd2 -x 1 /dev/hdd3 > [wait for resync to finish if you want to...] > mdadm --stop /dev/md0 > mdadm --examine /dev/hdd1 (no super block found) > mdadm --examine /dev/hdd2 (no super block found) > mdadm --examine /dev/hdd3 (nice output) > > If you want to skip spares, you will need to alter the patch (see below) > Ooops ! I should't post patches for everyone. Nevertheless it workes for me, but I see that your version, is much better. Testing for raid_disk <0 is not a good idea when you (re)create an array. The test for "in_sync" is better, but still don't know if it works under all circumstances. Thanks Peter > On Wed, 2005-03-09 at 14:05, Peter Evertz wrote: >> Mike Tran writes: >> >> > Hi Peter, >> > >> > After applying this patch, have you tried stop and restart the MD >> > array? I believe the spares will be kicked out in analyze_sbs() >> > function (see the second ITERATE_RDEV) >> mdadm ( v1.6.0 - 4 June 2004 ) >> shows the arrays complete including spare. >> /proc/mdstat is ok >> >> I booted with my patched raid modules. So analyze_sbs() should have run. >> Maybe it works only for 0.90 superblocks, I haven't tried 1.00 >> >> No problems yet. If it really fails the hard way, I will go to the next >> Internetcafe and tell you about it :) >> >> Peter >> > >> > -- >> > Regards, >> > Mike T. >> > >> > >> > On Wed, 2005-03-09 at 09:53, Peter Evertz wrote: >> >> This patch removes my problem. I hope it doesn't have influence on the >> >> stability of >> >> the system. >> >> It is simple: The Update routine skips normaly only "faulty" disks. Now it >> >> skips all disk >> >> that are not part of the working array ( raid_disk == -1 ) >> >> I made some testing, but surely not all, so : >> >> >> >> DON'T APPLY TO YOUR SYSTEM WITH IMPORTENT DATA ! >> >> >> >> Regards >> >> Peter >> >> >> >> --- md.c.orig 2005-01-14 16:33:49.000000000 +0100 >> >> +++ md.c 2005-03-09 15:27:23.000000000 +0100 >> >> @@ -1340,14 +1340,14 @@ >> >> ITERATE_RDEV(mddev,rdev,tmp) { >> >> char b[BDEVNAME_SIZE]; >> >> dprintk(KERN_INFO "md: "); >> >> - if (rdev->faulty) >> >> - dprintk("(skipping faulty "); >> >> + if (rdev->faulty || rdev->raid_disk < 0) >> >> + dprintk("(skipping faulty/spare "); >> >> >> >> dprintk("%s ", bdevname(rdev->bdev,b)); >> >> - if (!rdev->faulty) { >> >> + if (!rdev->faulty && !rdev->raid_disk <0 ) { > > if (!rdev->faulty && rdev->in_sync) > err += write_disk_sb(rdev); > else { > if (rdev->faulty) > dprintk(" faulty.\n"); > else > dprintk(" spare.\n"); > } > > /* > * Don't try this :( > * because this still breaks creation of new md array and.. > * for existing arrays with spares, the spares will be kicked out when > * the arrays are re-assembled. > */ > > > -- > Regards, > Mike T. > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2005-03-09 23:20 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-08 4:05 Spare disk could not sleep / standby Peter Evertz
2005-03-08 4:14 ` Guy
2005-03-08 4:40 ` Neil Brown
2005-03-08 5:20 ` Molle Bestefich
2005-03-08 5:36 ` Neil Brown
2005-03-08 5:46 ` Molle Bestefich
2005-03-08 6:03 ` Neil Brown
2005-03-08 6:24 ` Molle Bestefich
[not found] ` <422D625C.5020803@medien.uni-weimar.de>
2005-03-08 8:57 ` Molle Bestefich
2005-03-08 10:51 ` Tobias Hofmann
2005-03-08 13:13 ` Gordon Henderson
2005-03-09 5:11 ` Brad Campbell
2005-03-09 9:03 ` Tobias Hofmann
2005-03-08 8:51 ` David Greaves
2005-03-08 15:59 ` Mike Tran
2005-03-09 15:53 ` Spare disk could not sleep / standby [probably dangerous PATCH] Peter Evertz
2005-03-09 10:44 ` Mike Tran
2005-03-09 20:05 ` Peter Evertz
2005-03-09 16:29 ` Mike Tran
2005-03-09 23:20 ` Peter Evertz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).