* problem killing raid 5 @ 2007-10-01 11:04 Daniel Santos 2007-10-01 18:20 ` Daniel Santos 0 siblings, 1 reply; 8+ messages in thread From: Daniel Santos @ 2007-10-01 11:04 UTC (permalink / raw) To: linux-raid Hello, I had a raid 5 array on three disks. Because of a hardware problem two disks dissapeared one after the other. I have since been trying to create a new array with them. Between the degradation of the two disks I tryied removing one of the failed disks and re-adding it to the array. When the second disk failed I noticed the drive numbers on the broken array, and misteriously a fourth drive appeared on it. Now I have numbers 0,1 and 3, but no number 2. mdadm tells me that number 3 is a spare. Now I want to start all over again, but even after zeroing the superblocks on all three disks, and creation of a new array, /proc/mdstat shows the same drive numbers, while reconstructing the third drive. What should I do ? Daniel Santos ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 11:04 problem killing raid 5 Daniel Santos @ 2007-10-01 18:20 ` Daniel Santos 2007-10-01 18:47 ` Michael Tokarev 0 siblings, 1 reply; 8+ messages in thread From: Daniel Santos @ 2007-10-01 18:20 UTC (permalink / raw) Cc: linux-raid I retried rebuilding the array once again from scratch, and this time checked the syslog messages. The reconstructions process is getting stuck at a disk block that it can't read. I double checked the block number by repeating the array creation, and did a bad block scan. No bad blocks were found. How could the md driver be stuck if the block is fine ? Supposing that the disk has bad blocks, can I have a raid device on disks that have badblocks ? Each one of the disks is 400 GB. Probably not a good idea because if a drive has bad blocks it probably will have more in the future. But anyway, can I ? The bad blocks would have to be known to the md driver. Daniel Santos wrote: > Hello, > > I had a raid 5 array on three disks. Because of a hardware problem two > disks dissapeared one after the other. I have since been trying to > create a new array with them. > > Between the degradation of the two disks I tryied removing one of the > failed disks and re-adding it to the array. When the second disk > failed I noticed the drive numbers on the broken array, and > misteriously a fourth drive appeared on it. Now I have numbers 0,1 and > 3, but no number 2. > mdadm tells me that number 3 is a spare. > > Now I want to start all over again, but even after zeroing the > superblocks on all three disks, and creation of a new array, > /proc/mdstat shows the same drive numbers, while reconstructing the > third drive. > What should I do ? > > Daniel Santos > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 18:20 ` Daniel Santos @ 2007-10-01 18:47 ` Michael Tokarev 2007-10-01 18:58 ` Patrik Jonsson 0 siblings, 1 reply; 8+ messages in thread From: Michael Tokarev @ 2007-10-01 18:47 UTC (permalink / raw) To: Daniel Santos; +Cc: linux-raid Daniel Santos wrote: > I retried rebuilding the array once again from scratch, and this time > checked the syslog messages. The reconstructions process is getting > stuck at a disk block that it can't read. I double checked the block > number by repeating the array creation, and did a bad block scan. No bad > blocks were found. How could the md driver be stuck if the block is fine ? > > Supposing that the disk has bad blocks, can I have a raid device on > disks that have badblocks ? Each one of the disks is 400 GB. > > Probably not a good idea because if a drive has bad blocks it probably > will have more in the future. But anyway, can I ? > The bad blocks would have to be known to the md driver. Well, almost all modern drives can remap bad blocks (at least I know no drive that can't). Most of the time it happens on write - becaue if such a bad block is found during read operation and the drive really can't read the content of that block, it can't remap it either without losing data. From my expirience (about 20 years, many 100s of drives, mostly (old) SCSI but (old) IDE too), it's pretty normal for a drive to develop several bad blocks, especially during first year of usage. Sometimes however, number of bad blocks grows quite rapidly and such a drive definietely should be replaced - at least Seagate drives are covered by warranty in this case. SCSI drives has 2 so-called "defect lists", stored somewhere inside the drive - factory-preset list (bad blocks found during internal testing when producing a drive), and grown list (bad blocks found by drive during normal usage). Factory-preset list can contain from 0 to about 1000 entries or even more (depending on the size too), grown list can be as large as 500 blocks or more, whenever it's fatal or not depends on whenever new bad blocks continues to be found or not. We have several drives which developed that many bad blocks in first few months of usage, the list stopped growing, and they're still working just fine for >5 years. Both defect lists can be shown by scsitools programs. I don't know how one can see defect lists on a IDE or SATA drive. Note that md layer (raid1, 4, 5, 6, 10 - but obviously not raid0 and linear) are now able to repair bad blocks automatically, by forcing write to the same place of the drive where a read error occured - this usually forces drive to automatically reallocate that block and continue. But in any case, md should not stall - be it during reconstruction or not. For this, I can't comment - to me it smells like a bug somewhere (md layer? error handling in driver? something else?) which should be found and fixed. And for this, some more details are needed I guess -- kernel version is a start. /mjt ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 18:47 ` Michael Tokarev @ 2007-10-01 18:58 ` Patrik Jonsson 2007-10-01 20:35 ` Michael Tokarev 0 siblings, 1 reply; 8+ messages in thread From: Patrik Jonsson @ 2007-10-01 18:58 UTC (permalink / raw) To: Michael Tokarev; +Cc: Daniel Santos, linux-raid [-- Attachment #1: Type: text/plain, Size: 3475 bytes --] Michael Tokarev wrote: > Daniel Santos wrote: >> I retried rebuilding the array once again from scratch, and this time >> checked the syslog messages. The reconstructions process is getting >> stuck at a disk block that it can't read. I double checked the block >> number by repeating the array creation, and did a bad block scan. No bad >> blocks were found. How could the md driver be stuck if the block is fine ? >> >> Supposing that the disk has bad blocks, can I have a raid device on >> disks that have badblocks ? Each one of the disks is 400 GB. >> >> Probably not a good idea because if a drive has bad blocks it probably >> will have more in the future. But anyway, can I ? >> The bad blocks would have to be known to the md driver. > > Well, almost all modern drives can remap bad blocks (at least I know no > drive that can't). Most of the time it happens on write - becaue if such > a bad block is found during read operation and the drive really can't > read the content of that block, it can't remap it either without losing > data. From my expirience (about 20 years, many 100s of drives, mostly > (old) SCSI but (old) IDE too), it's pretty normal for a drive to develop > several bad blocks, especially during first year of usage. Sometimes > however, number of bad blocks grows quite rapidly and such a drive > definietely should be replaced - at least Seagate drives are covered > by warranty in this case. > > SCSI drives has 2 so-called "defect lists", stored somewhere inside the > drive - factory-preset list (bad blocks found during internal testing > when producing a drive), and grown list (bad blocks found by drive > during normal usage). Factory-preset list can contain from 0 to about > 1000 entries or even more (depending on the size too), grown list can > be as large as 500 blocks or more, whenever it's fatal or not depends > on whenever new bad blocks continues to be found or not. We have > several drives which developed that many bad blocks in first few > months of usage, the list stopped growing, and they're still working > just fine for >5 years. Both defect lists can be shown by scsitools > programs. > > I don't know how one can see defect lists on a IDE or SATA drive. > > Note that md layer (raid1, 4, 5, 6, 10 - but obviously not raid0 and > linear) are now able to repair bad blocks automatically, by forcing > write to the same place of the drive where a read error occured - > this usually forces drive to automatically reallocate that block > and continue. > > But in any case, md should not stall - be it during reconstruction > or not. For this, I can't comment - to me it smells like a bug > somewhere (md layer? error handling in driver? something else?) > which should be found and fixed. And for this, some more details > are needed I guess -- kernel version is a start. Really? It's my understanding that if md finds an unreadable block during raid5 reconstruction, it has no option but to fail since the information can't be reconstructed. When this happened to me, I had to wipe the bad block, which should allow reconstruction to proceed at the cost of losing the chunk that's on the unreadable block. The bad block howto and messages in this list ~2 years ago explain how to figure out which file(s) is affected. This is why it's important to run a weekly check so md can repair blocks *before* a drive fails. cheers, /Patrik [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 250 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 18:58 ` Patrik Jonsson @ 2007-10-01 20:35 ` Michael Tokarev 2007-10-01 21:11 ` Daniel Santos 0 siblings, 1 reply; 8+ messages in thread From: Michael Tokarev @ 2007-10-01 20:35 UTC (permalink / raw) To: Patrik Jonsson; +Cc: Daniel Santos, linux-raid Patrik Jonsson wrote: > Michael Tokarev wrote: [] >> But in any case, md should not stall - be it during reconstruction >> or not. For this, I can't comment - to me it smells like a bug >> somewhere (md layer? error handling in driver? something else?) >> which should be found and fixed. And for this, some more details >> are needed I guess -- kernel version is a start. > > Really? It's my understanding that if md finds an unreadable block > during raid5 reconstruction, it has no option but to fail since the > information can't be reconstructed. When this happened to me, I had to Yes indeed, it should fail, but not stuck as Daniel reported. Ie, it should either complete the work or fail, but not sleep somewhere in between. [] > This is why it's important to run a weekly check so md can repair blocks > *before* a drive fails. *nod*. /mjt ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 20:35 ` Michael Tokarev @ 2007-10-01 21:11 ` Daniel Santos 2007-10-01 21:44 ` Justin Piszcz 0 siblings, 1 reply; 8+ messages in thread From: Daniel Santos @ 2007-10-01 21:11 UTC (permalink / raw) Cc: linux-raid It stopped the reconstruction process and the output of /proc/mdstat was : oraculo:/home/dlsa# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [raid1] [raid0] [linear] md0 : active raid5 sdc1[3](S) sdb1[4](F) sdd1[0] 781417472 blocks level 5, 256k chunk, algorithm 2 [3/1] [U__] I then stopped the array and tried to assemble it with a scan : oraculo:/home/dlsa# mdadm --assemble --scan mdadm: /dev/md0 assembled from 1 drive and 1 spare - not enough to start the array. oraculo:/home/dlsa# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [raid1] [raid0] [linear] md0 : inactive sdd1[0](S) sdc1[3](S) sdb1[1](S) 1172126208 blocks The fourth drive I had to put in mdadm.conf as missing. The result was that because of the read error, the reconstruction process for the new array aborted, and the assemble came up with an array that seems like the one that failed before I created the new one. I am running debian with a 2.6.22 kernel. Michael Tokarev wrote: > Patrik Jonsson wrote: > >> Michael Tokarev wrote: >> > [] > >>> But in any case, md should not stall - be it during reconstruction >>> or not. For this, I can't comment - to me it smells like a bug >>> somewhere (md layer? error handling in driver? something else?) >>> which should be found and fixed. And for this, some more details >>> are needed I guess -- kernel version is a start. >>> >> Really? It's my understanding that if md finds an unreadable block >> during raid5 reconstruction, it has no option but to fail since the >> information can't be reconstructed. When this happened to me, I had to >> > > Yes indeed, it should fail, but not stuck as Daniel reported. > Ie, it should either complete the work or fail, but not sleep > somewhere in between. > > [] > >> This is why it's important to run a weekly check so md can repair blocks >> *before* a drive fails. >> > > *nod*. > > /mjt > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 21:11 ` Daniel Santos @ 2007-10-01 21:44 ` Justin Piszcz 2007-10-02 6:53 ` Daniel Santos 0 siblings, 1 reply; 8+ messages in thread From: Justin Piszcz @ 2007-10-01 21:44 UTC (permalink / raw) To: Daniel Santos; +Cc: linux-raid On Mon, 1 Oct 2007, Daniel Santos wrote: > It stopped the reconstruction process and the output of /proc/mdstat was : > > oraculo:/home/dlsa# cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] [raid1] [raid0] [linear] > md0 : active raid5 sdc1[3](S) sdb1[4](F) sdd1[0] > 781417472 blocks level 5, 256k chunk, algorithm 2 [3/1] [U__] > > I then stopped the array and tried to assemble it with a scan : > > oraculo:/home/dlsa# mdadm --assemble --scan > mdadm: /dev/md0 assembled from 1 drive and 1 spare - not enough to start the > array. > oraculo:/home/dlsa# cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] [raid1] [raid0] [linear] > md0 : inactive sdd1[0](S) sdc1[3](S) sdb1[1](S) > 1172126208 blocks > > The fourth drive I had to put in mdadm.conf as missing. > > The result was that because of the read error, the reconstruction process for > the new array aborted, and the assemble came up with an array that seems like > the one that failed before I created the new one. > > I am running debian with a 2.6.22 kernel. > > > Michael Tokarev wrote: >> Patrik Jonsson wrote: >> >>> Michael Tokarev wrote: >>> >> [] >> >>>> But in any case, md should not stall - be it during reconstruction >>>> or not. For this, I can't comment - to me it smells like a bug >>>> somewhere (md layer? error handling in driver? something else?) >>>> which should be found and fixed. And for this, some more details >>>> are needed I guess -- kernel version is a start. >>>> >>> Really? It's my understanding that if md finds an unreadable block >>> during raid5 reconstruction, it has no option but to fail since the >>> information can't be reconstructed. When this happened to me, I had to >>> >> >> Yes indeed, it should fail, but not stuck as Daniel reported. >> Ie, it should either complete the work or fail, but not sleep >> somewhere in between. >> >> [] >> >>> This is why it's important to run a weekly check so md can repair blocks >>> *before* a drive fails. >>> >> >> *nod*. >> >> /mjt >> >> > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Yikes. By the way are all those drives on the same chipset? What type of drives did you use? Justin. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: problem killing raid 5 2007-10-01 21:44 ` Justin Piszcz @ 2007-10-02 6:53 ` Daniel Santos 0 siblings, 0 replies; 8+ messages in thread From: Daniel Santos @ 2007-10-02 6:53 UTC (permalink / raw) Cc: linux-raid All the drives are identical, and they are on identical usb enclosures. I am starting to suspect USB. It frequently resets the enclosures. I'll have to look at that first. Anyway I had it working before for some time. Justin Piszcz wrote: > > > On Mon, 1 Oct 2007, Daniel Santos wrote: > >> It stopped the reconstruction process and the output of /proc/mdstat >> was : >> >> oraculo:/home/dlsa# cat /proc/mdstat >> Personalities : [raid6] [raid5] [raid4] [raid1] [raid0] [linear] >> md0 : active raid5 sdc1[3](S) sdb1[4](F) sdd1[0] >> 781417472 blocks level 5, 256k chunk, algorithm 2 [3/1] [U__] >> >> I then stopped the array and tried to assemble it with a scan : >> >> oraculo:/home/dlsa# mdadm --assemble --scan >> mdadm: /dev/md0 assembled from 1 drive and 1 spare - not enough to >> start the array. >> oraculo:/home/dlsa# cat /proc/mdstat >> Personalities : [raid6] [raid5] [raid4] [raid1] [raid0] [linear] >> md0 : inactive sdd1[0](S) sdc1[3](S) sdb1[1](S) >> 1172126208 blocks >> >> The fourth drive I had to put in mdadm.conf as missing. >> >> The result was that because of the read error, the reconstruction >> process for the new array aborted, and the assemble came up with an >> array that seems like the one that failed before I created the new one. >> >> I am running debian with a 2.6.22 kernel. >> >> >> Michael Tokarev wrote: >>> Patrik Jonsson wrote: >>> >>>> Michael Tokarev wrote: >>>> >>> [] >>> >>>>> But in any case, md should not stall - be it during reconstruction >>>>> or not. For this, I can't comment - to me it smells like a bug >>>>> somewhere (md layer? error handling in driver? something else?) >>>>> which should be found and fixed. And for this, some more details >>>>> are needed I guess -- kernel version is a start. >>>>> >>>> Really? It's my understanding that if md finds an unreadable block >>>> during raid5 reconstruction, it has no option but to fail since the >>>> information can't be reconstructed. When this happened to me, I had to >>>> >>> >>> Yes indeed, it should fail, but not stuck as Daniel reported. >>> Ie, it should either complete the work or fail, but not sleep >>> somewhere in between. >>> >>> [] >>> >>>> This is why it's important to run a weekly check so md can repair >>>> blocks >>>> *before* a drive fails. >>>> >>> >>> *nod*. >>> >>> /mjt >>> >>> >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > Yikes. By the way are all those drives on the same chipset? What type > of drives did you use? > > Justin. > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-10-02 6:53 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-10-01 11:04 problem killing raid 5 Daniel Santos 2007-10-01 18:20 ` Daniel Santos 2007-10-01 18:47 ` Michael Tokarev 2007-10-01 18:58 ` Patrik Jonsson 2007-10-01 20:35 ` Michael Tokarev 2007-10-01 21:11 ` Daniel Santos 2007-10-01 21:44 ` Justin Piszcz 2007-10-02 6:53 ` Daniel Santos
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).