* Huge mdadm resync problem.
@ 2005-02-17 12:08 Phantazm
2005-02-17 12:37 ` Phantazm
0 siblings, 1 reply; 6+ messages in thread
From: Phantazm @ 2005-02-17 12:08 UTC (permalink / raw)
To: linux-raid
This is really wierd problem with mdadm.
I currently have 8 Maxtor 200gb disks.
They are connected like this
hdb hdc hdd = onboard ide
hde hdf hdg hdh = Promise ata133 card
hdk = Promise ata 133 card.
Hardware is a P4 2.8ghz with 2gb of ram and a MSI NEO 2 mobo.
Problem is that te resync is really slow and when it's done it just loops
and the box craches.
Here are some info.
Currently i'm testing a resync with a non HT/SMP config and noapic just to
check that is no irq routing crap. (failed before though)
merlin / # uname -a
Linux merlin 2.6.10-gentoo-r6 #16 Thu Feb 17 11:00:11 CET 2005 i686 Intel(R)
Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux
merlin / # cat /proc/interrupts
CPU0
0: 6621371 XT-PIC timer
1: 8 XT-PIC i8042
2: 0 XT-PIC cascade
3: 1487791 XT-PIC eth1
10: 1628242 XT-PIC eth0, eth2
11: 112644 XT-PIC ide2, ide3
12: 35197 XT-PIC ide5
14: 71092 XT-PIC ide0
15: 63376 XT-PIC ide1
NMI: 0
ERR: 40328
cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 hde1[0] hdb1[8] hdd1[7] hdk1[6] hdc1[4] hdh1[3] hdg1[2]
hdf1[1]
1393991424 blocks level 5, 64k chunk, algorithm 2 [8/7] [UUUUU_UU]
[=>...................] recovery = 5.5% (11110168/199141632)
finish=1641.7min speed=1906K/sec
unused devices: <none>
(The resync speed is always somewhere between 500K to 3000K/s) should be
10000K/s ;-)
This is the kernelog. it's just a lil grab in it since this list goes on
untill i reboot the box. (its freezed).
This is what i get when sync is finiched and it should markt the array good.
Feb 17 07:17:08 [kernel] md: using maximum available idle IO bandwith (but
not more than 150000 KB/sec) for reconstruction.
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: using maximum available idle IO bandwith (but
not more than 150000 KB/sec) for reconstruction.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
Feb 17 07:17:08 [kernel] md: md0: sync done.
Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
I've also tried to have 4 disks on each promise card with same result. (if
having apic i get alot of cpu apic error 60)
i have checked all disks with smarttool and also benchmarked them. Each disk
gets about (hdparm) -T = 1800mb/s and -t 60mb/s so i doubt that
theres actually a broken disk.
i'm running mdadm 1.7.0
This is toally bugging me out.
Help is really really apricated.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Huge mdadm resync problem.
2005-02-17 12:08 Huge mdadm resync problem Phantazm
@ 2005-02-17 12:37 ` Phantazm
0 siblings, 0 replies; 6+ messages in thread
From: Phantazm @ 2005-02-17 12:37 UTC (permalink / raw)
To: linux-raid
i forgot to mention that speed is generally no problem on the raid set.
It's connected to a gigabit interface and copying to and from the interface
gives about 35mb/s over the link.
So disks are fast and IO is working pretty good.
Now i just sit here with my last hope on you fellas ;-)
Regards
Phantazm
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Huge mdadm resync problem.
@ 2005-02-17 13:52 Lord Hess,Raum 301Kw,54-8994
2005-02-17 14:33 ` Phantazm
0 siblings, 1 reply; 6+ messages in thread
From: Lord Hess,Raum 301Kw,54-8994 @ 2005-02-17 13:52 UTC (permalink / raw)
To: Phantazm, linux-raid
Hi,
check the sync rate while connecting the disks only as a IDE masters. This means
you can try a RAID with 5 disks as I can see in your configuration.
I guess that every disk is correct jumpered as a master or slave? Or do you use
the "cable select" option?
Lord
Phantazm <phantazm@phantazm.nu> schrieb:
> This is really wierd problem with mdadm.
>
> I currently have 8 Maxtor 200gb disks.
> They are connected like this
>
> hdb hdc hdd = onboard ide
> hde hdf hdg hdh = Promise ata133 card
> hdk = Promise ata 133 card.
>
> Hardware is a P4 2.8ghz with 2gb of ram and a MSI NEO 2 mobo.
>
> Problem is that te resync is really slow and when it's done it just loops
> and the box craches.
> Here are some info.
>
> Currently i'm testing a resync with a non HT/SMP config and noapic just to
> check that is no irq routing crap. (failed before though)
>
> merlin / # uname -a
> Linux merlin 2.6.10-gentoo-r6 #16 Thu Feb 17 11:00:11 CET 2005 i686 Intel(R)
> Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux
>
>
> merlin / # cat /proc/interrupts
> CPU0
> 0: 6621371 XT-PIC timer
> 1: 8 XT-PIC i8042
> 2: 0 XT-PIC cascade
> 3: 1487791 XT-PIC eth1
> 10: 1628242 XT-PIC eth0, eth2
> 11: 112644 XT-PIC ide2, ide3
> 12: 35197 XT-PIC ide5
> 14: 71092 XT-PIC ide0
> 15: 63376 XT-PIC ide1
> NMI: 0
> ERR: 40328
>
> cat /proc/mdstat
> Personalities : [raid5]
> md0 : active raid5 hde1[0] hdb1[8] hdd1[7] hdk1[6] hdc1[4] hdh1[3] hdg1[2]
> hdf1[1]
> 1393991424 blocks level 5, 64k chunk, algorithm 2 [8/7] [UUUUU_UU]
> [=>...................] recovery = 5.5% (11110168/199141632)
> finish=1641.7min speed=1906K/sec
> unused devices: <none>
>
> (The resync speed is always somewhere between 500K to 3000K/s) should be
> 10000K/s ;-)
>
>
> This is the kernelog. it's just a lil grab in it since this list goes on
> untill i reboot the box. (its freezed).
> This is what i get when sync is finiched and it should markt the array good.
> Feb 17 07:17:08 [kernel] md: using maximum available idle IO bandwith (but
> not more than 150000 KB/sec) for reconstruction.
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: using maximum available idle IO bandwith (but
> not more than 150000 KB/sec) for reconstruction.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
> Feb 17 07:17:08 [kernel] md: md0: sync done.
> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>
> I've also tried to have 4 disks on each promise card with same result. (if
> having apic i get alot of cpu apic error 60)
> i have checked all disks with smarttool and also benchmarked them. Each disk
> gets about (hdparm) -T = 1800mb/s and -t 60mb/s so i doubt that
> theres actually a broken disk.
>
> i'm running mdadm 1.7.0
>
> This is toally bugging me out.
> Help is really really apricated.
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Lord Hess, R. 3.307 ,KIP, Inf 227 69120 Heidelberg
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Huge mdadm resync problem.
2005-02-17 13:52 Lord Hess,Raum 301Kw,54-8994
@ 2005-02-17 14:33 ` Phantazm
2005-02-17 14:40 ` Gordon Henderson
0 siblings, 1 reply; 6+ messages in thread
From: Phantazm @ 2005-02-17 14:33 UTC (permalink / raw)
To: linux-raid
I use master slave. Problem is that i cant break raid set couse if i do i
will loose over 1TB of data :/
Goin to see if i can get more controller cards though.
"Lord Hess,Raum 301Kw,54-8994" <hess@kip.uni-heidelberg.de> skrev i
meddelandet
news:20050217145217.327494214a19146cf7@mail.kip.uni-heidelberg.de...
>
> Hi,
>
> check the sync rate while connecting the disks only as a IDE masters. This
> means
> you can try a RAID with 5 disks as I can see in your configuration.
> I guess that every disk is correct jumpered as a master or slave? Or do
> you use
> the "cable select" option?
>
> Lord
>
>
> Phantazm <phantazm@phantazm.nu> schrieb:
>
>> This is really wierd problem with mdadm.
>>
>> I currently have 8 Maxtor 200gb disks.
>> They are connected like this
>>
>> hdb hdc hdd = onboard ide
>> hde hdf hdg hdh = Promise ata133 card
>> hdk = Promise ata 133 card.
>>
>> Hardware is a P4 2.8ghz with 2gb of ram and a MSI NEO 2 mobo.
>>
>> Problem is that te resync is really slow and when it's done it just loops
>> and the box craches.
>> Here are some info.
>>
>> Currently i'm testing a resync with a non HT/SMP config and noapic just
>> to
>> check that is no irq routing crap. (failed before though)
>>
>> merlin / # uname -a
>> Linux merlin 2.6.10-gentoo-r6 #16 Thu Feb 17 11:00:11 CET 2005 i686
>> Intel(R)
>> Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux
>>
>>
>> merlin / # cat /proc/interrupts
>> CPU0
>> 0: 6621371 XT-PIC timer
>> 1: 8 XT-PIC i8042
>> 2: 0 XT-PIC cascade
>> 3: 1487791 XT-PIC eth1
>> 10: 1628242 XT-PIC eth0, eth2
>> 11: 112644 XT-PIC ide2, ide3
>> 12: 35197 XT-PIC ide5
>> 14: 71092 XT-PIC ide0
>> 15: 63376 XT-PIC ide1
>> NMI: 0
>> ERR: 40328
>>
>> cat /proc/mdstat
>> Personalities : [raid5]
>> md0 : active raid5 hde1[0] hdb1[8] hdd1[7] hdk1[6] hdc1[4] hdh1[3]
>> hdg1[2]
>> hdf1[1]
>> 1393991424 blocks level 5, 64k chunk, algorithm 2 [8/7] [UUUUU_UU]
>> [=>...................] recovery = 5.5% (11110168/199141632)
>> finish=1641.7min speed=1906K/sec
>> unused devices: <none>
>>
>> (The resync speed is always somewhere between 500K to 3000K/s) should be
>> 10000K/s ;-)
>>
>>
>> This is the kernelog. it's just a lil grab in it since this list goes on
>> untill i reboot the box. (its freezed).
>> This is what i get when sync is finiched and it should markt the array
>> good.
>> Feb 17 07:17:08 [kernel] md: using maximum available idle IO bandwith
>> (but
>> not more than 150000 KB/sec) for reconstruction.
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: using maximum available idle IO bandwith
>> (but
>> not more than 150000 KB/sec) for reconstruction.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>> Feb 17 07:17:08 [kernel] md: md0: sync done.
>> Feb 17 07:17:08 [kernel] .<6>md: syncing RAID array md0
>>
>> I've also tried to have 4 disks on each promise card with same result.
>> (if
>> having apic i get alot of cpu apic error 60)
>> i have checked all disks with smarttool and also benchmarked them. Each
>> disk
>> gets about (hdparm) -T = 1800mb/s and -t 60mb/s so i doubt that
>> theres actually a broken disk.
>>
>> i'm running mdadm 1.7.0
>>
>> This is toally bugging me out.
>> Help is really really apricated.
>>
>>
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
>
>
> --
> Lord Hess, R. 3.307 ,KIP, Inf 227 69120 Heidelberg
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Huge mdadm resync problem.
2005-02-17 14:33 ` Phantazm
@ 2005-02-17 14:40 ` Gordon Henderson
2005-02-17 14:57 ` Phantazm
0 siblings, 1 reply; 6+ messages in thread
From: Gordon Henderson @ 2005-02-17 14:40 UTC (permalink / raw)
To: linux-raid
On Thu, 17 Feb 2005, Phantazm wrote:
> I use master slave. Problem is that i cant break raid set couse if i do i
> will loose over 1TB of data :/
>
> Goin to see if i can get more controller cards though.
Do it. Use 4 2-port cards for your 8 drives and only one drive per cable.
It is possible, and I've had it happen to me, that a hardware failure on a
failing drive can cause loss of access to the 2nd drive on the same cable.
Fortunately when it happened to me, the disks weren't in a RAID set and I
didn't lose any data on the other drive, but if the disks were in a RAID
set, then you'd have a 2-disk failure on your hands, and unless it was
RAID-6, then you'd be shafted!
I was about to suggest running with noapic, but you've already tried
that... You might also want to see if the motherboard has a way to turn
the APIC off too. I had that on one Athlon mobo, and running it all in PIC
mode kept it going much better.
Also experiment with PCI slot locations, although with 4 PCI cards, you
might not have much luck, depending on your motherboard. Try to get each
card on its own interrupt if at all possible. Some BIOSes can help fix
this for you, some can't....
Gordon
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Huge mdadm resync problem.
2005-02-17 14:40 ` Gordon Henderson
@ 2005-02-17 14:57 ` Phantazm
0 siblings, 0 replies; 6+ messages in thread
From: Phantazm @ 2005-02-17 14:57 UTC (permalink / raw)
To: linux-raid
It might be worht a try. I have 6 pci slots. That i know of today my mobo
shares irq on pci 5 and 2. looks like i cant get around that.
I have disabled apic on mobo too.
Going to see if i can get me 2 more ata133 cards then to see if i get more
lucky :)
Thanx for your answers and suggestions.
"Gordon Henderson" <gordon@drogon.net> skrev i meddelandet
news:Pine.LNX.4.56.0502171435420.819@lion.drogon.net...
> On Thu, 17 Feb 2005, Phantazm wrote:
>
>> I use master slave. Problem is that i cant break raid set couse if i do i
>> will loose over 1TB of data :/
>>
>> Goin to see if i can get more controller cards though.
>
> Do it. Use 4 2-port cards for your 8 drives and only one drive per cable.
> It is possible, and I've had it happen to me, that a hardware failure on a
> failing drive can cause loss of access to the 2nd drive on the same cable.
> Fortunately when it happened to me, the disks weren't in a RAID set and I
> didn't lose any data on the other drive, but if the disks were in a RAID
> set, then you'd have a 2-disk failure on your hands, and unless it was
> RAID-6, then you'd be shafted!
>
> I was about to suggest running with noapic, but you've already tried
> that... You might also want to see if the motherboard has a way to turn
> the APIC off too. I had that on one Athlon mobo, and running it all in PIC
> mode kept it going much better.
>
> Also experiment with PCI slot locations, although with 4 PCI cards, you
> might not have much luck, depending on your motherboard. Try to get each
> card on its own interrupt if at all possible. Some BIOSes can help fix
> this for you, some can't....
>
> Gordon
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-02-17 14:57 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-17 12:08 Huge mdadm resync problem Phantazm
2005-02-17 12:37 ` Phantazm
-- strict thread matches above, loose matches on Subject: below --
2005-02-17 13:52 Lord Hess,Raum 301Kw,54-8994
2005-02-17 14:33 ` Phantazm
2005-02-17 14:40 ` Gordon Henderson
2005-02-17 14:57 ` Phantazm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).