* Hard drives shutting themselves off in RAID mode
@ 2006-06-13 21:53 Tom Wirschell
[not found] ` <62b0912f0606140419s60c30535p bcc97c30ef99c50d@mail.gmail.com>
` (3 more replies)
0 siblings, 4 replies; 21+ messages in thread
From: Tom Wirschell @ 2006-06-13 21:53 UTC (permalink / raw)
To: dm-devel
I'm trying to setup a poor man's RAID5 array that uses 11 200 GB Western
Digital harddisks. Two of them are the PATA Caviar SE 2000JB drives and
the other ten are SATA Caviar 2000JD drives.
Both PATA and 2 of the SATA drives are connected to the mainboard, an
ASUS PSCH-L with an Intel E7210+6300ESB chipset. The other drives were
previously connected to 2 Promise FastTrak S150 TX4's which I've since
replaced in favor of the 8-port SuperMicro AOC-SAT2-MV8 card in the
hopes of fixing the issue I'm having, but to no avail.
I want to create a RAID5 array of these drives. Unfortunately after a
varying amount of time of moderate use (though never more than 24 hours)
one of the drives not connected to the 6300ESB just out of the blue
shuts itself down, eventually followed by another at which point the
array is dead.
When the drive shuts down I can hear the familiar click from the drive
cutting its power, and after a bit the following gets logged:
ata9: commant timeout
when using the Promise controllers. The machine locks hard at this
point. With the SuperMicro card the machine remains usable, but the
drives are never to be heared from again. The following is logged:
ata14: no device found (phy stat 00000000)
sd 13:0:0:0: SCSI error: return code = 0x40000
end_request: I/O errorm dev sdi, sector 390716676
raid5: Disk failure on sdi2, disabling device.
Pretty much every time it's a different disk, and I'm unable to revive
that disk without a reboot.
I brought this issue to the attention of some WD support people who're
basically telling me that the RAID software is impatient. This being
desktop drives, they're not particularly fast (which I don't need them
to be) and not equally fast either, hovering between 20 and 30 MB/s
for writing. Haven't tried to measure reading yet.
When I mount the drives as separate partitions I can play with them to
my heart's content. As a test I filled up 5 drives, copied the data to
the other 5 drives (I'm using the 11th drive, a PATA one, for Linux
itself ATM) and vice versa. As I'm writing this I'm running Bonnie++ in
parallel on these partitions and so far everything's solid as a rock.
Besides the Promise controllers I've replaced the powersupply (500W
HuntKey to a 550W Antec TruePower II), all SATA data cables, all SATA
power cables...
I've tried striping instead of RAID5 but that didn't help either.
To the best of my ability I've ruled out hardware faults. The only
thing I can think of now is that the RAID5 module, for whatever reason,
is _telling_ the drive to shutdown, but I can't imagine that happening
without some serious logging going on.
Hopefully someone on this list can help me get this problem sorted?
When I was using the Promise controllers I was using version
2.6.11.12, and later 2.6.16.14 of the kernel. When I switched to the
SuperMicro card I had to upgrade to 2.6.17-rc5.
Any suggestions would be greatly appreciated.
Kind regards,
Tom Wirschell
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-13 21:53 Hard drives shutting themselves off in RAID mode Tom Wirschell
[not found] ` <62b0912f0606140419s60c30535p bcc97c30ef99c50d@mail.gmail.com>
@ 2006-06-14 3:58 ` Arno Wagner
2006-06-14 7:42 ` Tom Wirschell
2006-06-14 11:19 ` Molle Bestefich
2006-06-23 11:27 ` Molle Bestefich
3 siblings, 1 reply; 21+ messages in thread
From: Arno Wagner @ 2006-06-14 3:58 UTC (permalink / raw)
To: device-mapper development
WD drives are misdesigned in some way that they need sometimes
very long to respond to commands. I think it is a quality issue.
WD itself has "RAID ready" drives that don't do this.
I have run TB sized arrays of Seagate and Maxtor drives with
Linux software RAID for years now and never, ever had this issue.
My advice is to dump the WD drives and get others.
Arno
On Tue, Jun 13, 2006 at 11:53:16PM +0200, Tom Wirschell wrote:
> I'm trying to setup a poor man's RAID5 array that uses 11 200 GB Western
> Digital harddisks. Two of them are the PATA Caviar SE 2000JB drives and
> the other ten are SATA Caviar 2000JD drives.
> Both PATA and 2 of the SATA drives are connected to the mainboard, an
> ASUS PSCH-L with an Intel E7210+6300ESB chipset. The other drives were
> previously connected to 2 Promise FastTrak S150 TX4's which I've since
> replaced in favor of the 8-port SuperMicro AOC-SAT2-MV8 card in the
> hopes of fixing the issue I'm having, but to no avail.
>
> I want to create a RAID5 array of these drives. Unfortunately after a
> varying amount of time of moderate use (though never more than 24 hours)
> one of the drives not connected to the 6300ESB just out of the blue
> shuts itself down, eventually followed by another at which point the
> array is dead.
>
> When the drive shuts down I can hear the familiar click from the drive
> cutting its power, and after a bit the following gets logged:
>
> ata9: commant timeout
>
> when using the Promise controllers. The machine locks hard at this
> point. With the SuperMicro card the machine remains usable, but the
> drives are never to be heared from again. The following is logged:
>
> ata14: no device found (phy stat 00000000)
> sd 13:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O errorm dev sdi, sector 390716676
> raid5: Disk failure on sdi2, disabling device.
>
> Pretty much every time it's a different disk, and I'm unable to revive
> that disk without a reboot.
> I brought this issue to the attention of some WD support people who're
> basically telling me that the RAID software is impatient. This being
> desktop drives, they're not particularly fast (which I don't need them
> to be) and not equally fast either, hovering between 20 and 30 MB/s
> for writing. Haven't tried to measure reading yet.
>
> When I mount the drives as separate partitions I can play with them to
> my heart's content. As a test I filled up 5 drives, copied the data to
> the other 5 drives (I'm using the 11th drive, a PATA one, for Linux
> itself ATM) and vice versa. As I'm writing this I'm running Bonnie++ in
> parallel on these partitions and so far everything's solid as a rock.
>
> Besides the Promise controllers I've replaced the powersupply (500W
> HuntKey to a 550W Antec TruePower II), all SATA data cables, all SATA
> power cables...
> I've tried striping instead of RAID5 but that didn't help either.
> To the best of my ability I've ruled out hardware faults. The only
> thing I can think of now is that the RAID5 module, for whatever reason,
> is _telling_ the drive to shutdown, but I can't imagine that happening
> without some serious logging going on.
>
> Hopefully someone on this list can help me get this problem sorted?
>
> When I was using the Promise controllers I was using version
> 2.6.11.12, and later 2.6.16.14 of the kernel. When I switched to the
> SuperMicro card I had to upgrade to 2.6.17-rc5.
>
> Any suggestions would be greatly appreciated.
>
> Kind regards,
>
> Tom Wirschell
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
--
Arno Wagner, Dipl. Inform., CISSP --- CSG, ETH Zurich, wagner@tik.ee.ethz.ch
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
Windows is the "under-3" toy of the OS world. -- Matthew D. Fuller
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-14 3:58 ` Arno Wagner
@ 2006-06-14 7:42 ` Tom Wirschell
0 siblings, 0 replies; 21+ messages in thread
From: Tom Wirschell @ 2006-06-14 7:42 UTC (permalink / raw)
To: dm-devel
On 14 Jun 2006, Arno Wagner wrote:
>
> WD drives are misdesigned in some way that they need sometimes
> very long to respond to commands. I think it is a quality issue.
> WD itself has "RAID ready" drives that don't do this.
>
> I have run TB sized arrays of Seagate and Maxtor drives with
> Linux software RAID for years now and never, ever had this issue.
> My advice is to dump the WD drives and get others.
I was _really_ hoping it wouldn't have to come to that.
But I honestly don't get it. If the drives behave when accessed
individually, what is it that software raid does so differently that
makes them act like this?
Kind regards,
Tom Wirschell
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-13 21:53 Hard drives shutting themselves off in RAID mode Tom Wirschell
[not found] ` <62b0912f0606140419s60c30535p bcc97c30ef99c50d@mail.gmail.com>
2006-06-14 3:58 ` Arno Wagner
@ 2006-06-14 11:19 ` Molle Bestefich
2006-06-14 16:00 ` Rune Saetre
2006-06-23 11:27 ` Molle Bestefich
3 siblings, 1 reply; 21+ messages in thread
From: Molle Bestefich @ 2006-06-14 11:19 UTC (permalink / raw)
To: device-mapper development
Tom Wirschell wrote:
> I want to create a RAID5 array of these drives. Unfortunately after a
> varying amount of time of moderate use (though never more than 24 hours)
> one of the drives not connected to the 6300ESB just out of the blue
> shuts itself down, eventually followed by another at which point the
> array is dead.
>
> When the drive shuts down I can hear the familiar click from the drive
> cutting its power, and after a bit the following gets logged:
Usually a 'click' just means that the drive is recalibrating because
it has failed to read a sector/track.
You are sure that it's shutting down?
> ata9: commant timeout
Ugly.
Does the drive's SMART log say anything interesting?
> when using the Promise controllers. The machine locks hard at this
> point. With the SuperMicro card the machine remains usable, but the
> drives are never to be heared from again.
Bug?
Report it to the Promise maintainer?
> The following is logged:
>
> ata14: no device found (phy stat 00000000)
> sd 13:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O errorm dev sdi, sector 390716676
> raid5: Disk failure on sdi2, disabling device.
>
> Pretty much every time it's a different disk,
> and I'm unable to revive that disk without a reboot.
Have you tried poking the IDE driver to reset the bus, might get it
running again?
Not a very pretty solution, especially since you might still suffer
two drives going down at once from time to time. Maybe you can patch
MD to pause the array and poke the IDE driver whenever a disk is lost?
Then you would at least only have intermittent failures / timeouts on
a rare basis rather than a non-redundant array when something happens.
> I brought this issue to the attention of some WD support people who're
> basically telling me that the RAID software is impatient.
If the disk never comes up, being patient surely won't help.
Wait for an hour and see if the drive comes up, ask the WD folks
exactly how patient they want you to be? :-)
> When I mount the drives as separate partitions I can play with them to
> my heart's content. As a test I filled up 5 drives, copied the data to
> the other 5 drives (I'm using the 11th drive, a PATA one, for Linux
> itself ATM) and vice versa. As I'm writing this I'm running Bonnie++ in
> parallel on these partitions and so far everything's solid as a rock.
Bizarre!...
An idea that will take some amount of work, don't know if it's feasible:
Patch the IDE driver to log everything it does in a ring buffer in memory.
When a drive is lost, dump the buffer contents to disk so you can see
what happened, perhaps even try and reproduce it.
Perhaps the WD folks could even take a look at it..
> To the best of my ability I've ruled out hardware faults. The only
> thing I can think of now is that the RAID5 module, for whatever reason,
> is _telling_ the drive to shutdown, but I can't imagine that happening
> without some serious logging going on.
bonnie++ does random seeks, right?
> Hopefully someone on this list can help me get this problem sorted?
Sorry :-)...
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-14 11:19 ` Molle Bestefich
@ 2006-06-14 16:00 ` Rune Saetre
2006-06-14 19:52 ` Tom Wirschell
0 siblings, 1 reply; 21+ messages in thread
From: Rune Saetre @ 2006-06-14 16:00 UTC (permalink / raw)
To: device-mapper development
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4618 bytes --]
Hi
I always thought the loud click came from the disks parking their heads
before spinning down.
Anyway, it can take several seconds before a disk responds to commands
after having spun down. Often the bus/drive must be reset and the commands
reissued several times before the disk responds. While this isn't a big
problem when running a filesystem or lvm directly on the disks I suspect
this would result in the raid5 module marking the disk as dead.
Can you stop the disks from spinning down using hdparm or similar?
If not, maybe you can access the disks frequently to prevent them from
spinning down. You'll probably have to access them individually, since not
all disks are used when reading/writing a small amount of data.
Something like this in roots crontab might do the trick
* * * * * /bin/dd if=/dev/sda of=/dev/null bs=512 count=1 >/dev/null 2>&1
* * * * * /bin/dd if=/dev/sdb of=/dev/null bs=512 count=1 >/dev/null 2>&1
* * * * * /bin/dd if=/dev/sdc of=/dev/null bs=512 count=1 >/dev/null 2>&1
.
.
.
This reads the first block of each disk and discards it, every minute.
Regards
Rune
---
Rune Sætre <rune.saetre@netcom-gsm.no>
NetCom as
..
On Wed, 14 Jun 2006, Molle Bestefich wrote:
> Tom Wirschell wrote:
>> I want to create a RAID5 array of these drives. Unfortunately after a
>> varying amount of time of moderate use (though never more than 24 hours)
>> one of the drives not connected to the 6300ESB just out of the blue
>> shuts itself down, eventually followed by another at which point the
>> array is dead.
>>
>> When the drive shuts down I can hear the familiar click from the drive
>> cutting its power, and after a bit the following gets logged:
>
> Usually a 'click' just means that the drive is recalibrating because
> it has failed to read a sector/track.
> You are sure that it's shutting down?
>
>> ata9: commant timeout
>
> Ugly.
> Does the drive's SMART log say anything interesting?
>
>> when using the Promise controllers. The machine locks hard at this
>> point. With the SuperMicro card the machine remains usable, but the
>> drives are never to be heared from again.
>
> Bug?
> Report it to the Promise maintainer?
>
>> The following is logged:
>>
>> ata14: no device found (phy stat 00000000)
>> sd 13:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O errorm dev sdi, sector 390716676
>> raid5: Disk failure on sdi2, disabling device.
>>
>> Pretty much every time it's a different disk,
>> and I'm unable to revive that disk without a reboot.
>
> Have you tried poking the IDE driver to reset the bus, might get it
> running again?
>
> Not a very pretty solution, especially since you might still suffer
> two drives going down at once from time to time. Maybe you can patch
> MD to pause the array and poke the IDE driver whenever a disk is lost?
> Then you would at least only have intermittent failures / timeouts on
> a rare basis rather than a non-redundant array when something happens.
>
>> I brought this issue to the attention of some WD support people who're
>> basically telling me that the RAID software is impatient.
>
> If the disk never comes up, being patient surely won't help.
> Wait for an hour and see if the drive comes up, ask the WD folks
> exactly how patient they want you to be? :-)
>
>> When I mount the drives as separate partitions I can play with them to
>> my heart's content. As a test I filled up 5 drives, copied the data to
>> the other 5 drives (I'm using the 11th drive, a PATA one, for Linux
>> itself ATM) and vice versa. As I'm writing this I'm running Bonnie++ in
>> parallel on these partitions and so far everything's solid as a rock.
>
> Bizarre!...
>
> An idea that will take some amount of work, don't know if it's feasible:
> Patch the IDE driver to log everything it does in a ring buffer in memory.
> When a drive is lost, dump the buffer contents to disk so you can see
> what happened, perhaps even try and reproduce it.
> Perhaps the WD folks could even take a look at it..
>
>> To the best of my ability I've ruled out hardware faults. The only
>> thing I can think of now is that the RAID5 module, for whatever reason,
>> is _telling_ the drive to shutdown, but I can't imagine that happening
>> without some serious logging going on.
>
> bonnie++ does random seeks, right?
>
>> Hopefully someone on this list can help me get this problem sorted?
>
> Sorry :-)...
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-14 16:00 ` Rune Saetre
@ 2006-06-14 19:52 ` Tom Wirschell
2006-06-15 20:29 ` Greg Freemyer
2006-06-16 18:21 ` Molle Bestefich
0 siblings, 2 replies; 21+ messages in thread
From: Tom Wirschell @ 2006-06-14 19:52 UTC (permalink / raw)
To: dm-devel
On 14 Jun 2006, Rune Saetre wrote:
>
> I always thought the loud click came from the disks parking their
> heads before spinning down.
Well, it's most certainly loud. The same type of loud that you get when
the machine shuts down and removes the power from the drives. I thought
recalibration ticks weren't particularly loud.
> Anyway, it can take several seconds before a disk responds to
> commands after having spun down.
The problem isn't that it takes time to come back up after a spin down.
The drive isn't spinning down. It's turning itself off completely
(note the 'no device found' bit in the error). And it does this while
it's actively being used.
> On Wed, 14 Jun 2006, Molle Bestefich wrote:
> >
> > Does the drive's SMART log say anything interesting?
That's a damned good question. I didn't even know you could query that,
so I just recreated the array and started my test again. Took about 90
minutes for one of the drives to die. Unfortunately when it dies it
refuses to respond to anything.
When I try the smartctl program on the failed drive I get:
Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
When I issue the exact same command for another disk on the controller
I get a nice listing that you would expect from this program.
When I use hdparm -I on the died drive I get:
HDIO_DRIVE_CMD(identify) failed: Input/output error
And again, if I issue the exact same command for another disk on this
same controller I get a nice bit of info on the drive.
To me at least, this basically says that the drive is actually turned
off at this point in time. It would explain why SMART isn't getting any
data. On the other hand, it doesn't explain *WHY* the drive is off.
Do you know any program that's capable of telling a drive that isn't on
to activate itself? I don't think it's even possible but might be
mistaken there.
So, I reboot, run smartctl again and I'm presented with a nice sheet
of output that basically says all is well, nothing ever went wrong with
this drive and you can feel safe in using it.
This royally sucks...
> > Have you tried poking the IDE driver to reset the bus, might get it
> > running again?
How would I do this? I've compiled the driver into the kernel. But if
SMART data is kept even when a drive is off, this won't fix anything.
> > Not a very pretty solution, especially since you might still suffer
> > two drives going down at once from time to time. Maybe you can
> > patch MD to pause the array and poke the IDE driver whenever a disk
> > is lost? Then you would at least only have intermittent failures /
> > timeouts on a rare basis rather than a non-redundant array when
> > something happens.
The problem is that I can't tell if it's really MD that is telling the
drive to turn itself off. Is there even code in MD that does this?
Shouldn't it complain VERY LOUDLY that it's unhappy with a drive and
thus decide to kill it?
> > If the disk never comes up, being patient surely won't help.
> > Wait for an hour and see if the drive comes up, ask the WD folks
> > exactly how patient they want you to be? :-)
The assumption was that since the drive took so long to respond, MD is
telling the drive "You know what, fuck it. Never mind those outstanding
requests, just shut down and let the rest of us get on with business",
only thereby killing the array.
> > bonnie++ does random seeks, right?
I think so, yeah.
Kind regards,
Tom Wirschell
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-14 19:52 ` Tom Wirschell
@ 2006-06-15 20:29 ` Greg Freemyer
2006-06-15 21:25 ` Tom Wirschell
2006-06-16 18:21 ` Molle Bestefich
1 sibling, 1 reply; 21+ messages in thread
From: Greg Freemyer @ 2006-06-15 20:29 UTC (permalink / raw)
To: device-mapper development
Tom,
I did not review your e-mail in total, but using lots of SATA drives
in a big RAID array is not something I would attempt with 2.6.17 or
older kernels. (I know 2.6.17 is not even out yet.).
In 2.6.17-mm there is a huge SATA error handler (EH) rewrite. Is is
planned to hit the stable Linus kernel with 2.6.18 towards the end of
the summer, but even then it will only have a few of the actual
drivers modified to use the EH infrastructure.
I would repost your problem to the lkml-ide list and see if they think
that the new EH should help you, and when/if your controller will be
using the new EH infrastructure.
FYI: that is linux-ide@vger.kernel.org: sata is discussed there, no
need to subscribe, they will cc you on responses.
Also, there is a ton of testing going on with the new EH, so if your
willing to be a guinea pig, I'm sure you will get a lot of support
from the dev. team and get your specific driver updated ASAP.
HTH
Greg
--
Greg Freemyer
On 6/14/06, Tom Wirschell <Tom@wirschell.nl> wrote:
> On 14 Jun 2006, Rune Saetre wrote:
> >
> > I always thought the loud click came from the disks parking their
> > heads before spinning down.
>
> Well, it's most certainly loud. The same type of loud that you get when
> the machine shuts down and removes the power from the drives. I thought
> recalibration ticks weren't particularly loud.
>
> > Anyway, it can take several seconds before a disk responds to
> > commands after having spun down.
>
> The problem isn't that it takes time to come back up after a spin down.
> The drive isn't spinning down. It's turning itself off completely
> (note the 'no device found' bit in the error). And it does this while
> it's actively being used.
>
> > On Wed, 14 Jun 2006, Molle Bestefich wrote:
> > >
> > > Does the drive's SMART log say anything interesting?
>
> That's a damned good question. I didn't even know you could query that,
> so I just recreated the array and started my test again. Took about 90
> minutes for one of the drives to die. Unfortunately when it dies it
> refuses to respond to anything.
>
> When I try the smartctl program on the failed drive I get:
> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
> When I issue the exact same command for another disk on the controller
> I get a nice listing that you would expect from this program.
>
> When I use hdparm -I on the died drive I get:
> HDIO_DRIVE_CMD(identify) failed: Input/output error
> And again, if I issue the exact same command for another disk on this
> same controller I get a nice bit of info on the drive.
>
> To me at least, this basically says that the drive is actually turned
> off at this point in time. It would explain why SMART isn't getting any
> data. On the other hand, it doesn't explain *WHY* the drive is off.
> Do you know any program that's capable of telling a drive that isn't on
> to activate itself? I don't think it's even possible but might be
> mistaken there.
>
> So, I reboot, run smartctl again and I'm presented with a nice sheet
> of output that basically says all is well, nothing ever went wrong with
> this drive and you can feel safe in using it.
>
> This royally sucks...
>
> > > Have you tried poking the IDE driver to reset the bus, might get it
> > > running again?
>
> How would I do this? I've compiled the driver into the kernel. But if
> SMART data is kept even when a drive is off, this won't fix anything.
>
> > > Not a very pretty solution, especially since you might still suffer
> > > two drives going down at once from time to time. Maybe you can
> > > patch MD to pause the array and poke the IDE driver whenever a disk
> > > is lost? Then you would at least only have intermittent failures /
> > > timeouts on a rare basis rather than a non-redundant array when
> > > something happens.
>
> The problem is that I can't tell if it's really MD that is telling the
> drive to turn itself off. Is there even code in MD that does this?
> Shouldn't it complain VERY LOUDLY that it's unhappy with a drive and
> thus decide to kill it?
>
> > > If the disk never comes up, being patient surely won't help.
> > > Wait for an hour and see if the drive comes up, ask the WD folks
> > > exactly how patient they want you to be? :-)
>
> The assumption was that since the drive took so long to respond, MD is
> telling the drive "You know what, fuck it. Never mind those outstanding
> requests, just shut down and let the rest of us get on with business",
> only thereby killing the array.
>
> > > bonnie++ does random seeks, right?
>
> I think so, yeah.
>
> Kind regards,
>
> Tom Wirschell
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-15 20:29 ` Greg Freemyer
@ 2006-06-15 21:25 ` Tom Wirschell
2006-06-15 22:18 ` Greg Freemyer
2006-06-15 22:48 ` Rune Saetre
0 siblings, 2 replies; 21+ messages in thread
From: Tom Wirschell @ 2006-06-15 21:25 UTC (permalink / raw)
To: dm-devel
On 15 Jun 2006, Greg Freemyer wrote:
>
> In 2.6.17-mm there is a huge SATA error handler (EH) rewrite. Is is
> planned to hit the stable Linus kernel with 2.6.18 towards the end of
> the summer, but even then it will only have a few of the actual
> drivers modified to use the EH infrastructure.
Damn. I read about this in the LWN.net Kernel pages for may 17th. I
figured that once it was in, everything would automagically be using it
too (i.e. the drivers wouldn't need to be updated to the new system).
Guess I was a tad naïve about that.
I tried 2.6.17-rc5-mm2 and ran into some BUG()s which went away when I
just stuck to 2.6.17-rc5.
> I would repost your problem to the lkml-ide list and see if they think
> that the new EH should help you, and when/if your controller will be
> using the new EH infrastructure.
Looks like a good idea, thanks!
Don't let that keep anybody on this list who might have an explanation
for this be deterred from responding though. :)
> Also, there is a ton of testing going on with the new EH, so if your
> willing to be a guinea pig, I'm sure you will get a lot of support
> from the dev. team and get your specific driver updated ASAP.
I've been sitting on a non-functional array for almost 3 months now.
Don't mind having weird issues a little longer if it means things will
end up working nicely in the end (or I at least get a sensible
explanation for what's wrong, so I know what I must do to fix it).
Kind regards,
Tom Wirschell
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-15 21:25 ` Tom Wirschell
@ 2006-06-15 22:18 ` Greg Freemyer
2006-06-15 22:48 ` Rune Saetre
1 sibling, 0 replies; 21+ messages in thread
From: Greg Freemyer @ 2006-06-15 22:18 UTC (permalink / raw)
To: device-mapper development
On 6/15/06, Tom Wirschell <Tom@wirschell.nl> wrote:
> On 15 Jun 2006, Greg Freemyer wrote:
> >
> > In 2.6.17-mm there is a huge SATA error handler (EH) rewrite. Is is
> > planned to hit the stable Linus kernel with 2.6.18 towards the end of
> > the summer, but even then it will only have a few of the actual
> > drivers modified to use the EH infrastructure.
>
> Damn. I read about this in the LWN.net Kernel pages for may 17th. I
> figured that once it was in, everything would automagically be using it
> too (i.e. the drivers wouldn't need to be updated to the new system).
> Guess I was a tad naïve about that.
> I tried 2.6.17-rc5-mm2 and ran into some BUG()s which went away when I
> just stuck to 2.6.17-rc5.
>
I guess you know that the -mm kernels get everybody's not for
primetime code added to it, so the BUG() you got could have come from
anywhere.
If your comfortable with the process, I would start with 2.6.17-rc6
then get the #upstream patch(upgrade) from libata and compile up a new
kernel. (Think thousands of lines of modified code.)
If you get any issues with it the lkml-ide devs will jump right on it I suspect.
Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-15 21:25 ` Tom Wirschell
2006-06-15 22:18 ` Greg Freemyer
@ 2006-06-15 22:48 ` Rune Saetre
2006-06-15 22:53 ` Arno Wagner
1 sibling, 1 reply; 21+ messages in thread
From: Rune Saetre @ 2006-06-15 22:48 UTC (permalink / raw)
To: device-mapper development
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2040 bytes --]
Hi
Next silly question then:
It can't be the power supply not coping with a large number of disks
seeking simultaneously? If the voltage drops too much some disks might
shut down.
Probably unlikely, but it would yield a simple solution..
Rune
---
Rune Sætre <rune.saetre@netcom-gsm.no>
NetCom as
..
On Thu, 15 Jun 2006, Tom Wirschell wrote:
> On 15 Jun 2006, Greg Freemyer wrote:
>>
>> In 2.6.17-mm there is a huge SATA error handler (EH) rewrite. Is is
>> planned to hit the stable Linus kernel with 2.6.18 towards the end of
>> the summer, but even then it will only have a few of the actual
>> drivers modified to use the EH infrastructure.
>
> Damn. I read about this in the LWN.net Kernel pages for may 17th. I
> figured that once it was in, everything would automagically be using it
> too (i.e. the drivers wouldn't need to be updated to the new system).
> Guess I was a tad naïve about that.
> I tried 2.6.17-rc5-mm2 and ran into some BUG()s which went away when I
> just stuck to 2.6.17-rc5.
>
>> I would repost your problem to the lkml-ide list and see if they think
>> that the new EH should help you, and when/if your controller will be
>> using the new EH infrastructure.
>
> Looks like a good idea, thanks!
> Don't let that keep anybody on this list who might have an explanation
> for this be deterred from responding though. :)
>
>> Also, there is a ton of testing going on with the new EH, so if your
>> willing to be a guinea pig, I'm sure you will get a lot of support
>> from the dev. team and get your specific driver updated ASAP.
>
> I've been sitting on a non-functional array for almost 3 months now.
> Don't mind having weird issues a little longer if it means things will
> end up working nicely in the end (or I at least get a sensible
> explanation for what's wrong, so I know what I must do to fix it).
>
> Kind regards,
>
> Tom Wirschell
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-15 22:48 ` Rune Saetre
@ 2006-06-15 22:53 ` Arno Wagner
0 siblings, 0 replies; 21+ messages in thread
From: Arno Wagner @ 2006-06-15 22:53 UTC (permalink / raw)
To: device-mapper development
On Fri, Jun 16, 2006 at 12:48:24AM +0200, Rune Saetre wrote:
> Hi
>
> Next silly question then:
> It can't be the power supply not coping with a large number of disks
> seeking simultaneously? If the voltage drops too much some disks might
> shut down.
>
> Probably unlikely, but it would yield a simple solution..
Unlikely but possible IMO. If you can test it with reasonable
effort and cost (e.g. put some of them on another PSU),
you might want to do that.
Arno
> Rune
>
> ---
> Rune S?tre <rune.saetre@netcom-gsm.no>
> NetCom as
> ..
>
> On Thu, 15 Jun 2006, Tom Wirschell wrote:
>
> >On 15 Jun 2006, Greg Freemyer wrote:
> >>
> >>In 2.6.17-mm there is a huge SATA error handler (EH) rewrite. Is is
> >>planned to hit the stable Linus kernel with 2.6.18 towards the end of
> >>the summer, but even then it will only have a few of the actual
> >>drivers modified to use the EH infrastructure.
> >
> >Damn. I read about this in the LWN.net Kernel pages for may 17th. I
> >figured that once it was in, everything would automagically be using it
> >too (i.e. the drivers wouldn't need to be updated to the new system).
> >Guess I was a tad na?ve about that.
> >I tried 2.6.17-rc5-mm2 and ran into some BUG()s which went away when I
> >just stuck to 2.6.17-rc5.
> >
> >>I would repost your problem to the lkml-ide list and see if they think
> >>that the new EH should help you, and when/if your controller will be
> >>using the new EH infrastructure.
> >
> >Looks like a good idea, thanks!
> >Don't let that keep anybody on this list who might have an explanation
> >for this be deterred from responding though. :)
> >
> >>Also, there is a ton of testing going on with the new EH, so if your
> >>willing to be a guinea pig, I'm sure you will get a lot of support
> >>from the dev. team and get your specific driver updated ASAP.
> >
> >I've been sitting on a non-functional array for almost 3 months now.
> >Don't mind having weird issues a little longer if it means things will
> >end up working nicely in the end (or I at least get a sensible
> >explanation for what's wrong, so I know what I must do to fix it).
> >
> >Kind regards,
> >
> >Tom Wirschell
> >
> >--
> >dm-devel mailing list
> >dm-devel@redhat.com
> >https://www.redhat.com/mailman/listinfo/dm-devel
> >
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
--
Arno Wagner, Dipl. Inform., CISSP --- CSG, ETH Zurich, wagner@tik.ee.ethz.ch
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
Windows is the "under-3" toy of the OS world. -- Matthew D. Fuller
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-14 19:52 ` Tom Wirschell
2006-06-15 20:29 ` Greg Freemyer
@ 2006-06-16 18:21 ` Molle Bestefich
2006-06-16 18:32 ` Greg Freemyer
2006-06-18 13:45 ` Rune Saetre
1 sibling, 2 replies; 21+ messages in thread
From: Molle Bestefich @ 2006-06-16 18:21 UTC (permalink / raw)
To: device-mapper development
Greg Freemeyer:
> I did not review your e-mail in total, but using lots of SATA drives
> in a big RAID array is not something I would attempt with 2.6.17 or
> older kernels. (I know 2.6.17 is not even out yet.).
I've seen IDE failures on every Linux kernel I ever tried.
I hope Linux IDE will mature, some glorious day in a distant future...
Tom Wirschell:
> > Have you tried poking the IDE driver to reset the bus,
> > might get it running again?
>
> How would I do this?
Not sure how it's done with libata. Perhaps:
# cd /sys/block/sda/device/
# echo 1 > rescan
Rune Saetre wrote:
> It can't be the power supply not coping with a large number of disks
> seeking simultaneously? If the voltage drops too much some disks
> might shut down.
I was of the impression that disks suck tons of juice when they
spin up, and only 5W a piece or so at any other time. Is that right?
Of course, WD disks could be weird.
Or they could be configured to go into standby mode after some time,
requiring another spinup - but AFAIR, MD would spin them up one
after one.
Arno Wagner:
> If you can test it with reasonable effort and cost (e.g. put some of
> them on another PSU), you might want to do that.
Is that safe?
Another way could be to use a meter to gauge how much power it drains.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-16 18:21 ` Molle Bestefich
@ 2006-06-16 18:32 ` Greg Freemyer
2006-06-16 22:47 ` Tom Wirschell
2006-06-17 18:11 ` Tom Wirschell
2006-06-18 13:45 ` Rune Saetre
1 sibling, 2 replies; 21+ messages in thread
From: Greg Freemyer @ 2006-06-16 18:32 UTC (permalink / raw)
To: device-mapper development
On 6/16/06, Molle Bestefich <molle.bestefich@gmail.com> wrote:
> Greg Freemeyer:
> > I did not review your e-mail in total, but using lots of SATA drives
> > in a big RAID array is not something I would attempt with 2.6.17 or
> > older kernels. (I know 2.6.17 is not even out yet.).
>
> I've seen IDE failures on every Linux kernel I ever tried.
> I hope Linux IDE will mature, some glorious day in a distant future...
Based on the feedback I've seen on the mailing list, the new EH for
SATA appears to be a major improvement. The guy who wrote it (Tejun
Heo) said he had participated in static discharge tests with SATA
drives and had made an effort to catch and recover from the various
transient errors he had seen induced during those tests. I don't know
if he actually ran any static discharge tests on drives while
developing the new EH routines.
> Tom Wirschell:
> > > Have you tried poking the IDE driver to reset the bus,
> > > might get it running again?
> >
> > How would I do this?
>
> Not sure how it's done with libata. Perhaps:
> # cd /sys/block/sda/device/
> # echo 1 > rescan
>
> Rune Saetre wrote:
> > It can't be the power supply not coping with a large number of disks
> > seeking simultaneously? If the voltage drops too much some disks
> > might shut down.
>
> I was of the impression that disks suck tons of juice when they
> spin up, and only 5W a piece or so at any other time. Is that right?
>
> Of course, WD disks could be weird.
>
> Or they could be configured to go into standby mode after some time,
> requiring another spinup - but AFAIR, MD would spin them up one
> after one.
>
> Arno Wagner:
> > If you can test it with reasonable effort and cost (e.g. put some of
> > them on another PSU), you might want to do that.
>
> Is that safe?
Seems dangerous to me, but I don't know if a standard sata cable
carries ground or not. That is normally the problem. ie. if you have
more than one ground, you can get ground loops and most electronics is
not designed to work with those.
The 2 traditional ways to handle it are using differential circuits
like RS-232, and some SCSI cables. The other is to use fibre
connections to isolate any voltage issues.
> Another way could be to use a meter to gauge how much power it drains.
Not sure that will tell you much. On the ide list I seem to recall
several posts about problems when using multiple sata drives. Many of
the problems were resolved by addressing power issues, even thought PS
seemed plenty big. IIRC one the things done was to not have the
drives daisy-chained off the same power cable. Search the lkml-ide
archives if your curious.
Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-16 18:32 ` Greg Freemyer
@ 2006-06-16 22:47 ` Tom Wirschell
2006-06-17 0:35 ` Greg Freemyer
2006-06-17 18:11 ` Tom Wirschell
1 sibling, 1 reply; 21+ messages in thread
From: Tom Wirschell @ 2006-06-16 22:47 UTC (permalink / raw)
To: dm-devel
On 16 Jun 2006, Greg Freemyer wrote:
>
> Based on the feedback I've seen on the mailing list, the new EH for
> SATA appears to be a major improvement. The guy who wrote it (Tejun
> Heo) said he had participated in static discharge tests with SATA
> drives and had made an effort to catch and recover from the various
> transient errors he had seen induced during those tests. I don't know
> if he actually ran any static discharge tests on drives while
> developing the new EH routines.
Well at least I should be an interesting subject for them as I can
fairly reliably take a disk out in a rather unpleasant manner, giving
the new EH code something to work with...
> > Not sure how it's done with libata. Perhaps:
> > # cd /sys/block/sda/device/
> > # echo 1 > rescan
Interesting. I'll try that.
> > Rune Saetre wrote:
> > > It can't be the power supply not coping with a large number of
> > > disks seeking simultaneously? If the voltage drops too much some
> > > disks might shut down.
> >
> > I was of the impression that disks suck tons of juice when they
> > spin up, and only 5W a piece or so at any other time. Is that
> > right?
I thought this would be mighty nice to have, and apparently the WD
drives actually support it. Unfortunately the Promise cards do not so
I'm not using this feature. On boot, all drives spin up at once. And
I've yet to see any part of this setup complain about that. It usually
takes me about an hour or two of moderate RAID activity (specifically,
scp-ing a 200GB batch of files over gigabit ethernet at about 20 MB/s
tops. If it survives that I copy this data onto the array until it
fills up. Thus far I've never managed to reach this point).
As for power draw:
http://westerndigital.com/en/products/Products.asp?DriveID=58#jump1616
9W a pop for the SATA version.
http://westerndigital.com/en/products/products.asp?driveid=38&language=en#jump1818
Slightly less for the 2 PATA ones.
But even if we make it 10 they're only running 120 watts. That means
there's another 430 Watts for the Mobo, CPU, IO card, a DVD drive and a
bunch of fans which should be ample. It's behind an APC Back-UPS CS 650
which hasn't complained yet by flickering the light that says it's
getting loaded too much. Haven't tried to get any sensible data out of
it yet, but I'm quite confident power draw isn't the problem. Hell,
I've had this problem happen when there were only 6 drives inside this
machine.
> > Or they could be configured to go into standby mode after some time,
> > requiring another spinup - but AFAIR, MD would spin them up one
> > after one.
This would only make sense if the drives weren't in use, and they most
certainly were.
> > Arno Wagner:
> > > If you can test it with reasonable effort and cost (e.g. put some
> > > of them on another PSU), you might want to do that.
> >
> > Is that safe?
>
> Seems dangerous to me, but I don't know if a standard sata cable
> carries ground or not. That is normally the problem. ie. if you have
> more than one ground, you can get ground loops and most electronics is
> not designed to work with those.
>
> The 2 traditional ways to handle it are using differential circuits
> like RS-232, and some SCSI cables. The other is to use fibre
> connections to isolate any voltage issues.
You're losing me here, but yes, both the SATA power and data cables
carry ground. Plus the ground connectors are closer to the outside than
any other connector so they connect before any data or powerlines get a
chance to. Supposedly it's safe and I've seen images of rigs that
made ample use of this. I wouldn't call it common practice though.
You can turn on a powersupply by connecting the green wire in the chord
that connects to the mobo to any ground connector. Any drive that's
connected to the PS will at that time get its juice and spin up. Or at
least it should.
> > Another way could be to use a meter to gauge how much power it
> > drains.
Yeah, I think the UPS should be able to tell me that. Lemme see if I
can hook it up to another box sometime tomorrow.
> Not sure that will tell you much. On the ide list I seem to recall
> several posts about problems when using multiple sata drives. Many of
> the problems were resolved by addressing power issues, even thought PS
> seemed plenty big. IIRC one the things done was to not have the
> drives daisy-chained off the same power cable. Search the lkml-ide
> archives if your curious.
I've yet to run into a PS that comes with sufficient connectors to
power a total of 12 drives. My PS (550W Antec TruePower II) has 4 SATA
connectors coming from it, and a total of 5 wide molex connectors that
you can connect regular drives to. Every single one of those has a
splitter attached to it to give me sufficient connectors to power all
the drives as well as the fans. If anything the load should be even
across, and on the off chance that they were the flaky part, I did in
fact replace them but found no change in behaviour. The replacement
were regular molex splitters I might add. The old ones were
molex-to-2-SATA-power and the new ones were molex-to-2-molex. But like
I said, no difference.
I honestly can't find any fault in the hardware, and no logical
explanation for the software deciding to do this. Anyways, I'll ask the
IDE guys. If some kind soul could inform me of the location of the
latest version of their #upstream patch, I´d be mighty grateful. I
didn't see any mention of it in the mailinglist archives for this month.
Kind regards,
Tom Wirschell
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-16 22:47 ` Tom Wirschell
@ 2006-06-17 0:35 ` Greg Freemyer
0 siblings, 0 replies; 21+ messages in thread
From: Greg Freemyer @ 2006-06-17 0:35 UTC (permalink / raw)
To: device-mapper development
> I honestly can't find any fault in the hardware, and no logical
> explanation for the software deciding to do this. Anyways, I'll ask the
> IDE guys. If some kind soul could inform me of the location of the
> latest version of their #upstream patch, I´d be mighty grateful. I
> didn't see any mention of it in the mailinglist archives for this month.
>
I suspect your looking in the wrong archives.
Quoting a 3-week old post from the linux-ide list:
The 'upstream' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git
contains the following updates (queued for 2.6.18):
drivers/ide/pci/amd74xx.c | 7
drivers/scsi/Makefile | 2
drivers/scsi/ahci.c | 436 +++---
drivers/scsi/ata_piix.c | 25
drivers/scsi/libata-bmdma.c | 143 ++
drivers/scsi/libata-core.c | 2525 ++++++++++++++++++++++++--------------
drivers/scsi/libata-eh.c | 1561 +++++++++++++++++++++++
drivers/scsi/libata-scsi.c | 408 +++---
drivers/scsi/libata.h | 24
drivers/scsi/pdc_adma.c | 10
drivers/scsi/sata_mv.c | 70 -
drivers/scsi/sata_nv.c | 13
drivers/scsi/sata_promise.c | 39
drivers/scsi/sata_qstor.c | 14
drivers/scsi/sata_sil.c | 66
drivers/scsi/sata_sil24.c | 615 +++++----
drivers/scsi/sata_sis.c | 3
drivers/scsi/sata_svw.c | 5
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-16 18:32 ` Greg Freemyer
2006-06-16 22:47 ` Tom Wirschell
@ 2006-06-17 18:11 ` Tom Wirschell
1 sibling, 0 replies; 21+ messages in thread
From: Tom Wirschell @ 2006-06-17 18:11 UTC (permalink / raw)
To: device-mapper development
On 6/16/06, Molle Bestefich <molle.bestefich@gmail.com> wrote:
>
> Not sure how it's done with libata. Perhaps:
> # cd /sys/block/sda/device/
> # echo 1 > rescan
Just tried it. Fails.
Basically it complains that the drive is off.
Kind regards,
Tom Wirschell
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-16 18:21 ` Molle Bestefich
2006-06-16 18:32 ` Greg Freemyer
@ 2006-06-18 13:45 ` Rune Saetre
2006-06-18 15:13 ` Greg Freemyer
2006-06-18 18:50 ` Arno Wagner
1 sibling, 2 replies; 21+ messages in thread
From: Rune Saetre @ 2006-06-18 13:45 UTC (permalink / raw)
To: device-mapper development
[-- Attachment #1: Type: TEXT/PLAIN, Size: 845 bytes --]
Hi
> Arno Wagner:
>> If you can test it with reasonable effort and cost (e.g. put some of
>> them on another PSU), you might want to do that.
>
> Is that safe?
Don't know how safe it is to use different PSU's, but I have done it with
success several times on old hardware used for testing. I can't think of
any reason you should run into trouble if ground is common, and the disks
are powered up first.
> Another way could be to use a meter to gauge how much power it drains.
You better use an oscilloscope if anything, as you are looking for very
short transients. I have heard (but never tested) that the disks pull
large currents when moving the heads, but with seek times of some
milliseconds you wouldn't even see it on a regular amperemeter.
Rune
---
Rune Sætre <rune.saetre@netcom-gsm.no>
NetCom as
..
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-18 13:45 ` Rune Saetre
@ 2006-06-18 15:13 ` Greg Freemyer
2006-06-18 15:33 ` Rune Saetre
2006-06-18 18:50 ` Arno Wagner
1 sibling, 1 reply; 21+ messages in thread
From: Greg Freemyer @ 2006-06-18 15:13 UTC (permalink / raw)
To: device-mapper development
On 6/18/06, Rune Saetre <rune.saetre@netcom-gsm.no> wrote:
> Hi
>
> > Arno Wagner:
> >> If you can test it with reasonable effort and cost (e.g. put some of
> >> them on another PSU), you might want to do that.
> >
> > Is that safe?
>
> Don't know how safe it is to use different PSU's, but I have done it with
> success several times on old hardware used for testing. I can't think of
> any reason you should run into trouble if ground is common, and the disks
> are powered up first.
Its the common ground part that can be challeging. I'm not an expert
of PC powersupplies, but in general power supplies have a floating
ground, so if you have two of them plugged in the same AC power, you
will still not have a common ground on the DC side.
Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-18 15:13 ` Greg Freemyer
@ 2006-06-18 15:33 ` Rune Saetre
0 siblings, 0 replies; 21+ messages in thread
From: Rune Saetre @ 2006-06-18 15:33 UTC (permalink / raw)
To: device-mapper development
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1498 bytes --]
Hi
In PC's ground is usually connected to the box itself. To be sure ground
in the PSU's can just be connected.
I have never experienced problems with ground loops in PCs. In PC's you
have all kinds of ground loops, as the same ground is usually connected to
signal leads, power supply leads and the chassis. As long as the cables
are kept short this shouldn't cause problems.
Rune
---
Rune Sætre <rune.saetre@netcom-gsm.no>
NetCom as
..
On Sun, 18 Jun 2006, Greg Freemyer wrote:
> On 6/18/06, Rune Saetre <rune.saetre@netcom-gsm.no> wrote:
>> Hi
>>
>> > Arno Wagner:
>> >> If you can test it with reasonable effort and cost (e.g. put some of
>> >> them on another PSU), you might want to do that.
>> >
>> > Is that safe?
>>
>> Don't know how safe it is to use different PSU's, but I have done it with
>> success several times on old hardware used for testing. I can't think of
>> any reason you should run into trouble if ground is common, and the disks
>> are powered up first.
>
> Its the common ground part that can be challeging. I'm not an expert
> of PC powersupplies, but in general power supplies have a floating
> ground, so if you have two of them plugged in the same AC power, you
> will still not have a common ground on the DC side.
>
> Greg
> --
> Greg Freemyer
> The Norcross Group
> Forensics for the 21st Century
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-18 13:45 ` Rune Saetre
2006-06-18 15:13 ` Greg Freemyer
@ 2006-06-18 18:50 ` Arno Wagner
1 sibling, 0 replies; 21+ messages in thread
From: Arno Wagner @ 2006-06-18 18:50 UTC (permalink / raw)
To: device-mapper development
On Sun, Jun 18, 2006 at 03:45:36PM +0200, Rune Saetre wrote:
> Hi
>
> >Arno Wagner:
> >>If you can test it with reasonable effort and cost (e.g. put some of
> >>them on another PSU), you might want to do that.
> >
> >Is that safe?
>
> Don't know how safe it is to use different PSU's, but I have done it with
> success several times on old hardware used for testing. I can't think of
> any reason you should run into trouble if ground is common, and the disks
> are powered up first.
It is pretty safe, as long as you plug both PSUs into the same
outlet and the outlet its grounded. Also powering up may be a bit
difficult, since the power-up signal comes from the mainboard today.
Better solution: Get a PSU with more power reserves and try with that.
From the time you have alreasy sunk into this problem, I gather
paying for a, say, 500W PSU would not be too bad.
> >Another way could be to use a meter to gauge how much power it drains.
>
> You better use an oscilloscope if anything, as you are looking for very
> short transients. I have heard (but never tested) that the disks pull
> large currents when moving the heads, but with seek times of some
> milliseconds you wouldn't even see it on a regular amperemeter.
Nor a regular oscilloscope. Would need to be the storage kind.
Expensive, unless you can borrow it.
Arno
--
Arno Wagner, Dipl. Inform., CISSP --- CSG, ETH Zurich, wagner@tik.ee.ethz.ch
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
Windows is the "under-3" toy of the OS world. -- Matthew D. Fuller
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Hard drives shutting themselves off in RAID mode
2006-06-13 21:53 Hard drives shutting themselves off in RAID mode Tom Wirschell
` (2 preceding siblings ...)
2006-06-14 11:19 ` Molle Bestefich
@ 2006-06-23 11:27 ` Molle Bestefich
3 siblings, 0 replies; 21+ messages in thread
From: Molle Bestefich @ 2006-06-23 11:27 UTC (permalink / raw)
To: dm-devel; +Cc: dm-devel
Forwarding a comment from Ricky Beam over on linux-raid:
"
Where "for some reason" == HEAT. I've seen Maxtor, Seagate, AND Western
Digital drives all shutdown when they get too hot -- so hot you cannot
touch them. I know this all too well because Dell is stupid or lazy
to design their cases with proper ventilation over the drives; one drive
simply gets hot, two drives get hot enough to discolor their plastic
drive sleds.
Unless you're talking about little laptop drives, hard drives need active
cooling. A few CFM is usually enough. A LOT of people underestimate
the cooling needs of their drives. (and sadly that includes far too many
manufacturers of IDE/SATA drive cages.)
"
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2006-06-23 11:27 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-13 21:53 Hard drives shutting themselves off in RAID mode Tom Wirschell
[not found] ` <62b0912f0606140419s60c30535p bcc97c30ef99c50d@mail.gmail.com>
2006-06-14 3:58 ` Arno Wagner
2006-06-14 7:42 ` Tom Wirschell
2006-06-14 11:19 ` Molle Bestefich
2006-06-14 16:00 ` Rune Saetre
2006-06-14 19:52 ` Tom Wirschell
2006-06-15 20:29 ` Greg Freemyer
2006-06-15 21:25 ` Tom Wirschell
2006-06-15 22:18 ` Greg Freemyer
2006-06-15 22:48 ` Rune Saetre
2006-06-15 22:53 ` Arno Wagner
2006-06-16 18:21 ` Molle Bestefich
2006-06-16 18:32 ` Greg Freemyer
2006-06-16 22:47 ` Tom Wirschell
2006-06-17 0:35 ` Greg Freemyer
2006-06-17 18:11 ` Tom Wirschell
2006-06-18 13:45 ` Rune Saetre
2006-06-18 15:13 ` Greg Freemyer
2006-06-18 15:33 ` Rune Saetre
2006-06-18 18:50 ` Arno Wagner
2006-06-23 11:27 ` Molle Bestefich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.