* How to prefer some devices over others in raid @ 2013-12-31 14:42 Tomas M 2013-12-31 15:17 ` Stan Hoeppner 0 siblings, 1 reply; 8+ messages in thread From: Tomas M @ 2013-12-31 14:42 UTC (permalink / raw) To: linux-raid I'm using software raid 5 (stripe with one parity drive). Is there a way to force the raid array to MOSTLY ignore one of the drives? Let me explain. If a drive is failing (is very slow, has errors, etc), then I still prefer to keep it in the array and simply COPY all data from the array somewhere else, instead of risking that the array gets degraded if I remove the failing drive and another one dies at the same time. Example: - 4 drives in raid5 - one drive is slow, lets call the drive DISK1 - copying all data from the array is very slow because it still uses DISK1 to read data from it, even if it could IGNORE it and COMPUTE the data from the other three drives The filesystem on the array needs to be still mounted rw, though, since there may be some changes. Is there any mdadm parameter or option which would make the array IGNORE any given disk on reads (since those can be computed from other drives), while still NOT IGNORING the disk in writes? Because if I set it to ignore that failing slow drive, I will copy out the data 100 times faster while still keeping the array in sync. Because for me if ANOTHER drive dies, it will still be better to have the data 100 times slower than nothing. I hope you understand me :) Thanks! Tomas M ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2013-12-31 14:42 How to prefer some devices over others in raid Tomas M @ 2013-12-31 15:17 ` Stan Hoeppner 2014-01-01 6:50 ` Tomas M 0 siblings, 1 reply; 8+ messages in thread From: Stan Hoeppner @ 2013-12-31 15:17 UTC (permalink / raw) To: Tomas M, linux-raid On 12/31/2013 8:42 AM, Tomas M wrote: > I'm using software raid 5 (stripe with one parity drive). > > Is there a way to force the raid array to MOSTLY ignore one of the drives? > > Let me explain. If a drive is failing (is very slow, has errors, etc), > then I still prefer to keep it in the array and simply COPY all data > from the array somewhere else, instead of risking that the array gets > degraded if I remove the failing drive and another one dies at the > same time. > > Example: > - 4 drives in raid5 > - one drive is slow, lets call the drive DISK1 > - copying all data from the array is very slow because it still uses > DISK1 to read data from it, even if it could IGNORE it and COMPUTE the > data from the other three drives > > The filesystem on the array needs to be still mounted rw, though, > since there may be some changes. > > Is there any mdadm parameter or option which would make the array > IGNORE any given disk on reads (since those can be computed from other > drives), while still NOT IGNORING the disk in writes? > > Because if I set it to ignore that failing slow drive, I will copy out > the data 100 times faster while still keeping the array in sync. > Because for me if ANOTHER drive dies, it will still be better to have > the data 100 times slower than nothing. I hope you understand me :) The option you request is not available TTBOMK. So... Fail the drive. Copy all the data off the array. Zero the drive and add it back. Resync. But while you're at it, why not simply replace the flaky drive before the resync? Why would you intentionally keep a known-to-be-failing drive in the array? -- Stan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2013-12-31 15:17 ` Stan Hoeppner @ 2014-01-01 6:50 ` Tomas M 2014-01-01 7:05 ` NeilBrown 2014-01-01 16:49 ` Stan Hoeppner 0 siblings, 2 replies; 8+ messages in thread From: Tomas M @ 2014-01-01 6:50 UTC (permalink / raw) To: stan; +Cc: linux-raid My main concern is that I'm not always 100% sure if a certain drive is the root cause of slow read performance, so failing a wrong one would have terrible consequences. On the other hand, if it was possible to say "read the data with pretending one drive has failed, but write to the array as if all drives are ok", then I could actually find out if copying speed reverts to normal. Maybe I should write a kernel patch to try this out. Any suggestions where to start? Tomas M On Tue, Dec 31, 2013 at 4:17 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote: > > > On 12/31/2013 8:42 AM, Tomas M wrote: >> I'm using software raid 5 (stripe with one parity drive). >> >> Is there a way to force the raid array to MOSTLY ignore one of the drives? >> >> Let me explain. If a drive is failing (is very slow, has errors, etc), >> then I still prefer to keep it in the array and simply COPY all data >> from the array somewhere else, instead of risking that the array gets >> degraded if I remove the failing drive and another one dies at the >> same time. >> >> Example: >> - 4 drives in raid5 >> - one drive is slow, lets call the drive DISK1 >> - copying all data from the array is very slow because it still uses >> DISK1 to read data from it, even if it could IGNORE it and COMPUTE the >> data from the other three drives >> >> The filesystem on the array needs to be still mounted rw, though, >> since there may be some changes. >> >> Is there any mdadm parameter or option which would make the array >> IGNORE any given disk on reads (since those can be computed from other >> drives), while still NOT IGNORING the disk in writes? >> >> Because if I set it to ignore that failing slow drive, I will copy out >> the data 100 times faster while still keeping the array in sync. >> Because for me if ANOTHER drive dies, it will still be better to have >> the data 100 times slower than nothing. I hope you understand me :) > > The option you request is not available TTBOMK. So... > > Fail the drive. Copy all the data off the array. Zero the drive and > add it back. Resync. > > But while you're at it, why not simply replace the flaky drive before > the resync? Why would you intentionally keep a known-to-be-failing > drive in the array? > > -- > Stan > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2014-01-01 6:50 ` Tomas M @ 2014-01-01 7:05 ` NeilBrown 2014-01-01 16:49 ` Stan Hoeppner 1 sibling, 0 replies; 8+ messages in thread From: NeilBrown @ 2014-01-01 7:05 UTC (permalink / raw) To: Tomas M; +Cc: stan, linux-raid [-- Attachment #1: Type: text/plain, Size: 3298 bytes --] On Wed, 1 Jan 2014 07:50:24 +0100 Tomas M <tomas@slax.org> wrote: > My main concern is that I'm not always 100% sure if a certain drive is > the root cause of slow read performance, so failing a wrong one would > have terrible consequences. On the other hand, if it was possible to > say "read the data with pretending one drive has failed, but write to > the array as if all drives are ok", then I could actually find out if > copying speed reverts to normal. > > Maybe I should write a kernel patch to try this out. Any suggestions > where to start? raid1.c already has functionality like this, referred to as "Write-mostly". To add it to raid5.c you would need to modify fetch_block(). It sets Wantread or Wantcompute to indicate the the block should be read, or computed from others (which is only allows of all the others are valid already). So in the case where one device is write-mostly and no other devices are faulty, it should set Wantread on all those other devices. Once they have been read, fetch_block will be called again and the current code will notice that setting Wantcompute is sufficient. NeilBrown > > Tomas M > > On Tue, Dec 31, 2013 at 4:17 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote: > > > > > > On 12/31/2013 8:42 AM, Tomas M wrote: > >> I'm using software raid 5 (stripe with one parity drive). > >> > >> Is there a way to force the raid array to MOSTLY ignore one of the drives? > >> > >> Let me explain. If a drive is failing (is very slow, has errors, etc), > >> then I still prefer to keep it in the array and simply COPY all data > >> from the array somewhere else, instead of risking that the array gets > >> degraded if I remove the failing drive and another one dies at the > >> same time. > >> > >> Example: > >> - 4 drives in raid5 > >> - one drive is slow, lets call the drive DISK1 > >> - copying all data from the array is very slow because it still uses > >> DISK1 to read data from it, even if it could IGNORE it and COMPUTE the > >> data from the other three drives > >> > >> The filesystem on the array needs to be still mounted rw, though, > >> since there may be some changes. > >> > >> Is there any mdadm parameter or option which would make the array > >> IGNORE any given disk on reads (since those can be computed from other > >> drives), while still NOT IGNORING the disk in writes? > >> > >> Because if I set it to ignore that failing slow drive, I will copy out > >> the data 100 times faster while still keeping the array in sync. > >> Because for me if ANOTHER drive dies, it will still be better to have > >> the data 100 times slower than nothing. I hope you understand me :) > > > > The option you request is not available TTBOMK. So... > > > > Fail the drive. Copy all the data off the array. Zero the drive and > > add it back. Resync. > > > > But while you're at it, why not simply replace the flaky drive before > > the resync? Why would you intentionally keep a known-to-be-failing > > drive in the array? > > > > -- > > Stan > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2014-01-01 6:50 ` Tomas M 2014-01-01 7:05 ` NeilBrown @ 2014-01-01 16:49 ` Stan Hoeppner 2014-01-01 18:00 ` Tomas M 1 sibling, 1 reply; 8+ messages in thread From: Stan Hoeppner @ 2014-01-01 16:49 UTC (permalink / raw) To: Tomas M; +Cc: linux-raid On 1/1/2014 12:50 AM, Tomas M wrote: > My main concern is that I'm not always 100% sure if a certain drive is > the root cause of slow read performance, so failing a wrong one would > have terrible consequences. On the other hand, if it was possible to > say "read the data with pretending one drive has failed, but write to > the array as if all drives are ok", then I could actually find out if > copying speed reverts to normal. So you want to monkey with the way RAID works just to figure out if one of your drives is flaky, causing slow IO? Tools already exist to identify such things. "iostat -x" 'await' shows latency per drive. "smartctl -A /dev/device" gives you Raw_Read_Error_Rate and Seek_Error_Rate. High counts in the RAW_VALUE column for either indicates the drive is re-reading sectors and applying extra DSP to figure out the magnetism of the bits. Comparing such data amongst your array member drives should pretty quickly tell you if one has a problem. If none do, you need to look elsewhere for the problem. Your initial post suggested you knew which drive was flaky. Now you indicate you don't know which, if any, is flaky. This suggests you have no idea why your array is slow. The first place to look when IO slows down suddenly is the filesystem. Nearly full filesystems scatter extents into any free space available when writing new files or appending files. When you read these back you get far more seeking than normal and everything slows down, dramatically. Investigate this possibility, and the things above, before attempting to rewrite raid5.c, which will inevitably cause you more problems than it solves. -- Stan > Maybe I should write a kernel patch to try this out. Any suggestions > where to start? > > Tomas M > > On Tue, Dec 31, 2013 at 4:17 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote: >> >> >> On 12/31/2013 8:42 AM, Tomas M wrote: >>> I'm using software raid 5 (stripe with one parity drive). >>> >>> Is there a way to force the raid array to MOSTLY ignore one of the drives? >>> >>> Let me explain. If a drive is failing (is very slow, has errors, etc), >>> then I still prefer to keep it in the array and simply COPY all data >>> from the array somewhere else, instead of risking that the array gets >>> degraded if I remove the failing drive and another one dies at the >>> same time. >>> >>> Example: >>> - 4 drives in raid5 >>> - one drive is slow, lets call the drive DISK1 >>> - copying all data from the array is very slow because it still uses >>> DISK1 to read data from it, even if it could IGNORE it and COMPUTE the >>> data from the other three drives >>> >>> The filesystem on the array needs to be still mounted rw, though, >>> since there may be some changes. >>> >>> Is there any mdadm parameter or option which would make the array >>> IGNORE any given disk on reads (since those can be computed from other >>> drives), while still NOT IGNORING the disk in writes? >>> >>> Because if I set it to ignore that failing slow drive, I will copy out >>> the data 100 times faster while still keeping the array in sync. >>> Because for me if ANOTHER drive dies, it will still be better to have >>> the data 100 times slower than nothing. I hope you understand me :) >> >> The option you request is not available TTBOMK. So... >> >> Fail the drive. Copy all the data off the array. Zero the drive and >> add it back. Resync. >> >> But while you're at it, why not simply replace the flaky drive before >> the resync? Why would you intentionally keep a known-to-be-failing >> drive in the array? >> >> -- >> Stan >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2014-01-01 16:49 ` Stan Hoeppner @ 2014-01-01 18:00 ` Tomas M 2014-01-01 20:21 ` Phil Turmel 2014-01-02 14:30 ` Stan Hoeppner 0 siblings, 2 replies; 8+ messages in thread From: Tomas M @ 2014-01-01 18:00 UTC (permalink / raw) To: stan; +Cc: linux-raid > Your initial post suggested you knew which drive was flaky. Now you > indicate you don't know which, if any, is flaky. This suggests you have > no idea why your array is slow. Well, I always have an indication which drive is flaky, based on dmesg output (e.g. hard resetting ATA3 link, etc). However, sometimes it reports that more than one drive has problems, and I can't be 100% sure which of the flaky drives is the "more flaky" one :) and it is too late to replace any of them, since there is high chance that the other one dies as well during resync (which happened to me few times already). From my point of view it is better for me to keep the array in sync as long as I can, and copy the data somewhere as fast as I can. NeilBrown's suggestion looks like very simple logic needed to implement, however the amount of nested ANDs and ORs in fetch_block() in raid5.c is too huge to my understanding :-) so I'm giving up :) Tomas M ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2014-01-01 18:00 ` Tomas M @ 2014-01-01 20:21 ` Phil Turmel 2014-01-02 14:30 ` Stan Hoeppner 1 sibling, 0 replies; 8+ messages in thread From: Phil Turmel @ 2014-01-01 20:21 UTC (permalink / raw) To: Tomas M, stan; +Cc: linux-raid On 01/01/2014 01:00 PM, Tomas M wrote: >> Your initial post suggested you knew which drive was flaky. Now you >> indicate you don't know which, if any, is flaky. This suggests you have >> no idea why your array is slow. > > Well, I always have an indication which drive is flaky, based on dmesg > output (e.g. hard resetting ATA3 link, etc). However, sometimes it > reports that more than one drive has problems, and I can't be 100% > sure which of the flaky drives is the "more flaky" one :) and it is > too late to replace any of them, since there is high chance that the > other one dies as well during resync (which happened to me few times > already). From my point of view it is better for me to keep the array > in sync as long as I can, and copy the data somewhere as fast as I > can. If you've experienced drive drops during resync a "few times already", and you don't say that such drives were obviously dead, it makes me suspicious that you are using non-enterprise drives. Using non-enterprise drive in any raid array can expose you to false failures from the timeout mismatch problem. If you care to share the output of "smartctl -x" for all of your drives, and "for x in /sys/block/*/device/timeout ; do echo $x $(< $x) ; done", we can immediately figure that out for you. If you want to understand the issue, search this list's archives for various combinations of "scterc", "URE", "timeout mismatch". You should also see if your distro has a cron job that performs a "check" scrub on your arrays for you. HTH, Phil ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to prefer some devices over others in raid 2014-01-01 18:00 ` Tomas M 2014-01-01 20:21 ` Phil Turmel @ 2014-01-02 14:30 ` Stan Hoeppner 1 sibling, 0 replies; 8+ messages in thread From: Stan Hoeppner @ 2014-01-02 14:30 UTC (permalink / raw) To: Tomas M; +Cc: linux-raid On 1/1/2014 12:00 PM, Tomas M wrote: >> Your initial post suggested you knew which drive was flaky. Now you >> indicate you don't know which, if any, is flaky. This suggests you have >> no idea why your array is slow. > > Well, I always have an indication which drive is flaky, based on dmesg > output (e.g. hard resetting ATA3 link, etc). However, sometimes it > reports that more than one drive has problems [snip] Full stop. Random resets on multiple links indicates a backplane (if present), HBA/controller, or cable problem, not a drive problem. If you're using an HBA+backplane with an SFF-8087 4x multilane cable, or a breakout cable, the problem could be as simple as a lose connection at the SFF-8087 multilane connector, or a cable gone bad. If you have multiple discrete SATA cables, one per drive, this indicates a problem with the controller/HBA. Ergo, if you have a multilane cable, unplug/replug it and see if that helps. If not, replace it. If that doesn't solve the problem, replace the HBA. If replacing the HBA doesn't solve it, replace the backplane. If you have discrete cables, replacing the HBA should fix the problem. If you have discrete cables and are using motherboard SATA ports, you'll need to acquire an HBA and cease using the motherboard ports. -- Stan ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-01-02 14:30 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-31 14:42 How to prefer some devices over others in raid Tomas M 2013-12-31 15:17 ` Stan Hoeppner 2014-01-01 6:50 ` Tomas M 2014-01-01 7:05 ` NeilBrown 2014-01-01 16:49 ` Stan Hoeppner 2014-01-01 18:00 ` Tomas M 2014-01-01 20:21 ` Phil Turmel 2014-01-02 14:30 ` Stan Hoeppner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).