* Sleepy drives and MD RAID 6
@ 2014-08-12 1:03 Adam Talbot
2014-08-12 1:21 ` Larkin Lowrey
2014-08-12 3:29 ` Roman Mamedov
0 siblings, 2 replies; 17+ messages in thread
From: Adam Talbot @ 2014-08-12 1:03 UTC (permalink / raw)
To: linux-raid
I need help from the Linux RAID pros.
To make a very long story short; I have a 7 disk in a RAID 6 array. I
put the drives to sleep after 7 minutes of inactivity. When I go to
use this array the spin up time is causing applications to hang.
Current spin up time is 50 seconds, but will be getting worse as I add
drives.
Here is the MUCH longer description including more specs (DingbatCA):
http://forums.gentoo.org/viewtopic-p-7599010.html
Any help would be greatly appreciated. I think this would make a
great wiki article.
More details bellow:
root@nas:/data# smartctl -a /dev/sdd | grep Spin_Up
3 Spin_Up_Time 0x0027 150 137 021 Pre-fail
Always - 9608
root@nas:/data# time (touch foo ; sync)
real 0m49.004s
user 0m0.000s
sys 0m0.004s
root@nas:/data# time (touch foo ; sync)
real 0m50.647s
user 0m0.000s
sys 0m0.008s
root@nas:/data# df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/md125 9.1T 3.8T 5.4T 42% /data
root@nas:/data# mdadm -D /dev/md125
/dev/md125:
Version : 1.2
Creation Time : Wed Jun 18 07:54:38 2014
Raid Level : raid6
Array Size : 9766909440 (9314.45 GiB 10001.32 GB)
Used Dev Size : 1953381888 (1862.89 GiB 2000.26 GB)
Raid Devices : 7
Total Devices : 7
Persistence : Superblock is persistent
Update Time : Mon Aug 11 16:30:16 2014
State : clean
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : nas:data (local to host nas)
UUID : 74f9ce7a:df1c2698:c8ec7259:5fdb2618
Events : 1038642
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
3 8 49 2 active sync /dev/sdd1
4 8 65 3 active sync /dev/sde1
5 8 81 4 active sync /dev/sdf1
7 8 145 5 active sync /dev/sdj1
6 8 129 6 active sync /dev/sdi1
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: Sleepy drives and MD RAID 6 2014-08-12 1:03 Sleepy drives and MD RAID 6 Adam Talbot @ 2014-08-12 1:21 ` Larkin Lowrey 2014-08-12 1:31 ` Adam Talbot 2014-08-12 5:55 ` Can Jeuleers 2014-08-12 3:29 ` Roman Mamedov 1 sibling, 2 replies; 17+ messages in thread From: Larkin Lowrey @ 2014-08-12 1:21 UTC (permalink / raw) To: Adam Talbot, linux-raid I was in the same boat and decided to solve the problem with some code. I wrote a daemon that monitors /sys/block/sd?/stat for each member of the array. If all the drives have been idle for X seconds the daemon sends a spindown command to each member in parallel. If the array is spun down the daemon watches for any change in the aforementioned stat file and if there is it spins up all members in parallel. The affect of this is the array spin-up time is only as long as the slowest drive and all the drives spin down at the same time. My experience has been that leaving spindown up to the drives is a bad idea. Different models and different manufacturers have varying notions of what 10 minutes means. Also, leaving spin-up to the controller is also not so hot since some controllers spin-up the drives sequentially rather than in parallel. I'd be happy to share the code and even happier if someone wrote something better! --Larkin On 8/11/2014 8:03 PM, Adam Talbot wrote: > I need help from the Linux RAID pros. > > To make a very long story short; I have a 7 disk in a RAID 6 array. I > put the drives to sleep after 7 minutes of inactivity. When I go to > use this array the spin up time is causing applications to hang. > Current spin up time is 50 seconds, but will be getting worse as I add > drives. > > Here is the MUCH longer description including more specs (DingbatCA): > http://forums.gentoo.org/viewtopic-p-7599010.html > > Any help would be greatly appreciated. I think this would make a > great wiki article. > > More details bellow: > root@nas:/data# smartctl -a /dev/sdd | grep Spin_Up > 3 Spin_Up_Time 0x0027 150 137 021 Pre-fail > Always - 9608 > > root@nas:/data# time (touch foo ; sync) > real 0m49.004s > user 0m0.000s > sys 0m0.004s > > root@nas:/data# time (touch foo ; sync) > real 0m50.647s > user 0m0.000s > sys 0m0.008s > > root@nas:/data# df -h /data > Filesystem Size Used Avail Use% Mounted on > /dev/md125 9.1T 3.8T 5.4T 42% /data > > root@nas:/data# mdadm -D /dev/md125 > /dev/md125: > Version : 1.2 > Creation Time : Wed Jun 18 07:54:38 2014 > Raid Level : raid6 > Array Size : 9766909440 (9314.45 GiB 10001.32 GB) > Used Dev Size : 1953381888 (1862.89 GiB 2000.26 GB) > Raid Devices : 7 > Total Devices : 7 > Persistence : Superblock is persistent > > Update Time : Mon Aug 11 16:30:16 2014 > State : clean > Active Devices : 7 > Working Devices : 7 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 512K > > Name : nas:data (local to host nas) > UUID : 74f9ce7a:df1c2698:c8ec7259:5fdb2618 > Events : 1038642 > > Number Major Minor RaidDevice State > 0 8 17 0 active sync /dev/sdb1 > 1 8 33 1 active sync /dev/sdc1 > 3 8 49 2 active sync /dev/sdd1 > 4 8 65 3 active sync /dev/sde1 > 5 8 81 4 active sync /dev/sdf1 > 7 8 145 5 active sync /dev/sdj1 > 6 8 129 6 active sync /dev/sdi1 > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-12 1:21 ` Larkin Lowrey @ 2014-08-12 1:31 ` Adam Talbot 2014-08-12 5:55 ` Can Jeuleers 1 sibling, 0 replies; 17+ messages in thread From: Adam Talbot @ 2014-08-12 1:31 UTC (permalink / raw) To: Larkin Lowrey; +Cc: linux-raid Larkin I would be happy to take a look at your code and improve it, if I can. I am good with C, C++ and shell scripts. I can read almost everything else. I like your solution of going after /sys/block/sd?/stat. Would it be better if we looked at including something like this directly into the MD raid stack, or at least an option for parallel spin up (/sys/block/md125/md/parallel_spinup)? Adam On Mon, Aug 11, 2014 at 6:21 PM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > I was in the same boat and decided to solve the problem with some code. > > I wrote a daemon that monitors /sys/block/sd?/stat for each member of > the array. If all the drives have been idle for X seconds the daemon > sends a spindown command to each member in parallel. If the array is > spun down the daemon watches for any change in the aforementioned stat > file and if there is it spins up all members in parallel. > > The affect of this is the array spin-up time is only as long as the > slowest drive and all the drives spin down at the same time. My > experience has been that leaving spindown up to the drives is a bad > idea. Different models and different manufacturers have varying notions > of what 10 minutes means. Also, leaving spin-up to the controller is > also not so hot since some controllers spin-up the drives sequentially > rather than in parallel. > > I'd be happy to share the code and even happier if someone wrote > something better! > > --Larkin > > On 8/11/2014 8:03 PM, Adam Talbot wrote: >> I need help from the Linux RAID pros. >> >> To make a very long story short; I have a 7 disk in a RAID 6 array. I >> put the drives to sleep after 7 minutes of inactivity. When I go to >> use this array the spin up time is causing applications to hang. >> Current spin up time is 50 seconds, but will be getting worse as I add >> drives. >> >> Here is the MUCH longer description including more specs (DingbatCA): >> http://forums.gentoo.org/viewtopic-p-7599010.html >> >> Any help would be greatly appreciated. I think this would make a >> great wiki article. >> >> More details bellow: >> root@nas:/data# smartctl -a /dev/sdd | grep Spin_Up >> 3 Spin_Up_Time 0x0027 150 137 021 Pre-fail >> Always - 9608 >> >> root@nas:/data# time (touch foo ; sync) >> real 0m49.004s >> user 0m0.000s >> sys 0m0.004s >> >> root@nas:/data# time (touch foo ; sync) >> real 0m50.647s >> user 0m0.000s >> sys 0m0.008s >> >> root@nas:/data# df -h /data >> Filesystem Size Used Avail Use% Mounted on >> /dev/md125 9.1T 3.8T 5.4T 42% /data >> >> root@nas:/data# mdadm -D /dev/md125 >> /dev/md125: >> Version : 1.2 >> Creation Time : Wed Jun 18 07:54:38 2014 >> Raid Level : raid6 >> Array Size : 9766909440 (9314.45 GiB 10001.32 GB) >> Used Dev Size : 1953381888 (1862.89 GiB 2000.26 GB) >> Raid Devices : 7 >> Total Devices : 7 >> Persistence : Superblock is persistent >> >> Update Time : Mon Aug 11 16:30:16 2014 >> State : clean >> Active Devices : 7 >> Working Devices : 7 >> Failed Devices : 0 >> Spare Devices : 0 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Name : nas:data (local to host nas) >> UUID : 74f9ce7a:df1c2698:c8ec7259:5fdb2618 >> Events : 1038642 >> >> Number Major Minor RaidDevice State >> 0 8 17 0 active sync /dev/sdb1 >> 1 8 33 1 active sync /dev/sdc1 >> 3 8 49 2 active sync /dev/sdd1 >> 4 8 65 3 active sync /dev/sde1 >> 5 8 81 4 active sync /dev/sdf1 >> 7 8 145 5 active sync /dev/sdj1 >> 6 8 129 6 active sync /dev/sdi1 >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-12 1:21 ` Larkin Lowrey 2014-08-12 1:31 ` Adam Talbot @ 2014-08-12 5:55 ` Can Jeuleers 2014-08-12 9:46 ` Wilson, Jonathan 1 sibling, 1 reply; 17+ messages in thread From: Can Jeuleers @ 2014-08-12 5:55 UTC (permalink / raw) To: linux-raid On 08/12/2014 03:21 AM, Larkin Lowrey wrote: > Also, leaving spin-up to the controller is > also not so hot since some controllers spin-up the drives sequentially > rather than in parallel. Sequential spin-up is a feature to some, because it avoids large power spikes. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-12 5:55 ` Can Jeuleers @ 2014-08-12 9:46 ` Wilson, Jonathan 2014-08-12 15:26 ` Adam Talbot [not found] ` <CAH_2GhfBKiJAT=+zCFH0_xkmyTfWFi=c-F+z8rwL1x++UwU+KA@mail.gmail.com> 0 siblings, 2 replies; 17+ messages in thread From: Wilson, Jonathan @ 2014-08-12 9:46 UTC (permalink / raw) To: Can Jeuleers; +Cc: linux-raid On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: > On 08/12/2014 03:21 AM, Larkin Lowrey wrote: > > Also, leaving spin-up to the controller is > > also not so hot since some controllers spin-up the drives sequentially > > rather than in parallel. > > Sequential spin-up is a feature to some, because it avoids large power > spikes. I vaguely recall older drives had a jumper to set a delayed spin up so they stayed in a low power (possibly un-spun up) mode when power was applied and only woke up when a command was received (I think any command, not a specific "wake up" one). Also as mentioned some controllers may also only wake drives one after the other, likewise mdriad does not care about the underlying hardware/driver stack, only that it eventually responds, and even then I believe it will happily wait till the end of time if no response or error is propagated up the stack; hence the time out in scsi_device stack not in the mdraid. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-12 9:46 ` Wilson, Jonathan @ 2014-08-12 15:26 ` Adam Talbot [not found] ` <CAH_2GhfBKiJAT=+zCFH0_xkmyTfWFi=c-F+z8rwL1x++UwU+KA@mail.gmail.com> 1 sibling, 0 replies; 17+ messages in thread From: Adam Talbot @ 2014-08-12 15:26 UTC (permalink / raw) To: Wilson, Jonathan, rm; +Cc: Can Jeuleers, linux-raid Thank you all for the input. At this point I think I am going to write a simple daemon to do dm power management. I still think this would be a good feature set to roll into the driver stack, or madam-tools. As far as wear and tear on the disks. Yes, starting and stopping the drives shortens their life span. I don't trust my disks, regardless of starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current price for power in my area is $0.11 per kilowatt-hour. That comes out to be $5.62 per year to run my drives for 2 hours, daily. But if I run my drives 24/7 it would cost me $67.45/year. Basically it would cost me an extra $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been picking up from local surplus, or auction websites are costing me $40~$50, including shipping and tax. In other words I could buy a new disk every 8~10 months to replace failures and it would be the same cost. Drives don't fail that fast, even if I was start/stopping them 10 times daily. This is also completely ignoring the fact that drive prices are failing. Sorry to disappoint, but I am going to spin down my array and save some money. On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan <piercing_male@hotmail.com> wrote: > On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >> On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >> > Also, leaving spin-up to the controller is >> > also not so hot since some controllers spin-up the drives sequentially >> > rather than in parallel. >> >> Sequential spin-up is a feature to some, because it avoids large power >> spikes. > > I vaguely recall older drives had a jumper to set a delayed spin up so > they stayed in a low power (possibly un-spun up) mode when power was > applied and only woke up when a command was received (I think any > command, not a specific "wake up" one). > > Also as mentioned some controllers may also only wake drives one after > the other, likewise mdriad does not care about the underlying > hardware/driver stack, only that it eventually responds, and even then I > believe it will happily wait till the end of time if no response or > error is propagated up the stack; hence the time out in scsi_device > stack not in the mdraid. > > > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CAH_2GhfBKiJAT=+zCFH0_xkmyTfWFi=c-F+z8rwL1x++UwU+KA@mail.gmail.com>]
* Re: Sleepy drives and MD RAID 6 [not found] ` <CAH_2GhfBKiJAT=+zCFH0_xkmyTfWFi=c-F+z8rwL1x++UwU+KA@mail.gmail.com> @ 2014-08-13 16:07 ` Adam Talbot 2014-08-14 16:50 ` Adam Talbot 0 siblings, 1 reply; 17+ messages in thread From: Adam Talbot @ 2014-08-13 16:07 UTC (permalink / raw) To: Wilson, Jonathan; +Cc: Can Jeuleers, linux-raid, rm Arg!! Am I hitting some kind of blocking at the Linux kernel?? No matter what I do, I can't seem to get the drives to spin up in parallel. Any ideas? A simple test case trying to get two drives to spin up at once. root@nas:~# hdparm -C /dev/sdh /dev/sdg /dev/sdh: drive state is: standby /dev/sdg: drive state is: standby #Two terminal windows dd'ing sdg and sdh at the same time. root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s real 0m28.139s ############# WHY?! ################ user 0m0.000s sys 0m0.000s #A single drive spin-up root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s real 0m14.424s user 0m0.000s sys 0m0.000s On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: > Thank you all for the input. At this point I think I am going to write a > simple daemon to do dm power management. I still think this would be a good > feature set to roll into the driver stack, or madam-tools. > > As far as wear and tear on the disks. Yes, starting and stopping the drives > shortens their life span. I don't trust my disks, regardless of > starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's > 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current > price for power in my area is $0.11 per kilowatt-hour. That comes out to be > $5.62 per year to run my drives for 2 hours, daily. But if I run my drives > 24/7 it would cost me $67.45/year. Basically it would cost me an extra > $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been > picking up from local surplus, or auction websites are costing me $40~$50, > including shipping and tax. In other words I could buy a new disk every > 8~10 months to replace failures and it would be the same cost. Drives don't > fail that fast, even if I was start/stopping them 10 times daily. This is > also completely ignoring the fact that drive prices are failing. Sorry to > disappoint, but I am going to spin down my array and save some money. > > > > > On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan > <piercing_male@hotmail.com> wrote: >> >> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >> > On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >> > > Also, leaving spin-up to the controller is >> > > also not so hot since some controllers spin-up the drives sequentially >> > > rather than in parallel. >> > >> > Sequential spin-up is a feature to some, because it avoids large power >> > spikes. >> >> I vaguely recall older drives had a jumper to set a delayed spin up so >> they stayed in a low power (possibly un-spun up) mode when power was >> applied and only woke up when a command was received (I think any >> command, not a specific "wake up" one). >> >> Also as mentioned some controllers may also only wake drives one after >> the other, likewise mdriad does not care about the underlying >> hardware/driver stack, only that it eventually responds, and even then I >> believe it will happily wait till the end of time if no response or >> error is propagated up the stack; hence the time out in scsi_device >> stack not in the mdraid. >> >> >> >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-13 16:07 ` Adam Talbot @ 2014-08-14 16:50 ` Adam Talbot 2014-08-14 17:00 ` Larkin Lowrey 0 siblings, 1 reply; 17+ messages in thread From: Adam Talbot @ 2014-08-14 16:50 UTC (permalink / raw) To: Wilson, Jonathan; +Cc: Can Jeuleers, linux-raid, rm I am running out of ideas. Does anyone know how to wake a disk with a non-blocking, and non-caching method? I have tried the following commands: dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct,nonblock hdparm --dco-identify /dev/sdh (This gets cached after the 3~10th time running) hdparm --read-sector 48059863 /dev/sdh Any ideas? On Wed, Aug 13, 2014 at 9:07 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: > Arg!! Am I hitting some kind of blocking at the Linux kernel?? No > matter what I do, I can't seem to get the drives to spin up in > parallel. Any ideas? > > A simple test case trying to get two drives to spin up at once. > root@nas:~# hdparm -C /dev/sdh /dev/sdg > /dev/sdh: > drive state is: standby > > /dev/sdg: > drive state is: standby > > #Two terminal windows dd'ing sdg and sdh at the same time. > root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 > count=1 iflag=direct > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s > > real 0m28.139s ############# WHY?! ################ > user 0m0.000s > sys 0m0.000s > > #A single drive spin-up > root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 > count=1 iflag=direct > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s > > real 0m14.424s > user 0m0.000s > sys 0m0.000s > > On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >> Thank you all for the input. At this point I think I am going to write a >> simple daemon to do dm power management. I still think this would be a good >> feature set to roll into the driver stack, or madam-tools. >> >> As far as wear and tear on the disks. Yes, starting and stopping the drives >> shortens their life span. I don't trust my disks, regardless of >> starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's >> 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current >> price for power in my area is $0.11 per kilowatt-hour. That comes out to be >> $5.62 per year to run my drives for 2 hours, daily. But if I run my drives >> 24/7 it would cost me $67.45/year. Basically it would cost me an extra >> $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been >> picking up from local surplus, or auction websites are costing me $40~$50, >> including shipping and tax. In other words I could buy a new disk every >> 8~10 months to replace failures and it would be the same cost. Drives don't >> fail that fast, even if I was start/stopping them 10 times daily. This is >> also completely ignoring the fact that drive prices are failing. Sorry to >> disappoint, but I am going to spin down my array and save some money. >> >> >> >> >> On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan >> <piercing_male@hotmail.com> wrote: >>> >>> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >>> > On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >>> > > Also, leaving spin-up to the controller is >>> > > also not so hot since some controllers spin-up the drives sequentially >>> > > rather than in parallel. >>> > >>> > Sequential spin-up is a feature to some, because it avoids large power >>> > spikes. >>> >>> I vaguely recall older drives had a jumper to set a delayed spin up so >>> they stayed in a low power (possibly un-spun up) mode when power was >>> applied and only woke up when a command was received (I think any >>> command, not a specific "wake up" one). >>> >>> Also as mentioned some controllers may also only wake drives one after >>> the other, likewise mdriad does not care about the underlying >>> hardware/driver stack, only that it eventually responds, and even then I >>> believe it will happily wait till the end of time if no response or >>> error is propagated up the stack; hence the time out in scsi_device >>> stack not in the mdraid. >>> >>> >>> >>> > -- >>> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> > the body of a message to majordomo@vger.kernel.org >>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-14 16:50 ` Adam Talbot @ 2014-08-14 17:00 ` Larkin Lowrey 2014-08-14 17:37 ` Adam Talbot 0 siblings, 1 reply; 17+ messages in thread From: Larkin Lowrey @ 2014-08-14 17:00 UTC (permalink / raw) To: Adam Talbot, Wilson, Jonathan; +Cc: Can Jeuleers, linux-raid, rm Have you tried the dd command w/o nonblock and putting it in the background via &? You could then use the 'wait' command to wait for them to finish. I did dust off some old memories and recalled that one of my SAS controllers (LSI) does the spin ups serially no matter what and I ended up moving these low duty cycle drives to my other SAS controller (Marvell) and put my always spinning drives on the LSI. I've never seen this behavior from any of my AHCI SATA controllers. --Larkin On 8/14/2014 11:50 AM, Adam Talbot wrote: > I am running out of ideas. Does anyone know how to wake a disk with a > non-blocking, and non-caching method? > I have tried the following commands: > dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct,nonblock > hdparm --dco-identify /dev/sdh (This gets cached after the 3~10th > time running) > hdparm --read-sector 48059863 /dev/sdh > > Any ideas? > > On Wed, Aug 13, 2014 at 9:07 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >> Arg!! Am I hitting some kind of blocking at the Linux kernel?? No >> matter what I do, I can't seem to get the drives to spin up in >> parallel. Any ideas? >> >> A simple test case trying to get two drives to spin up at once. >> root@nas:~# hdparm -C /dev/sdh /dev/sdg >> /dev/sdh: >> drive state is: standby >> >> /dev/sdg: >> drive state is: standby >> >> #Two terminal windows dd'ing sdg and sdh at the same time. >> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >> count=1 iflag=direct >> 1+0 records in >> 1+0 records out >> 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s >> >> real 0m28.139s ############# WHY?! ################ >> user 0m0.000s >> sys 0m0.000s >> >> #A single drive spin-up >> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >> count=1 iflag=direct >> 1+0 records in >> 1+0 records out >> 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s >> >> real 0m14.424s >> user 0m0.000s >> sys 0m0.000s >> >> On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>> Thank you all for the input. At this point I think I am going to write a >>> simple daemon to do dm power management. I still think this would be a good >>> feature set to roll into the driver stack, or madam-tools. >>> >>> As far as wear and tear on the disks. Yes, starting and stopping the drives >>> shortens their life span. I don't trust my disks, regardless of >>> starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's >>> 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current >>> price for power in my area is $0.11 per kilowatt-hour. That comes out to be >>> $5.62 per year to run my drives for 2 hours, daily. But if I run my drives >>> 24/7 it would cost me $67.45/year. Basically it would cost me an extra >>> $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been >>> picking up from local surplus, or auction websites are costing me $40~$50, >>> including shipping and tax. In other words I could buy a new disk every >>> 8~10 months to replace failures and it would be the same cost. Drives don't >>> fail that fast, even if I was start/stopping them 10 times daily. This is >>> also completely ignoring the fact that drive prices are failing. Sorry to >>> disappoint, but I am going to spin down my array and save some money. >>> >>> >>> >>> >>> On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan >>> <piercing_male@hotmail.com> wrote: >>>> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >>>>> On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >>>>>> Also, leaving spin-up to the controller is >>>>>> also not so hot since some controllers spin-up the drives sequentially >>>>>> rather than in parallel. >>>>> Sequential spin-up is a feature to some, because it avoids large power >>>>> spikes. >>>> I vaguely recall older drives had a jumper to set a delayed spin up so >>>> they stayed in a low power (possibly un-spun up) mode when power was >>>> applied and only woke up when a command was received (I think any >>>> command, not a specific "wake up" one). >>>> >>>> Also as mentioned some controllers may also only wake drives one after >>>> the other, likewise mdriad does not care about the underlying >>>> hardware/driver stack, only that it eventually responds, and even then I >>>> believe it will happily wait till the end of time if no response or >>>> error is propagated up the stack; hence the time out in scsi_device >>>> stack not in the mdraid. >>>> >>>> >>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-14 17:00 ` Larkin Lowrey @ 2014-08-14 17:37 ` Adam Talbot 2014-08-14 18:05 ` Larkin Lowrey 0 siblings, 1 reply; 17+ messages in thread From: Adam Talbot @ 2014-08-14 17:37 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Wilson, Jonathan, Can Jeuleers, linux-raid, rm For testing I use two windows, just to make sure they are run independent. My shell script uses "(setsid put_some_command_here /dev/$i > /dev/null 2>&1 &)" to make sure the command is forced into the background. Hummm... A controller issue? lspci | grep LSI 07:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 02) 09:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) 0b:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 02) lspci | grep -i sata (On-board) 00:1f.2 IDE interface: Intel Corporation 631xESB/632xESB/3100 Chipset SATA IDE Controller (rev 09) All but 1 of my drives are run through my 3X 4-port LSI cards. /dev/sdb is running through the onboard Intel SATA controller. Each drive takes 10 secounds to spin up. With a 7 disk RAID 6, I would expect a read/write to succeed 50 seconds (5 drives) after the request. But on my system it always takes 40 seconds?! Quick test. sdb & sdc at the same time (Intel + LSI): root@nas:~/dm_drive_sleeper# time (dd if=/dev/sdc of=/dev/null bs=512k count=16 iflag=direct) 16+0 records in 16+0 records out 8388608 bytes (8.4 MB) copied, 10.2006 s, 822 kB/s real 0m10.202s user 0m0.000s sys 0m0.000s sdf & sde at the same time (LSI + LSI):root@nas:~/dm_drive_sleeper# time (dd if=/dev/sdf of=/dev/null bs=512k count=16 iflag=direct) 16+0 records in 16+0 records out 8388608 bytes (8.4 MB) copied, 10.2417 s, 819 kB/s real 0m20.208s user 0m0.000s sys 0m0.000s I blame the LSI cards!??!? I have been looking for an excuse to upgrade, and now I have it! Any clue where I can find a dumb/cheap/used 12-port (Or 2X 8-port). My drive cage has 15 ports, standard SATA/SAS connections. So I will have to pick up some adapter cables regardless of the new card type. In other news, Larkin I owe you a beer/coffee/tea. On Thu, Aug 14, 2014 at 10:00 AM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > Have you tried the dd command w/o nonblock and putting it in the > background via &? You could then use the 'wait' command to wait for them > to finish. > > I did dust off some old memories and recalled that one of my SAS > controllers (LSI) does the spin ups serially no matter what and I ended > up moving these low duty cycle drives to my other SAS controller > (Marvell) and put my always spinning drives on the LSI. I've never seen > this behavior from any of my AHCI SATA controllers. > > --Larkin > > On 8/14/2014 11:50 AM, Adam Talbot wrote: >> I am running out of ideas. Does anyone know how to wake a disk with a >> non-blocking, and non-caching method? >> I have tried the following commands: >> dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct,nonblock >> hdparm --dco-identify /dev/sdh (This gets cached after the 3~10th >> time running) >> hdparm --read-sector 48059863 /dev/sdh >> >> Any ideas? >> >> On Wed, Aug 13, 2014 at 9:07 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>> Arg!! Am I hitting some kind of blocking at the Linux kernel?? No >>> matter what I do, I can't seem to get the drives to spin up in >>> parallel. Any ideas? >>> >>> A simple test case trying to get two drives to spin up at once. >>> root@nas:~# hdparm -C /dev/sdh /dev/sdg >>> /dev/sdh: >>> drive state is: standby >>> >>> /dev/sdg: >>> drive state is: standby >>> >>> #Two terminal windows dd'ing sdg and sdh at the same time. >>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>> count=1 iflag=direct >>> 1+0 records in >>> 1+0 records out >>> 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s >>> >>> real 0m28.139s ############# WHY?! ################ >>> user 0m0.000s >>> sys 0m0.000s >>> >>> #A single drive spin-up >>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>> count=1 iflag=direct >>> 1+0 records in >>> 1+0 records out >>> 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s >>> >>> real 0m14.424s >>> user 0m0.000s >>> sys 0m0.000s >>> >>> On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>> Thank you all for the input. At this point I think I am going to write a >>>> simple daemon to do dm power management. I still think this would be a good >>>> feature set to roll into the driver stack, or madam-tools. >>>> >>>> As far as wear and tear on the disks. Yes, starting and stopping the drives >>>> shortens their life span. I don't trust my disks, regardless of >>>> starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's >>>> 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current >>>> price for power in my area is $0.11 per kilowatt-hour. That comes out to be >>>> $5.62 per year to run my drives for 2 hours, daily. But if I run my drives >>>> 24/7 it would cost me $67.45/year. Basically it would cost me an extra >>>> $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been >>>> picking up from local surplus, or auction websites are costing me $40~$50, >>>> including shipping and tax. In other words I could buy a new disk every >>>> 8~10 months to replace failures and it would be the same cost. Drives don't >>>> fail that fast, even if I was start/stopping them 10 times daily. This is >>>> also completely ignoring the fact that drive prices are failing. Sorry to >>>> disappoint, but I am going to spin down my array and save some money. >>>> >>>> >>>> >>>> >>>> On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan >>>> <piercing_male@hotmail.com> wrote: >>>>> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >>>>>> On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >>>>>>> Also, leaving spin-up to the controller is >>>>>>> also not so hot since some controllers spin-up the drives sequentially >>>>>>> rather than in parallel. >>>>>> Sequential spin-up is a feature to some, because it avoids large power >>>>>> spikes. >>>>> I vaguely recall older drives had a jumper to set a delayed spin up so >>>>> they stayed in a low power (possibly un-spun up) mode when power was >>>>> applied and only woke up when a command was received (I think any >>>>> command, not a specific "wake up" one). >>>>> >>>>> Also as mentioned some controllers may also only wake drives one after >>>>> the other, likewise mdriad does not care about the underlying >>>>> hardware/driver stack, only that it eventually responds, and even then I >>>>> believe it will happily wait till the end of time if no response or >>>>> error is propagated up the stack; hence the time out in scsi_device >>>>> stack not in the mdraid. >>>>> >>>>> >>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-14 17:37 ` Adam Talbot @ 2014-08-14 18:05 ` Larkin Lowrey 2014-08-18 15:41 ` Adam Talbot 0 siblings, 1 reply; 17+ messages in thread From: Larkin Lowrey @ 2014-08-14 18:05 UTC (permalink / raw) To: Adam Talbot; +Cc: Wilson, Jonathan, Can Jeuleers, linux-raid, rm My LSI SAS controller (SAS2008) is newer and may behave differently but I guessing this is your problem. I've been very happy with my HighPoint controllers (difficult to say in public). I have an 8 port Rocket 2720SGL ($150) and a 16 port RocketRaid 2740 ($400+) . Both have worked flawlessly and performance has been excellent. The 16 port card actually has two 8 port controllers on it bridged together. I think you're better off with 2 8 port cards. The 8 port RocketRaid 2680 is slower (3Gb/s) but should be fine for spinning rust and is about $100. I don't have any experience with those. I found one on ebay for $45 so there may be some good deals on that one since it's a generation older. --Larkin On 8/14/2014 12:37 PM, Adam Talbot wrote: > For testing I use two windows, just to make sure they are run > independent. My shell script uses "(setsid put_some_command_here > /dev/$i > /dev/null 2>&1 &)" to make sure the command is forced into > the background. > > Hummm... A controller issue? > lspci | grep LSI > 07:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E > PCI-Express Fusion-MPT SAS (rev 02) > 09:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E > PCI-Express Fusion-MPT SAS (rev 08) > 0b:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E > PCI-Express Fusion-MPT SAS (rev 02) > lspci | grep -i sata (On-board) > 00:1f.2 IDE interface: Intel Corporation 631xESB/632xESB/3100 Chipset > SATA IDE Controller (rev 09) > > All but 1 of my drives are run through my 3X 4-port LSI cards. > /dev/sdb is running through the onboard Intel SATA controller. Each > drive takes 10 secounds to spin up. With a 7 disk RAID 6, I would > expect a read/write to succeed 50 seconds (5 drives) after the > request. But on my system it always takes 40 seconds?! > > Quick test. sdb & sdc at the same time (Intel + LSI): > root@nas:~/dm_drive_sleeper# time (dd if=/dev/sdc of=/dev/null bs=512k > count=16 iflag=direct) > 16+0 records in > 16+0 records out > 8388608 bytes (8.4 MB) copied, 10.2006 s, 822 kB/s > > real 0m10.202s > user 0m0.000s > sys 0m0.000s > > sdf & sde at the same time (LSI + LSI):root@nas:~/dm_drive_sleeper# > time (dd if=/dev/sdf of=/dev/null bs=512k count=16 iflag=direct) > 16+0 records in > 16+0 records out > 8388608 bytes (8.4 MB) copied, 10.2417 s, 819 kB/s > > real 0m20.208s > user 0m0.000s > sys 0m0.000s > > I blame the LSI cards!??!? I have been looking for an excuse to > upgrade, and now I have it! Any clue where I can find a > dumb/cheap/used 12-port (Or 2X 8-port). My drive cage has 15 ports, > standard SATA/SAS connections. So I will have to pick up some adapter > cables regardless of the new card type. > > In other news, Larkin I owe you a beer/coffee/tea. > > On Thu, Aug 14, 2014 at 10:00 AM, Larkin Lowrey > <llowrey@nuclearwinter.com> wrote: >> Have you tried the dd command w/o nonblock and putting it in the >> background via &? You could then use the 'wait' command to wait for them >> to finish. >> >> I did dust off some old memories and recalled that one of my SAS >> controllers (LSI) does the spin ups serially no matter what and I ended >> up moving these low duty cycle drives to my other SAS controller >> (Marvell) and put my always spinning drives on the LSI. I've never seen >> this behavior from any of my AHCI SATA controllers. >> >> --Larkin >> >> On 8/14/2014 11:50 AM, Adam Talbot wrote: >>> I am running out of ideas. Does anyone know how to wake a disk with a >>> non-blocking, and non-caching method? >>> I have tried the following commands: >>> dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct,nonblock >>> hdparm --dco-identify /dev/sdh (This gets cached after the 3~10th >>> time running) >>> hdparm --read-sector 48059863 /dev/sdh >>> >>> Any ideas? >>> >>> On Wed, Aug 13, 2014 at 9:07 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>> Arg!! Am I hitting some kind of blocking at the Linux kernel?? No >>>> matter what I do, I can't seem to get the drives to spin up in >>>> parallel. Any ideas? >>>> >>>> A simple test case trying to get two drives to spin up at once. >>>> root@nas:~# hdparm -C /dev/sdh /dev/sdg >>>> /dev/sdh: >>>> drive state is: standby >>>> >>>> /dev/sdg: >>>> drive state is: standby >>>> >>>> #Two terminal windows dd'ing sdg and sdh at the same time. >>>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>>> count=1 iflag=direct >>>> 1+0 records in >>>> 1+0 records out >>>> 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s >>>> >>>> real 0m28.139s ############# WHY?! ################ >>>> user 0m0.000s >>>> sys 0m0.000s >>>> >>>> #A single drive spin-up >>>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>>> count=1 iflag=direct >>>> 1+0 records in >>>> 1+0 records out >>>> 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s >>>> >>>> real 0m14.424s >>>> user 0m0.000s >>>> sys 0m0.000s >>>> >>>> On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>>> Thank you all for the input. At this point I think I am going to write a >>>>> simple daemon to do dm power management. I still think this would be a good >>>>> feature set to roll into the driver stack, or madam-tools. >>>>> >>>>> As far as wear and tear on the disks. Yes, starting and stopping the drives >>>>> shortens their life span. I don't trust my disks, regardless of >>>>> starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's >>>>> 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current >>>>> price for power in my area is $0.11 per kilowatt-hour. That comes out to be >>>>> $5.62 per year to run my drives for 2 hours, daily. But if I run my drives >>>>> 24/7 it would cost me $67.45/year. Basically it would cost me an extra >>>>> $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been >>>>> picking up from local surplus, or auction websites are costing me $40~$50, >>>>> including shipping and tax. In other words I could buy a new disk every >>>>> 8~10 months to replace failures and it would be the same cost. Drives don't >>>>> fail that fast, even if I was start/stopping them 10 times daily. This is >>>>> also completely ignoring the fact that drive prices are failing. Sorry to >>>>> disappoint, but I am going to spin down my array and save some money. >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan >>>>> <piercing_male@hotmail.com> wrote: >>>>>> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >>>>>>> On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >>>>>>>> Also, leaving spin-up to the controller is >>>>>>>> also not so hot since some controllers spin-up the drives sequentially >>>>>>>> rather than in parallel. >>>>>>> Sequential spin-up is a feature to some, because it avoids large power >>>>>>> spikes. >>>>>> I vaguely recall older drives had a jumper to set a delayed spin up so >>>>>> they stayed in a low power (possibly un-spun up) mode when power was >>>>>> applied and only woke up when a command was received (I think any >>>>>> command, not a specific "wake up" one). >>>>>> >>>>>> Also as mentioned some controllers may also only wake drives one after >>>>>> the other, likewise mdriad does not care about the underlying >>>>>> hardware/driver stack, only that it eventually responds, and even then I >>>>>> believe it will happily wait till the end of time if no response or >>>>>> error is propagated up the stack; hence the time out in scsi_device >>>>>> stack not in the mdraid. >>>>>> >>>>>> >>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-14 18:05 ` Larkin Lowrey @ 2014-08-18 15:41 ` Adam Talbot 2014-08-18 16:16 ` Larkin Lowrey 0 siblings, 1 reply; 17+ messages in thread From: Adam Talbot @ 2014-08-18 15:41 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Wilson, Jonathan, Can Jeuleers, linux-raid, rm Can you confirm the LSI 2008 SAS Controller is, or is not, effected by this problem? On Thu, Aug 14, 2014 at 11:05 AM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > My LSI SAS controller (SAS2008) is newer and may behave differently but > I guessing this is your problem. > > I've been very happy with my HighPoint controllers (difficult to say in > public). I have an 8 port Rocket 2720SGL ($150) and a 16 port RocketRaid > 2740 ($400+) . Both have worked flawlessly and performance has been > excellent. The 16 port card actually has two 8 port controllers on it > bridged together. I think you're better off with 2 8 port cards. > > The 8 port RocketRaid 2680 is slower (3Gb/s) but should be fine for > spinning rust and is about $100. I don't have any experience with those. > I found one on ebay for $45 so there may be some good deals on that one > since it's a generation older. > > --Larkin > > On 8/14/2014 12:37 PM, Adam Talbot wrote: >> For testing I use two windows, just to make sure they are run >> independent. My shell script uses "(setsid put_some_command_here >> /dev/$i > /dev/null 2>&1 &)" to make sure the command is forced into >> the background. >> >> Hummm... A controller issue? >> lspci | grep LSI >> 07:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E >> PCI-Express Fusion-MPT SAS (rev 02) >> 09:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E >> PCI-Express Fusion-MPT SAS (rev 08) >> 0b:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E >> PCI-Express Fusion-MPT SAS (rev 02) >> lspci | grep -i sata (On-board) >> 00:1f.2 IDE interface: Intel Corporation 631xESB/632xESB/3100 Chipset >> SATA IDE Controller (rev 09) >> >> All but 1 of my drives are run through my 3X 4-port LSI cards. >> /dev/sdb is running through the onboard Intel SATA controller. Each >> drive takes 10 secounds to spin up. With a 7 disk RAID 6, I would >> expect a read/write to succeed 50 seconds (5 drives) after the >> request. But on my system it always takes 40 seconds?! >> >> Quick test. sdb & sdc at the same time (Intel + LSI): >> root@nas:~/dm_drive_sleeper# time (dd if=/dev/sdc of=/dev/null bs=512k >> count=16 iflag=direct) >> 16+0 records in >> 16+0 records out >> 8388608 bytes (8.4 MB) copied, 10.2006 s, 822 kB/s >> >> real 0m10.202s >> user 0m0.000s >> sys 0m0.000s >> >> sdf & sde at the same time (LSI + LSI):root@nas:~/dm_drive_sleeper# >> time (dd if=/dev/sdf of=/dev/null bs=512k count=16 iflag=direct) >> 16+0 records in >> 16+0 records out >> 8388608 bytes (8.4 MB) copied, 10.2417 s, 819 kB/s >> >> real 0m20.208s >> user 0m0.000s >> sys 0m0.000s >> >> I blame the LSI cards!??!? I have been looking for an excuse to >> upgrade, and now I have it! Any clue where I can find a >> dumb/cheap/used 12-port (Or 2X 8-port). My drive cage has 15 ports, >> standard SATA/SAS connections. So I will have to pick up some adapter >> cables regardless of the new card type. >> >> In other news, Larkin I owe you a beer/coffee/tea. >> >> On Thu, Aug 14, 2014 at 10:00 AM, Larkin Lowrey >> <llowrey@nuclearwinter.com> wrote: >>> Have you tried the dd command w/o nonblock and putting it in the >>> background via &? You could then use the 'wait' command to wait for them >>> to finish. >>> >>> I did dust off some old memories and recalled that one of my SAS >>> controllers (LSI) does the spin ups serially no matter what and I ended >>> up moving these low duty cycle drives to my other SAS controller >>> (Marvell) and put my always spinning drives on the LSI. I've never seen >>> this behavior from any of my AHCI SATA controllers. >>> >>> --Larkin >>> >>> On 8/14/2014 11:50 AM, Adam Talbot wrote: >>>> I am running out of ideas. Does anyone know how to wake a disk with a >>>> non-blocking, and non-caching method? >>>> I have tried the following commands: >>>> dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct,nonblock >>>> hdparm --dco-identify /dev/sdh (This gets cached after the 3~10th >>>> time running) >>>> hdparm --read-sector 48059863 /dev/sdh >>>> >>>> Any ideas? >>>> >>>> On Wed, Aug 13, 2014 at 9:07 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>>> Arg!! Am I hitting some kind of blocking at the Linux kernel?? No >>>>> matter what I do, I can't seem to get the drives to spin up in >>>>> parallel. Any ideas? >>>>> >>>>> A simple test case trying to get two drives to spin up at once. >>>>> root@nas:~# hdparm -C /dev/sdh /dev/sdg >>>>> /dev/sdh: >>>>> drive state is: standby >>>>> >>>>> /dev/sdg: >>>>> drive state is: standby >>>>> >>>>> #Two terminal windows dd'ing sdg and sdh at the same time. >>>>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>>>> count=1 iflag=direct >>>>> 1+0 records in >>>>> 1+0 records out >>>>> 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s >>>>> >>>>> real 0m28.139s ############# WHY?! ################ >>>>> user 0m0.000s >>>>> sys 0m0.000s >>>>> >>>>> #A single drive spin-up >>>>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>>>> count=1 iflag=direct >>>>> 1+0 records in >>>>> 1+0 records out >>>>> 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s >>>>> >>>>> real 0m14.424s >>>>> user 0m0.000s >>>>> sys 0m0.000s >>>>> >>>>> On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>>>> Thank you all for the input. At this point I think I am going to write a >>>>>> simple daemon to do dm power management. I still think this would be a good >>>>>> feature set to roll into the driver stack, or madam-tools. >>>>>> >>>>>> As far as wear and tear on the disks. Yes, starting and stopping the drives >>>>>> shortens their life span. I don't trust my disks, regardless of >>>>>> starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's >>>>>> 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current >>>>>> price for power in my area is $0.11 per kilowatt-hour. That comes out to be >>>>>> $5.62 per year to run my drives for 2 hours, daily. But if I run my drives >>>>>> 24/7 it would cost me $67.45/year. Basically it would cost me an extra >>>>>> $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been >>>>>> picking up from local surplus, or auction websites are costing me $40~$50, >>>>>> including shipping and tax. In other words I could buy a new disk every >>>>>> 8~10 months to replace failures and it would be the same cost. Drives don't >>>>>> fail that fast, even if I was start/stopping them 10 times daily. This is >>>>>> also completely ignoring the fact that drive prices are failing. Sorry to >>>>>> disappoint, but I am going to spin down my array and save some money. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan >>>>>> <piercing_male@hotmail.com> wrote: >>>>>>> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >>>>>>>> On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >>>>>>>>> Also, leaving spin-up to the controller is >>>>>>>>> also not so hot since some controllers spin-up the drives sequentially >>>>>>>>> rather than in parallel. >>>>>>>> Sequential spin-up is a feature to some, because it avoids large power >>>>>>>> spikes. >>>>>>> I vaguely recall older drives had a jumper to set a delayed spin up so >>>>>>> they stayed in a low power (possibly un-spun up) mode when power was >>>>>>> applied and only woke up when a command was received (I think any >>>>>>> command, not a specific "wake up" one). >>>>>>> >>>>>>> Also as mentioned some controllers may also only wake drives one after >>>>>>> the other, likewise mdriad does not care about the underlying >>>>>>> hardware/driver stack, only that it eventually responds, and even then I >>>>>>> believe it will happily wait till the end of time if no response or >>>>>>> error is propagated up the stack; hence the time out in scsi_device >>>>>>> stack not in the mdraid. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> -- >>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-18 15:41 ` Adam Talbot @ 2014-08-18 16:16 ` Larkin Lowrey [not found] ` <CAH_2GhcwHszr75bbDq2aWRnnF8-ttpGERKE=CKt=Nri3ue9row@mail.gmail.com> 0 siblings, 1 reply; 17+ messages in thread From: Larkin Lowrey @ 2014-08-18 16:16 UTC (permalink / raw) To: Adam Talbot; +Cc: Wilson, Jonathan, Can Jeuleers, linux-raid, rm Yes, the SAS2008 will not spin up the drives in parallel no matter what I try. --Larkin On 8/18/2014 10:41 AM, Adam Talbot wrote: > Can you confirm the LSI 2008 SAS Controller is, or is not, effected by > this problem? > > On Thu, Aug 14, 2014 at 11:05 AM, Larkin Lowrey > <llowrey@nuclearwinter.com> wrote: >> My LSI SAS controller (SAS2008) is newer and may behave differently but >> I guessing this is your problem. >> >> I've been very happy with my HighPoint controllers (difficult to say in >> public). I have an 8 port Rocket 2720SGL ($150) and a 16 port RocketRaid >> 2740 ($400+) . Both have worked flawlessly and performance has been >> excellent. The 16 port card actually has two 8 port controllers on it >> bridged together. I think you're better off with 2 8 port cards. >> >> The 8 port RocketRaid 2680 is slower (3Gb/s) but should be fine for >> spinning rust and is about $100. I don't have any experience with those. >> I found one on ebay for $45 so there may be some good deals on that one >> since it's a generation older. >> >> --Larkin >> >> On 8/14/2014 12:37 PM, Adam Talbot wrote: >>> For testing I use two windows, just to make sure they are run >>> independent. My shell script uses "(setsid put_some_command_here >>> /dev/$i > /dev/null 2>&1 &)" to make sure the command is forced into >>> the background. >>> >>> Hummm... A controller issue? >>> lspci | grep LSI >>> 07:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E >>> PCI-Express Fusion-MPT SAS (rev 02) >>> 09:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E >>> PCI-Express Fusion-MPT SAS (rev 08) >>> 0b:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E >>> PCI-Express Fusion-MPT SAS (rev 02) >>> lspci | grep -i sata (On-board) >>> 00:1f.2 IDE interface: Intel Corporation 631xESB/632xESB/3100 Chipset >>> SATA IDE Controller (rev 09) >>> >>> All but 1 of my drives are run through my 3X 4-port LSI cards. >>> /dev/sdb is running through the onboard Intel SATA controller. Each >>> drive takes 10 secounds to spin up. With a 7 disk RAID 6, I would >>> expect a read/write to succeed 50 seconds (5 drives) after the >>> request. But on my system it always takes 40 seconds?! >>> >>> Quick test. sdb & sdc at the same time (Intel + LSI): >>> root@nas:~/dm_drive_sleeper# time (dd if=/dev/sdc of=/dev/null bs=512k >>> count=16 iflag=direct) >>> 16+0 records in >>> 16+0 records out >>> 8388608 bytes (8.4 MB) copied, 10.2006 s, 822 kB/s >>> >>> real 0m10.202s >>> user 0m0.000s >>> sys 0m0.000s >>> >>> sdf & sde at the same time (LSI + LSI):root@nas:~/dm_drive_sleeper# >>> time (dd if=/dev/sdf of=/dev/null bs=512k count=16 iflag=direct) >>> 16+0 records in >>> 16+0 records out >>> 8388608 bytes (8.4 MB) copied, 10.2417 s, 819 kB/s >>> >>> real 0m20.208s >>> user 0m0.000s >>> sys 0m0.000s >>> >>> I blame the LSI cards!??!? I have been looking for an excuse to >>> upgrade, and now I have it! Any clue where I can find a >>> dumb/cheap/used 12-port (Or 2X 8-port). My drive cage has 15 ports, >>> standard SATA/SAS connections. So I will have to pick up some adapter >>> cables regardless of the new card type. >>> >>> In other news, Larkin I owe you a beer/coffee/tea. >>> >>> On Thu, Aug 14, 2014 at 10:00 AM, Larkin Lowrey >>> <llowrey@nuclearwinter.com> wrote: >>>> Have you tried the dd command w/o nonblock and putting it in the >>>> background via &? You could then use the 'wait' command to wait for them >>>> to finish. >>>> >>>> I did dust off some old memories and recalled that one of my SAS >>>> controllers (LSI) does the spin ups serially no matter what and I ended >>>> up moving these low duty cycle drives to my other SAS controller >>>> (Marvell) and put my always spinning drives on the LSI. I've never seen >>>> this behavior from any of my AHCI SATA controllers. >>>> >>>> --Larkin >>>> >>>> On 8/14/2014 11:50 AM, Adam Talbot wrote: >>>>> I am running out of ideas. Does anyone know how to wake a disk with a >>>>> non-blocking, and non-caching method? >>>>> I have tried the following commands: >>>>> dd if=/dev/sdh of=/dev/null bs=4096 count=1 iflag=direct,nonblock >>>>> hdparm --dco-identify /dev/sdh (This gets cached after the 3~10th >>>>> time running) >>>>> hdparm --read-sector 48059863 /dev/sdh >>>>> >>>>> Any ideas? >>>>> >>>>> On Wed, Aug 13, 2014 at 9:07 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>>>> Arg!! Am I hitting some kind of blocking at the Linux kernel?? No >>>>>> matter what I do, I can't seem to get the drives to spin up in >>>>>> parallel. Any ideas? >>>>>> >>>>>> A simple test case trying to get two drives to spin up at once. >>>>>> root@nas:~# hdparm -C /dev/sdh /dev/sdg >>>>>> /dev/sdh: >>>>>> drive state is: standby >>>>>> >>>>>> /dev/sdg: >>>>>> drive state is: standby >>>>>> >>>>>> #Two terminal windows dd'ing sdg and sdh at the same time. >>>>>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>>>>> count=1 iflag=direct >>>>>> 1+0 records in >>>>>> 1+0 records out >>>>>> 4096 bytes (4.1 kB) copied, 14.371 s, 0.3 kB/s >>>>>> >>>>>> real 0m28.139s ############# WHY?! ################ >>>>>> user 0m0.000s >>>>>> sys 0m0.000s >>>>>> >>>>>> #A single drive spin-up >>>>>> root@nas:~/dm_drive_sleeper# time dd if=/dev/sdh of=/dev/null bs=4096 >>>>>> count=1 iflag=direct >>>>>> 1+0 records in >>>>>> 1+0 records out >>>>>> 4096 bytes (4.1 kB) copied, 14.4212 s, 0.3 kB/s >>>>>> >>>>>> real 0m14.424s >>>>>> user 0m0.000s >>>>>> sys 0m0.000s >>>>>> >>>>>> On Tue, Aug 12, 2014 at 8:23 AM, Adam Talbot <ajtalbot1@gmail.com> wrote: >>>>>>> Thank you all for the input. At this point I think I am going to write a >>>>>>> simple daemon to do dm power management. I still think this would be a good >>>>>>> feature set to roll into the driver stack, or madam-tools. >>>>>>> >>>>>>> As far as wear and tear on the disks. Yes, starting and stopping the drives >>>>>>> shortens their life span. I don't trust my disks, regardless of >>>>>>> starting/stopping, that is why I run RAID 6. Lets say I use my NAS with it's >>>>>>> 7 disks for 2 hours a day, 7 days a week @ 10 watts per drive. The current >>>>>>> price for power in my area is $0.11 per kilowatt-hour. That comes out to be >>>>>>> $5.62 per year to run my drives for 2 hours, daily. But if I run my drives >>>>>>> 24/7 it would cost me $67.45/year. Basically it would cost me an extra >>>>>>> $61.83/year to run the drives 24/7. The 2TB 5400RPM SATA drives I have been >>>>>>> picking up from local surplus, or auction websites are costing me $40~$50, >>>>>>> including shipping and tax. In other words I could buy a new disk every >>>>>>> 8~10 months to replace failures and it would be the same cost. Drives don't >>>>>>> fail that fast, even if I was start/stopping them 10 times daily. This is >>>>>>> also completely ignoring the fact that drive prices are failing. Sorry to >>>>>>> disappoint, but I am going to spin down my array and save some money. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Aug 12, 2014 at 2:46 AM, Wilson, Jonathan >>>>>>> <piercing_male@hotmail.com> wrote: >>>>>>>> On Tue, 2014-08-12 at 07:55 +0200, Can Jeuleers wrote: >>>>>>>>> On 08/12/2014 03:21 AM, Larkin Lowrey wrote: >>>>>>>>>> Also, leaving spin-up to the controller is >>>>>>>>>> also not so hot since some controllers spin-up the drives sequentially >>>>>>>>>> rather than in parallel. >>>>>>>>> Sequential spin-up is a feature to some, because it avoids large power >>>>>>>>> spikes. >>>>>>>> I vaguely recall older drives had a jumper to set a delayed spin up so >>>>>>>> they stayed in a low power (possibly un-spun up) mode when power was >>>>>>>> applied and only woke up when a command was received (I think any >>>>>>>> command, not a specific "wake up" one). >>>>>>>> >>>>>>>> Also as mentioned some controllers may also only wake drives one after >>>>>>>> the other, likewise mdriad does not care about the underlying >>>>>>>> hardware/driver stack, only that it eventually responds, and even then I >>>>>>>> believe it will happily wait till the end of time if no response or >>>>>>>> error is propagated up the stack; hence the time out in scsi_device >>>>>>>> stack not in the mdraid. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> -- >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>>> >>>>>>>> -- >>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CAH_2GhcwHszr75bbDq2aWRnnF8-ttpGERKE=CKt=Nri3ue9row@mail.gmail.com>]
* Re: Sleepy drives and MD RAID 6 [not found] ` <CAH_2GhcwHszr75bbDq2aWRnnF8-ttpGERKE=CKt=Nri3ue9row@mail.gmail.com> @ 2014-08-28 21:04 ` Adam Talbot 0 siblings, 0 replies; 17+ messages in thread From: Adam Talbot @ 2014-08-28 21:04 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Can Jeuleers, rm, linux-raid, Wilson, Jonathan Solved. The below link to the Gentoo forums is my formal write up. I hope this info can help those who follow me in the sleep drive adventures. Now I am off to take a nap. https://forums.gentoo.org/viewtopic-t-997086.html The raw text, just encase the above link does not work: Success!! Why? I wanted to put my DM software based RAID6 to sleep when not in use. At 10 watts per drive, it adds up! I did not want to wait 10 seconds per drive, in series for the array to come to life. I was tired of my windows desktop hanging while waiting for a simple directory look up on my NAS. Disclaimer: Do not come crying to me when you destroy a hard drive, loose all your data, fry a power supply, or cause a small country to be erased from the face of the Earth. The key points covered below: Drive Controller Bcache Inotify Drive Controller My server/NAS was running 3X LSI SAS 1068e controllers to control my 7 drives RAID 6. Turns out that the cards are hard coded to spin up in series. No way to get around it, it just is. This happens to apply to ANY card running the LSI 1068e chipset, such as a Dell Perc 6/i, or HP P400. This may even apply to all LSI based cards. To make matters worse, the cards are smart and will only spin one drive up at a time across all 3 cards. My 7 disk RAID 6 was taking 50 seconds to spin up (10 seconds per drive). This was dropped to 40 seconds when I moved 1 drives to the on board SATA controller. That was my first clue. Thanks to the Linux-Raid group mailing list for the help isolating this one. So I was on the Internets looking for a new, cheap, 12~16 port SATAII controller card. I found a very strange card on ebay. A "Ciprico Inc. RAIDCore" 16-port card. I cant even find any good pictures or links to add to this post so you can see it. It basically has 4 Marvell controllers and a pci-e bridge strapped onto a single card. No brains, no nothing. Just a pure, dumb controller with out any spin up stupidity. Same chipset (88SE6445) found on some RocketRAID cards. It was EXACTLY what I was looking for. At a cost of $60 I was thrilled. In Linux is shows up as a bridge + controller chips: Code: 07:00.0 PCI bridge: Integrated Device Technology, Inc. PES24T6 PCI Express Switch (rev 0d) 08:02.0 PCI bridge: Integrated Device Technology, Inc. PES24T6 PCI Express Switch (rev 0d) 08:03.0 PCI bridge: Integrated Device Technology, Inc. PES24T6 PCI Express Switch (rev 0d) 08:04.0 PCI bridge: Integrated Device Technology, Inc. PES24T6 PCI Express Switch (rev 0d) 08:05.0 PCI bridge: Integrated Device Technology, Inc. PES24T6 PCI Express Switch (rev 0d) 09:00.0 SCSI storage controller: Marvell Technology Group Ltd. 88SE6440 SAS/SATA PCIe controller (rev 02) 0a:00.0 SCSI storage controller: Marvell Technology Group Ltd. 88SE6440 SAS/SATA PCIe controller (rev 02) 0b:00.0 SCSI storage controller: Marvell Technology Group Ltd. 88SE6440 SAS/SATA PCIe controller (rev 02) 0c:00.0 SCSI storage controller: Marvell Technology Group Ltd. 88SE6440 SAS/SATA PCIe controller (rev 02) Bcache https://www.kernel.org/doc/Documentation/bcache.txt Now that I have the total spin up time down from 50 seconds ((number_of_drives *10) -2) to 10 seconds. I was able to address the reaming 10 seconds using caching. In this case I am using bcache. My operating system disks 2X are OCZ Deneva 240GB SSD's set up in a basic mirror. I partitioned these drives out and used 24GB's as a caching device for my raid. Quickly found out that bcache is unstable on the 3.16 kernel and was forced back to the 3.14lts kernel. After I landed on the 3.14.15 kernel everything is running great. The basic bcache setting work, but I wanted more: Code: #Setup bcache just the way I like it, hun-hun, hun-hun #Get involved in read and write activities echo "writeback" > /sys/block/bcache0/bcache/cache_mode #Allow the bcache to put data in the cache, but get it out as fast as possible echo "0" > /sys/block/bcache0/bcache/writeback_percent echo "0" > /sys/block/bcache0/bcache/writeback_delay echo $((16*1024)) > /sys/block/bcache0/bcache/writeback_rate #Clean up jerky read performance on file that have never been cached. echo "16M" > /sys/block/bcache0/bcache/readahead I put all the above code in rc.local so my system picks them up on boot. Writes still need to wake the array, but reads from cache don't even wake up the drives. Code: root@nas:/data# time (dd if=/dev/zero of=foo.dd bs=4096k count=16 ; sync) 16+0 records in 16+0 records out 67108864 bytes (67 MB) copied, 0.0963405 s, 697 MB/s real 0m10.656s #######Array spin up time######### user 0m0.000s sys 0m0.128s root@nas:~# ./sleeping_raid_status.sh /dev/sdc standby ... /dev/sdd standby root@nas:/data# time (dd if=foo.dd of=/dev/null iflag=direct) 131072+0 records in 131072+0 records out 67108864 bytes (67 MB) copied, 0.118975 s, 564 MB/s real 0m0.121s ########Array never even woke up######### user 0m0.024s sys 0m0.096s root@nas:~# ./sleeping_raid_status.sh /dev/sdc standby /dev/sdj standby ... Inotify Wait... The array did not spin up because it read from cache?! Not good, but working exactly as expected. I have the file metadata in cache, but what happens when I want to read the file... 10 seconds later... Normally when I find a media file, I want to read/watch/listen to it. I accessed the metadata; preemptive spin up? Time for a fun script using inotify. I actually took this script one step further then just preemptive spin up and have it do all drive power management. Turns out different drive manufactures interpret `hdparm -S 84 $DRIVE` (Go to sleep in 7m) differently. This whole NAS was built on the cheap and I have 4 different types of drives in my array. Code: #!/bin/bash WATCH_PATH="/data" ARRAY_NAME="data" SLEEPING_TIME_S="600" ARRAY=`ls -la /dev/md/$ARRAY_NAME | awk -F"../" '{print $5}'` PARTS=`ls /sys/block/$ARRAY/slaves | sed 's/[^a-z]*//g'` set -m while [ 1 ];do inotifywait $WATCH_PATH -qq -t $SLEEPING_TIME_S if [ $? = "0" ];then #echo -n "Start waking: " for i in $PARTS; do (hdparm -S 0 /dev/$i) & done #echo "Done" else #echo -n "Make go sleep: " for i in $PARTS; do STATE=`hdparm -C /dev/$i | grep "drive state is" | awk '{print $4}'` #Really should check that the array is not doing something block related, like a check or rebuild if [ "$STATE" != "standby" ];then hdparm -y /dev/$i > /dev/null 2>&1 fi done #echo "Done" fi sleep 1s done A few other key points have been addressed in this thread. There is much greater details in the below posts: Spinning drives up/down puts wear on drives, but it is more cost effective to sleep the drives and wear them out then it is to pay for the power. Spinning up X drives at once puts a huge load on the PSU (Power Supply Unit). According to Western Digital, their 7200RPM drives spike at 30 watts during spin up. You have been warned. Warning, formatting a drive for bcache will remove ALL your data. There is no way to remove bcache with out reformatting the device. 5400RPM drives take about 10 seconds to spin up. 7200RPM take about 14 seconds to spin up. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-12 1:03 Sleepy drives and MD RAID 6 Adam Talbot 2014-08-12 1:21 ` Larkin Lowrey @ 2014-08-12 3:29 ` Roman Mamedov 2014-08-14 18:10 ` Bill Davidsen 1 sibling, 1 reply; 17+ messages in thread From: Roman Mamedov @ 2014-08-12 3:29 UTC (permalink / raw) To: Adam Talbot; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 709 bytes --] On Mon, 11 Aug 2014 18:03:17 -0700 Adam Talbot <ajtalbot1@gmail.com> wrote: > I need help from the Linux RAID pros. > > To make a very long story short; I have a 7 disk in a RAID 6 array. I > put the drives to sleep after 7 minutes of inactivity. It is well known that repeatedly spinning a drive down/up is absolutely the worst possible thing you can do to it, from a long term reliability standpoint. So my personal suggestion would be to reconsider if you really want this. The power consumption from 7 spinning drives with no access should be no higher than 60-70 watt; IMHO saving that amount, is not something that's worth risking your disks and data for. -- With respect, Roman [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-12 3:29 ` Roman Mamedov @ 2014-08-14 18:10 ` Bill Davidsen 2014-08-14 19:33 ` Adam Talbot 0 siblings, 1 reply; 17+ messages in thread From: Bill Davidsen @ 2014-08-14 18:10 UTC (permalink / raw) To: linux-raid Roman Mamedov wrote: > On Mon, 11 Aug 2014 18:03:17 -0700 > Adam Talbot <ajtalbot1@gmail.com> wrote: > >> I need help from the Linux RAID pros. >> >> To make a very long story short; I have a 7 disk in a RAID 6 array. I >> put the drives to sleep after 7 minutes of inactivity. > > It is well known that repeatedly spinning a drive down/up is absolutely the > worst possible thing you can do to it, from a long term reliability standpoint. > So my personal suggestion would be to reconsider if you really want this. The > power consumption from 7 spinning drives with no access should be no higher > than 60-70 watt; IMHO saving that amount, is not something that's worth risking > your disks and data for. > Unless you live in someplace really cold, there's the cost of pumping that heat out of the room. Running A/C can almost double your power cost, and running the room hot shortens your component life. In other words there's more cost than the power for most people. Even if my hardware can run at 90F, I can't. There appears to be a partial solution, get a small SDD (<$100) and put the journal on it. Get two, run RAID1 if you must. Then configure the system to write the journal with all data (data=journal), and the write will be really fast, even if they don't fully complete for a minute or so. Doesn't help reads, of course. Turns the journal into cache, sort of. I have the feeling that sequential spin is an option in a driver, but I can't remember or quickly find where. I say this because I had to set it on one machine I had, spinning up the whole array at one caused the power supply to overload, and until I could get a bigger one which would fit I set an option. That was long enough that I can't remember where I found that. Turn it off and all seven drives will ask for power at once, which probably isn't a great thing, but not my system. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Sleepy drives and MD RAID 6 2014-08-14 18:10 ` Bill Davidsen @ 2014-08-14 19:33 ` Adam Talbot 0 siblings, 0 replies; 17+ messages in thread From: Adam Talbot @ 2014-08-14 19:33 UTC (permalink / raw) To: Bill Davidsen; +Cc: linux-raid I live in Portland OR USA. My basement runs about 68F (20C) peak. That is with my NAS at 100% activity for 40hours+. That is why cooling was not included in my calculation. One could also include 2~3 year drive warranties into the equation of spin up/down. Wow. This whole topic is becoming a really good Wiki article. I remember a staggering option some where in the firmware of the cards. I am going to check that tonight, before I spend any moneys. I have two SSD's (240GB) RAID1. I never knew I could break the journal out to other drives?! Got any links/information on how to do this? On Thu, Aug 14, 2014 at 11:10 AM, Bill Davidsen <davidsen@tmr.com> wrote: > Roman Mamedov wrote: >> >> On Mon, 11 Aug 2014 18:03:17 -0700 >> Adam Talbot <ajtalbot1@gmail.com> wrote: >> >>> I need help from the Linux RAID pros. >>> >>> To make a very long story short; I have a 7 disk in a RAID 6 array. I >>> put the drives to sleep after 7 minutes of inactivity. >> >> >> It is well known that repeatedly spinning a drive down/up is absolutely >> the >> worst possible thing you can do to it, from a long term reliability >> standpoint. >> So my personal suggestion would be to reconsider if you really want this. >> The >> power consumption from 7 spinning drives with no access should be no >> higher >> than 60-70 watt; IMHO saving that amount, is not something that's worth >> risking >> your disks and data for. >> > Unless you live in someplace really cold, there's the cost of pumping that > heat out of the room. Running A/C can almost double your power cost, and > running the room hot shortens your component life. In other words there's > more cost than the power for most people. Even if my hardware can run at > 90F, I can't. > > There appears to be a partial solution, get a small SDD (<$100) and put the > journal on it. Get two, run RAID1 if you must. Then configure the system to > write the journal with all data (data=journal), and the write will be really > fast, even if they don't fully complete for a minute or so. Doesn't help > reads, of course. Turns the journal into cache, sort of. > > I have the feeling that sequential spin is an option in a driver, but I > can't remember or quickly find where. I say this because I had to set it on > one machine I had, spinning up the whole array at one caused the power > supply to overload, and until I could get a bigger one which would fit I set > an option. That was long enough that I can't remember where I found that. > Turn it off and all seven drives will ask for power at once, which probably > isn't a great thing, but not my system. > > -- > Bill Davidsen <davidsen@tmr.com> > "We have more to fear from the bungling of the incompetent than from > the machinations of the wicked." - from Slashdot > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-08-28 21:04 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-12 1:03 Sleepy drives and MD RAID 6 Adam Talbot
2014-08-12 1:21 ` Larkin Lowrey
2014-08-12 1:31 ` Adam Talbot
2014-08-12 5:55 ` Can Jeuleers
2014-08-12 9:46 ` Wilson, Jonathan
2014-08-12 15:26 ` Adam Talbot
[not found] ` <CAH_2GhfBKiJAT=+zCFH0_xkmyTfWFi=c-F+z8rwL1x++UwU+KA@mail.gmail.com>
2014-08-13 16:07 ` Adam Talbot
2014-08-14 16:50 ` Adam Talbot
2014-08-14 17:00 ` Larkin Lowrey
2014-08-14 17:37 ` Adam Talbot
2014-08-14 18:05 ` Larkin Lowrey
2014-08-18 15:41 ` Adam Talbot
2014-08-18 16:16 ` Larkin Lowrey
[not found] ` <CAH_2GhcwHszr75bbDq2aWRnnF8-ttpGERKE=CKt=Nri3ue9row@mail.gmail.com>
2014-08-28 21:04 ` Adam Talbot
2014-08-12 3:29 ` Roman Mamedov
2014-08-14 18:10 ` Bill Davidsen
2014-08-14 19:33 ` Adam Talbot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox