* RAID5 alignment issues with 4K/AF drives (WD green ones) @ 2011-12-29 23:28 Michele Codutti 2011-12-30 2:00 ` Zdenek Kaspar 2011-12-30 6:24 ` Brad Campbell 0 siblings, 2 replies; 12+ messages in thread From: Michele Codutti @ 2011-12-29 23:28 UTC (permalink / raw) To: linux-raid Hi all, I'm writing to this mailing list because I cannot figure out why I had some performance issues with my three WD20EARS (2TB Western Digital "Green" 4K/AF drive). These drives has a (sequential) write throughput around 100MB/s. When I combine them in a RAID0 configuration the throughput is around 300 MB/s and in a RAID1 configuration they preserve a single drive performance of 100MB/s. When I combine all three drives in a RAID5 configuration the (individual) performance falls around 40MB/s. I get the same performance level when I do individual misaligned writes (ex: dd if=/dev/zero bs=6K of=/dev/sda). The drives are not partitioned. I'm using the default chunk size (512K) and the default metadata superblock version (1.2). I had not formatted the RAID or any single drive during my test i had directly used the raw devices. I'm using a 11.10 ubuntu with 3.0.0 linux kernel and 3.1.4 mdadm. The hardware is a HP microserver. Could you give me some advice? Thanks in advance. Michele ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-29 23:28 RAID5 alignment issues with 4K/AF drives (WD green ones) Michele Codutti @ 2011-12-30 2:00 ` Zdenek Kaspar 2011-12-30 4:48 ` Marcus Sorensen 2011-12-30 6:24 ` Brad Campbell 1 sibling, 1 reply; 12+ messages in thread From: Zdenek Kaspar @ 2011-12-30 2:00 UTC (permalink / raw) To: linux-raid Dne 30.12.2011 0:28, Michele Codutti napsal(a): > Hi all, I'm writing to this mailing list because I cannot figure out why I had some performance issues with my three WD20EARS (2TB Western Digital "Green" 4K/AF drive). > These drives has a (sequential) write throughput around 100MB/s. When I combine them in a RAID0 configuration the throughput is around 300 MB/s and in a RAID1 configuration they preserve a single drive performance of 100MB/s. > When I combine all three drives in a RAID5 configuration the (individual) performance falls around 40MB/s. > I get the same performance level when I do individual misaligned writes (ex: dd if=/dev/zero bs=6K of=/dev/sda). > The drives are not partitioned. I'm using the default chunk size (512K) and the default metadata superblock version (1.2). > I had not formatted the RAID or any single drive during my test i had directly used the raw devices. > I'm using a 11.10 ubuntu with 3.0.0 linux kernel and 3.1.4 mdadm. > The hardware is a HP microserver. > > Could you give me some advice? > Thanks in advance. > > Michele-- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html There must be some misalignment somewhere :( Do all drives really report as 4K to the OS - physical_block_size, logical_block_size under /sys/block/sdX/queue/ ?? NB: how does it perform with partitions starting at sector 2048 (check all disks with fdisk -lu /dev/sdX). HTH, Z. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 2:00 ` Zdenek Kaspar @ 2011-12-30 4:48 ` Marcus Sorensen 2011-12-30 4:52 ` Mikael Abrahamsson 0 siblings, 1 reply; 12+ messages in thread From: Marcus Sorensen @ 2011-12-30 4:48 UTC (permalink / raw) To: Zdenek Kaspar; +Cc: linux-raid No, those drives generally DON'T report 4k to the OS, even though they are. If they were, there'd be fewer problems. They lie and say 512b sectors for compatibility. My only suggestion would be to experiment with various partitioning, starting the first partition at 2048s or various points to see if you can find a placement that aligns the partitions properly. I'm sure there's an explanation, but I'm not in the mood to put on my thinking hat to figure it out at the moment. May also be worth using a different superblock version, as 1.2 is 4k from the start of the drives, which might be messing with alignment (although I would expect it on all arrays), worth trying the .9 which goes to the end of the device. On Thu, Dec 29, 2011 at 7:00 PM, Zdenek Kaspar <zkaspar82@gmail.com> wrote: > Dne 30.12.2011 0:28, Michele Codutti napsal(a): >> Hi all, I'm writing to this mailing list because I cannot figure out why I had some performance issues with my three WD20EARS (2TB Western Digital "Green" 4K/AF drive). >> These drives has a (sequential) write throughput around 100MB/s. When I combine them in a RAID0 configuration the throughput is around 300 MB/s and in a RAID1 configuration they preserve a single drive performance of 100MB/s. >> When I combine all three drives in a RAID5 configuration the (individual) performance falls around 40MB/s. >> I get the same performance level when I do individual misaligned writes (ex: dd if=/dev/zero bs=6K of=/dev/sda). >> The drives are not partitioned. I'm using the default chunk size (512K) and the default metadata superblock version (1.2). >> I had not formatted the RAID or any single drive during my test i had directly used the raw devices. >> I'm using a 11.10 ubuntu with 3.0.0 linux kernel and 3.1.4 mdadm. >> The hardware is a HP microserver. >> >> Could you give me some advice? >> Thanks in advance. >> >> Michele-- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > There must be some misalignment somewhere :( Do all drives really report > as 4K to the OS - physical_block_size, logical_block_size under > /sys/block/sdX/queue/ ?? > > NB: how does it perform with partitions starting at sector 2048 (check > all disks with fdisk -lu /dev/sdX). > > HTH, Z. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 4:48 ` Marcus Sorensen @ 2011-12-30 4:52 ` Mikael Abrahamsson 2011-12-30 5:45 ` Marcus Sorensen 0 siblings, 1 reply; 12+ messages in thread From: Mikael Abrahamsson @ 2011-12-30 4:52 UTC (permalink / raw) To: Marcus Sorensen; +Cc: Zdenek Kaspar, linux-raid On Thu, 29 Dec 2011, Marcus Sorensen wrote: > My only suggestion would be to experiment with various partitioning, Poster already said they're not partitioned. > On Thu, Dec 29, 2011 at 7:00 PM, Zdenek Kaspar <zkaspar82@gmail.com> wrote: >> Dne 30.12.2011 0:28, Michele Codutti napsal(a): >>> The drives are not partitioned. I'm using the default chunk size (512K) and the default metadata superblock version (1.2). My recommendation would be to look into the stripe-cache settings and check iostat -x 5 output. What is most likely happening is that when writing to the raid5, it's reading some (to calculate parity most likely) and not just writing. iostat will confirm if this is indeed the case. Also, using raid5 for 2TB drives or larger is not recommended, use RAID6 <http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162>. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 4:52 ` Mikael Abrahamsson @ 2011-12-30 5:45 ` Marcus Sorensen 2011-12-30 6:09 ` Marcus Sorensen 2011-12-31 3:12 ` Mikael Abrahamsson 0 siblings, 2 replies; 12+ messages in thread From: Marcus Sorensen @ 2011-12-30 5:45 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Zdenek Kaspar, linux-raid On Thu, Dec 29, 2011 at 9:52 PM, Mikael Abrahamsson <swmike@swm.pp.se> wrote: > On Thu, 29 Dec 2011, Marcus Sorensen wrote: > >> My only suggestion would be to experiment with various partitioning, > > > Poster already said they're not partitioned. Correct. using partitioning allows you to adjust the alignment, so for example if the MD superblock at the front moves the start of the exported MD device out of alignment with the base disks, you could compensate for it by starting your partition on the correct offset. > >> On Thu, Dec 29, 2011 at 7:00 PM, Zdenek Kaspar <zkaspar82@gmail.com> >> wrote: >>> >>> Dne 30.12.2011 0:28, Michele Codutti napsal(a): >>>> >>>> The drives are not partitioned. I'm using the default chunk size (512K) >>>> and the default metadata superblock version (1.2). > > > My recommendation would be to look into the stripe-cache settings and check > iostat -x 5 output. What is most likely happening is that when writing to > the raid5, it's reading some (to calculate parity most likely) and not just > writing. iostat will confirm if this is indeed the case. > > Also, using raid5 for 2TB drives or larger is not recommended, use RAID6 > <http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162>. If he's writing full stripes he doesn't need to calculate parity by reading. I'm not sure how the MD layer determines this though, unless he's adding a sync or o_direct flag to his test he should be writing full stripes regardless of the blocksize he sets. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 5:45 ` Marcus Sorensen @ 2011-12-30 6:09 ` Marcus Sorensen 2011-12-31 3:12 ` Mikael Abrahamsson 1 sibling, 0 replies; 12+ messages in thread From: Marcus Sorensen @ 2011-12-30 6:09 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Zdenek Kaspar, linux-raid I think we need more info on his test. If he's running the dd until he exhausts his writeback to see what the disk speed is, then yes, he'll run into having to read stripes to calculate parity since he'll be forced to write 4k blocks synchronously (prior to kernel 3.1, where his thread will still get to use dirty memory but just be forced to sleep if the disk can't keep up). I have seen bumping the stripe cache help significantly in these cases, and in the real world where you're not writing large full-stripe files. Instead of doing a monster sequential write to find my disk speed, I generally find it more useful to add conv=fdatasync to a dd so that the dirty buffers are utilized as they are in most real-world working environments, but I don't get a result until the test is on-disk. On Thu, Dec 29, 2011 at 10:45 PM, Marcus Sorensen <shadowsor@gmail.com> wrote: > On Thu, Dec 29, 2011 at 9:52 PM, Mikael Abrahamsson <swmike@swm.pp.se> wrote: >> On Thu, 29 Dec 2011, Marcus Sorensen wrote: >> >>> My only suggestion would be to experiment with various partitioning, >> >> >> Poster already said they're not partitioned. > > Correct. using partitioning allows you to adjust the alignment, so for > example if the MD superblock at the front moves the start of the > exported MD device out of alignment with the base disks, you could > compensate for it by starting your partition on the correct offset. > > >> >>> On Thu, Dec 29, 2011 at 7:00 PM, Zdenek Kaspar <zkaspar82@gmail.com> >>> wrote: >>>> >>>> Dne 30.12.2011 0:28, Michele Codutti napsal(a): >>>>> >>>>> The drives are not partitioned. I'm using the default chunk size (512K) >>>>> and the default metadata superblock version (1.2). >> >> >> My recommendation would be to look into the stripe-cache settings and check >> iostat -x 5 output. What is most likely happening is that when writing to >> the raid5, it's reading some (to calculate parity most likely) and not just >> writing. iostat will confirm if this is indeed the case. >> >> Also, using raid5 for 2TB drives or larger is not recommended, use RAID6 >> <http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162>. > > If he's writing full stripes he doesn't need to calculate parity by > reading. I'm not sure how the MD layer determines this though, unless > he's adding a sync or o_direct flag to his test he should be writing > full stripes regardless of the blocksize he sets. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 5:45 ` Marcus Sorensen 2011-12-30 6:09 ` Marcus Sorensen @ 2011-12-31 3:12 ` Mikael Abrahamsson 1 sibling, 0 replies; 12+ messages in thread From: Mikael Abrahamsson @ 2011-12-31 3:12 UTC (permalink / raw) To: Marcus Sorensen; +Cc: Zdenek Kaspar, linux-raid On Thu, 29 Dec 2011, Marcus Sorensen wrote: >> Poster already said they're not partitioned. > > Correct. using partitioning allows you to adjust the alignment, so for > example if the MD superblock at the front moves the start of the > exported MD device out of alignment with the base disks, you could > compensate for it by starting your partition on the correct offset. Unless he has used the XP jumper, it's impossible to misalign MD if running without partitions, afaik. > If he's writing full stripes he doesn't need to calculate parity by > reading. I'm not sure how the MD layer determines this though, unless > he's adding a sync or o_direct flag to his test he should be writing > full stripes regardless of the blocksize he sets. I've seen MD do 10% reads in this situation, I believe the handling of this is not optimal and sometimes there will be reads. Niel can probably tell a lot more about what might be going on. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-29 23:28 RAID5 alignment issues with 4K/AF drives (WD green ones) Michele Codutti 2011-12-30 2:00 ` Zdenek Kaspar @ 2011-12-30 6:24 ` Brad Campbell 2011-12-30 21:04 ` Michele Codutti 1 sibling, 1 reply; 12+ messages in thread From: Brad Campbell @ 2011-12-30 6:24 UTC (permalink / raw) To: Michele Codutti; +Cc: linux-raid On 30/12/11 07:28, Michele Codutti wrote: > Hi all, I'm writing to this mailing list because I cannot figure out why I had some performance issues with my three WD20EARS (2TB Western Digital "Green" 4K/AF drive). > These drives has a (sequential) write throughput around 100MB/s. When I combine them in a RAID0 configuration the throughput is around 300 MB/s and in a RAID1 configuration they preserve a single drive performance of 100MB/s. > When I combine all three drives in a RAID5 configuration the (individual) performance falls around 40MB/s. > I get the same performance level when I do individual misaligned writes (ex: dd if=/dev/zero bs=6K of=/dev/sda). > The drives are not partitioned. I'm using the default chunk size (512K) and the default metadata superblock version (1.2). > I had not formatted the RAID or any single drive during my test i had directly used the raw devices. > I'm using a 11.10 ubuntu with 3.0.0 linux kernel and 3.1.4 mdadm. > The hardware is a HP microserver. > Just a thought, but do you have the "XP mode" jumper removed on all drives? Regards, Brad ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 6:24 ` Brad Campbell @ 2011-12-30 21:04 ` Michele Codutti 2011-12-30 23:17 ` Zdenek Kaspar 2011-12-31 15:53 ` John Robinson 0 siblings, 2 replies; 12+ messages in thread From: Michele Codutti @ 2011-12-30 21:04 UTC (permalink / raw) To: linux-raid Hi all, thanks for the tips I'll reply everyone in one aggregated message: > Just a thought, but do you have the "XP mode" jumper removed on all drives? Yes. > Instead of doing a monster sequential write to find my disk speed, I > generally find it more useful to add conv=fdatasync to a dd so that > the dirty buffers are utilized as they are in most real-world working > environments, but I don't get a result until the test is on-disk. Done, same results (40 MB/s) >>> My only suggestion would be to experiment with various partitioning, >> >> >> Poster already said they're not partitioned. > > Correct. using partitioning allows you to adjust the alignment, so for > example if the MD superblock at the front moves the start of the > exported MD device out of alignment with the base disks, you could > compensate for it by starting your partition on the correct offset. Done. I've created one big partition using parted with "-a optimal". The partition layout is (fdisk friendly output): Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00077f06 Device Boot Start End Blocks Id System /dev/sdc1 2048 3907028991 1953513472 fd Linux raid autodetect Redone the test with the "conv=fdatasync" option as above: same results. > My only suggestion would be to experiment with various partitioning, > starting the first partition at 2048s or various points to see if you > can find a placement that aligns the partitions properly. I'm sure > there's an explanation, but I'm not in the mood to put on my thinking > hat to figure it out at the moment. May also be worth using a > different superblock version, as 1.2 is 4k from the start of the > drives, which might be messing with alignment (although I would expect > it on all arrays), worth trying the .9 which goes to the end of the > device. I've tried all the superblock versions 0, 0.9, 1, 1.1 and 1.2. Same results. > No, those drives generally DON'T report 4k to the OS, even though they > are. If they were, there'd be fewer problems. They lie and say 512b > sectors for compatibility. Yes they are dirty liars. It's the same also for the EADS series not only for the EARS ones. > My recommendation would be to look into the stripe-cache settings and check > iostat -x 5 output. What is most likely happening is that when writing to > the raid5, it's reading some (to calculate parity most likely) and not just > writing. iostat will confirm if this is indeed the case. Could you explain how I could look into the stripe-cache settings? This is one of many similar outputs from iostat -x 5 from the initial rebuilding phase: avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 13.29 0.00 0.00 86.71 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 6585.60 0.00 4439.20 0.00 44099.20 0.00 19.87 6.14 1.38 1.38 0.00 0.09 39.28 sdb 6280.40 0.00 4746.60 0.00 44108.00 0.00 18.59 5.20 1.10 1.10 0.00 0.07 35.04 sdc 0.00 9895.40 0.00 1120.80 0.00 44152.80 78.79 12.03 10.73 0.00 10.73 0.82 92.32 I also build a RAID6 (with one drive missing): same results. > There must be some misalignment somewhere :( Yes, it's the same behavior. > Do all drives really report as 4K to the OS - physical_block_size, logical_block_size under > /sys/block/sdX/queue/ ?? No they lie about the block size as you can see also in the fdisk output above. > NB: how does it perform with partitions starting at sector 2048 (check > all disks with fdisk -lu /dev/sdX). They perform the same. Any other suggestion? I almost forgot: I've also booted OpenSolaris and I've created a zfs pool (aligned with 4k sector) from the same three drives and they perform very well, individually and together. I know that I'm comparing apples and oranges but ... there must be a solution! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 21:04 ` Michele Codutti @ 2011-12-30 23:17 ` Zdenek Kaspar 2011-12-31 22:20 ` Marcus Sorensen 2011-12-31 15:53 ` John Robinson 1 sibling, 1 reply; 12+ messages in thread From: Zdenek Kaspar @ 2011-12-30 23:17 UTC (permalink / raw) To: linux-raid Dne 30.12.2011 22:04, Michele Codutti napsal(a): > Hi all, thanks for the tips I'll reply everyone in one aggregated message: >> Just a thought, but do you have the "XP mode" jumper removed on all drives? > Yes. > >> Instead of doing a monster sequential write to find my disk speed, I >> generally find it more useful to add conv=fdatasync to a dd so that >> the dirty buffers are utilized as they are in most real-world working >> environments, but I don't get a result until the test is on-disk. > Done, same results (40 MB/s) > >>>> My only suggestion would be to experiment with various partitioning, >>> >>> >>> Poster already said they're not partitioned. >> >> Correct. using partitioning allows you to adjust the alignment, so for >> example if the MD superblock at the front moves the start of the >> exported MD device out of alignment with the base disks, you could >> compensate for it by starting your partition on the correct offset. > Done. I've created one big partition using parted with "-a optimal". > The partition layout is (fdisk friendly output): > Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes > 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors > Units = sectors of 1 * 512 = 512 bytes > Sector size (logical/physical): 512 bytes / 512 bytes > I/O size (minimum/optimal): 512 bytes / 512 bytes > Disk identifier: 0x00077f06 > > Device Boot Start End Blocks Id System > /dev/sdc1 2048 3907028991 1953513472 fd Linux raid autodetect > Redone the test with the "conv=fdatasync" option as above: same results. > >> My only suggestion would be to experiment with various partitioning, >> starting the first partition at 2048s or various points to see if you >> can find a placement that aligns the partitions properly. I'm sure >> there's an explanation, but I'm not in the mood to put on my thinking >> hat to figure it out at the moment. May also be worth using a >> different superblock version, as 1.2 is 4k from the start of the >> drives, which might be messing with alignment (although I would expect >> it on all arrays), worth trying the .9 which goes to the end of the >> device. > I've tried all the superblock versions 0, 0.9, 1, 1.1 and 1.2. Same results. > >> No, those drives generally DON'T report 4k to the OS, even though they >> are. If they were, there'd be fewer problems. They lie and say 512b >> sectors for compatibility. > Yes they are dirty liars. It's the same also for the EADS series not only for the EARS ones. > >> My recommendation would be to look into the stripe-cache settings and check >> iostat -x 5 output. What is most likely happening is that when writing to >> the raid5, it's reading some (to calculate parity most likely) and not just >> writing. iostat will confirm if this is indeed the case. > Could you explain how I could look into the stripe-cache settings? > This is one of many similar outputs from iostat -x 5 from the initial rebuilding phase: > avg-cpu: %user %nice %system %iowait %steal %idle > 0.00 0.00 13.29 0.00 0.00 86.71 > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 6585.60 0.00 4439.20 0.00 44099.20 0.00 19.87 6.14 1.38 1.38 0.00 0.09 39.28 > sdb 6280.40 0.00 4746.60 0.00 44108.00 0.00 18.59 5.20 1.10 1.10 0.00 0.07 35.04 > sdc 0.00 9895.40 0.00 1120.80 0.00 44152.80 78.79 12.03 10.73 0.00 10.73 0.82 92.32 > I also build a RAID6 (with one drive missing): same results. > >> There must be some misalignment somewhere :( > Yes, it's the same behavior. > >> Do all drives really report as 4K to the OS - physical_block_size, logical_block_size under >> /sys/block/sdX/queue/ ?? > No they lie about the block size as you can see also in the fdisk output above. > >> NB: how does it perform with partitions starting at sector 2048 (check >> all disks with fdisk -lu /dev/sdX). > They perform the same. > > Any other suggestion? > > I almost forgot: I've also booted OpenSolaris and I've created a zfs pool (aligned with 4k sector) from the same three drives and they perform very well, individually and together. I know that I'm comparing apples and oranges but ... there must be a solution!-- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > WTF is the jumper for then ? (on 512B drive) Does it change somehow: /sys/block/sdX/queue/physical_block_size /sys/block/sdX/queue/logical_block_size /sys/block/sdX/alignment_offset If osol can handle it (enforcing 4k), it's good sign.. (you used ashift=12 for the pool, right?) Z. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 23:17 ` Zdenek Kaspar @ 2011-12-31 22:20 ` Marcus Sorensen 0 siblings, 0 replies; 12+ messages in thread From: Marcus Sorensen @ 2011-12-31 22:20 UTC (permalink / raw) To: Zdenek Kaspar; +Cc: linux-raid > WTF is the jumper for then ? (on 512B drive) > Does it change somehow: > /sys/block/sdX/queue/physical_block_size > /sys/block/sdX/queue/logical_block_size > /sys/block/sdX/alignment_offset No, it doesn't change what the OS sees at all. Off the top of my head I think it just changes how the drive maps internally, so an OS that normally starts the 1st partition on sector 63 will align correctly, hence XP mode. That might not be exactly what's going on but something along those lines. > Could you explain how I could look into the stripe-cache settings? In /sys/block/md0/md/stripe_cache_size, this allows the system to keep the contents of recently read stripes in memory, so if they need to be modified again it doesn't have to read from disk to calculate parity. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 alignment issues with 4K/AF drives (WD green ones) 2011-12-30 21:04 ` Michele Codutti 2011-12-30 23:17 ` Zdenek Kaspar @ 2011-12-31 15:53 ` John Robinson 1 sibling, 0 replies; 12+ messages in thread From: John Robinson @ 2011-12-31 15:53 UTC (permalink / raw) To: Michele Codutti; +Cc: linux-raid On 30/12/2011 21:04, Michele Codutti wrote: [...] > This is one of many similar outputs from iostat -x 5 from the initial rebuilding phase: > avg-cpu: %user %nice %system %iowait %steal %idle > 0.00 0.00 13.29 0.00 0.00 86.71 > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 6585.60 0.00 4439.20 0.00 44099.20 0.00 19.87 6.14 1.38 1.38 0.00 0.09 39.28 > sdb 6280.40 0.00 4746.60 0.00 44108.00 0.00 18.59 5.20 1.10 1.10 0.00 0.07 35.04 > sdc 0.00 9895.40 0.00 1120.80 0.00 44152.80 78.79 12.03 10.73 0.00 10.73 0.82 92.32 > I also build a RAID6 (with one drive missing): same results. Hang on, are you saying you see the 40MB/s speeds during the initial rebuilding phase? Yes, you will get those results. You are seeing degraded mode performance in the RAID5 just as you are in the RAID6 with a missing drive. When the array is fully built, which may well take a day or two, you can expect better. Check /proc/mdstat for the progress of the initial build. If you happen to know that your array is already in sync (which three brand new all-zero drives would be for RAID5), or want to test without waiting for a rebuild, you can use --assume-clean when creating the array. Cheers, John. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-12-31 22:20 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-29 23:28 RAID5 alignment issues with 4K/AF drives (WD green ones) Michele Codutti 2011-12-30 2:00 ` Zdenek Kaspar 2011-12-30 4:48 ` Marcus Sorensen 2011-12-30 4:52 ` Mikael Abrahamsson 2011-12-30 5:45 ` Marcus Sorensen 2011-12-30 6:09 ` Marcus Sorensen 2011-12-31 3:12 ` Mikael Abrahamsson 2011-12-30 6:24 ` Brad Campbell 2011-12-30 21:04 ` Michele Codutti 2011-12-30 23:17 ` Zdenek Kaspar 2011-12-31 22:20 ` Marcus Sorensen 2011-12-31 15:53 ` John Robinson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).