* slow sequential read on partitioned raid6 @ 2010-03-16 19:05 Nicolae Mihalache 2010-03-16 22:22 ` Neil Brown 0 siblings, 1 reply; 7+ messages in thread From: Nicolae Mihalache @ 2010-03-16 19:05 UTC (permalink / raw) To: linux-raid Hello, I have created a partitioned raid6 array over 6x1TB SATA disks using the command (from memory): mdadm --create --auto=mdp --level=6 --raid-devices /dev/md_d1 /dev/sd[b-g]. When I run a sequential read test using dd if=/dev/md_d1p1 of=/dev/null bs=1M I get low read speeds of around 80MB/s but only when the partition is mounted. If I unmount, the speed is around 350MB/s. The filesystems I tried are ext3 and xfs. The partitions have been created with gparted, the partition table being of type GPT. If I create normal /dev/sdx1 partitions on each disk and then make a /dev/md1 raid6 array over them, the read speed is ok. I played with different read ahead settings and while they changed the read speed, it's only marginally around the values reported above. Can somebody explain what is the difference when accessing a raw disk when it is mounted or not? Also when playing with those read ahead settings it was not clear how/if the read ahead of the individual disks are taken into account. When setting big values of read ahead, I could see with iostat that tps for the individual disks is double when accessing the mounted disk as opposed to when accessing it unmounted (despite the speed being three times lower). It's like when accessing the mounted partition, it reads some other parts of the disks. I could not find a way to print the blocks read from the individual disks. The sysctl vm.block_dump=1 makes the kernel print the block numbers on the md array but not on the components of the array. The system is debian 5 with kernel 2.6.26-2-686. Thanks for any hint on how to further debug the problem. nicolae ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: slow sequential read on partitioned raid6 2010-03-16 19:05 slow sequential read on partitioned raid6 Nicolae Mihalache @ 2010-03-16 22:22 ` Neil Brown 2010-03-16 23:16 ` Nicolae Mihalache 0 siblings, 1 reply; 7+ messages in thread From: Neil Brown @ 2010-03-16 22:22 UTC (permalink / raw) To: Nicolae Mihalache; +Cc: linux-raid On Tue, 16 Mar 2010 20:05:45 +0100 Nicolae Mihalache <mache@abcpages.com> wrote: > Hello, > > I have created a partitioned raid6 array over 6x1TB SATA disks using the > command (from memory): mdadm --create --auto=mdp --level=6 > --raid-devices /dev/md_d1 /dev/sd[b-g]. > > When I run a sequential read test using > dd if=/dev/md_d1p1 of=/dev/null bs=1M > I get low read speeds of around 80MB/s but only when the partition is > mounted. > > If I unmount, the speed is around 350MB/s. The filesystems I tried are > ext3 and xfs. Thanks for reporting this. I just did some testing and I get the reverse!! When a filesystem is mounted I get 135MB/s. When it isn't mounted I get 64MB/s. I cannot think what could cause this. I will have to explore. Can you please double check you results and confirm that it definitely is faster then unmounted. > > The partitions have been created with gparted, the partition table being > of type GPT. > > If I create normal /dev/sdx1 partitions on each disk and then make a > /dev/md1 raid6 array over them, the read speed is ok. > > I played with different read ahead settings and while they changed the > read speed, it's only marginally around the values reported above. > > Can somebody explain what is the difference when accessing a raw disk > when it is mounted or not? Also when playing with those read ahead > settings it was not clear how/if the read ahead of the individual disks > are taken into account. Only the read-ahead value of the array is considered. The read-ahead settings of the individual devices in the array are ignored. NeilBrown > > When setting big values of read ahead, I could see with iostat that tps > for the individual disks is double when accessing the mounted disk as > opposed to when accessing it unmounted (despite the speed being three > times lower). > It's like when accessing the mounted partition, it reads some other > parts of the disks. I could not find a way to print the blocks read from > the individual disks. The sysctl vm.block_dump=1 makes the kernel print > the block numbers on the md array but not on the components of the array. > > The system is debian 5 with kernel 2.6.26-2-686. > > Thanks for any hint on how to further debug the problem. > > nicolae > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: slow sequential read on partitioned raid6 2010-03-16 22:22 ` Neil Brown @ 2010-03-16 23:16 ` Nicolae Mihalache [not found] ` <1268783497.3781.14.camel@localhost.localdomain> 0 siblings, 1 reply; 7+ messages in thread From: Nicolae Mihalache @ 2010-03-16 23:16 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid On 03/16/2010 11:22 PM, Neil Brown wrote: > On Tue, 16 Mar 2010 20:05:45 +0100 > Nicolae Mihalache <mache@abcpages.com> wrote: > >> Hello, >> >> I have created a partitioned raid6 array over 6x1TB SATA disks using the >> command (from memory): mdadm --create --auto=mdp --level=6 >> --raid-devices /dev/md_d1 /dev/sd[b-g]. >> >> When I run a sequential read test using >> dd if=/dev/md_d1p1 of=/dev/null bs=1M >> I get low read speeds of around 80MB/s but only when the partition is >> mounted. >> >> If I unmount, the speed is around 350MB/s. The filesystems I tried are >> ext3 and xfs. >> > > Thanks for reporting this. > > I just did some testing and I get the reverse!! > > When a filesystem is mounted I get 135MB/s. When it isn't mounted > I get 64MB/s. > > I cannot think what could cause this. I will have to explore. > Can you please double check you results and confirm that it definitely > is faster then unmounted. > I'm positive that it's slow when mounted, that's how I discovered the problem. See below (I recreated the array over 1/10 of the original disks to be able to test easier). In fact the highest speed I get when accessing directly the entire disk even when one partition is mounted. bacula:~# cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md_d1 : active raid6 sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0] 390668288 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU] md2 : active raid1 sdi1[0] sdj1[1] 1462750272 blocks [2/2] [UU] unused devices: <none> bacula:~# parted /dev/md_d1 GNU Parted 1.8.8 Using /dev/md_d1 Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print Model: Unknown (unknown) Disk /dev/md_d1: 400GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 17.4kB 50.0GB 50.0GB ext3 primary (parted) quit bacula:~# umount /dev/md_d1p1 umount: /dev/md_d1p1: not mounted bacula:~# dd if=/dev/md_d1p1 of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 37.4938 s, 280 MB/s bacula:~# mount /dev/md_d1p1 /mnt bacula:~# dd if=/dev/md_d1p1 of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 132.894 s, 78.9 MB/s bacula:~# dd if=/dev/md_d1 of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 28.222 s, 372 MB/s ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <1268783497.3781.14.camel@localhost.localdomain>]
* Re: slow sequential read on partitioned raid6 [not found] ` <1268783497.3781.14.camel@localhost.localdomain> @ 2010-03-17 8:23 ` Nicolae Mihalache 2010-03-18 2:40 ` Michael Evans 0 siblings, 1 reply; 7+ messages in thread From: Nicolae Mihalache @ 2010-03-17 8:23 UTC (permalink / raw) To: linux-raid I created a second 100GB partition on all the disks and then made a normal /dev/md1 raid6 array out of them, and the results I get: bacula:~# dd if=/dev/zero of=/mnt1/test-file bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 72.6303 s, 144 MB/s bacula:~# dd if=/mnt1/test-file of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 29.1241 s, 360 MB/s I really believe it's something with the partitioned array. /proc/devices shows: Block devices: ... 9 md ... 253 mdp All the md_d1 partitions have major number 253. I don't know if this means something but maybe there is a bug in the mdp driver (or whatever is called). nicolae Daniel Reurich wrote: > On Wed, 2010-03-17 at 00:16 +0100, Nicolae Mihalache wrote: > >> On 03/16/2010 11:22 PM, Neil Brown wrote: >> >>> On Tue, 16 Mar 2010 20:05:45 +0100 >>> Nicolae Mihalache <mache@abcpages.com> wrote: >>> >>> >>>> Hello, >>>> >>>> I have created a partitioned raid6 array over 6x1TB SATA disks using the >>>> command (from memory): mdadm --create --auto=mdp --level=6 >>>> --raid-devices /dev/md_d1 /dev/sd[b-g]. >>>> >>>> When I run a sequential read test using >>>> dd if=/dev/md_d1p1 of=/dev/null bs=1M >>>> I get low read speeds of around 80MB/s but only when the partition is >>>> mounted. >>>> >>>> If I unmount, the speed is around 350MB/s. The filesystems I tried are >>>> ext3 and xfs. >>>> >>>> >>> Thanks for reporting this. >>> >>> I just did some testing and I get the reverse!! >>> >>> When a filesystem is mounted I get 135MB/s. When it isn't mounted >>> I get 64MB/s. >>> >>> I cannot think what could cause this. I will have to explore. >>> Can you please double check you results and confirm that it definitely >>> is faster then unmounted. >>> >>> >> I'm positive that it's slow when mounted, that's how I discovered the >> problem. See below (I recreated the array over 1/10 of the original >> disks to be able to test easier). >> In fact the highest speed I get when accessing directly the entire disk >> even when one partition is mounted. >> >> >> bacula:~# cat /proc/mdstat >> Personalities : [raid1] [raid6] [raid5] [raid4] >> md_d1 : active raid6 sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0] >> 390668288 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU] >> >> md2 : active raid1 sdi1[0] sdj1[1] >> 1462750272 blocks [2/2] [UU] >> >> unused devices: <none> >> >> bacula:~# parted /dev/md_d1 >> GNU Parted 1.8.8 >> Using /dev/md_d1 >> Welcome to GNU Parted! Type 'help' to view a list of commands. >> (parted) print >> Model: Unknown (unknown) >> Disk /dev/md_d1: 400GB >> Sector size (logical/physical): 512B/512B >> Partition Table: gpt >> >> Number Start End Size File system Name Flags >> 1 17.4kB 50.0GB 50.0GB ext3 primary >> >> (parted) quit >> >> bacula:~# umount /dev/md_d1p1 >> umount: /dev/md_d1p1: not mounted >> >> bacula:~# dd if=/dev/md_d1p1 of=/dev/null bs=1M count=10000 >> 10000+0 records in >> 10000+0 records out >> 10485760000 bytes (10 GB) copied, 37.4938 s, 280 MB/s >> >> bacula:~# mount /dev/md_d1p1 /mnt >> >> bacula:~# dd if=/dev/md_d1p1 of=/dev/null bs=1M count=10000 >> 10000+0 records in >> 10000+0 records out >> 10485760000 bytes (10 GB) copied, 132.894 s, 78.9 MB/s >> >> bacula:~# dd if=/dev/md_d1 of=/dev/null bs=1M count=10000 >> 10000+0 records in >> 10000+0 records out >> 10485760000 bytes (10 GB) copied, 28.222 s, 372 MB/s >> > > Why are you trying directly from the block devices when they contain a > mounted filesystem? Surely the fs layer would be holding a locks on the > block device causing it to slow down raw layer access. > > Might I suggest you should be reading files that are located within the > mounted filesystem. > > I suggest you try this in the mounted filesystem: > > dd if=/dev/zero of=/mnt/test-file bs=1M count=10000 > dd if=/mnt/test-file of=/dev/null bs=1M > rm /mnt/test-file > > I hope this helps. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: slow sequential read on partitioned raid6 2010-03-17 8:23 ` Nicolae Mihalache @ 2010-03-18 2:40 ` Michael Evans 2010-03-19 6:47 ` Nicolae Mihalache 0 siblings, 1 reply; 7+ messages in thread From: Michael Evans @ 2010-03-18 2:40 UTC (permalink / raw) To: Nicolae Mihalache; +Cc: linux-raid On Wed, Mar 17, 2010 at 1:23 AM, Nicolae Mihalache <mache@abcpages.com> wrote: > I created a second 100GB partition on all the disks and then made a > normal /dev/md1 raid6 array out of them, and the results I get: > bacula:~# dd if=/dev/zero of=/mnt1/test-file bs=1M count=10000 > 10000+0 records in > 10000+0 records out > 10485760000 bytes (10 GB) copied, 72.6303 s, 144 MB/s > > bacula:~# dd if=/mnt1/test-file of=/dev/null bs=1M count=10000 > 10000+0 records in > 10000+0 records out > 10485760000 bytes (10 GB) copied, 29.1241 s, 360 MB/s > > I really believe it's something with the partitioned array. > /proc/devices shows: > > Block devices: > ... > 9 md > ... > 253 mdp > > > All the md_d1 partitions have major number 253. I don't know if this > means something but maybe there is a bug in the mdp driver (or whatever > is called). > > nicolae > Daniel Reurich wrote: >> On Wed, 2010-03-17 at 00:16 +0100, Nicolae Mihalache wrote: >> >>> On 03/16/2010 11:22 PM, Neil Brown wrote: >>> >>>> On Tue, 16 Mar 2010 20:05:45 +0100 >>>> Nicolae Mihalache <mache@abcpages.com> wrote: >>>> >>>> >>>>> Hello, >>>>> >>>>> I have created a partitioned raid6 array over 6x1TB SATA disks using the >>>>> command (from memory): mdadm --create --auto=mdp --level=6 >>>>> --raid-devices /dev/md_d1 /dev/sd[b-g]. >>>>> >>>>> When I run a sequential read test using >>>>> dd if=/dev/md_d1p1 of=/dev/null bs=1M >>>>> I get low read speeds of around 80MB/s but only when the partition is >>>>> mounted. >>>>> >>>>> If I unmount, the speed is around 350MB/s. The filesystems I tried are >>>>> ext3 and xfs. >>>>> >>>>> >>>> Thanks for reporting this. >>>> >>>> I just did some testing and I get the reverse!! >>>> >>>> When a filesystem is mounted I get 135MB/s. When it isn't mounted >>>> I get 64MB/s. >>>> >>>> I cannot think what could cause this. I will have to explore. >>>> Can you please double check you results and confirm that it definitely >>>> is faster then unmounted. >>>> >>>> >>> I'm positive that it's slow when mounted, that's how I discovered the >>> problem. See below (I recreated the array over 1/10 of the original >>> disks to be able to test easier). >>> In fact the highest speed I get when accessing directly the entire disk >>> even when one partition is mounted. >>> >>> >>> bacula:~# cat /proc/mdstat >>> Personalities : [raid1] [raid6] [raid5] [raid4] >>> md_d1 : active raid6 sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0] >>> 390668288 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU] >>> >>> md2 : active raid1 sdi1[0] sdj1[1] >>> 1462750272 blocks [2/2] [UU] >>> >>> unused devices: <none> >>> >>> bacula:~# parted /dev/md_d1 >>> GNU Parted 1.8.8 >>> Using /dev/md_d1 >>> Welcome to GNU Parted! Type 'help' to view a list of commands. >>> (parted) print >>> Model: Unknown (unknown) >>> Disk /dev/md_d1: 400GB >>> Sector size (logical/physical): 512B/512B >>> Partition Table: gpt >>> >>> Number Start End Size File system Name Flags >>> 1 17.4kB 50.0GB 50.0GB ext3 primary >>> >>> (parted) quit >>> >>> bacula:~# umount /dev/md_d1p1 >>> umount: /dev/md_d1p1: not mounted >>> >>> bacula:~# dd if=/dev/md_d1p1 of=/dev/null bs=1M count=10000 >>> 10000+0 records in >>> 10000+0 records out >>> 10485760000 bytes (10 GB) copied, 37.4938 s, 280 MB/s >>> >>> bacula:~# mount /dev/md_d1p1 /mnt >>> >>> bacula:~# dd if=/dev/md_d1p1 of=/dev/null bs=1M count=10000 >>> 10000+0 records in >>> 10000+0 records out >>> 10485760000 bytes (10 GB) copied, 132.894 s, 78.9 MB/s >>> >>> bacula:~# dd if=/dev/md_d1 of=/dev/null bs=1M count=10000 >>> 10000+0 records in >>> 10000+0 records out >>> 10485760000 bytes (10 GB) copied, 28.222 s, 372 MB/s >>> >> >> Why are you trying directly from the block devices when they contain a >> mounted filesystem? Surely the fs layer would be holding a locks on the >> block device causing it to slow down raw layer access. >> >> Might I suggest you should be reading files that are located within the >> mounted filesystem. >> >> I suggest you try this in the mounted filesystem: >> >> dd if=/dev/zero of=/mnt/test-file bs=1M count=10000 >> dd if=/mnt/test-file of=/dev/null bs=1M >> rm /mnt/test-file >> >> I hope this helps. >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > First off, why not use a hard disk benchmark utility (their names escape me aside from Bonnie++) which has these issues worked out? Second, if you absolutely must try to do a benchmark with basic tools (that buffer and use cache) try this: dd if=/dev/zero bs=1M count=10000 | tr '\0' 't' > testfile dd if=testfile of=/dev/null bs=1M You may note that you'll be writing a file with Ts instead of a file with 0's; my method should not be detected as sparse, where as the case with zeros probably will be detected as sparse and simply not stored. If in doubt you can check the size of the file on disk with ls -ls If I'm reading the output correctly the left most column (size on disk) is in kilobyte units, even on a 4kb cluster EXT4 filesystem. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: slow sequential read on partitioned raid6 2010-03-18 2:40 ` Michael Evans @ 2010-03-19 6:47 ` Nicolae Mihalache 2010-03-19 8:16 ` Michael Evans 0 siblings, 1 reply; 7+ messages in thread From: Nicolae Mihalache @ 2010-03-19 6:47 UTC (permalink / raw) To: Michael Evans; +Cc: linux-raid Actually my problem as written in the subject of the mail was that the sequential read was slow. Somebody suggested to use a file instead of the raw partition. If the file was detected as sparse (who does that??), it would be even faster to read not slower. nicolae On 03/18/2010 03:40 AM, Michael Evans wrote: > First off, why not use a hard disk benchmark utility (their names > escape me aside from Bonnie++) which has these issues worked out? > > Second, if you absolutely must try to do a benchmark with basic tools > (that buffer and use cache) try this: > > dd if=/dev/zero bs=1M count=10000 | tr '\0' 't' > testfile > dd if=testfile of=/dev/null bs=1M > > You may note that you'll be writing a file with Ts instead of a file > with 0's; my method should not be detected as sparse, where as the > case with zeros probably will be detected as sparse and simply not > stored. > > If in doubt you can check the size of the file on disk with ls -ls > If I'm reading the output correctly the left most column (size on > disk) is in kilobyte units, even on a 4kb cluster EXT4 filesystem ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: slow sequential read on partitioned raid6 2010-03-19 6:47 ` Nicolae Mihalache @ 2010-03-19 8:16 ` Michael Evans 0 siblings, 0 replies; 7+ messages in thread From: Michael Evans @ 2010-03-19 8:16 UTC (permalink / raw) To: Nicolae Mihalache; +Cc: linux-raid On Thu, Mar 18, 2010 at 11:47 PM, Nicolae Mihalache <mache@abcpages.com> wrote: > Actually my problem as written in the subject of the mail was that the > sequential read was slow. Somebody suggested to use a file instead of > the raw partition. If the file was detected as sparse (who does that??), > it would be even faster to read not slower. > > nicolae > > > On 03/18/2010 03:40 AM, Michael Evans wrote: >> First off, why not use a hard disk benchmark utility (their names >> escape me aside from Bonnie++) which has these issues worked out? >> >> Second, if you absolutely must try to do a benchmark with basic tools >> (that buffer and use cache) try this: >> >> dd if=/dev/zero bs=1M count=10000 | tr '\0' 't' > testfile >> dd if=testfile of=/dev/null bs=1M >> >> You may note that you'll be writing a file with Ts instead of a file >> with 0's; my method should not be detected as sparse, where as the >> case with zeros probably will be detected as sparse and simply not >> stored. >> >> If in doubt you can check the size of the file on disk with ls -ls >> If I'm reading the output correctly the left most column (size on >> disk) is in kilobyte units, even on a 4kb cluster EXT4 filesystem > > Some versions of standard system utilities may do that by default. They only have to ensure that the data is processed with similar content; not identical on disk structure. I've been told (by developers on the gnu project that includes it) that dd is supposed to do it (at least in recent versions); probably other utilities like cp could have it on by default as well. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-03-19 8:16 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-16 19:05 slow sequential read on partitioned raid6 Nicolae Mihalache 2010-03-16 22:22 ` Neil Brown 2010-03-16 23:16 ` Nicolae Mihalache [not found] ` <1268783497.3781.14.camel@localhost.localdomain> 2010-03-17 8:23 ` Nicolae Mihalache 2010-03-18 2:40 ` Michael Evans 2010-03-19 6:47 ` Nicolae Mihalache 2010-03-19 8:16 ` Michael Evans
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).