* fio results show sequential reads and writes better for network block device than local block device? [not found] <1384903429.33093.YahooMailNeo@web120804.mail.ne1.yahoo.com> @ 2013-11-19 23:26 ` K.R Kishore 2013-11-19 23:48 ` David Nellans 0 siblings, 1 reply; 4+ messages in thread From: K.R Kishore @ 2013-11-19 23:26 UTC (permalink / raw) To: fio@vger.kernel.org; +Cc: krkishore@yahoo.com Hi� I am trying to compare the performance of locally attached block device (SSD) with a network attached block device SSD)and I am seeing results for sequential reads and writes using that I cannot explain. The other results (random reads, writes etc) are as expected, i.e. local is better than remote. Here is my setup - Two machines connected back-to-back by a 10G link - Running RHEL 6.4 (Santiago), 2.6.32-358.6.1.el6.x86_64 - Running nbd v2.9.20 (http://nbd.sourceforge.net/) - Running fio v2.1.2 - Using identical SSD on both machines - Samsung 840 PRO, 128G - all 128G exported as rw volume I have my fio commands and output (only relevant portions) below. I cannot understand how the network device can have high throughput than local device. I see that when I use smaller block sizes to measure iops, the numbers are as expected (local > remote).� Has anyone tried fio on nbd? does fio measure a transaction done when it sees the block-io request handed off to the virtual device and assume TCP will take care of completing the transaction? I can see that it might do so for posted operations such as writes, but reads? Any clues? thx, Kishore ------------------------------------------------------------------------------------------------------------- # Sequential Write Bandwidth test: Locally attached SSD fio --name=writebw --filename=/dev/sdb --direct=1 --rw=write --bs=1m --numjobs=4 --iodepth=32 --direct=1 --iodepth_batch=16 --iodepth_batch_complete=16 --runtime=300 --ramp_time=5 --norandommap --time_based --ioengine=libaio --group_reporting > sdb_seqwrite_bw.out ---- output---- writebw: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32 ... writebw: (groupid=0, jobs=4): err= 0: pid=18689: Tue Nov 19 10:27:08 2013 � write: io=73932MB, bw=252024KB/s, iops=245, runt=300393msec � � slat (msec): min=1, max=262, avg=96.57, stdev=62.94 � � clat (msec): min=74, max=808, avg=423.21, stdev=90.93 � � �lat (msec): min=194, max=912, avg=519.78, stdev=68.03 ... #### Sequential Write Bandwidth test: Network attached SSD (nbd0 is exposed as a NetworkBlockDevice) fio --name=writebw --filename=/dev/nbd0 --direct=1 --rw=write --bs=1m --numjobs=4 --iodepth=32 --direct=1 --iodepth_batch=16 --iodepth_batch_complete=16 --runtime=300 --ramp_time=5 --norandommap --time_based --ioengine=libaio --group_reporting > ndb0_seqwrite_bw.out ---- output---- writebw: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32 ... writebw: (groupid=0, jobs=4): err= 0: pid=8570: Mon Nov 18 18:18:34 2013 � write: io=142764MB, bw=487036KB/s, iops=475, runt=300163msec � � slat (msec): min=1, max=865, avg=134.08, stdev=70.20 � � clat (msec): min=35, max=865, avg=134.61, stdev=70.20 � � �lat (msec): min=148, max=1029, avg=268.65, stdev=103.63 ... #----------------------------------------------------------------------------- #### Sequential Read Bandwidth test: Locally attached SSD fio --name=readbw --filename=/dev/sdb --direct=1 --rw=read --bs=1m --numjobs=4 --iodepth=32 --direct=1 --iodepth_batch=16 --iodepth_batch_complete=16 --runtime=300 --ramp_time=5 --norandommap --time_based --ioengine=libaio --group_reporting > sdb_seqread_bw.out ---- output---- readbw: (g=0): rw=read, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32 ... readbw: (groupid=0, jobs=4): err= 0: pid=18943: Tue Nov 19 10:47:32 2013 � read : io=81596MB, bw=278285KB/s, iops=271, runt=300247msec � � slat (msec): min=2, max=181, avg=96.16, stdev=48.97 � � clat (msec): min=58, max=1161, avg=375.17, stdev=192.42 � � �lat (msec): min=223, max=1335, avg=471.37, stdev=197.63 ... #### Sequential Read Bandwidth test: Network attached SSD (nbd0 is exposed as a NetworkBlockDevice) fio --name=readbw --filename=/dev/ndb0 --direct=1 --rw=read --bs=1m --numjobs=4 --iodepth=32 --direct=1 --iodepth_batch=16 --iodepth_batch_complete=16 --runtime=300 --ramp_time=5 --norandommap --time_based --ioengine=libaio --group_reporting > ndb0_seqread_bw.out ---- output---- readbw: (g=0): rw=read, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32 ... readbw: (groupid=0, jobs=4): err= 0: pid=8781: Mon Nov 18 18:38:56 2013 � read : io=115692MB, bw=394691KB/s, iops=385, runt=300155msec � � slat (msec): min=77, max=405, avg=165.67, stdev=25.84 � � clat (msec): min=39, max=405, avg=166.11, stdev=25.58 � � �lat (msec): min=168, max=621, avg=332.26, stdev=28.49 ... #----------------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fio results show sequential reads and writes better for network block device than local block device? 2013-11-19 23:26 ` fio results show sequential reads and writes better for network block device than local block device? K.R Kishore @ 2013-11-19 23:48 ` David Nellans 2013-11-20 2:16 ` K.R Kishore 0 siblings, 1 reply; 4+ messages in thread From: David Nellans @ 2013-11-19 23:48 UTC (permalink / raw) To: K.R Kishore, fio@vger.kernel.org On 11/19/2013 05:26 PM, K.R Kishore wrote: > Hi > I am trying to compare the performance of locally attached block device (SSD) with a network attached block device SSD)and I am seeing results for sequential reads and writes using that I cannot explain. The other results (random reads, writes etc) are as expected, i.e. local is better than remote. > > Here is my setup > - Two machines connected back-to-back by a 10G link > - Running RHEL 6.4 (Santiago), 2.6.32-358.6.1.el6.x86_64 > - Running nbd v2.9.20 (http://nbd.sourceforge.net/) > - Running fio v2.1.2 > - Using identical SSD on both machines - Samsung 840 PRO, 128G > - all 128G exported as rw volume > > I have my fio commands and output (only relevant portions) below. I cannot understand how the network device can have high throughput than local device. I see that when I use smaller block sizes to measure iops, the numbers are as expected (local > remote). > Has anyone tried fio on nbd? does fio measure a transaction done when it sees the block-io request handed off to the virtual device and assume TCP will take care of completing the transaction? I can see that it might do so for posted operations such as writes, but reads? > > Any clues? > thx, > Kishore 278MB/s read bandwidth to a locally attached samsung 840 pro on 1M sequential reads is very low unless you have it accidentally plugged into a SATA 3Gb/s port instead of a 6Gb/s. I'd sort out why you're not seeing 500 MB+ on this as starting point for your investigation. Also, Sequential performance probably isn't what you want to look at for a long latency block device (as opposed to without the network in the way) as io merging could become the dominant factor for performance even when using large block sizes to start. your latency data from the runs looks funny too - with the NBD latency being lower than the locally attached on writes, but not for reads. that would seem to indicate there is some buffering going on in the system that you're not aware of that is making your results noisy (and confusing) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fio results show sequential reads and writes better for network block device than local block device? 2013-11-19 23:48 ` David Nellans @ 2013-11-20 2:16 ` K.R Kishore 2013-11-27 22:11 ` David Nellans 0 siblings, 1 reply; 4+ messages in thread From: K.R Kishore @ 2013-11-20 2:16 UTC (permalink / raw) To: David Nellans, fio@vger.kernel.org David Thanks for the response.. > 278MB/s read bandwidth to a locally attached samsung 840 pro on 1M > sequential reads is very low unless you have it accidentally plugged > into a SATA 3Gb/s port instead of a 6Gb/s.� I'd sort out why you're not > seeing 500 MB+ on this as starting point for your investigation. I thought this was a good catch..so I tried� hdparm -I /dev/sdb|egrep -i "Model|speed" and I get the same on both machines.. [root@lab-sj1-141 uc]# hdparm -I /dev/sdb|egrep -i "Model|speed" � � Model Number: � � � Samsung SSD 840 PRO Series � � � � � � � � � � � � �* � �Gen1 signaling speed (1.5Gb/s) � � � � � �* � �Gen2 signaling speed (3.0Gb/s) [root@lab-sj1-141 uc]#� Does this imply they are running @3.0Gb/s with a peak rate of 300MB/s? I am using Dell Precision workstation T3600 and according to the specs it has 6G SAS ports which is where these drives are connected. I am not sure if this needs to be enabled some way. I rebooted and went through BIOS setting and did not see anything in the drives/storage sections. I ran the test on both machines and both got ~279MB/s for sequential reads. This does not explain why fio gives a higher number when one of the drives is exported over the network?! > Also, Sequential performance probably isn't what you want to look at for > a long latency block device (as opposed to without the network in the > way) as io merging could become the dominant factor for performance even > when using large block sizes to start. Your point noted. I ran all combinations of tests (read,write,readwrite,randread,randwrite,randrw) and did so with 1M and 512. I was looking for some consistency and trying to quantify the effect of latency on performance.� > your latency data from the runs looks funny too - with the NBD latency > being lower than the locally attached on writes, but not for reads. > that would seem to indicate there is some buffering going on in the > system that you're not aware of that is making your results noisy (and > confusing) I agree that the latency number is confusing. I am trying to understand how fio is measuring the latency for a NBD and maybe that will help sort this out. thx, Kishore ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fio results show sequential reads and writes better for network block device than local block device? 2013-11-20 2:16 ` K.R Kishore @ 2013-11-27 22:11 ` David Nellans 0 siblings, 0 replies; 4+ messages in thread From: David Nellans @ 2013-11-27 22:11 UTC (permalink / raw) To: K.R Kishore, fio@vger.kernel.org Yes 3.0Gb/s means that you're only getting 1/2 the throughput from that particular drive as it should be able to give you - if they're not recognized as 6Gb ports then something in the bios might need to be swizzled, upgraded, something... You say you ran random tests as well already - does the same result hold for random as sequential? The NBD still gives you higher throughput than the native device? while you certainly could have a really high latency SAS/SATA controller it seems unlikely that nbd could both do a network round trip + get through the userspace ndb-client in a lower latency than the local controller. doing synchronous q-depth one, I/O's of a small block (512) will give you a good picture on the minimum latency you can get from the local controller versus the ndb based disk to try and sort out the latency issue. On 11/19/2013 08:16 PM, K.R Kishore wrote: > > > David > Thanks for the response.. > > > >> 278MB/s read bandwidth to a locally attached samsung 840 pro on 1M >> sequential reads is very low unless you have it accidentally plugged >> into a SATA 3Gb/s port instead of a 6Gb/s. I'd sort out why you're not >> seeing 500 MB+ on this as starting point for your investigation. > > I thought this was a good catch..so I tried > hdparm -I /dev/sdb|egrep -i "Model|speed" and I get the same on both machines.. > > > [root@lab-sj1-141 uc]# hdparm -I /dev/sdb|egrep -i "Model|speed" > Model Number: Samsung SSD 840 PRO Series > * Gen1 signaling speed (1.5Gb/s) > * Gen2 signaling speed (3.0Gb/s) > [root@lab-sj1-141 uc]# > > Does this imply they are running @3.0Gb/s with a peak rate of 300MB/s? > I am using Dell Precision workstation T3600 and according to the specs it has 6G SAS ports which is where these drives are connected. I am not sure if this needs to be enabled some way. I rebooted and went through BIOS setting and did not see anything in the drives/storage sections. > > I ran the test on both machines and both got ~279MB/s for sequential reads. This does not explain why fio gives a higher number when one of the drives is exported over the network?! > >> Also, Sequential performance probably isn't what you want to look at for >> a long latency block device (as opposed to without the network in the >> way) as io merging could become the dominant factor for performance even >> when using large block sizes to start. > > Your point noted. I ran all combinations of tests (read,write,readwrite,randread,randwrite,randrw) and did so with 1M and 512. I was looking for some consistency and trying to quantify the effect of latency on performance. > > >> your latency data from the runs looks funny too - with the NBD latency >> being lower than the locally attached on writes, but not for reads. >> that would seem to indicate there is some buffering going on in the >> system that you're not aware of that is making your results noisy (and >> confusing) > > > I agree that the latency number is confusing. I am trying to understand how fio is measuring the latency for a NBD and maybe that will help sort this out. > > thx, > Kishore > > -- > To unsubscribe from this list: send the line "unsubscribe fio" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-11-27 22:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1384903429.33093.YahooMailNeo@web120804.mail.ne1.yahoo.com>
2013-11-19 23:26 ` fio results show sequential reads and writes better for network block device than local block device? K.R Kishore
2013-11-19 23:48 ` David Nellans
2013-11-20 2:16 ` K.R Kishore
2013-11-27 22:11 ` David Nellans
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox