* Bandwidth issues and blktrace debug
@ 2016-07-21 10:36 Gil Shomron
[not found] ` <CAD+HZHWTf-78rMR9bRZdF_36gMah-RD87NBT-01NqcFLHKNnWQ@mail.gmail.com>
0 siblings, 1 reply; 3+ messages in thread
From: Gil Shomron @ 2016-07-21 10:36 UTC (permalink / raw)
Hi all,
I have Intel's P3700 SSD mounted on a Lenovo x3650 M5 server with two
Intel Xeon E5-2630 v3 CPUs. The server is running Ubuntu 14.04 with
4.6.4 Kernel.
I've been using fio to benchmark the SSD using a synchronous
sequential reads with 1MB block size. Bandwidth result is ~1.6GB/sec.
This is relatively small to the maximum 2.8GB/sec it should be. I did
witnessed a P3700 reaches this bandwidth with a similar benchmark on a
high-end PC.
This is the fio command I'm using
sudo fio --time_based --name=benchmark --size=1G --runtime=15
--blocksize=1M --ioengine=sync --randrepeat=0 --iodepth=1 --direct=1
--sync=1 --verify_fatal=0 --numjobs=1 --rw=read --group_reporting
Trying to debug this I've installed blktrace, and ran it on my server
and on the high-end PC for comparison. see below.
Notice that on the server there are a bunch of splits (X) while the PC
doesn't have them, and on the server there is a mismatch between the
queued and dispatched requests (on the end). What is it?
Server trace (bad):
259,0 0 94207 1.353041898 1597 A R 410329088 + 2048 <-
(259,1) 410327040
259,1 0 94208 1.353042039 1597 Q R 410329088 + 2048 [fio]
259,1 0 94209 1.353042709 1597 X R 410329088 / 410329344 [fio]
259,1 0 94210 1.353042800 1597 Q R 410329344 + 1792 [fio]
259,1 0 94211 1.353042938 1597 G R 410329088 + 256 [fio]
259,1 0 94212 1.353043649 1597 X R 410329344 / 410329600 [fio]
259,1 0 94213 1.353043735 1597 Q R 410329600 + 1536 [fio]
259,1 0 94214 1.353043799 1597 G R 410329344 + 256 [fio]
259,1 0 94215 1.353044905 1597 D RS 410329088 + 256 [fio]
259,1 0 94216 1.353045604 1597 X R 410329600 / 410329856 [fio]
259,1 0 94217 1.353045690 1597 Q R 410329856 + 1280 [fio]
259,1 0 94218 1.353045879 1597 G R 410329600 + 256 [fio]
259,1 0 94219 1.353046790 1597 D RS 410329344 + 256 [fio]
259,1 0 94220 1.353047486 1597 X R 410329856 / 410330112 [fio]
259,1 0 94221 1.353047573 1597 Q R 410330112 + 1024 [fio]
259,1 0 94222 1.353047638 1597 G R 410329856 + 256 [fio]
259,1 0 94223 1.353048579 1597 D RS 410329600 + 256 [fio]
259,1 0 94224 1.353049292 1597 X R 410330112 / 410330368 [fio]
259,1 0 94225 1.353049378 1597 Q R 410330368 + 768 [fio]
259,1 0 94226 1.353049443 1597 G R 410330112 + 256 [fio]
259,1 0 94227 1.353050335 1597 D RS 410329856 + 256 [fio]
259,1 0 94228 1.353051032 1597 X R 410330368 / 410330624 [fio]
259,1 0 94229 1.353051118 1597 Q R 410330624 + 512 [fio]
259,1 0 94230 1.353051184 1597 G R 410330368 + 256 [fio]
259,1 0 94231 1.353052124 1597 D RS 410330112 + 256 [fio]
259,1 0 94232 1.353052830 1597 X R 410330624 / 410330880 [fio]
259,1 0 94233 1.353052915 1597 Q R 410330880 + 256 [fio]
259,1 0 94234 1.353052982 1597 G R 410330624 + 256 [fio]
259,1 0 94235 1.353053918 1597 D RS 410330368 + 256 [fio]
259,1 0 94236 1.353054583 1597 G R 410330880 + 256 [fio]
259,1 0 94237 1.353055518 1597 D RS 410330624 + 256 [fio]
259,1 0 94238 1.353055809 1597 U N [fio] 1
259,1 0 94239 1.353055885 1597 I RS 410330880 + 256 [fio]
259,1 0 94240 1.353056752 1597 D RS 410330880 + 256 [fio]
259,1 0 94241 1.353479359 0 C RS 410329344 + 256 [0]
259,1 0 94242 1.353485238 0 C RS 410329088 + 256 [0]
259,1 0 94243 1.353492491 0 C RS 410329600 + 256 [0]
259,1 0 94244 1.353520227 0 C RS 410330368 + 256 [0]
259,1 0 94245 1.353569613 0 C RS 410330880 + 256 [0]
259,1 0 94246 1.353574450 0 C RS 410329856 + 256 [0]
259,1 0 94247 1.353578150 0 C RS 410330112 + 256 [0]
259,1 0 94248 1.353585196 0 C RS 410330624 + 256 [0]
...
Total (259,1):
Reads Queued: 199808, 115089MiB Writes Queued: 0, 0KiB
Read Dispatches: 199808, 25575MiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 199808, 25575MiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
IO unplugs: 24976 Timer unplugs: 0
PC trace (good):
259,0 2 57721 8.845611892 25106 Q R 288768 + 256 [fio]
259,0 2 57722 8.845612054 25106 G R 288768 + 256 [fio]
259,0 2 57723 8.845613129 25106 D RS 288768 + 256 [fio]
259,0 2 57724 8.845615758 25106 Q R 289024 + 256 [fio]
259,0 2 57725 8.845615829 25106 G R 289024 + 256 [fio]
259,0 2 57726 8.845616781 25106 D RS 289024 + 256 [fio]
259,0 2 57727 8.845618483 25106 Q R 289280 + 256 [fio]
259,0 2 57728 8.845618535 25106 G R 289280 + 256 [fio]
259,0 2 57729 8.845619400 25106 D RS 289280 + 256 [fio]
259,0 2 57730 8.845621996 25106 Q R 289536 + 256 [fio]
259,0 2 57731 8.845622051 25106 G R 289536 + 256 [fio]
259,0 2 57732 8.845622699 25106 D RS 289536 + 256 [fio]
259,0 2 57733 8.845624389 25106 Q R 289792 + 256 [fio]
259,0 2 57734 8.845624452 25106 G R 289792 + 256 [fio]
259,0 2 57735 8.845625047 25106 D RS 289792 + 256 [fio]
259,0 2 57736 8.845627636 25106 Q R 290048 + 256 [fio]
259,0 2 57737 8.845627709 25106 G R 290048 + 256 [fio]
259,0 2 57738 8.845628294 25106 D RS 290048 + 256 [fio]
259,0 2 57739 8.845629988 25106 Q R 290304 + 256 [fio]
259,0 2 57740 8.845630042 25106 G R 290304 + 256 [fio]
259,0 2 57741 8.845630593 25106 D RS 290304 + 256 [fio]
259,0 2 57742 8.845632278 25106 Q R 290560 + 256 [fio]
259,0 2 57743 8.845632332 25106 G R 290560 + 256 [fio]
259,0 2 57744 8.845633114 25106 D RS 290560 + 256 [fio]
259,0 2 57745 8.846056151 0 C RS 289024 + 256 [0]
259,0 2 57746 8.846066912 0 C RS 289280 + 256 [0]
259,0 2 57747 8.846095809 0 C RS 289536 + 256 [0]
259,0 2 57748 8.846130757 0 C RS 290048 + 256 [0]
259,0 2 57749 8.846141543 0 C RS 289792 + 256 [0]
259,0 2 57750 8.846148906 0 C RS 288768 + 256 [0]
259,0 2 57751 8.846173043 0 C RS 290560 + 256 [0]
259,0 2 57752 8.846180066 0 C RS 290304 + 256 [0]
...
Total (259,0):
Reads Queued: 88156, 11283MiB Writes Queued: 8214, 1048MiB
Read Dispatches: 88155, 11283MiB Write Dispatches: 8214, 1048MiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 88153, 11283MiB Writes Completed: 8214, 1048MiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
IO unplugs: 0 Timer unplugs: 0
Can you please advise?
Thanks,
Gil.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Bandwidth issues and blktrace debug
[not found] ` <CA+-kpaQ1L0H_dcvqkyrd8d-zOjExE5tXXf88mPA+WpeorsSrDg@mail.gmail.com>
@ 2016-07-21 14:22 ` Jack Wang
2016-07-26 21:42 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Jack Wang @ 2016-07-21 14:22 UTC (permalink / raw)
2016-07-21 13:39 GMT+02:00 Gil Shomron <gil.shomron at gmail.com>:
>
> Hi Jack, how did you understand that the IO in the good trace is 128K
> and on the bad one is 1M?
Or maybe you didn't catch queue spiting part for good trace.
>
> As far as I understand there are 8*256K requests at the end of each
It's 256 sectors, so 128K
> trace, so it is 1M.... or I'm missing something?
>
> About the max_hw_sectors_kb and max_sectors_kb - both are 128.
Read the user guide:
http://www.cse.unsw.edu.au/~aaronc/iosched/doc/blktrace.html
See comments inline.
I cc-ed the list back.
>
> On Thu, Jul 21, 2016@2:18 PM, Jack Wang <xjtuwjp@gmail.com> wrote:
> >
> >
> > 2016-07-21 12:36 GMT+02:00 Gil Shomron <gil.shomron at gmail.com>:
> >>
> >> Hi all,
> >>
> >> I have Intel's P3700 SSD mounted on a Lenovo x3650 M5 server with two
> >> Intel Xeon E5-2630 v3 CPUs. The server is running Ubuntu 14.04 with
> >> 4.6.4 Kernel.
> >>
> >> I've been using fio to benchmark the SSD using a synchronous
> >> sequential reads with 1MB block size. Bandwidth result is ~1.6GB/sec.
> >> This is relatively small to the maximum 2.8GB/sec it should be. I did
> >> witnessed a P3700 reaches this bandwidth with a similar benchmark on a
> >> high-end PC.
> >>
> >> This is the fio command I'm using
> >> sudo fio --time_based --name=benchmark --size=1G --runtime=15
> >> --blocksize=1M --ioengine=sync --randrepeat=0 --iodepth=1 --direct=1
> >> --sync=1 --verify_fatal=0 --numjobs=1 --rw=read --group_reporting
> >>
> >> Trying to debug this I've installed blktrace, and ran it on my server
> >> and on the high-end PC for comparison. see below.
> >>
> >> Notice that on the server there are a bunch of splits (X) while the PC
> >> doesn't have them, and on the server there is a mismatch between the
> >> queued and dispatched requests (on the end). What is it?
> >>
> >> Server trace (bad):
> >> 259,0 0 94207 1.353041898 1597 A R 410329088 + 2048 <-
> >> (259,1) 410327040
request remaped, start from sector 410329088, length is 2048 sectors,
so total 1M.
> >> 259,1 0 94208 1.353042039 1597 Q R 410329088 + 2048 [fio]
> >> 259,1 0 94209 1.353042709 1597 X R 410329088 / 410329344
due to your hardware limit, request splited (410329344 - 410329088) 128K,
> >> [fio]
> >> 259,1 0 94210 1.353042800 1597 Q R 410329344 + 1792 [fio]
request from 410329344 got requeued.
Snip
Cheers,
Jack
> >>
> >> Thanks,
> >> Gil.
> >
> >
> > Hi Gil,
> >
> > Looks you're not using same fio configration file on both server, the good
> > one received IO with block size 128K,
> > bad one with 1M.
> >
> > Also have you checked:
> > /sys/block/nvmxx/queue/max_hw_sectors_kb
> > /sys/block/nvmxx/queue/max_sectors_kb
> >
> > block layer split IO based on thoes limit.
> >
> > You can increase the second one if possible.
> >
> > Regards,
> > Jack
^ permalink raw reply [flat|nested] 3+ messages in thread
* Bandwidth issues and blktrace debug
2016-07-21 14:22 ` Jack Wang
@ 2016-07-26 21:42 ` Jens Axboe
0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2016-07-26 21:42 UTC (permalink / raw)
>>>> 259,1 0 94210 1.353042800 1597 Q R 410329344 + 1792 [fio]
>
> request from 410329344 got requeued.
No it didn't, that's a Queue event for a Read. It's not a requeue event.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-26 21:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-21 10:36 Bandwidth issues and blktrace debug Gil Shomron
[not found] ` <CAD+HZHWTf-78rMR9bRZdF_36gMah-RD87NBT-01NqcFLHKNnWQ@mail.gmail.com>
[not found] ` <CA+-kpaQ1L0H_dcvqkyrd8d-zOjExE5tXXf88mPA+WpeorsSrDg@mail.gmail.com>
2016-07-21 14:22 ` Jack Wang
2016-07-26 21:42 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).