* problem with fio --client and latency logs [not found] <671811755.66182002.1464284748641.JavaMail.zimbra@redhat.com> @ 2016-05-26 19:11 ` Ben England 2016-05-27 14:43 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Ben England @ 2016-05-26 19:11 UTC (permalink / raw) To: fio; +Cc: Tim Wilkinson, Andrew Theurer, John Harrigan recently I noticed a problem in master branch when fio --client is used. latency logs generated with it have large numbers of "0, 0, 0, 0" records at the end of the file. I have a simple reproducer fio jobfile: --------- reproducer.fiojob ---- [global] numjobs=1 directory=/var/tmp [shared-files] rw=randread write_lat_log=1thr ioengine=sync bs=4k # I/O size filesize=1g # file size runtime=20 ------------- If you run this job file like this: # fio --server --daemonize=/var/run/fiosvr.pid # fio --client=localhost reproducer.fiojob You get the latency log 1thr_clat.1.log with the 0 records at the end. If you run the same job file with: # fio reproducer.fiojob You don't get the 0 records. This explains why Ceph CBT does not have the problem, for example. I bisected fio history using git until I found the first commit that failed: ------- commit 0cba0f919ee6af7dd65df436884336cff9c903f9 Author: Jens Axboe <axboe@fb.com> Date: Thu Dec 17 14:54:15 2015 -0700 client/server: transparent handling of storing compressed logs Signed-off-by: Jens Axboe <axboe@fb.com> ------ I could use some help figuring out what went wrong at this point, the diffs were complicated but it seemed possible that this commit was related to my problem. -ben england ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs 2016-05-26 19:11 ` problem with fio --client and latency logs Ben England @ 2016-05-27 14:43 ` Jens Axboe 2016-05-27 17:02 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Jens Axboe @ 2016-05-27 14:43 UTC (permalink / raw) To: Ben England, fio; +Cc: Tim Wilkinson, Andrew Theurer, John Harrigan On 05/26/2016 01:11 PM, Ben England wrote: > recently I noticed a problem in master branch when fio --client is used. latency logs generated with it have large numbers of "0, 0, 0, 0" records at the end of the file. I have a simple reproducer fio jobfile: > > --------- reproducer.fiojob ---- > [global] > numjobs=1 > directory=/var/tmp > > [shared-files] > rw=randread > write_lat_log=1thr > ioengine=sync > bs=4k # I/O size > filesize=1g # file size > runtime=20 > ------------- > > If you run this job file like this: > > # fio --server --daemonize=/var/run/fiosvr.pid > # fio --client=localhost reproducer.fiojob > > You get the latency log 1thr_clat.1.log with the 0 records at the end. If you run the same job file with: > > # fio reproducer.fiojob > > You don't get the 0 records. This explains why Ceph CBT does not have the problem, for example. > > I bisected fio history using git until I found the first commit that failed: > > ------- > commit 0cba0f919ee6af7dd65df436884336cff9c903f9 > Author: Jens Axboe <axboe@fb.com> > Date: Thu Dec 17 14:54:15 2015 -0700 > > client/server: transparent handling of storing compressed logs > > Signed-off-by: Jens Axboe <axboe@fb.com> > ------ > > I could use some help figuring out what went wrong at this point, the diffs were complicated but it seemed possible that this commit was related to my problem. I have a good idea what this might be, I'll take a look at it. -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs 2016-05-27 14:43 ` Jens Axboe @ 2016-05-27 17:02 ` Jens Axboe 2016-05-27 19:29 ` Ben England 0 siblings, 1 reply; 5+ messages in thread From: Jens Axboe @ 2016-05-27 17:02 UTC (permalink / raw) To: Ben England, fio; +Cc: Tim Wilkinson, Andrew Theurer, John Harrigan On 05/27/2016 08:43 AM, Jens Axboe wrote: > On 05/26/2016 01:11 PM, Ben England wrote: >> recently I noticed a problem in master branch when fio --client is >> used. latency logs generated with it have large numbers of "0, 0, 0, >> 0" records at the end of the file. I have a simple reproducer fio >> jobfile: >> >> --------- reproducer.fiojob ---- >> [global] >> numjobs=1 >> directory=/var/tmp >> >> [shared-files] >> rw=randread >> write_lat_log=1thr >> ioengine=sync >> bs=4k # I/O size >> filesize=1g # file size >> runtime=20 >> ------------- >> >> If you run this job file like this: >> >> # fio --server --daemonize=/var/run/fiosvr.pid >> # fio --client=localhost reproducer.fiojob >> >> You get the latency log 1thr_clat.1.log with the 0 records at the >> end. If you run the same job file with: >> >> # fio reproducer.fiojob >> >> You don't get the 0 records. This explains why Ceph CBT does not have >> the problem, for example. >> >> I bisected fio history using git until I found the first commit that >> failed: >> >> ------- >> commit 0cba0f919ee6af7dd65df436884336cff9c903f9 >> Author: Jens Axboe <axboe@fb.com> >> Date: Thu Dec 17 14:54:15 2015 -0700 >> >> client/server: transparent handling of storing compressed logs >> >> Signed-off-by: Jens Axboe <axboe@fb.com> >> ------ >> >> I could use some help figuring out what went wrong at this point, the >> diffs were complicated but it seemed possible that this commit was >> related to my problem. > > I have a good idea what this might be, I'll take a look at it. http://git.kernel.dk/cgit/fio/commit/?id=e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e This should fix it, can you try with current -git? -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs 2016-05-27 17:02 ` Jens Axboe @ 2016-05-27 19:29 ` Ben England 2016-05-27 19:33 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Ben England @ 2016-05-27 19:29 UTC (permalink / raw) To: Jens Axboe; +Cc: fio, Tim Wilkinson, Andrew Theurer, John Harrigan it passes my simple test, I'll try it out in the large next, thx Jens. This is the last commit I saw in master branch upstream just now after pulling into my clone: commit e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e Author: Jens Axboe <axboe@fb.com> Date: Fri May 27 11:01:15 2016 -0600 server: ensure that we flush compressed logs correctly Do chunkwise block compression, and flush at the end, adding more space as needed. Signed-off-by: Jens Axboe <axboe@fb.com> When I run fio -h after rebuild, it shows I'm on this commit because last 4 hex digits of version "fio-2.11-5-ge35f" are first 4 digits of git commit ID, which is reassuring. ----- Original Message ----- > From: "Jens Axboe" <axboe@kernel.dk> > To: "Ben England" <bengland@redhat.com>, fio@vger.kernel.org > Cc: "Tim Wilkinson" <twilkins@redhat.com>, "Andrew Theurer" <atheurer@redhat.com>, "John Harrigan" > <jharriga@redhat.com> > Sent: Friday, May 27, 2016 1:02:10 PM > Subject: Re: problem with fio --client and latency logs > > On 05/27/2016 08:43 AM, Jens Axboe wrote: > > On 05/26/2016 01:11 PM, Ben England wrote: > >> recently I noticed a problem in master branch when fio --client is > >> used. latency logs generated with it have large numbers of "0, 0, 0, > >> 0" records at the end of the file. I have a simple reproducer fio > >> jobfile: > >> > >> --------- reproducer.fiojob ---- > >> [global] > >> numjobs=1 > >> directory=/var/tmp > >> > >> [shared-files] > >> rw=randread > >> write_lat_log=1thr > >> ioengine=sync > >> bs=4k # I/O size > >> filesize=1g # file size > >> runtime=20 > >> ------------- > >> > >> If you run this job file like this: > >> > >> # fio --server --daemonize=/var/run/fiosvr.pid > >> # fio --client=localhost reproducer.fiojob > >> > >> You get the latency log 1thr_clat.1.log with the 0 records at the > >> end. If you run the same job file with: > >> > >> # fio reproducer.fiojob > >> > >> You don't get the 0 records. This explains why Ceph CBT does not have > >> the problem, for example. > >> > >> I bisected fio history using git until I found the first commit that > >> failed: > >> > >> ------- > >> commit 0cba0f919ee6af7dd65df436884336cff9c903f9 > >> Author: Jens Axboe <axboe@fb.com> > >> Date: Thu Dec 17 14:54:15 2015 -0700 > >> > >> client/server: transparent handling of storing compressed logs > >> > >> Signed-off-by: Jens Axboe <axboe@fb.com> > >> ------ > >> > >> I could use some help figuring out what went wrong at this point, the > >> diffs were complicated but it seemed possible that this commit was > >> related to my problem. > > > > I have a good idea what this might be, I'll take a look at it. > > http://git.kernel.dk/cgit/fio/commit/?id=e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e > > This should fix it, can you try with current -git? > > -- > Jens Axboe > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs 2016-05-27 19:29 ` Ben England @ 2016-05-27 19:33 ` Jens Axboe 0 siblings, 0 replies; 5+ messages in thread From: Jens Axboe @ 2016-05-27 19:33 UTC (permalink / raw) To: Ben England; +Cc: fio, Tim Wilkinson, Andrew Theurer, John Harrigan On 05/27/2016 01:29 PM, Ben England wrote: > it passes my simple test, I'll try it out in the large next, thx Jens. Super, let me know if you see issues. The difference in your two tests is that fio will compress the log on the server side, to reduce the amount of data we have to transfer. The bug was on the compression side, so the client only got the first chunk of partial data. That meant that we'd have the correct number of entries in the log, but the majority of it would be zeroes... > This is the last commit I saw in master branch upstream just now after > pulling into my clone: > > commit e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e > Author: Jens Axboe <axboe@fb.com> > Date: Fri May 27 11:01:15 2016 -0600 > > server: ensure that we flush compressed logs correctly > > Do chunkwise block compression, and flush at the end, adding more space > as needed. > > Signed-off-by: Jens Axboe <axboe@fb.com> Yes, that's the fix in question. > When I run fio -h after rebuild, it shows I'm on this commit because > last 4 hex digits of version "fio-2.11-5-ge35f" are first 4 digits of > git commit ID, which is reassuring. That's why I added it :-) Most people never include the version of fio when they report a bug, at least we'll have the version if they paste the fio output. -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-05-27 19:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <671811755.66182002.1464284748641.JavaMail.zimbra@redhat.com>
2016-05-26 19:11 ` problem with fio --client and latency logs Ben England
2016-05-27 14:43 ` Jens Axboe
2016-05-27 17:02 ` Jens Axboe
2016-05-27 19:29 ` Ben England
2016-05-27 19:33 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox