* problem with fio --client and latency logs
[not found] <671811755.66182002.1464284748641.JavaMail.zimbra@redhat.com>
@ 2016-05-26 19:11 ` Ben England
2016-05-27 14:43 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Ben England @ 2016-05-26 19:11 UTC (permalink / raw)
To: fio; +Cc: Tim Wilkinson, Andrew Theurer, John Harrigan
recently I noticed a problem in master branch when fio --client is used. latency logs generated with it have large numbers of "0, 0, 0, 0" records at the end of the file. I have a simple reproducer fio jobfile:
--------- reproducer.fiojob ----
[global]
numjobs=1
directory=/var/tmp
[shared-files]
rw=randread
write_lat_log=1thr
ioengine=sync
bs=4k # I/O size
filesize=1g # file size
runtime=20
-------------
If you run this job file like this:
# fio --server --daemonize=/var/run/fiosvr.pid
# fio --client=localhost reproducer.fiojob
You get the latency log 1thr_clat.1.log with the 0 records at the end. If you run the same job file with:
# fio reproducer.fiojob
You don't get the 0 records. This explains why Ceph CBT does not have the problem, for example.
I bisected fio history using git until I found the first commit that failed:
-------
commit 0cba0f919ee6af7dd65df436884336cff9c903f9
Author: Jens Axboe <axboe@fb.com>
Date: Thu Dec 17 14:54:15 2015 -0700
client/server: transparent handling of storing compressed logs
Signed-off-by: Jens Axboe <axboe@fb.com>
------
I could use some help figuring out what went wrong at this point, the diffs were complicated but it seemed possible that this commit was related to my problem.
-ben england
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs
2016-05-26 19:11 ` problem with fio --client and latency logs Ben England
@ 2016-05-27 14:43 ` Jens Axboe
2016-05-27 17:02 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2016-05-27 14:43 UTC (permalink / raw)
To: Ben England, fio; +Cc: Tim Wilkinson, Andrew Theurer, John Harrigan
On 05/26/2016 01:11 PM, Ben England wrote:
> recently I noticed a problem in master branch when fio --client is used. latency logs generated with it have large numbers of "0, 0, 0, 0" records at the end of the file. I have a simple reproducer fio jobfile:
>
> --------- reproducer.fiojob ----
> [global]
> numjobs=1
> directory=/var/tmp
>
> [shared-files]
> rw=randread
> write_lat_log=1thr
> ioengine=sync
> bs=4k # I/O size
> filesize=1g # file size
> runtime=20
> -------------
>
> If you run this job file like this:
>
> # fio --server --daemonize=/var/run/fiosvr.pid
> # fio --client=localhost reproducer.fiojob
>
> You get the latency log 1thr_clat.1.log with the 0 records at the end. If you run the same job file with:
>
> # fio reproducer.fiojob
>
> You don't get the 0 records. This explains why Ceph CBT does not have the problem, for example.
>
> I bisected fio history using git until I found the first commit that failed:
>
> -------
> commit 0cba0f919ee6af7dd65df436884336cff9c903f9
> Author: Jens Axboe <axboe@fb.com>
> Date: Thu Dec 17 14:54:15 2015 -0700
>
> client/server: transparent handling of storing compressed logs
>
> Signed-off-by: Jens Axboe <axboe@fb.com>
> ------
>
> I could use some help figuring out what went wrong at this point, the diffs were complicated but it seemed possible that this commit was related to my problem.
I have a good idea what this might be, I'll take a look at it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs
2016-05-27 14:43 ` Jens Axboe
@ 2016-05-27 17:02 ` Jens Axboe
2016-05-27 19:29 ` Ben England
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2016-05-27 17:02 UTC (permalink / raw)
To: Ben England, fio; +Cc: Tim Wilkinson, Andrew Theurer, John Harrigan
On 05/27/2016 08:43 AM, Jens Axboe wrote:
> On 05/26/2016 01:11 PM, Ben England wrote:
>> recently I noticed a problem in master branch when fio --client is
>> used. latency logs generated with it have large numbers of "0, 0, 0,
>> 0" records at the end of the file. I have a simple reproducer fio
>> jobfile:
>>
>> --------- reproducer.fiojob ----
>> [global]
>> numjobs=1
>> directory=/var/tmp
>>
>> [shared-files]
>> rw=randread
>> write_lat_log=1thr
>> ioengine=sync
>> bs=4k # I/O size
>> filesize=1g # file size
>> runtime=20
>> -------------
>>
>> If you run this job file like this:
>>
>> # fio --server --daemonize=/var/run/fiosvr.pid
>> # fio --client=localhost reproducer.fiojob
>>
>> You get the latency log 1thr_clat.1.log with the 0 records at the
>> end. If you run the same job file with:
>>
>> # fio reproducer.fiojob
>>
>> You don't get the 0 records. This explains why Ceph CBT does not have
>> the problem, for example.
>>
>> I bisected fio history using git until I found the first commit that
>> failed:
>>
>> -------
>> commit 0cba0f919ee6af7dd65df436884336cff9c903f9
>> Author: Jens Axboe <axboe@fb.com>
>> Date: Thu Dec 17 14:54:15 2015 -0700
>>
>> client/server: transparent handling of storing compressed logs
>>
>> Signed-off-by: Jens Axboe <axboe@fb.com>
>> ------
>>
>> I could use some help figuring out what went wrong at this point, the
>> diffs were complicated but it seemed possible that this commit was
>> related to my problem.
>
> I have a good idea what this might be, I'll take a look at it.
http://git.kernel.dk/cgit/fio/commit/?id=e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e
This should fix it, can you try with current -git?
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs
2016-05-27 17:02 ` Jens Axboe
@ 2016-05-27 19:29 ` Ben England
2016-05-27 19:33 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Ben England @ 2016-05-27 19:29 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio, Tim Wilkinson, Andrew Theurer, John Harrigan
it passes my simple test, I'll try it out in the large next, thx Jens.
This is the last commit I saw in master branch upstream just now after pulling into my clone:
commit e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e
Author: Jens Axboe <axboe@fb.com>
Date: Fri May 27 11:01:15 2016 -0600
server: ensure that we flush compressed logs correctly
Do chunkwise block compression, and flush at the end, adding more space
as needed.
Signed-off-by: Jens Axboe <axboe@fb.com>
When I run fio -h after rebuild, it shows I'm on this commit because last 4 hex digits of version "fio-2.11-5-ge35f" are first 4 digits of git commit ID, which is reassuring.
----- Original Message -----
> From: "Jens Axboe" <axboe@kernel.dk>
> To: "Ben England" <bengland@redhat.com>, fio@vger.kernel.org
> Cc: "Tim Wilkinson" <twilkins@redhat.com>, "Andrew Theurer" <atheurer@redhat.com>, "John Harrigan"
> <jharriga@redhat.com>
> Sent: Friday, May 27, 2016 1:02:10 PM
> Subject: Re: problem with fio --client and latency logs
>
> On 05/27/2016 08:43 AM, Jens Axboe wrote:
> > On 05/26/2016 01:11 PM, Ben England wrote:
> >> recently I noticed a problem in master branch when fio --client is
> >> used. latency logs generated with it have large numbers of "0, 0, 0,
> >> 0" records at the end of the file. I have a simple reproducer fio
> >> jobfile:
> >>
> >> --------- reproducer.fiojob ----
> >> [global]
> >> numjobs=1
> >> directory=/var/tmp
> >>
> >> [shared-files]
> >> rw=randread
> >> write_lat_log=1thr
> >> ioengine=sync
> >> bs=4k # I/O size
> >> filesize=1g # file size
> >> runtime=20
> >> -------------
> >>
> >> If you run this job file like this:
> >>
> >> # fio --server --daemonize=/var/run/fiosvr.pid
> >> # fio --client=localhost reproducer.fiojob
> >>
> >> You get the latency log 1thr_clat.1.log with the 0 records at the
> >> end. If you run the same job file with:
> >>
> >> # fio reproducer.fiojob
> >>
> >> You don't get the 0 records. This explains why Ceph CBT does not have
> >> the problem, for example.
> >>
> >> I bisected fio history using git until I found the first commit that
> >> failed:
> >>
> >> -------
> >> commit 0cba0f919ee6af7dd65df436884336cff9c903f9
> >> Author: Jens Axboe <axboe@fb.com>
> >> Date: Thu Dec 17 14:54:15 2015 -0700
> >>
> >> client/server: transparent handling of storing compressed logs
> >>
> >> Signed-off-by: Jens Axboe <axboe@fb.com>
> >> ------
> >>
> >> I could use some help figuring out what went wrong at this point, the
> >> diffs were complicated but it seemed possible that this commit was
> >> related to my problem.
> >
> > I have a good idea what this might be, I'll take a look at it.
>
> http://git.kernel.dk/cgit/fio/commit/?id=e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e
>
> This should fix it, can you try with current -git?
>
> --
> Jens Axboe
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problem with fio --client and latency logs
2016-05-27 19:29 ` Ben England
@ 2016-05-27 19:33 ` Jens Axboe
0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2016-05-27 19:33 UTC (permalink / raw)
To: Ben England; +Cc: fio, Tim Wilkinson, Andrew Theurer, John Harrigan
On 05/27/2016 01:29 PM, Ben England wrote:
> it passes my simple test, I'll try it out in the large next, thx Jens.
Super, let me know if you see issues. The difference in your two tests
is that fio will compress the log on the server side, to reduce the
amount of data we have to transfer. The bug was on the compression side,
so the client only got the first chunk of partial data. That meant that
we'd have the correct number of entries in the log, but the majority of
it would be zeroes...
> This is the last commit I saw in master branch upstream just now after
> pulling into my clone:
>
> commit e35fb4c43ecc5b9d35cb5d980e811d3408fc5a4e
> Author: Jens Axboe <axboe@fb.com>
> Date: Fri May 27 11:01:15 2016 -0600
>
> server: ensure that we flush compressed logs correctly
>
> Do chunkwise block compression, and flush at the end, adding more space
> as needed.
>
> Signed-off-by: Jens Axboe <axboe@fb.com>
Yes, that's the fix in question.
> When I run fio -h after rebuild, it shows I'm on this commit because
> last 4 hex digits of version "fio-2.11-5-ge35f" are first 4 digits of
> git commit ID, which is reassuring.
That's why I added it :-)
Most people never include the version of fio when they report a bug, at
least we'll have the version if they paste the fio output.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-05-27 19:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <671811755.66182002.1464284748641.JavaMail.zimbra@redhat.com>
2016-05-26 19:11 ` problem with fio --client and latency logs Ben England
2016-05-27 14:43 ` Jens Axboe
2016-05-27 17:02 ` Jens Axboe
2016-05-27 19:29 ` Ben England
2016-05-27 19:33 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox