From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: Message-ID: <533C55E8.1010108@gmail.com> Date: Wed, 02 Apr 2014 13:24:40 -0500 From: Mark Nelson MIME-Version: 1.0 Subject: Re: bug in logging code? References: <533C2B54.6030606@gmail.com> <533C4B06.9070108@kernel.dk> <533C4E9D.8050901@gmail.com> <533C4F2B.5070708@kernel.dk> In-Reply-To: <533C4F2B.5070708@kernel.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit To: Jens Axboe , fio@vger.kernel.org List-ID: On 04/02/2014 12:55 PM, Jens Axboe wrote: > On 04/02/2014 11:53 AM, Mark Nelson wrote: >> On 04/02/2014 12:38 PM, Jens Axboe wrote: >>> On 04/02/2014 09:23 AM, Mark Nelson wrote: >>>> I've been playing around with usng the various write_*_log options in >>>> fio to record run statistics. What I'm finding is that some of the >>>> logs >>>> have malformed lines in them. The following snippet is from the bw >>>> log: >>>> >>>> 811820, 12263, 0, 4194304 >>>> 812619, 35929, 0, 4194304 >>>> 818437, 8448, 0, 419430, 4194304 >>>> 1256299, 160627, 0, 4194304 >>>> 1256831, 154274, 0, 4194304 >>>> >>>> Notice that the malformed line appears to be a partial write that gets >>>> truncated. The fio command being used to run the tests is >>>> autogenerated, so job files aren't used. here's the one that created >>>> the above log snippet: >>>> >>>> /home/nhm/bin/fio >>>> --rw=read >>>> --ioengine=libaio >>>> --runtime=7200 >>>> --ramp_time=0 >>>> --numjobs=1 >>>> --direct=1 >>>> --bs=4194304B >>>> --iodepth=16 >>>> --size=9216M >>>> --write_iops_log=/tmp/cbt/ceph/kvmrbdfio/osd_ra-00004096/client_ra-00000128/op_size-04194304/concurrent_procs-008/iodepth-016/read/output >>>> >>>> >>>> >>>> >>>> --write_bw_log=/tmp/cbt/ceph/kvmrbdfio/osd_ra-00004096/client_ra-00000128/op_size-04194304/concurrent_procs-008/iodepth-016/read/output >>>> >>>> >>>> >>>> >>>> --write_lat_log=/tmp/cbt/ceph/kvmrbdfio/osd_ra-00004096/client_ra-00000128/op_size-04194304/concurrent_procs-008/iodepth-016/read/output >>>> >>>> >>>> >>>> >>>> --time_based >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-0 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-1 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-2 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-3 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-4 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-5 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-6 >>>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-7 >>>> >>>> The --numjobs=1 line should be removed, but I believe it gets >>>> overridden >>>> anyway with multiple name parameters. I wonder if having multiple name >>>> parameters (ie multiple fio processes) is causing those processes to >>>> all >>>> try to write to the same log files, potentially at the same time? Any >>>> thoughts would be much appreicated! >>> >>> What version of fio are you running? >>> >> >> f42557f (ie master from yesterday) with a slightly different fix for the >> basename issues that just got resolved. >> >> I can verify now that running fio with a single --name flag appears to >> generate logs without malformated lines. > > OK, I'll take a look, We should already be serializing the writing to > the same log files, but your example definitely looks like we are not. > > BTW, the numjobs option applies to each --name section. Just to clear > that up from your original email. So you really get numjobs * > name_sections running. Aha, thanks for clearing that up! Is github the right place to create new issues? If so, want me to create one for this? Mark