From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <533C4F2B.5070708@kernel.dk> Date: Wed, 02 Apr 2014 11:55:55 -0600 From: Jens Axboe MIME-Version: 1.0 Subject: Re: bug in logging code? References: <533C2B54.6030606@gmail.com> <533C4B06.9070108@kernel.dk> <533C4E9D.8050901@gmail.com> In-Reply-To: <533C4E9D.8050901@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit To: Mark Nelson , fio@vger.kernel.org List-ID: On 04/02/2014 11:53 AM, Mark Nelson wrote: > On 04/02/2014 12:38 PM, Jens Axboe wrote: >> On 04/02/2014 09:23 AM, Mark Nelson wrote: >>> I've been playing around with usng the various write_*_log options in >>> fio to record run statistics. What I'm finding is that some of the logs >>> have malformed lines in them. The following snippet is from the bw log: >>> >>> 811820, 12263, 0, 4194304 >>> 812619, 35929, 0, 4194304 >>> 818437, 8448, 0, 419430, 4194304 >>> 1256299, 160627, 0, 4194304 >>> 1256831, 154274, 0, 4194304 >>> >>> Notice that the malformed line appears to be a partial write that gets >>> truncated. The fio command being used to run the tests is >>> autogenerated, so job files aren't used. here's the one that created >>> the above log snippet: >>> >>> /home/nhm/bin/fio >>> --rw=read >>> --ioengine=libaio >>> --runtime=7200 >>> --ramp_time=0 >>> --numjobs=1 >>> --direct=1 >>> --bs=4194304B >>> --iodepth=16 >>> --size=9216M >>> --write_iops_log=/tmp/cbt/ceph/kvmrbdfio/osd_ra-00004096/client_ra-00000128/op_size-04194304/concurrent_procs-008/iodepth-016/read/output >>> >>> >>> >>> --write_bw_log=/tmp/cbt/ceph/kvmrbdfio/osd_ra-00004096/client_ra-00000128/op_size-04194304/concurrent_procs-008/iodepth-016/read/output >>> >>> >>> >>> --write_lat_log=/tmp/cbt/ceph/kvmrbdfio/osd_ra-00004096/client_ra-00000128/op_size-04194304/concurrent_procs-008/iodepth-016/read/output >>> >>> >>> >>> --time_based >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-0 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-1 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-2 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-3 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-4 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-5 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-6 >>> --name=/srv/rbdfio-`hostname -s`-0/cbt-kvmrbdfio-7 >>> >>> The --numjobs=1 line should be removed, but I believe it gets overridden >>> anyway with multiple name parameters. I wonder if having multiple name >>> parameters (ie multiple fio processes) is causing those processes to all >>> try to write to the same log files, potentially at the same time? Any >>> thoughts would be much appreicated! >> >> What version of fio are you running? >> > > f42557f (ie master from yesterday) with a slightly different fix for the > basename issues that just got resolved. > > I can verify now that running fio with a single --name flag appears to > generate logs without malformated lines. OK, I'll take a look, We should already be serializing the writing to the same log files, but your example definitely looks like we are not. BTW, the numjobs option applies to each --name section. Just to clear that up from your original email. So you really get numjobs * name_sections running. -- Jens Axboe