From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: Running a separate fio process for each disk? References: <56464ACC.9030605@kernel.dk> <56465F00.1060504@kernel.dk> From: Jens Axboe Message-ID: <564F798A.8050009@kernel.dk> Date: Fri, 20 Nov 2015 12:50:34 -0700 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: Caio Villela , Allen Schade Cc: fio@vger.kernel.org List-ID: On 11/20/2015 12:37 PM, Caio Villela wrote: > Hello Allen and Jens, > > Sorry for the long output, this is just in case you want the details. > Here is a simple explanation for the problem. I want to run a 15 minute > random write, using 1 Meg requests, and measure throughput and latency. > What seems to be the problem is that if the test system has a large > number of drives - the system that I am testing here has 28 drives - > then the time accounting seems to go bad for some of the processes. > What you see below is that during the 15 minutes from start, all disks > are getting hit the same, as they should. Then, after 15 minutes, there > are 15 drives that are still running.... after 5 minutes over the > specified 15 minutes, there is still one drive running. Then looking at > the amount of IOs sent to each drive, the ones that ran on that excess > time have much more IOs. FIO still reports that all drives ran for 15 > minutes, although some ran for more than 20 minutes. > > We will attempt to run a single process instead of 28 instances of FIO > to see if this goes away. Could you also check if adding clocksource=gettimeofday makes any difference? This sounds very odd. Assuming this was run with fio -git? -- Jens Axboe