From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <53D20DB8.10100@kernel.dk> Date: Fri, 25 Jul 2014 09:56:40 +0200 From: Jens Axboe MIME-Version: 1.0 Subject: Re: fio hangs with --status-interval References: <53BE5286.2060203@kernel.dk> <53BEF29E.3040500@kernel.dk> <53BFCF22.8020407@kernel.dk> <53C0F4EC.9010107@kernel.dk> <53CCCA96.7010703@kernel.dk> <53D20AA8.6020700@kernel.dk> In-Reply-To: <53D20AA8.6020700@kernel.dk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: Vasily Tarasov Cc: Michael Mattsson , "fio@vger.kernel.org" List-ID: On 2014-07-25 09:43, Jens Axboe wrote: > On 2014-07-21 22:25, Vasily Tarasov wrote: >> Hi Jens, >> >> I tried your patch, but it didn't help. Interestingly, the number of >> threads changes in the end. At first, during the run: >> >> # ps -eLf | grep fio >> root 5224 4274 5224 1 2 11:12 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5225 0 2 11:12 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5231 5224 5231 60 1 11:12 ? 00:00:07 fio >> --status-interval 10 --minimal fios/1.fio >> root 5260 5237 5260 0 1 11:12 pts/0 00:00:00 grep fio >> [root@bison01 vass]# ps -eLf | grep fio >> root 5224 4274 5224 0 2 11:12 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5225 0 2 11:12 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5231 5224 5231 16 1 11:12 ? 00:00:21 fio >> --status-interval 10 --minimal fios/1.fio >> root 5293 5237 5293 0 1 11:14 pts/0 00:00:00 grep fio >> [root@bison01 vass]# ps -eLf | grep fio >> root 5224 4274 5224 0 2 11:12 pts/1 00:00:01 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5225 0 2 11:12 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5231 5224 5231 12 1 11:12 ? 00:01:13 fio >> --status-interval 10 --minimal fios/1.fio >> root 5411 5237 5411 0 1 11:22 pts/0 00:00:00 grep fio >> >> Later, when the threads are stuck: >> >> # ps -eLf | grep fio >> root 5224 4274 5224 0 16 11:12 pts/1 00:00:02 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5225 0 16 11:12 pts/1 00:00:01 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5458 0 16 11:25 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5459 0 16 11:25 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5460 0 16 11:25 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5461 0 16 11:25 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5462 0 16 11:25 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5471 0 16 11:25 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5472 0 16 11:26 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5475 0 16 11:26 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5476 0 16 11:26 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5477 0 16 11:26 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5478 0 16 11:26 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5487 0 16 11:26 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5488 0 16 11:27 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 5224 4274 5489 0 16 11:27 pts/1 00:00:00 fio >> --status-interval 10 --minimal fios/1.fio >> root 6665 5237 6665 0 1 13:21 pts/0 00:00:00 grep fio >> >> Is the number of threads supposed to change?.. > > Never answered this one... Yes, it'll change, since when you run the > job, you'll have one backend process, a number of IO workers, and one > disk util thread typically. When you get stuck, it's the backend that is > left waiting for that mutex. > > In any case, I haven't been able to figure this one out yet. But it > should be safe enough to just ignore the stat mutex for the final > output, since the threads otherwise accessing it are gone. Can you see > if this one makes the issue go away? Patch was not compiled, was missing the non-static __show_run_stats(). But just pull current -git, I have committed a variant that does compile :-) -- Jens Axboe