All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: option runtime not working during a particular failure mode.
@ 2008-10-06 22:04 Shawn Lewis
  2008-10-07  9:30 ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Shawn Lewis @ 2008-10-06 22:04 UTC (permalink / raw)
  To: fio

Hi,

I have a random read load in which fio hung on a machine. It is
time_based with runtime=60. A few of the disks in question experienced
errors at the same time so I would expect fio to fail or stop after
60- seconds.

I haven't tried to debug this in depth yet. Jens I thought an answer
might jump out at you. If not I'll take a look.

Full disclosure: I modified the config and the strace output to show
fewer disks then were actually being accessed.

Here is the config file:
[sda-randomaccess]
filename=/export/sda3/
datafile.tmp
rw=randread
bs=64k
ioengine=sync
time_based=1
runtime=3600
bwavgtime=5000
direct=1
thread=1

[sdb-randomaccess]
filename=/export/sdb3/datafile.tmp
rw=randread
bs=64k
ioengine=sync
time_based=1
runtime=3600
bwavgtime=5000
direct=1
thread=1

[sdc-randomaccess]
filename=/export/sdc3/datafile.tmp
rw=randread
bs=64k
ioengine=sync
time_based=1
runtime=3600
bwavgtime=5000
direct=1
thread=1

[sdd-randomaccess]
filename=/export/sdd3/datafile.tmp
rw=randread
bs=64k
ioengine=sync
time_based=1
runtime=3600
bwavgtime=5000
direct=1
thread=1


We get some hints from strace. It looks like we're just doing the
sig_alrm loop. But why aren't hitting runtime? Are the other threads
stopped already for some reason?

static void sig_alrm(int sig)
{
        if (threads) {
                update_io_ticks();
                print_thread_status();
                status_timer_arm();
        }
}


strace. This is repeated over and over:

--- SIGALRM (Alarm clock) @ 0 (0) ---
open("/sys/block/sdd/stat", O_RDONLY)   = 8
fstat(8, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x2aaaadf29000
read(8, "15490269   238448 1950476298  87"..., 4096) = 105
close(8)                                = 0
munmap(0x2aaaadf29000, 4096)            = 0
open("/sys/block/sdc/stat", O_RDONLY)   = 8
fstat(8, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x2aaaadf29000
read(8, "15489598   238450 1950388618  86"..., 4096) = 105
close(8)                                = 0
munmap(0x2aaaadf29000, 4096)            = 0
open("/sys/block/sda/stat", O_RDONLY)   = 8
fstat(8, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x2aaaadf29000
read(8, "15368259   237021 1934418422 103"..., 4096) = 105
close(8)                                = 0
munmap(0x2aaaadf29000, 4096)            = 0
open("/sys/block/sdb/stat", O_RDONLY)   = 8
fstat(8, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x2aaaadf29000
read(8, "15665652   241115 1972572554 101"..., 4096) = 105
close(8)                                = 0
munmap(0x2aaaadf29000, 4096)            = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
rt_sigreturn(0x5875e0)                  = -1 EINTR (Interrupted system call)
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, NULL)          = 0
nanosleep({0, 10000000}, 0)             = ? ERESTART_RESTARTBLOCK (To
be restarted)


Thanks,
Shawn

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-10-08 17:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-06 22:04 BUG: option runtime not working during a particular failure mode Shawn Lewis
2008-10-07  9:30 ` Jens Axboe
2008-10-07 16:28   ` Shawn Lewis
2008-10-08 11:01     ` Jens Axboe
2008-10-08 11:17       ` Jens Axboe
2008-10-08 16:24         ` Shawn Lewis
2008-10-08 17:23           ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.