* Shared files with thread=1: func=xfer, err=Bad file descriptor
@ 2014-04-01 1:08 Andrey Kuzmin
2014-04-01 14:31 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Andrey Kuzmin @ 2014-04-01 1:08 UTC (permalink / raw)
To: fio; +Cc: Jens Axboe
The following simple setup
---
[global]
ioengine=psync
direct=1
sync=1
[randrwrite]
filename=/tmp/test.dat
lockfile=exclusive
filesize=1m
;
thread=1
numjobs=4
runtime=10
time_based=1
;
rw=randwrite
bs=4k
---
runs fine with jobs being processes, and fails as follows with thread=1.
randrwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
...
fio-2.1.6.1-23-g9f97
Starting 4 threads
fio: pid=10455, err=9/file:engines/sync.c:67, func=xfer, error=Bad
file descriptor]
fio: pid=10453, err=9/file:engines/sync.c:67, func=xfer, error=Bad
file descriptor
fio: pid=10454, err=9/file:engines/sync.c:67, func=xfer, error=Bad
file descriptor
Jobs: 1 (f=1): [wXXX] [100.0% done] [0KB/2372KB/0KB /s] [0/593/0 iops]
[eta 00m:00s]
randrwrite: (groupid=0, jobs=1): err= 0: pid=10452: Mon Mar 31 18:05:26 2014
write: io=14900KB, bw=1489.9KB/s, iops=372, runt= 10001msec
clat (usec): min=419, max=260739, avg=1546.00, stdev=4329.54
lat (usec): min=424, max=260744, avg=1551.95, stdev=4329.58
clat percentiles (usec):
| 1.00th=[ 588], 5.00th=[ 732], 10.00th=[ 780], 20.00th=[ 828],
| 30.00th=[ 876], 40.00th=[ 956], 50.00th=[ 1080], 60.00th=[ 1432],
| 70.00th=[ 2064], 80.00th=[ 2256], 90.00th=[ 2480], 95.00th=[ 2736],
| 99.00th=[ 3344], 99.50th=[ 3632], 99.90th=[ 4512], 99.95th=[23680],
| 99.99th=[261120]
bw (KB /s): min= 5, max= 1099, per=22.09%, avg=519.90, stdev=359.34
lat (usec) : 500=0.19%, 750=6.42%, 1000=37.61%
lat (msec) : 2=23.68%, 4=31.81%, 10=0.24%, 50=0.03%, 500=0.03%
cpu : usr=0.76%, sys=8.08%, ctx=8191, majf=0, minf=5
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=3725/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67,
func=xfer, error=Bad file descriptor): pid=10453: Mon Mar 31 18:05:26
2014
write: io=3072.0KB, bw=581465B/s, iops=142, runt= 5410msec
clat (usec): min=587, max=33827, avg=1817.50, stdev=2464.14
lat (usec): min=593, max=33837, avg=1823.85, stdev=2464.47
clat percentiles (usec):
| 1.00th=[ 804], 5.00th=[ 876], 10.00th=[ 932], 20.00th=[ 988],
| 30.00th=[ 1048], 40.00th=[ 1096], 50.00th=[ 1208], 60.00th=[ 2024],
| 70.00th=[ 2224], 80.00th=[ 2352], 90.00th=[ 2576], 95.00th=[ 2896],
| 99.00th=[ 4080], 99.50th=[32640], 99.90th=[34048], 99.95th=[34048],
| 99.99th=[34048]
bw (KB /s): min= 165, max= 1266, per=25.28%, avg=595.00, stdev=450.04
lat (usec) : 750=0.39%, 1000=22.76%
lat (msec) : 2=36.67%, 4=38.88%, 10=0.52%, 20=0.13%, 50=0.52%
cpu : usr=0.15%, sys=5.69%, ctx=2288, majf=0, minf=0
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=769/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67,
func=xfer, error=Bad file descriptor): pid=10454: Mon Mar 31 18:05:26
2014
write: io=2528.0KB, bw=478497B/s, iops=117, runt= 5410msec
clat (usec): min=299, max=33920, avg=2106.79, stdev=3775.38
lat (usec): min=303, max=33930, avg=2113.60, stdev=3775.68
clat percentiles (usec):
| 1.00th=[ 564], 5.00th=[ 820], 10.00th=[ 932], 20.00th=[ 996],
| 30.00th=[ 1064], 40.00th=[ 1160], 50.00th=[ 1256], 60.00th=[ 1976],
| 70.00th=[ 2224], 80.00th=[ 2384], 90.00th=[ 2672], 95.00th=[ 3056],
| 99.00th=[33024], 99.50th=[33024], 99.90th=[34048], 99.95th=[34048],
| 99.99th=[34048]
bw (KB /s): min= 46, max= 728, per=18.61%, avg=438.17, stdev=265.37
lat (usec) : 500=0.16%, 750=3.63%, 1000=16.59%
lat (msec) : 2=39.81%, 4=37.28%, 10=0.63%, 20=0.32%, 50=1.42%
cpu : usr=0.30%, sys=5.03%, ctx=2048, majf=0, minf=2
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.2%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=633/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67,
func=xfer, error=Bad file descriptor): pid=10455: Mon Mar 31 18:05:26
2014
write: io=3052.0KB, bw=583286B/s, iops=142, runt= 5358msec
clat (usec): min=541, max=24712, avg=1707.90, stdev=1533.18
lat (usec): min=547, max=24743, avg=1714.89, stdev=1533.64
clat percentiles (usec):
| 1.00th=[ 796], 5.00th=[ 876], 10.00th=[ 908], 20.00th=[ 980],
| 30.00th=[ 1048], 40.00th=[ 1128], 50.00th=[ 1240], 60.00th=[ 1736],
| 70.00th=[ 2224], 80.00th=[ 2384], 90.00th=[ 2640], 95.00th=[ 2896],
| 99.00th=[ 3280], 99.50th=[ 4640], 99.90th=[24704], 99.95th=[24704],
| 99.99th=[24704]
bw (KB /s): min= 204, max= 1009, per=20.39%, avg=480.00, stdev=283.93
lat (usec) : 750=0.65%, 1000=22.12%
lat (msec) : 2=38.35%, 4=38.09%, 10=0.26%, 50=0.39%
cpu : usr=0.37%, sys=5.45%, ctx=2261, majf=0, minf=5
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=764/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: io=23552KB, aggrb=2354KB/s, minb=467KB/s, maxb=1489KB/s,
mint=5358msec, maxt=10001msec
Disk stats (read/write):
sda: ios=0/13818, merge=0/2227, ticks=0/7428, in_queue=7260, util=72.37%
Is file sharing with non-trivial lockfile= setting is expected to work
with threads=1, or is it an unanticipated use case?
Regards,
Andrey
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Shared files with thread=1: func=xfer, err=Bad file descriptor 2014-04-01 1:08 Shared files with thread=1: func=xfer, err=Bad file descriptor Andrey Kuzmin @ 2014-04-01 14:31 ` Jens Axboe 2014-04-01 15:56 ` Andrey Kuzmin 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2014-04-01 14:31 UTC (permalink / raw) To: Andrey Kuzmin, fio On 03/31/2014 07:08 PM, Andrey Kuzmin wrote: > The following simple setup > --- > [global] > ioengine=psync > direct=1 > sync=1 > > [randrwrite] > filename=/tmp/test.dat > lockfile=exclusive > filesize=1m > ; > thread=1 > numjobs=4 > runtime=10 > time_based=1 > ; > rw=randwrite > bs=4k > --- > > runs fine with jobs being processes, and fails as follows with thread=1. > > randrwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 > ... > fio-2.1.6.1-23-g9f97 > Starting 4 threads > fio: pid=10455, err=9/file:engines/sync.c:67, func=xfer, error=Bad > file descriptor] > fio: pid=10453, err=9/file:engines/sync.c:67, func=xfer, error=Bad > file descriptor > fio: pid=10454, err=9/file:engines/sync.c:67, func=xfer, error=Bad > file descriptor > Jobs: 1 (f=1): [wXXX] [100.0% done] [0KB/2372KB/0KB /s] [0/593/0 iops] > [eta 00m:00s] > randrwrite: (groupid=0, jobs=1): err= 0: pid=10452: Mon Mar 31 18:05:26 2014 > write: io=14900KB, bw=1489.9KB/s, iops=372, runt= 10001msec > clat (usec): min=419, max=260739, avg=1546.00, stdev=4329.54 > lat (usec): min=424, max=260744, avg=1551.95, stdev=4329.58 > clat percentiles (usec): > | 1.00th=[ 588], 5.00th=[ 732], 10.00th=[ 780], 20.00th=[ 828], > | 30.00th=[ 876], 40.00th=[ 956], 50.00th=[ 1080], 60.00th=[ 1432], > | 70.00th=[ 2064], 80.00th=[ 2256], 90.00th=[ 2480], 95.00th=[ 2736], > | 99.00th=[ 3344], 99.50th=[ 3632], 99.90th=[ 4512], 99.95th=[23680], > | 99.99th=[261120] > bw (KB /s): min= 5, max= 1099, per=22.09%, avg=519.90, stdev=359.34 > lat (usec) : 500=0.19%, 750=6.42%, 1000=37.61% > lat (msec) : 2=23.68%, 4=31.81%, 10=0.24%, 50=0.03%, 500=0.03% > cpu : usr=0.76%, sys=8.08%, ctx=8191, majf=0, minf=5 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued : total=r=0/w=3725/d=0, short=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=1 > randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67, > func=xfer, error=Bad file descriptor): pid=10453: Mon Mar 31 18:05:26 > 2014 > write: io=3072.0KB, bw=581465B/s, iops=142, runt= 5410msec > clat (usec): min=587, max=33827, avg=1817.50, stdev=2464.14 > lat (usec): min=593, max=33837, avg=1823.85, stdev=2464.47 > clat percentiles (usec): > | 1.00th=[ 804], 5.00th=[ 876], 10.00th=[ 932], 20.00th=[ 988], > | 30.00th=[ 1048], 40.00th=[ 1096], 50.00th=[ 1208], 60.00th=[ 2024], > | 70.00th=[ 2224], 80.00th=[ 2352], 90.00th=[ 2576], 95.00th=[ 2896], > | 99.00th=[ 4080], 99.50th=[32640], 99.90th=[34048], 99.95th=[34048], > | 99.99th=[34048] > bw (KB /s): min= 165, max= 1266, per=25.28%, avg=595.00, stdev=450.04 > lat (usec) : 750=0.39%, 1000=22.76% > lat (msec) : 2=36.67%, 4=38.88%, 10=0.52%, 20=0.13%, 50=0.52% > cpu : usr=0.15%, sys=5.69%, ctx=2288, majf=0, minf=0 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued : total=r=0/w=769/d=0, short=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=1 > randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67, > func=xfer, error=Bad file descriptor): pid=10454: Mon Mar 31 18:05:26 > 2014 > write: io=2528.0KB, bw=478497B/s, iops=117, runt= 5410msec > clat (usec): min=299, max=33920, avg=2106.79, stdev=3775.38 > lat (usec): min=303, max=33930, avg=2113.60, stdev=3775.68 > clat percentiles (usec): > | 1.00th=[ 564], 5.00th=[ 820], 10.00th=[ 932], 20.00th=[ 996], > | 30.00th=[ 1064], 40.00th=[ 1160], 50.00th=[ 1256], 60.00th=[ 1976], > | 70.00th=[ 2224], 80.00th=[ 2384], 90.00th=[ 2672], 95.00th=[ 3056], > | 99.00th=[33024], 99.50th=[33024], 99.90th=[34048], 99.95th=[34048], > | 99.99th=[34048] > bw (KB /s): min= 46, max= 728, per=18.61%, avg=438.17, stdev=265.37 > lat (usec) : 500=0.16%, 750=3.63%, 1000=16.59% > lat (msec) : 2=39.81%, 4=37.28%, 10=0.63%, 20=0.32%, 50=1.42% > cpu : usr=0.30%, sys=5.03%, ctx=2048, majf=0, minf=2 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.2%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued : total=r=0/w=633/d=0, short=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=1 > randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67, > func=xfer, error=Bad file descriptor): pid=10455: Mon Mar 31 18:05:26 > 2014 > write: io=3052.0KB, bw=583286B/s, iops=142, runt= 5358msec > clat (usec): min=541, max=24712, avg=1707.90, stdev=1533.18 > lat (usec): min=547, max=24743, avg=1714.89, stdev=1533.64 > clat percentiles (usec): > | 1.00th=[ 796], 5.00th=[ 876], 10.00th=[ 908], 20.00th=[ 980], > | 30.00th=[ 1048], 40.00th=[ 1128], 50.00th=[ 1240], 60.00th=[ 1736], > | 70.00th=[ 2224], 80.00th=[ 2384], 90.00th=[ 2640], 95.00th=[ 2896], > | 99.00th=[ 3280], 99.50th=[ 4640], 99.90th=[24704], 99.95th=[24704], > | 99.99th=[24704] > bw (KB /s): min= 204, max= 1009, per=20.39%, avg=480.00, stdev=283.93 > lat (usec) : 750=0.65%, 1000=22.12% > lat (msec) : 2=38.35%, 4=38.09%, 10=0.26%, 50=0.39% > cpu : usr=0.37%, sys=5.45%, ctx=2261, majf=0, minf=5 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued : total=r=0/w=764/d=0, short=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > Run status group 0 (all jobs): > WRITE: io=23552KB, aggrb=2354KB/s, minb=467KB/s, maxb=1489KB/s, > mint=5358msec, maxt=10001msec > > Disk stats (read/write): > sda: ios=0/13818, merge=0/2227, ticks=0/7428, in_queue=7260, util=72.37% > > Is file sharing with non-trivial lockfile= setting is expected to work > with threads=1, or is it an unanticipated use case? No, that's supposed to work. I think it's a race between shared files, and opening/closing of them when you have a timed based job. I'll take a look at it. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Shared files with thread=1: func=xfer, err=Bad file descriptor 2014-04-01 14:31 ` Jens Axboe @ 2014-04-01 15:56 ` Andrey Kuzmin 2014-04-01 18:24 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Andrey Kuzmin @ 2014-04-01 15:56 UTC (permalink / raw) To: Jens Axboe; +Cc: fio Thanks. On a separate (or may be related) note, shared files handling in dup_files seems to be broken (unless I'm missing the logic behind dropping non-shared files). The patch below has fixed it for me. diff --git a/init.c b/init.c index 9b0666d..99038b2 100644 --- a/init.c +++ b/init.c @@ -1160,17 +1161,20 @@ static int add_job(struct thread_data *td, const char *j td_new->o.new_group = 0; if (file_alloced) { - td_new->files_index = 0; - td_new->files_size = 0; if (td_new->files) { struct fio_file *f; for_each_file(td_new, f, i) { if (f->file_name) - free(f->file_name); - free(f); + sfree(f->file_name); + sfree(f); } + free(td_new->files); td_new->files = NULL; } + td_new->files_index = 0; + td_new->files_size = 0; if (td_new->o.filename) { free(td_new->o.filename); td_new->o.filename = NULL; Regards, Andrey On Tue, Apr 1, 2014 at 7:31 AM, Jens Axboe <axboe@kernel.dk> wrote: > On 03/31/2014 07:08 PM, Andrey Kuzmin wrote: >> >> The following simple setup >> --- >> [global] >> ioengine=psync >> direct=1 >> sync=1 >> >> [randrwrite] >> filename=/tmp/test.dat >> lockfile=exclusive >> filesize=1m >> ; >> thread=1 >> numjobs=4 >> runtime=10 >> time_based=1 >> ; >> rw=randwrite >> bs=4k >> --- >> >> runs fine with jobs being processes, and fails as follows with thread=1. >> >> randrwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, >> iodepth=1 >> ... >> fio-2.1.6.1-23-g9f97 >> Starting 4 threads >> fio: pid=10455, err=9/file:engines/sync.c:67, func=xfer, error=Bad >> file descriptor] >> fio: pid=10453, err=9/file:engines/sync.c:67, func=xfer, error=Bad >> file descriptor >> fio: pid=10454, err=9/file:engines/sync.c:67, func=xfer, error=Bad >> file descriptor >> Jobs: 1 (f=1): [wXXX] [100.0% done] [0KB/2372KB/0KB /s] [0/593/0 iops] >> [eta 00m:00s] >> randrwrite: (groupid=0, jobs=1): err= 0: pid=10452: Mon Mar 31 18:05:26 >> 2014 >> write: io=14900KB, bw=1489.9KB/s, iops=372, runt= 10001msec >> clat (usec): min=419, max=260739, avg=1546.00, stdev=4329.54 >> lat (usec): min=424, max=260744, avg=1551.95, stdev=4329.58 >> clat percentiles (usec): >> | 1.00th=[ 588], 5.00th=[ 732], 10.00th=[ 780], 20.00th=[ >> 828], >> | 30.00th=[ 876], 40.00th=[ 956], 50.00th=[ 1080], 60.00th=[ >> 1432], >> | 70.00th=[ 2064], 80.00th=[ 2256], 90.00th=[ 2480], 95.00th=[ >> 2736], >> | 99.00th=[ 3344], 99.50th=[ 3632], 99.90th=[ 4512], >> 99.95th=[23680], >> | 99.99th=[261120] >> bw (KB /s): min= 5, max= 1099, per=22.09%, avg=519.90, >> stdev=359.34 >> lat (usec) : 500=0.19%, 750=6.42%, 1000=37.61% >> lat (msec) : 2=23.68%, 4=31.81%, 10=0.24%, 50=0.03%, 500=0.03% >> cpu : usr=0.76%, sys=8.08%, ctx=8191, majf=0, minf=5 >> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >> >=64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> issued : total=r=0/w=3725/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=1 >> randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67, >> func=xfer, error=Bad file descriptor): pid=10453: Mon Mar 31 18:05:26 >> 2014 >> write: io=3072.0KB, bw=581465B/s, iops=142, runt= 5410msec >> clat (usec): min=587, max=33827, avg=1817.50, stdev=2464.14 >> lat (usec): min=593, max=33837, avg=1823.85, stdev=2464.47 >> clat percentiles (usec): >> | 1.00th=[ 804], 5.00th=[ 876], 10.00th=[ 932], 20.00th=[ >> 988], >> | 30.00th=[ 1048], 40.00th=[ 1096], 50.00th=[ 1208], 60.00th=[ >> 2024], >> | 70.00th=[ 2224], 80.00th=[ 2352], 90.00th=[ 2576], 95.00th=[ >> 2896], >> | 99.00th=[ 4080], 99.50th=[32640], 99.90th=[34048], >> 99.95th=[34048], >> | 99.99th=[34048] >> bw (KB /s): min= 165, max= 1266, per=25.28%, avg=595.00, >> stdev=450.04 >> lat (usec) : 750=0.39%, 1000=22.76% >> lat (msec) : 2=36.67%, 4=38.88%, 10=0.52%, 20=0.13%, 50=0.52% >> cpu : usr=0.15%, sys=5.69%, ctx=2288, majf=0, minf=0 >> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >> >=64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> complete : 0=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> issued : total=r=0/w=769/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=1 >> randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67, >> func=xfer, error=Bad file descriptor): pid=10454: Mon Mar 31 18:05:26 >> 2014 >> write: io=2528.0KB, bw=478497B/s, iops=117, runt= 5410msec >> clat (usec): min=299, max=33920, avg=2106.79, stdev=3775.38 >> lat (usec): min=303, max=33930, avg=2113.60, stdev=3775.68 >> clat percentiles (usec): >> | 1.00th=[ 564], 5.00th=[ 820], 10.00th=[ 932], 20.00th=[ >> 996], >> | 30.00th=[ 1064], 40.00th=[ 1160], 50.00th=[ 1256], 60.00th=[ >> 1976], >> | 70.00th=[ 2224], 80.00th=[ 2384], 90.00th=[ 2672], 95.00th=[ >> 3056], >> | 99.00th=[33024], 99.50th=[33024], 99.90th=[34048], >> 99.95th=[34048], >> | 99.99th=[34048] >> bw (KB /s): min= 46, max= 728, per=18.61%, avg=438.17, >> stdev=265.37 >> lat (usec) : 500=0.16%, 750=3.63%, 1000=16.59% >> lat (msec) : 2=39.81%, 4=37.28%, 10=0.63%, 20=0.32%, 50=1.42% >> cpu : usr=0.30%, sys=5.03%, ctx=2048, majf=0, minf=2 >> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >> >=64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> complete : 0=0.2%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> issued : total=r=0/w=633/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=1 >> randrwrite: (groupid=0, jobs=1): err= 9 (file:engines/sync.c:67, >> func=xfer, error=Bad file descriptor): pid=10455: Mon Mar 31 18:05:26 >> 2014 >> write: io=3052.0KB, bw=583286B/s, iops=142, runt= 5358msec >> clat (usec): min=541, max=24712, avg=1707.90, stdev=1533.18 >> lat (usec): min=547, max=24743, avg=1714.89, stdev=1533.64 >> clat percentiles (usec): >> | 1.00th=[ 796], 5.00th=[ 876], 10.00th=[ 908], 20.00th=[ >> 980], >> | 30.00th=[ 1048], 40.00th=[ 1128], 50.00th=[ 1240], 60.00th=[ >> 1736], >> | 70.00th=[ 2224], 80.00th=[ 2384], 90.00th=[ 2640], 95.00th=[ >> 2896], >> | 99.00th=[ 3280], 99.50th=[ 4640], 99.90th=[24704], >> 99.95th=[24704], >> | 99.99th=[24704] >> bw (KB /s): min= 204, max= 1009, per=20.39%, avg=480.00, >> stdev=283.93 >> lat (usec) : 750=0.65%, 1000=22.12% >> lat (msec) : 2=38.35%, 4=38.09%, 10=0.26%, 50=0.39% >> cpu : usr=0.37%, sys=5.45%, ctx=2261, majf=0, minf=5 >> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >> >=64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> complete : 0=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> >=64=0.0% >> issued : total=r=0/w=764/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=1 >> >> Run status group 0 (all jobs): >> WRITE: io=23552KB, aggrb=2354KB/s, minb=467KB/s, maxb=1489KB/s, >> mint=5358msec, maxt=10001msec >> >> Disk stats (read/write): >> sda: ios=0/13818, merge=0/2227, ticks=0/7428, in_queue=7260, >> util=72.37% >> >> Is file sharing with non-trivial lockfile= setting is expected to work >> with threads=1, or is it an unanticipated use case? > > > No, that's supposed to work. I think it's a race between shared files, and > opening/closing of them when you have a timed based job. I'll take a look at > it. > > -- > Jens Axboe > ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Shared files with thread=1: func=xfer, err=Bad file descriptor 2014-04-01 15:56 ` Andrey Kuzmin @ 2014-04-01 18:24 ` Jens Axboe 2014-04-01 18:56 ` Andrey Kuzmin 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2014-04-01 18:24 UTC (permalink / raw) To: Andrey Kuzmin; +Cc: fio [-- Attachment #1: Type: text/plain, Size: 1801 bytes --] On 04/01/2014 09:56 AM, Andrey Kuzmin wrote: > Thanks. On a separate (or may be related) note, shared files handling > in dup_files seems to be broken (unless I'm missing the logic behind > dropping non-shared files). The patch below has fixed it for me. > > diff --git a/init.c b/init.c > index 9b0666d..99038b2 100644 > --- a/init.c > +++ b/init.c > @@ -1160,17 +1161,20 @@ static int add_job(struct thread_data *td, const char *j > td_new->o.new_group = 0; > > if (file_alloced) { > - td_new->files_index = 0; > - td_new->files_size = 0; > if (td_new->files) { > struct fio_file *f; > for_each_file(td_new, f, i) { > if (f->file_name) > - free(f->file_name); > - free(f); > + sfree(f->file_name); > + sfree(f); > } > + free(td_new->files); > td_new->files = NULL; > } > + td_new->files_index = 0; > + td_new->files_size = 0; > if (td_new->o.filename) { > free(td_new->o.filename); > td_new->o.filename = NULL; Thanks, I'll add this, definitely looks like a missing conversion at some point... And while this isn't the bug, I think it's touching in the right area. Can you try with this one-liner and see if that fixes your issues with EBADF? -- Jens Axboe [-- Attachment #2: file-close-race-shadow.patch --] [-- Type: text/x-patch, Size: 315 bytes --] diff --git a/filesetup.c b/filesetup.c index db80a8b7e28b..ecbe3a5fb41b 100644 --- a/filesetup.c +++ b/filesetup.c @@ -1421,6 +1459,7 @@ void dup_files(struct thread_data *td, struct thread_data *org) assert(0); } __f->fd = -1; + __f->shadow_fd = -1; fio_file_reset(td, __f); if (f->file_name) { ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Shared files with thread=1: func=xfer, err=Bad file descriptor 2014-04-01 18:24 ` Jens Axboe @ 2014-04-01 18:56 ` Andrey Kuzmin 2014-04-01 18:59 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Andrey Kuzmin @ 2014-04-01 18:56 UTC (permalink / raw) To: Jens Axboe; +Cc: fio Seems to have fixed it: EBADF does not show up anymore even in a 120s run. randrwrite: (groupid=0, jobs=1): err= 0: pid=11373: Tue Apr 1 11:54:39 2014 write: io=25608KB, bw=218501B/s, iops=53, runt=120011msec clat (usec): min=317, max=281422, avg=3839.52, stdev=8501.45 lat (usec): min=320, max=281427, avg=3847.68, stdev=8501.90 clat percentiles (usec): | 1.00th=[ 652], 5.00th=[ 844], 10.00th=[ 956], 20.00th=[ 1160], | 30.00th=[ 1512], 40.00th=[ 2160], 50.00th=[ 2576], 60.00th=[ 2960], | 70.00th=[ 3248], 80.00th=[ 3664], 90.00th=[ 4768], 95.00th=[15424], | 99.00th=[33024], 99.50th=[33536], 99.90th=[83456], 99.95th=[209920], | 99.99th=[280576] bw (KB /s): min= 1, max= 1016, per=24.43%, avg=238.72, stdev=226.57 lat (usec) : 500=0.05%, 750=2.58%, 1000=9.56% lat (msec) : 2=25.02%, 4=47.47%, 10=9.03%, 20=3.86%, 50=2.25% lat (msec) : 100=0.11%, 250=0.05%, 500=0.03% cpu : usr=0.24%, sys=3.14%, ctx=20699, majf=0, minf=3 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=6402/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 randrwrite: (groupid=0, jobs=1): err= 0: pid=11374: Tue Apr 1 11:54:39 2014 write: io=26880KB, bw=229370B/s, iops=55, runt=120003msec clat (usec): min=353, max=309465, avg=3914.37, stdev=9882.36 lat (usec): min=356, max=309476, avg=3922.20, stdev=9882.87 clat percentiles (usec): | 1.00th=[ 636], 5.00th=[ 812], 10.00th=[ 924], 20.00th=[ 1112], | 30.00th=[ 1400], 40.00th=[ 1976], 50.00th=[ 2416], 60.00th=[ 2864], | 70.00th=[ 3184], 80.00th=[ 3600], 90.00th=[ 4640], 95.00th=[15424], | 99.00th=[33536], 99.50th=[36096], 99.90th=[166912], 99.95th=[232448], | 99.99th=[309248] bw (KB /s): min= 2, max= 1208, per=24.62%, avg=240.54, stdev=228.83 lat (usec) : 500=0.10%, 750=3.15%, 1000=10.70% lat (msec) : 2=26.32%, 4=45.31%, 10=8.14%, 20=3.41%, 50=2.68% lat (msec) : 100=0.06%, 250=0.07%, 500=0.04% cpu : usr=0.29%, sys=3.12%, ctx=21225, majf=0, minf=1 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=6720/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 randrwrite: (groupid=0, jobs=1): err= 0: pid=11375: Tue Apr 1 11:54:39 2014 write: io=30728KB, bw=262194B/s, iops=64, runt=120008msec clat (usec): min=305, max=315339, avg=4275.75, stdev=12442.00 lat (usec): min=309, max=315347, avg=4284.04, stdev=12442.51 clat percentiles (usec): | 1.00th=[ 628], 5.00th=[ 812], 10.00th=[ 924], 20.00th=[ 1112], | 30.00th=[ 1384], 40.00th=[ 1960], 50.00th=[ 2384], 60.00th=[ 2832], | 70.00th=[ 3184], 80.00th=[ 3600], 90.00th=[ 4640], 95.00th=[16768], | 99.00th=[33536], 99.50th=[39680], 99.90th=[232448], 99.95th=[280576], | 99.99th=[313344] bw (KB /s): min= 2, max= 1295, per=28.94%, avg=282.72, stdev=271.58 lat (usec) : 500=0.09%, 750=3.37%, 1000=10.80% lat (msec) : 2=26.56%, 4=44.99%, 10=7.39%, 20=3.16%, 50=3.33% lat (msec) : 100=0.09%, 250=0.13%, 500=0.08% cpu : usr=0.33%, sys=3.40%, ctx=22837, majf=0, minf=0 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=7682/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 randrwrite: (groupid=0, jobs=1): err= 0: pid=11376: Tue Apr 1 11:54:39 2014 write: io=34064KB, bw=290616B/s, iops=70, runt=120026msec clat (usec): min=351, max=376615, avg=4112.48, stdev=11571.34 lat (usec): min=355, max=376621, avg=4120.65, stdev=11572.04 clat percentiles (usec): | 1.00th=[ 644], 5.00th=[ 812], 10.00th=[ 924], 20.00th=[ 1096], | 30.00th=[ 1352], 40.00th=[ 1864], 50.00th=[ 2384], 60.00th=[ 2768], | 70.00th=[ 3120], 80.00th=[ 3504], 90.00th=[ 4576], 95.00th=[16192], | 99.00th=[34048], 99.50th=[41728], 99.90th=[193536], 99.95th=[259072], | 99.99th=[374784] bw (KB /s): min= 1, max= 1736, per=31.18%, avg=304.63, stdev=317.27 lat (usec) : 500=0.09%, 750=3.43%, 1000=10.79% lat (msec) : 2=27.56%, 4=44.46%, 10=7.21%, 20=2.92%, 50=3.21% lat (msec) : 100=0.15%, 250=0.11%, 500=0.07% cpu : usr=0.28%, sys=3.66%, ctx=24223, majf=0, minf=2 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8516/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=117280KB, aggrb=977KB/s, minb=213KB/s, maxb=283KB/s, mint=120003msec, maxt=120026msec Disk stats (read/write): sda: ios=0/76211, merge=0/17768, ticks=0/99456, in_queue=98572, util=81.61% Regards, Andrey On Tue, Apr 1, 2014 at 11:24 AM, Jens Axboe <axboe@kernel.dk> wrote: > On 04/01/2014 09:56 AM, Andrey Kuzmin wrote: >> >> Thanks. On a separate (or may be related) note, shared files handling >> in dup_files seems to be broken (unless I'm missing the logic behind >> dropping non-shared files). The patch below has fixed it for me. >> >> diff --git a/init.c b/init.c >> index 9b0666d..99038b2 100644 >> --- a/init.c >> +++ b/init.c >> @@ -1160,17 +1161,20 @@ static int add_job(struct thread_data *td, const >> char *j >> td_new->o.new_group = 0; >> >> if (file_alloced) { >> - td_new->files_index = 0; >> - td_new->files_size = 0; >> if (td_new->files) { >> struct fio_file *f; >> for_each_file(td_new, f, i) { >> if (f->file_name) >> - free(f->file_name); >> - free(f); >> + sfree(f->file_name); >> + sfree(f); >> } >> + free(td_new->files); >> td_new->files = NULL; >> } >> + td_new->files_index = 0; >> + td_new->files_size = 0; >> if (td_new->o.filename) { >> free(td_new->o.filename); >> td_new->o.filename = NULL; > > > Thanks, I'll add this, definitely looks like a missing conversion at some > point... > > And while this isn't the bug, I think it's touching in the right area. Can > you try with this one-liner and see if that fixes your issues with EBADF? > > -- > Jens Axboe > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Shared files with thread=1: func=xfer, err=Bad file descriptor 2014-04-01 18:56 ` Andrey Kuzmin @ 2014-04-01 18:59 ` Jens Axboe 0 siblings, 0 replies; 6+ messages in thread From: Jens Axboe @ 2014-04-01 18:59 UTC (permalink / raw) To: Andrey Kuzmin; +Cc: fio On 04/01/2014 12:56 PM, Andrey Kuzmin wrote: > Seems to have fixed it: EBADF does not show up anymore even in a 120s run. Great, it has been committed. The issue was that fio would mistakenly not assign shadow_fd on that one alloc, hence it would think it was fd 0 and close it. And when open(2) then (validly) started handing out untracked fd 0, then all hell broke lose. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-04-01 18:59 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-04-01 1:08 Shared files with thread=1: func=xfer, err=Bad file descriptor Andrey Kuzmin 2014-04-01 14:31 ` Jens Axboe 2014-04-01 15:56 ` Andrey Kuzmin 2014-04-01 18:24 ` Jens Axboe 2014-04-01 18:56 ` Andrey Kuzmin 2014-04-01 18:59 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox