* fio rbd hang for block sizes > 1M
@ 2014-10-24 2:38 Mark Kirkwood
2014-10-24 5:35 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-24 2:38 UTC (permalink / raw)
To: fio
[-- Attachment #1: Type: text/plain, Size: 10899 bytes --]
I stumbled across this performance testing a new ceph cluster:
Env:
Ceph 0.86-467-g317b83d (317b83dddd1a917f70838870b31931a79bdd4dd0)
Ubuntu 14.04 (3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux)
Fio fio-2.1.13-88-gb2ee7
Cmd:
$ rbd ls -l
NAME SIZE PARENT FMT PROT LOCK
vol0 4096M 1
$ fio read-test.fio # attached
rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
1158050441d:06h:59m:33s]
Block sizes 1M usually works, 2M,4M always fail. The rbd volume should
be written to 1st (just change read to write in workload file). Note
that 2-4M blocksize is fine for writes!
Running the read variant under valgrind shows seveal invalid reads -
only for these bigger block sizes, so I'm guessing they are the problem:
$ valgrind fio read-test.fio
==12519== Memcheck, a memory error detector
==12519== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==12519== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for
copyright info
==12519== Command: fio read-test.fio
==12519==
rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
==12519== Thread 6:
==12519== Invalid read of size 8
==12519== at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158)
==12519== by 0x4E965A7:
librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
std::vector<std::pair<unsigned long, unsigned long>,
std::allocator<std::pair<unsigned long, unsigned long> > > const&,
char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
(internal.cc:3135)
==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
==12519== by 0x40D379: td_io_queue (ioengines.c:300)
==12519== by 0x44B77E: thread_main (backend.c:781)
==12519== by 0x81F6181: start_thread (pthread_create.c:312)
==12519== by 0x870AFBC: clone (clone.S:111)
==12519== Address 0x197b6fe0 is 48 bytes inside a block of size 264 free'd
==12519== at 0x4C2C2BC: operator delete(void*) (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
==12519== by 0x4E965A7:
librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
std::vector<std::pair<unsigned long, unsigned long>,
std::allocator<std::pair<unsigned long, unsigned long> > > const&,
char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
(internal.cc:3135)
==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
==12519== by 0x40D379: td_io_queue (ioengines.c:300)
==12519== by 0x44B77E: thread_main (backend.c:781)
==12519== by 0x81F6181: start_thread (pthread_create.c:312)
==12519== by 0x870AFBC: clone (clone.S:111)
==12519==
==12519== Invalid read of size 8
==12519== at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170)
==12519== by 0x4E965A7:
librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
std::vector<std::pair<unsigned long, unsigned long>,
std::allocator<std::pair<unsigned long, unsigned long> > > const&,
char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
(internal.cc:3135)
==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
==12519== by 0x40D379: td_io_queue (ioengines.c:300)
==12519== by 0x44B77E: thread_main (backend.c:781)
==12519== by 0x81F6181: start_thread (pthread_create.c:312)
==12519== by 0x870AFBC: clone (clone.S:111)
==12519== Address 0x197b6fe8 is 56 bytes inside a block of size 264 free'd
==12519== at 0x4C2C2BC: operator delete(void*) (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
==12519== by 0x4E965A7:
librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
std::vector<std::pair<unsigned long, unsigned long>,
std::allocator<std::pair<unsigned long, unsigned long> > > const&,
char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
(internal.cc:3135)
==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
==12519== by 0x40D379: td_io_queue (ioengines.c:300)
==12519== by 0x44B77E: thread_main (backend.c:781)
==12519== by 0x81F6181: start_thread (pthread_create.c:312)
==12519== by 0x870AFBC: clone (clone.S:111)
==12519==
==12519== Thread 18:
==12519== Invalid read of size 8
==12519== at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158)
==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
(ObjectCacher.h:581)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
(ObjectCacher.cc:805)
==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
(ObjectCacher.h:504)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
(LibrbdWriteback.cc:54)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
(AioCompletionImpl.h:180)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x5452397: Finisher::finisher_thread_entry()
(Finisher.cc:59)
==12519== Address 0x1a299710 is 48 bytes inside a block of size 264 free'd
==12519== at 0x4C2C2BC: operator delete(void*) (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
(ObjectCacher.h:581)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
(ObjectCacher.cc:805)
==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
(ObjectCacher.h:504)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
(LibrbdWriteback.cc:54)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
(AioCompletionImpl.h:180)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519==
==12519== Invalid read of size 8
==12519== at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170)
==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
(ObjectCacher.h:581)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
(ObjectCacher.cc:805)
==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
(ObjectCacher.h:504)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
(LibrbdWriteback.cc:54)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
(AioCompletionImpl.h:180)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x5452397: Finisher::finisher_thread_entry()
(Finisher.cc:59)
==12519== Address 0x1a299718 is 56 bytes inside a block of size 264 free'd
==12519== at 0x4C2C2BC: operator delete(void*) (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
(ObjectCacher.h:581)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
(ObjectCacher.cc:805)
==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
(ObjectCacher.h:504)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
(LibrbdWriteback.cc:54)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
(AioCompletionImpl.h:180)
==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
==12519==
[-- Attachment #2: read-test.fio --]
[-- Type: text/plain, Size: 650 bytes --]
######################################################################
# Example test for the RBD engine.
#
# From http://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
#
# Runs a 4k random write test agains a RBD via librbd
#
# NOTE: Make sure you have either a RBD named 'voltest' or change
# the rbdname parameter.
######################################################################
[global]
#logging
#write_iops_log=write_iops_log
#write_bw_log=write_bw_log
#write_lat_log=write_lat_log
ioengine=rbd
clientname=admin
pool=rbd
rbdname=vol0
invalidate=0 # mandatory
rw=read
bs=2M
[rbd_thread]
iodepth=32
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 2:38 fio rbd hang for block sizes > 1M Mark Kirkwood
@ 2014-10-24 5:35 ` Jens Axboe
2014-10-24 6:17 ` Mark Kirkwood
2014-10-24 14:11 ` Danny Al-Gaaf
0 siblings, 2 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-24 5:35 UTC (permalink / raw)
To: Mark Kirkwood, fio; +Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng
CC'ing relevant parties, leaving email intact.
On 2014-10-23 20:38, Mark Kirkwood wrote:
> I stumbled across this performance testing a new ceph cluster:
>
> Env:
>
> Ceph 0.86-467-g317b83d (317b83dddd1a917f70838870b31931a79bdd4dd0)
> Ubuntu 14.04 (3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC
> 2014 x86_64 x86_64 x86_64 GNU/Linux)
> Fio fio-2.1.13-88-gb2ee7
>
> Cmd:
>
> $ rbd ls -l
> NAME SIZE PARENT FMT PROT LOCK
> vol0 4096M 1
>
> $ fio read-test.fio # attached
> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
> 1158050441d:06h:59m:33s]
>
> Block sizes 1M usually works, 2M,4M always fail. The rbd volume should
> be written to 1st (just change read to write in workload file). Note
> that 2-4M blocksize is fine for writes!
>
> Running the read variant under valgrind shows seveal invalid reads -
> only for these bigger block sizes, so I'm guessing they are the problem:
>
> $ valgrind fio read-test.fio
> ==12519== Memcheck, a memory error detector
> ==12519== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==12519== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for
> copyright info
> ==12519== Command: fio read-test.fio
> ==12519==
> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> ==12519== Thread 6:
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519== Address 0x197b6fe0 is 48 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519==
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519== Address 0x197b6fe8 is 56 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519==
> ==12519== Thread 18:
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x5452397: Finisher::finisher_thread_entry()
> (Finisher.cc:59)
> ==12519== Address 0x1a299710 is 48 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519==
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x5452397: Finisher::finisher_thread_entry()
> (Finisher.cc:59)
> ==12519== Address 0x1a299718 is 56 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519==
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 5:35 ` Jens Axboe
@ 2014-10-24 6:17 ` Mark Kirkwood
2014-10-24 13:19 ` Mark Nelson
2014-10-24 14:11 ` Danny Al-Gaaf
1 sibling, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-24 6:17 UTC (permalink / raw)
To: Jens Axboe, fio; +Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng
On 24/10/14 18:35, Jens Axboe wrote:
> CC'ing relevant parties, leaving email intact.
>
Note that the 'Killed' is because I killed the run - it hangs and
appears to be non interruptable. I missed that when pasting, sorry!
>> $ fio read-test.fio # attached
>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>> iodepth=32
>> fio-2.1.13-88-gb2ee7
>> Starting 1 process
>> rbd engine: RBD version: 0.1.8
>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 1158050441d:06h:59m:33s]
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 6:17 ` Mark Kirkwood
@ 2014-10-24 13:19 ` Mark Nelson
2014-10-24 14:09 ` Mark Nelson
2014-10-24 22:30 ` fio rbd hang for block sizes > 1M Mark Kirkwood
0 siblings, 2 replies; 52+ messages in thread
From: Mark Nelson @ 2014-10-24 13:19 UTC (permalink / raw)
To: Mark Kirkwood, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng
FWIW we are seeing this at Redhat/Inktank with recent fio from master
and ceph giant branch as well.
Mark
On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
> On 24/10/14 18:35, Jens Axboe wrote:
>> CC'ing relevant parties, leaving email intact.
>>
>
> Note that the 'Killed' is because I killed the run - it hangs and
> appears to be non interruptable. I missed that when pasting, sorry!
>
>>> $ fio read-test.fio # attached
>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>> iodepth=32
>>> fio-2.1.13-88-gb2ee7
>>> Starting 1 process
>>> rbd engine: RBD version: 0.1.8
>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>> 1158050441d:06h:59m:33s]
>
> --
> To unsubscribe from this list: send the line "unsubscribe fio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 13:19 ` Mark Nelson
@ 2014-10-24 14:09 ` Mark Nelson
2014-10-24 14:30 ` Jens Axboe
2014-10-24 22:45 ` Mark Kirkwood
2014-10-24 22:30 ` fio rbd hang for block sizes > 1M Mark Kirkwood
1 sibling, 2 replies; 52+ messages in thread
From: Mark Nelson @ 2014-10-24 14:09 UTC (permalink / raw)
To: Mark Kirkwood, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
More info:
I went back and tested fio versions back to 2.1.10 and still encountered
the issue. I then went back and tested the v0.86 release versus giant
and was able to get through a 4MB read test without error. I suspect
this is not an fio problem. I'll try to narrow down the commit after
0.86 that is causing this.
Mark
On 10/24/2014 08:19 AM, Mark Nelson wrote:
> FWIW we are seeing this at Redhat/Inktank with recent fio from master
> and ceph giant branch as well.
>
> Mark
>
> On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
>> On 24/10/14 18:35, Jens Axboe wrote:
>>> CC'ing relevant parties, leaving email intact.
>>>
>>
>> Note that the 'Killed' is because I killed the run - it hangs and
>> appears to be non interruptable. I missed that when pasting, sorry!
>>
>>>> $ fio read-test.fio # attached
>>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>>> iodepth=32
>>>> fio-2.1.13-88-gb2ee7
>>>> Starting 1 process
>>>> rbd engine: RBD version: 0.1.8
>>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>>> 1158050441d:06h:59m:33s]
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 5:35 ` Jens Axboe
2014-10-24 6:17 ` Mark Kirkwood
@ 2014-10-24 14:11 ` Danny Al-Gaaf
2014-10-24 14:31 ` Jens Axboe
1 sibling, 1 reply; 52+ messages in thread
From: Danny Al-Gaaf @ 2014-10-24 14:11 UTC (permalink / raw)
To: Jens Axboe, Mark Kirkwood, fio; +Cc: xan.peng
Am 24.10.2014 um 07:35 schrieb Jens Axboe:
> CC'ing relevant parties, leaving email intact.
>
I'll take a look at it.
@Jens: I removed Daniel from the thread since his email is no longer valid.
Regards,
Danny
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 14:09 ` Mark Nelson
@ 2014-10-24 14:30 ` Jens Axboe
2014-10-24 22:45 ` Mark Kirkwood
1 sibling, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-24 14:30 UTC (permalink / raw)
To: Mark Nelson, Mark Kirkwood, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
On 2014-10-24 08:09, Mark Nelson wrote:
> More info:
>
> I went back and tested fio versions back to 2.1.10 and still encountered
> the issue. I then went back and tested the v0.86 release versus giant
> and was able to get through a 4MB read test without error. I suspect
> this is not an fio problem. I'll try to narrow down the commit after
> 0.86 that is causing this.
Thanks, it doesn't look like a fio problem if it's dependent on the
block size used. Might warrant a check in the fio configure script, so
we can fail (or limit) read sizes on the problematic versions.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 14:11 ` Danny Al-Gaaf
@ 2014-10-24 14:31 ` Jens Axboe
0 siblings, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-24 14:31 UTC (permalink / raw)
To: Danny Al-Gaaf, Mark Kirkwood, fio; +Cc: xan.peng
On 2014-10-24 08:11, Danny Al-Gaaf wrote:
> Am 24.10.2014 um 07:35 schrieb Jens Axboe:
>> CC'ing relevant parties, leaving email intact.
>>
>
> I'll take a look at it.
Thanks!
> @Jens: I removed Daniel from the thread since his email is no longer valid.
Yeah, forgot to remove him on subsequent emails, sorry.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 13:19 ` Mark Nelson
2014-10-24 14:09 ` Mark Nelson
@ 2014-10-24 22:30 ` Mark Kirkwood
2014-10-24 22:38 ` Mark Nelson
1 sibling, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-24 22:30 UTC (permalink / raw)
To: Mark Nelson, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng
It looks like it is an rbd cache issue:
http://tracker.ceph.com/issues/9854
If I disable the rbd ccahe:
$ tail /etc/ceph/ceph.conf
...
[client]
rbd cache = false
then the 2-4M reads work fine (no invalid reads in valgrind either).
Regards
Mark
On 25/10/14 02:19, Mark Nelson wrote:
> FWIW we are seeing this at Redhat/Inktank with recent fio from master
> and ceph giant branch as well.
>
> Mark
>
> On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
>> On 24/10/14 18:35, Jens Axboe wrote:
>>> CC'ing relevant parties, leaving email intact.
>>>
>>
>> Note that the 'Killed' is because I killed the run - it hangs and
>> appears to be non interruptable. I missed that when pasting, sorry!
>>
>>>> $ fio read-test.fio # attached
>>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>>> iodepth=32
>>>> fio-2.1.13-88-gb2ee7
>>>> Starting 1 process
>>>> rbd engine: RBD version: 0.1.8
>>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>>> 1158050441d:06h:59m:33s]
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 22:30 ` fio rbd hang for block sizes > 1M Mark Kirkwood
@ 2014-10-24 22:38 ` Mark Nelson
0 siblings, 0 replies; 52+ messages in thread
From: Mark Nelson @ 2014-10-24 22:38 UTC (permalink / raw)
To: Mark Kirkwood, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng
Yeah, we reverted the commit that we think was causing it earlier today.
Should be able to confirm things are working again in the next hour or
two.
Mark
On 10/24/2014 05:30 PM, Mark Kirkwood wrote:
> It looks like it is an rbd cache issue:
>
> http://tracker.ceph.com/issues/9854
>
> If I disable the rbd ccahe:
>
> $ tail /etc/ceph/ceph.conf
> ...
> [client]
> rbd cache = false
>
> then the 2-4M reads work fine (no invalid reads in valgrind either).
>
> Regards
>
> Mark
>
> On 25/10/14 02:19, Mark Nelson wrote:
>> FWIW we are seeing this at Redhat/Inktank with recent fio from master
>> and ceph giant branch as well.
>>
>> Mark
>>
>> On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
>>> On 24/10/14 18:35, Jens Axboe wrote:
>>>> CC'ing relevant parties, leaving email intact.
>>>>
>>>
>>> Note that the 'Killed' is because I killed the run - it hangs and
>>> appears to be non interruptable. I missed that when pasting, sorry!
>>>
>>>>> $ fio read-test.fio # attached
>>>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>>>> iodepth=32
>>>>> fio-2.1.13-88-gb2ee7
>>>>> Starting 1 process
>>>>> rbd engine: RBD version: 0.1.8
>>>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>>>> 1158050441d:06h:59m:33s]
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 14:09 ` Mark Nelson
2014-10-24 14:30 ` Jens Axboe
@ 2014-10-24 22:45 ` Mark Kirkwood
2014-10-25 0:12 ` Mark Nelson
1 sibling, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-24 22:45 UTC (permalink / raw)
To: Mark Nelson, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
Interestingly, I first encountered this on (what I think is) 0.86
release (0.86-1precise). I wonder if you had a bigger rbd cache on the
release cluster you tested?
As mentioned in the same named thread on -users, disabling the rbd cache
stops the hang.
Regards
Mark
On 25/10/14 03:09, Mark Nelson wrote:
> More info:
>
> I went back and tested fio versions back to 2.1.10 and still encountered
> the issue. I then went back and tested the v0.86 release versus giant
> and was able to get through a 4MB read test without error. I suspect
> this is not an fio problem. I'll try to narrow down the commit after
> 0.86 that is causing this.
>
> Mark
>
> On 10/24/2014 08:19 AM, Mark Nelson wrote:
>> FWIW we are seeing this at Redhat/Inktank with recent fio from master
>> and ceph giant branch as well.
>>
>> Mark
>>
>> On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
>>> On 24/10/14 18:35, Jens Axboe wrote:
>>>> CC'ing relevant parties, leaving email intact.
>>>>
>>>
>>> Note that the 'Killed' is because I killed the run - it hangs and
>>> appears to be non interruptable. I missed that when pasting, sorry!
>>>
>>>>> $ fio read-test.fio # attached
>>>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>>>> iodepth=32
>>>>> fio-2.1.13-88-gb2ee7
>>>>> Starting 1 process
>>>>> rbd engine: RBD version: 0.1.8
>>>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>>>> 1158050441d:06h:59m:33s]
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-24 22:45 ` Mark Kirkwood
@ 2014-10-25 0:12 ` Mark Nelson
2014-10-25 0:37 ` Mark Kirkwood
0 siblings, 1 reply; 52+ messages in thread
From: Mark Nelson @ 2014-10-25 0:12 UTC (permalink / raw)
To: Mark Kirkwood, Mark Nelson, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
Hi Mark,
Try the latest giant branch. I believe we've fixed this with 7272bb8.
My test cluster is passing read tests now.
Mark
On 10/24/2014 05:45 PM, Mark Kirkwood wrote:
> Interestingly, I first encountered this on (what I think is) 0.86
> release (0.86-1precise). I wonder if you had a bigger rbd cache on the
> release cluster you tested?
>
> As mentioned in the same named thread on -users, disabling the rbd cache
> stops the hang.
>
> Regards
>
> Mark
>
> On 25/10/14 03:09, Mark Nelson wrote:
>> More info:
>>
>> I went back and tested fio versions back to 2.1.10 and still encountered
>> the issue. I then went back and tested the v0.86 release versus giant
>> and was able to get through a 4MB read test without error. I suspect
>> this is not an fio problem. I'll try to narrow down the commit after
>> 0.86 that is causing this.
>>
>> Mark
>>
>> On 10/24/2014 08:19 AM, Mark Nelson wrote:
>>> FWIW we are seeing this at Redhat/Inktank with recent fio from master
>>> and ceph giant branch as well.
>>>
>>> Mark
>>>
>>> On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
>>>> On 24/10/14 18:35, Jens Axboe wrote:
>>>>> CC'ing relevant parties, leaving email intact.
>>>>>
>>>>
>>>> Note that the 'Killed' is because I killed the run - it hangs and
>>>> appears to be non interruptable. I missed that when pasting, sorry!
>>>>
>>>>>> $ fio read-test.fio # attached
>>>>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>>>>> iodepth=32
>>>>>> fio-2.1.13-88-gb2ee7
>>>>>> Starting 1 process
>>>>>> rbd engine: RBD version: 0.1.8
>>>>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>>>>> 1158050441d:06h:59m:33s]
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-25 0:12 ` Mark Nelson
@ 2014-10-25 0:37 ` Mark Kirkwood
2014-10-25 2:35 ` Mark Kirkwood
0 siblings, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-25 0:37 UTC (permalink / raw)
To: Mark Nelson, Mark Nelson, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
Righty, building now.
On 25/10/14 13:12, Mark Nelson wrote:
> Hi Mark,
>
> Try the latest giant branch. I believe we've fixed this with 7272bb8.
> My test cluster is passing read tests now.
>
> Mark
>
> On 10/24/2014 05:45 PM, Mark Kirkwood wrote:
>> Interestingly, I first encountered this on (what I think is) 0.86
>> release (0.86-1precise). I wonder if you had a bigger rbd cache on the
>> release cluster you tested?
>>
>> As mentioned in the same named thread on -users, disabling the rbd cache
>> stops the hang.
>>
>> Regards
>>
>> Mark
>>
>> On 25/10/14 03:09, Mark Nelson wrote:
>>> More info:
>>>
>>> I went back and tested fio versions back to 2.1.10 and still encountered
>>> the issue. I then went back and tested the v0.86 release versus giant
>>> and was able to get through a 4MB read test without error. I suspect
>>> this is not an fio problem. I'll try to narrow down the commit after
>>> 0.86 that is causing this.
>>>
>>> Mark
>>>
>>> On 10/24/2014 08:19 AM, Mark Nelson wrote:
>>>> FWIW we are seeing this at Redhat/Inktank with recent fio from master
>>>> and ceph giant branch as well.
>>>>
>>>> Mark
>>>>
>>>> On 10/24/2014 01:17 AM, Mark Kirkwood wrote:
>>>>> On 24/10/14 18:35, Jens Axboe wrote:
>>>>>> CC'ing relevant parties, leaving email intact.
>>>>>>
>>>>>
>>>>> Note that the 'Killed' is because I killed the run - it hangs and
>>>>> appears to be non interruptable. I missed that when pasting, sorry!
>>>>>
>>>>>>> $ fio read-test.fio # attached
>>>>>>> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd,
>>>>>>> iodepth=32
>>>>>>> fio-2.1.13-88-gb2ee7
>>>>>>> Starting 1 process
>>>>>>> rbd engine: RBD version: 0.1.8
>>>>>>> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>>>>>>> 1158050441d:06h:59m:33s]
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-25 0:37 ` Mark Kirkwood
@ 2014-10-25 2:35 ` Mark Kirkwood
2014-10-25 3:47 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-25 2:35 UTC (permalink / raw)
To: Mark Nelson, Mark Nelson, Jens Axboe, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
Patched client machine *only* - re-running fio from there works fine
with (default - i.e no [client' section at all) cache settings:
$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [75.0% done] [1165MB/0KB/0KB /s] [291/0/0 iops]
[eta 00m:0Jobs: 1 (f=1): [R(1)] [83.3% done] [447.4MB/0KB/0KB /s]
[111/0/0 iops] [eta 00m:Jobs: 1 (f=1): [R(1)] [100.0% done]
[268.0MB/0KB/0KB /s] [67/0/0 iops] [eta 00m:Jobs: 1 (f=1): [R(1)]
[100.0% done] [336.1MB/0KB/0KB /s] [84/0/0 iops] [eta 00m:00s]
rbd_thread: (groupid=0, jobs=1): err= 0: pid=5980: Sat Oct 25 15:32:16 2014
read : io=4096.0MB, bw=623410KB/s, iops=152, runt= 6728msec
slat (usec): min=7, max=230691, avg=5664.46, stdev=14434.46
clat (msec): min=11, max=1589, avg=193.03, stdev=246.84
lat (msec): min=13, max=1606, avg=198.70, stdev=248.62
clat percentiles (msec):
| 1.00th=[ 17], 5.00th=[ 30], 10.00th=[ 43], 20.00th=[ 60],
| 30.00th=[ 78], 40.00th=[ 93], 50.00th=[ 109], 60.00th=[ 124],
| 70.00th=[ 147], 80.00th=[ 210], 90.00th=[ 498], 95.00th=[ 758],
| 99.00th=[ 1237], 99.50th=[ 1467], 99.90th=[ 1565], 99.95th=[ 1598],
| 99.99th=[ 1598]
bw (KB /s): min=178086, max=1193644, per=100.00%, avg=637349.58,
stdev=397329.85
lat (msec) : 20=2.15%, 50=12.11%, 100=30.08%, 250=38.09%, 500=7.62%
lat (msec) : 750=4.79%, 1000=2.64%, 2000=2.54%
cpu : usr=1.69%, sys=0.28%, ctx=6234, majf=0, minf=78
IO depths : 1=0.1%, 2=0.2%, 4=0.4%, 8=1.7%, 16=58.6%, 32=39.1%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=94.3%, 8=5.0%, 16=0.4%, 32=0.3%, 64=0.0%,
>=64=0.0%
issued : total=r=1024/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: io=4096.0MB, aggrb=623410KB/s, minb=623410KB/s,
maxb=623410KB/s, mint=6728msec, maxt=6728msec
On 25/10/14 13:37, Mark Kirkwood wrote:
> Righty, building now.
>
> On 25/10/14 13:12, Mark Nelson wrote:
>> Hi Mark,
>>
>> Try the latest giant branch. I believe we've fixed this with 7272bb8.
>> My test cluster is passing read tests now.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd hang for block sizes > 1M
2014-10-25 2:35 ` Mark Kirkwood
@ 2014-10-25 3:47 ` Jens Axboe
2014-10-25 4:50 ` fio rbd completions (Was: fio rbd hang for block sizes > 1M) Mark Kirkwood
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-25 3:47 UTC (permalink / raw)
To: Mark Kirkwood, Mark Nelson, Mark Nelson, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2460 bytes --]
On 2014-10-24 20:35, Mark Kirkwood wrote:
> Patched client machine *only* - re-running fio from there works fine
> with (default - i.e no [client' section at all) cache settings:
>
> $ fio read-test.fio
> rbd_thread: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Jobs: 1 (f=1): [R(1)] [75.0% done] [1165MB/0KB/0KB /s] [291/0/0 iops]
> [eta 00m:0Jobs: 1 (f=1): [R(1)] [83.3% done] [447.4MB/0KB/0KB /s]
> [111/0/0 iops] [eta 00m:Jobs: 1 (f=1): [R(1)] [100.0% done]
> [268.0MB/0KB/0KB /s] [67/0/0 iops] [eta 00m:Jobs: 1 (f=1): [R(1)]
> [100.0% done] [336.1MB/0KB/0KB /s] [84/0/0 iops] [eta 00m:00s]
> rbd_thread: (groupid=0, jobs=1): err= 0: pid=5980: Sat Oct 25 15:32:16 2014
> read : io=4096.0MB, bw=623410KB/s, iops=152, runt= 6728msec
> slat (usec): min=7, max=230691, avg=5664.46, stdev=14434.46
> clat (msec): min=11, max=1589, avg=193.03, stdev=246.84
> lat (msec): min=13, max=1606, avg=198.70, stdev=248.62
> clat percentiles (msec):
> | 1.00th=[ 17], 5.00th=[ 30], 10.00th=[ 43], 20.00th=[ 60],
> | 30.00th=[ 78], 40.00th=[ 93], 50.00th=[ 109], 60.00th=[ 124],
> | 70.00th=[ 147], 80.00th=[ 210], 90.00th=[ 498], 95.00th=[ 758],
> | 99.00th=[ 1237], 99.50th=[ 1467], 99.90th=[ 1565], 99.95th=[ 1598],
> | 99.99th=[ 1598]
> bw (KB /s): min=178086, max=1193644, per=100.00%, avg=637349.58,
> stdev=397329.85
> lat (msec) : 20=2.15%, 50=12.11%, 100=30.08%, 250=38.09%, 500=7.62%
> lat (msec) : 750=4.79%, 1000=2.64%, 2000=2.54%
> cpu : usr=1.69%, sys=0.28%, ctx=6234, majf=0, minf=78
> IO depths : 1=0.1%, 2=0.2%, 4=0.4%, 8=1.7%, 16=58.6%, 32=39.1%,
> >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> complete : 0=0.0%, 4=94.3%, 8=5.0%, 16=0.4%, 32=0.3%, 64=0.0%,
> >=64=0.0%
> issued : total=r=1024/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
> latency : target=0, window=0, percentile=100.00%, depth=32
>
> Run status group 0 (all jobs):
> READ: io=4096.0MB, aggrb=623410KB/s, minb=623410KB/s,
> maxb=623410KB/s, mint=6728msec, maxt=6728msec
Since you're running rbd tests... Mind giving this patch a go? I don't
have an easy way to test it myself. It has nothing to do with this
issue, it's just a potentially faster way to do the rbd completions.
--
Jens Axboe
[-- Attachment #2: rbd-complete-v2.patch --]
[-- Type: text/x-patch, Size: 4345 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..6aa96a5ff550 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,6 +11,7 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
};
@@ -221,34 +222,66 @@ static struct io_u *fio_rbd_event(struct thread_data *td, int event)
return rbd_data->aio_events[event];
}
-static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
- unsigned int max, const struct timespec *t)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
+{
+ struct fio_rbd_iou *fri = io_u->engine_data;
+
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
+ return 1;
+ }
+
+ return 0;
+}
+
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
+ unsigned int this_events = 0;
struct io_u *io_u;
int i;
- struct fio_rbd_iou *fov;
- do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
- fov = (struct fio_rbd_iou *)io_u->engine_data;
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ else if (wait) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
+ rbd_aio_wait_for_complete(fri->completion);
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
}
- if (events < min)
- usleep(100);
- else
+ if (*events >= min_evts)
+ break;
+ }
+
+ return this_events;
+}
+
+static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
+ unsigned int max, const struct timespec *t)
+{
+ unsigned int this_events, events = 0;
+ int wait = 0;
+
+ do {
+ this_events = rbd_iter_events(td, &events, min, wait);
+
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -258,7 +291,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
fio_ro_check(td, io_u);
@@ -266,7 +299,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_write_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,7 +307,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
goto failed;
@@ -284,7 +318,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_read_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,7 +326,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
@@ -303,14 +338,14 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_sync_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
goto failed;
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-25 3:47 ` Jens Axboe
@ 2014-10-25 4:50 ` Mark Kirkwood
2014-10-25 19:20 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-25 4:50 UTC (permalink / raw)
To: Jens Axboe, Mark Nelson, Mark Nelson, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
On 25/10/14 16:47, Jens Axboe wrote:
>
> Since you're running rbd tests... Mind giving this patch a go? I don't
> have an easy way to test it myself. It has nothing to do with this
> issue, it's just a potentially faster way to do the rbd completions.
>
Sure - but note I'm testing this on my i7 workstation (4x osd's running
on 2x Crucial M550) so not exactly server grade :-)
With that in mind, I'm seeing slightly *slower* performance with the
patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
cached.
Unpatched:
$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [100.0% done] [588.5MB/0KB/0KB /s] [4707/0/0 iops]
[eta 00m:00s]
rbd_thread: (groupid=0, jobs=1): err= 0: pid=4305: Sat Oct 25 17:39:32 2014
read : io=4096.0MB, bw=596205KB/s, iops=4657, runt= 7035msec
slat (usec): min=2, max=2967, avg=36.67, stdev=58.70
clat (usec): min=1, max=28305, avg=6812.05, stdev=3062.44
lat (usec): min=24, max=28330, avg=6848.72, stdev=3061.25
clat percentiles (usec):
| 1.00th=[ 2008], 5.00th=[ 2544], 10.00th=[ 3024], 20.00th=[ 3952],
| 30.00th=[ 4832], 40.00th=[ 5664], 50.00th=[ 6560], 60.00th=[ 7456],
| 70.00th=[ 8384], 80.00th=[ 9280], 90.00th=[10816], 95.00th=[11968],
| 99.00th=[14912], 99.50th=[16512], 99.90th=[24192], 99.95th=[26496],
| 99.99th=[28032]
bw (KB /s): min=568064, max=620288, per=100.00%, avg=596434.86,
stdev=18741.30
lat (usec) : 2=0.01%, 50=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.94%, 4=19.48%, 10=65.18%, 20=14.16%, 50=0.22%
cpu : usr=12.84%, sys=1.96%, ctx=52370, majf=0, minf=78
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=25.6%, 32=74.3%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=99.6%, 8=0.4%, 16=0.1%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=32768/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [100.0% done] [843.8MB/0KB/0KB /s] [6750/0/0 iops]
[eta 00m:00s]
rbd_thread: (groupid=0, jobs=1): err= 0: pid=4393: Sat Oct 25 17:39:50 2014
read : io=4096.0MB, bw=847163KB/s, iops=6618, runt= 4951msec
slat (usec): min=2, max=3996, avg=46.39, stdev=106.38
clat (usec): min=1, max=19652, avg=4699.45, stdev=2251.49
lat (usec): min=14, max=19726, avg=4745.83, stdev=2244.04
clat percentiles (usec):
| 1.00th=[ 916], 5.00th=[ 1400], 10.00th=[ 1864], 20.00th=[ 2704],
| 30.00th=[ 3408], 40.00th=[ 3984], 50.00th=[ 4512], 60.00th=[ 5088],
| 70.00th=[ 5664], 80.00th=[ 6432], 90.00th=[ 7584], 95.00th=[ 8640],
| 99.00th=[11328], 99.50th=[11968], 99.90th=[14016], 99.95th=[14784],
| 99.99th=[16320]
bw (KB /s): min=823808, max=885760, per=100.00%, avg=847975.33,
stdev=24137.14
lat (usec) : 2=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 500=0.03%
lat (usec) : 750=0.32%, 1000=1.15%
lat (msec) : 2=10.05%, 4=28.67%, 10=57.42%, 20=2.34%
cpu : usr=15.31%, sys=3.15%, ctx=48359, majf=0, minf=82
IO depths : 1=0.1%, 2=0.1%, 4=0.5%, 8=2.3%, 16=43.4%, 32=53.7%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=98.3%, 8=1.0%, 16=0.4%, 32=0.3%, 64=0.0%,
>=64=0.0%
issued : total=r=32768/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
patched:
$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [100.0% done] [424.9MB/0KB/0KB /s] [3399/0/0 iops]
[eta 00m:00s]
rbd_thread: (groupid=0, jobs=1): err= 0: pid=4528: Sat Oct 25 17:40:31 2014
read : io=4096.0MB, bw=429744KB/s, iops=3357, runt= 9760msec
slat (usec): min=2, max=1450, avg=24.89, stdev=28.80
clat (usec): min=0, max=29343, avg=9504.27, stdev=3355.50
lat (usec): min=14, max=29352, avg=9529.17, stdev=3351.45
clat percentiles (usec):
| 1.00th=[ 852], 5.00th=[ 2960], 10.00th=[ 4512], 20.00th=[ 6688],
| 30.00th=[ 8512], 40.00th=[ 9408], 50.00th=[10304], 60.00th=[10944],
| 70.00th=[11456], 80.00th=[11968], 90.00th=[12480], 95.00th=[13632],
| 99.00th=[18048], 99.50th=[19072], 99.90th=[21376], 99.95th=[21888],
| 99.99th=[22400]
bw (KB /s): min=400606, max=463141, per=100.00%, avg=429940.42,
stdev=19324.84
lat (usec) : 2=0.07%, 500=0.01%, 750=0.56%, 1000=0.78%
lat (msec) : 2=1.70%, 4=5.10%, 10=38.37%, 20=53.20%, 50=0.21%
cpu : usr=6.36%, sys=0.79%, ctx=18607, majf=0, minf=81
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=32768/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=0): [R(1)] [100.0% done] [711.9MB/0KB/0KB /s] [5695/0/0 iops]
[eta 00m:00s]
rbd_thread: (groupid=0, jobs=1): err= 0: pid=4594: Sat Oct 25 17:40:43 2014
read : io=4096.0MB, bw=719311KB/s, iops=5619, runt= 5831msec
slat (usec): min=2, max=3965, avg=32.65, stdev=86.47
clat (usec): min=0, max=16050, avg=5658.86, stdev=2230.99
lat (usec): min=17, max=16074, avg=5691.51, stdev=2222.24
clat percentiles (usec):
| 1.00th=[ 796], 5.00th=[ 1880], 10.00th=[ 2864], 20.00th=[ 3888],
| 30.00th=[ 4576], 40.00th=[ 5088], 50.00th=[ 5536], 60.00th=[ 6112],
| 70.00th=[ 6624], 80.00th=[ 7328], 90.00th=[ 8384], 95.00th=[ 9408],
| 99.00th=[11968], 99.50th=[12864], 99.90th=[15552], 99.95th=[15552],
| 99.99th=[15680]
bw (KB /s): min=631040, max=795904, per=100.00%, avg=719788.73,
stdev=49266.37
lat (usec) : 2=0.03%, 250=0.01%, 500=0.08%, 750=0.69%, 1000=0.99%
lat (msec) : 2=3.76%, 4=15.47%, 10=75.63%, 20=3.35%
cpu : usr=11.17%, sys=1.22%, ctx=22614, majf=0, minf=83
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=32768/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
I'll try it out next week on our real cluster (3x hosts, 24x osds on
spinners + ssd journals), Mark Nelson will probably beat me to it mind you!
Cheers
Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-25 4:50 ` fio rbd completions (Was: fio rbd hang for block sizes > 1M) Mark Kirkwood
@ 2014-10-25 19:20 ` Jens Axboe
2014-10-25 22:25 ` Mark Kirkwood
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-25 19:20 UTC (permalink / raw)
To: Mark Kirkwood, Mark Nelson, Mark Nelson, fio
Cc: d.gollub@telekom.de >> Daniel Gollub, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 871 bytes --]
On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
> On 25/10/14 16:47, Jens Axboe wrote:
>>
>> Since you're running rbd tests... Mind giving this patch a go? I don't
>> have an easy way to test it myself. It has nothing to do with this
>> issue, it's just a potentially faster way to do the rbd completions.
>>
>
> Sure - but note I'm testing this on my i7 workstation (4x osd's running
> on 2x Crucial M550) so not exactly server grade :-)
>
> With that in mind, I'm seeing slightly *slower* performance with the
> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
> cached.
Yeah, that doesn't look good. Mind trying this one out? I wonder if we
doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
working correctly. If you try this one, we should know more...
Goal is, I want to get rid of that usleep() in getevents.
--
Jens Axboe
[-- Attachment #2: rbd-comp-v3.patch --]
[-- Type: text/x-patch, Size: 5109 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..2353b1f11caf 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
+ int io_seen;
};
struct rbd_data {
@@ -221,34 +223,69 @@ static struct io_u *fio_rbd_event(struct thread_data *td, int event)
return rbd_data->aio_events[event];
}
-static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
- unsigned int max, const struct timespec *t)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
+{
+ struct fio_rbd_iou *fri = io_u->engine_data;
+
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ fri->io_seen = 1;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
+ return 1;
+ }
+
+ return 0;
+}
+
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
+ unsigned int this_events = 0;
struct io_u *io_u;
int i;
- struct fio_rbd_iou *fov;
- do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- fov = (struct fio_rbd_iou *)io_u->engine_data;
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
+ if (fri->io_seen)
+ continue;
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ else if (wait) {
+ rbd_aio_wait_for_complete(fri->completion);
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
}
- if (events < min)
- usleep(100);
- else
+ if (*events >= min_evts)
+ break;
+ }
+
+ return this_events;
+}
+
+static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
+ unsigned int max, const struct timespec *t)
+{
+ unsigned int this_events, events = 0;
+ int wait = 0;
+
+ do {
+ this_events = rbd_iter_events(td, &events, min, wait);
+
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -258,7 +295,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
fio_ro_check(td, io_u);
@@ -266,7 +303,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_write_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,7 +311,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
goto failed;
@@ -284,7 +322,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_read_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,7 +330,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
@@ -303,14 +342,14 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_sync_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
goto failed;
@@ -439,22 +478,21 @@ static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o = io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (o) {
+ if (fri) {
io_u->engine_data = NULL;
- free(o);
+ free(fri);
}
}
static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o;
+ struct fio_rbd_iou *fri;
- o = malloc(sizeof(*o));
- o->io_complete = 0;
- o->io_u = io_u;
- io_u->engine_data = o;
+ fri = calloc(1, sizeof(*fri));
+ fri->io_u = io_u;
+ io_u->engine_data = fri;
return 0;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-25 19:20 ` Jens Axboe
@ 2014-10-25 22:25 ` Mark Kirkwood
2014-10-27 9:27 ` Ketor D
2014-10-27 14:19 ` Jens Axboe
0 siblings, 2 replies; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-25 22:25 UTC (permalink / raw)
To: Jens Axboe, Mark Nelson, Mark Nelson, fio
Cc: xan.peng, ceph-devel@vger.kernel.org
On 26/10/14 08:20, Jens Axboe wrote:
> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>> On 25/10/14 16:47, Jens Axboe wrote:
>>>
>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>> have an easy way to test it myself. It has nothing to do with this
>>> issue, it's just a potentially faster way to do the rbd completions.
>>>
>>
>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>> on 2x Crucial M550) so not exactly server grade :-)
>>
>> With that in mind, I'm seeing slightly *slower* performance with the
>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>> cached.
>
> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
> working correctly. If you try this one, we should know more...
>
> Goal is, I want to get rid of that usleep() in getevents.
>
Testing with v3 patch applied hangs. I did wonder if we had somehow hit
a new variant of the cache issue - so reran with it disabled in
ceph.conf. Result is the same:
$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
01h:25m:15s]
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-25 22:25 ` Mark Kirkwood
@ 2014-10-27 9:27 ` Ketor D
2014-10-27 10:25 ` Ketor D
2014-10-27 14:15 ` Jens Axboe
2014-10-27 14:19 ` Jens Axboe
1 sibling, 2 replies; 52+ messages in thread
From: Ketor D @ 2014-10-27 9:27 UTC (permalink / raw)
To: Mark Kirkwood
Cc: Jens Axboe, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]
Hi, Jens:
I have test your v2 and v3 patch.
The v2 patch get SIGABT and crash. The v3 patch hang.
Why not simply comment usleep?
2014-10-26 6:25 GMT+08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
> On 26/10/14 08:20, Jens Axboe wrote:
>
>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>
>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>
>>>>
>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>> have an easy way to test it myself. It has nothing to do with this
>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>
>>>>
>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>> on 2x Crucial M550) so not exactly server grade :-)
>>>
>>> With that in mind, I'm seeing slightly *slower* performance with the
>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>> cached.
>>>
>>
>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>> working correctly. If you try this one, we should know more...
>>
>> Goal is, I want to get rid of that usleep() in getevents.
>>
>>
> Testing with v3 patch applied hangs. I did wonder if we had somehow hit a
> new variant of the cache issue - so reran with it disabled in ceph.conf.
> Result is the same:
>
> $ fio read-test.fio
> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
> ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
> 01h:25m:15s]
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
[-- Attachment #2: Type: text/html, Size: 2959 bytes --]
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 9:27 ` Ketor D
@ 2014-10-27 10:25 ` Ketor D
2014-10-27 14:19 ` Jens Axboe
2014-10-27 14:15 ` Jens Axboe
1 sibling, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-27 10:25 UTC (permalink / raw)
To: Mark Kirkwood
Cc: Jens Axboe, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2684 bytes --]
Hi Jens:
After debug the v3 patch, I found there is a bug in the patch.
On the first fio_rbd_getevents loop, the fri->io_seen is set to
1, and this variable never set to 0 again. So the program get into
endless loop in such code:
do {
this_events = rbd_iter_events(td, &events, min, wait);
if (events >= min)
break;
if (this_events)
continue;
wait = 1;
} while (1);
this_events and events always be 0, because the fri->io_seen is always
1, so no events can be getted.
The Bug fix is:
in the function _fio_rbd_finish_read_aiocb,
_fio_rbd_finish_write_aiocb and _fio_rbd_finish_sync_aiocb add
"fio_rbd_iou->io_seen = 0;" after "fio_rbd_iou->io_complete = 1;".
The attchment is the new patch.
2014-10-27 17:27 GMT+08:00 Ketor D <d.ketor@gmail.com>:
> Hi, Jens:
> I have test your v2 and v3 patch.
> The v2 patch get SIGABT and crash. The v3 patch hang.
>
> Why not simply comment usleep?
>
>
> 2014-10-26 6:25 GMT+08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
>>
>> On 26/10/14 08:20, Jens Axboe wrote:
>>>
>>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>>>
>>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>>>
>>>>>
>>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>>> have an easy way to test it myself. It has nothing to do with this
>>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>>
>>>>
>>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>>> on 2x Crucial M550) so not exactly server grade :-)
>>>>
>>>> With that in mind, I'm seeing slightly *slower* performance with the
>>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>>> cached.
>>>
>>>
>>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>>> working correctly. If you try this one, we should know more...
>>>
>>> Goal is, I want to get rid of that usleep() in getevents.
>>>
>>
>> Testing with v3 patch applied hangs. I did wonder if we had somehow hit a
>> new variant of the cache issue - so reran with it disabled in ceph.conf.
>> Result is the same:
>>
>> $ fio read-test.fio
>> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
>> ioengine=rbd, iodepth=32
>> fio-2.1.13-88-gb2ee7
>> Starting 1 process
>> rbd engine: RBD version: 0.1.8
>> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 01h:25m:15s]
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
[-- Attachment #2: rbd-comp-v4.patch --]
[-- Type: application/octet-stream, Size: 6034 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8..3fd815c 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
+ int io_seen;
};
struct rbd_data {
@@ -170,6 +172,7 @@ static void _fio_rbd_finish_write_aiocb(rbd_completion_t comp, void *data)
(struct fio_rbd_iou *)io_u->engine_data;
fio_rbd_iou->io_complete = 1;
+ fio_rbd_iou->io_seen = 0;
/* if write needs to be verified - we should not release comp here
without fetching the result */
@@ -187,6 +190,7 @@ static void _fio_rbd_finish_read_aiocb(rbd_completion_t comp, void *data)
(struct fio_rbd_iou *)io_u->engine_data;
fio_rbd_iou->io_complete = 1;
+ fio_rbd_iou->io_seen = 0;
/* if read needs to be verified - we should not release comp here
without fetching the result */
@@ -204,6 +208,7 @@ static void _fio_rbd_finish_sync_aiocb(rbd_completion_t comp, void *data)
(struct fio_rbd_iou *)io_u->engine_data;
fio_rbd_iou->io_complete = 1;
+ fio_rbd_iou->io_seen = 0;
/* if sync needs to be verified - we should not release comp here
without fetching the result */
@@ -221,34 +226,70 @@ static struct io_u *fio_rbd_event(struct thread_data *td, int event)
return rbd_data->aio_events[event];
}
-static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
- unsigned int max, const struct timespec *t)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
+{
+ struct fio_rbd_iou *fri = io_u->engine_data;
+
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ fri->io_seen = 1;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
+ return 1;
+ }
+
+ return 0;
+}
+
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
+ unsigned int this_events = 0;
struct io_u *io_u;
int i;
- struct fio_rbd_iou *fov;
- do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- fov = (struct fio_rbd_iou *)io_u->engine_data;
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
+ if (fri->io_seen)
+ continue;
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
+ if (fri_check_complete(rbd_data, io_u, events)){
+ this_events++;
+ }
+ else if (wait) {
+ rbd_aio_wait_for_complete(fri->completion);
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
}
- if (events < min)
- usleep(100);
- else
+ if (*events >= min_evts)
+ break;
+ }
+
+ return this_events;
+}
+
+static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
+ unsigned int max, const struct timespec *t)
+{
+ unsigned int this_events, events = 0;
+ int wait = 0;
+
+ do {
+ this_events = rbd_iter_events(td, &events, min, wait);
+
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -258,7 +299,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
fio_ro_check(td, io_u);
@@ -266,7 +307,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_write_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,7 +315,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
goto failed;
@@ -284,7 +326,7 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_read_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,7 +334,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
@@ -303,14 +346,14 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
r = rbd_aio_create_completion(io_u,
(rbd_callback_t)
_fio_rbd_finish_sync_aiocb,
- &comp);
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
goto failed;
@@ -439,22 +482,21 @@ static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o = io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (o) {
+ if (fri) {
io_u->engine_data = NULL;
- free(o);
+ free(fri);
}
}
static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o;
+ struct fio_rbd_iou *fri;
- o = malloc(sizeof(*o));
- o->io_complete = 0;
- o->io_u = io_u;
- io_u->engine_data = o;
+ fri = calloc(1, sizeof(*fri));
+ fri->io_u = io_u;
+ io_u->engine_data = fri;
return 0;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 9:27 ` Ketor D
2014-10-27 10:25 ` Ketor D
@ 2014-10-27 14:15 ` Jens Axboe
1 sibling, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 14:15 UTC (permalink / raw)
To: Ketor D, Mark Kirkwood
Cc: Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 10/27/2014 03:27 AM, Ketor D wrote:
> Hi, Jens:
> I have test your v2 and v3 patch.
> The v2 patch get SIGABT and crash. The v3 patch hang.
>
> Why not simply comment usleep?
Because that is very inefficient as well, then fio would basically be
busy looping waiting for IO to finish.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 10:25 ` Ketor D
@ 2014-10-27 14:19 ` Jens Axboe
0 siblings, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 14:19 UTC (permalink / raw)
To: Ketor D, Mark Kirkwood
Cc: Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]
On 10/27/2014 04:25 AM, Ketor D wrote:
> Hi Jens:
> After debug the v3 patch, I found there is a bug in the patch.
> On the first fio_rbd_getevents loop, the fri->io_seen is set to
> 1, and this variable never set to 0 again. So the program get into
> endless loop in such code:
>
> do {
> this_events = rbd_iter_events(td, &events, min, wait);
>
> if (events >= min)
> break;
> if (this_events)
> continue;
>
> wait = 1;
> } while (1);
>
> this_events and events always be 0, because the fri->io_seen is always
> 1, so no events can be getted.
>
> The Bug fix is:
> in the function _fio_rbd_finish_read_aiocb,
> _fio_rbd_finish_write_aiocb and _fio_rbd_finish_sync_aiocb add
> "fio_rbd_iou->io_seen = 0;" after "fio_rbd_iou->io_complete = 1;".
So there are two issues. One is that ->io_seen should be reset in the
->queue() ops, before issuing the IO. The second is that the comp is
released in a racy way, so we can't use it in getevents() reliably.
The new patch moves the comp release to when we reap the event, and
cleans up the ->io_seen setting as well. As far as I can tell, this
should fix all cases.
Additionally, it now actually checks for IO errors and handles those
correctly. They were just ignored before. Gets rid of some useless
casting as well, and lots of duplicated IO comp functions.
If everybody involved (Mark, you) could try this one out, then I'd
appreciate it.
--
Jens Axboe
[-- Attachment #2: rbd-comp-v5.patch --]
[-- Type: text/x-patch, Size: 7347 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..89344033f894 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
+ int io_seen;
};
struct rbd_data {
@@ -163,92 +165,102 @@ static void _fio_rbd_disconnect(struct rbd_data *rbd_data)
}
}
-static void _fio_rbd_finish_write_aiocb(rbd_completion_t comp, void *data)
+static void _fio_rbd_io_finish(struct io_u *io_u)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ ssize_t ret;
+
+ fri->io_complete = 1;
+
+ ret = rbd_aio_get_return_value(&fri->completion);
+ if (ret != (int) io_u->xfer_buflen) {
+ if (ret >= 0) {
+ io_u->resid = io_u->xfer_buflen - ret;
+ io_u->error = 0;
+ } else
+ io_u->error = ret;
+ }
+}
- fio_rbd_iou->io_complete = 1;
+static void _fio_rbd_finish_aiocb(rbd_completion_t comp, void *data)
+{
+ struct io_u *io_u = data;
- /* if write needs to be verified - we should not release comp here
- without fetching the result */
+ _fio_rbd_io_finish(io_u);
+}
- rbd_aio_release(comp);
- /* TODO handle error */
+static struct io_u *fio_rbd_event(struct thread_data *td, int event)
+{
+ struct rbd_data *rbd_data = td->io_ops->data;
- return;
+ return rbd_data->aio_events[event];
}
-static void _fio_rbd_finish_read_aiocb(rbd_completion_t comp, void *data)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- fio_rbd_iou->io_complete = 1;
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ fri->io_seen = 1;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
- /* if read needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
-
- /* TODO handle error */
+ rbd_aio_release(&fri->completion);
+ return 1;
+ }
- return;
+ return 0;
}
-static void _fio_rbd_finish_sync_aiocb(rbd_completion_t comp, void *data)
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
+ struct rbd_data *rbd_data = td->io_ops->data;
+ unsigned int this_events = 0;
+ struct io_u *io_u;
+ int i;
- /* if sync needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- /* TODO handle error */
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
+ if (fri->io_seen)
+ continue;
- return;
-}
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ else if (wait) {
+ rbd_aio_wait_for_complete(fri->completion);
-static struct io_u *fio_rbd_event(struct thread_data *td, int event)
-{
- struct rbd_data *rbd_data = td->io_ops->data;
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ }
+ if (*events >= min_evts)
+ break;
+ }
- return rbd_data->aio_events[event];
+ return this_events;
}
static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
unsigned int max, const struct timespec *t)
{
- struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
- struct io_u *io_u;
- int i;
- struct fio_rbd_iou *fov;
+ unsigned int this_events, events = 0;
+ int wait = 0;
do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
-
- fov = (struct fio_rbd_iou *)io_u->engine_data;
-
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
+ this_events = rbd_iter_events(td, &events, min, wait);
- }
- if (events < min)
- usleep(100);
- else
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -256,17 +268,18 @@ static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
- int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ int r = -1;
fio_ro_check(td, io_u);
+ fri->io_complete = 0;
+ fri->io_seen = 0;
+
if (io_u->ddir == DDIR_WRITE) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_write_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,17 +287,16 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
goto failed;
}
} else if (io_u->ddir == DDIR_READ) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_read_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,7 +304,8 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
@@ -300,17 +313,15 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
} else if (io_u->ddir == DDIR_SYNC) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_sync_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
goto failed;
@@ -439,22 +450,21 @@ static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o = io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (o) {
+ if (fri) {
io_u->engine_data = NULL;
- free(o);
+ free(fri);
}
}
static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o;
+ struct fio_rbd_iou *fri;
- o = malloc(sizeof(*o));
- o->io_complete = 0;
- o->io_u = io_u;
- io_u->engine_data = o;
+ fri = calloc(1, sizeof(*fri));
+ fri->io_u = io_u;
+ io_u->engine_data = fri;
return 0;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-25 22:25 ` Mark Kirkwood
2014-10-27 9:27 ` Ketor D
@ 2014-10-27 14:19 ` Jens Axboe
2014-10-27 15:12 ` Ketor D
1 sibling, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 14:19 UTC (permalink / raw)
To: Mark Kirkwood, Mark Nelson, Mark Nelson, fio
Cc: xan.peng, ceph-devel@vger.kernel.org
On 10/25/2014 04:25 PM, Mark Kirkwood wrote:
> On 26/10/14 08:20, Jens Axboe wrote:
>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>>
>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>> have an easy way to test it myself. It has nothing to do with this
>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>
>>>
>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>> on 2x Crucial M550) so not exactly server grade :-)
>>>
>>> With that in mind, I'm seeing slightly *slower* performance with the
>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>> cached.
>>
>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>> working correctly. If you try this one, we should know more...
>>
>> Goal is, I want to get rid of that usleep() in getevents.
>>
>
> Testing with v3 patch applied hangs. I did wonder if we had somehow hit
> a new variant of the cache issue - so reran with it disabled in
> ceph.conf. Result is the same:
>
> $ fio read-test.fio
> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
> ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
> 01h:25m:15s]
There were, unfortunately, still two bugs left in -v3. I just posted an
updated one, please try that and see if it works for you.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 14:19 ` Jens Axboe
@ 2014-10-27 15:12 ` Ketor D
2014-10-27 15:22 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-27 15:12 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
The v5 patch does not work.
Run 5 times:
3 times SEGSV
2 times NO IOPS, Endless loop
2014-10-27 22:19 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 10/25/2014 04:25 PM, Mark Kirkwood wrote:
>> On 26/10/14 08:20, Jens Axboe wrote:
>>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>>>
>>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>>> have an easy way to test it myself. It has nothing to do with this
>>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>>
>>>>
>>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>>> on 2x Crucial M550) so not exactly server grade :-)
>>>>
>>>> With that in mind, I'm seeing slightly *slower* performance with the
>>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>>> cached.
>>>
>>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>>> working correctly. If you try this one, we should know more...
>>>
>>> Goal is, I want to get rid of that usleep() in getevents.
>>>
>>
>> Testing with v3 patch applied hangs. I did wonder if we had somehow hit
>> a new variant of the cache issue - so reran with it disabled in
>> ceph.conf. Result is the same:
>>
>> $ fio read-test.fio
>> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
>> ioengine=rbd, iodepth=32
>> fio-2.1.13-88-gb2ee7
>> Starting 1 process
>> rbd engine: RBD version: 0.1.8
>> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 01h:25m:15s]
>
> There were, unfortunately, still two bugs left in -v3. I just posted an
> updated one, please try that and see if it works for you.
>
> --
> Jens Axboe
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:12 ` Ketor D
@ 2014-10-27 15:22 ` Jens Axboe
2014-10-27 15:25 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 15:22 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 305 bytes --]
On 10/27/2014 09:12 AM, Ketor D wrote:
> The v5 patch does not work.
>
> Run 5 times:
> 3 times SEGSV
> 2 times NO IOPS, Endless loop
Try this one, perhaps it's the wrong type passed for release. Typedefs
for the win (or not).
This also fixes comp leaks, if the read/write/sync fails.
--
Jens Axboe
[-- Attachment #2: rbd-comp-v6.patch --]
[-- Type: text/x-patch, Size: 9805 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..a6e5dafb87fd 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
+ int io_seen;
};
struct rbd_data {
@@ -30,35 +32,35 @@ struct rbd_options {
static struct fio_option options[] = {
{
- .name = "rbdname",
- .lname = "rbd engine rbdname",
- .type = FIO_OPT_STR_STORE,
- .help = "RBD name for RBD engine",
- .off1 = offsetof(struct rbd_options, rbd_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "rbdname",
+ .lname = "rbd engine rbdname",
+ .type = FIO_OPT_STR_STORE,
+ .help = "RBD name for RBD engine",
+ .off1 = offsetof(struct rbd_options, rbd_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = "pool",
- .lname = "rbd engine pool",
- .type = FIO_OPT_STR_STORE,
- .help = "Name of the pool hosting the RBD for the RBD engine",
- .off1 = offsetof(struct rbd_options, pool_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "pool",
+ .lname = "rbd engine pool",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Name of the pool hosting the RBD for the RBD engine",
+ .off1 = offsetof(struct rbd_options, pool_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = "clientname",
- .lname = "rbd engine clientname",
- .type = FIO_OPT_STR_STORE,
- .help = "Name of the ceph client to access the RBD for the RBD engine",
- .off1 = offsetof(struct rbd_options, client_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "clientname",
+ .lname = "rbd engine clientname",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Name of the ceph client to access the RBD for the RBD engine",
+ .off1 = offsetof(struct rbd_options, client_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = NULL,
- },
+ .name = NULL,
+ },
};
static int _fio_setup_rbd_data(struct thread_data *td,
@@ -163,92 +165,96 @@ static void _fio_rbd_disconnect(struct rbd_data *rbd_data)
}
}
-static void _fio_rbd_finish_write_aiocb(rbd_completion_t comp, void *data)
+static void _fio_rbd_finish_aiocb(rbd_completion_t comp, void *data)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
-
- /* if write needs to be verified - we should not release comp here
- without fetching the result */
+ struct io_u *io_u = data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ ssize_t ret;
+
+ fri->io_complete = 1;
+
+ ret = rbd_aio_get_return_value(&fri->completion);
+ if (ret != (int) io_u->xfer_buflen) {
+ if (ret >= 0) {
+ io_u->resid = io_u->xfer_buflen - ret;
+ io_u->error = 0;
+ } else
+ io_u->error = ret;
+ }
+}
- rbd_aio_release(comp);
- /* TODO handle error */
+static struct io_u *fio_rbd_event(struct thread_data *td, int event)
+{
+ struct rbd_data *rbd_data = td->io_ops->data;
- return;
+ return rbd_data->aio_events[event];
}
-static void _fio_rbd_finish_read_aiocb(rbd_completion_t comp, void *data)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- /* if read needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ fri->io_seen = 1;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
- /* TODO handle error */
+ rbd_aio_release(fri->completion);
+ return 1;
+ }
- return;
+ return 0;
}
-static void _fio_rbd_finish_sync_aiocb(rbd_completion_t comp, void *data)
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
+ struct rbd_data *rbd_data = td->io_ops->data;
+ unsigned int this_events = 0;
+ struct io_u *io_u;
+ int i;
- /* if sync needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- /* TODO handle error */
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
+ if (fri->io_seen)
+ continue;
- return;
-}
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ else if (wait) {
+ rbd_aio_wait_for_complete(fri->completion);
-static struct io_u *fio_rbd_event(struct thread_data *td, int event)
-{
- struct rbd_data *rbd_data = td->io_ops->data;
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ }
+ if (*events >= min_evts)
+ break;
+ }
- return rbd_data->aio_events[event];
+ return this_events;
}
static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
unsigned int max, const struct timespec *t)
{
- struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
- struct io_u *io_u;
- int i;
- struct fio_rbd_iou *fov;
+ unsigned int this_events, events = 0;
+ int wait = 0;
do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
-
- fov = (struct fio_rbd_iou *)io_u->engine_data;
+ this_events = rbd_iter_events(td, &events, min, wait);
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
-
- }
- if (events < min)
- usleep(100);
- else
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -256,17 +262,18 @@ static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
- int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ int r = -1;
fio_ro_check(td, io_u);
+ fri->io_complete = 0;
+ fri->io_seen = 0;
+
if (io_u->ddir == DDIR_WRITE) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_write_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,17 +281,17 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
} else if (io_u->ddir == DDIR_READ) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_read_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,27 +299,28 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
} else if (io_u->ddir == DDIR_SYNC) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_sync_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
@@ -344,7 +352,6 @@ static int fio_rbd_init(struct thread_data *td)
failed:
return 1;
-
}
static void fio_rbd_cleanup(struct thread_data *td)
@@ -379,8 +386,9 @@ static int fio_rbd_setup(struct thread_data *td)
}
td->io_ops->data = rbd_data;
- /* librbd does not allow us to run first in the main thread and later in a
- * fork child. It needs to be the same process context all the time.
+ /* librbd does not allow us to run first in the main thread and later
+ * in a fork child. It needs to be the same process context all the
+ * time.
*/
td->o.use_thread = 1;
@@ -439,22 +447,21 @@ static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o = io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (o) {
+ if (fri) {
io_u->engine_data = NULL;
- free(o);
+ free(fri);
}
}
static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o;
+ struct fio_rbd_iou *fri;
- o = malloc(sizeof(*o));
- o->io_complete = 0;
- o->io_u = io_u;
- io_u->engine_data = o;
+ fri = calloc(1, sizeof(*fri));
+ fri->io_u = io_u;
+ io_u->engine_data = fri;
return 0;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:22 ` Jens Axboe
@ 2014-10-27 15:25 ` Jens Axboe
2014-10-27 15:29 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 15:25 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 408 bytes --]
On 10/27/2014 09:22 AM, Jens Axboe wrote:
> On 10/27/2014 09:12 AM, Ketor D wrote:
>> The v5 patch does not work.
>>
>> Run 5 times:
>> 3 times SEGSV
>> 2 times NO IOPS, Endless loop
>
> Try this one, perhaps it's the wrong type passed for release. Typedefs
> for the win (or not).
>
> This also fixes comp leaks, if the read/write/sync fails.
The get_return was wrong too, here's -v7...
--
Jens Axboe
[-- Attachment #2: rbd-comp-v7.patch --]
[-- Type: text/x-patch, Size: 9804 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..0e04b610b3d9 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
+ int io_seen;
};
struct rbd_data {
@@ -30,35 +32,35 @@ struct rbd_options {
static struct fio_option options[] = {
{
- .name = "rbdname",
- .lname = "rbd engine rbdname",
- .type = FIO_OPT_STR_STORE,
- .help = "RBD name for RBD engine",
- .off1 = offsetof(struct rbd_options, rbd_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "rbdname",
+ .lname = "rbd engine rbdname",
+ .type = FIO_OPT_STR_STORE,
+ .help = "RBD name for RBD engine",
+ .off1 = offsetof(struct rbd_options, rbd_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = "pool",
- .lname = "rbd engine pool",
- .type = FIO_OPT_STR_STORE,
- .help = "Name of the pool hosting the RBD for the RBD engine",
- .off1 = offsetof(struct rbd_options, pool_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "pool",
+ .lname = "rbd engine pool",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Name of the pool hosting the RBD for the RBD engine",
+ .off1 = offsetof(struct rbd_options, pool_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = "clientname",
- .lname = "rbd engine clientname",
- .type = FIO_OPT_STR_STORE,
- .help = "Name of the ceph client to access the RBD for the RBD engine",
- .off1 = offsetof(struct rbd_options, client_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "clientname",
+ .lname = "rbd engine clientname",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Name of the ceph client to access the RBD for the RBD engine",
+ .off1 = offsetof(struct rbd_options, client_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = NULL,
- },
+ .name = NULL,
+ },
};
static int _fio_setup_rbd_data(struct thread_data *td,
@@ -163,92 +165,96 @@ static void _fio_rbd_disconnect(struct rbd_data *rbd_data)
}
}
-static void _fio_rbd_finish_write_aiocb(rbd_completion_t comp, void *data)
+static void _fio_rbd_finish_aiocb(rbd_completion_t comp, void *data)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
-
- /* if write needs to be verified - we should not release comp here
- without fetching the result */
+ struct io_u *io_u = data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ ssize_t ret;
+
+ fri->io_complete = 1;
+
+ ret = rbd_aio_get_return_value(fri->completion);
+ if (ret != (int) io_u->xfer_buflen) {
+ if (ret >= 0) {
+ io_u->resid = io_u->xfer_buflen - ret;
+ io_u->error = 0;
+ } else
+ io_u->error = ret;
+ }
+}
- rbd_aio_release(comp);
- /* TODO handle error */
+static struct io_u *fio_rbd_event(struct thread_data *td, int event)
+{
+ struct rbd_data *rbd_data = td->io_ops->data;
- return;
+ return rbd_data->aio_events[event];
}
-static void _fio_rbd_finish_read_aiocb(rbd_completion_t comp, void *data)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- /* if read needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ fri->io_seen = 1;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
- /* TODO handle error */
+ rbd_aio_release(fri->completion);
+ return 1;
+ }
- return;
+ return 0;
}
-static void _fio_rbd_finish_sync_aiocb(rbd_completion_t comp, void *data)
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
+ struct rbd_data *rbd_data = td->io_ops->data;
+ unsigned int this_events = 0;
+ struct io_u *io_u;
+ int i;
- /* if sync needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- /* TODO handle error */
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
+ if (fri->io_seen)
+ continue;
- return;
-}
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ else if (wait) {
+ rbd_aio_wait_for_complete(fri->completion);
-static struct io_u *fio_rbd_event(struct thread_data *td, int event)
-{
- struct rbd_data *rbd_data = td->io_ops->data;
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ }
+ if (*events >= min_evts)
+ break;
+ }
- return rbd_data->aio_events[event];
+ return this_events;
}
static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
unsigned int max, const struct timespec *t)
{
- struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
- struct io_u *io_u;
- int i;
- struct fio_rbd_iou *fov;
+ unsigned int this_events, events = 0;
+ int wait = 0;
do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
-
- fov = (struct fio_rbd_iou *)io_u->engine_data;
+ this_events = rbd_iter_events(td, &events, min, wait);
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
-
- }
- if (events < min)
- usleep(100);
- else
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -256,17 +262,18 @@ static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
- int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ int r = -1;
fio_ro_check(td, io_u);
+ fri->io_complete = 0;
+ fri->io_seen = 0;
+
if (io_u->ddir == DDIR_WRITE) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_write_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,17 +281,17 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
} else if (io_u->ddir == DDIR_READ) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_read_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,27 +299,28 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
} else if (io_u->ddir == DDIR_SYNC) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_sync_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
@@ -344,7 +352,6 @@ static int fio_rbd_init(struct thread_data *td)
failed:
return 1;
-
}
static void fio_rbd_cleanup(struct thread_data *td)
@@ -379,8 +386,9 @@ static int fio_rbd_setup(struct thread_data *td)
}
td->io_ops->data = rbd_data;
- /* librbd does not allow us to run first in the main thread and later in a
- * fork child. It needs to be the same process context all the time.
+ /* librbd does not allow us to run first in the main thread and later
+ * in a fork child. It needs to be the same process context all the
+ * time.
*/
td->o.use_thread = 1;
@@ -439,22 +447,21 @@ static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o = io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (o) {
+ if (fri) {
io_u->engine_data = NULL;
- free(o);
+ free(fri);
}
}
static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o;
+ struct fio_rbd_iou *fri;
- o = malloc(sizeof(*o));
- o->io_complete = 0;
- o->io_u = io_u;
- io_u->engine_data = o;
+ fri = calloc(1, sizeof(*fri));
+ fri->io_u = io_u;
+ io_u->engine_data = fri;
return 0;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:25 ` Jens Axboe
@ 2014-10-27 15:29 ` Ketor D
2014-10-27 15:36 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-27 15:29 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
I just found the aio_get_return and aio_release bug, then you fix it. So fast!
But the test looks bad.
The write bytes is always zero..........
2014-10-27 23:25 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 10/27/2014 09:22 AM, Jens Axboe wrote:
>> On 10/27/2014 09:12 AM, Ketor D wrote:
>>> The v5 patch does not work.
>>>
>>> Run 5 times:
>>> 3 times SEGSV
>>> 2 times NO IOPS, Endless loop
>>
>> Try this one, perhaps it's the wrong type passed for release. Typedefs
>> for the win (or not).
>>
>> This also fixes comp leaks, if the read/write/sync fails.
>
> The get_return was wrong too, here's -v7...
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:29 ` Ketor D
@ 2014-10-27 15:36 ` Jens Axboe
2014-10-27 15:45 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 15:36 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 10/27/2014 09:29 AM, Ketor D wrote:
> I just found the aio_get_return and aio_release bug, then you fix it. So fast!
>
> But the test looks bad.
>
> The write bytes is always zero..........
Looks like I need to setup a local test here, haven't run ceph/rbd
before... Can you put a debug printf() in _fio_rbd_finish_aiocb() and
dump what rbd_aio_get_return_value() returns?
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:36 ` Jens Axboe
@ 2014-10-27 15:45 ` Ketor D
2014-10-27 15:53 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-27 15:45 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
The return code is 0 if success.I mod the code a bit and then run fio very well.
I think if you fix this bug, the path will be nearly pefect!!
ret = rbd_aio_get_return_value(fri->completion);
//printf("ret=%ld\n", ret);
//if (ret != (int) io_u->xfer_buflen) {
if (ret != 0) {
if (ret >= 0) {
io_u->resid = io_u->xfer_buflen - ret;
io_u->error = 0;
} else
io_u->error = ret;
}
2014-10-27 23:36 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 10/27/2014 09:29 AM, Ketor D wrote:
>> I just found the aio_get_return and aio_release bug, then you fix it. So fast!
>>
>> But the test looks bad.
>>
>> The write bytes is always zero..........
>
> Looks like I need to setup a local test here, haven't run ceph/rbd
> before... Can you put a debug printf() in _fio_rbd_finish_aiocb() and
> dump what rbd_aio_get_return_value() returns?
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:45 ` Ketor D
@ 2014-10-27 15:53 ` Jens Axboe
2014-10-27 16:20 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 15:53 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 672 bytes --]
On 10/27/2014 09:45 AM, Ketor D wrote:
> The return code is 0 if success.I mod the code a bit and then run fio very well.
> I think if you fix this bug, the path will be nearly pefect!!
>
> ret = rbd_aio_get_return_value(fri->completion);
> //printf("ret=%ld\n", ret);
> //if (ret != (int) io_u->xfer_buflen) {
> if (ret != 0) {
> if (ret >= 0) {
> io_u->resid = io_u->xfer_buflen - ret;
> io_u->error = 0;
> } else
> io_u->error = ret;
> }
Weird, so it does not do partial completions I assume. Modified -v8 to
take that into account, hopefully this just works out-of-the-box.
What does the performance numbers look like for your sync test with this?
--
Jens Axboe
[-- Attachment #2: rbd-comp-v8.patch --]
[-- Type: text/x-patch, Size: 9900 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..5160c32aedb0 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@
struct fio_rbd_iou {
struct io_u *io_u;
+ rbd_completion_t completion;
int io_complete;
+ int io_seen;
};
struct rbd_data {
@@ -30,35 +32,35 @@ struct rbd_options {
static struct fio_option options[] = {
{
- .name = "rbdname",
- .lname = "rbd engine rbdname",
- .type = FIO_OPT_STR_STORE,
- .help = "RBD name for RBD engine",
- .off1 = offsetof(struct rbd_options, rbd_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "rbdname",
+ .lname = "rbd engine rbdname",
+ .type = FIO_OPT_STR_STORE,
+ .help = "RBD name for RBD engine",
+ .off1 = offsetof(struct rbd_options, rbd_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = "pool",
- .lname = "rbd engine pool",
- .type = FIO_OPT_STR_STORE,
- .help = "Name of the pool hosting the RBD for the RBD engine",
- .off1 = offsetof(struct rbd_options, pool_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "pool",
+ .lname = "rbd engine pool",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Name of the pool hosting the RBD for the RBD engine",
+ .off1 = offsetof(struct rbd_options, pool_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = "clientname",
- .lname = "rbd engine clientname",
- .type = FIO_OPT_STR_STORE,
- .help = "Name of the ceph client to access the RBD for the RBD engine",
- .off1 = offsetof(struct rbd_options, client_name),
- .category = FIO_OPT_C_ENGINE,
- .group = FIO_OPT_G_RBD,
- },
+ .name = "clientname",
+ .lname = "rbd engine clientname",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Name of the ceph client to access the RBD for the RBD engine",
+ .off1 = offsetof(struct rbd_options, client_name),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_RBD,
+ },
{
- .name = NULL,
- },
+ .name = NULL,
+ },
};
static int _fio_setup_rbd_data(struct thread_data *td,
@@ -163,92 +165,99 @@ static void _fio_rbd_disconnect(struct rbd_data *rbd_data)
}
}
-static void _fio_rbd_finish_write_aiocb(rbd_completion_t comp, void *data)
+static void _fio_rbd_finish_aiocb(rbd_completion_t comp, void *data)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
+ struct io_u *io_u = data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ ssize_t ret;
- fio_rbd_iou->io_complete = 1;
+ fri->io_complete = 1;
- /* if write needs to be verified - we should not release comp here
- without fetching the result */
+ /*
+ * Looks like return value is 0 for success, or < 0 for
+ * a specific error. So we have to assume that it can't do
+ * partial completions.
+ */
+ ret = rbd_aio_get_return_value(fri->completion);
+ if (ret < 0) {
+ io_u->error = ret;
+ io_u->resid = io_u->xfer_buflen;
+ } else
+ io_u->error = 0;
+}
- rbd_aio_release(comp);
- /* TODO handle error */
+static struct io_u *fio_rbd_event(struct thread_data *td, int event)
+{
+ struct rbd_data *rbd_data = td->io_ops->data;
- return;
+ return rbd_data->aio_events[event];
}
-static void _fio_rbd_finish_read_aiocb(rbd_completion_t comp, void *data)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+ struct io_u *io_u,
+ unsigned int *events)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- fio_rbd_iou->io_complete = 1;
+ if (fri->io_complete) {
+ fri->io_complete = 0;
+ fri->io_seen = 1;
+ rbd_data->aio_events[*events] = io_u;
+ (*events)++;
- /* if read needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
-
- /* TODO handle error */
+ rbd_aio_release(fri->completion);
+ return 1;
+ }
- return;
+ return 0;
}
-static void _fio_rbd_finish_sync_aiocb(rbd_completion_t comp, void *data)
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+ unsigned int min_evts, int wait)
{
- struct io_u *io_u = (struct io_u *)data;
- struct fio_rbd_iou *fio_rbd_iou =
- (struct fio_rbd_iou *)io_u->engine_data;
-
- fio_rbd_iou->io_complete = 1;
+ struct rbd_data *rbd_data = td->io_ops->data;
+ unsigned int this_events = 0;
+ struct io_u *io_u;
+ int i;
- /* if sync needs to be verified - we should not release comp here
- without fetching the result */
- rbd_aio_release(comp);
+ io_u_qiter(&td->io_u_all, io_u, i) {
+ struct fio_rbd_iou *fri = io_u->engine_data;
- /* TODO handle error */
+ if (!(io_u->flags & IO_U_F_FLIGHT))
+ continue;
+ if (fri->io_seen)
+ continue;
- return;
-}
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ else if (wait) {
+ rbd_aio_wait_for_complete(fri->completion);
-static struct io_u *fio_rbd_event(struct thread_data *td, int event)
-{
- struct rbd_data *rbd_data = td->io_ops->data;
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+ }
+ if (*events >= min_evts)
+ break;
+ }
- return rbd_data->aio_events[event];
+ return this_events;
}
static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
unsigned int max, const struct timespec *t)
{
- struct rbd_data *rbd_data = td->io_ops->data;
- unsigned int events = 0;
- struct io_u *io_u;
- int i;
- struct fio_rbd_iou *fov;
+ unsigned int this_events, events = 0;
+ int wait = 0;
do {
- io_u_qiter(&td->io_u_all, io_u, i) {
- if (!(io_u->flags & IO_U_F_FLIGHT))
- continue;
+ this_events = rbd_iter_events(td, &events, min, wait);
- fov = (struct fio_rbd_iou *)io_u->engine_data;
-
- if (fov->io_complete) {
- fov->io_complete = 0;
- rbd_data->aio_events[events] = io_u;
- events++;
- }
-
- }
- if (events < min)
- usleep(100);
- else
+ if (events >= min)
break;
+ if (this_events)
+ continue;
+ wait = 1;
} while (1);
return events;
@@ -256,17 +265,18 @@ static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
{
- int r = -1;
struct rbd_data *rbd_data = td->io_ops->data;
- rbd_completion_t comp;
+ struct fio_rbd_iou *fri = io_u->engine_data;
+ int r = -1;
fio_ro_check(td, io_u);
+ fri->io_complete = 0;
+ fri->io_seen = 0;
+
if (io_u->ddir == DDIR_WRITE) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_write_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,17 +284,17 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_write(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_write failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
} else if (io_u->ddir == DDIR_READ) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_read_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,27 +302,28 @@ static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
}
r = rbd_aio_read(rbd_data->image, io_u->offset,
- io_u->xfer_buflen, io_u->xfer_buf, comp);
+ io_u->xfer_buflen, io_u->xfer_buf,
+ fri->completion);
if (r < 0) {
log_err("rbd_aio_read failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
} else if (io_u->ddir == DDIR_SYNC) {
- r = rbd_aio_create_completion(io_u,
- (rbd_callback_t)
- _fio_rbd_finish_sync_aiocb,
- &comp);
+ r = rbd_aio_create_completion(io_u, _fio_rbd_finish_aiocb,
+ &fri->completion);
if (r < 0) {
log_err
("rbd_aio_create_completion for DDIR_SYNC failed.\n");
goto failed;
}
- r = rbd_aio_flush(rbd_data->image, comp);
+ r = rbd_aio_flush(rbd_data->image, fri->completion);
if (r < 0) {
log_err("rbd_flush failed.\n");
+ rbd_aio_release(fri->completion);
goto failed;
}
@@ -344,7 +355,6 @@ static int fio_rbd_init(struct thread_data *td)
failed:
return 1;
-
}
static void fio_rbd_cleanup(struct thread_data *td)
@@ -379,8 +389,9 @@ static int fio_rbd_setup(struct thread_data *td)
}
td->io_ops->data = rbd_data;
- /* librbd does not allow us to run first in the main thread and later in a
- * fork child. It needs to be the same process context all the time.
+ /* librbd does not allow us to run first in the main thread and later
+ * in a fork child. It needs to be the same process context all the
+ * time.
*/
td->o.use_thread = 1;
@@ -439,22 +450,21 @@ static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o = io_u->engine_data;
+ struct fio_rbd_iou *fri = io_u->engine_data;
- if (o) {
+ if (fri) {
io_u->engine_data = NULL;
- free(o);
+ free(fri);
}
}
static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
{
- struct fio_rbd_iou *o;
+ struct fio_rbd_iou *fri;
- o = malloc(sizeof(*o));
- o->io_complete = 0;
- o->io_u = io_u;
- io_u->engine_data = o;
+ fri = calloc(1, sizeof(*fri));
+ fri->io_u = io_u;
+ io_u->engine_data = fri;
return 0;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 15:53 ` Jens Axboe
@ 2014-10-27 16:20 ` Ketor D
2014-10-27 16:55 ` Jens Axboe
2014-10-27 21:59 ` Mark Kirkwood
0 siblings, 2 replies; 52+ messages in thread
From: Ketor D @ 2014-10-27 16:20 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
V8 patch runs good.
The iops is 33032. If I just comment the usleep(100) in the master, I
can get iops 35245.
The CPU usage about the two test is same 120%.
So maybe this patch could be better!
Belong to the master, this patch is perfect enough!!
2014-10-27 23:53 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 10/27/2014 09:45 AM, Ketor D wrote:
>> The return code is 0 if success.I mod the code a bit and then run fio very well.
>> I think if you fix this bug, the path will be nearly pefect!!
>>
>> ret = rbd_aio_get_return_value(fri->completion);
>> //printf("ret=%ld\n", ret);
>> //if (ret != (int) io_u->xfer_buflen) {
>> if (ret != 0) {
>> if (ret >= 0) {
>> io_u->resid = io_u->xfer_buflen - ret;
>> io_u->error = 0;
>> } else
>> io_u->error = ret;
>> }
>
> Weird, so it does not do partial completions I assume. Modified -v8 to
> take that into account, hopefully this just works out-of-the-box.
>
> What does the performance numbers look like for your sync test with this?
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 16:20 ` Ketor D
@ 2014-10-27 16:55 ` Jens Axboe
2014-10-27 21:59 ` Mark Kirkwood
1 sibling, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 16:55 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 10/27/2014 10:20 AM, Ketor D wrote:
> V8 patch runs good.
>
> The iops is 33032. If I just comment the usleep(100) in the master, I
> can get iops 35245.
> The CPU usage about the two test is same 120%.
> So maybe this patch could be better!
>
> Belong to the master, this patch is perfect enough!!
Agree, committed. I'll setup a local test here and see if we can't
recoup those last percentages. CPU usage may have been the same for your
test, but it will be more for others. A busy loop in there is not a good
idea.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 16:20 ` Ketor D
2014-10-27 16:55 ` Jens Axboe
@ 2014-10-27 21:59 ` Mark Kirkwood
2014-10-27 22:32 ` Jens Axboe
1 sibling, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-27 21:59 UTC (permalink / raw)
To: Ketor D, Jens Axboe
Cc: Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 28/10/14 05:20, Ketor D wrote:
> V8 patch runs good.
>
> The iops is 33032. If I just comment the usleep(100) in the master, I
> can get iops 35245.
> The CPU usage about the two test is same 120%.
> So maybe this patch could be better!
>
Yeah, v8 is working for me.
I'm seeing it a bit slower for some blocksizes, but faster (or perhaps
about the same within repeat measurement error) for others:
blocksize k | patched iops | orig iops
------------+---------------+-----------
4 | 12265 | 11930
128 | 5800 | 7100
1024 | 1193 | 1196
Regards
Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 21:59 ` Mark Kirkwood
@ 2014-10-27 22:32 ` Jens Axboe
2014-10-27 23:21 ` Mark Kirkwood
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-27 22:32 UTC (permalink / raw)
To: Mark Kirkwood, Ketor D
Cc: Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 10/27/2014 03:59 PM, Mark Kirkwood wrote:
> On 28/10/14 05:20, Ketor D wrote:
>> V8 patch runs good.
>>
>> The iops is 33032. If I just comment the usleep(100) in the master, I
>> can get iops 35245.
>> The CPU usage about the two test is same 120%.
>> So maybe this patch could be better!
>>
>
> Yeah, v8 is working for me.
>
> I'm seeing it a bit slower for some blocksizes, but faster (or perhaps
> about the same within repeat measurement error) for others:
>
> blocksize k | patched iops | orig iops
> ------------+---------------+-----------
> 4 | 12265 | 11930
> 128 | 5800 | 7100
> 1024 | 1193 | 1196
As for most things, the difference should be in IOPS, not bandwidth. So
I would assume that the above are within normal variance, since 4k
should show the biggest difference, then drop off after that and match
at 128/1024k.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 22:32 ` Jens Axboe
@ 2014-10-27 23:21 ` Mark Kirkwood
2014-10-28 3:23 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-27 23:21 UTC (permalink / raw)
To: Jens Axboe, Ketor D
Cc: Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 28/10/14 11:32, Jens Axboe wrote:
> On 10/27/2014 03:59 PM, Mark Kirkwood wrote:
>> On 28/10/14 05:20, Ketor D wrote:
>>> V8 patch runs good.
>>>
>>> The iops is 33032. If I just comment the usleep(100) in the master, I
>>> can get iops 35245.
>>> The CPU usage about the two test is same 120%.
>>> So maybe this patch could be better!
>>>
>>
>> Yeah, v8 is working for me.
>>
>> I'm seeing it a bit slower for some blocksizes, but faster (or perhaps
>> about the same within repeat measurement error) for others:
>>
>> blocksize k | patched iops | orig iops
>> ------------+---------------+-----------
>> 4 | 12265 | 11930
>> 128 | 5800 | 7100
>> 1024 | 1193 | 1196
>
> As for most things, the difference should be in IOPS, not bandwidth. So
> I would assume that the above are within normal variance, since 4k
> should show the biggest difference, then drop off after that and match
> at 128/1024k.
>
Yeah, I suspect the 4K numbers are the same as we are bottlenecked by
Ceph's small blocksize performance, not fio itself. If Ketor has a setup
that can get higher IOPS @4K it would be interesting to see his numbers
for patched vs orig!
Cheers
Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-27 23:21 ` Mark Kirkwood
@ 2014-10-28 3:23 ` Ketor D
2014-10-28 4:01 ` Mark Kirkwood
2014-10-28 4:05 ` Jens Axboe
0 siblings, 2 replies; 52+ messages in thread
From: Ketor D @ 2014-10-28 3:23 UTC (permalink / raw)
To: Mark Kirkwood
Cc: Jens Axboe, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 209 bytes --]
Hi Mark,
Wish you could test my patch.I get the best performance using this patch.
2014-10-28 7:21 GMT+08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
> ng to see his numbers for patched vs orig!
[-- Attachment #2: rbd_no_usleep.patch --]
[-- Type: application/octet-stream, Size: 290 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8..e6f2dff 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -245,7 +245,7 @@ static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
}
if (events < min)
- usleep(100);
+ ;//usleep(100);
else
break;
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 3:23 ` Ketor D
@ 2014-10-28 4:01 ` Mark Kirkwood
2014-10-28 4:05 ` Jens Axboe
1 sibling, 0 replies; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-28 4:01 UTC (permalink / raw)
To: Ketor D
Cc: Jens Axboe, Mark Nelson, Mark Nelson, fio, xan.peng,
ceph-devel@vger.kernel.org
On 28/10/14 16:23, Ketor D wrote:
> Hi Mark,
> Wish you could test my patch.I get the best performance using this patch.
>
>
It is not clear cut for me (tested reads only):
blocksize k | v8 patched iops | Ketor patch iops | orig iops
------------+-----------------+------------------+-----------
4 | 12265 | 11930 | 11516
128 | 5800 | 7100 | 6550
1024 | 1193 | 1196 | 1248
Cheers
Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 3:23 ` Ketor D
2014-10-28 4:01 ` Mark Kirkwood
@ 2014-10-28 4:05 ` Jens Axboe
2014-10-28 4:49 ` Ketor D
1 sibling, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-28 4:05 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
> On Oct 27, 2014, at 9:23 PM, Ketor D <d.ketor@gmail.com> wrote:
>
> Hi Mark,
> Wish you could test my patch.I get the best performance using this patch.
There's no way we're doing a busy loop, sorry. As mentioned in a previous email, it'd be great if you would work off current git and potentially improve that.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 4:05 ` Jens Axboe
@ 2014-10-28 4:49 ` Ketor D
2014-10-28 15:14 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-28 4:49 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Agree. Busy loop is only for test.
I will try the current git.
Thanks!
2014-10-28 12:05 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
>
>> On Oct 27, 2014, at 9:23 PM, Ketor D <d.ketor@gmail.com> wrote:
>>
>> Hi Mark,
>> Wish you could test my patch.I get the best performance using this patch.
>
> There's no way we're doing a busy loop, sorry. As mentioned in a previous email, it'd be great if you would work off current git and potentially improve that.
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 4:49 ` Ketor D
@ 2014-10-28 15:14 ` Jens Axboe
2014-10-28 15:49 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-28 15:14 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
On 2014-10-27 22:49, Ketor D wrote:
> Agree. Busy loop is only for test.
> I will try the current git.
Committed two more rbd changes:
- Add support for rbd_invalidate_cache() (if it exists)
- Use rbd_aio_is_complete() instead of using fri->io_complete. The
latter should have some locking to ensure it's always seen, so it's
better to use the API provided function to determine whether this IO is
done or not.
Unless we often hit the complete race, I would not expect this to make
much of a difference. But it's worth testing in any case, especially
since my two attempts at setting up ceph + rbd have failed miserably. So
I still can't test myself.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 15:14 ` Jens Axboe
@ 2014-10-28 15:49 ` Ketor D
2014-10-28 15:53 ` Jens Axboe
2014-10-28 17:09 ` Jens Axboe
0 siblings, 2 replies; 52+ messages in thread
From: Ketor D @ 2014-10-28 15:49 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Cannot get the new commited code from github now.
When I get the newest code, I will test.
2014-10-28 23:14 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 2014-10-27 22:49, Ketor D wrote:
>>
>> Agree. Busy loop is only for test.
>> I will try the current git.
>
>
> Committed two more rbd changes:
>
> - Add support for rbd_invalidate_cache() (if it exists)
> - Use rbd_aio_is_complete() instead of using fri->io_complete. The latter
> should have some locking to ensure it's always seen, so it's better to use
> the API provided function to determine whether this IO is done or not.
>
> Unless we often hit the complete race, I would not expect this to make much
> of a difference. But it's worth testing in any case, especially since my two
> attempts at setting up ceph + rbd have failed miserably. So I still can't
> test myself.
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 15:49 ` Ketor D
@ 2014-10-28 15:53 ` Jens Axboe
2014-10-28 17:09 ` Jens Axboe
1 sibling, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-28 15:53 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
On 2014-10-28 09:49, Ketor D wrote:
> Cannot get the new commited code from github now.
> When I get the newest code, I will test.
github is just a mirror, I push to:
git://git.kernel.dk/fio
Github is pushed automatically every hour, if there are changes. So it
may lag an hour. Should be there now, though.
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 15:49 ` Ketor D
2014-10-28 15:53 ` Jens Axboe
@ 2014-10-28 17:09 ` Jens Axboe
2014-10-28 18:43 ` Ketor D
1 sibling, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-28 17:09 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 610 bytes --]
On 2014-10-28 09:49, Ketor D wrote:
> Cannot get the new commited code from github now.
> When I get the newest code, I will test.
So here's another idea, applies on top of current -git. Basically it
makes rbd wait for the oldest event, not just the first one in the array
of all ios. This is the saner thing to do, as hopefully the oldest event
will be the one to complete first. At least it has a much higher chance
of being the right thing to do, than just waiting on a random event.
Completely untested, so you might have to fiddle a bit with it to ensure
that it actually works...
--
Jens Axboe
[-- Attachment #2: rbd-time-sort.patch --]
[-- Type: text/x-patch, Size: 3060 bytes --]
diff --git a/engines/rbd.c b/engines/rbd.c
index cf7be0acd1e3..f3129044c430 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -20,6 +20,7 @@ struct rbd_data {
rados_ioctx_t io_ctx;
rbd_image_t image;
struct io_u **aio_events;
+ struct io_u **sort_events;
};
struct rbd_options {
@@ -80,20 +81,19 @@ static int _fio_setup_rbd_data(struct thread_data *td,
if (td->io_ops->data)
return 0;
- rbd_data = malloc(sizeof(struct rbd_data));
+ rbd_data = calloc(1, sizeof(struct rbd_data));
if (!rbd_data)
goto failed;
- memset(rbd_data, 0, sizeof(struct rbd_data));
-
- rbd_data->aio_events = malloc(td->o.iodepth * sizeof(struct io_u *));
+ rbd_data->aio_events = calloc(td->o.iodepth, sizeof(struct io_u *));
if (!rbd_data->aio_events)
goto failed;
- memset(rbd_data->aio_events, 0, td->o.iodepth * sizeof(struct io_u *));
+ rbd_data->sort_events = calloc(td->o.iodepth, sizeof(struct io_u *));
+ if (!rbd_data->sort_events)
+ goto failed;
*rbd_data_ptr = rbd_data;
-
return 0;
failed:
@@ -218,14 +218,32 @@ static inline int fri_check_complete(struct rbd_data *rbd_data,
return 0;
}
+static int rbd_io_u_cmp(const void *p1, const void *p2)
+{
+ const struct io_u **a = (const struct io_u **) p1;
+ const struct io_u **b = (const struct io_u **) p2;
+ uint64_t at, bt;
+
+ at = utime_since_now(&(*a)->start_time);
+ bt = utime_since_now(&(*b)->start_time);
+
+ if (at < bt)
+ return -1;
+ else if (at == bt)
+ return 0;
+ else
+ return 1;
+}
+
static int rbd_iter_events(struct thread_data *td, unsigned int *events,
unsigned int min_evts, int wait)
{
struct rbd_data *rbd_data = td->io_ops->data;
unsigned int this_events = 0;
struct io_u *io_u;
- int i;
+ int i, sort_idx;
+ sort_idx = 0;
io_u_qiter(&td->io_u_all, io_u, i) {
struct fio_rbd_iou *fri = io_u->engine_data;
@@ -236,16 +254,39 @@ static int rbd_iter_events(struct thread_data *td, unsigned int *events,
if (fri_check_complete(rbd_data, io_u, events))
this_events++;
- else if (wait) {
- rbd_aio_wait_for_complete(fri->completion);
+ else if (wait)
+ rbd_data->sort_events[sort_idx++] = io_u;
- if (fri_check_complete(rbd_data, io_u, events))
- this_events++;
- }
if (*events >= min_evts)
break;
}
+ if (!wait || !sort_idx)
+ return this_events;
+
+ qsort(rbd_data->sort_events, sort_idx, sizeof(struct io_u *), rbd_io_u_cmp);
+ for (i = 0; i < sort_idx; i++) {
+ struct fio_rbd_iou *fri;
+
+ io_u = rbd_data->sort_events[i];
+ fri = io_u->engine_data;
+
+ if (fri_check_complete(rbd_data, io_u, events)) {
+ this_events++;
+ continue;
+ }
+ if (!wait)
+ continue;
+
+ rbd_aio_wait_for_complete(fri->completion);
+
+ if (fri_check_complete(rbd_data, io_u, events))
+ this_events++;
+
+ if (wait && *events >= min_evts)
+ wait = 0;
+ }
+
return this_events;
}
@@ -359,6 +400,7 @@ static void fio_rbd_cleanup(struct thread_data *td)
if (rbd_data) {
_fio_rbd_disconnect(rbd_data);
free(rbd_data->aio_events);
+ free(rbd_data->sort_events);
free(rbd_data);
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 17:09 ` Jens Axboe
@ 2014-10-28 18:43 ` Ketor D
2014-10-29 7:15 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-28 18:43 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Yeah, the new wait strategy looks better.
I will test the patch soon.
2014-10-29 1:09 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 2014-10-28 09:49, Ketor D wrote:
>>
>> Cannot get the new commited code from github now.
>> When I get the newest code, I will test.
>
>
> So here's another idea, applies on top of current -git. Basically it makes
> rbd wait for the oldest event, not just the first one in the array of all
> ios. This is the saner thing to do, as hopefully the oldest event will be
> the one to complete first. At least it has a much higher chance of being the
> right thing to do, than just waiting on a random event.
>
> Completely untested, so you might have to fiddle a bit with it to ensure
> that it actually works...
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-28 18:43 ` Ketor D
@ 2014-10-29 7:15 ` Ketor D
2014-10-29 14:31 ` Jens Axboe
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-29 7:15 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Hi, Jens,
There is cmdline parse bug in the fio rbd test.
I have fixed this and create a pull request on the github.
Please review.
After fix the bugs, the fio test can run.
2014-10-29 2:43 GMT+08:00 Ketor D <d.ketor@gmail.com>:
> Yeah, the new wait strategy looks better.
>
> I will test the patch soon.
>
> 2014-10-29 1:09 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
>> On 2014-10-28 09:49, Ketor D wrote:
>>>
>>> Cannot get the new commited code from github now.
>>> When I get the newest code, I will test.
>>
>>
>> So here's another idea, applies on top of current -git. Basically it makes
>> rbd wait for the oldest event, not just the first one in the array of all
>> ios. This is the saner thing to do, as hopefully the oldest event will be
>> the one to complete first. At least it has a much higher chance of being the
>> right thing to do, than just waiting on a random event.
>>
>> Completely untested, so you might have to fiddle a bit with it to ensure
>> that it actually works...
>>
>> --
>> Jens Axboe
>>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-29 7:15 ` Ketor D
@ 2014-10-29 14:31 ` Jens Axboe
2014-10-30 2:50 ` Ketor D
2014-10-30 7:44 ` Mark Kirkwood
0 siblings, 2 replies; 52+ messages in thread
From: Jens Axboe @ 2014-10-29 14:31 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
On 2014-10-29 01:15, Ketor D wrote:
> Hi, Jens,
>
> There is cmdline parse bug in the fio rbd test.
>
> I have fixed this and create a pull request on the github.
>
> Please review.
>
> After fix the bugs, the fio test can run.
I merged your two pull requests (thanks!) and committed a polished
variant of the sort patch. Ketor and Mark, would you mind both running a
quick benchmark on the current -git head?
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-29 14:31 ` Jens Axboe
@ 2014-10-30 2:50 ` Ketor D
2014-10-30 2:55 ` Jens Axboe
2014-10-30 7:44 ` Mark Kirkwood
1 sibling, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-30 2:50 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Hi Jens,
The current code runs good!
Test Conf: jobs=1 iodepth=1 bs=4k
if busy_poll = 1, IOPS is 38989.
if busy_poll = 0, IOPS is 33031.
And busy_poll=0 test result looks no difference from the old code than
do not have sorted events wait.
2014-10-29 22:31 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 2014-10-29 01:15, Ketor D wrote:
>>
>> Hi, Jens,
>>
>> There is cmdline parse bug in the fio rbd test.
>>
>> I have fixed this and create a pull request on the github.
>>
>> Please review.
>>
>> After fix the bugs, the fio test can run.
>
>
> I merged your two pull requests (thanks!) and committed a polished variant
> of the sort patch. Ketor and Mark, would you mind both running a quick
> benchmark on the current -git head?
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-30 2:50 ` Ketor D
@ 2014-10-30 2:55 ` Jens Axboe
2014-10-30 5:29 ` Ketor D
0 siblings, 1 reply; 52+ messages in thread
From: Jens Axboe @ 2014-10-30 2:55 UTC (permalink / raw)
To: Ketor D
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
On 2014-10-29 20:50, Ketor D wrote:
> Hi Jens,
>
> The current code runs good!
>
> Test Conf: jobs=1 iodepth=1 bs=4k
>
> if busy_poll = 1, IOPS is 38989.
> if busy_poll = 0, IOPS is 33031.
>
> And busy_poll=0 test result looks no difference from the old code than
> do not have sorted events wait.
Good to hear! I think we can safely say that we've pushed rbd as far as
we can. At least on the fio side. Still appears to be some suboptimal
parts of the librbd API. And the kernel rbd driver could definitely be
improved a lot as well. Using busy_poll on the user side gets rid of a
sleep/wakeup cycle at that end, but the kernel driver still punts and
offloads any work item to a work queue...
Thanks for all your testing through this, really appreciated!
--
Jens Axboe
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-30 2:55 ` Jens Axboe
@ 2014-10-30 5:29 ` Ketor D
0 siblings, 0 replies; 52+ messages in thread
From: Ketor D @ 2014-10-30 5:29 UTC (permalink / raw)
To: Jens Axboe
Cc: Mark Kirkwood, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Thanks ,I feel a great honour to finished these test and happy to help
improved fio.
And I am trying to decrease the librbd latency. I have make some small progess.
2014-10-30 10:55 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 2014-10-29 20:50, Ketor D wrote:
>>
>> Hi Jens,
>>
>> The current code runs good!
>>
>> Test Conf: jobs=1 iodepth=1 bs=4k
>>
>> if busy_poll = 1, IOPS is 38989.
>> if busy_poll = 0, IOPS is 33031.
>>
>> And busy_poll=0 test result looks no difference from the old code than
>> do not have sorted events wait.
>
>
> Good to hear! I think we can safely say that we've pushed rbd as far as we
> can. At least on the fio side. Still appears to be some suboptimal parts of
> the librbd API. And the kernel rbd driver could definitely be improved a lot
> as well. Using busy_poll on the user side gets rid of a sleep/wakeup cycle
> at that end, but the kernel driver still punts and offloads any work item to
> a work queue...
>
> Thanks for all your testing through this, really appreciated!
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-29 14:31 ` Jens Axboe
2014-10-30 2:50 ` Ketor D
@ 2014-10-30 7:44 ` Mark Kirkwood
2014-10-30 8:04 ` Ketor D
1 sibling, 1 reply; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-30 7:44 UTC (permalink / raw)
To: Jens Axboe, Ketor D
Cc: Mark Nelson, Mark Nelson, fio@vger.kernel.org, xan.peng,
ceph-devel@vger.kernel.org
On 30/10/14 03:31, Jens Axboe wrote:
> On 2014-10-29 01:15, Ketor D wrote:
>> Hi, Jens,
>>
>> There is cmdline parse bug in the fio rbd test.
>>
>> I have fixed this and create a pull request on the github.
>>
>> Please review.
>>
>> After fix the bugs, the fio test can run.
>
> I merged your two pull requests (thanks!) and committed a polished
> variant of the sort patch. Ketor and Mark, would you mind both running a
> quick benchmark on the current -git head?
>
Better late than never (sorry), comparing with the 'original' fio code
containing the usleep(100):
blocksize k | head iops | orig iops
------------+---------------+--------------
4 | 11114 | 11516
128 | 4551 | 6550
1024 | 1195 | 1248
So we do pretty much the same except in the middle blocksize range (I
checked again with the old binary just to rule out any other changes on
the ceph end...).
Regards
Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-30 7:44 ` Mark Kirkwood
@ 2014-10-30 8:04 ` Ketor D
2014-10-31 8:54 ` Mark Kirkwood
0 siblings, 1 reply; 52+ messages in thread
From: Ketor D @ 2014-10-30 8:04 UTC (permalink / raw)
To: Mark Kirkwood
Cc: Jens Axboe, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
Hi Mark,
Could you do a fio test in your env with the busy_poll=1 ?
I am very interested in the busy_poll result. Thanks!
2014-10-30 15:44 GMT+08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
> On 30/10/14 03:31, Jens Axboe wrote:
>>
>> On 2014-10-29 01:15, Ketor D wrote:
>>>
>>> Hi, Jens,
>>>
>>> There is cmdline parse bug in the fio rbd test.
>>>
>>> I have fixed this and create a pull request on the github.
>>>
>>> Please review.
>>>
>>> After fix the bugs, the fio test can run.
>>
>>
>> I merged your two pull requests (thanks!) and committed a polished
>> variant of the sort patch. Ketor and Mark, would you mind both running a
>> quick benchmark on the current -git head?
>>
>
> Better late than never (sorry), comparing with the 'original' fio code
> containing the usleep(100):
>
> blocksize k | head iops | orig iops
> ------------+---------------+--------------
> 4 | 11114 | 11516
> 128 | 4551 | 6550
> 1024 | 1195 | 1248
>
> So we do pretty much the same except in the middle blocksize range (I
> checked again with the old binary just to rule out any other changes on the
> ceph end...).
>
> Regards
>
> Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M)
2014-10-30 8:04 ` Ketor D
@ 2014-10-31 8:54 ` Mark Kirkwood
0 siblings, 0 replies; 52+ messages in thread
From: Mark Kirkwood @ 2014-10-31 8:54 UTC (permalink / raw)
To: Ketor D
Cc: Jens Axboe, Mark Nelson, Mark Nelson, fio@vger.kernel.org,
xan.peng, ceph-devel@vger.kernel.org
On 30/10/14 21:04, Ketor D wrote:
> Hi Mark,
> Could you do a fio test in your env with the busy_poll=1 ?
> I am very interested in the busy_poll result. Thanks!
>
Sure:
blocksize k | head iops | head iops (busy_pool=1)
------------+---------------+--------------------------
4 | 11114 | 12608
128 | 4551 | 6422
1024 | 1195 | 1175
4096 | 320 | 316
So looks like the busy_pool=1 improves performance for small and mid
range blocksizes but is a little slower at the larger end.
However there are a lot of variables here - I'm using iodepth=32 for
instance, and altering that may change the pattern I'm seeing, also a
system with more osd's may bring out different behaviours as it runs the
fio client out of available cpu power in the smaller block sizes.
Regards
Mark
^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2014-10-31 8:54 UTC | newest]
Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-24 2:38 fio rbd hang for block sizes > 1M Mark Kirkwood
2014-10-24 5:35 ` Jens Axboe
2014-10-24 6:17 ` Mark Kirkwood
2014-10-24 13:19 ` Mark Nelson
2014-10-24 14:09 ` Mark Nelson
2014-10-24 14:30 ` Jens Axboe
2014-10-24 22:45 ` Mark Kirkwood
2014-10-25 0:12 ` Mark Nelson
2014-10-25 0:37 ` Mark Kirkwood
2014-10-25 2:35 ` Mark Kirkwood
2014-10-25 3:47 ` Jens Axboe
2014-10-25 4:50 ` fio rbd completions (Was: fio rbd hang for block sizes > 1M) Mark Kirkwood
2014-10-25 19:20 ` Jens Axboe
2014-10-25 22:25 ` Mark Kirkwood
2014-10-27 9:27 ` Ketor D
2014-10-27 10:25 ` Ketor D
2014-10-27 14:19 ` Jens Axboe
2014-10-27 14:15 ` Jens Axboe
2014-10-27 14:19 ` Jens Axboe
2014-10-27 15:12 ` Ketor D
2014-10-27 15:22 ` Jens Axboe
2014-10-27 15:25 ` Jens Axboe
2014-10-27 15:29 ` Ketor D
2014-10-27 15:36 ` Jens Axboe
2014-10-27 15:45 ` Ketor D
2014-10-27 15:53 ` Jens Axboe
2014-10-27 16:20 ` Ketor D
2014-10-27 16:55 ` Jens Axboe
2014-10-27 21:59 ` Mark Kirkwood
2014-10-27 22:32 ` Jens Axboe
2014-10-27 23:21 ` Mark Kirkwood
2014-10-28 3:23 ` Ketor D
2014-10-28 4:01 ` Mark Kirkwood
2014-10-28 4:05 ` Jens Axboe
2014-10-28 4:49 ` Ketor D
2014-10-28 15:14 ` Jens Axboe
2014-10-28 15:49 ` Ketor D
2014-10-28 15:53 ` Jens Axboe
2014-10-28 17:09 ` Jens Axboe
2014-10-28 18:43 ` Ketor D
2014-10-29 7:15 ` Ketor D
2014-10-29 14:31 ` Jens Axboe
2014-10-30 2:50 ` Ketor D
2014-10-30 2:55 ` Jens Axboe
2014-10-30 5:29 ` Ketor D
2014-10-30 7:44 ` Mark Kirkwood
2014-10-30 8:04 ` Ketor D
2014-10-31 8:54 ` Mark Kirkwood
2014-10-24 22:30 ` fio rbd hang for block sizes > 1M Mark Kirkwood
2014-10-24 22:38 ` Mark Nelson
2014-10-24 14:11 ` Danny Al-Gaaf
2014-10-24 14:31 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox