From: Jens Axboe <axboe@kernel.dk>
To: Mark Kirkwood <mark.kirkwood@catalyst.net.nz>, fio@vger.kernel.org
Cc: "d.gollub@telekom.de >> Daniel Gollub" <d.gollub@telekom.de>,
"xan.peng" <xanpeng@gmail.com>
Subject: Re: fio rbd hang for block sizes > 1M
Date: Thu, 23 Oct 2014 23:35:10 -0600 [thread overview]
Message-ID: <5449E50E.7000808@kernel.dk> (raw)
In-Reply-To: <5449BBB3.7090109@catalyst.net.nz>
CC'ing relevant parties, leaving email intact.
On 2014-10-23 20:38, Mark Kirkwood wrote:
> I stumbled across this performance testing a new ceph cluster:
>
> Env:
>
> Ceph 0.86-467-g317b83d (317b83dddd1a917f70838870b31931a79bdd4dd0)
> Ubuntu 14.04 (3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC
> 2014 x86_64 x86_64 x86_64 GNU/Linux)
> Fio fio-2.1.13-88-gb2ee7
>
> Cmd:
>
> $ rbd ls -l
> NAME SIZE PARENT FMT PROT LOCK
> vol0 4096M 1
>
> $ fio read-test.fio # attached
> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Killed1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
> 1158050441d:06h:59m:33s]
>
> Block sizes 1M usually works, 2M,4M always fail. The rbd volume should
> be written to 1st (just change read to write in workload file). Note
> that 2-4M blocksize is fine for writes!
>
> Running the read variant under valgrind shows seveal invalid reads -
> only for these bigger block sizes, so I'm guessing they are the problem:
>
> $ valgrind fio read-test.fio
> ==12519== Memcheck, a memory error detector
> ==12519== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==12519== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for
> copyright info
> ==12519== Command: fio read-test.fio
> ==12519==
> rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> ==12519== Thread 6:
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519== Address 0x197b6fe0 is 48 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519==
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519== Address 0x197b6fe8 is 56 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4E965A7:
> librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*,
> unsigned long, unsigned long, Context*) (ImageCtx.cc:484)
> ==12519== by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*,
> std::vector<std::pair<unsigned long, unsigned long>,
> std::allocator<std::pair<unsigned long, unsigned long> > > const&,
> char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262)
> ==12519== by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned
> long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*)
> (internal.cc:3135)
> ==12519== by 0x4E8B737: rbd_aio_read (librbd.cc:1518)
> ==12519== by 0x459D92: fio_rbd_queue (rbd.c:294)
> ==12519== by 0x40D379: td_io_queue (ioengines.c:300)
> ==12519== by 0x44B77E: thread_main (backend.c:781)
> ==12519== by 0x81F6181: start_thread (pthread_create.c:312)
> ==12519== by 0x870AFBC: clone (clone.S:111)
> ==12519==
> ==12519== Thread 18:
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x5452397: Finisher::finisher_thread_entry()
> (Finisher.cc:59)
> ==12519== Address 0x1a299710 is 48 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519==
> ==12519== Invalid read of size 8
> ==12519== at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x5452397: Finisher::finisher_thread_entry()
> (Finisher.cc:59)
> ==12519== Address 0x1a299718 is 56 bytes inside a block of size 264 free'd
> ==12519== at 0x4C2C2BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==12519== by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*,
> ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149)
> ==12519== by 0x4F027BF: ObjectCacher::C_RetryRead::finish(int)
> (ObjectCacher.h:581)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EFF083: void finish_contexts<Context>(CephContext*,
> std::list<Context*, std::allocator<Context*> >&, int) (Context.h:120)
> ==12519== by 0x4EF489C: ObjectCacher::bh_read_finish(long, sobject_t,
> unsigned long, long, unsigned long, ceph::buffer::list&, int, bool)
> (ObjectCacher.cc:805)
> ==12519== by 0x4F01590: ObjectCacher::C_ReadFinish::finish(int)
> (ObjectCacher.h:504)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x4EB9BBC: librbd::C_Request::finish(int)
> (LibrbdWriteback.cc:54)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519== by 0x53B64FC: librados::C_AioComplete::finish(int)
> (AioCompletionImpl.h:180)
> ==12519== by 0x4E8EBE8: Context::complete(int) (Context.h:64)
> ==12519==
--
Jens Axboe
next prev parent reply other threads:[~2014-10-24 5:35 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-24 2:38 fio rbd hang for block sizes > 1M Mark Kirkwood
2014-10-24 5:35 ` Jens Axboe [this message]
2014-10-24 6:17 ` Mark Kirkwood
2014-10-24 13:19 ` Mark Nelson
[not found] ` <544A51C7.40803-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-10-24 14:09 ` Mark Nelson
2014-10-24 14:09 ` Mark Nelson
[not found] ` <544A5DA6.2010709-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-10-24 14:30 ` Jens Axboe
2014-10-24 14:30 ` Jens Axboe
2014-10-24 22:45 ` Mark Kirkwood
2014-10-24 22:45 ` Mark Kirkwood
[not found] ` <544AD67D.4030603-6STWZtX7tXAqAMOr+u8IRA@public.gmane.org>
2014-10-25 0:12 ` Mark Nelson
2014-10-25 0:12 ` Mark Nelson
[not found] ` <544AEAE7.6080603-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-25 0:37 ` Mark Kirkwood
2014-10-25 0:37 ` Mark Kirkwood
[not found] ` <544AF0D2.1050405-6STWZtX7tXAqAMOr+u8IRA@public.gmane.org>
2014-10-25 2:35 ` Mark Kirkwood
2014-10-25 2:35 ` Mark Kirkwood
2014-10-25 3:47 ` Jens Axboe
2014-10-25 4:50 ` fio rbd completions (Was: fio rbd hang for block sizes > 1M) Mark Kirkwood
2014-10-25 19:20 ` Jens Axboe
[not found] ` <544BF808.2090800-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-25 22:25 ` Mark Kirkwood
2014-10-25 22:25 ` Mark Kirkwood
2014-10-27 9:27 ` Ketor D
[not found] ` <CAM9_UU_S7qhenZW34Lw3r=RHoVa1__610RRsFScgt0adi1dpFw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-27 10:25 ` Ketor D
2014-10-27 10:25 ` Ketor D
2014-10-27 14:19 ` Jens Axboe
2014-10-27 14:15 ` Jens Axboe
[not found] ` <544C2371.1020403-6STWZtX7tXAqAMOr+u8IRA@public.gmane.org>
2014-10-27 14:19 ` Jens Axboe
2014-10-27 14:19 ` Jens Axboe
[not found] ` <544E547C.30009-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-27 15:12 ` Ketor D
2014-10-27 15:12 ` Ketor D
2014-10-27 15:22 ` Jens Axboe
2014-10-27 15:25 ` Jens Axboe
2014-10-27 15:29 ` Ketor D
[not found] ` <CAM9_UU-35i9uRu9EDQSM-b7CjmxrKYV2Gz8ocrykOYk+2q++hw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-27 15:36 ` Jens Axboe
2014-10-27 15:36 ` Jens Axboe
2014-10-27 15:45 ` Ketor D
2014-10-27 15:53 ` Jens Axboe
2014-10-27 16:20 ` Ketor D
[not found] ` <CAM9_UU8x2uZZUWaPPoy+LH mUhC_3sqKZ9GPsEqDwKUkprg4kdQ@mail.gmail.com>
[not found] ` <CAM9_UU8x2uZZUWaPPoy+LHmUhC_3sqKZ9GPsEqDwKUkprg4kdQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-27 16:55 ` Jens Axboe
2014-10-27 16:55 ` Jens Axboe
2014-10-27 21:59 ` Mark Kirkwood
2014-10-27 21:59 ` Mark Kirkwood
2014-10-27 22:32 ` Jens Axboe
2014-10-27 22:32 ` Jens Axboe
[not found] ` <544EC7F1.6010900-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-27 23:21 ` Mark Kirkwood
2014-10-27 23:21 ` Mark Kirkwood
[not found] ` <544ED37D.6060800-6STWZtX7tXAqAMOr+u8IRA@public.gmane.org>
2014-10-28 3:23 ` Ketor D
2014-10-28 3:23 ` Ketor D
[not found] ` <CAM9_UU8MHdj+mjAWBziETxPDnwTt0JBuHrQp2Fu9YtF=msae3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-28 4:01 ` Mark Kirkwood
2014-10-28 4:01 ` Mark Kirkwood
2014-10-28 4:05 ` Jens Axboe
2014-10-28 4:49 ` Ketor D
[not found] ` <CAM9_UU9G5vQ68UxMakte-Wb5B9_KBo24ov7=hNHpYqEtko2nQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-28 15:14 ` Jens Axboe
2014-10-28 15:14 ` Jens Axboe
2014-10-28 15:49 ` Ketor D
[not found] ` <CAM9_UU_o8kS1wJnDKTvd8+qkm9=93yfW3THr_8ni8C+5=TH6tg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-28 15:53 ` Jens Axboe
2014-10-28 15:53 ` Jens Axboe
2014-10-28 17:09 ` Jens Axboe
2014-10-28 17:09 ` Jens Axboe
2014-10-28 18:43 ` Ketor D
2014-10-29 7:15 ` Ketor D
[not found] ` <CAM9_UU8=MtUnQiembBfr8YQiDOD7TNey=mp8_H6gySenRVHy6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-29 14:31 ` Jens Axboe
2014-10-29 14:31 ` Jens Axboe
[not found] ` <5450FA47.2030203-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-30 2:50 ` Ketor D
2014-10-30 2:50 ` Ketor D
2014-10-30 2:55 ` Jens Axboe
2014-10-30 2:55 ` Jens Axboe
2014-10-30 5:29 ` Ketor D
2014-10-30 7:44 ` Mark Kirkwood
2014-10-30 7:44 ` Mark Kirkwood
2014-10-30 8:04 ` Ketor D
2014-10-31 8:54 ` Mark Kirkwood
2014-10-31 8:54 ` Mark Kirkwood
2014-10-24 22:30 ` fio rbd hang for block sizes > 1M Mark Kirkwood
2014-10-24 22:38 ` Mark Nelson
2014-10-24 14:11 ` Danny Al-Gaaf
2014-10-24 14:31 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5449E50E.7000808@kernel.dk \
--to=axboe@kernel.dk \
--cc=d.gollub@telekom.de \
--cc=fio@vger.kernel.org \
--cc=mark.kirkwood@catalyst.net.nz \
--cc=xanpeng@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.