* ceph rbd crashes/stalls while random write 4k blocks
@ 2012-05-24 11:07 Stefan Priebe - Profihost AG
2012-05-24 12:12 ` Florian Haas
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-24 11:07 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org
Hi list,
i'm still testing ceph rbd with kvm. Right now i'm testing a rbd block
device within a network booted kvm.
Sequential write/reads and random reads are fine. No problems so far.
But when i trigger lots of 4k random writes all of them stall after
short time and i get 0 iops and 0 transfer.
used command:
fio --filename=/dev/vda --direct=1 --rw=randwrite --bs=4k --size=20G
--numjobs=50 --runtime=30 --group_reporting --name=file1
Then some time later i see this call trace:
INFO: task ceph-osd:3065 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ceph-osd D ffff8803b0e61d88 0 3065 1 0x00000004
ffff88032f3ab7f8 0000000000000086 ffff8803bffdac08 ffff880300000000
ffff8803b0e61820 0000000000010800 ffff88032f3abfd8 ffff88032f3aa010
ffff88032f3abfd8 0000000000010800 ffffffff81a0b020 ffff8803b0e61820
Call Trace:
[<ffffffff815e0e1a>] schedule+0x3a/0x60
[<ffffffff815e127d>] schedule_timeout+0x1fd/0x2e0
[<ffffffff812696c4>] ? xfs_iext_bno_to_ext+0x84/0x160
[<ffffffff81074db1>] ? down_trylock+0x31/0x50
[<ffffffff812696c4>] ? xfs_iext_bno_to_ext+0x84/0x160
[<ffffffff815e20b9>] __down+0x69/0xb0
[<ffffffff8128c4a6>] ? _xfs_buf_find+0xf6/0x280
[<ffffffff81074e6b>] down+0x3b/0x50
[<ffffffff8128b7b0>] xfs_buf_lock+0x40/0xe0
[<ffffffff8128c4a6>] _xfs_buf_find+0xf6/0x280
[<ffffffff8128c689>] xfs_buf_get+0x59/0x190
[<ffffffff8128ccf7>] xfs_buf_read+0x27/0x100
[<ffffffff81282f97>] xfs_trans_read_buf+0x1e7/0x420
[<ffffffff81239371>] xfs_read_agf+0x61/0x1a0
[<ffffffff812394e4>] xfs_alloc_read_agf+0x34/0xd0
[<ffffffff8123c877>] xfs_alloc_fix_freelist+0x3f7/0x470
[<ffffffff81288005>] ? kmem_free+0x35/0x40
[<ffffffff8127ff6e>] ? xfs_trans_free_item_desc+0x2e/0x30
[<ffffffff812800a7>] ? xfs_trans_free_items+0x87/0xb0
[<ffffffff8127cc73>] ? xfs_perag_get+0x33/0xb0
[<ffffffff8123c97f>] ? xfs_free_extent+0x8f/0x120
[<ffffffff8123c990>] xfs_free_extent+0xa0/0x120
[<ffffffff81287f07>] ? kmem_zone_alloc+0x77/0xf0
[<ffffffff81245ead>] xfs_bmap_finish+0x15d/0x1a0
[<ffffffff8126d15e>] xfs_itruncate_finish+0x15e/0x340
[<ffffffff81285495>] xfs_setattr+0x365/0x980
[<ffffffff812926e6>] xfs_vn_setattr+0x16/0x20
[<ffffffff8111e0ad>] notify_change+0x11d/0x300
[<ffffffff81103ccc>] do_truncate+0x5c/0x90
[<ffffffff8110ea35>] ? get_write_access+0x15/0x50
[<ffffffff81103ef7>] sys_truncate+0x127/0x130
[<ffffffff815e367b>] system_call_fastpath+0x16/0x1b
INFO: task flush-8:16:3089 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:16 D ffff8803af0d9d88 0 3089 2 0x00000000
ffff88032e835940 0000000000000046 0000000100000fe0 ffff880300000000
ffff8803af0d9820 0000000000010800 ffff88032e835fd8 ffff88032e834010
ffff88032e835fd8 0000000000010800 ffff8803b0f7e080 ffff8803af0d9820
Call Trace:
[<ffffffff810be570>] ? __lock_page+0x70/0x70
[<ffffffff815e0e1a>] schedule+0x3a/0x60
[<ffffffff815e0ec7>] io_schedule+0x87/0xd0
[<ffffffff810be579>] sleep_on_page+0x9/0x10
[<ffffffff815e1412>] __wait_on_bit_lock+0x52/0xb0
[<ffffffff810be562>] __lock_page+0x62/0x70
[<ffffffff8106fb80>] ? autoremove_wake_function+0x40/0x40
[<ffffffff810c8fd0>] ? pagevec_lookup_tag+0x20/0x30
[<ffffffff810c7f66>] write_cache_pages+0x386/0x4d0
[<ffffffff810c6c10>] ? set_page_dirty+0x70/0x70
[<ffffffff810fd7ab>] ? kmem_cache_free+0x1b/0xe0
[<ffffffff810c80fc>] generic_writepages+0x4c/0x70
[<ffffffff81288bcf>] xfs_vm_writepages+0x4f/0x60
[<ffffffff810c813c>] do_writepages+0x1c/0x40
[<ffffffff81128854>] writeback_single_inode+0xf4/0x260
[<ffffffff81128c45>] writeback_sb_inodes+0xe5/0x1b0
[<ffffffff811290a8>] writeback_inodes_wb+0x98/0x160
[<ffffffff81129ac3>] wb_writeback+0x2f3/0x460
[<ffffffff815e089e>] ? __schedule+0x3ae/0x850
[<ffffffff8105df47>] ? lock_timer_base+0x37/0x70
[<ffffffff81129e4f>] wb_do_writeback+0x21f/0x270
[<ffffffff81129f3a>] bdi_writeback_thread+0x9a/0x230
[<ffffffff81129ea0>] ? wb_do_writeback+0x270/0x270
[<ffffffff81129ea0>] ? wb_do_writeback+0x270/0x270
[<ffffffff8106f646>] kthread+0x96/0xa0
[<ffffffff815e46d4>] kernel_thread_helper+0x4/0x10
[<ffffffff8106f5b0>] ? kthread_worker_fn+0x130/0x130
[<ffffffff815e46d0>] ? gs_change+0xb/0xb
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph rbd crashes/stalls while random write 4k blocks
2012-05-24 11:07 ceph rbd crashes/stalls while random write 4k blocks Stefan Priebe - Profihost AG
@ 2012-05-24 12:12 ` Florian Haas
2012-05-24 14:09 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 7+ messages in thread
From: Florian Haas @ 2012-05-24 12:12 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org
Stefan,
On 05/24/12 13:07, Stefan Priebe - Profihost AG wrote:
> Hi list,
>
> i'm still testing ceph rbd with kvm. Right now i'm testing a rbd block
> device within a network booted kvm.
>
> Sequential write/reads and random reads are fine. No problems so far.
>
> But when i trigger lots of 4k random writes all of them stall after
> short time and i get 0 iops and 0 transfer.
>
> used command:
> fio --filename=/dev/vda --direct=1 --rw=randwrite --bs=4k --size=20G
> --numjobs=50 --runtime=30 --group_reporting --name=file1
>
> Then some time later i see this call trace:
>
> INFO: task ceph-osd:3065 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> ceph-osd D ffff8803b0e61d88 0 3065 1 0x00000004
> ffff88032f3ab7f8 0000000000000086 ffff8803bffdac08 ffff880300000000
> ffff8803b0e61820 0000000000010800 ffff88032f3abfd8 ffff88032f3aa010
> ffff88032f3abfd8 0000000000010800 ffffffff81a0b020 ffff8803b0e61820
> Call Trace:
> [<ffffffff815e0e1a>] schedule+0x3a/0x60
> [<ffffffff815e127d>] schedule_timeout+0x1fd/0x2e0
> [<ffffffff812696c4>] ? xfs_iext_bno_to_ext+0x84/0x160
> [<ffffffff81074db1>] ? down_trylock+0x31/0x50
> [<ffffffff812696c4>] ? xfs_iext_bno_to_ext+0x84/0x160
> [<ffffffff815e20b9>] __down+0x69/0xb0
> [<ffffffff8128c4a6>] ? _xfs_buf_find+0xf6/0x280
> [<ffffffff81074e6b>] down+0x3b/0x50
sorry I'm coming a bit late to the various threads you've posted
recently, but on this particular issue: what kernel are your OSDs
running on, and do these hung tasks occur if you're using a local
filesystem other than XFS?
As of late XFS has occasionally been producing seemingly random kernel
hangs. Your call trace doesn't have the signature entries from xfssyncd
that identify a particular problem that I've been struggling with
lately, but you just might be affected by some other effect of the same
root issue.
Take a look at these to see if anything looks familiar:
http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
http://oss.sgi.com/archives/xfs/2011-11/msg00400.html
Not sure if this helps at all; just thought I might pitch that in.
Cheers,
Florian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph rbd crashes/stalls while random write 4k blocks
2012-05-24 12:12 ` Florian Haas
@ 2012-05-24 14:09 ` Stefan Priebe - Profihost AG
2012-05-24 14:19 ` Florian Haas
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-24 14:09 UTC (permalink / raw)
To: Florian Haas; +Cc: ceph-devel@vger.kernel.org
Am 24.05.2012 14:12, schrieb Florian Haas:
> Stefan,
> sorry I'm coming a bit late to the various threads you've posted
> recently, but on this particular issue: what kernel are your OSDs
> running on, and do these hung tasks occur if you're using a local
> filesystem other than XFS?
OSDs run 3.0.30 but i tried 3.3.7 too - no difference (regarding XFS
crash and random writes).
Just tried btrfs with 3.4 kernel and the posted patch from yesterday.
But with kernel 3.4 the performance is in general pretty low doesn't
matter if i use xfs or btrfs:
~# rados -p data bench 10 write -t 16
Maintaining 16 concurrent writes of 4194304 bytes for at least 10 seconds.
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
0 0 0 0 0 0 - 0
1 16 35 19 75.9824 76 0.294869 0.376607
2 16 51 35 69.9844 64 0.103118 0.345375
3 16 72 56 74.652 84 0.113909 0.5364
4 16 88 72 71.9866 64 0.641818 0.786378
5 16 95 79 63.1887 28 0.131084 0.737699
6 16 113 97 64.6553 72 0.232688 0.851319
7 16 129 113 64.5604 64 0.35199 0.822971
8 16 148 132 65.9888 76 0.09892 0.739852
9 16 149 133 59.1007 4 0.833541 0.740556
10 16 157 141 56.3899 32 0.101306 0.715187
11 16 157 141 51.2634 0 - 0.715187
12 16 157 141 46.9914 0 - 0.715187
13 16 157 141 43.3766 0 - 0.715187
14 16 157 141 40.2782 0 - 0.715187
15 16 157 141 37.593 0 - 0.715187
16 16 157 141 35.2434 0 - 0.715187
Total time run: 16.471636
Total writes made: 158
Write size: 4194304
Bandwidth (MB/sec): 38.369
Average Latency: 1.66534
Max latency: 13.554
Min latency: 0.095194
> As of late XFS has occasionally been producing seemingly random kernel
> hangs. Your call trace doesn't have the signature entries from xfssyncd
> that identify a particular problem that I've been struggling with
> lately, but you just might be affected by some other effect of the same
> root issue.
>
> Take a look at these to see if anything looks familiar:
>
> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
> http://oss.sgi.com/archives/xfs/2011-11/msg00400.html
These are solved by using 3.0.20.
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph rbd crashes/stalls while random write 4k blocks
2012-05-24 14:09 ` Stefan Priebe - Profihost AG
@ 2012-05-24 14:19 ` Florian Haas
2012-05-25 6:47 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 7+ messages in thread
From: Florian Haas @ 2012-05-24 14:19 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org
On Thu, May 24, 2012 at 4:09 PM, Stefan Priebe - Profihost AG
<s.priebe@profihost.ag> wrote:
>> Take a look at these to see if anything looks familiar:
>>
>> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
>> http://oss.sgi.com/archives/xfs/2011-11/msg00400.html
>
> These are solved by using 3.0.20.
... or so Christoph says, but comment #4 in bug 922 seems to indicate otherwise.
Florian
--
Need help with High Availability?
http://www.hastexo.com/now
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph rbd crashes/stalls while random write 4k blocks
2012-05-24 14:19 ` Florian Haas
@ 2012-05-25 6:47 ` Stefan Priebe - Profihost AG
2012-05-25 7:33 ` Florian Haas
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-25 6:47 UTC (permalink / raw)
To: Florian Haas; +Cc: ceph-devel@vger.kernel.org
Am 24.05.2012 16:19, schrieb Florian Haas:
> On Thu, May 24, 2012 at 4:09 PM, Stefan Priebe - Profihost AG
> <s.priebe@profihost.ag> wrote:
>>> Take a look at these to see if anything looks familiar:
>>>
>>> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
>>> http://oss.sgi.com/archives/xfs/2011-11/msg00400.html
>>
>> These are solved by using 3.0.20.
>
> ... or so Christoph says, but comment #4 in bug 922 seems to indicate otherwise.
I'm sorry you're absolutely right. BUT XFS had some regressions with
xlog_grabt_log_space since 2.6.28 which was fixed in 3.0.X by reverting
back to a kernel thread instead of workers. I was working with Christoph
and Dave on this problem and it tooked be nearly a whole month to track
that down (git commit c7eead1e118fb7e34ee8f5063c3c090c054c3820). In this
case (#922) it seems it is really related to a too small log. But I
don't have a too small log in my ceph case ;-)
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph rbd crashes/stalls while random write 4k blocks
2012-05-25 6:47 ` Stefan Priebe - Profihost AG
@ 2012-05-25 7:33 ` Florian Haas
2012-05-25 7:35 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 7+ messages in thread
From: Florian Haas @ 2012-05-25 7:33 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org
On Fri, May 25, 2012 at 8:47 AM, Stefan Priebe - Profihost AG
<s.priebe@profihost.ag> wrote:
> Am 24.05.2012 16:19, schrieb Florian Haas:
>> On Thu, May 24, 2012 at 4:09 PM, Stefan Priebe - Profihost AG
>> <s.priebe@profihost.ag> wrote:
>>>> Take a look at these to see if anything looks familiar:
>>>>
>>>> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
>>>> http://oss.sgi.com/archives/xfs/2011-11/msg00400.html
>>>
>>> These are solved by using 3.0.20.
>>
>> ... or so Christoph says, but comment #4 in bug 922 seems to indicate otherwise.
>
> I'm sorry you're absolutely right. BUT XFS had some regressions with
> xlog_grabt_log_space since 2.6.28 which was fixed in 3.0.X by reverting
> back to a kernel thread instead of workers. I was working with Christoph
> and Dave on this problem and it tooked be nearly a whole month to track
> that down (git commit c7eead1e118fb7e34ee8f5063c3c090c054c3820). In this
> case (#922) it seems it is really related to a too small log. But I
> don't have a too small log in my ceph case ;-)
Hmmm. So what's Chinner saying about this one? Should we move this
discussion to an XFS list?
Cheers,
Florian
--
Need help with High Availability?
http://www.hastexo.com/now
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph rbd crashes/stalls while random write 4k blocks
2012-05-25 7:33 ` Florian Haas
@ 2012-05-25 7:35 ` Stefan Priebe - Profihost AG
0 siblings, 0 replies; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-25 7:35 UTC (permalink / raw)
To: Florian Haas; +Cc: ceph-devel@vger.kernel.org
Am 25.05.2012 09:33, schrieb Florian Haas:
> On Fri, May 25, 2012 at 8:47 AM, Stefan Priebe - Profihost AG
> <s.priebe@profihost.ag> wrote:
>> Am 24.05.2012 16:19, schrieb Florian Haas:
>>> On Thu, May 24, 2012 at 4:09 PM, Stefan Priebe - Profihost AG
>>> <s.priebe@profihost.ag> wrote:
>>>>> Take a look at these to see if anything looks familiar:
>>>>>
>>>>> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
>>>>> http://oss.sgi.com/archives/xfs/2011-11/msg00400.html
>>>>
>>>> These are solved by using 3.0.20.
>>>
>>> ... or so Christoph says, but comment #4 in bug 922 seems to indicate otherwise.
>>
>> I'm sorry you're absolutely right. BUT XFS had some regressions with
>> xlog_grabt_log_space since 2.6.28 which was fixed in 3.0.X by reverting
>> back to a kernel thread instead of workers. I was working with Christoph
>> and Dave on this problem and it tooked be nearly a whole month to track
>> that down (git commit c7eead1e118fb7e34ee8f5063c3c090c054c3820). In this
>> case (#922) it seems it is really related to a too small log. But I
>> don't have a too small log in my ceph case ;-)
>
> Hmmm. So what's Chinner saying about this one? Should we move this
> discussion to an XFS list?
I already send the trace to Christoph, Dave and the XFS List. Sadly no
reply.
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-05-25 7:34 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-24 11:07 ceph rbd crashes/stalls while random write 4k blocks Stefan Priebe - Profihost AG
2012-05-24 12:12 ` Florian Haas
2012-05-24 14:09 ` Stefan Priebe - Profihost AG
2012-05-24 14:19 ` Florian Haas
2012-05-25 6:47 ` Stefan Priebe - Profihost AG
2012-05-25 7:33 ` Florian Haas
2012-05-25 7:35 ` Stefan Priebe - Profihost AG
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.