From: Kasper Dieter <dieter.kasper@ts.fujitsu.com>
To: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Cc: Kasper Dieter <Dieter.Kasper@ts.fujitsu.com>
Subject: krbd kernel 3.16.0-1 with v0.83 got stuck during write
Date: Thu, 7 Aug 2014 19:36:46 +0200 [thread overview]
Message-ID: <20140807173646.GA1130@oder.mch.fsc.net> (raw)
In-Reply-To: <alpine.DEB.2.00.1408050653170.18917@cobra.newdream.net>
Hi,
I'm running a 3 node cluster with 126 OSDs in total under CentOS-6.5 with
ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
On the client side it's 0.83, too
with kernel 3.16.0-1.el6.elrepo.x86_64
rbd showmapped
id pool image snap device
0 SAS-r2 sas2-r2-1T-4m.0 - /dev/rbd0
1 SAS-r2 sas2-r2-1T-4m.1 - /dev/rbd1
2 SAS-r2 sas2-r2-1T-4m.2 - /dev/rbd2
After a couple of minutes (trying to fill the 1TB volume)
fio --filename=/dev/rbd0 --direct=1 --rw=write --bs=8M --size=8G --numjobs=128 --offset_increment=8G --runtime=3600 --group_reporting --name=file1
got stuck.
/var/log/message:
(...)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd118 192.168.113.54:6902 socket closed (con state OPEN)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd40 192.168.113.52:6920 socket closed (con state OPEN)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd109 192.168.113.54:6875 socket closed (con state OPEN)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd67 192.168.113.53:6875 socket closed (con state OPEN)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd37 192.168.113.52:6911 socket closed (con state OPEN)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd98 192.168.113.54:6842 socket closed (con state OPEN)
Aug 7 19:22:34 rx37-0 kernel: libceph: osd26 192.168.113.52:6878 socket closed (con state OPEN)
Aug 7 19:24:43 rx37-0 kernel: INFO: task kworker/2:0:19 blocked for more than 120 seconds.
Aug 7 19:24:43 rx37-0 kernel: Not tainted 3.16.0-1.el6.elrepo.x86_64 #1
Aug 7 19:24:43 rx37-0 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 7 19:24:43 rx37-0 kernel: kworker/2:0 D 0000000000000002 0 19 2 0x00000000
Aug 7 19:24:43 rx37-0 kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 7 19:24:43 rx37-0 kernel: ffff8810307bfb68 0000000000000046 ffff8810307bfb18 ffff8810307bc010
Aug 7 19:24:43 rx37-0 kernel: 0000000000014380 0000000000014380 ffff8810307ae390 ffff880079678250
Aug 7 19:24:43 rx37-0 kernel: 0000003500004040 ffff88102a1fd7c8 ffff88102a1fd7cc ffff8810307ae390
Aug 7 19:24:43 rx37-0 kernel: Call Trace:
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff81647629>] schedule+0x29/0x70
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff8164778e>] schedule_preempt_disabled+0xe/0x10
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff816490fb>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff81649213>] mutex_lock+0x23/0x40
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa0615e0f>] get_reply+0x3f/0x200 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa0616058>] alloc_msg+0x88/0x90 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa060d8f1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa060eba8>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa060d278>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa06101d6>] try_read+0x2b6/0x430 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffffa0610688>] con_work+0x78/0x220 [libceph]
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff8108d60c>] process_one_work+0x17c/0x420
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff8108e7d3>] worker_thread+0x123/0x420
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff8108e6b0>] ? maybe_create_worker+0x180/0x180
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff810943be>] kthread+0xce/0xf0
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff810942f0>] ? kthread_freezable_should_stop+0x70/0x70
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff8164ae3c>] ret_from_fork+0x7c/0xb0
Aug 7 19:24:43 rx37-0 kernel: [<ffffffff810942f0>] ? kthread_freezable_should_stop+0x70/0x70
Aug 7 19:24:43 rx37-0 kernel: INFO: task kworker/3:0:24 blocked for more than 120 seconds.
Aug 7 19:24:43 rx37-0 kernel: Not tainted 3.16.0-1.el6.elrepo.x86_64 #1
Aug 7 19:24:43 rx37-0 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 7 19:24:43 rx37-0 kernel: kworker/3:0 D 0000000000000003 0 24 2 0x00000000
Aug 7 19:24:43 rx37-0 kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 7 19:24:43 rx37-0 kernel: ffff881030027c98 0000000000000046 ffff881019afe330 ffff881030024010
(...)
Any ideas ?
With Kernel 3.10.32 on the client side everythink worked fine.
Mit freundlichen Grüßen / Best regards
Dieter Kasper
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-08-07 17:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-30 1:21 v0.83 released Sage Weil
[not found] ` <alpine.DEB.2.00.1407291820100.5744-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2014-08-05 4:58 ` debian Only
[not found] ` <CA+SSH2qOdNne-ExmOjLGnNKfZYXtO+KJ8vSQzvJj07hOK5H8qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-05 13:55 ` Sage Weil
[not found] ` <alpine.DEB.2.00.1408050653170.18917-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2014-08-05 14:09 ` Sage Weil
2014-08-07 17:36 ` Kasper Dieter [this message]
2014-08-07 17:56 ` krbd kernel 3.16.0-1 with v0.83 got stuck during write Ilya Dryomov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140807173646.GA1130@oder.mch.fsc.net \
--to=dieter.kasper@ts.fujitsu.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.