Hi Cephers,
We have a Ceph cluster running 0.80.9, which consists of 36 OSDs with 3
replicas. Recently, some OSDs keep reporting slow request and the cluster
has a performance downgrade.
From the log of one OSD, I observe that all the slow requests are resulted
from waiting for the replicas to complete. And the replication OSDs are not
always some specific ones but could be any other two OSDs.
2016-01-06 08:17:11.887016 7f175ef25700 0 log [WRN] : slow request 1.162776
seconds old, received at 2016-01-06 08:17:11.887092:
osd_op(client.13302933.0:839452 rbd_data.c2659c728b0ddb.0000000000000024
[stat,set-alloc-hint object_size 16777216 write_size 16777216,write
12099584~8192] 3.abd08522 ack+ondisk+write e4661) v4 currently waiting for
subops from 24,31
I dumped out the historic Ops of the OSD and noticed the following
information:
1) wait about 8 seconds for the replies from the replica OSDs.
{ "time": "2016-01-06 08:17:03.879264",
"event": "op_applied"},
{ "time": "2016-01-06 08:17:11.684598",
"event": "sub_op_applied_rec"},
{ "time": "2016-01-06 08:17:11.687016",
"event": "sub_op_commit_rec"},
2) spend more than 3 seconds in writeq and 2 seconds to write the journal.
{ "time": "2016-01-06 08:19:16.887519",
"event": "commit_queued_for_journal_write"},
{ "time": "2016-01-06 08:19:20.109339",
"event": "write_thread_in_journal_buffer"},
{ "time": "2016-01-06 08:19:22.177952",
"event": "journaled_completion_queued"},
Any ideas or suggestions?
BTW, I checked the underlying network with iperf, it works fine.
Thanks,
Jevon