From: Jevon Qiao <scaleqiao-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org>
Cc: ceph-devel <ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Ceph-User <ceph-users-Qp0mS5GaXlQ@public.gmane.org>
Subject: Re: Any suggestion to deal with slow request?
Date: Fri, 8 Jan 2016 12:22:04 +0800 [thread overview]
Message-ID: <568F396C.9020700@gmail.com> (raw)
In-Reply-To: <CAANLjFrTf1CCm=zWJSpFU78WeODhfVeqbr7RxOz3Df4ojABZug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 3958 bytes --]
Hi Robert,
Thank you for the prompt response.
The OSDs are built on XFS and the drives are Intel SSDs. Each SSD is
parted into two partitions, one is for journal, the other is for data.
There is no alignment issue for the partitions.
When slow request msg is outputted, the workload is quite light on the
replication OSDs.
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.50 30.00 0.00 0.18
12.33 0.00 0.08 0.08 0.25
sdb 0.00 0.50 0.50 78.00 0.00 0.75
19.57 0.09 1.20 0.08 0.60
sdc 0.00 0.50 0.00 28.00 0.00 0.24
17.75 0.01 0.32 0.11 0.30
I benchmarked some OSDs with 'ceph tell osd.x bench',and learned that
the throughput for some OSDs(the disk usage is over 60%) is 21MB/s,
which seems abnormal.
$ ceph tell osd.24 bench
{ "bytes_written": 1073741824,
"blocksize": 4194304,
"bytes_per_sec": "22995975.000000"}
But the throughput for some newly added OSDs can reach 370MB/s. I
suspect if it is related to the GC of SSD. If so, it might explain why
it takes such long time to write journal. Any idea?
Another phenomenon that the journal_write is queued in writeq for 3
seconds, I checked the corresponding process logic in function
FileJournal::submit_entry() and FileJournal::write_thread_entry(), I did
not find anything suspicious point.
Thanks,
Jevon
On 8/1/16 00:43, Robert LeBlanc wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> What is the file system on the OSDs? Anything interesting in
> iostat/atop? What are the drives backing the OSDs? A few more details
> would be helpful.
> - ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>
>
> On Wed, Jan 6, 2016 at 9:03 PM, Jevon Qiao wrote:
>> Hi Cephers,
>>
>> We have a Ceph cluster running 0.80.9, which consists of 36 OSDs with 3
>> replicas. Recently, some OSDs keep reporting slow request and the cluster
>> has a performance downgrade.
>>
>> From the log of one OSD, I observe that all the slow requests are resulted
>> from waiting for the replicas to complete. And the replication OSDs are not
>> always some specific ones but could be any other two OSDs.
>>
>> 2016-01-06 08:17:11.887016 7f175ef25700 0 log [WRN] : slow request 1.162776
>> seconds old, received at 2016-01-06 08:17:11.887092:
>> osd_op(client.13302933.0:839452 rbd_data.c2659c728b0ddb.0000000000000024
>> [stat,set-alloc-hint object_size 16777216 write_size 16777216,write
>> 12099584~8192] 3.abd08522 ack+ondisk+write e4661) v4 currently waiting for
>> subops from 24,31
>>
>> I dumped out the historic Ops of the OSD and noticed the following
>> information:
>> 1) wait about 8 seconds for the replies from the replica OSDs.
>> { "time": "2016-01-06 08:17:03.879264",
>> "event": "op_applied"},
>> { "time": "2016-01-06 08:17:11.684598",
>> "event": "sub_op_applied_rec"},
>> { "time": "2016-01-06 08:17:11.687016",
>> "event": "sub_op_commit_rec"},
>>
>> 2) spend more than 3 seconds in writeq and 2 seconds to write the journal.
>> { "time": "2016-01-06 08:19:16.887519",
>> "event": "commit_queued_for_journal_write"},
>> { "time": "2016-01-06 08:19:20.109339",
>> "event": "write_thread_in_journal_buffer"},
>> { "time": "2016-01-06 08:19:22.177952",
>> "event": "journaled_completion_queued"},
>>
>> Any ideas or suggestions?
>>
>> BTW, I checked the underlying network with iperf, it works fine.
>>
>> Thanks,
>> Jevon
[-- Attachment #1.2: Type: text/html, Size: 4718 bytes --]
[-- Attachment #2: Type: text/plain, Size: 178 bytes --]
_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
prev parent reply other threads:[~2016-01-08 4:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-07 4:03 Any suggestion to deal with slow request? Jevon Qiao
2016-01-07 16:43 ` [ceph-users] " Robert LeBlanc
[not found] ` <CAANLjFrTf1CCm=zWJSpFU78WeODhfVeqbr7RxOz3Df4ojABZug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-08 4:22 ` Jevon Qiao [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=568F396C.9020700@gmail.com \
--to=scaleqiao-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ceph-users-Qp0mS5GaXlQ@public.gmane.org \
--cc=robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.