* LTTng tracing: ReplicatedPG::log_operation
@ 2014-12-02 18:17 Andreas Bluemle
2014-12-02 18:32 ` Gregory Farnum
0 siblings, 1 reply; 4+ messages in thread
From: Andreas Bluemle @ 2014-12-02 18:17 UTC (permalink / raw)
To: Ceph Development
Hi,
during code profiling using LTTng, I encounter that during
processing of write requests to the cluster, the ceph-osd
spends a lot of time in the ReplicatedPG::log_operation
before the the actual writes to journal and object
in the FileStore are triggered.
This happens in ReplicatedBackend::submit_transaction.
What I wonder is
- what is the purpose of the log_operation?
If I am not mistaken, then it is neither the write-to-journal
nor the write-to-object; both of these are triggered from
the queue_operation following that log_operation.
- can the sequence between the log_operation and
the actual queue_operation be reversed in
ReplicatedBackend::submit_transaction?
Regards
Andreas Bluemle
--
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: LTTng tracing: ReplicatedPG::log_operation 2014-12-02 18:17 LTTng tracing: ReplicatedPG::log_operation Andreas Bluemle @ 2014-12-02 18:32 ` Gregory Farnum 2014-12-03 8:58 ` Andreas Bluemle 0 siblings, 1 reply; 4+ messages in thread From: Gregory Farnum @ 2014-12-02 18:32 UTC (permalink / raw) To: Andreas Bluemle; +Cc: Ceph Development On Tue, Dec 2, 2014 at 10:17 AM, Andreas Bluemle <andreas.bluemle@itxperts.de> wrote: > Hi, > > during code profiling using LTTng, I encounter that during > processing of write requests to the cluster, the ceph-osd > spends a lot of time in the ReplicatedPG::log_operation > before the the actual writes to journal and object > in the FileStore are triggered. > > This happens in ReplicatedBackend::submit_transaction. > > What I wonder is > - what is the purpose of the log_operation? > If I am not mistaken, then it is neither the write-to-journal > nor the write-to-object; both of these are triggered from > the queue_operation following that log_operation. This is setting up the changes to the pg log, and encoding them into the transaction. > - can the sequence between the log_operation and > the actual queue_operation be reversed in > ReplicatedBackend::submit_transaction? Nope, it needs to go into the transaction and get journaled. I'm kind of surprised this is a big time sink, but there is a lot of encoding so if you're running against a fast system I suppose it could be relatively large. -Greg ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: LTTng tracing: ReplicatedPG::log_operation 2014-12-02 18:32 ` Gregory Farnum @ 2014-12-03 8:58 ` Andreas Bluemle 2014-12-04 2:17 ` Haomai Wang 0 siblings, 1 reply; 4+ messages in thread From: Andreas Bluemle @ 2014-12-03 8:58 UTC (permalink / raw) To: Gregory Farnum; +Cc: Ceph Development Hi Gregory, On Tue, 2 Dec 2014 10:32:50 -0800 Gregory Farnum <greg@gregs42.com> wrote: > On Tue, Dec 2, 2014 at 10:17 AM, Andreas Bluemle > <andreas.bluemle@itxperts.de> wrote: > > Hi, > > > > during code profiling using LTTng, I encounter that during > > processing of write requests to the cluster, the ceph-osd > > spends a lot of time in the ReplicatedPG::log_operation > > before the the actual writes to journal and object > > in the FileStore are triggered. > > > > This happens in ReplicatedBackend::submit_transaction. > > > > What I wonder is > > - what is the purpose of the log_operation? > > If I am not mistaken, then it is neither the write-to-journal > > nor the write-to-object; both of these are triggered from > > the queue_operation following that log_operation. > > This is setting up the changes to the pg log, and encoding them into > the transaction. > > > - can the sequence between the log_operation and > > the actual queue_operation be reversed in > > ReplicatedBackend::submit_transaction? > > Nope, it needs to go into the transaction and get journaled. > I'm kind of surprised this is a big time sink, but there is a lot of > encoding so if you're running against a fast system I suppose it could > be relatively large. From what I see on my test system, adding the pg log entry consumes about 60 microseconds - which is about 12 % of the overall time spent for a write request on a replicating OSD, which is sth. like 460 microseconds between receipt of the MSG_OSD_SUBOP at the messenger until the corresponding MSG_OSD_SUBOPREPLY is sent back to the primary OSD. > -Greg > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de ITXperts GmbH http://www.itxperts.de Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917 D-81541 Muenchen (Germany) Fax: (+49) 89 89044910 Company details: http://www.itxperts.de/imprint.htm ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: LTTng tracing: ReplicatedPG::log_operation 2014-12-03 8:58 ` Andreas Bluemle @ 2014-12-04 2:17 ` Haomai Wang 0 siblings, 0 replies; 4+ messages in thread From: Haomai Wang @ 2014-12-04 2:17 UTC (permalink / raw) To: Andreas Bluemle; +Cc: Gregory Farnum, Ceph Development Yes, it give a small but significant to performance hit. But it's necessary and may exists optimization space. On Wed, Dec 3, 2014 at 4:58 PM, Andreas Bluemle <andreas.bluemle@itxperts.de> wrote: > Hi Gregory, > > On Tue, 2 Dec 2014 10:32:50 -0800 > Gregory Farnum <greg@gregs42.com> wrote: > >> On Tue, Dec 2, 2014 at 10:17 AM, Andreas Bluemle >> <andreas.bluemle@itxperts.de> wrote: >> > Hi, >> > >> > during code profiling using LTTng, I encounter that during >> > processing of write requests to the cluster, the ceph-osd >> > spends a lot of time in the ReplicatedPG::log_operation >> > before the the actual writes to journal and object >> > in the FileStore are triggered. >> > >> > This happens in ReplicatedBackend::submit_transaction. >> > >> > What I wonder is >> > - what is the purpose of the log_operation? >> > If I am not mistaken, then it is neither the write-to-journal >> > nor the write-to-object; both of these are triggered from >> > the queue_operation following that log_operation. >> >> This is setting up the changes to the pg log, and encoding them into >> the transaction. >> >> > - can the sequence between the log_operation and >> > the actual queue_operation be reversed in >> > ReplicatedBackend::submit_transaction? >> >> Nope, it needs to go into the transaction and get journaled. >> I'm kind of surprised this is a big time sink, but there is a lot of >> encoding so if you're running against a fast system I suppose it could >> be relatively large. > > From what I see on my test system, adding the pg log entry consumes > about 60 microseconds - which is about 12 % of the overall time spent > for a write request on a replicating OSD, which is sth. like 460 > microseconds between receipt of the MSG_OSD_SUBOP at the messenger > until the corresponding MSG_OSD_SUBOPREPLY is sent back to the primary > OSD. > >> -Greg >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >> in the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > > -- > Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de > ITXperts GmbH http://www.itxperts.de > Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917 > D-81541 Muenchen (Germany) Fax: (+49) 89 89044910 > > Company details: http://www.itxperts.de/imprint.htm > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-12-04 2:17 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-02 18:17 LTTng tracing: ReplicatedPG::log_operation Andreas Bluemle 2014-12-02 18:32 ` Gregory Farnum 2014-12-03 8:58 ` Andreas Bluemle 2014-12-04 2:17 ` Haomai Wang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.