* LTTng tracing: ReplicatedPG::log_operation
@ 2014-12-02 18:17 Andreas Bluemle
2014-12-02 18:32 ` Gregory Farnum
0 siblings, 1 reply; 4+ messages in thread
From: Andreas Bluemle @ 2014-12-02 18:17 UTC (permalink / raw)
To: Ceph Development
Hi,
during code profiling using LTTng, I encounter that during
processing of write requests to the cluster, the ceph-osd
spends a lot of time in the ReplicatedPG::log_operation
before the the actual writes to journal and object
in the FileStore are triggered.
This happens in ReplicatedBackend::submit_transaction.
What I wonder is
- what is the purpose of the log_operation?
If I am not mistaken, then it is neither the write-to-journal
nor the write-to-object; both of these are triggered from
the queue_operation following that log_operation.
- can the sequence between the log_operation and
the actual queue_operation be reversed in
ReplicatedBackend::submit_transaction?
Regards
Andreas Bluemle
--
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: LTTng tracing: ReplicatedPG::log_operation
2014-12-02 18:17 LTTng tracing: ReplicatedPG::log_operation Andreas Bluemle
@ 2014-12-02 18:32 ` Gregory Farnum
2014-12-03 8:58 ` Andreas Bluemle
0 siblings, 1 reply; 4+ messages in thread
From: Gregory Farnum @ 2014-12-02 18:32 UTC (permalink / raw)
To: Andreas Bluemle; +Cc: Ceph Development
On Tue, Dec 2, 2014 at 10:17 AM, Andreas Bluemle
<andreas.bluemle@itxperts.de> wrote:
> Hi,
>
> during code profiling using LTTng, I encounter that during
> processing of write requests to the cluster, the ceph-osd
> spends a lot of time in the ReplicatedPG::log_operation
> before the the actual writes to journal and object
> in the FileStore are triggered.
>
> This happens in ReplicatedBackend::submit_transaction.
>
> What I wonder is
> - what is the purpose of the log_operation?
> If I am not mistaken, then it is neither the write-to-journal
> nor the write-to-object; both of these are triggered from
> the queue_operation following that log_operation.
This is setting up the changes to the pg log, and encoding them into
the transaction.
> - can the sequence between the log_operation and
> the actual queue_operation be reversed in
> ReplicatedBackend::submit_transaction?
Nope, it needs to go into the transaction and get journaled.
I'm kind of surprised this is a big time sink, but there is a lot of
encoding so if you're running against a fast system I suppose it could
be relatively large.
-Greg
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: LTTng tracing: ReplicatedPG::log_operation
2014-12-02 18:32 ` Gregory Farnum
@ 2014-12-03 8:58 ` Andreas Bluemle
2014-12-04 2:17 ` Haomai Wang
0 siblings, 1 reply; 4+ messages in thread
From: Andreas Bluemle @ 2014-12-03 8:58 UTC (permalink / raw)
To: Gregory Farnum; +Cc: Ceph Development
Hi Gregory,
On Tue, 2 Dec 2014 10:32:50 -0800
Gregory Farnum <greg@gregs42.com> wrote:
> On Tue, Dec 2, 2014 at 10:17 AM, Andreas Bluemle
> <andreas.bluemle@itxperts.de> wrote:
> > Hi,
> >
> > during code profiling using LTTng, I encounter that during
> > processing of write requests to the cluster, the ceph-osd
> > spends a lot of time in the ReplicatedPG::log_operation
> > before the the actual writes to journal and object
> > in the FileStore are triggered.
> >
> > This happens in ReplicatedBackend::submit_transaction.
> >
> > What I wonder is
> > - what is the purpose of the log_operation?
> > If I am not mistaken, then it is neither the write-to-journal
> > nor the write-to-object; both of these are triggered from
> > the queue_operation following that log_operation.
>
> This is setting up the changes to the pg log, and encoding them into
> the transaction.
>
> > - can the sequence between the log_operation and
> > the actual queue_operation be reversed in
> > ReplicatedBackend::submit_transaction?
>
> Nope, it needs to go into the transaction and get journaled.
> I'm kind of surprised this is a big time sink, but there is a lot of
> encoding so if you're running against a fast system I suppose it could
> be relatively large.
From what I see on my test system, adding the pg log entry consumes
about 60 microseconds - which is about 12 % of the overall time spent
for a write request on a replicating OSD, which is sth. like 460
microseconds between receipt of the MSG_OSD_SUBOP at the messenger
until the corresponding MSG_OSD_SUBOPREPLY is sent back to the primary
OSD.
> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: LTTng tracing: ReplicatedPG::log_operation
2014-12-03 8:58 ` Andreas Bluemle
@ 2014-12-04 2:17 ` Haomai Wang
0 siblings, 0 replies; 4+ messages in thread
From: Haomai Wang @ 2014-12-04 2:17 UTC (permalink / raw)
To: Andreas Bluemle; +Cc: Gregory Farnum, Ceph Development
Yes, it give a small but significant to performance hit. But it's
necessary and may exists optimization space.
On Wed, Dec 3, 2014 at 4:58 PM, Andreas Bluemle
<andreas.bluemle@itxperts.de> wrote:
> Hi Gregory,
>
> On Tue, 2 Dec 2014 10:32:50 -0800
> Gregory Farnum <greg@gregs42.com> wrote:
>
>> On Tue, Dec 2, 2014 at 10:17 AM, Andreas Bluemle
>> <andreas.bluemle@itxperts.de> wrote:
>> > Hi,
>> >
>> > during code profiling using LTTng, I encounter that during
>> > processing of write requests to the cluster, the ceph-osd
>> > spends a lot of time in the ReplicatedPG::log_operation
>> > before the the actual writes to journal and object
>> > in the FileStore are triggered.
>> >
>> > This happens in ReplicatedBackend::submit_transaction.
>> >
>> > What I wonder is
>> > - what is the purpose of the log_operation?
>> > If I am not mistaken, then it is neither the write-to-journal
>> > nor the write-to-object; both of these are triggered from
>> > the queue_operation following that log_operation.
>>
>> This is setting up the changes to the pg log, and encoding them into
>> the transaction.
>>
>> > - can the sequence between the log_operation and
>> > the actual queue_operation be reversed in
>> > ReplicatedBackend::submit_transaction?
>>
>> Nope, it needs to go into the transaction and get journaled.
>> I'm kind of surprised this is a big time sink, but there is a lot of
>> encoding so if you're running against a fast system I suppose it could
>> be relatively large.
>
> From what I see on my test system, adding the pg log entry consumes
> about 60 microseconds - which is about 12 % of the overall time spent
> for a write request on a replicating OSD, which is sth. like 460
> microseconds between receipt of the MSG_OSD_SUBOP at the messenger
> until the corresponding MSG_OSD_SUBOPREPLY is sent back to the primary
> OSD.
>
>> -Greg
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
>
>
> --
> Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
> ITXperts GmbH http://www.itxperts.de
> Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
> D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
>
> Company details: http://www.itxperts.de/imprint.htm
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Best Regards,
Wheat
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-12-04 2:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-02 18:17 LTTng tracing: ReplicatedPG::log_operation Andreas Bluemle
2014-12-02 18:32 ` Gregory Farnum
2014-12-03 8:58 ` Andreas Bluemle
2014-12-04 2:17 ` Haomai Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.