* Checking on wip-blkin branch
@ 2015-06-09 11:59 Xue, Chendi
2015-06-09 14:44 ` Josh Durgin
0 siblings, 1 reply; 5+ messages in thread
From: Xue, Chendi @ 2015-06-09 11:59 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org, Josh Durgin, Andrew Shewmaker
Hi, Josh and Andrew
Today, I applied wip-blkin branch to my 4 nodes ceph setup, and created zipkin-based lttng results successfully.
The lttng output of one node looks like below:
[19:40:13.515540140] (+?.?????????) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.12ea", port_no = 0, ip = "", trace_id = 5603008495359114284, span_id = 7574382314084922818, parent_span_id = 8052152701610410440, event = "sub_op_commit_rec" }
[19:40:13.516052860] (+0.000512720) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "OSD Handling op", service_name = "PG 1.173e", port_no = 0, ip = "", trace_id = 5972487792843317983, span_id = 8783747543424673039, parent_span_id = 4641226164743578081, event = "sub_op_commit_rec" }
[19:40:13.517445543] (+0.001392683) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "Main", service_name = "MOSDOp", port_no = 0, ip = "0.0.0.0", trace_id = 6216541782147073283, span_id = 4139330703901011153, parent_span_id = 0, event = "Message allocated" }
[19:40:13.517464397] (+0.000018854) aceph01 zipkin:keyval: { cpu_id = 0 }, { trace_name = "Main", service_name = "MOSDOp", port_no = 0, ip = "0.0.0.0", trace_id = 6216541782147073283, span_id = 4139330703901011153, parent_span_id = 0, key = "Type", val = "MOSDOp" }
[19:40:13.517466586] (+0.000002189) aceph01 zipkin:keyval: { cpu_id = 0 }, { trace_name = "Main", service_name = "MOSDOp", port_no = 0, ip = "0.0.0.0", trace_id = 6216541782147073283, span_id = 4139330703901011153, parent_span_id = 0, key = "Reqid", val = "client.24276.0:517" }
[19:40:13.517470247] (+0.000003661) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "Main", service_name = "MOSDOp", port_no = 0, ip = "0.0.0.0", trace_id = 6216541782147073283, span_id = 4139330703901011153, parent_span_id = 0, event = "message_read" }
[19:40:13.517551767] (+0.000081520) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "OSD Handling op", service_name = "osd.6", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "waiting_on_osdmap" }
[19:40:13.517558902] (+0.000007135) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "OSD Handling op", service_name = "osd.6", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "handling_op" }
[19:40:13.517582555] (+0.000023653) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "enqueuing_op" }
[19:40:13.517592939] (+0.000010384) aceph01 zipkin:timestamp: { cpu_id = 0 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "enqueued_op" }
[19:40:13.517610258] (+0.000017319) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "dequeuing_op" }
[19:40:13.517631460] (+0.000021202) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "starting_request" }
[19:40:13.517635339] (+0.000003879) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "handling_message" }
[19:40:13.517637085] (+0.000001746) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "do_op" }
[19:40:13.517638907] (+0.000001822) aceph01 zipkin:keyval: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "osd.6", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, key = "object", val = "rbd_data.5e7b7ca40890.000000000000007d" }
[19:40:13.517677284] (+0.000038377) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "executing_ctx" }
[19:40:13.517717674] (+0.000040390) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "Main", service_name = "MOSDOpReply", port_no = 0, ip = "0.0.0.0", trace_id = 6216541782147073283, span_id = 2923417568587988974, parent_span_id = 389191001414244146, event = "Message allocated" }
[19:40:13.517721865] (+0.000004191) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "issuing_repop" }
[19:40:13.517728168] (+0.000006303) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "issuing_replication" }
[19:40:13.517742523] (+0.000014355) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "sub_op_sent | waiting for subops from 23" }
[19:40:13.517867388] (+0.000124865) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "Journal access", service_name = "Journal (/dev/sdh2)", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 5949552981320957406, parent_span_id = 389191001414244146, event = "commit_queued_for_journal_write" }
[19:40:13.517879771] (+0.000012383) aceph01 zipkin:timestamp: { cpu_id = 3 }, { trace_name = "OSD Handling op", service_name = "PG 1.c6d", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 389191001414244146, parent_span_id = 4139330703901011153, event = "dequeued_op" }
[19:40:13.517921734] (+0.000041963) aceph01 zipkin:timestamp: { cpu_id = 4 }, { trace_name = "OSD Handling op", service_name = "PG 1.204", port_no = 0, ip = "", trace_id = 7400594096956513409, span_id = 6293528614396127061, parent_span_id = 8738710816442983260, event = "sub_op_commit_rec" }
[19:40:13.517951799] (+0.000030065) aceph01 zipkin:timestamp: { cpu_id = 7 }, { trace_name = "Journal access", service_name = "Journal (/dev/sdh2)", port_no = 0, ip = "", trace_id = 6216541782147073283, span_id = 5949552981320957406, parent_span_id = 389191001414244146, event = "write_thread_in_journal_buffer" }
[19:40:13.517977726] (+0.000025927) aceph01 zipkin:timestamp: { cpu_id = 4 }, { trace_name = "Main", service_name = "MOSDOpReply", port_no = 0, ip = "0.0.0.0", trace_id = 7400594096956513409, span_id = 6966407715853117252, parent_span_id = 6293528614396127061, event = "Message allocated" }
[19:40:13.517993525] (+0.000015799) aceph01 zipkin:timestamp: { cpu_id = 4 }, { trace_name = "Main", service_name = "MOSDOp", port_no = 0, ip = "0.0.0.0", trace_id = 7400594096956513409, span_id = 8738710816442983260, parent_span_id = 0, event = "replied_commit" }
[19: ...
Since we wanna using this latency analyzing methodology on further release or on keyvalue store and newstore.
I am wondering what is the gap to merge blkin into master? Can I help to rebase to master, so we can merge this branch?
Thanks so much !
Best Regards,
-Chendi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Checking on wip-blkin branch
2015-06-09 11:59 Checking on wip-blkin branch Xue, Chendi
@ 2015-06-09 14:44 ` Josh Durgin
2015-06-16 7:02 ` Xue, Chendi
0 siblings, 1 reply; 5+ messages in thread
From: Josh Durgin @ 2015-06-09 14:44 UTC (permalink / raw)
To: Xue, Chendi, ceph-devel@vger.kernel.org, Andrew Shewmaker
Hi Chendi,
On 06/09/2015 04:59 AM, Xue, Chendi wrote:
> Hi, Josh and Andrew
>
> Today, I applied wip-blkin branch to my 4 nodes ceph setup, and created zipkin-based lttng results successfully.
>
> Since we wanna using this latency analyzing methodology on further release or on keyvalue store and newstore.
> I am wondering what is the gap to merge blkin into master? Can I help to rebase to master, so we can merge this branch?
That'd be great! Andrew and I simply haven't had time to finish fixing
up the remaining issues. Running it through a rados suite showed some
more changes were necessary to handle more complicated osd usage -
you'll see several osd crashes in this rados suite run [1].
The code was looking fine to me, it just needs rebasing and debugging
so it can pass the rados test suite.
Thanks!
Josh
[1]
http://pulpito.ceph.com/joshd-2015-03-25_17:29:16-rados-wip-blkin---blkin-multi/
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Checking on wip-blkin branch
2015-06-09 14:44 ` Josh Durgin
@ 2015-06-16 7:02 ` Xue, Chendi
2015-06-16 17:13 ` Josh Durgin
0 siblings, 1 reply; 5+ messages in thread
From: Xue, Chendi @ 2015-06-16 7:02 UTC (permalink / raw)
To: Josh Durgin, ceph-devel@vger.kernel.org, Andrew Shewmaker
HI, Josh and Andrew
I just rebase the wip-blkin branch to 9.0.1, and did a rados bench test on that
https://github.com/ceph/ceph/pull/4963
Lttng result is collected successfully
Considering on rados testsuits failing issues, I don't have a test environment, any suggests?
Best Regards,
-Chendi
-----Original Message-----
From: Josh Durgin [mailto:jdurgin@redhat.com]
Sent: Tuesday, June 9, 2015 10:45 PM
To: Xue, Chendi; ceph-devel@vger.kernel.org; Andrew Shewmaker
Subject: Re: Checking on wip-blkin branch
Hi Chendi,
On 06/09/2015 04:59 AM, Xue, Chendi wrote:
> Hi, Josh and Andrew
>
> Today, I applied wip-blkin branch to my 4 nodes ceph setup, and created zipkin-based lttng results successfully.
>
> Since we wanna using this latency analyzing methodology on further release or on keyvalue store and newstore.
> I am wondering what is the gap to merge blkin into master? Can I help to rebase to master, so we can merge this branch?
That'd be great! Andrew and I simply haven't had time to finish fixing up the remaining issues. Running it through a rados suite showed some more changes were necessary to handle more complicated osd usage - you'll see several osd crashes in this rados suite run [1].
The code was looking fine to me, it just needs rebasing and debugging so it can pass the rados test suite.
Thanks!
Josh
[1]
http://pulpito.ceph.com/joshd-2015-03-25_17:29:16-rados-wip-blkin---blkin-multi/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Checking on wip-blkin branch
2015-06-16 7:02 ` Xue, Chendi
@ 2015-06-16 17:13 ` Josh Durgin
2015-06-16 18:20 ` Josh Durgin
0 siblings, 1 reply; 5+ messages in thread
From: Josh Durgin @ 2015-06-16 17:13 UTC (permalink / raw)
To: Xue, Chendi, ceph-devel@vger.kernel.org, Andrew Shewmaker
On 06/16/2015 12:02 AM, Xue, Chendi wrote:
> HI, Josh and Andrew
>
> I just rebase the wip-blkin branch to 9.0.1, and did a rados bench test on that
>
> https://github.com/ceph/ceph/pull/4963
Thanks! Reset the branch in ceph.git.
> Lttng result is collected successfully
>
> Considering on rados testsuits failing issues, I don't have a test environment, any suggests?
You should be able to reproduce the failures without teuthology. Try
running these scripts from the ceph tree with
ms inject socket failures = 5000
in the [global] section of ceph.conf:
qa/workunits/rados/test.sh
qa/workunits/rados/test_pool_quota.sh
qa/workunits/rados/cls.sh
These resulted in osd crashes in the last rados suite run linked
earlier in the thread.
Josh
> Best Regards,
> -Chendi
>
> -----Original Message-----
> From: Josh Durgin [mailto:jdurgin@redhat.com]
> Sent: Tuesday, June 9, 2015 10:45 PM
> To: Xue, Chendi; ceph-devel@vger.kernel.org; Andrew Shewmaker
> Subject: Re: Checking on wip-blkin branch
>
> Hi Chendi,
>
> On 06/09/2015 04:59 AM, Xue, Chendi wrote:
>> Hi, Josh and Andrew
>>
>> Today, I applied wip-blkin branch to my 4 nodes ceph setup, and created zipkin-based lttng results successfully.
>>
>> Since we wanna using this latency analyzing methodology on further release or on keyvalue store and newstore.
>> I am wondering what is the gap to merge blkin into master? Can I help to rebase to master, so we can merge this branch?
>
> That'd be great! Andrew and I simply haven't had time to finish fixing up the remaining issues. Running it through a rados suite showed some more changes were necessary to handle more complicated osd usage - you'll see several osd crashes in this rados suite run [1].
>
> The code was looking fine to me, it just needs rebasing and debugging so it can pass the rados test suite.
>
> Thanks!
> Josh
>
> [1]
> http://pulpito.ceph.com/joshd-2015-03-25_17:29:16-rados-wip-blkin---blkin-multi/
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Checking on wip-blkin branch
2015-06-16 17:13 ` Josh Durgin
@ 2015-06-16 18:20 ` Josh Durgin
0 siblings, 0 replies; 5+ messages in thread
From: Josh Durgin @ 2015-06-16 18:20 UTC (permalink / raw)
To: Xue, Chendi, ceph-devel@vger.kernel.org, Andrew Shewmaker
On 06/16/2015 10:13 AM, Josh Durgin wrote:
> On 06/16/2015 12:02 AM, Xue, Chendi wrote:
>> HI, Josh and Andrew
>>
>> I just rebase the wip-blkin branch to 9.0.1, and did a rados bench
>> test on that
>>
>> https://github.com/ceph/ceph/pull/4963
>
> Thanks! Reset the branch in ceph.git.
>
>> Lttng result is collected successfully
>>
>> Considering on rados testsuits failing issues, I don't have a test
>> environment, any suggests?
>
> You should be able to reproduce the failures without teuthology. Try
> running these scripts from the ceph tree with
>
> ms inject socket failures = 5000
>
> in the [global] section of ceph.conf:
>
> qa/workunits/rados/test.sh
> qa/workunits/rados/test_pool_quota.sh
> qa/workunits/rados/cls.sh
Whoops, there's no cls.sh. I meant all the scripts in qa/workunits/cls/
>
> These resulted in osd crashes in the last rados suite run linked
> earlier in the thread.
>
> Josh
>
>> Best Regards,
>> -Chendi
>>
>> -----Original Message-----
>> From: Josh Durgin [mailto:jdurgin@redhat.com]
>> Sent: Tuesday, June 9, 2015 10:45 PM
>> To: Xue, Chendi; ceph-devel@vger.kernel.org; Andrew Shewmaker
>> Subject: Re: Checking on wip-blkin branch
>>
>> Hi Chendi,
>>
>> On 06/09/2015 04:59 AM, Xue, Chendi wrote:
>>> Hi, Josh and Andrew
>>>
>>> Today, I applied wip-blkin branch to my 4 nodes ceph setup, and
>>> created zipkin-based lttng results successfully.
>>>
>>> Since we wanna using this latency analyzing methodology on further
>>> release or on keyvalue store and newstore.
>>> I am wondering what is the gap to merge blkin into master? Can I help
>>> to rebase to master, so we can merge this branch?
>>
>> That'd be great! Andrew and I simply haven't had time to finish fixing
>> up the remaining issues. Running it through a rados suite showed some
>> more changes were necessary to handle more complicated osd usage -
>> you'll see several osd crashes in this rados suite run [1].
>>
>> The code was looking fine to me, it just needs rebasing and debugging
>> so it can pass the rados test suite.
>>
>> Thanks!
>> Josh
>>
>> [1]
>> http://pulpito.ceph.com/joshd-2015-03-25_17:29:16-rados-wip-blkin---blkin-multi/
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-06-16 18:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-09 11:59 Checking on wip-blkin branch Xue, Chendi
2015-06-09 14:44 ` Josh Durgin
2015-06-16 7:02 ` Xue, Chendi
2015-06-16 17:13 ` Josh Durgin
2015-06-16 18:20 ` Josh Durgin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.