All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Ιnstrumenting RADOS with Zipkin + LTTng
@ 2014-10-30 18:49 Andrew Shewmaker
  2014-11-07 10:32 ` Sage Weil
  2014-11-07 14:18 ` Mark Nelson
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Shewmaker @ 2014-10-30 18:49 UTC (permalink / raw)
  To: ceph-devel

Hi everyone,

I'm Andrew, a new Ceph intern, and I'll be working to get Marios'
Zipkin + LTTng repo into a merge-able state, as Sam asked about
(http://www.spinics.net/lists/ceph-devel/msg20024.html).

I've exchanged email with Marios, and he is interested in 
helping me with it in his spare time. One of the important 
things he mentioned is that his blkin library only works 
with LTTng 2.4. Other versions experience deadlocks, and 
he'll work to resolve them in 2.5.

I'm still working on getting a ceph development environment 
set up, so I haven't tested the tracepoints Marios added.
I have built an RPM spec file for blkin, tested it against 
lttng-ust 2.3 and 2.4, and built (but not tested) Marios'
ceph branch.

Should I put effort into adding a "--with-blkin" or "--with-zipkin"
option to autoconf?

I've made a first attempt at dividing the changes into
logically grouped patches: common blkin infrastructure, 
osd, and rados. Does that sound reasonable? Or should 
I not bother separating osd and rados changes?

I saw a couple instances of extraneous whitespace/newlines
I'll clean up. What other issues should I look for?

After they're cleaned up, would it be best for me to submit these 
patches to the list, or just point to a github repo?

Thanks,

Andrew Shewmaker

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Ιnstrumenting RADOS with Zipkin + LTTng
@ 2014-08-01 16:28 Marios-Evaggelos Kogias
  2014-08-01 20:17 ` Samuel Just
  2014-08-01 20:54 ` Adam Crume
  0 siblings, 2 replies; 13+ messages in thread
From: Marios-Evaggelos Kogias @ 2014-08-01 16:28 UTC (permalink / raw)
  To: ceph-devel; +Cc: Vangelis Koukis, cven, Filippos Giannakos

Hello all,

my name is Marios Kogias and I am a student at the National Technical
University of Athens. As part of my diploma thesis and my participation in
Google Summer of Code 2014 (in the LTTng organization) I am working on a
low-overhead tracing infrastructure for distributed systems. I am also
collaborating with the Synnefo team (https://www.synnefo.org/) and especially
with Vangelis Koukis, Constantinos Venetsanopoulos and Filippos Giannakos (cc)

Some time ago, we started experimenting with RADOS instrumentation
using LTTng and
we noticed that there are similar endeavours in the Ceph github repository [1].

However, unlike your approach, we are following an annotation-based tracing
schema, which enables us to track a specific request from the time it enters
the system at higher levels till it is finally served by RADOS.

In general, we try to implement the tracing semantics described in the Dapper
paper [2] in order to trace the causal relationships between the different
processing phases that an IO request may trigger. Our target is an end-to-end
visualisation of the request's route in the system, accompanied by information
concerning latencies in each processing phase. Thanks to LTTng this can happen
with a minimal overhead and in realtime. In order to visualize the results we
have integrated Twitter's Zipkin [3], (which is a tracing system
entirely based on
Dapper) with LTTng.

You can find a proof of concept of what we've done so far here:

http://snf-551656.vm.okeanos.grnet.gr:8080/traces/0b554b8a48cb3e84?serviceName=MOSDOp

In the above link you can see the trace of a write request served by a RADOS
pool with replication level set to 3 (two replicas).

We'd love to have early feedback and comments from you guys too,
so that we can incorporate useful recommendations. You can find all
the relevant code
here[5][6]. If you have any questions or you wish to experiment with the
project please do not hesitate to contact us.

Kind regards,
Marios

[1]https://github.com/ceph/ceph/tree/wip-lttng
[2]http://static.googleusercontent.com/media/research.google.com/el//pubs/archive/36356.pdf
[3]http://twitter.github.io/zipkin/
[4] https://github.com/marioskogias/blkin
[5] https://github.com/marioskogias/babeltrace-plugins

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Ιnstrumenting RADOS with Zipkin + LTTng
@ 2014-08-01 16:14 Marios-Evaggelos Kogias
  0 siblings, 0 replies; 13+ messages in thread
From: Marios-Evaggelos Kogias @ 2014-08-01 16:14 UTC (permalink / raw)
  To: ceph-devel-u79uwXL29TY76Z2rM5mHXA, ceph-users-Qp0mS5GaXlQ
  Cc: Filippos Giannakos


[-- Attachment #1.1: Type: text/plain, Size: 2261 bytes --]

Hello all,

my name is Marios Kogias and I am a student at the National Technical
University of Athens. As part of my diploma thesis and my participation in
Google Summer of Code 2014 (in the LTTng organization) I am working on a
low-overhead tracing infrastructure for distributed systems. I am also
collaborating with the Synnefo team (https://www.synnefo.org/) and
especially
with Vangelis Koukis, Constantinos Venetsanopoulos and Filippos Giannakos
(cc)

Some time ago, we started experimenting with RADOS instrumentation using
LTTng and
we noticed that there are similar endeavours in the Ceph github repository
[1].

However, unlike your approach, we are following an annotation-based tracing
schema, which enables us to track a specific request from the time it enters
the system at higher levels till it is finally served by RADOS.

In general, we try to implement the tracing semantics described in the
Dapper
paper [2] in order to trace the causal relationships between the different
processing phases that an IO request may trigger. Our target is an
end-to-end
visualisation of the request's route in the system, accompanied by
information
concerning latencies in each processing phase. Thanks to LTTng this can
happen
with a minimal overhead and in realtime. In order to visualize the results
we
have integrated Twitter's Zipkin [3], (which is a tracing system entirely
based on
Dapper) with LTTng.

You can find a proof of concept of what we've done so far here:

http://snf-551656.vm.okeanos.grnet.gr:8080/traces/0b554b8a48cb3e84?serviceName=MOSDOp

In the above link you can see the trace of a write request served by a RADOS
pool with replication level set to 3 (two replicas).

We'd love to have early feedback and comments from you guys too,
so that we can incorporate useful recommendations. You can find all the
relevant code
here[5][6]. If you have any questions or you wish to experiment with the
project please do not hesitate to contact us.

Kind regards,
Marios

[1]https://github.com/ceph/ceph/tree/wip-lttng
[2]
http://static.googleusercontent.com/media/research.google.com/el//pubs/archive/36356.pdf
[3]http://twitter.github.io/zipkin/
[4] https://github.com/marioskogias/blkin
[5] https://github.com/marioskogias/babeltrace-plugins

[-- Attachment #1.2: Type: text/html, Size: 3490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-11-07 17:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-30 18:49 Ιnstrumenting RADOS with Zipkin + LTTng Andrew Shewmaker
2014-11-07 10:32 ` Sage Weil
2014-11-07 17:29   ` Andrew Shewmaker
2014-11-07 14:18 ` Mark Nelson
2014-11-07 14:36   ` Matt W. Benjamin
  -- strict thread matches above, loose matches on Subject: below --
2014-08-01 16:28 Marios-Evaggelos Kogias
2014-08-01 20:17 ` Samuel Just
2014-08-01 20:24   ` Samuel Just
     [not found]     ` <CAFu9Dv1Aqsf69=fXGZ1QjgQuvmXyaDGX0vk91MAT=6QuWznXDw@mail.gmail.com>
2014-08-05 16:22       ` Marios-Evaggelos Kogias
2014-08-13 21:21         ` Samuel Just
2014-08-01 20:54 ` Adam Crume
2014-08-05 16:18   ` Marios-Evaggelos Kogias
2014-08-01 16:14 Marios-Evaggelos Kogias

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.