All of lore.kernel.org
 help / color / mirror / Atom feed
* Tracing Ceph results
@ 2017-05-02 20:21 Mohamad Gebai
  2017-05-02 20:47 ` Sage Weil
  0 siblings, 1 reply; 6+ messages in thread
From: Mohamad Gebai @ 2017-05-02 20:21 UTC (permalink / raw)
  To: ceph-devel

Hi,

I thought I would share some results of tracing I've been conducting 
recently. I instrumented Ceph (OSD only) with GCC's 
-finstrument-functions option, and traced it and the kernel 
simultaneously using LTTng (no instrumentation is actually necessary, 
though rebuilding Ceph is required).

In [1]:
- the top view shows the state of the OSD process (22983). It is 
switching between the running (green) and blocked (yellow) states
- the middle view shows the critical path of that process, in other 
words it shows what the process 22983 was waiting for when it is in the 
blocked (yellow) state
- the bottom view shows the call stack graph, which is all function 
calls in OSD code, generated by -finstrument-functions

Image [2] shows a better image of the call stack view, and Image [3] 
shows a better one of the critical path.

The code is in branch [4].

After looking extensively at the result, I created a diagram that 
simplifies these interactions and shows the high level behavior of Ceph.

Here's the IO path of a write_full() - how it goes from the client to 
all OSDs and back. The OSDs talk to each other through their respective 
msg-worker thread using sockets, and the threads of the same OSD talk to 
each other through semaphores (int Wait(Mutex &mutex) in Cond.h). 
Colored arrows are messages across nodes (over the network). Thick 
horizontal lines separate nodes, thin horizontal lines separate threads 
on the same node.
https://docs.google.com/drawings/d/1fFb-aI8NQq7RESV2OKzrte2KvaBWAh6X9TL7woAGAmo/edit?usp=sharing

I did the same exercise for librbd (build with -finstrument-functions). 
Here's the IO path in librbd for writes against images with RBD 
journaling enabled and disabled. The red parts only appear when 
journaling is enabled. There's still a bit more differences for the 
final flush() part, but I've omitted them for simplicity. If there's 
need/interest I can add those as well. The dotted vertical line separate 
iterations of writes() in the client (rbd tool).
https://docs.google.com/drawings/d/1s9pRVygdPzBtQyYrvPueF8Cm6Xi1d4oy5gNMOjx2Wzo/edit?usp=sharing

I think these are the kind of diagrams we'll be able to get more easily 
from Zipkin/Blkin.

Mohamad


[1] https://drive.google.com/open?id=0B0_2xRlvBgt9c3ltaTNmMEplTlE
[2] https://drive.google.com/open?id=0B0_2xRlvBgt9TjVNTUpzWkpITjQ
[3] https://drive.google.com/open?id=0B0_2xRlvBgt9dl9vS083X2sxWTQ
[4] https://github.com/mogeb/ceph/tree/wip-finstrument-functions

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-03 13:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-02 20:21 Tracing Ceph results Mohamad Gebai
2017-05-02 20:47 ` Sage Weil
2017-05-02 21:05   ` Mohamad Gebai
2017-05-03  0:49     ` Mark Nelson
2017-05-03 13:45       ` Mohamad Gebai
2017-05-03 13:43     ` Jason Dillaman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.