* [Qemu-devel] [RFC 0/2] Tracing
@ 2010-05-21 9:42 Stefan Hajnoczi
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2010-05-21 9:42 UTC (permalink / raw)
To: qemu-devel, kvm; +Cc: Anthony Liguori, Prerna Saxena
Trace events in QEMU/KVM can be very useful for debugging and performance
analysis. I'd like to discuss tracing support and hope others have an interest
in this feature, too.
Following this email are patches I am using to debug virtio-blk and storage.
The patches provide trivial tracing support, but they don't address the details
of real tracing tools: enabling/disabling events at runtime, no overhead for
disabled events, multithreading support, etc.
It would be nice to have userland tracing facilities that work out-of-the-box
on production systems. Unfortunately, I'm not aware of any such facilities out
there right now on Linux. Perhaps SystemTap userspace tracing is the way to
go, has anyone tried it with KVM?
For the medium term, without userspace tracing facilities in the OS we could
put something into QEMU to address the need for tracing. Here are my thoughts
on fleshing out the tracing patch I have posted:
1. Make it possible to enable/disable events at runtime. Users enable only the
events they are interested in and aren't flooded with trace data for all
other events.
2. Either make trace events cheap or build without trace events by default.
Disable by default still allows tracing to be used for development but
less for production.
3. Allow events in any execution context (cpu, io, aio emulation threads). The
current code does not support concurrency and is meant for when the iothread
mutex is held.
4. Make it easy to add new events. Instead of keeping trace.h and trace.py in
sync manually, use something like .hx to produce the appropriate C and
Python.
Summary: Tracing is useful, are there external tools we can use right now? If
not, should we put in something that works well enough until external tools
catch up?
Stefan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi
@ 2010-05-21 9:42 ` Stefan Hajnoczi
2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi
` (2 more replies)
2010-05-21 9:42 ` [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit Stefan Hajnoczi
2010-05-21 11:27 ` [Qemu-devel] [RFC 0/2] Tracing Prerna Saxena
2 siblings, 3 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2010-05-21 9:42 UTC (permalink / raw)
To: qemu-devel, kvm; +Cc: Anthony Liguori, Stefan Hajnoczi, Prerna Saxena
Trace events should be defined in trace.h. Events are written to
/tmp/trace.log and can be formatted using trace.py. Remember to add
events to trace.py for pretty-printing.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
Makefile.objs | 2 +-
trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
trace.h | 9 ++++++++
trace.py | 30 ++++++++++++++++++++++++++
4 files changed, 104 insertions(+), 1 deletions(-)
create mode 100644 trace.c
create mode 100644 trace.h
create mode 100755 trace.py
diff --git a/Makefile.objs b/Makefile.objs
index acbaf22..307e989 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
# block-obj-y is code used by both qemu system emulation and qemu-img
block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
-block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
+block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o
block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
diff --git a/trace.c b/trace.c
new file mode 100644
index 0000000..2fec4d3
--- /dev/null
+++ b/trace.c
@@ -0,0 +1,64 @@
+#include <stdlib.h>
+#include <stdio.h>
+#include "trace.h"
+
+typedef struct {
+ unsigned long event;
+ unsigned long x1;
+ unsigned long x2;
+ unsigned long x3;
+ unsigned long x4;
+ unsigned long x5;
+} TraceRecord;
+
+enum {
+ TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord),
+};
+
+static TraceRecord trace_buf[TRACE_BUF_LEN];
+static unsigned int trace_idx;
+static FILE *trace_fp;
+
+static void trace(TraceEvent event, unsigned long x1,
+ unsigned long x2, unsigned long x3,
+ unsigned long x4, unsigned long x5) {
+ TraceRecord *rec = &trace_buf[trace_idx];
+ rec->event = event;
+ rec->x1 = x1;
+ rec->x2 = x2;
+ rec->x3 = x3;
+ rec->x4 = x4;
+ rec->x5 = x5;
+
+ if (++trace_idx == TRACE_BUF_LEN) {
+ trace_idx = 0;
+
+ if (!trace_fp) {
+ trace_fp = fopen("/tmp/trace.log", "w");
+ }
+ if (trace_fp) {
+ size_t result = fwrite(trace_buf, sizeof trace_buf, 1, trace_fp);
+ result = result;
+ }
+ }
+}
+
+void trace1(TraceEvent event, unsigned long x1) {
+ trace(event, x1, 0, 0, 0, 0);
+}
+
+void trace2(TraceEvent event, unsigned long x1, unsigned long x2) {
+ trace(event, x1, x2, 0, 0, 0);
+}
+
+void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3) {
+ trace(event, x1, x2, x3, 0, 0);
+}
+
+void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4) {
+ trace(event, x1, x2, x3, x4, 0);
+}
+
+void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5) {
+ trace(event, x1, x2, x3, x4, x5);
+}
diff --git a/trace.h b/trace.h
new file mode 100644
index 0000000..144aa1e
--- /dev/null
+++ b/trace.h
@@ -0,0 +1,9 @@
+typedef enum {
+ TRACE_MAX
+} TraceEvent;
+
+void trace1(TraceEvent event, unsigned long x1);
+void trace2(TraceEvent event, unsigned long x1, unsigned long x2);
+void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3);
+void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4);
+void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5);
diff --git a/trace.py b/trace.py
new file mode 100755
index 0000000..f38ab6b
--- /dev/null
+++ b/trace.py
@@ -0,0 +1,30 @@
+#!/usr/bin/env python
+import sys
+import struct
+
+trace_fmt = 'LLLLLL'
+trace_len = struct.calcsize(trace_fmt)
+
+events = {
+}
+
+def read_record(fobj):
+ s = fobj.read(trace_len)
+ if len(s) != trace_len:
+ return None
+ return struct.unpack(trace_fmt, s)
+
+def format_record(rec):
+ event = events[rec[0]]
+ fields = [event[0]]
+ for i in xrange(1, len(event)):
+ fields.append('%s=0x%x' % (event[i], rec[i]))
+ return ' '.join(fields)
+
+f = open(sys.argv[1], 'rb')
+while True:
+ rec = read_record(f)
+ if rec is None:
+ break
+
+ print format_record(rec)
--
1.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit
2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
@ 2010-05-21 9:42 ` Stefan Hajnoczi
2010-05-21 11:27 ` [Qemu-devel] [RFC 0/2] Tracing Prerna Saxena
2 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2010-05-21 9:42 UTC (permalink / raw)
To: qemu-devel, kvm; +Cc: Anthony Liguori, Stefan Hajnoczi, Prerna Saxena
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
block.c | 7 +++++++
hw/virtio-blk.c | 6 ++++++
posix-aio-compat.c | 2 ++
trace.h | 42 +++++++++++++++++++++++++++++++++++++++++-
trace.py | 8 ++++++++
5 files changed, 64 insertions(+), 1 deletions(-)
diff --git a/block.c b/block.c
index bfe46e3..a7fb040 100644
--- a/block.c
+++ b/block.c
@@ -27,6 +27,7 @@
#include "block_int.h"
#include "module.h"
#include "qemu-objects.h"
+#include "trace.h"
#ifdef CONFIG_BSD
#include <sys/types.h>
@@ -1913,6 +1914,8 @@ static void multiwrite_cb(void *opaque, int ret)
{
MultiwriteCB *mcb = opaque;
+ trace_multiwrite_cb(mcb, ret);
+
if (ret < 0 && !mcb->error) {
mcb->error = ret;
multiwrite_user_cb(mcb);
@@ -2044,6 +2047,8 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
// Check for mergable requests
num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);
+ trace_bdrv_aio_multiwrite(mcb, mcb->num_callbacks, num_reqs);
+
// Run the aio requests
for (i = 0; i < num_reqs; i++) {
acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov,
@@ -2054,9 +2059,11 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
// submitted yet. Otherwise we'll wait for the submitted AIOs to
// complete and report the error in the callback.
if (mcb->num_requests == 0) {
+ trace_bdrv_aio_multiwrite_earlyfail(mcb);
reqs[i].error = -EIO;
goto fail;
} else {
+ trace_bdrv_aio_multiwrite_latefail(mcb, i);
mcb->num_requests++;
multiwrite_cb(mcb, -EIO);
break;
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index b05d15e..73b873e 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -13,6 +13,7 @@
#include <qemu-common.h>
#include <sysemu.h>
+#include "trace.h"
#include "virtio-blk.h"
#include "block_int.h"
#ifdef __linux__
@@ -50,6 +51,8 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, int status)
{
VirtIOBlock *s = req->dev;
+ trace_virtio_blk_req_complete(req, status);
+
req->in->status = status;
virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in));
virtio_notify(&s->vdev, s->vq);
@@ -87,6 +90,8 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
{
VirtIOBlockReq *req = opaque;
+ trace_virtio_blk_rw_complete(req, ret);
+
if (ret) {
int is_read = !(req->out->type & VIRTIO_BLK_T_OUT);
if (virtio_blk_handle_rw_error(req, -ret, is_read))
@@ -270,6 +275,7 @@ static void virtio_blk_handle_write(BlockRequest *blkreq, int *num_writes,
blkreq[*num_writes].cb = virtio_blk_rw_complete;
blkreq[*num_writes].opaque = req;
blkreq[*num_writes].error = 0;
+ trace_virtio_blk_handle_write(req, req->out->sector, req->qiov.size / 512);
(*num_writes)++;
}
diff --git a/posix-aio-compat.c b/posix-aio-compat.c
index b43c531..57d83f0 100644
--- a/posix-aio-compat.c
+++ b/posix-aio-compat.c
@@ -23,6 +23,7 @@
#include <stdio.h>
#include "qemu-queue.h"
+#include "trace.h"
#include "osdep.h"
#include "qemu-common.h"
#include "block_int.h"
@@ -583,6 +584,7 @@ BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
acb->next = posix_aio_state->first_aio;
posix_aio_state->first_aio = acb;
+ trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
qemu_paio_submit(acb);
return &acb->common;
}
diff --git a/trace.h b/trace.h
index 144aa1e..3c4564f 100644
--- a/trace.h
+++ b/trace.h
@@ -1,5 +1,12 @@
typedef enum {
- TRACE_MAX
+ TRACE_MULTIWRITE_CB,
+ TRACE_BDRV_AIO_MULTIWRITE,
+ TRACE_BDRV_AIO_MULTIWRITE_EARLYFAIL,
+ TRACE_BDRV_AIO_MULTIWRITE_LATEFAIL,
+ TRACE_VIRTIO_BLK_REQ_COMPLETE,
+ TRACE_VIRTIO_BLK_RW_COMPLETE,
+ TRACE_VIRTIO_BLK_HANDLE_WRITE,
+ TRACE_PAIO_SUBMIT,
} TraceEvent;
void trace1(TraceEvent event, unsigned long x1);
@@ -7,3 +14,36 @@ void trace2(TraceEvent event, unsigned long x1, unsigned long x2);
void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3);
void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4);
void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5);
+
+static inline void trace_multiwrite_cb(void *mcb, int ret) {
+ trace2(TRACE_MULTIWRITE_CB, (unsigned long)mcb, ret);
+}
+
+static inline void trace_bdrv_aio_multiwrite(void *mcb, int num_callbacks, int num_reqs) {
+ trace3(TRACE_BDRV_AIO_MULTIWRITE, (unsigned long)mcb, num_callbacks, num_reqs);
+}
+
+static inline void trace_bdrv_aio_multiwrite_earlyfail(void *mcb) {
+ trace1(TRACE_BDRV_AIO_MULTIWRITE_EARLYFAIL, (unsigned long)mcb);
+}
+
+static inline void trace_bdrv_aio_multiwrite_latefail(void *mcb, int i) {
+ trace2(TRACE_BDRV_AIO_MULTIWRITE_LATEFAIL, (unsigned long)mcb, i);
+}
+
+static inline void trace_virtio_blk_req_complete(void *req, int status) {
+ trace2(TRACE_VIRTIO_BLK_REQ_COMPLETE, (unsigned long)req, status);
+}
+
+static inline void trace_virtio_blk_rw_complete(void *req, int ret) {
+ trace2(TRACE_VIRTIO_BLK_RW_COMPLETE, (unsigned long)req, ret);
+}
+
+static inline void trace_virtio_blk_handle_write(void *req, unsigned long sector, unsigned long nsectors) {
+ trace3(TRACE_VIRTIO_BLK_HANDLE_WRITE, (unsigned long)req, sector, nsectors);
+}
+
+static inline void trace_paio_submit(void *acb, void *opaque, unsigned long sector_num, unsigned long nb_sectors, unsigned long type)
+{
+ trace5(TRACE_PAIO_SUBMIT, (unsigned long)acb, (unsigned long)opaque, sector_num, nb_sectors, type);
+}
diff --git a/trace.py b/trace.py
index f38ab6b..ba4476c 100755
--- a/trace.py
+++ b/trace.py
@@ -6,6 +6,14 @@ trace_fmt = 'LLLLLL'
trace_len = struct.calcsize(trace_fmt)
events = {
+ 0: ('multiwrite_cb', 'mcb', 'ret'),
+ 1: ('bdrv_aio_multiwrite', 'mcb', 'num_callbacks', 'num_reqs'),
+ 2: ('bdrv_aio_multiwrite_earlyfail', 'mcb'),
+ 3: ('bdrv_aio_multiwrite_latefail', 'mcb', 'i'),
+ 4: ('virtio_blk_req_complete', 'req', 'status'),
+ 5: ('virtio_blk_rw_complete', 'req', 'ret'),
+ 6: ('virtio_blk_handle_write', 'req', 'sector', 'nsectors'),
+ 7: ('paio_submit', 'acb', 'opaque', 'sector_num', 'nb_sectors', 'type'),
}
def read_record(fobj):
--
1.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
@ 2010-05-21 9:49 ` Stefan Hajnoczi
2010-05-21 11:13 ` Jan Kiszka
2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori
2 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2010-05-21 9:49 UTC (permalink / raw)
To: qemu-devel, kvm; +Cc: Anthony Liguori, Prerna Saxena
I should have used the "[RFC]" tag to make it clear that I'm not
proposing these patches for merge, sorry.
Stefan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-05-21 11:13 ` Jan Kiszka
2010-05-21 13:10 ` Stefan Hajnoczi
2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori
2 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-05-21 11:13 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Anthony Liguori, qemu-devel, kvm, Prerna Saxena
Stefan Hajnoczi wrote:
> Trace events should be defined in trace.h. Events are written to
> /tmp/trace.log and can be formatted using trace.py. Remember to add
> events to trace.py for pretty-printing.
When already writing to a file, why not reusing QEMU's logging
infrastructure ("log <foo>" / "-d foo")? Shouldn't make a huge
performance difference if the data is saved in clear-text.
Also, having support for ftrace's user space markers would be a very
nice option (only an option as it's Linux-specific), see
http://lwn.net/Articles/366796. This allows to correlate kernel events
(KVM as well as others) with what goes on in QEMU. It simply enables
integration with the whole kernel tracing infrastructure, e.g.
KernelShark (http://people.redhat.com/srostedt/kernelshark/HTML).
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/2] Tracing
2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
2010-05-21 9:42 ` [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit Stefan Hajnoczi
@ 2010-05-21 11:27 ` Prerna Saxena
2 siblings, 0 replies; 18+ messages in thread
From: Prerna Saxena @ 2010-05-21 11:27 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Anthony Liguori, qemu-devel, kvm
Hi Stefan,
Nice to see the patchset.
I am working on something similar, on the lines of static trace events
for QEMU, that collect traces in a qemu-internal buffer. This would
employ monitor commands to read traces, as well as enable/disable trace
events at runtime.
I plan to post a prototype early next week.
On 05/21/2010 03:12 PM, Stefan Hajnoczi wrote:
> Trace events in QEMU/KVM can be very useful for debugging and performance
> analysis. I'd like to discuss tracing support and hope others have an interest
> in this feature, too.
>
> Following this email are patches I am using to debug virtio-blk and storage.
> The patches provide trivial tracing support, but they don't address the details
> of real tracing tools: enabling/disabling events at runtime, no overhead for
> disabled events, multithreading support, etc.
>
> It would be nice to have userland tracing facilities that work out-of-the-box
> on production systems. Unfortunately, I'm not aware of any such facilities out
> there right now on Linux. Perhaps SystemTap userspace tracing is the way to
> go, has anyone tried it with KVM?
>
> For the medium term, without userspace tracing facilities in the OS we could
> put something into QEMU to address the need for tracing. Here are my thoughts
> on fleshing out the tracing patch I have posted:
>
> 1. Make it possible to enable/disable events at runtime. Users enable only the
> events they are interested in and aren't flooded with trace data for all
> other events.
>
Agree, my upcoming patchset should address this.
> 2. Either make trace events cheap or build without trace events by default.
> Disable by default still allows tracing to be used for development but
> less for production.
>
I'm trying to do this too, though quite a lot remains to be improved in
my current implementation :-)
> 3. Allow events in any execution context (cpu, io, aio emulation threads).
>
Agree.
> 4. Make it easy to add new events.
Agree ! I'm trying to provide a unified macro interface like trace
events which makes it easy enough to add new events.
Regards,
--
Prerna Saxena
Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi
2010-05-21 11:13 ` Jan Kiszka
@ 2010-05-21 12:37 ` Anthony Liguori
2010-05-21 13:46 ` Jan Kiszka
2 siblings, 1 reply; 18+ messages in thread
From: Anthony Liguori @ 2010-05-21 12:37 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, Prerna Saxena
On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote:
> Trace events should be defined in trace.h. Events are written to
> /tmp/trace.log and can be formatted using trace.py. Remember to add
> events to trace.py for pretty-printing.
>
> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
> ---
> Makefile.objs | 2 +-
> trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> trace.h | 9 ++++++++
> trace.py | 30 ++++++++++++++++++++++++++
> 4 files changed, 104 insertions(+), 1 deletions(-)
> create mode 100644 trace.c
> create mode 100644 trace.h
> create mode 100755 trace.py
>
> diff --git a/Makefile.objs b/Makefile.objs
> index acbaf22..307e989 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
> # block-obj-y is code used by both qemu system emulation and qemu-img
>
> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o
> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>
> diff --git a/trace.c b/trace.c
> new file mode 100644
> index 0000000..2fec4d3
> --- /dev/null
> +++ b/trace.c
> @@ -0,0 +1,64 @@
> +#include<stdlib.h>
> +#include<stdio.h>
> +#include "trace.h"
> +
> +typedef struct {
> + unsigned long event;
> + unsigned long x1;
> + unsigned long x2;
> + unsigned long x3;
> + unsigned long x4;
> + unsigned long x5;
> +} TraceRecord;
> +
> +enum {
> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord),
> +};
> +
> +static TraceRecord trace_buf[TRACE_BUF_LEN];
> +static unsigned int trace_idx;
> +static FILE *trace_fp;
> +
> +static void trace(TraceEvent event, unsigned long x1,
> + unsigned long x2, unsigned long x3,
> + unsigned long x4, unsigned long x5) {
> + TraceRecord *rec =&trace_buf[trace_idx];
> + rec->event = event;
> + rec->x1 = x1;
> + rec->x2 = x2;
> + rec->x3 = x3;
> + rec->x4 = x4;
> + rec->x5 = x5;
> +
> + if (++trace_idx == TRACE_BUF_LEN) {
> + trace_idx = 0;
> +
> + if (!trace_fp) {
> + trace_fp = fopen("/tmp/trace.log", "w");
> + }
> + if (trace_fp) {
> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, trace_fp);
> + result = result;
> + }
> + }
> +}
>
It is probably worth while to read trace points via the monitor or
through some other mechanism. My concern would be that writing even 64k
out to disk would introduce enough performance overhead mainly because
it runs lock-step with the guest's VCPU.
Maybe it's worth adding a thread that syncs the ring to disk if we want
to write to disk?
> +void trace1(TraceEvent event, unsigned long x1) {
> + trace(event, x1, 0, 0, 0, 0);
> +}
> +
> +void trace2(TraceEvent event, unsigned long x1, unsigned long x2) {
> + trace(event, x1, x2, 0, 0, 0);
> +}
> +
> +void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3) {
> + trace(event, x1, x2, x3, 0, 0);
> +}
> +
> +void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4) {
> + trace(event, x1, x2, x3, x4, 0);
> +}
> +
> +void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5) {
> + trace(event, x1, x2, x3, x4, x5);
> +}
> diff --git a/trace.h b/trace.h
> new file mode 100644
> index 0000000..144aa1e
> --- /dev/null
> +++ b/trace.h
> @@ -0,0 +1,9 @@
> +typedef enum {
> + TRACE_MAX
> +} TraceEvent;
> +
> +void trace1(TraceEvent event, unsigned long x1);
> +void trace2(TraceEvent event, unsigned long x1, unsigned long x2);
> +void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3);
> +void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4);
> +void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5);
>
Looks good. I think we definitely need something like this.
Regards,
Anthony Liguori
> diff --git a/trace.py b/trace.py
> new file mode 100755
> index 0000000..f38ab6b
> --- /dev/null
> +++ b/trace.py
> @@ -0,0 +1,30 @@
> +#!/usr/bin/env python
> +import sys
> +import struct
> +
> +trace_fmt = 'LLLLLL'
> +trace_len = struct.calcsize(trace_fmt)
> +
> +events = {
> +}
> +
> +def read_record(fobj):
> + s = fobj.read(trace_len)
> + if len(s) != trace_len:
> + return None
> + return struct.unpack(trace_fmt, s)
> +
> +def format_record(rec):
> + event = events[rec[0]]
> + fields = [event[0]]
> + for i in xrange(1, len(event)):
> + fields.append('%s=0x%x' % (event[i], rec[i]))
> + return ' '.join(fields)
> +
> +f = open(sys.argv[1], 'rb')
> +while True:
> + rec = read_record(f)
> + if rec is None:
> + break
> +
> + print format_record(rec)
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support
2010-05-21 11:13 ` Jan Kiszka
@ 2010-05-21 13:10 ` Stefan Hajnoczi
2010-05-21 13:22 ` Jan Kiszka
0 siblings, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2010-05-21 13:10 UTC (permalink / raw)
To: Jan Kiszka
Cc: Anthony Liguori, Prerna Saxena, Stefan Hajnoczi, kvm, qemu-devel
On Fri, May 21, 2010 at 12:13 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> Stefan Hajnoczi wrote:
>> Trace events should be defined in trace.h. Events are written to
>> /tmp/trace.log and can be formatted using trace.py. Remember to add
>> events to trace.py for pretty-printing.
>
> When already writing to a file, why not reusing QEMU's logging
> infrastructure ("log <foo>" / "-d foo")? Shouldn't make a huge
> performance difference if the data is saved in clear-text.
> Also, having support for ftrace's user space markers would be a very
> nice option (only an option as it's Linux-specific), see
> http://lwn.net/Articles/366796.
Thanks for the links.
I think using the platform's tracing facility has many advantages.
The main one being that we can focus on QEMU/KVM development rather
than re-implementing tracing infrastructure :).
It may be possible to have SystemTap, DTrace, or nop static trace
event code. A platform with no tracing support can only use the nop
backend, which results in a build without static trace events.
Platforms with tracing support can build with the appropriate backend
or nop. The backend tracing facility is abstracted and most of QEMU
doesn't need to know which one is being used.
I hadn't seen trace markers. However, I suspect they aren't ideal for
static trace events because logging an event requires a write system
call. They look useful for annotating kernel tracing information, but
less for high frequency/low overhead userspace tracing.
Stefan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support
2010-05-21 13:10 ` Stefan Hajnoczi
@ 2010-05-21 13:22 ` Jan Kiszka
0 siblings, 0 replies; 18+ messages in thread
From: Jan Kiszka @ 2010-05-21 13:22 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Anthony Liguori, Prerna Saxena, Stefan Hajnoczi,
kvm@vger.kernel.org, qemu-devel@nongnu.org
Stefan Hajnoczi wrote:
> On Fri, May 21, 2010 at 12:13 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> Stefan Hajnoczi wrote:
>>> Trace events should be defined in trace.h. Events are written to
>>> /tmp/trace.log and can be formatted using trace.py. Remember to add
>>> events to trace.py for pretty-printing.
>> When already writing to a file, why not reusing QEMU's logging
>> infrastructure ("log <foo>" / "-d foo")? Shouldn't make a huge
>> performance difference if the data is saved in clear-text.
>
>> Also, having support for ftrace's user space markers would be a very
>> nice option (only an option as it's Linux-specific), see
>> http://lwn.net/Articles/366796.
>
> Thanks for the links.
>
> I think using the platform's tracing facility has many advantages.
> The main one being that we can focus on QEMU/KVM development rather
> than re-implementing tracing infrastructure :).
Indeed. :)
>
> It may be possible to have SystemTap, DTrace, or nop static trace
> event code. A platform with no tracing support can only use the nop
> backend, which results in a build without static trace events.
> Platforms with tracing support can build with the appropriate backend
> or nop. The backend tracing facility is abstracted and most of QEMU
> doesn't need to know which one is being used.
That would be ideal.
>
> I hadn't seen trace markers. However, I suspect they aren't ideal for
> static trace events because logging an event requires a write system
> call. They look useful for annotating kernel tracing information, but
> less for high frequency/low overhead userspace tracing.
You never know for sure until you tried :). There are surely lots of
scenarios where this overhead does not matter.
Moreover, I'm sure that something of LTTng's high-frequency/low-overhead
tracing capabilities will make it (in whatever form) into mainline
sooner or later. So we need that smart infrastructure to make use of it
once it's available (actually, LTTng is already available, just still
requires "some" kernel patching).
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori
@ 2010-05-21 13:46 ` Jan Kiszka
2010-05-21 14:03 ` Anthony Liguori
0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-05-21 13:46 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Prerna Saxena, Stefan Hajnoczi, kvm, qemu-devel
Anthony Liguori wrote:
> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote:
>> Trace events should be defined in trace.h. Events are written to
>> /tmp/trace.log and can be formatted using trace.py. Remember to add
>> events to trace.py for pretty-printing.
>>
>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
>> ---
>> Makefile.objs | 2 +-
>> trace.c | 64
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> trace.h | 9 ++++++++
>> trace.py | 30 ++++++++++++++++++++++++++
>> 4 files changed, 104 insertions(+), 1 deletions(-)
>> create mode 100644 trace.c
>> create mode 100644 trace.h
>> create mode 100755 trace.py
>>
>> diff --git a/Makefile.objs b/Makefile.objs
>> index acbaf22..307e989 100644
>> --- a/Makefile.objs
>> +++ b/Makefile.objs
>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
>> # block-obj-y is code used by both qemu system emulation and qemu-img
>>
>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
>> module.o
>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o
>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>
>> diff --git a/trace.c b/trace.c
>> new file mode 100644
>> index 0000000..2fec4d3
>> --- /dev/null
>> +++ b/trace.c
>> @@ -0,0 +1,64 @@
>> +#include<stdlib.h>
>> +#include<stdio.h>
>> +#include "trace.h"
>> +
>> +typedef struct {
>> + unsigned long event;
>> + unsigned long x1;
>> + unsigned long x2;
>> + unsigned long x3;
>> + unsigned long x4;
>> + unsigned long x5;
>> +} TraceRecord;
>> +
>> +enum {
>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord),
>> +};
>> +
>> +static TraceRecord trace_buf[TRACE_BUF_LEN];
>> +static unsigned int trace_idx;
>> +static FILE *trace_fp;
>> +
>> +static void trace(TraceEvent event, unsigned long x1,
>> + unsigned long x2, unsigned long x3,
>> + unsigned long x4, unsigned long x5) {
>> + TraceRecord *rec =&trace_buf[trace_idx];
>> + rec->event = event;
>> + rec->x1 = x1;
>> + rec->x2 = x2;
>> + rec->x3 = x3;
>> + rec->x4 = x4;
>> + rec->x5 = x5;
>> +
>> + if (++trace_idx == TRACE_BUF_LEN) {
>> + trace_idx = 0;
>> +
>> + if (!trace_fp) {
>> + trace_fp = fopen("/tmp/trace.log", "w");
>> + }
>> + if (trace_fp) {
>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1,
>> trace_fp);
>> + result = result;
>> + }
>> + }
>> +}
>>
>
> It is probably worth while to read trace points via the monitor or
> through some other mechanism. My concern would be that writing even 64k
> out to disk would introduce enough performance overhead mainly because
> it runs lock-step with the guest's VCPU.
>
> Maybe it's worth adding a thread that syncs the ring to disk if we want
> to write to disk?
That's not what QEMU should worry about. If somehow possible, let's push
this into the hands of a (user space) tracing framework, ideally one
that is already designed for such requirements. E.g. there exists quite
useful work in the context of LTTng (user space RCU for application
tracing).
We may need simple stubs for the case that no such framework is (yet)
available. But effort should focus on a QEMU infrastructure to add
useful tracepoints to the code. Specifically when tracing over KVM, you
usually need information about kernel states as well, so you depend on
an integrated approach, not Yet Another Log File.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 13:46 ` Jan Kiszka
@ 2010-05-21 14:03 ` Anthony Liguori
2010-05-21 16:52 ` Jan Kiszka
0 siblings, 1 reply; 18+ messages in thread
From: Anthony Liguori @ 2010-05-21 14:03 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Prerna Saxena, Stefan Hajnoczi, kvm, qemu-devel
On 05/21/2010 08:46 AM, Jan Kiszka wrote:
> Anthony Liguori wrote:
>
>> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote:
>>
>>> Trace events should be defined in trace.h. Events are written to
>>> /tmp/trace.log and can be formatted using trace.py. Remember to add
>>> events to trace.py for pretty-printing.
>>>
>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
>>> ---
>>> Makefile.objs | 2 +-
>>> trace.c | 64
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> trace.h | 9 ++++++++
>>> trace.py | 30 ++++++++++++++++++++++++++
>>> 4 files changed, 104 insertions(+), 1 deletions(-)
>>> create mode 100644 trace.c
>>> create mode 100644 trace.h
>>> create mode 100755 trace.py
>>>
>>> diff --git a/Makefile.objs b/Makefile.objs
>>> index acbaf22..307e989 100644
>>> --- a/Makefile.objs
>>> +++ b/Makefile.objs
>>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
>>> # block-obj-y is code used by both qemu system emulation and qemu-img
>>>
>>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
>>> module.o
>>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
>>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o
>>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>>
>>> diff --git a/trace.c b/trace.c
>>> new file mode 100644
>>> index 0000000..2fec4d3
>>> --- /dev/null
>>> +++ b/trace.c
>>> @@ -0,0 +1,64 @@
>>> +#include<stdlib.h>
>>> +#include<stdio.h>
>>> +#include "trace.h"
>>> +
>>> +typedef struct {
>>> + unsigned long event;
>>> + unsigned long x1;
>>> + unsigned long x2;
>>> + unsigned long x3;
>>> + unsigned long x4;
>>> + unsigned long x5;
>>> +} TraceRecord;
>>> +
>>> +enum {
>>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord),
>>> +};
>>> +
>>> +static TraceRecord trace_buf[TRACE_BUF_LEN];
>>> +static unsigned int trace_idx;
>>> +static FILE *trace_fp;
>>> +
>>> +static void trace(TraceEvent event, unsigned long x1,
>>> + unsigned long x2, unsigned long x3,
>>> + unsigned long x4, unsigned long x5) {
>>> + TraceRecord *rec =&trace_buf[trace_idx];
>>> + rec->event = event;
>>> + rec->x1 = x1;
>>> + rec->x2 = x2;
>>> + rec->x3 = x3;
>>> + rec->x4 = x4;
>>> + rec->x5 = x5;
>>> +
>>> + if (++trace_idx == TRACE_BUF_LEN) {
>>> + trace_idx = 0;
>>> +
>>> + if (!trace_fp) {
>>> + trace_fp = fopen("/tmp/trace.log", "w");
>>> + }
>>> + if (trace_fp) {
>>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1,
>>> trace_fp);
>>> + result = result;
>>> + }
>>> + }
>>> +}
>>>
>>>
>> It is probably worth while to read trace points via the monitor or
>> through some other mechanism. My concern would be that writing even 64k
>> out to disk would introduce enough performance overhead mainly because
>> it runs lock-step with the guest's VCPU.
>>
>> Maybe it's worth adding a thread that syncs the ring to disk if we want
>> to write to disk?
>>
> That's not what QEMU should worry about. If somehow possible, let's push
> this into the hands of a (user space) tracing framework, ideally one
> that is already designed for such requirements. E.g. there exists quite
> useful work in the context of LTTng (user space RCU for application
> tracing).
>
From what I understand, none of the current kernel approaches to
userspace tracing have much momentum at the moment.
> We may need simple stubs for the case that no such framework is (yet)
> available. But effort should focus on a QEMU infrastructure to add
> useful tracepoints to the code. Specifically when tracing over KVM, you
> usually need information about kernel states as well, so you depend on
> an integrated approach, not Yet Another Log File.
>
I think the simple code that Stefan pasted gives us 95% of what we need.
Regards,
Anthony Liguori
> Jan
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 14:03 ` Anthony Liguori
@ 2010-05-21 16:52 ` Jan Kiszka
2010-05-21 20:49 ` Stefan Hajnoczi
2010-05-21 21:06 ` Anthony Liguori
0 siblings, 2 replies; 18+ messages in thread
From: Jan Kiszka @ 2010-05-21 16:52 UTC (permalink / raw)
To: Anthony Liguori
Cc: Prerna Saxena, Stefan Hajnoczi, kvm@vger.kernel.org,
qemu-devel@nongnu.org
Anthony Liguori wrote:
> On 05/21/2010 08:46 AM, Jan Kiszka wrote:
>> Anthony Liguori wrote:
>>
>>> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote:
>>>
>>>> Trace events should be defined in trace.h. Events are written to
>>>> /tmp/trace.log and can be formatted using trace.py. Remember to add
>>>> events to trace.py for pretty-printing.
>>>>
>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
>>>> ---
>>>> Makefile.objs | 2 +-
>>>> trace.c | 64
>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> trace.h | 9 ++++++++
>>>> trace.py | 30 ++++++++++++++++++++++++++
>>>> 4 files changed, 104 insertions(+), 1 deletions(-)
>>>> create mode 100644 trace.c
>>>> create mode 100644 trace.h
>>>> create mode 100755 trace.py
>>>>
>>>> diff --git a/Makefile.objs b/Makefile.objs
>>>> index acbaf22..307e989 100644
>>>> --- a/Makefile.objs
>>>> +++ b/Makefile.objs
>>>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
>>>> # block-obj-y is code used by both qemu system emulation and qemu-img
>>>>
>>>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
>>>> module.o
>>>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
>>>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o
>>>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>>>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>>>
>>>> diff --git a/trace.c b/trace.c
>>>> new file mode 100644
>>>> index 0000000..2fec4d3
>>>> --- /dev/null
>>>> +++ b/trace.c
>>>> @@ -0,0 +1,64 @@
>>>> +#include<stdlib.h>
>>>> +#include<stdio.h>
>>>> +#include "trace.h"
>>>> +
>>>> +typedef struct {
>>>> + unsigned long event;
>>>> + unsigned long x1;
>>>> + unsigned long x2;
>>>> + unsigned long x3;
>>>> + unsigned long x4;
>>>> + unsigned long x5;
>>>> +} TraceRecord;
>>>> +
>>>> +enum {
>>>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord),
>>>> +};
>>>> +
>>>> +static TraceRecord trace_buf[TRACE_BUF_LEN];
>>>> +static unsigned int trace_idx;
>>>> +static FILE *trace_fp;
>>>> +
>>>> +static void trace(TraceEvent event, unsigned long x1,
>>>> + unsigned long x2, unsigned long x3,
>>>> + unsigned long x4, unsigned long x5) {
>>>> + TraceRecord *rec =&trace_buf[trace_idx];
>>>> + rec->event = event;
>>>> + rec->x1 = x1;
>>>> + rec->x2 = x2;
>>>> + rec->x3 = x3;
>>>> + rec->x4 = x4;
>>>> + rec->x5 = x5;
>>>> +
>>>> + if (++trace_idx == TRACE_BUF_LEN) {
>>>> + trace_idx = 0;
>>>> +
>>>> + if (!trace_fp) {
>>>> + trace_fp = fopen("/tmp/trace.log", "w");
>>>> + }
>>>> + if (trace_fp) {
>>>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1,
>>>> trace_fp);
>>>> + result = result;
>>>> + }
>>>> + }
>>>> +}
>>>>
>>>>
>>> It is probably worth while to read trace points via the monitor or
>>> through some other mechanism. My concern would be that writing even 64k
>>> out to disk would introduce enough performance overhead mainly because
>>> it runs lock-step with the guest's VCPU.
>>>
>>> Maybe it's worth adding a thread that syncs the ring to disk if we want
>>> to write to disk?
>>>
>> That's not what QEMU should worry about. If somehow possible, let's push
>> this into the hands of a (user space) tracing framework, ideally one
>> that is already designed for such requirements. E.g. there exists quite
>> useful work in the context of LTTng (user space RCU for application
>> tracing).
>>
>
> From what I understand, none of the current kernel approaches to
> userspace tracing have much momentum at the moment.
>
>> We may need simple stubs for the case that no such framework is (yet)
>> available. But effort should focus on a QEMU infrastructure to add
>> useful tracepoints to the code. Specifically when tracing over KVM, you
>> usually need information about kernel states as well, so you depend on
>> an integrated approach, not Yet Another Log File.
>>
>
> I think the simple code that Stefan pasted gives us 95% of what we need.
IMHO not 95%, but it is a start.
I would just like to avoid that too much efforts are spent on
re-inventing smart trace buffers, trace daemons, or trace visualization
tools. Then better pick up some semi-perfect approach (e.g. [1], it
unfortunately still seems to lack kernel integration) and drive it
according to our needs.
Jan
[1] http://lttng.org/ust
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 16:52 ` Jan Kiszka
@ 2010-05-21 20:49 ` Stefan Hajnoczi
2010-05-21 21:26 ` Christoph Hellwig
2010-05-21 21:37 ` Jan Kiszka
2010-05-21 21:06 ` Anthony Liguori
1 sibling, 2 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2010-05-21 20:49 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Prerna Saxena, Stefan Hajnoczi,
qemu-devel@nongnu.org
On Fri, May 21, 2010 at 5:52 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> I would just like to avoid that too much efforts are spent on
> re-inventing smart trace buffers, trace daemons, or trace visualization
> tools. Then better pick up some semi-perfect approach (e.g. [1], it
> unfortunately still seems to lack kernel integration) and drive it
> according to our needs.
I agree we have to consider existing solutions. The killer is the
usability: what dependencies are required to build with tracing? Is a
patched kernel or module required? How easy is it to add static trace
events during debugging?
If there are too many dependencies, especially to unpackaged software,
many people will stop right there and not bother. A patched kernel or
module isn't acceptable since the hassle of reconfiguring a system for
tracing becomes too great (or in some cases changing the kernel is not
possible/allowed).
Adding new static trace events should be easy, too. Ideally it
doesn't require adding information about the trace event in multiple
places (header files, C files, etc). It also shouldn't require
learning about the tracing system, adding a trace event should be
self-explanatory so anyone can easily add one for debugging.
A lot of opinions there, but what I'm saying is that friction must be
low. If the tracing system is a pain to use, then no-one will use it.
http://lttng.org/files/ust/manual/ust.html
LTTng Userspace Tracer looks interesting - no kernel support required
AFAICT. Toggling trace events in a running process supported.
Similar to kernel tracepoint.h and existing report/visualization tool.
x86 (32- and 64-bit) only. Like you say, no correlation with kernel trace data.
I'll try to give LTTng UST a spin by converting my trace events to use
UST. This seems closest to an existing tracing system we can drop in.
http://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
Requires kernel support - not sure if enough of utrace is in mainline
for this to work out-of-the-box across distros.
Unclear how exactly SystemTap userspace probing would work out. Does
anyone have experience or want to try this?
Stefan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 16:52 ` Jan Kiszka
2010-05-21 20:49 ` Stefan Hajnoczi
@ 2010-05-21 21:06 ` Anthony Liguori
2010-05-21 21:41 ` Jan Kiszka
1 sibling, 1 reply; 18+ messages in thread
From: Anthony Liguori @ 2010-05-21 21:06 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Stefan Hajnoczi,
Prerna Saxena
On 05/21/2010 11:52 AM, Jan Kiszka wrote:
> Anthony Liguori wrote:
>
>> On 05/21/2010 08:46 AM, Jan Kiszka wrote:
>>
>>> Anthony Liguori wrote:
>>>
>>>
>>>> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote:
>>>>
>>>>
>>>>> Trace events should be defined in trace.h. Events are written to
>>>>> /tmp/trace.log and can be formatted using trace.py. Remember to add
>>>>> events to trace.py for pretty-printing.
>>>>>
>>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
>>>>> ---
>>>>> Makefile.objs | 2 +-
>>>>> trace.c | 64
>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> trace.h | 9 ++++++++
>>>>> trace.py | 30 ++++++++++++++++++++++++++
>>>>> 4 files changed, 104 insertions(+), 1 deletions(-)
>>>>> create mode 100644 trace.c
>>>>> create mode 100644 trace.h
>>>>> create mode 100755 trace.py
>>>>>
>>>>> diff --git a/Makefile.objs b/Makefile.objs
>>>>> index acbaf22..307e989 100644
>>>>> --- a/Makefile.objs
>>>>> +++ b/Makefile.objs
>>>>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
>>>>> # block-obj-y is code used by both qemu system emulation and qemu-img
>>>>>
>>>>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
>>>>> module.o
>>>>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
>>>>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o
>>>>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>>>>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>>>>
>>>>> diff --git a/trace.c b/trace.c
>>>>> new file mode 100644
>>>>> index 0000000..2fec4d3
>>>>> --- /dev/null
>>>>> +++ b/trace.c
>>>>> @@ -0,0 +1,64 @@
>>>>> +#include<stdlib.h>
>>>>> +#include<stdio.h>
>>>>> +#include "trace.h"
>>>>> +
>>>>> +typedef struct {
>>>>> + unsigned long event;
>>>>> + unsigned long x1;
>>>>> + unsigned long x2;
>>>>> + unsigned long x3;
>>>>> + unsigned long x4;
>>>>> + unsigned long x5;
>>>>> +} TraceRecord;
>>>>> +
>>>>> +enum {
>>>>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord),
>>>>> +};
>>>>> +
>>>>> +static TraceRecord trace_buf[TRACE_BUF_LEN];
>>>>> +static unsigned int trace_idx;
>>>>> +static FILE *trace_fp;
>>>>> +
>>>>> +static void trace(TraceEvent event, unsigned long x1,
>>>>> + unsigned long x2, unsigned long x3,
>>>>> + unsigned long x4, unsigned long x5) {
>>>>> + TraceRecord *rec =&trace_buf[trace_idx];
>>>>> + rec->event = event;
>>>>> + rec->x1 = x1;
>>>>> + rec->x2 = x2;
>>>>> + rec->x3 = x3;
>>>>> + rec->x4 = x4;
>>>>> + rec->x5 = x5;
>>>>> +
>>>>> + if (++trace_idx == TRACE_BUF_LEN) {
>>>>> + trace_idx = 0;
>>>>> +
>>>>> + if (!trace_fp) {
>>>>> + trace_fp = fopen("/tmp/trace.log", "w");
>>>>> + }
>>>>> + if (trace_fp) {
>>>>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1,
>>>>> trace_fp);
>>>>> + result = result;
>>>>> + }
>>>>> + }
>>>>> +}
>>>>>
>>>>>
>>>>>
>>>> It is probably worth while to read trace points via the monitor or
>>>> through some other mechanism. My concern would be that writing even 64k
>>>> out to disk would introduce enough performance overhead mainly because
>>>> it runs lock-step with the guest's VCPU.
>>>>
>>>> Maybe it's worth adding a thread that syncs the ring to disk if we want
>>>> to write to disk?
>>>>
>>>>
>>> That's not what QEMU should worry about. If somehow possible, let's push
>>> this into the hands of a (user space) tracing framework, ideally one
>>> that is already designed for such requirements. E.g. there exists quite
>>> useful work in the context of LTTng (user space RCU for application
>>> tracing).
>>>
>>>
>> From what I understand, none of the current kernel approaches to
>> userspace tracing have much momentum at the moment.
>>
>>
>>> We may need simple stubs for the case that no such framework is (yet)
>>> available. But effort should focus on a QEMU infrastructure to add
>>> useful tracepoints to the code. Specifically when tracing over KVM, you
>>> usually need information about kernel states as well, so you depend on
>>> an integrated approach, not Yet Another Log File.
>>>
>>>
>> I think the simple code that Stefan pasted gives us 95% of what we need.
>>
> IMHO not 95%, but it is a start.
>
I'm not opposed to using a framework, but I'd rather have an equivalent
to kvm_stat tomorrow than wait 3 years for LTTng to not get merged.
So let's have a dirt-simple tracing mechanism and focus on adding useful
trace points. Then when we have a framework we can use, we can just
convert the tracepoints to the new framework.
Regards,
Anthony Liguori
> I would just like to avoid that too much efforts are spent on
> re-inventing smart trace buffers, trace daemons, or trace visualization
> tools. Then better pick up some semi-perfect approach (e.g. [1], it
> unfortunately still seems to lack kernel integration) and drive it
> according to our needs.
>
> Jan
>
> [1] http://lttng.org/ust
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 20:49 ` Stefan Hajnoczi
@ 2010-05-21 21:26 ` Christoph Hellwig
2010-05-21 21:37 ` Jan Kiszka
1 sibling, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2010-05-21 21:26 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Stefan Hajnoczi, kvm@vger.kernel.org, Jan Kiszka,
qemu-devel@nongnu.org, Prerna Saxena
On Fri, May 21, 2010 at 09:49:56PM +0100, Stefan Hajnoczi wrote:
> http://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
>
> Requires kernel support - not sure if enough of utrace is in mainline
> for this to work out-of-the-box across distros.
Nothing of utrace is in mainline, nevermind the whole systemtap code
which is intentionally keep out of the kernel tree. Using this means
that for every probe in userspace code you need to keep the configured
source tree of the currently running kernel around, which is completely
unusable for typical developer setups.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 20:49 ` Stefan Hajnoczi
2010-05-21 21:26 ` Christoph Hellwig
@ 2010-05-21 21:37 ` Jan Kiszka
1 sibling, 0 replies; 18+ messages in thread
From: Jan Kiszka @ 2010-05-21 21:37 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: kvm@vger.kernel.org, Prerna Saxena, Stefan Hajnoczi,
qemu-devel@nongnu.org
[-- Attachment #1: Type: text/plain, Size: 2446 bytes --]
Stefan Hajnoczi wrote:
> On Fri, May 21, 2010 at 5:52 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> I would just like to avoid that too much efforts are spent on
>> re-inventing smart trace buffers, trace daemons, or trace visualization
>> tools. Then better pick up some semi-perfect approach (e.g. [1], it
>> unfortunately still seems to lack kernel integration) and drive it
>> according to our needs.
>
> I agree we have to consider existing solutions. The killer is the
> usability: what dependencies are required to build with tracing? Is a
> patched kernel or module required? How easy is it to add static trace
> events during debugging?
>
> If there are too many dependencies, especially to unpackaged software,
> many people will stop right there and not bother. A patched kernel or
> module isn't acceptable since the hassle of reconfiguring a system for
> tracing becomes too great (or in some cases changing the kernel is not
> possible/allowed).
>
> Adding new static trace events should be easy, too. Ideally it
> doesn't require adding information about the trace event in multiple
> places (header files, C files, etc). It also shouldn't require
> learning about the tracing system, adding a trace event should be
> self-explanatory so anyone can easily add one for debugging.
>
> A lot of opinions there, but what I'm saying is that friction must be
> low. If the tracing system is a pain to use, then no-one will use it.
No question.
I mentioned LTTng as it is most promising /wrt performance (both when
enabled and disabled). But LTTng was so far not best in class when it
came to usability.
>
> http://lttng.org/files/ust/manual/ust.html
>
> LTTng Userspace Tracer looks interesting - no kernel support required
> AFAICT. Toggling trace events in a running process supported.
> Similar to kernel tracepoint.h and existing report/visualization tool.
>
> x86 (32- and 64-bit) only.
Sure? I thought there might be an arch dependency due to urcu but it has
generic support as well now.
> Like you say, no correlation with kernel trace data.
It would be good if we could still hook into trancepoints and stream
them out differently. That would allow for add-hoc tracing when
performance does not matter that much (trace to file, trace to kernel).
But we would still benefit from enabling tracepoints during runtime and
keeping them built in.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 21:06 ` Anthony Liguori
@ 2010-05-21 21:41 ` Jan Kiszka
2010-05-21 21:58 ` Anthony Liguori
0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-05-21 21:41 UTC (permalink / raw)
To: Anthony Liguori
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi, kvm@vger.kernel.org,
Prerna Saxena
[-- Attachment #1: Type: text/plain, Size: 487 bytes --]
Anthony Liguori wrote:
> I'm not opposed to using a framework, but I'd rather have an equivalent
> to kvm_stat tomorrow than wait 3 years for LTTng to not get merged.
>
> So let's have a dirt-simple tracing mechanism and focus on adding useful
> trace points. Then when we have a framework we can use, we can just
> convert the tracepoints to the new framework.
That could mean serializing the tracepoints to strings and dumping them
to our log file - no concerns.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support
2010-05-21 21:41 ` Jan Kiszka
@ 2010-05-21 21:58 ` Anthony Liguori
0 siblings, 0 replies; 18+ messages in thread
From: Anthony Liguori @ 2010-05-21 21:58 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi, kvm@vger.kernel.org,
Prerna Saxena
On 05/21/2010 04:41 PM, Jan Kiszka wrote:
> Anthony Liguori wrote:
>
>> I'm not opposed to using a framework, but I'd rather have an equivalent
>> to kvm_stat tomorrow than wait 3 years for LTTng to not get merged.
>>
>> So let's have a dirt-simple tracing mechanism and focus on adding useful
>> trace points. Then when we have a framework we can use, we can just
>> convert the tracepoints to the new framework.
>>
> That could mean serializing the tracepoints to strings and dumping them
> to our log file - no concerns.
>
Which I really don't mind.
Regards,
Anthony Liguori
> Jan
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2010-05-21 21:58 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi
2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi
2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi
2010-05-21 11:13 ` Jan Kiszka
2010-05-21 13:10 ` Stefan Hajnoczi
2010-05-21 13:22 ` Jan Kiszka
2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori
2010-05-21 13:46 ` Jan Kiszka
2010-05-21 14:03 ` Anthony Liguori
2010-05-21 16:52 ` Jan Kiszka
2010-05-21 20:49 ` Stefan Hajnoczi
2010-05-21 21:26 ` Christoph Hellwig
2010-05-21 21:37 ` Jan Kiszka
2010-05-21 21:06 ` Anthony Liguori
2010-05-21 21:41 ` Jan Kiszka
2010-05-21 21:58 ` Anthony Liguori
2010-05-21 9:42 ` [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit Stefan Hajnoczi
2010-05-21 11:27 ` [Qemu-devel] [RFC 0/2] Tracing Prerna Saxena
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).