* [Qemu-devel] [RFC 0/2] Tracing @ 2010-05-21 9:42 Stefan Hajnoczi 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Stefan Hajnoczi @ 2010-05-21 9:42 UTC (permalink / raw) To: qemu-devel, kvm; +Cc: Anthony Liguori, Prerna Saxena Trace events in QEMU/KVM can be very useful for debugging and performance analysis. I'd like to discuss tracing support and hope others have an interest in this feature, too. Following this email are patches I am using to debug virtio-blk and storage. The patches provide trivial tracing support, but they don't address the details of real tracing tools: enabling/disabling events at runtime, no overhead for disabled events, multithreading support, etc. It would be nice to have userland tracing facilities that work out-of-the-box on production systems. Unfortunately, I'm not aware of any such facilities out there right now on Linux. Perhaps SystemTap userspace tracing is the way to go, has anyone tried it with KVM? For the medium term, without userspace tracing facilities in the OS we could put something into QEMU to address the need for tracing. Here are my thoughts on fleshing out the tracing patch I have posted: 1. Make it possible to enable/disable events at runtime. Users enable only the events they are interested in and aren't flooded with trace data for all other events. 2. Either make trace events cheap or build without trace events by default. Disable by default still allows tracing to be used for development but less for production. 3. Allow events in any execution context (cpu, io, aio emulation threads). The current code does not support concurrency and is meant for when the iothread mutex is held. 4. Make it easy to add new events. Instead of keeping trace.h and trace.py in sync manually, use something like .hx to produce the appropriate C and Python. Summary: Tracing is useful, are there external tools we can use right now? If not, should we put in something that works well enough until external tools catch up? Stefan ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi @ 2010-05-21 9:42 ` Stefan Hajnoczi 2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi ` (2 more replies) 2010-05-21 9:42 ` [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit Stefan Hajnoczi 2010-05-21 11:27 ` [Qemu-devel] [RFC 0/2] Tracing Prerna Saxena 2 siblings, 3 replies; 18+ messages in thread From: Stefan Hajnoczi @ 2010-05-21 9:42 UTC (permalink / raw) To: qemu-devel, kvm; +Cc: Anthony Liguori, Stefan Hajnoczi, Prerna Saxena Trace events should be defined in trace.h. Events are written to /tmp/trace.log and can be formatted using trace.py. Remember to add events to trace.py for pretty-printing. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- Makefile.objs | 2 +- trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ trace.h | 9 ++++++++ trace.py | 30 ++++++++++++++++++++++++++ 4 files changed, 104 insertions(+), 1 deletions(-) create mode 100644 trace.c create mode 100644 trace.h create mode 100755 trace.py diff --git a/Makefile.objs b/Makefile.objs index acbaf22..307e989 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o # block-obj-y is code used by both qemu system emulation and qemu-img block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o block-obj-$(CONFIG_POSIX) += posix-aio-compat.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o diff --git a/trace.c b/trace.c new file mode 100644 index 0000000..2fec4d3 --- /dev/null +++ b/trace.c @@ -0,0 +1,64 @@ +#include <stdlib.h> +#include <stdio.h> +#include "trace.h" + +typedef struct { + unsigned long event; + unsigned long x1; + unsigned long x2; + unsigned long x3; + unsigned long x4; + unsigned long x5; +} TraceRecord; + +enum { + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), +}; + +static TraceRecord trace_buf[TRACE_BUF_LEN]; +static unsigned int trace_idx; +static FILE *trace_fp; + +static void trace(TraceEvent event, unsigned long x1, + unsigned long x2, unsigned long x3, + unsigned long x4, unsigned long x5) { + TraceRecord *rec = &trace_buf[trace_idx]; + rec->event = event; + rec->x1 = x1; + rec->x2 = x2; + rec->x3 = x3; + rec->x4 = x4; + rec->x5 = x5; + + if (++trace_idx == TRACE_BUF_LEN) { + trace_idx = 0; + + if (!trace_fp) { + trace_fp = fopen("/tmp/trace.log", "w"); + } + if (trace_fp) { + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, trace_fp); + result = result; + } + } +} + +void trace1(TraceEvent event, unsigned long x1) { + trace(event, x1, 0, 0, 0, 0); +} + +void trace2(TraceEvent event, unsigned long x1, unsigned long x2) { + trace(event, x1, x2, 0, 0, 0); +} + +void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3) { + trace(event, x1, x2, x3, 0, 0); +} + +void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4) { + trace(event, x1, x2, x3, x4, 0); +} + +void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5) { + trace(event, x1, x2, x3, x4, x5); +} diff --git a/trace.h b/trace.h new file mode 100644 index 0000000..144aa1e --- /dev/null +++ b/trace.h @@ -0,0 +1,9 @@ +typedef enum { + TRACE_MAX +} TraceEvent; + +void trace1(TraceEvent event, unsigned long x1); +void trace2(TraceEvent event, unsigned long x1, unsigned long x2); +void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3); +void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4); +void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5); diff --git a/trace.py b/trace.py new file mode 100755 index 0000000..f38ab6b --- /dev/null +++ b/trace.py @@ -0,0 +1,30 @@ +#!/usr/bin/env python +import sys +import struct + +trace_fmt = 'LLLLLL' +trace_len = struct.calcsize(trace_fmt) + +events = { +} + +def read_record(fobj): + s = fobj.read(trace_len) + if len(s) != trace_len: + return None + return struct.unpack(trace_fmt, s) + +def format_record(rec): + event = events[rec[0]] + fields = [event[0]] + for i in xrange(1, len(event)): + fields.append('%s=0x%x' % (event[i], rec[i])) + return ' '.join(fields) + +f = open(sys.argv[1], 'rb') +while True: + rec = read_record(f) + if rec is None: + break + + print format_record(rec) -- 1.7.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi @ 2010-05-21 9:49 ` Stefan Hajnoczi 2010-05-21 11:13 ` Jan Kiszka 2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori 2 siblings, 0 replies; 18+ messages in thread From: Stefan Hajnoczi @ 2010-05-21 9:49 UTC (permalink / raw) To: qemu-devel, kvm; +Cc: Anthony Liguori, Prerna Saxena I should have used the "[RFC]" tag to make it clear that I'm not proposing these patches for merge, sorry. Stefan ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi 2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi @ 2010-05-21 11:13 ` Jan Kiszka 2010-05-21 13:10 ` Stefan Hajnoczi 2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori 2 siblings, 1 reply; 18+ messages in thread From: Jan Kiszka @ 2010-05-21 11:13 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Anthony Liguori, qemu-devel, kvm, Prerna Saxena Stefan Hajnoczi wrote: > Trace events should be defined in trace.h. Events are written to > /tmp/trace.log and can be formatted using trace.py. Remember to add > events to trace.py for pretty-printing. When already writing to a file, why not reusing QEMU's logging infrastructure ("log <foo>" / "-d foo")? Shouldn't make a huge performance difference if the data is saved in clear-text. Also, having support for ftrace's user space markers would be a very nice option (only an option as it's Linux-specific), see http://lwn.net/Articles/366796. This allows to correlate kernel events (KVM as well as others) with what goes on in QEMU. It simply enables integration with the whole kernel tracing infrastructure, e.g. KernelShark (http://people.redhat.com/srostedt/kernelshark/HTML). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support 2010-05-21 11:13 ` Jan Kiszka @ 2010-05-21 13:10 ` Stefan Hajnoczi 2010-05-21 13:22 ` Jan Kiszka 0 siblings, 1 reply; 18+ messages in thread From: Stefan Hajnoczi @ 2010-05-21 13:10 UTC (permalink / raw) To: Jan Kiszka Cc: Anthony Liguori, Prerna Saxena, Stefan Hajnoczi, kvm, qemu-devel On Fri, May 21, 2010 at 12:13 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: > Stefan Hajnoczi wrote: >> Trace events should be defined in trace.h. Events are written to >> /tmp/trace.log and can be formatted using trace.py. Remember to add >> events to trace.py for pretty-printing. > > When already writing to a file, why not reusing QEMU's logging > infrastructure ("log <foo>" / "-d foo")? Shouldn't make a huge > performance difference if the data is saved in clear-text. > Also, having support for ftrace's user space markers would be a very > nice option (only an option as it's Linux-specific), see > http://lwn.net/Articles/366796. Thanks for the links. I think using the platform's tracing facility has many advantages. The main one being that we can focus on QEMU/KVM development rather than re-implementing tracing infrastructure :). It may be possible to have SystemTap, DTrace, or nop static trace event code. A platform with no tracing support can only use the nop backend, which results in a build without static trace events. Platforms with tracing support can build with the appropriate backend or nop. The backend tracing facility is abstracted and most of QEMU doesn't need to know which one is being used. I hadn't seen trace markers. However, I suspect they aren't ideal for static trace events because logging an event requires a write system call. They look useful for annotating kernel tracing information, but less for high frequency/low overhead userspace tracing. Stefan ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] trace: Add simple tracing support 2010-05-21 13:10 ` Stefan Hajnoczi @ 2010-05-21 13:22 ` Jan Kiszka 0 siblings, 0 replies; 18+ messages in thread From: Jan Kiszka @ 2010-05-21 13:22 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Anthony Liguori, Prerna Saxena, Stefan Hajnoczi, kvm@vger.kernel.org, qemu-devel@nongnu.org Stefan Hajnoczi wrote: > On Fri, May 21, 2010 at 12:13 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >> Stefan Hajnoczi wrote: >>> Trace events should be defined in trace.h. Events are written to >>> /tmp/trace.log and can be formatted using trace.py. Remember to add >>> events to trace.py for pretty-printing. >> When already writing to a file, why not reusing QEMU's logging >> infrastructure ("log <foo>" / "-d foo")? Shouldn't make a huge >> performance difference if the data is saved in clear-text. > >> Also, having support for ftrace's user space markers would be a very >> nice option (only an option as it's Linux-specific), see >> http://lwn.net/Articles/366796. > > Thanks for the links. > > I think using the platform's tracing facility has many advantages. > The main one being that we can focus on QEMU/KVM development rather > than re-implementing tracing infrastructure :). Indeed. :) > > It may be possible to have SystemTap, DTrace, or nop static trace > event code. A platform with no tracing support can only use the nop > backend, which results in a build without static trace events. > Platforms with tracing support can build with the appropriate backend > or nop. The backend tracing facility is abstracted and most of QEMU > doesn't need to know which one is being used. That would be ideal. > > I hadn't seen trace markers. However, I suspect they aren't ideal for > static trace events because logging an event requires a write system > call. They look useful for annotating kernel tracing information, but > less for high frequency/low overhead userspace tracing. You never know for sure until you tried :). There are surely lots of scenarios where this overhead does not matter. Moreover, I'm sure that something of LTTng's high-frequency/low-overhead tracing capabilities will make it (in whatever form) into mainline sooner or later. So we need that smart infrastructure to make use of it once it's available (actually, LTTng is already available, just still requires "some" kernel patching). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi 2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi 2010-05-21 11:13 ` Jan Kiszka @ 2010-05-21 12:37 ` Anthony Liguori 2010-05-21 13:46 ` Jan Kiszka 2 siblings, 1 reply; 18+ messages in thread From: Anthony Liguori @ 2010-05-21 12:37 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, Prerna Saxena On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote: > Trace events should be defined in trace.h. Events are written to > /tmp/trace.log and can be formatted using trace.py. Remember to add > events to trace.py for pretty-printing. > > Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> > --- > Makefile.objs | 2 +- > trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > trace.h | 9 ++++++++ > trace.py | 30 ++++++++++++++++++++++++++ > 4 files changed, 104 insertions(+), 1 deletions(-) > create mode 100644 trace.c > create mode 100644 trace.h > create mode 100755 trace.py > > diff --git a/Makefile.objs b/Makefile.objs > index acbaf22..307e989 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o > # block-obj-y is code used by both qemu system emulation and qemu-img > > block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o > -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o > +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o > block-obj-$(CONFIG_POSIX) += posix-aio-compat.o > block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o > > diff --git a/trace.c b/trace.c > new file mode 100644 > index 0000000..2fec4d3 > --- /dev/null > +++ b/trace.c > @@ -0,0 +1,64 @@ > +#include<stdlib.h> > +#include<stdio.h> > +#include "trace.h" > + > +typedef struct { > + unsigned long event; > + unsigned long x1; > + unsigned long x2; > + unsigned long x3; > + unsigned long x4; > + unsigned long x5; > +} TraceRecord; > + > +enum { > + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), > +}; > + > +static TraceRecord trace_buf[TRACE_BUF_LEN]; > +static unsigned int trace_idx; > +static FILE *trace_fp; > + > +static void trace(TraceEvent event, unsigned long x1, > + unsigned long x2, unsigned long x3, > + unsigned long x4, unsigned long x5) { > + TraceRecord *rec =&trace_buf[trace_idx]; > + rec->event = event; > + rec->x1 = x1; > + rec->x2 = x2; > + rec->x3 = x3; > + rec->x4 = x4; > + rec->x5 = x5; > + > + if (++trace_idx == TRACE_BUF_LEN) { > + trace_idx = 0; > + > + if (!trace_fp) { > + trace_fp = fopen("/tmp/trace.log", "w"); > + } > + if (trace_fp) { > + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, trace_fp); > + result = result; > + } > + } > +} > It is probably worth while to read trace points via the monitor or through some other mechanism. My concern would be that writing even 64k out to disk would introduce enough performance overhead mainly because it runs lock-step with the guest's VCPU. Maybe it's worth adding a thread that syncs the ring to disk if we want to write to disk? > +void trace1(TraceEvent event, unsigned long x1) { > + trace(event, x1, 0, 0, 0, 0); > +} > + > +void trace2(TraceEvent event, unsigned long x1, unsigned long x2) { > + trace(event, x1, x2, 0, 0, 0); > +} > + > +void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3) { > + trace(event, x1, x2, x3, 0, 0); > +} > + > +void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4) { > + trace(event, x1, x2, x3, x4, 0); > +} > + > +void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5) { > + trace(event, x1, x2, x3, x4, x5); > +} > diff --git a/trace.h b/trace.h > new file mode 100644 > index 0000000..144aa1e > --- /dev/null > +++ b/trace.h > @@ -0,0 +1,9 @@ > +typedef enum { > + TRACE_MAX > +} TraceEvent; > + > +void trace1(TraceEvent event, unsigned long x1); > +void trace2(TraceEvent event, unsigned long x1, unsigned long x2); > +void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3); > +void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4); > +void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5); > Looks good. I think we definitely need something like this. Regards, Anthony Liguori > diff --git a/trace.py b/trace.py > new file mode 100755 > index 0000000..f38ab6b > --- /dev/null > +++ b/trace.py > @@ -0,0 +1,30 @@ > +#!/usr/bin/env python > +import sys > +import struct > + > +trace_fmt = 'LLLLLL' > +trace_len = struct.calcsize(trace_fmt) > + > +events = { > +} > + > +def read_record(fobj): > + s = fobj.read(trace_len) > + if len(s) != trace_len: > + return None > + return struct.unpack(trace_fmt, s) > + > +def format_record(rec): > + event = events[rec[0]] > + fields = [event[0]] > + for i in xrange(1, len(event)): > + fields.append('%s=0x%x' % (event[i], rec[i])) > + return ' '.join(fields) > + > +f = open(sys.argv[1], 'rb') > +while True: > + rec = read_record(f) > + if rec is None: > + break > + > + print format_record(rec) > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori @ 2010-05-21 13:46 ` Jan Kiszka 2010-05-21 14:03 ` Anthony Liguori 0 siblings, 1 reply; 18+ messages in thread From: Jan Kiszka @ 2010-05-21 13:46 UTC (permalink / raw) To: Anthony Liguori; +Cc: Prerna Saxena, Stefan Hajnoczi, kvm, qemu-devel Anthony Liguori wrote: > On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote: >> Trace events should be defined in trace.h. Events are written to >> /tmp/trace.log and can be formatted using trace.py. Remember to add >> events to trace.py for pretty-printing. >> >> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >> --- >> Makefile.objs | 2 +- >> trace.c | 64 >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> trace.h | 9 ++++++++ >> trace.py | 30 ++++++++++++++++++++++++++ >> 4 files changed, 104 insertions(+), 1 deletions(-) >> create mode 100644 trace.c >> create mode 100644 trace.h >> create mode 100755 trace.py >> >> diff --git a/Makefile.objs b/Makefile.objs >> index acbaf22..307e989 100644 >> --- a/Makefile.objs >> +++ b/Makefile.objs >> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o >> # block-obj-y is code used by both qemu system emulation and qemu-img >> >> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o >> module.o >> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o >> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o >> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o >> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o >> >> diff --git a/trace.c b/trace.c >> new file mode 100644 >> index 0000000..2fec4d3 >> --- /dev/null >> +++ b/trace.c >> @@ -0,0 +1,64 @@ >> +#include<stdlib.h> >> +#include<stdio.h> >> +#include "trace.h" >> + >> +typedef struct { >> + unsigned long event; >> + unsigned long x1; >> + unsigned long x2; >> + unsigned long x3; >> + unsigned long x4; >> + unsigned long x5; >> +} TraceRecord; >> + >> +enum { >> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), >> +}; >> + >> +static TraceRecord trace_buf[TRACE_BUF_LEN]; >> +static unsigned int trace_idx; >> +static FILE *trace_fp; >> + >> +static void trace(TraceEvent event, unsigned long x1, >> + unsigned long x2, unsigned long x3, >> + unsigned long x4, unsigned long x5) { >> + TraceRecord *rec =&trace_buf[trace_idx]; >> + rec->event = event; >> + rec->x1 = x1; >> + rec->x2 = x2; >> + rec->x3 = x3; >> + rec->x4 = x4; >> + rec->x5 = x5; >> + >> + if (++trace_idx == TRACE_BUF_LEN) { >> + trace_idx = 0; >> + >> + if (!trace_fp) { >> + trace_fp = fopen("/tmp/trace.log", "w"); >> + } >> + if (trace_fp) { >> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, >> trace_fp); >> + result = result; >> + } >> + } >> +} >> > > It is probably worth while to read trace points via the monitor or > through some other mechanism. My concern would be that writing even 64k > out to disk would introduce enough performance overhead mainly because > it runs lock-step with the guest's VCPU. > > Maybe it's worth adding a thread that syncs the ring to disk if we want > to write to disk? That's not what QEMU should worry about. If somehow possible, let's push this into the hands of a (user space) tracing framework, ideally one that is already designed for such requirements. E.g. there exists quite useful work in the context of LTTng (user space RCU for application tracing). We may need simple stubs for the case that no such framework is (yet) available. But effort should focus on a QEMU infrastructure to add useful tracepoints to the code. Specifically when tracing over KVM, you usually need information about kernel states as well, so you depend on an integrated approach, not Yet Another Log File. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 13:46 ` Jan Kiszka @ 2010-05-21 14:03 ` Anthony Liguori 2010-05-21 16:52 ` Jan Kiszka 0 siblings, 1 reply; 18+ messages in thread From: Anthony Liguori @ 2010-05-21 14:03 UTC (permalink / raw) To: Jan Kiszka; +Cc: Prerna Saxena, Stefan Hajnoczi, kvm, qemu-devel On 05/21/2010 08:46 AM, Jan Kiszka wrote: > Anthony Liguori wrote: > >> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote: >> >>> Trace events should be defined in trace.h. Events are written to >>> /tmp/trace.log and can be formatted using trace.py. Remember to add >>> events to trace.py for pretty-printing. >>> >>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>> --- >>> Makefile.objs | 2 +- >>> trace.c | 64 >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> trace.h | 9 ++++++++ >>> trace.py | 30 ++++++++++++++++++++++++++ >>> 4 files changed, 104 insertions(+), 1 deletions(-) >>> create mode 100644 trace.c >>> create mode 100644 trace.h >>> create mode 100755 trace.py >>> >>> diff --git a/Makefile.objs b/Makefile.objs >>> index acbaf22..307e989 100644 >>> --- a/Makefile.objs >>> +++ b/Makefile.objs >>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o >>> # block-obj-y is code used by both qemu system emulation and qemu-img >>> >>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o >>> module.o >>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o >>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o >>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o >>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o >>> >>> diff --git a/trace.c b/trace.c >>> new file mode 100644 >>> index 0000000..2fec4d3 >>> --- /dev/null >>> +++ b/trace.c >>> @@ -0,0 +1,64 @@ >>> +#include<stdlib.h> >>> +#include<stdio.h> >>> +#include "trace.h" >>> + >>> +typedef struct { >>> + unsigned long event; >>> + unsigned long x1; >>> + unsigned long x2; >>> + unsigned long x3; >>> + unsigned long x4; >>> + unsigned long x5; >>> +} TraceRecord; >>> + >>> +enum { >>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), >>> +}; >>> + >>> +static TraceRecord trace_buf[TRACE_BUF_LEN]; >>> +static unsigned int trace_idx; >>> +static FILE *trace_fp; >>> + >>> +static void trace(TraceEvent event, unsigned long x1, >>> + unsigned long x2, unsigned long x3, >>> + unsigned long x4, unsigned long x5) { >>> + TraceRecord *rec =&trace_buf[trace_idx]; >>> + rec->event = event; >>> + rec->x1 = x1; >>> + rec->x2 = x2; >>> + rec->x3 = x3; >>> + rec->x4 = x4; >>> + rec->x5 = x5; >>> + >>> + if (++trace_idx == TRACE_BUF_LEN) { >>> + trace_idx = 0; >>> + >>> + if (!trace_fp) { >>> + trace_fp = fopen("/tmp/trace.log", "w"); >>> + } >>> + if (trace_fp) { >>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, >>> trace_fp); >>> + result = result; >>> + } >>> + } >>> +} >>> >>> >> It is probably worth while to read trace points via the monitor or >> through some other mechanism. My concern would be that writing even 64k >> out to disk would introduce enough performance overhead mainly because >> it runs lock-step with the guest's VCPU. >> >> Maybe it's worth adding a thread that syncs the ring to disk if we want >> to write to disk? >> > That's not what QEMU should worry about. If somehow possible, let's push > this into the hands of a (user space) tracing framework, ideally one > that is already designed for such requirements. E.g. there exists quite > useful work in the context of LTTng (user space RCU for application > tracing). > From what I understand, none of the current kernel approaches to userspace tracing have much momentum at the moment. > We may need simple stubs for the case that no such framework is (yet) > available. But effort should focus on a QEMU infrastructure to add > useful tracepoints to the code. Specifically when tracing over KVM, you > usually need information about kernel states as well, so you depend on > an integrated approach, not Yet Another Log File. > I think the simple code that Stefan pasted gives us 95% of what we need. Regards, Anthony Liguori > Jan > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 14:03 ` Anthony Liguori @ 2010-05-21 16:52 ` Jan Kiszka 2010-05-21 20:49 ` Stefan Hajnoczi 2010-05-21 21:06 ` Anthony Liguori 0 siblings, 2 replies; 18+ messages in thread From: Jan Kiszka @ 2010-05-21 16:52 UTC (permalink / raw) To: Anthony Liguori Cc: Prerna Saxena, Stefan Hajnoczi, kvm@vger.kernel.org, qemu-devel@nongnu.org Anthony Liguori wrote: > On 05/21/2010 08:46 AM, Jan Kiszka wrote: >> Anthony Liguori wrote: >> >>> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote: >>> >>>> Trace events should be defined in trace.h. Events are written to >>>> /tmp/trace.log and can be formatted using trace.py. Remember to add >>>> events to trace.py for pretty-printing. >>>> >>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>> --- >>>> Makefile.objs | 2 +- >>>> trace.c | 64 >>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> trace.h | 9 ++++++++ >>>> trace.py | 30 ++++++++++++++++++++++++++ >>>> 4 files changed, 104 insertions(+), 1 deletions(-) >>>> create mode 100644 trace.c >>>> create mode 100644 trace.h >>>> create mode 100755 trace.py >>>> >>>> diff --git a/Makefile.objs b/Makefile.objs >>>> index acbaf22..307e989 100644 >>>> --- a/Makefile.objs >>>> +++ b/Makefile.objs >>>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o >>>> # block-obj-y is code used by both qemu system emulation and qemu-img >>>> >>>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o >>>> module.o >>>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o >>>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o >>>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o >>>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o >>>> >>>> diff --git a/trace.c b/trace.c >>>> new file mode 100644 >>>> index 0000000..2fec4d3 >>>> --- /dev/null >>>> +++ b/trace.c >>>> @@ -0,0 +1,64 @@ >>>> +#include<stdlib.h> >>>> +#include<stdio.h> >>>> +#include "trace.h" >>>> + >>>> +typedef struct { >>>> + unsigned long event; >>>> + unsigned long x1; >>>> + unsigned long x2; >>>> + unsigned long x3; >>>> + unsigned long x4; >>>> + unsigned long x5; >>>> +} TraceRecord; >>>> + >>>> +enum { >>>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), >>>> +}; >>>> + >>>> +static TraceRecord trace_buf[TRACE_BUF_LEN]; >>>> +static unsigned int trace_idx; >>>> +static FILE *trace_fp; >>>> + >>>> +static void trace(TraceEvent event, unsigned long x1, >>>> + unsigned long x2, unsigned long x3, >>>> + unsigned long x4, unsigned long x5) { >>>> + TraceRecord *rec =&trace_buf[trace_idx]; >>>> + rec->event = event; >>>> + rec->x1 = x1; >>>> + rec->x2 = x2; >>>> + rec->x3 = x3; >>>> + rec->x4 = x4; >>>> + rec->x5 = x5; >>>> + >>>> + if (++trace_idx == TRACE_BUF_LEN) { >>>> + trace_idx = 0; >>>> + >>>> + if (!trace_fp) { >>>> + trace_fp = fopen("/tmp/trace.log", "w"); >>>> + } >>>> + if (trace_fp) { >>>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, >>>> trace_fp); >>>> + result = result; >>>> + } >>>> + } >>>> +} >>>> >>>> >>> It is probably worth while to read trace points via the monitor or >>> through some other mechanism. My concern would be that writing even 64k >>> out to disk would introduce enough performance overhead mainly because >>> it runs lock-step with the guest's VCPU. >>> >>> Maybe it's worth adding a thread that syncs the ring to disk if we want >>> to write to disk? >>> >> That's not what QEMU should worry about. If somehow possible, let's push >> this into the hands of a (user space) tracing framework, ideally one >> that is already designed for such requirements. E.g. there exists quite >> useful work in the context of LTTng (user space RCU for application >> tracing). >> > > From what I understand, none of the current kernel approaches to > userspace tracing have much momentum at the moment. > >> We may need simple stubs for the case that no such framework is (yet) >> available. But effort should focus on a QEMU infrastructure to add >> useful tracepoints to the code. Specifically when tracing over KVM, you >> usually need information about kernel states as well, so you depend on >> an integrated approach, not Yet Another Log File. >> > > I think the simple code that Stefan pasted gives us 95% of what we need. IMHO not 95%, but it is a start. I would just like to avoid that too much efforts are spent on re-inventing smart trace buffers, trace daemons, or trace visualization tools. Then better pick up some semi-perfect approach (e.g. [1], it unfortunately still seems to lack kernel integration) and drive it according to our needs. Jan [1] http://lttng.org/ust -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 16:52 ` Jan Kiszka @ 2010-05-21 20:49 ` Stefan Hajnoczi 2010-05-21 21:26 ` Christoph Hellwig 2010-05-21 21:37 ` Jan Kiszka 2010-05-21 21:06 ` Anthony Liguori 1 sibling, 2 replies; 18+ messages in thread From: Stefan Hajnoczi @ 2010-05-21 20:49 UTC (permalink / raw) To: Jan Kiszka Cc: kvm@vger.kernel.org, Prerna Saxena, Stefan Hajnoczi, qemu-devel@nongnu.org On Fri, May 21, 2010 at 5:52 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: > I would just like to avoid that too much efforts are spent on > re-inventing smart trace buffers, trace daemons, or trace visualization > tools. Then better pick up some semi-perfect approach (e.g. [1], it > unfortunately still seems to lack kernel integration) and drive it > according to our needs. I agree we have to consider existing solutions. The killer is the usability: what dependencies are required to build with tracing? Is a patched kernel or module required? How easy is it to add static trace events during debugging? If there are too many dependencies, especially to unpackaged software, many people will stop right there and not bother. A patched kernel or module isn't acceptable since the hassle of reconfiguring a system for tracing becomes too great (or in some cases changing the kernel is not possible/allowed). Adding new static trace events should be easy, too. Ideally it doesn't require adding information about the trace event in multiple places (header files, C files, etc). It also shouldn't require learning about the tracing system, adding a trace event should be self-explanatory so anyone can easily add one for debugging. A lot of opinions there, but what I'm saying is that friction must be low. If the tracing system is a pain to use, then no-one will use it. http://lttng.org/files/ust/manual/ust.html LTTng Userspace Tracer looks interesting - no kernel support required AFAICT. Toggling trace events in a running process supported. Similar to kernel tracepoint.h and existing report/visualization tool. x86 (32- and 64-bit) only. Like you say, no correlation with kernel trace data. I'll try to give LTTng UST a spin by converting my trace events to use UST. This seems closest to an existing tracing system we can drop in. http://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps Requires kernel support - not sure if enough of utrace is in mainline for this to work out-of-the-box across distros. Unclear how exactly SystemTap userspace probing would work out. Does anyone have experience or want to try this? Stefan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 20:49 ` Stefan Hajnoczi @ 2010-05-21 21:26 ` Christoph Hellwig 2010-05-21 21:37 ` Jan Kiszka 1 sibling, 0 replies; 18+ messages in thread From: Christoph Hellwig @ 2010-05-21 21:26 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Stefan Hajnoczi, kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org, Prerna Saxena On Fri, May 21, 2010 at 09:49:56PM +0100, Stefan Hajnoczi wrote: > http://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps > > Requires kernel support - not sure if enough of utrace is in mainline > for this to work out-of-the-box across distros. Nothing of utrace is in mainline, nevermind the whole systemtap code which is intentionally keep out of the kernel tree. Using this means that for every probe in userspace code you need to keep the configured source tree of the currently running kernel around, which is completely unusable for typical developer setups. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 20:49 ` Stefan Hajnoczi 2010-05-21 21:26 ` Christoph Hellwig @ 2010-05-21 21:37 ` Jan Kiszka 1 sibling, 0 replies; 18+ messages in thread From: Jan Kiszka @ 2010-05-21 21:37 UTC (permalink / raw) To: Stefan Hajnoczi Cc: kvm@vger.kernel.org, Prerna Saxena, Stefan Hajnoczi, qemu-devel@nongnu.org [-- Attachment #1: Type: text/plain, Size: 2446 bytes --] Stefan Hajnoczi wrote: > On Fri, May 21, 2010 at 5:52 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >> I would just like to avoid that too much efforts are spent on >> re-inventing smart trace buffers, trace daemons, or trace visualization >> tools. Then better pick up some semi-perfect approach (e.g. [1], it >> unfortunately still seems to lack kernel integration) and drive it >> according to our needs. > > I agree we have to consider existing solutions. The killer is the > usability: what dependencies are required to build with tracing? Is a > patched kernel or module required? How easy is it to add static trace > events during debugging? > > If there are too many dependencies, especially to unpackaged software, > many people will stop right there and not bother. A patched kernel or > module isn't acceptable since the hassle of reconfiguring a system for > tracing becomes too great (or in some cases changing the kernel is not > possible/allowed). > > Adding new static trace events should be easy, too. Ideally it > doesn't require adding information about the trace event in multiple > places (header files, C files, etc). It also shouldn't require > learning about the tracing system, adding a trace event should be > self-explanatory so anyone can easily add one for debugging. > > A lot of opinions there, but what I'm saying is that friction must be > low. If the tracing system is a pain to use, then no-one will use it. No question. I mentioned LTTng as it is most promising /wrt performance (both when enabled and disabled). But LTTng was so far not best in class when it came to usability. > > http://lttng.org/files/ust/manual/ust.html > > LTTng Userspace Tracer looks interesting - no kernel support required > AFAICT. Toggling trace events in a running process supported. > Similar to kernel tracepoint.h and existing report/visualization tool. > > x86 (32- and 64-bit) only. Sure? I thought there might be an arch dependency due to urcu but it has generic support as well now. > Like you say, no correlation with kernel trace data. It would be good if we could still hook into trancepoints and stream them out differently. That would allow for add-hoc tracing when performance does not matter that much (trace to file, trace to kernel). But we would still benefit from enabling tracepoints during runtime and keeping them built in. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 16:52 ` Jan Kiszka 2010-05-21 20:49 ` Stefan Hajnoczi @ 2010-05-21 21:06 ` Anthony Liguori 2010-05-21 21:41 ` Jan Kiszka 1 sibling, 1 reply; 18+ messages in thread From: Anthony Liguori @ 2010-05-21 21:06 UTC (permalink / raw) To: Jan Kiszka Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Stefan Hajnoczi, Prerna Saxena On 05/21/2010 11:52 AM, Jan Kiszka wrote: > Anthony Liguori wrote: > >> On 05/21/2010 08:46 AM, Jan Kiszka wrote: >> >>> Anthony Liguori wrote: >>> >>> >>>> On 05/21/2010 04:42 AM, Stefan Hajnoczi wrote: >>>> >>>> >>>>> Trace events should be defined in trace.h. Events are written to >>>>> /tmp/trace.log and can be formatted using trace.py. Remember to add >>>>> events to trace.py for pretty-printing. >>>>> >>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>>> --- >>>>> Makefile.objs | 2 +- >>>>> trace.c | 64 >>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> trace.h | 9 ++++++++ >>>>> trace.py | 30 ++++++++++++++++++++++++++ >>>>> 4 files changed, 104 insertions(+), 1 deletions(-) >>>>> create mode 100644 trace.c >>>>> create mode 100644 trace.h >>>>> create mode 100755 trace.py >>>>> >>>>> diff --git a/Makefile.objs b/Makefile.objs >>>>> index acbaf22..307e989 100644 >>>>> --- a/Makefile.objs >>>>> +++ b/Makefile.objs >>>>> @@ -8,7 +8,7 @@ qobject-obj-y += qerror.o >>>>> # block-obj-y is code used by both qemu system emulation and qemu-img >>>>> >>>>> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o >>>>> module.o >>>>> -block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o >>>>> +block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o trace.o >>>>> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o >>>>> block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o >>>>> >>>>> diff --git a/trace.c b/trace.c >>>>> new file mode 100644 >>>>> index 0000000..2fec4d3 >>>>> --- /dev/null >>>>> +++ b/trace.c >>>>> @@ -0,0 +1,64 @@ >>>>> +#include<stdlib.h> >>>>> +#include<stdio.h> >>>>> +#include "trace.h" >>>>> + >>>>> +typedef struct { >>>>> + unsigned long event; >>>>> + unsigned long x1; >>>>> + unsigned long x2; >>>>> + unsigned long x3; >>>>> + unsigned long x4; >>>>> + unsigned long x5; >>>>> +} TraceRecord; >>>>> + >>>>> +enum { >>>>> + TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), >>>>> +}; >>>>> + >>>>> +static TraceRecord trace_buf[TRACE_BUF_LEN]; >>>>> +static unsigned int trace_idx; >>>>> +static FILE *trace_fp; >>>>> + >>>>> +static void trace(TraceEvent event, unsigned long x1, >>>>> + unsigned long x2, unsigned long x3, >>>>> + unsigned long x4, unsigned long x5) { >>>>> + TraceRecord *rec =&trace_buf[trace_idx]; >>>>> + rec->event = event; >>>>> + rec->x1 = x1; >>>>> + rec->x2 = x2; >>>>> + rec->x3 = x3; >>>>> + rec->x4 = x4; >>>>> + rec->x5 = x5; >>>>> + >>>>> + if (++trace_idx == TRACE_BUF_LEN) { >>>>> + trace_idx = 0; >>>>> + >>>>> + if (!trace_fp) { >>>>> + trace_fp = fopen("/tmp/trace.log", "w"); >>>>> + } >>>>> + if (trace_fp) { >>>>> + size_t result = fwrite(trace_buf, sizeof trace_buf, 1, >>>>> trace_fp); >>>>> + result = result; >>>>> + } >>>>> + } >>>>> +} >>>>> >>>>> >>>>> >>>> It is probably worth while to read trace points via the monitor or >>>> through some other mechanism. My concern would be that writing even 64k >>>> out to disk would introduce enough performance overhead mainly because >>>> it runs lock-step with the guest's VCPU. >>>> >>>> Maybe it's worth adding a thread that syncs the ring to disk if we want >>>> to write to disk? >>>> >>>> >>> That's not what QEMU should worry about. If somehow possible, let's push >>> this into the hands of a (user space) tracing framework, ideally one >>> that is already designed for such requirements. E.g. there exists quite >>> useful work in the context of LTTng (user space RCU for application >>> tracing). >>> >>> >> From what I understand, none of the current kernel approaches to >> userspace tracing have much momentum at the moment. >> >> >>> We may need simple stubs for the case that no such framework is (yet) >>> available. But effort should focus on a QEMU infrastructure to add >>> useful tracepoints to the code. Specifically when tracing over KVM, you >>> usually need information about kernel states as well, so you depend on >>> an integrated approach, not Yet Another Log File. >>> >>> >> I think the simple code that Stefan pasted gives us 95% of what we need. >> > IMHO not 95%, but it is a start. > I'm not opposed to using a framework, but I'd rather have an equivalent to kvm_stat tomorrow than wait 3 years for LTTng to not get merged. So let's have a dirt-simple tracing mechanism and focus on adding useful trace points. Then when we have a framework we can use, we can just convert the tracepoints to the new framework. Regards, Anthony Liguori > I would just like to avoid that too much efforts are spent on > re-inventing smart trace buffers, trace daemons, or trace visualization > tools. Then better pick up some semi-perfect approach (e.g. [1], it > unfortunately still seems to lack kernel integration) and drive it > according to our needs. > > Jan > > [1] http://lttng.org/ust > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 21:06 ` Anthony Liguori @ 2010-05-21 21:41 ` Jan Kiszka 2010-05-21 21:58 ` Anthony Liguori 0 siblings, 1 reply; 18+ messages in thread From: Jan Kiszka @ 2010-05-21 21:41 UTC (permalink / raw) To: Anthony Liguori Cc: qemu-devel@nongnu.org, Stefan Hajnoczi, kvm@vger.kernel.org, Prerna Saxena [-- Attachment #1: Type: text/plain, Size: 487 bytes --] Anthony Liguori wrote: > I'm not opposed to using a framework, but I'd rather have an equivalent > to kvm_stat tomorrow than wait 3 years for LTTng to not get merged. > > So let's have a dirt-simple tracing mechanism and focus on adding useful > trace points. Then when we have a framework we can use, we can just > convert the tracepoints to the new framework. That could mean serializing the tracepoints to strings and dumping them to our log file - no concerns. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support 2010-05-21 21:41 ` Jan Kiszka @ 2010-05-21 21:58 ` Anthony Liguori 0 siblings, 0 replies; 18+ messages in thread From: Anthony Liguori @ 2010-05-21 21:58 UTC (permalink / raw) To: Jan Kiszka Cc: qemu-devel@nongnu.org, Stefan Hajnoczi, kvm@vger.kernel.org, Prerna Saxena On 05/21/2010 04:41 PM, Jan Kiszka wrote: > Anthony Liguori wrote: > >> I'm not opposed to using a framework, but I'd rather have an equivalent >> to kvm_stat tomorrow than wait 3 years for LTTng to not get merged. >> >> So let's have a dirt-simple tracing mechanism and focus on adding useful >> trace points. Then when we have a framework we can use, we can just >> convert the tracepoints to the new framework. >> > That could mean serializing the tracepoints to strings and dumping them > to our log file - no concerns. > Which I really don't mind. Regards, Anthony Liguori > Jan > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit 2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi @ 2010-05-21 9:42 ` Stefan Hajnoczi 2010-05-21 11:27 ` [Qemu-devel] [RFC 0/2] Tracing Prerna Saxena 2 siblings, 0 replies; 18+ messages in thread From: Stefan Hajnoczi @ 2010-05-21 9:42 UTC (permalink / raw) To: qemu-devel, kvm; +Cc: Anthony Liguori, Stefan Hajnoczi, Prerna Saxena Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- block.c | 7 +++++++ hw/virtio-blk.c | 6 ++++++ posix-aio-compat.c | 2 ++ trace.h | 42 +++++++++++++++++++++++++++++++++++++++++- trace.py | 8 ++++++++ 5 files changed, 64 insertions(+), 1 deletions(-) diff --git a/block.c b/block.c index bfe46e3..a7fb040 100644 --- a/block.c +++ b/block.c @@ -27,6 +27,7 @@ #include "block_int.h" #include "module.h" #include "qemu-objects.h" +#include "trace.h" #ifdef CONFIG_BSD #include <sys/types.h> @@ -1913,6 +1914,8 @@ static void multiwrite_cb(void *opaque, int ret) { MultiwriteCB *mcb = opaque; + trace_multiwrite_cb(mcb, ret); + if (ret < 0 && !mcb->error) { mcb->error = ret; multiwrite_user_cb(mcb); @@ -2044,6 +2047,8 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // Check for mergable requests num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); + trace_bdrv_aio_multiwrite(mcb, mcb->num_callbacks, num_reqs); + // Run the aio requests for (i = 0; i < num_reqs; i++) { acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, @@ -2054,9 +2059,11 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. if (mcb->num_requests == 0) { + trace_bdrv_aio_multiwrite_earlyfail(mcb); reqs[i].error = -EIO; goto fail; } else { + trace_bdrv_aio_multiwrite_latefail(mcb, i); mcb->num_requests++; multiwrite_cb(mcb, -EIO); break; diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index b05d15e..73b873e 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -13,6 +13,7 @@ #include <qemu-common.h> #include <sysemu.h> +#include "trace.h" #include "virtio-blk.h" #include "block_int.h" #ifdef __linux__ @@ -50,6 +51,8 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, int status) { VirtIOBlock *s = req->dev; + trace_virtio_blk_req_complete(req, status); + req->in->status = status; virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in)); virtio_notify(&s->vdev, s->vq); @@ -87,6 +90,8 @@ static void virtio_blk_rw_complete(void *opaque, int ret) { VirtIOBlockReq *req = opaque; + trace_virtio_blk_rw_complete(req, ret); + if (ret) { int is_read = !(req->out->type & VIRTIO_BLK_T_OUT); if (virtio_blk_handle_rw_error(req, -ret, is_read)) @@ -270,6 +275,7 @@ static void virtio_blk_handle_write(BlockRequest *blkreq, int *num_writes, blkreq[*num_writes].cb = virtio_blk_rw_complete; blkreq[*num_writes].opaque = req; blkreq[*num_writes].error = 0; + trace_virtio_blk_handle_write(req, req->out->sector, req->qiov.size / 512); (*num_writes)++; } diff --git a/posix-aio-compat.c b/posix-aio-compat.c index b43c531..57d83f0 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -23,6 +23,7 @@ #include <stdio.h> #include "qemu-queue.h" +#include "trace.h" #include "osdep.h" #include "qemu-common.h" #include "block_int.h" @@ -583,6 +584,7 @@ BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd, acb->next = posix_aio_state->first_aio; posix_aio_state->first_aio = acb; + trace_paio_submit(acb, opaque, sector_num, nb_sectors, type); qemu_paio_submit(acb); return &acb->common; } diff --git a/trace.h b/trace.h index 144aa1e..3c4564f 100644 --- a/trace.h +++ b/trace.h @@ -1,5 +1,12 @@ typedef enum { - TRACE_MAX + TRACE_MULTIWRITE_CB, + TRACE_BDRV_AIO_MULTIWRITE, + TRACE_BDRV_AIO_MULTIWRITE_EARLYFAIL, + TRACE_BDRV_AIO_MULTIWRITE_LATEFAIL, + TRACE_VIRTIO_BLK_REQ_COMPLETE, + TRACE_VIRTIO_BLK_RW_COMPLETE, + TRACE_VIRTIO_BLK_HANDLE_WRITE, + TRACE_PAIO_SUBMIT, } TraceEvent; void trace1(TraceEvent event, unsigned long x1); @@ -7,3 +14,36 @@ void trace2(TraceEvent event, unsigned long x1, unsigned long x2); void trace3(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3); void trace4(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4); void trace5(TraceEvent event, unsigned long x1, unsigned long x2, unsigned long x3, unsigned long x4, unsigned long x5); + +static inline void trace_multiwrite_cb(void *mcb, int ret) { + trace2(TRACE_MULTIWRITE_CB, (unsigned long)mcb, ret); +} + +static inline void trace_bdrv_aio_multiwrite(void *mcb, int num_callbacks, int num_reqs) { + trace3(TRACE_BDRV_AIO_MULTIWRITE, (unsigned long)mcb, num_callbacks, num_reqs); +} + +static inline void trace_bdrv_aio_multiwrite_earlyfail(void *mcb) { + trace1(TRACE_BDRV_AIO_MULTIWRITE_EARLYFAIL, (unsigned long)mcb); +} + +static inline void trace_bdrv_aio_multiwrite_latefail(void *mcb, int i) { + trace2(TRACE_BDRV_AIO_MULTIWRITE_LATEFAIL, (unsigned long)mcb, i); +} + +static inline void trace_virtio_blk_req_complete(void *req, int status) { + trace2(TRACE_VIRTIO_BLK_REQ_COMPLETE, (unsigned long)req, status); +} + +static inline void trace_virtio_blk_rw_complete(void *req, int ret) { + trace2(TRACE_VIRTIO_BLK_RW_COMPLETE, (unsigned long)req, ret); +} + +static inline void trace_virtio_blk_handle_write(void *req, unsigned long sector, unsigned long nsectors) { + trace3(TRACE_VIRTIO_BLK_HANDLE_WRITE, (unsigned long)req, sector, nsectors); +} + +static inline void trace_paio_submit(void *acb, void *opaque, unsigned long sector_num, unsigned long nb_sectors, unsigned long type) +{ + trace5(TRACE_PAIO_SUBMIT, (unsigned long)acb, (unsigned long)opaque, sector_num, nb_sectors, type); +} diff --git a/trace.py b/trace.py index f38ab6b..ba4476c 100755 --- a/trace.py +++ b/trace.py @@ -6,6 +6,14 @@ trace_fmt = 'LLLLLL' trace_len = struct.calcsize(trace_fmt) events = { + 0: ('multiwrite_cb', 'mcb', 'ret'), + 1: ('bdrv_aio_multiwrite', 'mcb', 'num_callbacks', 'num_reqs'), + 2: ('bdrv_aio_multiwrite_earlyfail', 'mcb'), + 3: ('bdrv_aio_multiwrite_latefail', 'mcb', 'i'), + 4: ('virtio_blk_req_complete', 'req', 'status'), + 5: ('virtio_blk_rw_complete', 'req', 'ret'), + 6: ('virtio_blk_handle_write', 'req', 'sector', 'nsectors'), + 7: ('paio_submit', 'acb', 'opaque', 'sector_num', 'nb_sectors', 'type'), } def read_record(fobj): -- 1.7.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/2] Tracing 2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi 2010-05-21 9:42 ` [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit Stefan Hajnoczi @ 2010-05-21 11:27 ` Prerna Saxena 2 siblings, 0 replies; 18+ messages in thread From: Prerna Saxena @ 2010-05-21 11:27 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Anthony Liguori, qemu-devel, kvm Hi Stefan, Nice to see the patchset. I am working on something similar, on the lines of static trace events for QEMU, that collect traces in a qemu-internal buffer. This would employ monitor commands to read traces, as well as enable/disable trace events at runtime. I plan to post a prototype early next week. On 05/21/2010 03:12 PM, Stefan Hajnoczi wrote: > Trace events in QEMU/KVM can be very useful for debugging and performance > analysis. I'd like to discuss tracing support and hope others have an interest > in this feature, too. > > Following this email are patches I am using to debug virtio-blk and storage. > The patches provide trivial tracing support, but they don't address the details > of real tracing tools: enabling/disabling events at runtime, no overhead for > disabled events, multithreading support, etc. > > It would be nice to have userland tracing facilities that work out-of-the-box > on production systems. Unfortunately, I'm not aware of any such facilities out > there right now on Linux. Perhaps SystemTap userspace tracing is the way to > go, has anyone tried it with KVM? > > For the medium term, without userspace tracing facilities in the OS we could > put something into QEMU to address the need for tracing. Here are my thoughts > on fleshing out the tracing patch I have posted: > > 1. Make it possible to enable/disable events at runtime. Users enable only the > events they are interested in and aren't flooded with trace data for all > other events. > Agree, my upcoming patchset should address this. > 2. Either make trace events cheap or build without trace events by default. > Disable by default still allows tracing to be used for development but > less for production. > I'm trying to do this too, though quite a lot remains to be improved in my current implementation :-) > 3. Allow events in any execution context (cpu, io, aio emulation threads). > Agree. > 4. Make it easy to add new events. Agree ! I'm trying to provide a unified macro interface like trace events which makes it easy enough to add new events. Regards, -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2010-05-21 21:58 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-21 9:42 [Qemu-devel] [RFC 0/2] Tracing Stefan Hajnoczi 2010-05-21 9:42 ` [Qemu-devel] [PATCH 1/2] trace: Add simple tracing support Stefan Hajnoczi 2010-05-21 9:49 ` [Qemu-devel] " Stefan Hajnoczi 2010-05-21 11:13 ` Jan Kiszka 2010-05-21 13:10 ` Stefan Hajnoczi 2010-05-21 13:22 ` Jan Kiszka 2010-05-21 12:37 ` [Qemu-devel] " Anthony Liguori 2010-05-21 13:46 ` Jan Kiszka 2010-05-21 14:03 ` Anthony Liguori 2010-05-21 16:52 ` Jan Kiszka 2010-05-21 20:49 ` Stefan Hajnoczi 2010-05-21 21:26 ` Christoph Hellwig 2010-05-21 21:37 ` Jan Kiszka 2010-05-21 21:06 ` Anthony Liguori 2010-05-21 21:41 ` Jan Kiszka 2010-05-21 21:58 ` Anthony Liguori 2010-05-21 9:42 ` [Qemu-devel] [PATCH 2/2] trace: Trace write requests in virtio-blk, multiwrite, and paio_submit Stefan Hajnoczi 2010-05-21 11:27 ` [Qemu-devel] [RFC 0/2] Tracing Prerna Saxena
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).