* RE: Hi,something about the xentrace tool
@ 2006-06-15 7:06 Ian Pratt
2006-06-15 8:58 ` [Xen-devel] " rickey berkeley
2006-06-15 16:41 ` Rob Gardner
0 siblings, 2 replies; 22+ messages in thread
From: Ian Pratt @ 2006-06-15 7:06 UTC (permalink / raw)
To: Rob Gardner, NAHieu
Cc: xen-tools, xen-devel, rickey berkeley, ryanh, xen-users
> If overflow occurs, it is not handled. The mechanism I implemented was
> just designed to drastically reduce the probability of overflow.
It does count the number of lost trace messages and add a trace message
to that effect though, right?
Thanks,
Ian
> Currently, the trace buffer "high water" mark is set to 50%. That is,
> when the hypervisor trace buffer becomes 1/2 full, it sends a soft
> interrupt to wake up xenbaked from its blocking select(). If nobody
> wakes up to read trace records from the trace buffer, I take that to
> mean that nobody cares about the trace records. When somebody does
care,
> they will read those records in a timely manner. Obviously, the
> hypervisor cannot "block" if there is no room in the trace buffers; In
> this case, new trace records simply overwrite old ones, and the old
ones
> are lost.
>
> If you encounter a situation where trace records are being generated
too
> fast, and fill up the trace buffer too quickly, then the simple next
> step is to increase the size of the trace buffers. So far, use of the
> trace records has not been linked to anything so critical that it's
> necessary to take extraordinary measures to avoid loss of data.
>
> Rob
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Xen-devel] Hi,something about the xentrace tool
2006-06-15 7:06 Hi,something about the xentrace tool Ian Pratt
@ 2006-06-15 8:58 ` rickey berkeley
2006-06-15 17:03 ` Rob Gardner
2006-06-15 16:41 ` Rob Gardner
1 sibling, 1 reply; 22+ messages in thread
From: rickey berkeley @ 2006-06-15 8:58 UTC (permalink / raw)
To: Ian Pratt
Cc: xen-tools, xen-devel, NAHieu, Rob Gardner, ryanh, xen-users,
ian.pratt
[-- Attachment #1.1: Type: text/plain, Size: 1080 bytes --]
>
>
> > If you encounter a situation where trace records are being generated
> > fast, and fill up the trace buffer too quickly, then the simple next
> > step is to increase the size of the trace buffers. So far, use of the
> > trace records has not been linked to anything so critical that it's
> > necessary to take extraordinary measures to avoid loss of data.
> >
> > Rob
>
Hi,Rob
as xentrace can be used as the performance tracing and debugging tool.
you mean when transfer the large amounts of data from kernel space to
the user space,xentrace use its own mechanisms to relay the data and balance
the transfer speed.And we can enlarge the buffer size if we want to save
more tracing raw data.
so,dose this mechanisms will effect the system performance evidently?as we
know ,copy huge raw data from kernel space to user space will exhaust so
much efficiency and system resource.
How about make use of relayfs? It is some kind of standardization of the way
in which large amounts of data are transferred from kernel space to user
space.
Anyway,it is just a piece of idea.
[-- Attachment #1.2: Type: text/html, Size: 1456 bytes --]
[-- Attachment #2: Type: text/plain, Size: 137 bytes --]
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-15 8:58 ` [Xen-devel] " rickey berkeley
@ 2006-06-15 17:03 ` Rob Gardner
2006-06-15 18:20 ` George Dunlap
0 siblings, 1 reply; 22+ messages in thread
From: Rob Gardner @ 2006-06-15 17:03 UTC (permalink / raw)
To: rickey berkeley; +Cc: xen-tools, xen-devel, xen-users
rickey berkeley wrote:
>
> so,dose this mechanisms will effect the system performance
> evidently?as we know ,copy huge raw data from kernel space to user
> space will exhaust so much efficiency and system resource.
I wouldn't call the amount of data 'huge'. Even on a very busy system,
where there are thousands of trace records being generated every second,
that's still a pretty small amount of data. (The size of a trace record
is something like 50 or 60 bytes.) Also, the data is not "copied" from
kernel space to user space. There is a shared memory buffer which xen
writes into, and the user app reads out of. Memory read speeds are
currently in the Gb/s range. So to answer your question, I don't think
that this mechanism affects system performance in any significant way.
>
> How about make use of relayfs? It is some kind of standardization of
> the way in which large amounts of data are transferred from kernel
> space to user space.
>
If the data were only being transferred between the linux kernel and a
linux app, then I'd say yeah, relayfs sounds like a cool thing to do.
However, the trace records are generated by the xen hypervisor, not the
linux kernel. The hypervisor doesn't have relayfs (or any fs for that
matter), so you're stuck with involving the linux kernel which would
read stuff from a shared hypervisor buffer, then present the data to
userland via relayfs. Doesn't sound like a better solution than what we
have now.
Rob
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-15 17:03 ` Rob Gardner
@ 2006-06-15 18:20 ` George Dunlap
2006-06-15 18:28 ` George Dunlap
2006-06-15 18:53 ` Rob Gardner
0 siblings, 2 replies; 22+ messages in thread
From: George Dunlap @ 2006-06-15 18:20 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-tools, xen-devel, xen-users, rickey berkeley
On 6/15/06, Rob Gardner <rob.gardner@hp.com> wrote:
> I wouldn't call the amount of data 'huge'. Even on a very busy system,
> where there are thousands of trace records being generated every second,
> that's still a pretty small amount of data. (The size of a trace record
> is something like 50 or 60 bytes.)
For the record, I think the trace record size in the trace buffers is
probably 32 bytes:
struct {
unsigned long long rdtsc; /* 8 */
unsigned long event; /* + 4 = 12 */
unsigned long data[5] /* + (4 * 5) = 32 */
};
The size on disk from xentrace is 36 bytes (it adds 4 bytes for the cpu).
If someone were really worried about copy time, one could write
something which uses raw disks (or, perhaps, the O_DIRECT flag) to DMA
data straight from the buffers to the disk. But I'm not really
worried about it at this point. :-)
Peace,
-George
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Hi,something about the xentrace tool
2006-06-15 18:20 ` George Dunlap
@ 2006-06-15 18:28 ` George Dunlap
2006-06-15 18:53 ` Rob Gardner
1 sibling, 0 replies; 22+ messages in thread
From: George Dunlap @ 2006-06-15 18:28 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-devel
On 6/15/06, George Dunlap <dunlapg@umich.edu> wrote:
> For the record, I think the trace record size in the trace buffers is
> probably 32 bytes:
On 32-bit architectures, that is...
(Sorry for the 32-bit provincialism... haven't coded on a 64-bit box yet.)
-G
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-15 18:20 ` George Dunlap
2006-06-15 18:28 ` George Dunlap
@ 2006-06-15 18:53 ` Rob Gardner
2006-06-19 1:06 ` George Dunlap
1 sibling, 1 reply; 22+ messages in thread
From: Rob Gardner @ 2006-06-15 18:53 UTC (permalink / raw)
To: George Dunlap; +Cc: xen-tools, xen-devel, xen-users, rickey berkeley
George Dunlap wrote:
> On 6/15/06, Rob Gardner <rob.gardner@hp.com> wrote:
>> I wouldn't call the amount of data 'huge'. Even on a very busy system,
>> where there are thousands of trace records being generated every second,
>> that's still a pretty small amount of data. (The size of a trace record
>> is something like 50 or 60 bytes.)
>
> For the record, I think the trace record size in the trace buffers is
> probably 32 bytes:
You're right, I was thinking everything is 64 bits these days. In any
case, it's a small amount of data.
> If someone were really worried about copy time, one could write
> something which uses raw disks (or, perhaps, the O_DIRECT flag) to DMA
> data straight from the buffers to the disk.
Once again, there is no explicit copying of the data between kernel and
user space, so nobody should be worried about it.
Rob
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-15 18:53 ` Rob Gardner
@ 2006-06-19 1:06 ` George Dunlap
2006-06-19 5:00 ` Rob Gardner
0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2006-06-19 1:06 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-tools, xen-devel, xen-users, rickey berkeley
On 6/15/06, Rob Gardner <rob.gardner@hp.com> wrote:
> > If someone were really worried about copy time, one could write
> > something which uses raw disks (or, perhaps, the O_DIRECT flag) to DMA
> > data straight from the buffers to the disk.
>
> Once again, there is no explicit copying of the data between kernel and
> user space, so nobody should be worried about it.
There's no copying from the HV to the xentrace process. But there is
copying from xentrace to the dom0 kernel for the output file. Some
copying is necessary right now, because rather than writing out the
pages verbatim, xentrace writes out the pcpu before writing out each
record:
void write_rec(unsigned int cpu, struct t_rec *rec, FILE *out)
{
size_t written = 0;
written += fwrite(&cpu, sizeof(cpu), 1, out);
written += fwrite(rec, sizeof(*rec), 1, out);
if ( written != 2 )
{
PERROR("Failed to write trace record");
exit(EXIT_FAILURE);
}
}
If we wanted to make it zero copy all the way from the HV to the disk,
we could have the xentrace process one stream per cpu, and do
whatever's necessary to use DMA. (Does anyone know if O_DIRECT will
do direct DMA, or if one would have to use a raw disk?)
But I think we all seem to agree, this is not a high priority. :-)
-George
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Hi,something about the xentrace tool
2006-06-19 1:06 ` George Dunlap
@ 2006-06-19 5:00 ` Rob Gardner
2006-06-19 14:02 ` George Dunlap
0 siblings, 1 reply; 22+ messages in thread
From: Rob Gardner @ 2006-06-19 5:00 UTC (permalink / raw)
To: George Dunlap; +Cc: xen-tools, xen-devel, xen-users, rickey berkeley
George Dunlap wrote:
>
> There's no copying from the HV to the xentrace process. But there is
> copying from xentrace to the dom0 kernel for the output file. Some
> copying is necessary right now, because rather than writing out the
> pages verbatim, xentrace writes out the pcpu before writing out each
> record:
>
> void write_rec(unsigned int cpu, struct t_rec *rec, FILE *out)
> {
> size_t written = 0;
> written += fwrite(&cpu, sizeof(cpu), 1, out);
> written += fwrite(rec, sizeof(*rec), 1, out);
> if ( written != 2 )
> {
> PERROR("Failed to write trace record");
> exit(EXIT_FAILURE);
> }
> }
>
> If we wanted to make it zero copy all the way from the HV to the disk,
> we could have the xentrace process one stream per cpu, and do
> whatever's necessary to use DMA. (Does anyone know if O_DIRECT will
> do direct DMA, or if one would have to use a raw disk?)
So you're saying if we didn't have to write the cpu number, then we
could bypass stdio, and directly do a write() using the trace buffer?
And this would be better because it would avoid a memory to memory copy,
and use DMA immediately on the trace buffer memory? Do I understand you
correctly? Assuming this is what you mean, allow me to correct a slight
logic flaw. Stdio is there for a reason; Doing lots of raw I/O using
very small buffers is highly inefficient. There's the overhead of kernel
entry/exit and of setting up and tearing down DMA transactions. And
writing to a block device will result in I/O's that are multiples of the
devices' block size, so writing a 32 byte trace record will probably
cause a 512-byte block to actually be written to disk. So bypassing
stdio in this case will result in lots more disk accesses, lots more dma
setup/teardown, and lots more system calls. In other words, the
performance is going to horrible. The Stdio library greatly reduces all
this overhead by buffering stuff in memory until there's enough to make
a genuine I/O relatively efficient. In this case, the memory copies are
intentional and beneficial; We do not want to eliminate them in our
quest for "zero copy".
Rob
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Hi,something about the xentrace tool
2006-06-19 5:00 ` Rob Gardner
@ 2006-06-19 14:02 ` George Dunlap
2006-06-19 17:19 ` Rob Gardner
0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2006-06-19 14:02 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-tools, xen-devel
On 6/19/06, Rob Gardner <rob.gardner@hp.com> wrote:
> Stdio is there for a reason; Doing lots of raw I/O using
> very small buffers is highly inefficient.
You misunderstand me. :-) I meant to write out (via DMA) several
pages at a time, straight from the HV trace buffers. The default tbuf
size in xentrace is 20 pages, so if (as the plan is) xentrace would be
notified when it would be half full, we could easily write out 10
pages in one transaction. The tbuf size could be increased if DMA
setup/teardown overhead were an issue on that scale.
You're right, for traces that fit in the file cache, buffering is a
big win. The copy overhead is negligible, writes to disk are more
efficient, and the data will be in the file cache for reading for
subsequent analysis. But for traces that won't fit in the file cache,
the best thing would be to get them to disk with as little copying and
cache-trashing as possible.
Some of my recent traces have been on the order of 10 gigabytes. I
haven't done much to modify xentrace, because I'm not worried about
the trace overhead at this point. But I've had to pull some tricks to
get my analysis tools to run in anything like a reasonable amount of
time.
-George
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-19 14:02 ` George Dunlap
@ 2006-06-19 17:19 ` Rob Gardner
2006-06-21 19:02 ` George Dunlap
0 siblings, 1 reply; 22+ messages in thread
From: Rob Gardner @ 2006-06-19 17:19 UTC (permalink / raw)
To: George Dunlap; +Cc: xen-tools, xen-devel
George Dunlap wrote:
> You misunderstand me. :-) I meant to write out (via DMA) several
> pages at a time, straight from the HV trace buffers. The default tbuf
> size in xentrace is 20 pages, so if (as the plan is) xentrace would be
> notified when it would be half full, we could easily write out 10
> pages in one transaction. The tbuf size could be increased if DMA
> setup/teardown overhead were an issue on that scale.
> ...
> Some of my recent traces have been on the order of 10 gigabytes. I
> haven't done much to modify xentrace, because I'm not worried about
> the trace overhead at this point. But I've had to pull some tricks to
> get my analysis tools to run in anything like a reasonable amount of
> time.
I am glad to discover that I misunderstood you. ;) But I am still having
trouble understanding what the actual problem is, or even if one exists.
If you have a trace that is 10 gigabytes, that's several days (maybe
weeks) worth of trace records, depending on the rate they're generated.
A memory to memory copy of 10 gigabytes will take mere seconds on any
modern machine, and amortized over a few days, I don't see how it's
worth any work to further reduce that or eliminate it. Is the system so
cpu-bound that the loss of a few seconds over several days is that
serious? Even compared to the disk I/O to write out 10 gb, which is
probably several minutes, I don't see how the memory copies are a big
deal. Perhaps kernel buffer cache effects are noticeable, but again at
the data rate you're talking about, the cache will only get completely
purged once every 5 or 10 hours.
If your analysis tools take a long time to run, I'd guess it's because
of the size of the data, not because system resources are being hogged
by xentrace; If you are generating that much data, maybe you consider
methods to reduce it. Take a look the the trace code
(xen/common/trace.c) and you'll see that there is a facility to mask out
tracing of certain events, classes of events, and cpu's. You might use
this to drastically reduce the number of trace records generated. For
instance, if you are not interested in tracing I/O related events, you
don't want to be storing TRC_MEM records, which account for a large
percentage of the trace records generated on a busy system.
Rob
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-19 17:19 ` Rob Gardner
@ 2006-06-21 19:02 ` George Dunlap
0 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2006-06-21 19:02 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-tools, xen-devel
On 6/19/06, Rob Gardner <rob.gardner@hp.com> wrote:
> I am glad to discover that I misunderstood you. ;) But I am still having
> trouble understanding what the actual problem is, or even if one exists.
Well, I ran some tests, and no problem exists, yet. Running the following:
# time xentrace -e 0x81000 /tmp/test22-passmark.trace
change evtmask to 0x81000
real 7m15.456s
user 0m0.080s
sys 0m0.050s
# ls -l /tmp/test22-passmark.trace
-rw-r--r-- 1 root root 2654091720 Jun 21 14:49 /tmp/test22-passmark.trace
So although 2.6 gigabytes was generated in 7 minutes, the total time
spent in user and system (if the numbers time report are accurate) was
less than .13 seconds.
The only potential issues would be with cache trashing -- both the
buffer cache (from plain writes to the file), and the cpu caches (from
copying the data). If anyone finds a workload this is a problem for,
we can look at it then.
-George
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-15 7:06 Hi,something about the xentrace tool Ian Pratt
2006-06-15 8:58 ` [Xen-devel] " rickey berkeley
@ 2006-06-15 16:41 ` Rob Gardner
1 sibling, 0 replies; 22+ messages in thread
From: Rob Gardner @ 2006-06-15 16:41 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-tools, xen-devel, rickey berkeley, NAHieu, ryanh, xen-users
[-- Attachment #1.1: Type: text/plain, Size: 352 bytes --]
Ian Pratt wrote:
>> If overflow occurs, it is not handled. The mechanism I implemented was
>> just designed to drastically reduce the probability of overflow.
>>
>
> It does count the number of lost trace messages and add a trace message
> to that effect though, right?
>
No, but I'll add that to the list of things to do in the future.
Rob
[-- Attachment #1.2: Type: text/html, Size: 784 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Hi,something about the xentrace tool
@ 2006-06-19 7:14 Ian Pratt
0 siblings, 0 replies; 22+ messages in thread
From: Ian Pratt @ 2006-06-19 7:14 UTC (permalink / raw)
To: George Dunlap, Rob Gardner
Cc: xen-tools, xen-devel, xen-users, rickey berkeley
> > Once again, there is no explicit copying of the data between kernel
and
> > user space, so nobody should be worried about it.
>
> There's no copying from the HV to the xentrace process. But there is
> copying from xentrace to the dom0 kernel for the output file. Some
> copying is necessary right now, because rather than writing out the
> pages verbatim, xentrace writes out the pcpu before writing out each
> record:
We have the records in huge per-cpu blocks in memory, then write them
out individually?
That's nuts.
We should keep the IO page aligned, reserving the first record entry of
each block to fill in when we do a write-out to indicate the cpu and
#records in the batch.
I'd say this fix is less important than logging the number of dropped
records, but if we ever want to reduce the capture overhead in the
future we'll have to fix this.
Ian
> void write_rec(unsigned int cpu, struct t_rec *rec, FILE *out)
> {
> size_t written = 0;
> written += fwrite(&cpu, sizeof(cpu), 1, out);
> written += fwrite(rec, sizeof(*rec), 1, out);
> if ( written != 2 )
> {
> PERROR("Failed to write trace record");
> exit(EXIT_FAILURE);
> }
> }
>
> If we wanted to make it zero copy all the way from the HV to the disk,
> we could have the xentrace process one stream per cpu, and do
> whatever's necessary to use DMA. (Does anyone know if O_DIRECT will
> do direct DMA, or if one would have to use a raw disk?)
>
> But I think we all seem to agree, this is not a high priority. :-)
>
> -George
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread* RE: Hi,something about the xentrace tool
@ 2006-06-16 14:11 Ian Pratt
2006-06-16 16:56 ` Rob Gardner
0 siblings, 1 reply; 22+ messages in thread
From: Ian Pratt @ 2006-06-16 14:11 UTC (permalink / raw)
To: Rob Gardner
Cc: xen-tools, xen-devel, rickey berkeley, NAHieu, ryanh, xen-users
> It does count the number of lost trace messages and add a trace
> message
> to that effect though, right?
>
> No, but I'll add that to the list of things to do in the future.
I think we really need this. It's trivial to add -- as soon as you fill,
overwrite the last trace record with a new 'lost events' record and bump
a counter in it whenever you try to write a record. When space becomes
available just progress as before.
Ian
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-16 14:11 Ian Pratt
@ 2006-06-16 16:56 ` Rob Gardner
0 siblings, 0 replies; 22+ messages in thread
From: Rob Gardner @ 2006-06-16 16:56 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-tools, xen-devel, rickey berkeley, NAHieu, ryanh, xen-users
[-- Attachment #1.1: Type: text/plain, Size: 519 bytes --]
Ian Pratt wrote:
>> It does count the number of lost trace messages and add a trace
>> message
>> to that effect though, right?
>>
>> No, but I'll add that to the list of things to do in the future.
>>
>
> I think we really need this. It's trivial to add -- as soon as you fill,
> overwrite the last trace record with a new 'lost events' record and bump
> a counter in it whenever you try to write a record. When space becomes
> available just progress as before.
>
OK, I'll plan on doing this soon.
Rob
[-- Attachment #1.2: Type: text/html, Size: 938 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Hi,something about the xentrace tool
@ 2006-06-11 13:17 rickey berkeley
2006-06-12 15:17 ` George Dunlap
2006-06-12 15:30 ` Ryan Harper
0 siblings, 2 replies; 22+ messages in thread
From: rickey berkeley @ 2006-06-11 13:17 UTC (permalink / raw)
To: xen-tools, xen-devel, xen-users
[-- Attachment #1.1: Type: text/plain, Size: 1718 bytes --]
Hi folks:
Recently, I am doing some research on xentrace source code.
En,actually,I think the xentrace is a very useful tool for xen debug.
./xen/include/xen/trace.h
./xen/common/trace.c
etc.
I got a point very confused in the source code about the trace record.
from man page
"Where CPU is the processor number, TSC is the record’s
timestamp (the
value of the CPU cycle counter), EVENT is the event ID and
D1...D5 are
the trace data."
"Which correspond to the CPU number, event ID, timestamp counter
and
the 5 data fields from the trace record. There should be one
such
rule for each type of event."
So I just wonder what does these kind of D1....D5 data mean.
I found the defined the d1-d5 structure in xen/include/public/trace.h
59 /* This structure represents a single trace buffer record. */
60 struct t_rec {
61 uint64_t cycles; /* cycle counter timestamp */
62 uint32_t event; /* event ID */
63 unsigned long data[5]; /* event data items */
64 };
and defined a trace function in ./xen/common/trace.c
225 void trace(u32 event, unsigned long d1, unsigned long d2,
226 unsigned long d3, unsigned long d4, unsigned long d5)
227 {
But I still can't understand what are these data real meaning
I never found a place in the source code where this function void
trace(...) had been called.
Can someone give me a clue about this?
Or are those interfaces (d1...d5) for developers to define their own
interest to track the event?
Your rapid reply will be appreciated.
Thanks.
----
Regards
[-- Attachment #1.2: Type: text/html, Size: 2949 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Hi,something about the xentrace tool
2006-06-11 13:17 rickey berkeley
@ 2006-06-12 15:17 ` George Dunlap
2006-06-12 15:30 ` Ryan Harper
1 sibling, 0 replies; 22+ messages in thread
From: George Dunlap @ 2006-06-12 15:17 UTC (permalink / raw)
To: rickey berkeley; +Cc: xen-tools, xen-devel, xen-users
Search for the "TRACE_nD" macros., where you replace 'n' with [1-5] to
see how traces are used in practice. For example,
TRACE_2D(TRC_[sometype], dataval1, dataval2);
This will be changed by the preprocessor into
trace(TRC_[sometype], dataval1, dataval2, 0, 0, 0);
This saves us poor exhausted programmers from typing 3 extra zeros. ;-)
As you surmised, the values in the data are defined per trace value,
depending on what information the programmer writing the trace wants
to gather.
Be advised that xentrace adds the physical cpu that the event occured
on, so the real structure to read from xentrace output looks like
this:
struct trace_record {
unsigned long cpu;
unsigned long long tsc;
unsigned long event;
unsigned long data[5];
};
Peace,
-George
On 6/11/06, rickey berkeley <rickey.berkeley@gmail.com> wrote:
>
> Hi folks:
>
> Recently, I am doing some research on xentrace source code.
> En,actually,I think the xentrace is a very useful tool for xen debug.
>
> ./xen/include/xen/trace.h
> ./xen/common/trace.c
> etc.
> I got a point very confused in the source code about the trace record.
>
> from man page
> "Where CPU is the processor number, TSC is the record’s
> timestamp (the
> value of the CPU cycle counter), EVENT is the event ID and
> D1...D5 are
> the trace data."
>
> "Which correspond to the CPU number, event ID, timestamp counter
> and
> the 5 data fields from the trace record. There should be one
> such
> rule for each type of event."
>
> So I just wonder what does these kind of D1....D5 data mean.
>
> I found the defined the d1-d5 structure in xen/include/public/trace.h
> 59 /* This structure represents a single trace buffer record. */
> 60 struct t_rec {
> 61 uint64_t cycles; /* cycle counter timestamp */
> 62 uint32_t event; /* event ID */
> 63 unsigned long data[5]; /* event data items */
> 64 };
>
> and defined a trace function in ./xen/common/trace.c
> 225 void trace(u32 event, unsigned long d1, unsigned long d2,
> 226 unsigned long d3, unsigned long d4, unsigned long d5)
> 227 {
>
> But I still can't understand what are these data real meaning
> I never found a place in the source code where this function void
> trace(...) had been called.
>
> Can someone give me a clue about this?
> Or are those interfaces (d1...d5) for developers to define their own
> interest to track the event?
>
> Your rapid reply will be appreciated.
> Thanks.
>
> ----
> Regards
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Hi,something about the xentrace tool
2006-06-11 13:17 rickey berkeley
2006-06-12 15:17 ` George Dunlap
@ 2006-06-12 15:30 ` Ryan Harper
2006-06-13 17:11 ` rickey berkeley
1 sibling, 1 reply; 22+ messages in thread
From: Ryan Harper @ 2006-06-12 15:30 UTC (permalink / raw)
To: rickey berkeley; +Cc: xen-tools, xen-devel, xen-users
* rickey berkeley <rickey.berkeley@gmail.com> [2006-06-11 08:18]:
> Hi folks:
Hi,
> from man page
> "Where CPU is the processor number, TSC is the recordâ??s
> timestamp (the
> value of the CPU cycle counter), EVENT is the event ID and
> D1...D5 are
> the trace data."
>
> "Which correspond to the CPU number, event ID, timestamp counter
> and
> the 5 data fields from the trace record. There should be one
> such
> rule for each type of event."
>
> So I just wonder what does these kind of D1....D5 data mean.
This is referring to the fact that a trace record can have up to 5
fields.
>
> I found the defined the d1-d5 structure in xen/include/public/trace.h
> 59 /* This structure represents a single trace buffer record. */
> 60 struct t_rec {
> 61 uint64_t cycles; /* cycle counter timestamp */
> 62 uint32_t event; /* event ID */
> 63 unsigned long data[5]; /* event data items */
> 64 };
>
> and defined a trace function in ./xen/common/trace.c
> 225 void trace(u32 event, unsigned long d1, unsigned long d2,
> 226 unsigned long d3, unsigned long d4, unsigned long d5)
> 227 {
>
> But I still can't understand what are these data real meaning
> I never found a place in the source code where this function void
> trace(...) had been called.
Search for TRACE_ , there are macros that wrap the proper call to trace
based on the number of fields that the trace record is using.
> Can someone give me a clue about this?
> Or are those interfaces (d1...d5) for developers to define their own
> interest to track the event?
Let's look at one of the trace macros:
xen/common/schedule.c:115
TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_id, v->vcpu_id);
The trace id is TRC_SCHED_DOM_ADD, is defined in
xen/include/public/trace.h, which is
#define TRC_SCHED_DOM_ADD (TRC_SCHED + 1)
#define TRC_SCHED 0x0002f000 /* Xen Scheduler trace */
so TRC_SCHED_DOM_ADD is 0x0002f0001
The _2D means the trace record expects to inputs. The trace
infrastructure supports up to 5 parameters, TRACE_5D.
The xentrace formatter needs to know how to unpack the trace record, you
can look at the unpacking in tools/xentrace/formats , looking at the
TRC_SCHED_DOM_ADD (0x0002f001), we can see it is unpacked as:
0x0002f001 CPU%(cpu)d %(tsc)d sched_add_domain [ domid =
0x%(1)08x, edomid = 0x%(2)08x ]
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253 T/L: 678-9253
ryanh@us.ibm.com
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Hi,something about the xentrace tool
2006-06-12 15:30 ` Ryan Harper
@ 2006-06-13 17:11 ` rickey berkeley
2006-06-13 17:25 ` Rob Gardner
0 siblings, 1 reply; 22+ messages in thread
From: rickey berkeley @ 2006-06-13 17:11 UTC (permalink / raw)
To: ryanh; +Cc: xen-tools, xen-devel, xen-users
[-- Attachment #1.1: Type: text/plain, Size: 1562 bytes --]
Hi folks
Thanks very much ! I've got it.You guys' direction and clue are very clear!
And I'd like to talk another topic about the xentrace tool.
xentrace tool is some kind of user space tool.
Based on xentrace source code (tools/xentrace/xentrace.c),it will get, map
and monitor the tracing data
from kernel space to the user space.I think xen use procfs to transfer data
from kernel space to uesr space.
Based on trace source code(xen/common/trace.c),dom0 tracing the event which
are enabled.
xen initialize tracing buffer for each cpu,trace buffer size (in pages) in
kenel space is defined by opt_tbuf_size.
The tracing buf use loop structure.
I guess when the tracing data increase rapidly ,and xentrace tool in user
space does not take them immediately.
Some tracing data will lost.Maybe relayfs or something else can solve this
problem.
On 6/12/06, Ryan Harper <ryanh@us.ibm.com> wrote:
>
> * rickey berkeley <rickey.berkeley@gmail.com> [2006-06-11 08:18]:
> > Hi folks:
>
> Search for TRACE_ , there are macros that wrap the proper call to trace
> based on the number of fields that the trace record is using.
>
> > Can someone give me a clue about this?
> > Or are those interfaces (d1...d5) for developers to define their own
> > interest to track the event?
>
> Let's look at one of the trace macros:
>
> 0x0002f001 CPU%(cpu)d %(tsc)d sched_add_domain [ domid =
> 0x%(1)08x, edomid = 0x%(2)08x ]
>
> --
> Ryan Harper
> Software Engineer; Linux Technology Center
> IBM Corp., Austin, Tx
> (512) 838-9253 T/L: 678-9253
> ryanh@us.ibm.com
>
[-- Attachment #1.2: Type: text/html, Size: 2075 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-13 17:11 ` rickey berkeley
@ 2006-06-13 17:25 ` Rob Gardner
2006-06-14 3:47 ` NAHieu
0 siblings, 1 reply; 22+ messages in thread
From: Rob Gardner @ 2006-06-13 17:25 UTC (permalink / raw)
To: rickey berkeley; +Cc: xen-tools, ryanh, xen-devel, xen-users
rickey berkeley wrote:
> Based on trace source code(xen/common/trace.c),dom0 tracing the event
> which are enabled.
>
> xen initialize tracing buffer for each cpu,trace buffer size (in
> pages) in kenel space is defined by opt_tbuf_size.
> The tracing buf use loop structure.
>
> I guess when the tracing data increase rapidly ,and xentrace tool in
> user space does not take them immediately.
> Some tracing data will lost.Maybe relayfs or something else can solve
> this problem.
>
I added a basic flow control mechanism to the trace buffer system a few
months ago. You can see an example of how to use it in tools/xenmon. The
way it works is that as trace records are generated, a software
interrupt is generated when the trace buffer gets filled to a certain
point. The user space tools can use select() on an event channel to find
out about these interrupts. See tools/xenmon/xenbaked.c for exact
programming details. This code has not been copied into xentrace yet;
Feel free to do so yourself if you think it's necessary.
Rob
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-13 17:25 ` Rob Gardner
@ 2006-06-14 3:47 ` NAHieu
2006-06-14 16:06 ` Rob Gardner
0 siblings, 1 reply; 22+ messages in thread
From: NAHieu @ 2006-06-14 3:47 UTC (permalink / raw)
To: Rob Gardner; +Cc: xen-tools, ryanh, xen-devel, rickey berkeley, xen-users
Hi Rob,
On 6/14/06, Rob Gardner <rob.gardner@hp.com> wrote:
> rickey berkeley wrote:
> > Based on trace source code(xen/common/trace.c),dom0 tracing the event
> > which are enabled.
> >
> > xen initialize tracing buffer for each cpu,trace buffer size (in
> > pages) in kenel space is defined by opt_tbuf_size.
> > The tracing buf use loop structure.
> >
> > I guess when the tracing data increase rapidly ,and xentrace tool in
> > user space does not take them immediately.
> > Some tracing data will lost.Maybe relayfs or something else can solve
> > this problem.
> >
>
> I added a basic flow control mechanism to the trace buffer system a few
> months ago. You can see an example of how to use it in tools/xenmon. The
> way it works is that as trace records are generated, a software
> interrupt is generated when the trace buffer gets filled to a certain
> point. The user space tools can use select() on an event channel to find
> out about these interrupts. See tools/xenmon/xenbaked.c for exact
> programming details. This code has not been copied into xentrace yet;
> Feel free to do so yourself if you think it's necessary.
Would you explain a little bit more on how you handle the overflow? If
the userspace detects that event, what it does then?
Usually if there is no more space available in kernel buffer, what
will you do? Drop some data, and hope that the userspace will come up
to free some for more space in the future? Or if you don't drop them,
just block there and wait for the userspace to come to free some?
I believe that there are few tactics for this problems, and what we
should do depends on situations.
Thanks.
H
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Hi,something about the xentrace tool
2006-06-14 3:47 ` NAHieu
@ 2006-06-14 16:06 ` Rob Gardner
0 siblings, 0 replies; 22+ messages in thread
From: Rob Gardner @ 2006-06-14 16:06 UTC (permalink / raw)
To: NAHieu; +Cc: xen-tools, ryanh, xen-devel, rickey berkeley, xen-users
NAHieu wrote:
> On 6/14/06, Rob Gardner <rob.gardner@hp.com> wrote:
>> rickey berkeley wrote:
>> > Based on trace source code(xen/common/trace.c),dom0 tracing the event
>> > which are enabled.
>> >
>> > xen initialize tracing buffer for each cpu,trace buffer size (in
>> > pages) in kenel space is defined by opt_tbuf_size.
>> > The tracing buf use loop structure.
>> >
>> > I guess when the tracing data increase rapidly ,and xentrace tool in
>> > user space does not take them immediately.
>> > Some tracing data will lost.Maybe relayfs or something else can solve
>> > this problem.
>> >
>>
>> I added a basic flow control mechanism to the trace buffer system a few
>> months ago. You can see an example of how to use it in tools/xenmon. The
>> way it works is that as trace records are generated, a software
>> interrupt is generated when the trace buffer gets filled to a certain
>> point. The user space tools can use select() on an event channel to find
>> out about these interrupts. See tools/xenmon/xenbaked.c for exact
>> programming details. This code has not been copied into xentrace yet;
>> Feel free to do so yourself if you think it's necessary.
>
> Would you explain a little bit more on how you handle the overflow? If
> the userspace detects that event, what it does then?
If overflow occurs, it is not handled. The mechanism I implemented was
just designed to drastically reduce the probability of overflow.
When xenbaked wakes up from its select() call (indicating the trace
buffer high water mark has been reached) it simply starts reading trace
records out of the trace buffers.
>
> Usually if there is no more space available in kernel buffer, what
> will you do? Drop some data, and hope that the userspace will come up
> to free some for more space in the future? Or if you don't drop them,
> just block there and wait for the userspace to come to free some?
Currently, the trace buffer "high water" mark is set to 50%. That is,
when the hypervisor trace buffer becomes 1/2 full, it sends a soft
interrupt to wake up xenbaked from its blocking select(). If nobody
wakes up to read trace records from the trace buffer, I take that to
mean that nobody cares about the trace records. When somebody does care,
they will read those records in a timely manner. Obviously, the
hypervisor cannot "block" if there is no room in the trace buffers; In
this case, new trace records simply overwrite old ones, and the old ones
are lost.
If you encounter a situation where trace records are being generated too
fast, and fill up the trace buffer too quickly, then the simple next
step is to increase the size of the trace buffers. So far, use of the
trace records has not been linked to anything so critical that it's
necessary to take extraordinary measures to avoid loss of data.
Rob
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2006-06-21 19:02 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-15 7:06 Hi,something about the xentrace tool Ian Pratt
2006-06-15 8:58 ` [Xen-devel] " rickey berkeley
2006-06-15 17:03 ` Rob Gardner
2006-06-15 18:20 ` George Dunlap
2006-06-15 18:28 ` George Dunlap
2006-06-15 18:53 ` Rob Gardner
2006-06-19 1:06 ` George Dunlap
2006-06-19 5:00 ` Rob Gardner
2006-06-19 14:02 ` George Dunlap
2006-06-19 17:19 ` Rob Gardner
2006-06-21 19:02 ` George Dunlap
2006-06-15 16:41 ` Rob Gardner
-- strict thread matches above, loose matches on Subject: below --
2006-06-19 7:14 Ian Pratt
2006-06-16 14:11 Ian Pratt
2006-06-16 16:56 ` Rob Gardner
2006-06-11 13:17 rickey berkeley
2006-06-12 15:17 ` George Dunlap
2006-06-12 15:30 ` Ryan Harper
2006-06-13 17:11 ` rickey berkeley
2006-06-13 17:25 ` Rob Gardner
2006-06-14 3:47 ` NAHieu
2006-06-14 16:06 ` Rob Gardner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.