* [PATCH] ext4: Add a stub for mpage_da_data in the trace header @ 2009-09-29 21:40 Josh Stone 2009-09-30 13:33 ` Christoph Hellwig 0 siblings, 1 reply; 6+ messages in thread From: Josh Stone @ 2009-09-29 21:40 UTC (permalink / raw) To: LKML; +Cc: Josh Stone, Theodore Ts'o The tracepoint ext4_da_write_pages has a struct mpage_da_data* parameter, but that struct is only defined in fs/ext4/ext4.h. This patch adds a forward declaration for that struct, so this tracepoint header can still be used by tools like SystemTap. This is a continuation of the fix in commit 3661d286. http://sourceware.org/bugzilla/show_bug.cgi?id=10703 Signed-off-by: Josh Stone <jistone@redhat.com> Cc: "Theodore Ts'o" <tytso@mit.edu> --- include/trace/events/ext4.h | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index c1bd8f1..d502974 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -11,6 +11,7 @@ struct ext4_allocation_context; struct ext4_allocation_request; struct ext4_prealloc_space; struct ext4_inode_info; +struct mpage_da_data; #define EXT4_I(inode) (container_of(inode, struct ext4_inode_info, vfs_inode)) -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: Add a stub for mpage_da_data in the trace header 2009-09-29 21:40 [PATCH] ext4: Add a stub for mpage_da_data in the trace header Josh Stone @ 2009-09-30 13:33 ` Christoph Hellwig 2009-09-30 14:20 ` Theodore Tso 0 siblings, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2009-09-30 13:33 UTC (permalink / raw) To: Josh Stone; +Cc: LKML, Theodore Ts'o On Tue, Sep 29, 2009 at 02:40:07PM -0700, Josh Stone wrote: > The tracepoint ext4_da_write_pages has a struct mpage_da_data* > parameter, but that struct is only defined in fs/ext4/ext4.h. This > patch adds a forward declaration for that struct, so this tracepoint > header can still be used by tools like SystemTap. That's not what the tracepoints are for anyway. No hacks for out of tree crap like this please - just use the trace buffer directly like ftrace and then you don't actually need any data types. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: Add a stub for mpage_da_data in the trace header 2009-09-30 13:33 ` Christoph Hellwig @ 2009-09-30 14:20 ` Theodore Tso 2009-09-30 19:45 ` Josh Stone 0 siblings, 1 reply; 6+ messages in thread From: Theodore Tso @ 2009-09-30 14:20 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Josh Stone, LKML On Wed, Sep 30, 2009 at 09:33:35AM -0400, Christoph Hellwig wrote: > On Tue, Sep 29, 2009 at 02:40:07PM -0700, Josh Stone wrote: > > The tracepoint ext4_da_write_pages has a struct mpage_da_data* > > parameter, but that struct is only defined in fs/ext4/ext4.h. This > > patch adds a forward declaration for that struct, so this tracepoint > > header can still be used by tools like SystemTap. > > That's not what the tracepoints are for anyway. No hacks for out of > tree crap like this please - just use the trace buffer directly like > ftrace and then you don't actually need any data types. > Josh, you do realize that SystemTap can pull stuff out of the trace buffer by examining information found in /sys/kernel/debug/tracing/events/*/*/format right? For example: name: ext4_sync_fs ID: 600 format: field:unsigned short common_type; offset:0; size:2; field:unsigned char common_flags; offset:2; size:1; field:unsigned char common_preempt_count; offset:3; size:1; field:int common_pid; offset:4; size:4; field:int common_tgid; offset:8; size:4; field:dev_t dev; offset:12; size:4; field:int wait; offset:16; size:4; print fmt: "dev %s wait %d", jbd2_dev_to_name(REC->dev), REC->wait No need to for users to download gigabytes and gigabytes of DWARF crapola (and for developers to grow old waiting for a kernel compile with -g enabled to complete); SystemTap can just get what it needs straight out of /sys/kernel/debug/tracing/events, and then reading what it needs out of the ring buffer. In any case I've added the blind structure pointer to include/trace/events/ext4.h, but in the long term you're *much* better off pulling the information you need from the ftrace ring buffer and getting the information you need to interpret it out of /sys/kernel/debug/tracing. - Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: Add a stub for mpage_da_data in the trace header 2009-09-30 14:20 ` Theodore Tso @ 2009-09-30 19:45 ` Josh Stone 2009-09-30 21:23 ` Theodore Tso 0 siblings, 1 reply; 6+ messages in thread From: Josh Stone @ 2009-09-30 19:45 UTC (permalink / raw) To: Theodore Tso, Christoph Hellwig; +Cc: LKML On 09/30/2009 07:20 AM, Theodore Tso wrote: > On Wed, Sep 30, 2009 at 09:33:35AM -0400, Christoph Hellwig wrote: >> On Tue, Sep 29, 2009 at 02:40:07PM -0700, Josh Stone wrote: >>> The tracepoint ext4_da_write_pages has a struct mpage_da_data* >>> parameter, but that struct is only defined in fs/ext4/ext4.h. This >>> patch adds a forward declaration for that struct, so this tracepoint >>> header can still be used by tools like SystemTap. >> >> That's not what the tracepoints are for anyway. No hacks for out of >> tree crap like this please - just use the trace buffer directly like >> ftrace and then you don't actually need any data types. If you just want the data in the trace buffer, then SystemTap is not the tool for you. By all means, just write yourself a perl script or something that parses the trace buffer however you like. On the other hand, stap is useful to do some processing/inspection *live*, at the moment the event happens. For that, we register our own tracepoint handler that can do something different than ftrace. Anyway, I don't think it's a hack to add a simple struct declaration. It's wrong to have an include/... header that requires definitions from some subsystem internals, and that's the core thing I'm fixing here. > Josh, you do realize that SystemTap can pull stuff out of the trace > buffer by examining information found in > > /sys/kernel/debug/tracing/events/*/*/format > > right? Sure, but we want to be able to do more than just post-process the trace buffer, so we hook up directly to the tracepoints. > No need to for users to download gigabytes and gigabytes of DWARF > crapola (and for developers to grow old waiting for a kernel compile > with -g enabled to complete); SystemTap can just get what it needs > straight out of /sys/kernel/debug/tracing/events, and then reading > what it needs out of the ring buffer. FWIW, Fedora rawhide's x86_64 kernel-debuginfo is a ~140MB package, which installs to ~1.1GB, so that situation is not quite so bad. The compile time is of course still a beast. However, SystemTap does *not* require the kernel debuginfo for using tracepoints, even when reading parameters. It should work in the complete absence of CONFIG_DEBUGINFO, so if you find otherwise, please let me know and I will fix it. > In any case I've added the blind structure pointer to > include/trace/events/ext4.h, but in the long term you're *much* better > off pulling the information you need from the ftrace ring buffer and > getting the information you need to interpret it out of > /sys/kernel/debug/tracing. Thank you for including it. Josh ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: Add a stub for mpage_da_data in the trace header 2009-09-30 19:45 ` Josh Stone @ 2009-09-30 21:23 ` Theodore Tso 2009-10-01 0:13 ` Josh Stone 0 siblings, 1 reply; 6+ messages in thread From: Theodore Tso @ 2009-09-30 21:23 UTC (permalink / raw) To: Josh Stone; +Cc: Christoph Hellwig, LKML On Wed, Sep 30, 2009 at 12:45:23PM -0700, Josh Stone wrote: > If you just want the data in the trace buffer, then SystemTap is not the > tool for you. By all means, just write yourself a perl script or > something that parses the trace buffer however you like. > > On the other hand, stap is useful to do some processing/inspection > *live*, at the moment the event happens. For that, we register our own > tracepoint handler that can do something different than ftrace. So there are two things I would point out here. First of all, now that ftrace has the ability to do basic filtering, just about the only thing SystemTap can do which is unique is either complex filtering, summary statistics, or some kind of correlation between multiple events (within the limits of restricted memory allocation limits of SystemTap). So I'm not sure it's such a great idea to cede a large bit of functionality to as being something that SystemTap will never accomplish --- especially when it's far more convenient and stable to depend on fixed trace points than setting arbitrary dynamic trace points in the middle of source files which will break all the time when distro's release new kernels, etc. Secondly, while I'm not so sure it's that big of a restriction to have Systemtap pull events out of the trace buffer, if you must capture the event right as it happens, it should be possible set a kprobe in the ftrace subsystem, and then pull out the data of the event from the trace buffer. Keep in mind that one of the advantage DTrace has over SystemTap is that it can use pre-defined events in the kernel, and not have to keep userspace macro files in sync with a changing kernel source base. It seems counterproductive to throw away the opportunity of being able to read the tracepoint event data, since it would give SystemTap a lot more power. As people start creating more and more tracepoints, being able to tap into them without needing to download (or build) symbol tables would be a huge advantage. > FWIW, Fedora rawhide's x86_64 kernel-debuginfo is a ~140MB package, > which installs to ~1.1GB, so that situation is not quite so bad. The > compile time is of course still a beast. > > However, SystemTap does *not* require the kernel debuginfo for using > tracepoints, even when reading parameters. It should work in the > complete absence of CONFIG_DEBUGINFO, so if you find otherwise, please > let me know and I will fix it. Well, how is it going to do that if you don't have access to the structure definition? This is why fetching the information from the ring buffer is much more powerful. - Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: Add a stub for mpage_da_data in the trace header 2009-09-30 21:23 ` Theodore Tso @ 2009-10-01 0:13 ` Josh Stone 0 siblings, 0 replies; 6+ messages in thread From: Josh Stone @ 2009-10-01 0:13 UTC (permalink / raw) To: Theodore Tso, LKML On 09/30/2009 02:23 PM, Theodore Tso wrote: > On Wed, Sep 30, 2009 at 12:45:23PM -0700, Josh Stone wrote: >> If you just want the data in the trace buffer, then SystemTap is not the >> tool for you. By all means, just write yourself a perl script or >> something that parses the trace buffer however you like. >> >> On the other hand, stap is useful to do some processing/inspection >> *live*, at the moment the event happens. For that, we register our own >> tracepoint handler that can do something different than ftrace. > > So there are two things I would point out here. First of all, now > that ftrace has the ability to do basic filtering, just about the only > thing SystemTap can do which is unique is either complex filtering, > summary statistics, or some kind of correlation between multiple > events (within the limits of restricted memory allocation limits of > SystemTap). This "only thing" seems like quite a lot to me, but I suppose the significance could be a matter of opinion. I would also add that SystemTap can better support concurrent users who want to monitor different things. > So I'm not sure it's such a great idea to cede a large bit of > functionality to as being something that SystemTap will never > accomplish --- especially when it's far more convenient and stable > to depend on fixed trace points than setting arbitrary dynamic trace > points in the middle of source files which will break all the time > when distro's release new kernels, etc. I don't understand your point about ceding here. But yes, I agree that fixed trace points are more convenient and stable, which is why we've long supported static instrumentation in the kernel. > Secondly, while I'm not so sure it's that big of a restriction to have > Systemtap pull events out of the trace buffer, if you must capture the > event right as it happens, it should be possible set a kprobe in the > ftrace subsystem, and then pull out the data of the event from the > trace buffer. This is possible, but it's a step backward for a few reasons: - A kprobe will be inherently slower than a tracepoint handler. - It requires debuginfo (maybe not to place the probe, but surely to dig into ftrace's internal data structures). - It requires knowledge about the ftrace internals, which is fragile and unmaintainable. - It assumes that every bit of data that the user wants is captured in the trace buffer. I think that last point is particularly significant. Kernel devs are not prescient, so the trace event might not be capturing all of the data that's relevant to a particular troubleshooting effort. With stap you can gather whatever data you want. (By the way, I seem to recall that we once discussed adding a proper hook for stap to grab ftrace data as it comes, but I don't think that went anywhere.) > Keep in mind that one of the advantage DTrace has over SystemTap is > that it can use pre-defined events in the kernel, and not have to > keep userspace macro files in sync with a changing kernel source > base. It seems counterproductive to throw away the opportunity of > being able to read the tracepoint event data, since it would give > SystemTap a lot more power. Aren't "pre-defined events" == tracepoints? That's exactly what we're trying to use in SystemTap! But then, DTrace doesn't dictate what data is captured at those events, so I don't understand why you think we should be more restrictive. >> However, SystemTap does *not* require the kernel debuginfo for using >> tracepoints, even when reading parameters. It should work in the >> complete absence of CONFIG_DEBUGINFO, so if you find otherwise, please >> let me know and I will fix it. > > Well, how is it going to do that if you don't have access to the > structure definition? This is why fetching the information from the > ring buffer is much more powerful. True, when neither a header nor debuginfo for a private type is available, then it will be opaque to us, so the ring buffer can offer pre-defined insight into those structures. But in sched_switch, for example, the ring buffer only knows prev/next->comm/pid/prio/state, whereas stap has the entire rq and task_structs at your disposal. Each has power in their own place... Josh ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-10-01 0:13 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-29 21:40 [PATCH] ext4: Add a stub for mpage_da_data in the trace header Josh Stone 2009-09-30 13:33 ` Christoph Hellwig 2009-09-30 14:20 ` Theodore Tso 2009-09-30 19:45 ` Josh Stone 2009-09-30 21:23 ` Theodore Tso 2009-10-01 0:13 ` Josh Stone
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox