* [PATCH] Tracepoint: Add 'file name' as a parameter of tracepoint events ext4:ext4_direct_IO_enter&ext4:ext4_direct_IO_exit
@ 2013-02-01 15:37 chenggang.qin
2013-02-01 16:26 ` Theodore Ts'o
2013-02-01 19:34 ` Al Viro
0 siblings, 2 replies; 3+ messages in thread
From: chenggang.qin @ 2013-02-01 15:37 UTC (permalink / raw)
To: linux-kernel; +Cc: chenggang, Steven Rostedt, Frederic Weisbecker, Ingo Molnar
From: chenggang <chenggang.qcg@alibaba-inc.com>
Yesterday, I implemented these tracepoint events in VFS subsystem.
It is not a good idea.
Now, I modified two existing tracepoint events in ext4 subsystem to implement
the same function.
Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers want to know the misses
rate of the database system's page cache. They also require to know what files
are accessed by the target processes with the direct IO method. These requirements
can be satisfied by recording the database's file access behavior through the way
of direct IO. So, we add 'file name' as a parameter of tracepoint events:
ext4:ext4_direct_IO_enter & ext4:ext4_direct_IO_exit.
Then, we will extend the perf or blktrace's function to use these tracepoint events.
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Chenggang Qin <chenggang.qcg@alibaba-inc.com>
---
fs/ext4/inode.c | 7 +++++--
include/trace/events/ext4.h | 22 ++++++++++++++--------
2 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index cbfe13b..92a379f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3202,6 +3202,7 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb,
struct file *file = iocb->ki_filp;
struct inode *inode = file->f_mapping->host;
ssize_t ret;
+ const unsigned char *fname;
/*
* If we are doing data journalling we don't support O_DIRECT
@@ -3213,13 +3214,15 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb,
if (ext4_has_inline_data(inode))
return 0;
- trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw);
+ fname = file->f_path.dentry->d_name.name;
+ trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw,
+ fname);
if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
ret = ext4_ext_direct_IO(rw, iocb, iov, offset, nr_segs);
else
ret = ext4_ind_direct_IO(rw, iocb, iov, offset, nr_segs);
trace_ext4_direct_IO_exit(inode, offset,
- iov_length(iov, nr_segs), rw, ret);
+ iov_length(iov, nr_segs), rw, ret, fname);
return ret;
}
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 7e8c36b..532bbb4 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -1211,9 +1211,10 @@ DEFINE_EVENT(ext4__bitmap_load, ext4_load_inode_bitmap,
);
TRACE_EVENT(ext4_direct_IO_enter,
- TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, int rw),
+ TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, int rw,
+ const unsigned char *fname),
- TP_ARGS(inode, offset, len, rw),
+ TP_ARGS(inode, offset, len, rw, fname),
TP_STRUCT__entry(
__field( dev_t, dev )
@@ -1221,6 +1222,7 @@ TRACE_EVENT(ext4_direct_IO_enter,
__field( loff_t, pos )
__field( unsigned long, len )
__field( int, rw )
+ __string( fname, fname )
),
TP_fast_assign(
@@ -1229,19 +1231,20 @@ TRACE_EVENT(ext4_direct_IO_enter,
__entry->pos = offset;
__entry->len = len;
__entry->rw = rw;
+ __assign_str(fname, fname);
),
- TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d",
+ TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d fname %s",
MAJOR(__entry->dev), MINOR(__entry->dev),
(unsigned long) __entry->ino,
- __entry->pos, __entry->len, __entry->rw)
+ __entry->pos, __entry->len, __entry->rw, __get_str(fname))
);
TRACE_EVENT(ext4_direct_IO_exit,
TP_PROTO(struct inode *inode, loff_t offset, unsigned long len,
- int rw, int ret),
+ int rw, int ret, const unsigned char *fname),
- TP_ARGS(inode, offset, len, rw, ret),
+ TP_ARGS(inode, offset, len, rw, ret, fname),
TP_STRUCT__entry(
__field( dev_t, dev )
@@ -1250,6 +1253,7 @@ TRACE_EVENT(ext4_direct_IO_exit,
__field( unsigned long, len )
__field( int, rw )
__field( int, ret )
+ __string( fname, fname )
),
TP_fast_assign(
@@ -1259,13 +1263,15 @@ TRACE_EVENT(ext4_direct_IO_exit,
__entry->len = len;
__entry->rw = rw;
__entry->ret = ret;
+ __assign_str(fname, fname);
),
- TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d ret %d",
+ TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d ret %d fname %s",
MAJOR(__entry->dev), MINOR(__entry->dev),
(unsigned long) __entry->ino,
__entry->pos, __entry->len,
- __entry->rw, __entry->ret)
+ __entry->rw, __entry->ret,
+ __get_str(fname))
);
TRACE_EVENT(ext4_fallocate_enter,
--
1.7.9.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] Tracepoint: Add 'file name' as a parameter of tracepoint events ext4:ext4_direct_IO_enter&ext4:ext4_direct_IO_exit
2013-02-01 15:37 [PATCH] Tracepoint: Add 'file name' as a parameter of tracepoint events ext4:ext4_direct_IO_enter&ext4:ext4_direct_IO_exit chenggang.qin
@ 2013-02-01 16:26 ` Theodore Ts'o
2013-02-01 19:34 ` Al Viro
1 sibling, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2013-02-01 16:26 UTC (permalink / raw)
To: chenggang.qin
Cc: linux-kernel, chenggang, Steven Rostedt, Frederic Weisbecker,
Ingo Molnar
On Fri, Feb 01, 2013 at 11:37:38PM +0800, chenggang.qin@gmail.com wrote:
>
> Many database systems use their own page cache subsystems and use
> the direct IO to access the disks. Sometimes, the system engineers
> want to know the misses rate of the database system's page
> cache. They also require to know what files are accessed by the
> target processes with the direct IO method. These requirements can
> be satisfied by recording the database's file access behavior
> through the way of direct IO. So, we add 'file name' as a parameter
> of tracepoint events: ext4:ext4_direct_IO_enter &
> ext4:ext4_direct_IO_exit.
The device and inode number isn't sufficient? Database files tend to
be long-lasting, so it shouldn't be hard to use the inode number.
My concern with putting the filename into the string buffer is that it
will seriously bloat the size of the event that will end up getting
dropped into the ring buffer.
Regards,
- Ted
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Tracepoint: Add 'file name' as a parameter of tracepoint events ext4:ext4_direct_IO_enter&ext4:ext4_direct_IO_exit
2013-02-01 15:37 [PATCH] Tracepoint: Add 'file name' as a parameter of tracepoint events ext4:ext4_direct_IO_enter&ext4:ext4_direct_IO_exit chenggang.qin
2013-02-01 16:26 ` Theodore Ts'o
@ 2013-02-01 19:34 ` Al Viro
1 sibling, 0 replies; 3+ messages in thread
From: Al Viro @ 2013-02-01 19:34 UTC (permalink / raw)
To: chenggang.qin
Cc: linux-kernel, chenggang, Steven Rostedt, Frederic Weisbecker,
Ingo Molnar
On Fri, Feb 01, 2013 at 11:37:38PM +0800, chenggang.qin@gmail.com wrote:
> @@ -3213,13 +3214,15 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb,
> if (ext4_has_inline_data(inode))
> return 0;
>
> - trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw);
> + fname = file->f_path.dentry->d_name.name;
> + trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw,
> + fname);
Oh, wonderful... "your patch is racy; there's no warranty that fname will
not be freed right under you" -- "OK, we shouldn't do it in VFS... let's
try to do exact same thing in ext4, then"
Let me spell it out for you: opened files *can* be renamed while they are
in the middle of IO. If both old and new names are short, the contents
of ->d_name.name will be overwritten, so dereferencing your fname can yield
a mix of old and new name, or something that isn't NUL-terminated. If they
are long enough, they will be allocated separately from struct dentry and
your fname can bloody well end up pointing to freed memory by the time you
get around to dereferencing it.
Again, there is no exclusion between ext4_direct_IO() (or its callers) and
rename(). The NAK still stands.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-02-01 19:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-01 15:37 [PATCH] Tracepoint: Add 'file name' as a parameter of tracepoint events ext4:ext4_direct_IO_enter&ext4:ext4_direct_IO_exit chenggang.qin
2013-02-01 16:26 ` Theodore Ts'o
2013-02-01 19:34 ` Al Viro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox