* [RFC PATCH 0/2] tracing: ftrace_enable_fops fixes
@ 2025-06-12 10:43 Gabriele Paoloni
2025-06-12 10:43 ` [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops Gabriele Paoloni
2025-06-12 10:43 ` [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read Gabriele Paoloni
0 siblings, 2 replies; 13+ messages in thread
From: Gabriele Paoloni @ 2025-06-12 10:43 UTC (permalink / raw)
To: rostedt, mhiramat, mathieu.desnoyers, linux-kernel,
linux-trace-kernel
Cc: acarmina, chuck.wolber, Gabriele Paoloni
This patchset addresses some issues found in ftrace_enable_fops and
proposes low-level documentation for event_enable_write/read.
Gabriele Paoloni (2):
tracing: fixes of ftrace_enable_fops
tracing: add testable specifications for event_enable_write/read
kernel/trace/trace_events.c | 83 +++++++++++++++++++++++++++++++++++--
1 file changed, 80 insertions(+), 3 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-06-12 10:43 [RFC PATCH 0/2] tracing: ftrace_enable_fops fixes Gabriele Paoloni
@ 2025-06-12 10:43 ` Gabriele Paoloni
2025-06-13 2:45 ` Masami Hiramatsu
2025-06-12 10:43 ` [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read Gabriele Paoloni
1 sibling, 1 reply; 13+ messages in thread
From: Gabriele Paoloni @ 2025-06-12 10:43 UTC (permalink / raw)
To: rostedt, mhiramat, mathieu.desnoyers, linux-kernel,
linux-trace-kernel
Cc: acarmina, chuck.wolber, Gabriele Paoloni
Currently there are different issues associated with ftrace_enable_fops
- event_enable_write: *ppos is increased while not used at all in the
write operation itself (following a write, this could lead a read to
fail or report a corrupted event status);
- event_enable_read: cnt < strlen(buf) is allowed and this can lead to
reading an incomplete event status (i.e. not all status characters
are retrieved) and/or reading the status in a non-atomic way (i.e.
the status could change between two consecutive reads);
- .llseek is set to default_llseek: this is wrong since for this
type of files it does not make sense to reposition the ppos offset.
Hence this should be set instead to noop_llseek.
This patch fixes all the issues listed above.
Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
Tested-by: Alessandro Carminati <acarmina@redhat.com>
---
kernel/trace/trace_events.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 120531268abf..5e84ef01d0c8 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1798,6 +1798,13 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
strcat(buf, "\n");
+ /*
+ * A requested cnt less than strlen(buf) could lead to a wrong
+ * event status being reported.
+ */
+ if (cnt < strlen(buf))
+ return -EINVAL;
+
return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
}
@@ -1833,8 +1840,6 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
return -EINVAL;
}
- *ppos += cnt;
-
return cnt;
}
@@ -2557,7 +2562,7 @@ static const struct file_operations ftrace_enable_fops = {
.read = event_enable_read,
.write = event_enable_write,
.release = tracing_release_file_tr,
- .llseek = default_llseek,
+ .llseek = noop_llseek,
};
static const struct file_operations ftrace_event_format_fops = {
--
2.48.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read
2025-06-12 10:43 [RFC PATCH 0/2] tracing: ftrace_enable_fops fixes Gabriele Paoloni
2025-06-12 10:43 ` [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops Gabriele Paoloni
@ 2025-06-12 10:43 ` Gabriele Paoloni
2025-07-01 22:11 ` Steven Rostedt
1 sibling, 1 reply; 13+ messages in thread
From: Gabriele Paoloni @ 2025-06-12 10:43 UTC (permalink / raw)
To: rostedt, mhiramat, mathieu.desnoyers, linux-kernel,
linux-trace-kernel
Cc: acarmina, chuck.wolber, Gabriele Paoloni
This patch implements the documentation of event_enable_write and
event_enable_read in the form of testable function's expectations.
Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
---
kernel/trace/trace_events.c | 72 +++++++++++++++++++++++++++++++++++++
1 file changed, 72 insertions(+)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 5e84ef01d0c8..eb3c5e6e557d 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1771,6 +1771,46 @@ static void p_stop(struct seq_file *m, void *p)
mutex_unlock(&event_mutex);
}
+/**
+ * event_enable_read - read from a trace event file to retrieve its status.
+ * @filp: file pointer associated with the target trace event;
+ * @ubuf: user space buffer where the event status is copied to;
+ * @cnt: number of bytes to be copied to the user space buffer;
+ * @ppos: the current position in the buffer.
+ *
+ * This is a way for user space executables to retrieve the status of a
+ * specific event
+ *
+ * Function's expectations:
+ * - This function shall lock the global event_mutex before performing any
+ * operation on the target event file and unlock it after all operations on
+ * the target event file have completed;
+ * - This function shall retrieve the status flags from the file associated
+ * with the target event;
+ * - This function shall format the string to report the event status to user
+ * space:
+ * - The first character of the string to be copied to user space shall be
+ * set to "1" if the enabled flag is set AND the soft_disabled flag is not
+ * set, else it shall be set to "0";
+ * - The second character of the string to be copied to user space shall be
+ * set to "*" if either the soft_disabled flag or the soft_mode flag is
+ * set, else it shall be set to "\n";
+ * - The third character of the string to be copied to user space shall b
+ * set to "\n" if either the soft_disabled flag or the soft_mode flag is
+ * set, else it shall be set to "0";
+ * - Any other character in the string to be copied to user space shall be
+ * set to "0";
+ * - This function shall check if the requested cnt bytes are equal or greater
+ * than the length of the string to be copied to user space (else a
+ * corrupted event status could be reported);
+ * - This function shall invoke simple_read_from_buffer() to perform the copy
+ * of the kernel space string to ubuf.
+ *
+ * Returns the number of copied bytes on success, -ENODEV if the event file
+ * cannot be retrieved from the input filp, -EINVAL if cnt is less than the
+ * length of the string to be copied to ubuf, any error returned by
+ * simple_read_from_buffer
+ */
static ssize_t
event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
loff_t *ppos)
@@ -1808,6 +1848,38 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
}
+/**
+ * event_enable_write - write to a trace event file to enable/disable it.
+ * @filp: file pointer associated with the target trace event;
+ * @ubuf: user space buffer where the enable/disable value is copied from;
+ * @cnt: number of bytes to be copied from the user space buffer;
+ * @ppos: the current position in the buffer.
+ *
+ * This is a way for user space executables to enable or disable event
+ * recording.
+ *
+ * Function's expectations:
+ * - This function shall copy cnt bytes from the input ubuf buffer to a kernel
+ * space buffer (or the whole input ubuf if cnt is larger than ubuf size)
+ * and shall convert the string within the kernel space buffer into a decimal
+ * base format number;
+ * - This function shall lock the global event_mutex before performing any
+ * operation on the target event file and unlock it after all operations on
+ * the target event file have completed;
+ * - This function shall check the size of the per-cpu ring-buffers used for
+ * the event trace data and, if smaller than TRACE_BUF_SIZE_DEFAULT, expand
+ * them to TRACE_BUF_SIZE_DEFAULT bytes (sizes larger than
+ * TRACE_BUF_SIZE_DEFAULT are not allowed);
+ * - This function shall invoke ftrace_event_enable_disable to enable or
+ * disable the target trace event according to the value read from user space
+ * (0 - disable, 1 - enable);
+ *
+ * Returns 0 on success, any error returned by kstrtoul_from_user, -ENODEV if
+ * the event file cannot be retrieved from the input filp, any error returned by
+ * tracing_update_buffers, any error returned by ftrace_event_enable_disable,
+ * -EINVAL if the value copied from the user space ubuf is different from 0 or
+ * 1.
+ */
static ssize_t
event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
loff_t *ppos)
--
2.48.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-06-12 10:43 ` [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops Gabriele Paoloni
@ 2025-06-13 2:45 ` Masami Hiramatsu
2025-06-19 17:07 ` Gabriele Paoloni
0 siblings, 1 reply; 13+ messages in thread
From: Masami Hiramatsu @ 2025-06-13 2:45 UTC (permalink / raw)
To: Gabriele Paoloni
Cc: rostedt, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Thu, 12 Jun 2025 12:43:48 +0200
Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> Currently there are different issues associated with ftrace_enable_fops
> - event_enable_write: *ppos is increased while not used at all in the
> write operation itself (following a write, this could lead a read to
> fail or report a corrupted event status);
Here, we expected the "enable" file is a pseudo text file. So if
there is a write, the ppos should be incremented.
> - event_enable_read: cnt < strlen(buf) is allowed and this can lead to
> reading an incomplete event status (i.e. not all status characters
> are retrieved) and/or reading the status in a non-atomic way (i.e.
> the status could change between two consecutive reads);
As I said, the "enable" file is a kind of text file. So reader must read
it until EOF. If you need to get the consistent result, user should
use the enough size of buffer.
> - .llseek is set to default_llseek: this is wrong since for this
> type of files it does not make sense to reposition the ppos offset.
> Hence this should be set instead to noop_llseek.
As I said, it is a kind of text file, default_llseek is better.
But, if we change (re-design) what is this "enable" file is,
we can accept these changes. So this is not a "Fix" but re-design
of the "enable" file as an interface (as a char device), not a text
file (or a block device).
I want to keep this as is, same as other tracefs files.
Thank you,
>
> This patch fixes all the issues listed above.
>
> Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
> Tested-by: Alessandro Carminati <acarmina@redhat.com>
> ---
> kernel/trace/trace_events.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 120531268abf..5e84ef01d0c8 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1798,6 +1798,13 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
>
> strcat(buf, "\n");
>
> + /*
> + * A requested cnt less than strlen(buf) could lead to a wrong
> + * event status being reported.
> + */
> + if (cnt < strlen(buf))
> + return -EINVAL;
> +
> return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> }
>
> @@ -1833,8 +1840,6 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> return -EINVAL;
> }
>
> - *ppos += cnt;
> -
> return cnt;
> }
>
> @@ -2557,7 +2562,7 @@ static const struct file_operations ftrace_enable_fops = {
> .read = event_enable_read,
> .write = event_enable_write,
> .release = tracing_release_file_tr,
> - .llseek = default_llseek,
> + .llseek = noop_llseek,
> };
>
> static const struct file_operations ftrace_event_format_fops = {
> --
> 2.48.1
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-06-13 2:45 ` Masami Hiramatsu
@ 2025-06-19 17:07 ` Gabriele Paoloni
2025-06-20 9:35 ` Masami Hiramatsu
0 siblings, 1 reply; 13+ messages in thread
From: Gabriele Paoloni @ 2025-06-19 17:07 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: rostedt, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
Hi Masami
On Fri, Jun 13, 2025 at 4:45 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 12 Jun 2025 12:43:48 +0200
> Gabriele Paoloni <gpaoloni@redhat.com> wrote:
>
> > Currently there are different issues associated with ftrace_enable_fops
> > - event_enable_write: *ppos is increased while not used at all in the
> > write operation itself (following a write, this could lead a read to
> > fail or report a corrupted event status);
>
> Here, we expected the "enable" file is a pseudo text file. So if
> there is a write, the ppos should be incremented.
>
> > - event_enable_read: cnt < strlen(buf) is allowed and this can lead to
> > reading an incomplete event status (i.e. not all status characters
> > are retrieved) and/or reading the status in a non-atomic way (i.e.
> > the status could change between two consecutive reads);
>
> As I said, the "enable" file is a kind of text file. So reader must read
> it until EOF. If you need to get the consistent result, user should
> use the enough size of buffer.
What I am concerned about are scenarios like the one below:
---
# strace -Tfe trace=openat,open,read,write scat 1
/sys/kernel/tracing/events/kprobes/ev1/enable
open("/sys/kernel/tracing/events/kprobes/ev1/enable",
O_RDONLY|O_LARGEFILE) = 3 <0.000237>
Open /sys/kernel/tracing/events/kprobes/ev1/enable ->fd=3
read fd=3, 1
read(3, "0", 1) = 1 <0.000099>
1 bytes Read
30,
read(3, "\n", 1) = 1 <0.000095>
1 bytes Read
0a,
read(3, "", 1) = 0 <0.000102>
close fd=3
+++ exited with 0 +++
---
So in this case there are 2 consecutive reads byte by byte that
could lead to inconsistent results if in the meantime the event
status has changed.
With the proposed patchset the same test would result in a failure
as per log below:
---
# strace -Tfe trace=openat,open,read,write scat 1
/sys/kernel/tracing/events/kprobes/ev1/enable
open("/sys/kernel/tracing/events/kprobes/ev1/enable",
O_RDONLY|O_LARGEFILE) = 3 <0.000227>
Open /sys/kernel/tracing/events/kprobes/ev1/enable ->fd=3
read fd=3, 1
read(3, 0x7ffd960234e0, 1) = -1 EINVAL (Invalid argument)
<0.000228>
close fd=3
+++ exited with 0 +++
---
On the other side the proposed patchset would be still compatible with
“cat” or “scat 2” commands that continue to work as they do today.
>
> > - .llseek is set to default_llseek: this is wrong since for this
> > type of files it does not make sense to reposition the ppos offset.
> > Hence this should be set instead to noop_llseek.
>
> As I said, it is a kind of text file, default_llseek is better.
>
> But, if we change (re-design) what is this "enable" file is,
> we can accept these changes. So this is not a "Fix" but re-design
> of the "enable" file as an interface (as a char device), not a text
> file (or a block device).
>
> I want to keep this as is, same as other tracefs files.
IMO it is a redesign that is enforcing the user to avoid erroneous
usages of enable files (because the reads are either reporting the
whole and correct status of the event or failing to read; also the user
cannot llseek into a position that would lead to not reading or reading
a corrupted status).
On the other hand the proposed re-design is fully compatible with
the current user space commands reading and writing to the enable
files.
If the concern is having inconsistent implementations between tracefs
files, I am happy to extend this patchset to all traces files, however,
before doing so, I would like to have your approval.
Otherwise I will just document the current functions and associated
assumptions of use that the user must comply with in order to avoid
the erroneous behaviour.
Thanks a lot for your inputs and clarifications.
Gab
>
> Thank you,
>
> >
> > This patch fixes all the issues listed above.
> >
> > Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
> > Tested-by: Alessandro Carminati <acarmina@redhat.com>
> > ---
> > kernel/trace/trace_events.c | 11 ++++++++---
> > 1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 120531268abf..5e84ef01d0c8 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -1798,6 +1798,13 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> >
> > strcat(buf, "\n");
> >
> > + /*
> > + * A requested cnt less than strlen(buf) could lead to a wrong
> > + * event status being reported.
> > + */
> > + if (cnt < strlen(buf))
> > + return -EINVAL;
> > +
> > return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> > }
> >
> > @@ -1833,8 +1840,6 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> > return -EINVAL;
> > }
> >
> > - *ppos += cnt;
> > -
> > return cnt;
> > }
> >
> > @@ -2557,7 +2562,7 @@ static const struct file_operations ftrace_enable_fops = {
> > .read = event_enable_read,
> > .write = event_enable_write,
> > .release = tracing_release_file_tr,
> > - .llseek = default_llseek,
> > + .llseek = noop_llseek,
> > };
> >
> > static const struct file_operations ftrace_event_format_fops = {
> > --
> > 2.48.1
> >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-06-19 17:07 ` Gabriele Paoloni
@ 2025-06-20 9:35 ` Masami Hiramatsu
2025-06-20 13:26 ` Gabriele Paoloni
0 siblings, 1 reply; 13+ messages in thread
From: Masami Hiramatsu @ 2025-06-20 9:35 UTC (permalink / raw)
To: Gabriele Paoloni
Cc: rostedt, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Thu, 19 Jun 2025 19:07:33 +0200
Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> Hi Masami
>
> On Fri, Jun 13, 2025 at 4:45 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 12 Jun 2025 12:43:48 +0200
> > Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> >
> > > Currently there are different issues associated with ftrace_enable_fops
> > > - event_enable_write: *ppos is increased while not used at all in the
> > > write operation itself (following a write, this could lead a read to
> > > fail or report a corrupted event status);
> >
> > Here, we expected the "enable" file is a pseudo text file. So if
> > there is a write, the ppos should be incremented.
> >
> > > - event_enable_read: cnt < strlen(buf) is allowed and this can lead to
> > > reading an incomplete event status (i.e. not all status characters
> > > are retrieved) and/or reading the status in a non-atomic way (i.e.
> > > the status could change between two consecutive reads);
> >
> > As I said, the "enable" file is a kind of text file. So reader must read
> > it until EOF. If you need to get the consistent result, user should
> > use the enough size of buffer.
>
> What I am concerned about are scenarios like the one below:
> ---
> # strace -Tfe trace=openat,open,read,write scat 1
> /sys/kernel/tracing/events/kprobes/ev1/enable
> open("/sys/kernel/tracing/events/kprobes/ev1/enable",
> O_RDONLY|O_LARGEFILE) = 3 <0.000237>
> Open /sys/kernel/tracing/events/kprobes/ev1/enable ->fd=3
> read fd=3, 1
> read(3, "0", 1) = 1 <0.000099>
> 1 bytes Read
> 30,
> read(3, "\n", 1) = 1 <0.000095>
> 1 bytes Read
> 0a,
> read(3, "", 1) = 0 <0.000102>
> close fd=3
> +++ exited with 0 +++
> ---
> So in this case there are 2 consecutive reads byte by byte that
> could lead to inconsistent results if in the meantime the event
> status has changed.
Unless you take a lock explicitly, ftrace (and other pseudo
files) does not guarantee the consistency between 2 read()
syscalls, because it is something like a file which is shared
with kernel side.
Please imagine that this is something like a file shared
between two processes, one updating it and one reading it.
The kernel guarantees the one read() will consistent, but
two read()s may not be consistent because it can be updated
by another.
> With the proposed patchset the same test would result in a failure
> as per log below:
> ---
> # strace -Tfe trace=openat,open,read,write scat 1
> /sys/kernel/tracing/events/kprobes/ev1/enable
> open("/sys/kernel/tracing/events/kprobes/ev1/enable",
> O_RDONLY|O_LARGEFILE) = 3 <0.000227>
> Open /sys/kernel/tracing/events/kprobes/ev1/enable ->fd=3
> read fd=3, 1
> read(3, 0x7ffd960234e0, 1) = -1 EINVAL (Invalid argument)
> <0.000228>
> close fd=3
> +++ exited with 0 +++
> ---
> On the other side the proposed patchset would be still compatible with
> “cat” or “scat 2” commands that continue to work as they do today.
>
> >
> > > - .llseek is set to default_llseek: this is wrong since for this
> > > type of files it does not make sense to reposition the ppos offset.
> > > Hence this should be set instead to noop_llseek.
> >
> > As I said, it is a kind of text file, default_llseek is better.
> >
> > But, if we change (re-design) what is this "enable" file is,
> > we can accept these changes. So this is not a "Fix" but re-design
> > of the "enable" file as an interface (as a char device), not a text
> > file (or a block device).
> >
> > I want to keep this as is, same as other tracefs files.
>
> IMO it is a redesign that is enforcing the user to avoid erroneous
> usages of enable files (because the reads are either reporting the
> whole and correct status of the event or failing to read; also the user
> cannot llseek into a position that would lead to not reading or reading
> a corrupted status).
Can you make it for files which can be bigger than PAGE_SIZE?
For example, "hist" file also can be inconsistent if user reads
it several times. Can you also update it to return -EINVAL
if read buffer size is smaller?
>
> On the other hand the proposed re-design is fully compatible with
> the current user space commands reading and writing to the enable
> files.
>
> If the concern is having inconsistent implementations between tracefs
> files, I am happy to extend this patchset to all traces files, however,
> before doing so, I would like to have your approval.
Hmm, I'm still not convinced. If you redesign it, that should also
be applied to other pseudo files. "why tracefs can not read partially,
but procfs can?" I guess that can cause more confusion and will
lead unneeded debug.
> Otherwise I will just document the current functions and associated
> assumptions of use that the user must comply with in order to avoid
> the erroneous behaviour.
Yeah, I like to update the document so that user must read with enough
size of buffer. And TIPs how to read consistent data from each file.
Thank you,
>
> Thanks a lot for your inputs and clarifications.
> Gab
> >
> > Thank you,
> >
> > >
> > > This patch fixes all the issues listed above.
> > >
> > > Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
> > > Tested-by: Alessandro Carminati <acarmina@redhat.com>
> > > ---
> > > kernel/trace/trace_events.c | 11 ++++++++---
> > > 1 file changed, 8 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > index 120531268abf..5e84ef01d0c8 100644
> > > --- a/kernel/trace/trace_events.c
> > > +++ b/kernel/trace/trace_events.c
> > > @@ -1798,6 +1798,13 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> > >
> > > strcat(buf, "\n");
> > >
> > > + /*
> > > + * A requested cnt less than strlen(buf) could lead to a wrong
> > > + * event status being reported.
> > > + */
> > > + if (cnt < strlen(buf))
> > > + return -EINVAL;
> > > +
> > > return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> > > }
> > >
> > > @@ -1833,8 +1840,6 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> > > return -EINVAL;
> > > }
> > >
> > > - *ppos += cnt;
> > > -
> > > return cnt;
> > > }
> > >
> > > @@ -2557,7 +2562,7 @@ static const struct file_operations ftrace_enable_fops = {
> > > .read = event_enable_read,
> > > .write = event_enable_write,
> > > .release = tracing_release_file_tr,
> > > - .llseek = default_llseek,
> > > + .llseek = noop_llseek,
> > > };
> > >
> > > static const struct file_operations ftrace_event_format_fops = {
> > > --
> > > 2.48.1
> > >
> >
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-06-20 9:35 ` Masami Hiramatsu
@ 2025-06-20 13:26 ` Gabriele Paoloni
2025-07-01 21:58 ` Steven Rostedt
0 siblings, 1 reply; 13+ messages in thread
From: Gabriele Paoloni @ 2025-06-20 13:26 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: rostedt, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Fri, Jun 20, 2025 at 11:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 19 Jun 2025 19:07:33 +0200
> Gabriele Paoloni <gpaoloni@redhat.com> wrote:
>
> > Hi Masami
> >
> > On Fri, Jun 13, 2025 at 4:45 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Thu, 12 Jun 2025 12:43:48 +0200
> > > Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> > >
> > > > Currently there are different issues associated with ftrace_enable_fops
> > > > - event_enable_write: *ppos is increased while not used at all in the
> > > > write operation itself (following a write, this could lead a read to
> > > > fail or report a corrupted event status);
> > >
> > > Here, we expected the "enable" file is a pseudo text file. So if
> > > there is a write, the ppos should be incremented.
> > >
> > > > - event_enable_read: cnt < strlen(buf) is allowed and this can lead to
> > > > reading an incomplete event status (i.e. not all status characters
> > > > are retrieved) and/or reading the status in a non-atomic way (i.e.
> > > > the status could change between two consecutive reads);
> > >
> > > As I said, the "enable" file is a kind of text file. So reader must read
> > > it until EOF. If you need to get the consistent result, user should
> > > use the enough size of buffer.
> >
> > What I am concerned about are scenarios like the one below:
> > ---
> > # strace -Tfe trace=openat,open,read,write scat 1
> > /sys/kernel/tracing/events/kprobes/ev1/enable
> > open("/sys/kernel/tracing/events/kprobes/ev1/enable",
> > O_RDONLY|O_LARGEFILE) = 3 <0.000237>
> > Open /sys/kernel/tracing/events/kprobes/ev1/enable ->fd=3
> > read fd=3, 1
> > read(3, "0", 1) = 1 <0.000099>
> > 1 bytes Read
> > 30,
> > read(3, "\n", 1) = 1 <0.000095>
> > 1 bytes Read
> > 0a,
> > read(3, "", 1) = 0 <0.000102>
> > close fd=3
> > +++ exited with 0 +++
> > ---
> > So in this case there are 2 consecutive reads byte by byte that
> > could lead to inconsistent results if in the meantime the event
> > status has changed.
>
> Unless you take a lock explicitly, ftrace (and other pseudo
> files) does not guarantee the consistency between 2 read()
> syscalls, because it is something like a file which is shared
> with kernel side.
>
> Please imagine that this is something like a file shared
> between two processes, one updating it and one reading it.
> The kernel guarantees the one read() will consistent, but
> two read()s may not be consistent because it can be updated
> by another.
Yes, this is the reason behind the proposal I made here.
In this case the Kernel rejects a read that is requesting
a number of bytes cnt that is smaller than strlen(buf).
>
> > With the proposed patchset the same test would result in a failure
> > as per log below:
> > ---
> > # strace -Tfe trace=openat,open,read,write scat 1
> > /sys/kernel/tracing/events/kprobes/ev1/enable
> > open("/sys/kernel/tracing/events/kprobes/ev1/enable",
> > O_RDONLY|O_LARGEFILE) = 3 <0.000227>
> > Open /sys/kernel/tracing/events/kprobes/ev1/enable ->fd=3
> > read fd=3, 1
> > read(3, 0x7ffd960234e0, 1) = -1 EINVAL (Invalid argument)
> > <0.000228>
> > close fd=3
> > +++ exited with 0 +++
> > ---
> > On the other side the proposed patchset would be still compatible with
> > “cat” or “scat 2” commands that continue to work as they do today.
> >
> > >
> > > > - .llseek is set to default_llseek: this is wrong since for this
> > > > type of files it does not make sense to reposition the ppos offset.
> > > > Hence this should be set instead to noop_llseek.
> > >
> > > As I said, it is a kind of text file, default_llseek is better.
> > >
> > > But, if we change (re-design) what is this "enable" file is,
> > > we can accept these changes. So this is not a "Fix" but re-design
> > > of the "enable" file as an interface (as a char device), not a text
> > > file (or a block device).
> > >
> > > I want to keep this as is, same as other tracefs files.
> >
> > IMO it is a redesign that is enforcing the user to avoid erroneous
> > usages of enable files (because the reads are either reporting the
> > whole and correct status of the event or failing to read; also the user
> > cannot llseek into a position that would lead to not reading or reading
> > a corrupted status).
>
> Can you make it for files which can be bigger than PAGE_SIZE?
>
> For example, "hist" file also can be inconsistent if user reads
> it several times. Can you also update it to return -EINVAL
> if read buffer size is smaller?
From a very quick look it seems that the read callback of event_hist_fops
is set to the standard seq_read, whereas the read callback in
ftrace_enable_fops defines its own method.
So maybe redesigning the read callback of event_hist_fops could
achieve this goal (in order to be sure I need to look deeper into it,
this is just a guess).
>
> >
> > On the other hand the proposed re-design is fully compatible with
> > the current user space commands reading and writing to the enable
> > files.
> >
> > If the concern is having inconsistent implementations between tracefs
> > files, I am happy to extend this patchset to all traces files, however,
> > before doing so, I would like to have your approval.
>
>
> Hmm, I'm still not convinced. If you redesign it, that should also
> be applied to other pseudo files. "why tracefs can not read partially,
> but procfs can?" I guess that can cause more confusion and will
> lead unneeded debug.
>
> > Otherwise I will just document the current functions and associated
> > assumptions of use that the user must comply with in order to avoid
> > the erroneous behaviour.
>
> Yeah, I like to update the document so that user must read with enough
> size of buffer. And TIPs how to read consistent data from each file.
I think that from my side I do not have comprehensive answers to all your
questions (I need to read the code more in depth).
So to be pragmatic I can split this effort into two parts (documentation and
redesign); I will propose documentation first with the TIPs that you mentioned
above and later, if we find a better re-design solution, we can also amend
the documentation as needed.
Many thanks for your advice and directions
Gab
>
> Thank you,
>
> >
> > Thanks a lot for your inputs and clarifications.
> > Gab
> > >
> > > Thank you,
> > >
> > > >
> > > > This patch fixes all the issues listed above.
> > > >
> > > > Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
> > > > Tested-by: Alessandro Carminati <acarmina@redhat.com>
> > > > ---
> > > > kernel/trace/trace_events.c | 11 ++++++++---
> > > > 1 file changed, 8 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > > index 120531268abf..5e84ef01d0c8 100644
> > > > --- a/kernel/trace/trace_events.c
> > > > +++ b/kernel/trace/trace_events.c
> > > > @@ -1798,6 +1798,13 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> > > >
> > > > strcat(buf, "\n");
> > > >
> > > > + /*
> > > > + * A requested cnt less than strlen(buf) could lead to a wrong
> > > > + * event status being reported.
> > > > + */
> > > > + if (cnt < strlen(buf))
> > > > + return -EINVAL;
> > > > +
> > > > return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> > > > }
> > > >
> > > > @@ -1833,8 +1840,6 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> > > > return -EINVAL;
> > > > }
> > > >
> > > > - *ppos += cnt;
> > > > -
> > > > return cnt;
> > > > }
> > > >
> > > > @@ -2557,7 +2562,7 @@ static const struct file_operations ftrace_enable_fops = {
> > > > .read = event_enable_read,
> > > > .write = event_enable_write,
> > > > .release = tracing_release_file_tr,
> > > > - .llseek = default_llseek,
> > > > + .llseek = noop_llseek,
> > > > };
> > > >
> > > > static const struct file_operations ftrace_event_format_fops = {
> > > > --
> > > > 2.48.1
> > > >
> > >
> > >
> > > --
> > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > >
> >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-06-20 13:26 ` Gabriele Paoloni
@ 2025-07-01 21:58 ` Steven Rostedt
2025-07-02 14:16 ` Gabriele Paoloni
0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2025-07-01 21:58 UTC (permalink / raw)
To: Gabriele Paoloni
Cc: Masami Hiramatsu, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, acarmina, chuck.wolber
On Fri, 20 Jun 2025 15:26:27 +0200
Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> I think that from my side I do not have comprehensive answers to all your
> questions (I need to read the code more in depth).
> So to be pragmatic I can split this effort into two parts (documentation and
> redesign); I will propose documentation first with the TIPs that you mentioned
> above and later, if we find a better re-design solution, we can also amend
> the documentation as needed.
Just to confirm, I agree with Masami. The enable file is quite special,
and I don't see the use of user space playing tricks with it, which
even includes lseek. Maybe to keep rewinding a read to get a new status
change?
But it usually contains a single character (sometimes two) and a new
line. It's not something that's ever been reported as an issue. I
rather not touch it if it hasn't been reported as broken because
there's some hypothetical use case that can see it as broken.
Documenting its current behavior is perfectly fine with me.
-- Steve
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read
2025-06-12 10:43 ` [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read Gabriele Paoloni
@ 2025-07-01 22:11 ` Steven Rostedt
2025-07-02 14:59 ` Gabriele Paoloni
0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2025-07-01 22:11 UTC (permalink / raw)
To: Gabriele Paoloni
Cc: mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Thu, 12 Jun 2025 12:43:49 +0200
Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> This patch implements the documentation of event_enable_write and
> event_enable_read in the form of testable function's expectations.
>
> Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
> ---
> kernel/trace/trace_events.c | 72 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 72 insertions(+)
>
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 5e84ef01d0c8..eb3c5e6e557d 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1771,6 +1771,46 @@ static void p_stop(struct seq_file *m, void *p)
> mutex_unlock(&event_mutex);
> }
>
> +/**
> + * event_enable_read - read from a trace event file to retrieve its status.
> + * @filp: file pointer associated with the target trace event;
> + * @ubuf: user space buffer where the event status is copied to;
> + * @cnt: number of bytes to be copied to the user space buffer;
Why is the above ending with semicolons?
> + * @ppos: the current position in the buffer.
> + *
> + * This is a way for user space executables to retrieve the status of a
> + * specific event
> + *
> + * Function's expectations:
> + * - This function shall lock the global event_mutex before performing any
> + * operation on the target event file and unlock it after all operations on
> + * the target event file have completed;
> + * - This function shall retrieve the status flags from the file associated
> + * with the target event;
> + * - This function shall format the string to report the event status to user
> + * space:
> + * - The first character of the string to be copied to user space shall be
> + * set to "1" if the enabled flag is set AND the soft_disabled flag is not
> + * set, else it shall be set to "0";
> + * - The second character of the string to be copied to user space shall be
> + * set to "*" if either the soft_disabled flag or the soft_mode flag is
> + * set, else it shall be set to "\n";
> + * - The third character of the string to be copied to user space shall b
> + * set to "\n" if either the soft_disabled flag or the soft_mode flag is
> + * set, else it shall be set to "0";
> + * - Any other character in the string to be copied to user space shall be
> + * set to "0";
The above could use some new lines. Like between each bullet. My
eyesight isn't as good anymore and I find reading gobs of text more
difficult. Newlines give my eyes a break.
I know this is being written so that a tool could parse it, but also
needs to be parsed by aging developers. ;-)
> + * - This function shall check if the requested cnt bytes are equal or greater
> + * than the length of the string to be copied to user space (else a
> + * corrupted event status could be reported);
> + * - This function shall invoke simple_read_from_buffer() to perform the copy
> + * of the kernel space string to ubuf.
Hmm, and it is also verbose. This is a relatively simple function, and
it caused quite a bit of requirements. I wonder if it could be
shortened a bit. Otherwise more complex and larger functions are going
to be overwhelming and likely not acceptable. And those are the
functions that I think this will benefit the most!
> + *
> + * Returns the number of copied bytes on success, -ENODEV if the event file
> + * cannot be retrieved from the input filp, -EINVAL if cnt is less than the
> + * length of the string to be copied to ubuf, any error returned by
> + * simple_read_from_buffer
Returns should be formatted better where each return value is on a
separate line.
-- Steve
> + */
> static ssize_t
> event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> loff_t *ppos)
> @@ -1808,6 +1848,38 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> }
>
> +/**
> + * event_enable_write - write to a trace event file to enable/disable it.
> + * @filp: file pointer associated with the target trace event;
> + * @ubuf: user space buffer where the enable/disable value is copied from;
> + * @cnt: number of bytes to be copied from the user space buffer;
> + * @ppos: the current position in the buffer.
> + *
> + * This is a way for user space executables to enable or disable event
> + * recording.
> + *
> + * Function's expectations:
> + * - This function shall copy cnt bytes from the input ubuf buffer to a kernel
> + * space buffer (or the whole input ubuf if cnt is larger than ubuf size)
> + * and shall convert the string within the kernel space buffer into a decimal
> + * base format number;
> + * - This function shall lock the global event_mutex before performing any
> + * operation on the target event file and unlock it after all operations on
> + * the target event file have completed;
> + * - This function shall check the size of the per-cpu ring-buffers used for
> + * the event trace data and, if smaller than TRACE_BUF_SIZE_DEFAULT, expand
> + * them to TRACE_BUF_SIZE_DEFAULT bytes (sizes larger than
> + * TRACE_BUF_SIZE_DEFAULT are not allowed);
> + * - This function shall invoke ftrace_event_enable_disable to enable or
> + * disable the target trace event according to the value read from user space
> + * (0 - disable, 1 - enable);
> + *
> + * Returns 0 on success, any error returned by kstrtoul_from_user, -ENODEV if
> + * the event file cannot be retrieved from the input filp, any error returned by
> + * tracing_update_buffers, any error returned by ftrace_event_enable_disable,
> + * -EINVAL if the value copied from the user space ubuf is different from 0 or
> + * 1.
> + */
> static ssize_t
> event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> loff_t *ppos)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops
2025-07-01 21:58 ` Steven Rostedt
@ 2025-07-02 14:16 ` Gabriele Paoloni
0 siblings, 0 replies; 13+ messages in thread
From: Gabriele Paoloni @ 2025-07-02 14:16 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, acarmina, chuck.wolber
On Tue, Jul 1, 2025 at 11:58 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 20 Jun 2025 15:26:27 +0200
> Gabriele Paoloni <gpaoloni@redhat.com> wrote:
>
> > I think that from my side I do not have comprehensive answers to all your
> > questions (I need to read the code more in depth).
> > So to be pragmatic I can split this effort into two parts (documentation and
> > redesign); I will propose documentation first with the TIPs that you mentioned
> > above and later, if we find a better re-design solution, we can also amend
> > the documentation as needed.
>
> Just to confirm, I agree with Masami. The enable file is quite special,
> and I don't see the use of user space playing tricks with it, which
> even includes lseek. Maybe to keep rewinding a read to get a new status
> change?
Well the proposed patchset was aiming to prevent the user from doing
stupid things (e.g. reading 1byte at a time or reading after a write has
increased ppos). However documenting the correct AoUs would still work
>
> But it usually contains a single character (sometimes two) and a new
> line. It's not something that's ever been reported as an issue. I
> rather not touch it if it hasn't been reported as broken because
> there's some hypothetical use case that can see it as broken.
>
> Documenting its current behavior is perfectly fine with me.
Yep "[RFC PATCH v3] tracing: add testable specifications for
event_enable_write/read" is already out.
Thanks
Gab
>
> -- Steve
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read
2025-07-01 22:11 ` Steven Rostedt
@ 2025-07-02 14:59 ` Gabriele Paoloni
2025-07-02 15:12 ` Steven Rostedt
0 siblings, 1 reply; 13+ messages in thread
From: Gabriele Paoloni @ 2025-07-02 14:59 UTC (permalink / raw)
To: Steven Rostedt
Cc: mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Wed, Jul 2, 2025 at 12:11 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 12 Jun 2025 12:43:49 +0200
> Gabriele Paoloni <gpaoloni@redhat.com> wrote:
>
> > This patch implements the documentation of event_enable_write and
> > event_enable_read in the form of testable function's expectations.
> >
> > Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
> > ---
> > kernel/trace/trace_events.c | 72 +++++++++++++++++++++++++++++++++++++
> > 1 file changed, 72 insertions(+)
> >
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 5e84ef01d0c8..eb3c5e6e557d 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -1771,6 +1771,46 @@ static void p_stop(struct seq_file *m, void *p)
> > mutex_unlock(&event_mutex);
> > }
> >
> > +/**
> > + * event_enable_read - read from a trace event file to retrieve its status.
> > + * @filp: file pointer associated with the target trace event;
> > + * @ubuf: user space buffer where the event status is copied to;
> > + * @cnt: number of bytes to be copied to the user space buffer;
>
> Why is the above ending with semicolons?
Sorry it should be a full stop, I'll fix it in v4.
>
> > + * @ppos: the current position in the buffer.
> > + *
> > + * This is a way for user space executables to retrieve the status of a
> > + * specific event
> > + *
> > + * Function's expectations:
> > + * - This function shall lock the global event_mutex before performing any
> > + * operation on the target event file and unlock it after all operations on
> > + * the target event file have completed;
> > + * - This function shall retrieve the status flags from the file associated
> > + * with the target event;
> > + * - This function shall format the string to report the event status to user
> > + * space:
> > + * - The first character of the string to be copied to user space shall be
> > + * set to "1" if the enabled flag is set AND the soft_disabled flag is not
> > + * set, else it shall be set to "0";
> > + * - The second character of the string to be copied to user space shall be
> > + * set to "*" if either the soft_disabled flag or the soft_mode flag is
> > + * set, else it shall be set to "\n";
> > + * - The third character of the string to be copied to user space shall b
> > + * set to "\n" if either the soft_disabled flag or the soft_mode flag is
> > + * set, else it shall be set to "0";
> > + * - Any other character in the string to be copied to user space shall be
> > + * set to "0";
>
> The above could use some new lines. Like between each bullet. My
> eyesight isn't as good anymore and I find reading gobs of text more
> difficult. Newlines give my eyes a break.
>
> I know this is being written so that a tool could parse it, but also
> needs to be parsed by aging developers. ;-)
Agreed, I'll fix it in v4.
>
> > + * - This function shall check if the requested cnt bytes are equal or greater
> > + * than the length of the string to be copied to user space (else a
> > + * corrupted event status could be reported);
> > + * - This function shall invoke simple_read_from_buffer() to perform the copy
> > + * of the kernel space string to ubuf.
>
> Hmm, and it is also verbose. This is a relatively simple function, and
> it caused quite a bit of requirements. I wonder if it could be
> shortened a bit. Otherwise more complex and larger functions are going
> to be overwhelming and likely not acceptable. And those are the
> functions that I think this will benefit the most!
Mmm got it. What about
* Function's expectations:
* - This function shall lock the global event_mutex before performing any
* operation on the target event file and unlock it after all operations on
* the target event file have completed;
*
* - This function shall format the string copied to userspace according to
* the status flags retrieved from the target event file:
* - The first character shall be set to "1" if the enabled flag is
set AND the
* soft_disabled flag is not set, else it shall be set to "0";
* - The second character is optional and shall be set to "*" if either the
* soft_disabled flag or the soft_mode flag is set;
* - The string shall be terminated by a newline ("\n") and any remaining
* character shall be set to "0";
*
* - This function shall invoke simple_read_from_buffer() to perform the copy
* of the kernel space string to ubuf.
(pls note that the check on cnt has been removed in v3 that is out already)
>
> > + *
> > + * Returns the number of copied bytes on success, -ENODEV if the event file
> > + * cannot be retrieved from the input filp, -EINVAL if cnt is less than the
> > + * length of the string to be copied to ubuf, any error returned by
> > + * simple_read_from_buffer
>
> Returns should be formatted better where each return value is on a
> separate line.
Understood, I'll fix this in v4
Many Thanks!
Gab
>
> -- Steve
>
> > + */
> > static ssize_t
> > event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> > loff_t *ppos)
> > @@ -1808,6 +1848,38 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> > return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> > }
> >
> > +/**
> > + * event_enable_write - write to a trace event file to enable/disable it.
> > + * @filp: file pointer associated with the target trace event;
> > + * @ubuf: user space buffer where the enable/disable value is copied from;
> > + * @cnt: number of bytes to be copied from the user space buffer;
> > + * @ppos: the current position in the buffer.
> > + *
> > + * This is a way for user space executables to enable or disable event
> > + * recording.
> > + *
> > + * Function's expectations:
> > + * - This function shall copy cnt bytes from the input ubuf buffer to a kernel
> > + * space buffer (or the whole input ubuf if cnt is larger than ubuf size)
> > + * and shall convert the string within the kernel space buffer into a decimal
> > + * base format number;
> > + * - This function shall lock the global event_mutex before performing any
> > + * operation on the target event file and unlock it after all operations on
> > + * the target event file have completed;
> > + * - This function shall check the size of the per-cpu ring-buffers used for
> > + * the event trace data and, if smaller than TRACE_BUF_SIZE_DEFAULT, expand
> > + * them to TRACE_BUF_SIZE_DEFAULT bytes (sizes larger than
> > + * TRACE_BUF_SIZE_DEFAULT are not allowed);
> > + * - This function shall invoke ftrace_event_enable_disable to enable or
> > + * disable the target trace event according to the value read from user space
> > + * (0 - disable, 1 - enable);
> > + *
> > + * Returns 0 on success, any error returned by kstrtoul_from_user, -ENODEV if
> > + * the event file cannot be retrieved from the input filp, any error returned by
> > + * tracing_update_buffers, any error returned by ftrace_event_enable_disable,
> > + * -EINVAL if the value copied from the user space ubuf is different from 0 or
> > + * 1.
> > + */
> > static ssize_t
> > event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> > loff_t *ppos)
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read
2025-07-02 14:59 ` Gabriele Paoloni
@ 2025-07-02 15:12 ` Steven Rostedt
2025-07-02 16:12 ` Gabriele Paoloni
0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2025-07-02 15:12 UTC (permalink / raw)
To: Gabriele Paoloni
Cc: mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Wed, 2 Jul 2025 16:59:29 +0200
Gabriele Paoloni <gpaoloni@redhat.com> wrote:
> Mmm got it. What about
>
> * Function's expectations:
> * - This function shall lock the global event_mutex before performing any
> * operation on the target event file and unlock it after all operations on
> * the target event file have completed;
Since 99% of the time that a lock is taken in a function it is
released, I think that should be the default assumption here, and only
when a lock is taken and not release, that should be explicitly called
out.
And also we should remove "This function" we know that these
requirements are for this function.
- The global event_mutex shall be taken before performing any
operation on the target event.
Should be good enough.
If the lock can be released and taken again, that too should be
explicit in the requirements otherwise it is assumed it is taken once
and not released until the operation is completed.
> *
> * - This function shall format the string copied to userspace according to
> * the status flags retrieved from the target event file:
> * - The first character shall be set to "1" if the enabled flag is
> set AND the
> * soft_disabled flag is not set, else it shall be set to "0";
> * - The second character is optional and shall be set to "*" if either the
> * soft_disabled flag or the soft_mode flag is set;
> * - The string shall be terminated by a newline ("\n") and any remaining
> * character shall be set to "0";
- The string copied to user space shall be formatted according to the
status flags from the target event file:
- If the enable flag is set AND the soft_disable flag is not set then
the first character shall be set to "1" ELSE it shall be set to "0"
- If either the soft_disable fag or the soft_mode flag is set then the
second character shall be set to "*" ELSE it is skipped.
I think the above is easier to read and is a bit more consolidated.
Stating the status then the effect is also easier to read.
-- Steve
> *
> * - This function shall invoke simple_read_from_buffer() to perform the copy
> * of the kernel space string to ubuf.
>
> (pls note that the check on cnt has been removed in v3 that is out already)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read
2025-07-02 15:12 ` Steven Rostedt
@ 2025-07-02 16:12 ` Gabriele Paoloni
0 siblings, 0 replies; 13+ messages in thread
From: Gabriele Paoloni @ 2025-07-02 16:12 UTC (permalink / raw)
To: Steven Rostedt
Cc: mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
acarmina, chuck.wolber
On Wed, Jul 2, 2025 at 5:12 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 2 Jul 2025 16:59:29 +0200
> Gabriele Paoloni <gpaoloni@redhat.com> wrote:
>
> > Mmm got it. What about
> >
> > * Function's expectations:
> > * - This function shall lock the global event_mutex before performing any
> > * operation on the target event file and unlock it after all operations on
> > * the target event file have completed;
>
> Since 99% of the time that a lock is taken in a function it is
> released, I think that should be the default assumption here, and only
> when a lock is taken and not release, that should be explicitly called
> out.
>
> And also we should remove "This function" we know that these
> requirements are for this function.
>
> - The global event_mutex shall be taken before performing any
> operation on the target event.
>
> Should be good enough.
>
> If the lock can be released and taken again, that too should be
> explicit in the requirements otherwise it is assumed it is taken once
> and not released until the operation is completed.
>
> > *
> > * - This function shall format the string copied to userspace according to
> > * the status flags retrieved from the target event file:
> > * - The first character shall be set to "1" if the enabled flag is
> > set AND the
> > * soft_disabled flag is not set, else it shall be set to "0";
> > * - The second character is optional and shall be set to "*" if either the
> > * soft_disabled flag or the soft_mode flag is set;
> > * - The string shall be terminated by a newline ("\n") and any remaining
> > * character shall be set to "0";
>
> - The string copied to user space shall be formatted according to the
> status flags from the target event file:
>
> - If the enable flag is set AND the soft_disable flag is not set then
> the first character shall be set to "1" ELSE it shall be set to "0"
>
> - If either the soft_disable fag or the soft_mode flag is set then the
> second character shall be set to "*" ELSE it is skipped.
>
> I think the above is easier to read and is a bit more consolidated.
> Stating the status then the effect is also easier to read.
I will add all your suggestions in v4.
Many thanks for your review!
Gab
>
> -- Steve
>
>
> > *
> > * - This function shall invoke simple_read_from_buffer() to perform the copy
> > * of the kernel space string to ubuf.
> >
> > (pls note that the check on cnt has been removed in v3 that is out already)
>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-07-02 16:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-12 10:43 [RFC PATCH 0/2] tracing: ftrace_enable_fops fixes Gabriele Paoloni
2025-06-12 10:43 ` [RFC PATCH 1/2] tracing: fixes of ftrace_enable_fops Gabriele Paoloni
2025-06-13 2:45 ` Masami Hiramatsu
2025-06-19 17:07 ` Gabriele Paoloni
2025-06-20 9:35 ` Masami Hiramatsu
2025-06-20 13:26 ` Gabriele Paoloni
2025-07-01 21:58 ` Steven Rostedt
2025-07-02 14:16 ` Gabriele Paoloni
2025-06-12 10:43 ` [RFC PATCH 2/2] tracing: add testable specifications for event_enable_write/read Gabriele Paoloni
2025-07-01 22:11 ` Steven Rostedt
2025-07-02 14:59 ` Gabriele Paoloni
2025-07-02 15:12 ` Steven Rostedt
2025-07-02 16:12 ` Gabriele Paoloni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).