* [PATCH 1/3] tracing/user_events: Allow events to persist for perfmon_capable users
2023-09-01 20:43 [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users Beau Belgrave
@ 2023-09-01 20:43 ` Beau Belgrave
2023-09-01 20:43 ` [PATCH 2/3] selftests/user_events: Test persist flag cases Beau Belgrave
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook
There are several scenarios that have come up where having a user_event
persist even if the process that registered it exits. The main one is
having a daemon create events on bootup that shouldn't get deleted if
the daemon has to exit or reload. Another is within OpenTelemetry
exporters, they wish to potentially check if a user_event exists on the
system to determine if exporting the data out should occur. The
user_event in this case must exist even in the absence of the owning
process running (such as the above daemon case).
Expose the previously internal flag USER_EVENT_REG_PERSIST to user
processes. Upon register or delete of events with this flag, ensure the
user is perfmon_capable to prevent random user processes with access to
tracefs from creating events that persist after exit.
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
include/uapi/linux/user_events.h | 11 ++++++++++-
kernel/trace/trace_events_user.c | 28 ++++++++++++++--------------
2 files changed, 24 insertions(+), 15 deletions(-)
diff --git a/include/uapi/linux/user_events.h b/include/uapi/linux/user_events.h
index 2984aae4a2b4..f74f3aedd49c 100644
--- a/include/uapi/linux/user_events.h
+++ b/include/uapi/linux/user_events.h
@@ -17,6 +17,15 @@
/* Create dynamic location entry within a 32-bit value */
#define DYN_LOC(offset, size) ((size) << 16 | (offset))
+/* List of supported registration flags */
+enum user_reg_flag {
+ /* Event will not delete upon last reference closing */
+ USER_EVENT_REG_PERSIST = 1U << 0,
+
+ /* This value or above is currently non-ABI */
+ USER_EVENT_REG_MAX = 1U << 1,
+};
+
/*
* Describes an event registration and stores the results of the registration.
* This structure is passed to the DIAG_IOCSREG ioctl, callers at a minimum
@@ -33,7 +42,7 @@ struct user_reg {
/* Input: Enable size in bytes at address */
__u8 enable_size;
- /* Input: Flags for future use, set to 0 */
+ /* Input: Flags to use, if any */
__u16 flags;
/* Input: Address to update when enabled */
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 6f046650e527..af663de93492 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -49,18 +49,6 @@
#define EVENT_STATUS_PERF BIT(1)
#define EVENT_STATUS_OTHER BIT(7)
-/*
- * User register flags are not allowed yet, keep them here until we are
- * ready to expose them out to the user ABI.
- */
-enum user_reg_flag {
- /* Event will not delete upon last reference closing */
- USER_EVENT_REG_PERSIST = 1U << 0,
-
- /* This value or above is currently non-ABI */
- USER_EVENT_REG_MAX = 1U << 1,
-};
-
/*
* Stores the system name, tables, and locks for a group of events. This
* allows isolation for events by various means.
@@ -1888,10 +1876,16 @@ static int user_event_parse(struct user_event_group *group, char *name,
int argc = 0;
char **argv;
- /* User register flags are not ready yet */
- if (reg_flags != 0 || flags != NULL)
+ /* Currently don't support any text based flags */
+ if (flags != NULL)
return -EINVAL;
+ /* Persistent events require CAP_PERFMON / CAP_SYS_ADMIN */
+ if (reg_flags & USER_EVENT_REG_PERSIST) {
+ if (!perfmon_capable())
+ return -EPERM;
+ }
+
/* Prevent dyn_event from racing */
mutex_lock(&event_mutex);
user = find_user_event(group, name, &key);
@@ -2024,6 +2018,12 @@ static int delete_user_event(struct user_event_group *group, char *name)
if (!user_event_last_ref(user))
return -EBUSY;
+ /* Persistent events require CAP_PERFMON / CAP_SYS_ADMIN */
+ if (user->reg_flags & USER_EVENT_REG_PERSIST) {
+ if (!perfmon_capable())
+ return -EPERM;
+ }
+
return destroy_user_event(user);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 2/3] selftests/user_events: Test persist flag cases
2023-09-01 20:43 [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users Beau Belgrave
2023-09-01 20:43 ` [PATCH 1/3] " Beau Belgrave
@ 2023-09-01 20:43 ` Beau Belgrave
2023-09-01 20:43 ` [PATCH 3/3] tracing/user_events: Document persist event flags Beau Belgrave
2023-09-01 20:57 ` [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users Steven Rostedt
3 siblings, 0 replies; 5+ messages in thread
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook
Now that we have exposed USER_EVENT_REG_PERSIST events can persist both
via the ABI and in the /sys/kernel/tracing/dynamic_events file.
Ensure both the ABI and DYN cases work by calling both during the parse
tests. Add new flags test that ensures only USER_EVENT_REG_PERSIST is
honored and any other flag is invalid.
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
.../testing/selftests/user_events/abi_test.c | 55 ++++++++++++++++++-
.../testing/selftests/user_events/dyn_test.c | 54 +++++++++++++++++-
2 files changed, 107 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/user_events/abi_test.c b/tools/testing/selftests/user_events/abi_test.c
index 5125c42efe65..b95fc15496ba 100644
--- a/tools/testing/selftests/user_events/abi_test.c
+++ b/tools/testing/selftests/user_events/abi_test.c
@@ -23,6 +23,18 @@
const char *data_file = "/sys/kernel/tracing/user_events_data";
const char *enable_file = "/sys/kernel/tracing/events/user_events/__abi_event/enable";
+static bool event_exists(void)
+{
+ int fd = open(enable_file, O_RDWR);
+
+ if (fd < 0)
+ return false;
+
+ close(fd);
+
+ return true;
+}
+
static int change_event(bool enable)
{
int fd = open(enable_file, O_RDWR);
@@ -46,7 +58,22 @@ static int change_event(bool enable)
return ret;
}
-static int reg_enable(long *enable, int size, int bit)
+static int event_delete(void)
+{
+ int fd = open(data_file, O_RDWR);
+ int ret;
+
+ if (fd < 0)
+ return -1;
+
+ ret = ioctl(fd, DIAG_IOCSDEL, "__abi_event");
+
+ close(fd);
+
+ return ret;
+}
+
+static int reg_enable_flags(long *enable, int size, int bit, int flags)
{
struct user_reg reg = {0};
int fd = open(data_file, O_RDWR);
@@ -57,6 +84,7 @@ static int reg_enable(long *enable, int size, int bit)
reg.size = sizeof(reg);
reg.name_args = (__u64)"__abi_event";
+ reg.flags = flags;
reg.enable_bit = bit;
reg.enable_addr = (__u64)enable;
reg.enable_size = size;
@@ -68,6 +96,11 @@ static int reg_enable(long *enable, int size, int bit)
return ret;
}
+static int reg_enable(long *enable, int size, int bit)
+{
+ return reg_enable_flags(enable, size, bit, 0);
+}
+
static int reg_disable(long *enable, int bit)
{
struct user_unreg reg = {0};
@@ -121,6 +154,26 @@ TEST_F(user, enablement) {
ASSERT_EQ(0, change_event(false));
}
+TEST_F(user, flags) {
+ /* USER_EVENT_REG_PERSIST is allowed */
+ ASSERT_EQ(0, reg_enable_flags(&self->check, sizeof(int), 0,
+ USER_EVENT_REG_PERSIST));
+ ASSERT_EQ(0, reg_disable(&self->check, 0));
+
+ /* Ensure it exists after close and disable */
+ ASSERT_TRUE(event_exists());
+
+ /* Ensure we can delete it */
+ ASSERT_EQ(0, event_delete());
+
+ /* USER_EVENT_REG_MAX or above is not allowed */
+ ASSERT_EQ(-1, reg_enable_flags(&self->check, sizeof(int), 0,
+ USER_EVENT_REG_MAX));
+
+ /* Ensure it does not exist after invalid flags */
+ ASSERT_FALSE(event_exists());
+}
+
TEST_F(user, bit_sizes) {
/* Allow 0-31 bits for 32-bit */
ASSERT_EQ(0, reg_enable(&self->check, sizeof(int), 0));
diff --git a/tools/testing/selftests/user_events/dyn_test.c b/tools/testing/selftests/user_events/dyn_test.c
index 91a4444ad42b..f2a41bcb5ad8 100644
--- a/tools/testing/selftests/user_events/dyn_test.c
+++ b/tools/testing/selftests/user_events/dyn_test.c
@@ -16,9 +16,25 @@
#include "../kselftest_harness.h"
+const char *dyn_file = "/sys/kernel/tracing/dynamic_events";
const char *abi_file = "/sys/kernel/tracing/user_events_data";
const char *enable_file = "/sys/kernel/tracing/events/user_events/__test_event/enable";
+static int event_delete(void)
+{
+ int fd = open(abi_file, O_RDWR);
+ int ret;
+
+ if (fd < 0)
+ return -1;
+
+ ret = ioctl(fd, DIAG_IOCSDEL, "__test_event");
+
+ close(fd);
+
+ return ret;
+}
+
static bool wait_for_delete(void)
{
int i;
@@ -63,7 +79,31 @@ static int unreg_event(int fd, int *check, int bit)
return ioctl(fd, DIAG_IOCSUNREG, &unreg);
}
-static int parse(int *check, const char *value)
+static int parse_dyn(const char *value)
+{
+ int fd = open(dyn_file, O_RDWR | O_APPEND);
+ int len = strlen(value);
+ int ret;
+
+ if (fd == -1)
+ return -1;
+
+ ret = write(fd, value, len);
+
+ if (ret == len)
+ ret = 0;
+ else
+ ret = -1;
+
+ close(fd);
+
+ if (ret == 0)
+ event_delete();
+
+ return ret;
+}
+
+static int parse_abi(int *check, const char *value)
{
int fd = open(abi_file, O_RDWR);
int ret;
@@ -89,6 +129,18 @@ static int parse(int *check, const char *value)
return ret;
}
+static int parse(int *check, const char *value)
+{
+ int abi_ret = parse_abi(check, value);
+ int dyn_ret = parse_dyn(value);
+
+ /* Ensure both ABI and DYN parse the same way */
+ if (dyn_ret != abi_ret)
+ return -1;
+
+ return dyn_ret;
+}
+
static int check_match(int *check, const char *first, const char *second, bool *match)
{
int fd = open(abi_file, O_RDWR);
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 3/3] tracing/user_events: Document persist event flags
2023-09-01 20:43 [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users Beau Belgrave
2023-09-01 20:43 ` [PATCH 1/3] " Beau Belgrave
2023-09-01 20:43 ` [PATCH 2/3] selftests/user_events: Test persist flag cases Beau Belgrave
@ 2023-09-01 20:43 ` Beau Belgrave
2023-09-01 20:57 ` [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users Steven Rostedt
3 siblings, 0 replies; 5+ messages in thread
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook
Users need to know how to make events persist, now that we allow for
that. We also now allow the dynamic_events file to create events by
utilizing the persist flag during event register.
Add back in to documentation how /sys/kernel/tracing/dynamic_events can
be used to create peristent user_events. Add a section under registering
for the currently supported flags (USER_EVENT_REG_PERSIST) and the
required permissions. Add a note under deleting that deleting a
persistent event also requires sufficient permission.
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
Documentation/trace/user_events.rst | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/Documentation/trace/user_events.rst b/Documentation/trace/user_events.rst
index e7b07313550a..c5388a47376f 100644
--- a/Documentation/trace/user_events.rst
+++ b/Documentation/trace/user_events.rst
@@ -14,6 +14,11 @@ Programs can view status of the events via
/sys/kernel/tracing/user_events_status and can both register and write
data out via /sys/kernel/tracing/user_events_data.
+Programs can also use /sys/kernel/tracing/dynamic_events to register and
+delete user based events via the u: prefix. The format of the command to
+dynamic_events is the same as the ioctl with the u: prefix applied. This
+requires CAP_PERFMON due to the event persisting, otherwise -EPERM is returned.
+
Typically programs will register a set of events that they wish to expose to
tools that can read trace_events (such as ftrace and perf). The registration
process tells the kernel which address and bit to reflect if any tool has
@@ -45,7 +50,7 @@ This command takes a packed struct user_reg as an argument::
/* Input: Enable size in bytes at address */
__u8 enable_size;
- /* Input: Flags for future use, set to 0 */
+ /* Input: Flags to be used, if any */
__u16 flags;
/* Input: Address to update when enabled */
@@ -69,7 +74,7 @@ The struct user_reg requires all the above inputs to be set appropriately.
This must be 4 (32-bit) or 8 (64-bit). 64-bit values are only allowed to be
used on 64-bit kernels, however, 32-bit can be used on all kernels.
-+ flags: The flags to use, if any. For the initial version this must be 0.
++ flags: The flags to use, if any.
Callers should first attempt to use flags and retry without flags to ensure
support for lower versions of the kernel. If a flag is not supported -EINVAL
is returned.
@@ -80,6 +85,13 @@ The struct user_reg requires all the above inputs to be set appropriately.
+ name_args: The name and arguments to describe the event, see command format
for details.
+The following flags are currently supported.
+
++ USER_EVENT_REG_PERSIST: The event will not delete upon the last reference
+ closing. Callers may use this if an event should exist even after the
+ process closes or unregisters the event. Requires CAP_PERFMON otherwise
+ -EPERM is returned.
+
Upon successful registration the following is set.
+ write_index: The index to use for this file descriptor that represents this
@@ -141,7 +153,10 @@ event (in both user and kernel space). User programs should use a separate file
to request deletes than the one used for registration due to this.
**NOTE:** By default events will auto-delete when there are no references left
-to the event. Flags in the future may change this logic.
+to the event. If programs do not want auto-delete, they must use the
+USER_EVENT_REG_PERSIST flag when registering the event. Once that flag is used
+the event exists until DIAG_IOCSDEL is invoked. Both register and delete of an
+event that persists requires CAP_PERFMON, otherwise -EPERM is returned.
Unregistering
-------------
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users
2023-09-01 20:43 [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users Beau Belgrave
` (2 preceding siblings ...)
2023-09-01 20:43 ` [PATCH 3/3] tracing/user_events: Document persist event flags Beau Belgrave
@ 2023-09-01 20:57 ` Steven Rostedt
3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2023-09-01 20:57 UTC (permalink / raw)
To: Beau Belgrave; +Cc: mhiramat, linux-kernel, linux-trace-kernel, ast, dcook
Hi Beau,
Just an FYI, except for fixes, it is never a good ideal to send out patches
while the merge window is open. They will likely be ignored for the
entirety of the merge window.
-- Steve
On Fri, 1 Sep 2023 20:43:29 +0000
Beau Belgrave <beaub@linux.microsoft.com> wrote:
> There are several scenarios that have come up where having a user_event
> persist even if the process that registered it exits. The main one is
> having a daemon create events on bootup that shouldn't get deleted if
> the daemon has to exit or reload. Another is within OpenTelemetry
> exporters, they wish to potentially check if a user_event exists on the
> system to determine if exporting the data out should occur. The
> user_event in this case must exist even in the absence of the owning
> process running (such as the above daemon case).
>
> Since persistent events aren't automatically cleaned up, we want to ensure
> only trusted users are allowed to do this. It seems reasonable to use
> CAP_PERFMON as that boundary, since those users can already do many things
> via perf_event_open without requiring full CAP_SYS_ADMIN.
>
> This patchset brings back the ability to use /sys/kernel/tracing/dynamic_events
> to create user_events, as persist is now back to being supported. Both the
> register and delete of events that persist require CAP_PERFMON, which prevents
> a non-perfmon user from making an event go away that a perfmon user decided
> should persist.
>
> Beau Belgrave (3):
> tracing/user_events: Allow events to persist for perfmon_capable users
> selftests/user_events: Test persist flag cases
> tracing/user_events: Document persist event flags
>
> Documentation/trace/user_events.rst | 21 ++++++-
> include/uapi/linux/user_events.h | 11 +++-
> kernel/trace/trace_events_user.c | 28 +++++-----
> .../testing/selftests/user_events/abi_test.c | 55 ++++++++++++++++++-
> .../testing/selftests/user_events/dyn_test.c | 54 +++++++++++++++++-
> 5 files changed, 149 insertions(+), 20 deletions(-)
>
>
> base-commit: f940e482b0f889e697372a22b6c15da87aa1f63a
^ permalink raw reply [flat|nested] 5+ messages in thread