Re: [PATCH] user_events: Enable user processes to create and write to trace events

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Beau Belgrave <beaub@linux.microsoft.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: rostedt@goodmis.org, linux-trace-devel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] user_events: Enable user processes to create and write to trace events
Date: Thu, 7 Oct 2021 17:05:40 -0700	[thread overview]
Message-ID: <20211008000540.GA31220@kbox> (raw)
In-Reply-To: <20211008081249.8fbacc4f5d9fa7cf2e488d21@kernel.org>

On Fri, Oct 08, 2021 at 08:12:49AM +0900, Masami Hiramatsu wrote:
> > Please see v2 patch, I do this pattern except it's '(Used by ftrace)'
> > instead of '# Used by ftrace'.
> > 
> > Format is id:name status
> 
> Hm, why I suggested to use "# status" is that the comment will be
> removed when writing it. So you can do
> 
> cat user_events > ~/saved_events
> (reboot)
> cat ~/saved_events > user_events
> 
> to restore events :)
> 
Nice, good idea.

> > 
> > > > The other thing is we need ref counting to know if the event is busy.
> > > > Having the ID in the packet avoids having a fd per-event, but it also
> > > > makes ref counting process lifetime of each event quite hard.
> > > 
> > > Hmm, I don't think so. You can use an array of the pointer to
> > > events on the private data of the struct file.
> > > When you add (or start using) an event (this is identified by ioctl),
> > > you can increment the event refcount and add it to the array.
> > > When the file is closed (in exiting process), it will loop on the
> > > array and decrement the refcount for each event.
> > > Then, after all tracers disabled the event, ftrace can remove the
> > > event in background (unless it is defined through 'dynamic_events' or
> > > 'user_events').
> > > 
> > Yes, I didn't say it's impossible :) It's quite hard and takes a lot
> > more management. I don't see a clear benefit to that approach, why is it
> > better than an fd lifetime? Not trying to be difficult, just trying to
> > be pragmatic about what approach is best.
> 
> I'm not sure this point, you mean 1 fd == 1 event model?
> 
Yeah, I like the idea of not having an fd per event. I want to make
sure the complexity is worth it. Is the overhead of an FD per event in
user space too much? What happens to the first 4 bytes (ID)? Does it not
show up in the buffer? This would be fine as long as the rel_loc idea
gets into ftrace, etc.

This would require a global array as well as a local per-FD array. I'm
wondering if the per-FD array becoming large mitigates the gain by
simply having an FD per-event.

> > > > We also want
> > > > predicate filtering to work as cheap as possible. I would really like to
> > > > keep offset manipulation entirely in the user space to avoid confusion
> > > > across the various tracing mechanisms and avoid probing the user data
> > > > upon each call (eBPF programs only selectively probe in data).
> > > 
> > > OK, so let's add __rel_loc__ attribute. The rel_loc type will be
> > > 
> > > struct rel_loc {
> > > 	uint16_t len;	/* The data size (including '\0' if string )*/
> > > 	uint16_t offs;	/* The offset of actual data from this field */
> > > } __packed;
> > > 
> > > Hmm, btw, this will be good for probe events... I don't need to pass
> > > the base address with this attribute.
> > > 
> > What's the difference between __rel_loc__ and __data_loc? Seems like
> > instead of just offset it's length + offset?
> 
> In my idea, rel_loc is similar to the data_loc. It has the offset, but
> the offset is the data offset from the rel_loc, not from the entry of
> the recorded data. So kernel doesn't need to adjust it.
> 
Got it, makes sense and would eliminate the need for the IOCTL for
offsets. I like it.

Thanks,
-Beau
> Thank you,
> 
> -- 
> Masami Hiramatsu <mhiramat@kernel.org>

next prev parent reply	other threads:[~2021-10-08  0:05 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-05 22:44 [PATCH] user_events: Enable user processes to create and write to trace events Beau Belgrave
2021-10-06 16:28 ` Masami Hiramatsu
2021-10-06 17:56   ` Beau Belgrave
2021-10-07 14:17     ` Masami Hiramatsu
2021-10-07 16:22       ` Beau Belgrave
2021-10-07 23:12         ` Masami Hiramatsu
2021-10-08  0:05           ` Beau Belgrave [this message]
2021-10-08  9:22             ` Masami Hiramatsu
2021-10-11 16:25               ` Beau Belgrave
2021-10-13  1:18                 ` Steven Rostedt
2021-10-13 16:50                   ` Beau Belgrave
2021-10-13 17:11                     ` Steven Rostedt
2021-10-13 17:17                       ` Beau Belgrave
2021-10-13 17:27                         ` Steven Rostedt
2021-10-13 15:21                 ` Masami Hiramatsu
2021-10-13 15:40                   ` Steven Rostedt
2021-10-14 12:21                     ` Masami Hiramatsu
2021-10-13 16:56                   ` Beau Belgrave
2021-10-06 16:54 ` Steven Rostedt
2021-10-06 17:27   ` Beau Belgrave
2021-10-06 17:44     ` Steven Rostedt
2021-10-08 13:11 ` Peter.Enderborg
2021-10-08 16:09   ` Beau Belgrave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211008000540.GA31220@kbox \
    --to=beaub@linux.microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.