public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Beau Belgrave <beaub@linux.microsoft.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: mhiramat@kernel.org, linux-trace-devel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 02/13] user_events: Add minimal support for trace_event into ftrace
Date: Thu, 9 Dec 2021 09:40:50 -0800	[thread overview]
Message-ID: <20211209174050.GA21553@kbox> (raw)
In-Reply-To: <20211208210336.40c7741b@yoga.local.home>

On Wed, Dec 08, 2021 at 09:03:36PM -0500, Steven Rostedt wrote:
> On Wed, 8 Dec 2021 16:58:23 -0800
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
> > >   
> > > > +/*
> > > > + * Handles the final close of the file from user mode.
> > > > + */
> > > > +static int user_events_release(struct inode *node, struct file
> > > > *file) +{
> > > > +	struct user_event_refs *refs;
> > > > +	struct user_event *user;
> > > > +	int i;
> > > > +
> > > > +	/*
> > > > +	 * refs is protected by RCU and could in theory change
> > > > immediately
> > > > +	 * before this call on another core. To ensure we read
> > > > the latest
> > > > +	 * version of refs we acquire the RCU read lock again.
> > > > +	 */
> > > > +	rcu_read_lock_sched();
> > > > +	refs = rcu_dereference_sched(file->private_data);
> > > > +	rcu_read_unlock_sched();  
> > > 
> > > This still bothers me. Can another CPU call an ioctl here?  
> > 
> > Sorry :)
> > 
> > No, another CPU cannot call the ioctl on the file, since if another
> > CPU had a reference to this file release couldn't be called.
> 
> OK, so it should be good, as the last fdput() will call this (and the
> ioctl should keep that from happening until its done). But this could
> also be done with less confusion and less paranoia if we simply take
> the reg_mutex, as that should keep everything from changing, and we
> wouldn't need to do any rcu_read_lock*() from the release function.
> 

Sure, could do that. This shouldn't be a fast path scenario, however
I dislike taking global locks when not required. Happy to change this
though.

I'm guessing for less confusion and paranoia you'd want the lock held
for the entire method call?

> > 
> > user_events_release is only called when the final reference to the
> > file has been closed, so there cannot be another ioctl pending,
> > starting or finishing for this file at the time it is called.
> > 
> > The last user mode program to call close() on the file will end up
> > invoking user_events_release.
> 
> It doesn't work like that. There's only one close(). But you are
> correct that it is protected, and that's by the fdget() and fdput()
> that is done within the ioctl (and other) system call.
> 

Yeah, I simplified to user space. IE: fork() then close() in each
process / task. We both understand each other now though ;)

> > 
> > The user_event_refs is only accessible via the file's private_data,
> > which now has zero references when release is executing. This means
> > the private_data can no longer change and the rcu deref ensures we
> > have the latest version.
> > 
> > refs is per-file, so while there can be other ioctl's occurring for
> > other files, they are completely different ref objects than the one
> > being cleaned up in the release of the file, it's not shared outside
> > of this file lifetime, which has now ended.
> 
> Right, but I'm still paranoid ;-)
> 
> > 
> > > 
> > >   user_events_ioctl_reg() {
> > >     user_events_ref_add() {
> > >       refs = rcu_dereference_protected(file->private_data, ..);
> > >       new_refs = kzalloc(size, GFP_KERNEL);
> > >       rcu_assign_pointer(file->private_data, new_refs);
> > >       if (refs)
> > >         kfree_rcu(refs, rcu);
> > > 
> > > refs now freed.
> > >   
> > 
> > If user_events_ioctl is executing for that same file,
> > user_events_release could not have been called due to the file being
> > in use to issue the ioctl.
> 
> 
> The only thing protecting against this is the fdget/put logic in the
> system calls.
> 

Yes, however, it is protected.

> > 
> > > > +
> > > > +	if (!refs)
> > > > +		goto out;
> > > > +
> > > > +	/*
> > > > +	 * Do not need RCU while enumerating the events that
> > > > were used.
> > > > +	 * The lifetime of refs has reached an end, it's tied to
> > > > this file.
> > > > +	 * The underlying user_events are ref counted, and
> > > > cannot be freed.
> > > > +	 * After this decrement, the user_events may be freed
> > > > elsewhere.
> > > > +	 */
> > > > +	for (i = 0; i < refs->count; ++i) {  
> > > 
> > > Fault on refs->count
> > > 
> > > ??  
> > 
> > refs after rcu_dereference is checked for null before accessing.
> > 
> > refs cannot be changed when release is being executed, since that
> > would mean the ioctl ran without a file reference (not sure how that
> > could happen).
> > 
> > This is why it's important that release get the latest version of
> > refs, an ioctl could have JUST happened before the final close() in
> > user mode, and if it jumped CPUs we could (in theory) get an old
> > value. If we got an old value, then yes, the fault could occur.
> > 
> > This code uses the file ops release method as a final sync point to
> > clean up everything for that file only after there are no more
> > references to it at all, so nobody can do this kind of thing.
> > 
> > Is there some case I am missing where an ioctl on a file can be
> > performed without a reference to that file?
> 
> 
> Well, the ioctl can be called as the close happens, but it's the
> internal working of fdget/put that protects it. If the ioctl is called
> at the same time as the close, the fdget in the ioctl will keep the
> close from calling the release. And as soon as the ioctl is finished,
> it will call the fdput() which then calls the release logic.
> 
> > 
> > Are you worried about a malicious user calling close on the file and
> > then immediately issuing an ioctl on the now closed file?
> > 
> > If so, wouldn't ioctl just reject that file reference being used as
> > not in the processes file table / invalid and not let the ioctl go
> > through?
> > 
> 
> I think it seems less confusing and saner to just use the mutex. It's
> not a fast path is it?
> 
> -- Steve

No, this is not a fast path, and I don't have a problem moving to a
mutex if you feel that is better. I've likely become too close to this
code to know when things are confusing for others.

Thanks,
-Beau

  reply	other threads:[~2021-12-09 17:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-01 18:25 [PATCH v6 00/13] user_events: Enable user processes to create and write to trace events Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 01/13] user_events: Add UABI header for user access to user_events Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 02/13] user_events: Add minimal support for trace_event into ftrace Beau Belgrave
2021-12-08 23:19   ` Steven Rostedt
2021-12-09  0:58     ` Beau Belgrave
2021-12-09  2:03       ` Steven Rostedt
2021-12-09 17:40         ` Beau Belgrave [this message]
2021-12-09 17:47           ` Steven Rostedt
2021-12-09 19:42             ` Beau Belgrave
2021-12-09 19:57               ` Steven Rostedt
2021-12-09 20:11                 ` Beau Belgrave
2021-12-09 20:19                   ` Steven Rostedt
2021-12-01 18:25 ` [PATCH v6 03/13] user_events: Add print_fmt generation support for basic types Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 04/13] user_events: Handle matching arguments from dyn_events Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 05/13] user_events: Add basic perf and eBPF support Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 06/13] user_events: Add self-test for ftrace integration Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 07/13] user_events: Add self-test for dynamic_events integration Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 08/13] user_events: Add self-test for perf_event integration Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 09/13] user_events: Optimize writing events by only copying data once Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 10/13] user_events: Add documentation file Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 11/13] user_events: Add sample code for typical usage Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 12/13] user_events: Validate user payloads for size and null termination Beau Belgrave
2021-12-01 18:25 ` [PATCH v6 13/13] user_events: Use __get_rel_str for relative string fields Beau Belgrave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211209174050.GA21553@kbox \
    --to=beaub@linux.microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox