linux-trace-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Beau Belgrave <beaub@linux.microsoft.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: rostedt@goodmis.org, mhiramat@kernel.org,
	dcook@linux.microsoft.com, alanau@linux.microsoft.com,
	brauner@kernel.org, akpm@linux-foundation.org,
	linux-trace-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 03/11] tracing/user_events: Use remote writes for event enablement
Date: Mon, 5 Dec 2022 14:26:57 -0800	[thread overview]
Message-ID: <20221205222657.GA2270@kbox> (raw)
In-Reply-To: <acd9c1b1-e56f-e49c-6092-d53d51cd8d4c@efficios.com>

On Mon, Dec 05, 2022 at 04:28:03PM -0500, Mathieu Desnoyers wrote:
> On 2022-12-05 16:00, Beau Belgrave wrote:
> [...]
> >   #ifdef CONFIG_USER_EVENTS
> >   struct user_event_mm {
> > +	struct list_head link;
> > +	struct list_head enablers;
> > +	struct mm_struct *mm;
> > +	struct user_event_mm *next;
> > +	refcount_t refcnt;
> > +	refcount_t tasks;
> >   };
> > -#endif
> > +extern void user_event_mm_dup(struct task_struct *t,
> > +			      struct user_event_mm *old_mm);
> > +
> > +extern void user_event_mm_remove(struct task_struct *t);
> > +
> > +static inline void user_events_fork(struct task_struct *t,
> > +				    unsigned long clone_flags)
> > +{
> > +	struct user_event_mm *old_mm;
> > +
> > +	if (!t || !current->user_event_mm)
> > +		return;
> > +
> > +	old_mm = current->user_event_mm;
> > +
> > +	if (clone_flags & CLONE_VM) {
> > +		t->user_event_mm = old_mm;
> > +		refcount_inc(&old_mm->tasks);
> > +		return;
> > +	}
> > +
> > +	user_event_mm_dup(t, old_mm);
> > +}
> > +
> > +static inline void user_events_execve(struct task_struct *t)
> > +{
> > +	if (!t || !t->user_event_mm)
> > +		return;
> > +
> > +	user_event_mm_remove(t);
> > +}
> > +
> > +static inline void user_events_exit(struct task_struct *t)
> > +{
> > +	if (!t || !t->user_event_mm)
> > +		return;
> > +
> > +	user_event_mm_remove(t);
> > +}
> 
> So this is adding user_event_mm_remove() calls on each execve and each
> process exit, correct ?
> 

Yes, as long as the process has registered a user_event. If it has not,
nothing happens.

> [...]
> 
> 
> > +
> > +void user_event_mm_remove(struct task_struct *t)
> > +{
> > +	struct user_event_mm *mm;
> > +	unsigned long flags;
> > +
> > +	might_sleep();
> > +
> > +	mm = t->user_event_mm;
> > +	t->user_event_mm = NULL;
> > +
> > +	/* Clone will increment the tasks, only remove if last clone */
> > +	if (!refcount_dec_and_test(&mm->tasks))
> > +		return;
> > +
> > +	/* Remove the mm from the list, so it can no longer be enabled */
> > +	spin_lock_irqsave(&user_event_mms_lock, flags);
> > +	list_del_rcu(&mm->link);
> > +	spin_unlock_irqrestore(&user_event_mms_lock, flags);
> > +
> > +	/*
> > +	 * Put for mm must be done after RCU sync to handle new refs in
> > +	 * between the list_del_rcu() and now. This ensures any get refs
> > +	 * during rcu_read_lock() are accounted for during list removal.
> > +	 *
> > +	 * CPU A			|	CPU B
> > +	 * ---------------------------------------------------------------
> > +	 * user_event_mm_remove()	|	rcu_read_lock();
> > +	 * list_del_rcu()		|	list_for_each_entry_rcu();
> > +	 * synchronize_rcu()		|	refcount_inc();
> > +	 * .				|	rcu_read_unlock();
> > +	 * user_event_mm_put()		|	.
> > +	 */
> > +	synchronize_rcu();
> 
> This means a synchronize_rcu() is added on each execve and each process exit
> ? I am really worried about the performance impact of this big hammer
> synchronization in those key points of process lifetime.
> 

Agreed, I can move this into a call_rcu() at the cost of an alloc.
Perhaps that will work better? I could have these in memcaches.

Thanks,
-Beau

> Thanks,
> 
> Mathieu
> 
> > +
> > +	/*
> > +	 * We need to wait for currently occurring writes to stop within
> > +	 * the mm. This is required since exit_mm() snaps the current rss
> > +	 * stats and clears them. On the final mmdrop(), check_mm() will
> > +	 * report a bug if these increment.
> > +	 *
> > +	 * All writes/pins are done under mmap_read lock, take the write
> > +	 * lock to ensure in-progress faults have completed. Faults that
> > +	 * are pending but yet to run will check the task count and skip
> > +	 * the fault since the mm is going away.
> > +	 */
> > +	mmap_write_lock(mm->mm);
> > +	mmap_write_unlock(mm->mm);
> > +
> > +	/* MM is still alive, but won't be updated anymore */
> > +	user_event_mm_put(mm);
> > +}
> > +
> > +void user_event_mm_dup(struct task_struct *t, struct user_event_mm *old_mm)
> >   {
> > -	int i = user->index;
> > +	struct user_event_mm *mm = user_event_mm_create(t);
> > +	struct user_event_enabler *enabler;
> > +
> > +	if (!mm)
> > +		return;
> > +
> > +	rcu_read_lock();
> > -	user->group->register_page_data[MAP_STATUS_BYTE(i)] |= MAP_STATUS_MASK(i);
> > +	list_for_each_entry_rcu(enabler, &old_mm->enablers, link)
> > +		if (!user_event_enabler_dup(enabler, mm))
> > +			goto error;
> > +
> > +	rcu_read_unlock();
> > +
> > +	return;
> > +error:
> > +	rcu_read_unlock();
> > +	user_event_mm_remove(t);
> >   }
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com

  reply	other threads:[~2022-12-05 22:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-05 21:00 [PATCH v5 00/11] tracing/user_events: Remote write ABI Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 01/11] tracing/user_events: Split header into uapi and kernel Beau Belgrave
2022-12-05 21:13   ` Mathieu Desnoyers
2022-12-05 22:30     ` Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 02/11] tracing/user_events: Track fork/exec/exit for mm lifetime Beau Belgrave
2022-12-05 21:17   ` Mathieu Desnoyers
2022-12-05 21:00 ` [PATCH v5 03/11] tracing/user_events: Use remote writes for event enablement Beau Belgrave
2022-12-05 21:28   ` Mathieu Desnoyers
2022-12-05 22:26     ` Beau Belgrave [this message]
2022-12-05 21:00 ` [PATCH v5 04/11] tracing/user_events: Fixup enable faults asyncly Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 05/11] tracing/user_events: Add ioctl for disabling addresses Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 06/11] tracing/user_events: Update self-tests to write ABI Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 07/11] tracing/user_events: Add ABI self-test Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 08/11] tracing/user_events: Use write ABI in example Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 09/11] tracing/user_events: Update documentation for ABI Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 10/11] tracing/user_events: Charge event allocs to cgroups Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 11/11] tracing/user_events: Limit global user_event count Beau Belgrave
2022-12-05 21:33   ` Mathieu Desnoyers
2022-12-05 22:28     ` Beau Belgrave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221205222657.GA2270@kbox \
    --to=beaub@linux.microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=alanau@linux.microsoft.com \
    --cc=brauner@kernel.org \
    --cc=dcook@linux.microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).