All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Beau Belgrave <beaub@linux.microsoft.com>,
	rostedt@goodmis.org, mhiramat@kernel.org,
	dcook@linux.microsoft.com, alanau@linux.microsoft.com,
	brauner@kernel.org, akpm@linux-foundation.org
Cc: linux-trace-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 03/11] tracing/user_events: Use remote writes for event enablement
Date: Mon, 5 Dec 2022 16:28:03 -0500	[thread overview]
Message-ID: <acd9c1b1-e56f-e49c-6092-d53d51cd8d4c@efficios.com> (raw)
In-Reply-To: <20221205210017.23440-4-beaub@linux.microsoft.com>

On 2022-12-05 16:00, Beau Belgrave wrote:
[...]
>   #ifdef CONFIG_USER_EVENTS
>   struct user_event_mm {
> +	struct list_head link;
> +	struct list_head enablers;
> +	struct mm_struct *mm;
> +	struct user_event_mm *next;
> +	refcount_t refcnt;
> +	refcount_t tasks;
>   };
> -#endif
>   
> +extern void user_event_mm_dup(struct task_struct *t,
> +			      struct user_event_mm *old_mm);
> +
> +extern void user_event_mm_remove(struct task_struct *t);
> +
> +static inline void user_events_fork(struct task_struct *t,
> +				    unsigned long clone_flags)
> +{
> +	struct user_event_mm *old_mm;
> +
> +	if (!t || !current->user_event_mm)
> +		return;
> +
> +	old_mm = current->user_event_mm;
> +
> +	if (clone_flags & CLONE_VM) {
> +		t->user_event_mm = old_mm;
> +		refcount_inc(&old_mm->tasks);
> +		return;
> +	}
> +
> +	user_event_mm_dup(t, old_mm);
> +}
> +
> +static inline void user_events_execve(struct task_struct *t)
> +{
> +	if (!t || !t->user_event_mm)
> +		return;
> +
> +	user_event_mm_remove(t);
> +}
> +
> +static inline void user_events_exit(struct task_struct *t)
> +{
> +	if (!t || !t->user_event_mm)
> +		return;
> +
> +	user_event_mm_remove(t);
> +}

So this is adding user_event_mm_remove() calls on each execve and each 
process exit, correct ?

[...]


> +
> +void user_event_mm_remove(struct task_struct *t)
> +{
> +	struct user_event_mm *mm;
> +	unsigned long flags;
> +
> +	might_sleep();
> +
> +	mm = t->user_event_mm;
> +	t->user_event_mm = NULL;
> +
> +	/* Clone will increment the tasks, only remove if last clone */
> +	if (!refcount_dec_and_test(&mm->tasks))
> +		return;
> +
> +	/* Remove the mm from the list, so it can no longer be enabled */
> +	spin_lock_irqsave(&user_event_mms_lock, flags);
> +	list_del_rcu(&mm->link);
> +	spin_unlock_irqrestore(&user_event_mms_lock, flags);
> +
> +	/*
> +	 * Put for mm must be done after RCU sync to handle new refs in
> +	 * between the list_del_rcu() and now. This ensures any get refs
> +	 * during rcu_read_lock() are accounted for during list removal.
> +	 *
> +	 * CPU A			|	CPU B
> +	 * ---------------------------------------------------------------
> +	 * user_event_mm_remove()	|	rcu_read_lock();
> +	 * list_del_rcu()		|	list_for_each_entry_rcu();
> +	 * synchronize_rcu()		|	refcount_inc();
> +	 * .				|	rcu_read_unlock();
> +	 * user_event_mm_put()		|	.
> +	 */
> +	synchronize_rcu();

This means a synchronize_rcu() is added on each execve and each process 
exit ? I am really worried about the performance impact of this big 
hammer synchronization in those key points of process lifetime.

Thanks,

Mathieu

> +
> +	/*
> +	 * We need to wait for currently occurring writes to stop within
> +	 * the mm. This is required since exit_mm() snaps the current rss
> +	 * stats and clears them. On the final mmdrop(), check_mm() will
> +	 * report a bug if these increment.
> +	 *
> +	 * All writes/pins are done under mmap_read lock, take the write
> +	 * lock to ensure in-progress faults have completed. Faults that
> +	 * are pending but yet to run will check the task count and skip
> +	 * the fault since the mm is going away.
> +	 */
> +	mmap_write_lock(mm->mm);
> +	mmap_write_unlock(mm->mm);
> +
> +	/* MM is still alive, but won't be updated anymore */
> +	user_event_mm_put(mm);
> +}
> +
> +void user_event_mm_dup(struct task_struct *t, struct user_event_mm *old_mm)
>   {
> -	int i = user->index;
> +	struct user_event_mm *mm = user_event_mm_create(t);
> +	struct user_event_enabler *enabler;
> +
> +	if (!mm)
> +		return;
> +
> +	rcu_read_lock();
>   
> -	user->group->register_page_data[MAP_STATUS_BYTE(i)] |= MAP_STATUS_MASK(i);
> +	list_for_each_entry_rcu(enabler, &old_mm->enablers, link)
> +		if (!user_event_enabler_dup(enabler, mm))
> +			goto error;
> +
> +	rcu_read_unlock();
> +
> +	return;
> +error:
> +	rcu_read_unlock();
> +	user_event_mm_remove(t);
>   }
>   
-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


  reply	other threads:[~2022-12-05 21:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-05 21:00 [PATCH v5 00/11] tracing/user_events: Remote write ABI Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 01/11] tracing/user_events: Split header into uapi and kernel Beau Belgrave
2022-12-05 21:13   ` Mathieu Desnoyers
2022-12-05 22:30     ` Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 02/11] tracing/user_events: Track fork/exec/exit for mm lifetime Beau Belgrave
2022-12-05 21:17   ` Mathieu Desnoyers
2022-12-05 21:00 ` [PATCH v5 03/11] tracing/user_events: Use remote writes for event enablement Beau Belgrave
2022-12-05 21:28   ` Mathieu Desnoyers [this message]
2022-12-05 22:26     ` Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 04/11] tracing/user_events: Fixup enable faults asyncly Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 05/11] tracing/user_events: Add ioctl for disabling addresses Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 06/11] tracing/user_events: Update self-tests to write ABI Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 07/11] tracing/user_events: Add ABI self-test Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 08/11] tracing/user_events: Use write ABI in example Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 09/11] tracing/user_events: Update documentation for ABI Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 10/11] tracing/user_events: Charge event allocs to cgroups Beau Belgrave
2022-12-05 21:00 ` [PATCH v5 11/11] tracing/user_events: Limit global user_event count Beau Belgrave
2022-12-05 21:33   ` Mathieu Desnoyers
2022-12-05 22:28     ` Beau Belgrave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acd9c1b1-e56f-e49c-6092-d53d51cd8d4c@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=akpm@linux-foundation.org \
    --cc=alanau@linux.microsoft.com \
    --cc=beaub@linux.microsoft.com \
    --cc=brauner@kernel.org \
    --cc=dcook@linux.microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.