From: Beau Belgrave <beaub@linux.microsoft.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: mhiramat@kernel.org, mathieu.desnoyers@efficios.com,
dcook@linux.microsoft.com, alanau@linux.microsoft.com,
brauner@kernel.org, akpm@linux-foundation.org,
ebiederm@xmission.com, keescook@chromium.org, tglx@linutronix.de,
linux-trace-devel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v8 04/11] tracing/user_events: Fixup enable faults asyncly
Date: Tue, 28 Mar 2023 14:42:54 -0700 [thread overview]
Message-ID: <20230328214254.GA85@W11-BEAU-MD.localdomain> (raw)
In-Reply-To: <20230328172049.10061257@gandalf.local.home>
On Tue, Mar 28, 2023 at 05:20:49PM -0400, Steven Rostedt wrote:
> On Tue, 21 Feb 2023 13:11:36 -0800
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
>
> > @@ -263,7 +277,85 @@ static int user_event_mm_fault_in(struct user_event_mm *mm, unsigned long uaddr)
> > }
> >
> > static int user_event_enabler_write(struct user_event_mm *mm,
> > - struct user_event_enabler *enabler)
> > + struct user_event_enabler *enabler,
> > + bool fixup_fault);
> > +
> > +static void user_event_enabler_fault_fixup(struct work_struct *work)
> > +{
> > + struct user_event_enabler_fault *fault = container_of(
> > + work, struct user_event_enabler_fault, work);
> > + struct user_event_enabler *enabler = fault->enabler;
> > + struct user_event_mm *mm = fault->mm;
> > + unsigned long uaddr = enabler->addr;
> > + int ret;
> > +
> > + ret = user_event_mm_fault_in(mm, uaddr);
> > +
> > + if (ret && ret != -ENOENT) {
> > + struct user_event *user = enabler->event;
> > +
> > + pr_warn("user_events: Fault for mm: 0x%pK @ 0x%llx event: %s\n",
> > + mm->mm, (unsigned long long)uaddr, EVENT_NAME(user));
> > + }
> > +
> > + /* Prevent state changes from racing */
> > + mutex_lock(&event_mutex);
> > +
> > + /*
> > + * If we managed to get the page, re-issue the write. We do not
> > + * want to get into a possible infinite loop, which is why we only
> > + * attempt again directly if the page came in. If we couldn't get
> > + * the page here, then we will try again the next time the event is
> > + * enabled/disabled.
> > + */
>
> What case would we not get the page? A bad page mapping? User space doing
> something silly?
>
A user space program unmapping the page is the most common I can think
of. A silly action would be unmapping the page while forgetting to call
the unregister IOCTL. We would then possibly see this if the event was
enabled in perf/ftrace before the process exited (and the mm getting
cleaned up).
> Or something else, for which how can it go into an infinite loop? Can that
> only happen if userspace is doing something mischievous?
>
I'm not sure if changing page permissions on the user side would prevent
write permitted mapping in the kernel, but I wanted to ensure if that
type of thing did occur, we wouldn't loop forever. The code lets the mm
decide if a page is ever coming in via fixup_user_fault() with
FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE set.
My understanding is that fixup_user_fault() will retry to get the page
up until it's decided it's either capable of coming in or not. It will
do this since we pass the unlocked bool as a parameter. I used
fixup_user_fault() since it was created for the futex code to handle
this scenario better.
From what I gather, the fault in should only fail for these reasons:
#define VM_FAULT_ERROR (VM_FAULT_OOM | VM_FAULT_SIGBUS | \
VM_FAULT_SIGSEGV | VM_FAULT_HWPOISON | \
VM_FAULT_HWPOISON_LARGE | VM_FAULT_FALLBACK)
If these are hit, I don't believe we want to retry as they aren't likely
to ever get corrected.
Thanks,
-Beau
> -- Steve
>
>
> > + clear_bit(ENABLE_VAL_FAULTING_BIT, ENABLE_BITOPS(enabler));
> > +
> > + if (!ret) {
> > + mmap_read_lock(mm->mm);
> > + user_event_enabler_write(mm, enabler, true);
> > + mmap_read_unlock(mm->mm);
> > + }
> > +
> > + mutex_unlock(&event_mutex);
> > +
> > + /* In all cases we no longer need the mm or fault */
> > + user_event_mm_put(mm);
> > + kmem_cache_free(fault_cache, fault);
> > +}
> > +
> > +static bool user_event_enabler_queue_fault(struct user_event_mm *mm,
> > + struct user_event_enabler *enabler)
> > +{
> > + struct user_event_enabler_fault *fault;
> > +
> > + fault = kmem_cache_zalloc(fault_cache, GFP_NOWAIT | __GFP_NOWARN);
> > +
> > + if (!fault)
> > + return false;
> > +
> > + INIT_WORK(&fault->work, user_event_enabler_fault_fixup);
> > + fault->mm = user_event_mm_get(mm);
> > + fault->enabler = enabler;
> > +
> > + /* Don't try to queue in again while we have a pending fault */
> > + set_bit(ENABLE_VAL_FAULTING_BIT, ENABLE_BITOPS(enabler));
> > +
> > + if (!schedule_work(&fault->work)) {
> > + /* Allow another attempt later */
> > + clear_bit(ENABLE_VAL_FAULTING_BIT, ENABLE_BITOPS(enabler));
> > +
> > + user_event_mm_put(mm);
> > + kmem_cache_free(fault_cache, fault);
> > +
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> > +
next prev parent reply other threads:[~2023-03-28 21:43 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-21 21:11 [PATCH v8 00/11] tracing/user_events: Remote write ABI Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 01/11] tracing/user_events: Split header into uapi and kernel Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 02/11] tracing/user_events: Track fork/exec/exit for mm lifetime Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 03/11] tracing/user_events: Use remote writes for event enablement Beau Belgrave
2023-03-28 20:59 ` Steven Rostedt
2023-02-21 21:11 ` [PATCH v8 04/11] tracing/user_events: Fixup enable faults asyncly Beau Belgrave
2023-03-28 21:20 ` Steven Rostedt
2023-03-28 21:42 ` Beau Belgrave [this message]
2023-02-21 21:11 ` [PATCH v8 05/11] tracing/user_events: Add ioctl for disabling addresses Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 06/11] tracing/user_events: Update self-tests to write ABI Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 07/11] tracing/user_events: Add ABI self-test Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 08/11] tracing/user_events: Use write ABI in example Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 09/11] tracing/user_events: Update documentation for ABI Beau Belgrave
2023-03-24 0:06 ` Masami Hiramatsu
2023-03-24 16:47 ` Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 10/11] tracing/user_events: Charge event allocs to cgroups Beau Belgrave
2023-02-21 21:11 ` [PATCH v8 11/11] tracing/user_events: Limit global user_event count Beau Belgrave
2023-03-24 0:18 ` Masami Hiramatsu
2023-03-24 8:54 ` Vlastimil Babka
2023-03-24 16:43 ` Beau Belgrave
2023-03-24 17:06 ` Steven Rostedt
2023-03-26 15:22 ` Masami Hiramatsu
2023-03-24 0:05 ` [PATCH v8 00/11] tracing/user_events: Remote write ABI Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230328214254.GA85@W11-BEAU-MD.localdomain \
--to=beaub@linux.microsoft.com \
--cc=akpm@linux-foundation.org \
--cc=alanau@linux.microsoft.com \
--cc=brauner@kernel.org \
--cc=dcook@linux.microsoft.com \
--cc=ebiederm@xmission.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-devel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).