From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Oskolkov <posk@google.com>
Cc: Florian Weimer <fweimer@redhat.com>,
Prakash Sangappa <prakash.sangappa@oracle.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-api <linux-api@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>, Paul Turner <pjt@google.com>,
Jann Horn <jannh@google.com>, Peter Oskolkov <posk@posk.io>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RESEND RFC PATCH 0/3] Provide fast access to thread specific data
Date: Fri, 10 Sep 2021 13:55:55 -0400 (EDT) [thread overview]
Message-ID: <872090791.15342.1631296555821.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <CAPNVh5d0jd=ks6WBnsheiAE394=31X963X+ZUG6x=ZZLHZ=jbQ@mail.gmail.com>
----- On Sep 10, 2021, at 1:48 PM, Peter Oskolkov posk@google.com wrote:
> On Fri, Sep 10, 2021 at 10:33 AM Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
>>
>> ----- On Sep 10, 2021, at 12:37 PM, Florian Weimer fweimer@redhat.com wrote:
>>
>> > * Peter Oskolkov:
>> >
>> >> In short, due to the need to read/write to the userspace from
>> >> non-sleepable contexts in the kernel it seems that we need to have some
>> >> form of per task/thread kernel/userspace shared memory that is pinned,
>> >> similar to what your sys_task_getshared does.
>> >
>> > In glibc, we'd also like to have this for PID and TID. Eventually,
>> > rt_sigprocmask without kernel roundtrip in most cases would be very nice
>> > as well. For performance and simplicity in userspace, it would be best
>> > if the memory region could be at the same offset from the TCB for all
>> > threads.
>> >
>> > For KTLS, the idea was that the auxiliary vector would contain size and
>> > alignment of the KTLS. Userspace would reserve that memory, register it
>> > with the kernel like rseq (or the robust list pointers), and pass its
>> > address to the vDSO functions that need them. The last part ensures
>> > that the vDSO functions do not need non-global data to determine the
>> > offset from the TCB. Registration is still needed for the caches.
>> >
>> > I think previous discussions (in the KTLS and rseq context) did not have
>> > the pinning constraint.
>>
>> If this data is per-thread, and read from user-space, why is it relevant
>> to update this data from non-sleepable kernel context rather than update it as
>> needed on return-to-userspace ? When returning to userspace, sleeping due to a
>> page fault is entirely acceptable. This is what we currently do for rseq.
>>
>> In short, the data could be accessible from the task struct. Flags in the
>> task struct can let return-to-userspace know that it has outdated ktls
>> data. So before returning to userspace, the kernel can copy the relevant data
>> from the task struct to the shared memory area, without requiring any pinning.
>>
>> What am I missing ?
>
> I can't speak about other use cases, but in the context of userspace
> scheduling, the information that a task has blocked in the kernel and
> is going to be removed from its runqueue cannot wait to be delivered
> to the userspace until the task wakes up, as the userspace scheduler
> needs to know of the even when it happened so that it can schedule
> another task in place of the blocked one. See the discussion here:
>
> https://lore.kernel.org/lkml/CAG48ez0mgCXpXnqAUsa0TcFBPjrid-74Gj=xG8HZqj2n+OPoKw@mail.gmail.com/
OK, just to confirm my understanding, so the use-case here is per-thread
state which can be read by other threads (in this case the userspace scheduler) ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2021-09-10 17:55 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-09 0:23 [RESEND RFC PATCH 0/3] Provide fast access to thread specific data Prakash Sangappa
2021-09-09 0:23 ` [RESEND RFC PATCH 1/3] Introduce per thread user-kernel shared structure Prakash Sangappa
2021-09-09 1:50 ` Jann Horn
2021-09-09 17:30 ` Prakash Sangappa
2021-09-09 0:23 ` [RESEND RFC PATCH 2/3] Publish tasks's scheduler stats thru the " Prakash Sangappa
2021-09-09 0:23 ` [RESEND RFC PATCH 3/3] Introduce task's 'off cpu' time Prakash Sangappa
[not found] ` <CAFTs51VDUPWu=r9d=ThABc-Z6wCwTOC+jKDCq=Jk8Pfid61xyQ@mail.gmail.com>
2021-09-10 15:18 ` Fwd: [RESEND RFC PATCH 0/3] Provide fast access to thread specific data Peter Oskolkov
2021-09-10 16:13 ` Prakash Sangappa
2021-09-10 16:28 ` Peter Oskolkov
2021-09-10 19:12 ` Jann Horn
2021-09-10 19:36 ` Peter Oskolkov
2021-09-13 17:36 ` Prakash Sangappa
2021-09-13 18:00 ` Peter Oskolkov
2021-09-14 16:10 ` Prakash Sangappa
2021-09-10 16:37 ` Fwd: " Florian Weimer
2021-09-10 17:33 ` Mathieu Desnoyers
2021-09-10 17:48 ` Peter Oskolkov
2021-09-10 17:55 ` Mathieu Desnoyers [this message]
2021-09-10 18:00 ` Peter Oskolkov
[not found] <1631146225-13387-1-git-send-email-prakash.sangappa@oracle.com>
2021-09-09 17:42 ` Andy Lutomirski
2021-09-14 17:24 ` Prakash Sangappa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=872090791.15342.1631296555821.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=fweimer@redhat.com \
--cc=jannh@google.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=posk@google.com \
--cc=posk@posk.io \
--cc=prakash.sangappa@oracle.com \
--cc=vincenzo.frascino@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox