public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	paulmck <paulmck@kernel.org>, Boqun Feng <boqun.feng@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Paul Turner <pjt@google.com>,
	linux-api <linux-api@vger.kernel.org>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	carlos <carlos@redhat.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>
Subject: Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64
Date: Thu, 29 Oct 2020 16:35:49 +0100	[thread overview]
Message-ID: <87a6w5rqne.fsf@oldenburg2.str.redhat.com> (raw)
In-Reply-To: <1247061646.32339.1603219677094.JavaMail.zimbra@efficios.com> (Mathieu Desnoyers's message of "Tue, 20 Oct 2020 14:47:57 -0400 (EDT)")

* Mathieu Desnoyers:

> ----- On Sep 29, 2020, at 4:13 AM, Florian Weimer fweimer@redhat.com wrote:
>
>> * Mathieu Desnoyers:
>> 
>>>> So we have a bootstrap issue here that needs to be solved, I think.
>>>
>>> The one thing I'm not sure about is whether the vDSO interface is indeed
>>> superior to KTLS, or if it is just the model we are used to.
>>>
>>> AFAIU, the current use-cases for vDSO is that an application calls into
>>> glibc, which then calls the vDSO function exposed by the kernel. I wonder
>>> whether the vDSO indirection is really needed if we typically have a glibc
>>> function used as indirection ? For an end user, what is the benefit of vDSO
>>> over accessing KTLS data directly from glibc ?
>> 
>> I think the kernel can only reasonably maintain a single userspace data
>> structure.  It's not reasonable to update several versions of the data
>> structure in parallel.
>
> I disagree with your statement. Considering that the kernel needs to
> keep ABI compatibility for whatever it exposes to user-space, claiming
> that it should never update several versions of data structures
> exposed to user-space in parallel means that once a data structure is
> exposed to user-space as ABI in a certain way, it can never ever
> change in the future, even if we find a better way to do things.

I think it's possible to put data into userspace without making it ABI.
Think about the init_module system call.  The module blob comes from
userspace, but its (deeper) internal structure does not have a stable
ABI.  Similar for many BPF use cases.

If the internal KTLS blob structure turns into ABI, including the parts
that need to be updated on context switch, each versioning change has a
performance impact.

>> This means that glibc would have to support multiple kernel data
>> structures, and users might lose userspace acceleration after a kernel
>> update, until they update glibc as well.  The glibc update should be
>> ABI-compatible, but someone would still have to backport it, apply it to
>> container images, etc.
>
> No. If the kernel ever exposes a data structure to user-space as ABI,
> then it needs to stay there, and not break userspace. Hence the need to
> duplicate information provided to user-space if need be, so we can move
> on to better ABIs without breaking the old ones.

It can expose the data as an opaque blob.

> Or as Andy mentioned, we would simply pass the ktls offset as argument to
> the vDSO ? It seems simple enough. Would it fit all our use-cases including
> errno ?

That would work, yes.  It's neat, but it won't give you a way to provide
traditional syscall wrappers directly from the vDSO.

>> We'll see what will break once we have the correct TID after vfork. 8->
>> glibc currently supports malloc-after-vfork as an extension, and
>> a lot of software depends on it (OpenJDK, for example).
>
> I am not sure to see how that is related to ktls ?

The mutex implementation could switch to the KTLS TID because it always
correct.  But then locking in a vfork'ed subprocess would no longer look
like locking from the parent thread because the TID would be different.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


  reply	other threads:[~2020-10-29 15:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25 18:15 [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64 Mathieu Desnoyers
2020-09-25 18:15 ` [RFC PATCH 2/2] selftests/rseq: Adapt x86-64 rseq selftest to rseq KTLS prototype Mathieu Desnoyers
2020-09-28 15:13 ` [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64 Florian Weimer
2020-09-28 17:29   ` Mathieu Desnoyers
2020-09-29  8:13     ` Florian Weimer
2020-10-20 18:47       ` Mathieu Desnoyers
2020-10-29 15:35         ` Florian Weimer [this message]
2020-09-29 18:01   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6w5rqne.fsf@oldenburg2.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=boqun.feng@gmail.com \
    --cc=carlos@redhat.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=hpa@zytor.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox