linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Carlos O'Donell <carlos@redhat.com>
Cc: Florian Weimer <fw@deneb.enyo.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	paulmck <paulmck@linux.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Paul Turner <pjt@google.com>,
	linux-api <linux-api@vger.kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	Neel Natu <neelnatu@google.com>
Subject: Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID
Date: Tue, 7 Jul 2020 14:59:21 -0400 (EDT)	[thread overview]
Message-ID: <1834900818.710.1594148361942.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <d6b28b3e-9866-ce6f-659e-2c0dba4cd527@redhat.com>

----- On Jul 7, 2020, at 2:53 PM, Carlos O'Donell carlos@redhat.com wrote:

> On 7/7/20 8:06 AM, Mathieu Desnoyers wrote:
>> ----- On Jul 7, 2020, at 7:32 AM, Florian Weimer fw@deneb.enyo.de wrote:
>> 
>>> * Mathieu Desnoyers:
>>>
>>>> Those are very good points. One possibility we have would be to let
>>>> glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
>>>> flag. On kernels with the bug present, the cpu_id field is still good
>>>> enough for typical uses of sched_getcpu() which does not appear to
>>>> have a very strict correctness requirement on returning the right
>>>> cpu number.
>>>>
>>>> Then libraries and applications which require a reliable cpu_id
>>>> field could check this on their own by calling rseq with the
>>>> RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more
>>>> complex in __rseq_abi, and let each rseq user decide about its own
>>>> fate: whether it uses rseq or keeps using an rseq-free fallback.
>>>>
>>>> I am still tempted to allow combining RSEQ_FLAG_REGISTER |
>>>> RSEQ_FLAG_RELIABLE_CPU_ID for applications which would not be using
>>>> glibc, and want to check this flag on thread registration.
>>>
>>> Well, you could add a bug fix level field to the __rseq_abi variable.
>> 
>> Even though I initially planned to make the struct rseq_abi extensible,
>> the __rseq_abi variable ends up being fix-sized, so we need to be very
>> careful in choosing what we place in the remaining 12 bytes of padding.
>> I suspect we'd want to keep 8 bytes to express a pointer to an
>> "extended" structure.
>> 
>> I wonder if a bug fix level "version" is the right approach. We could
>> instead have a bitmask of fixes, which the application could independently
>> check. For instance, some applications may care about cpu_id field
>> reliability, and others not.
> 
> I agree with Florian.
> 
> Developers are not interested in a bitmask of fixes.
> 
> They want a version they can check and that at a given version everything
> works as expected.
> 
> In reality today this is the kernel version.
> 
> So rseq is broken from a developer perspective until they can get a new
> kernel or an agreement from their downstream vendor that revision Z of
> the kernel they are using has the fix you've just committed.
> 
> Essentially this problem solves itself at level higher than the interfaces
> we're talking about today.
> 
> Encoding this as a bitmask of fixes is an overengineered solution for a
> problem that the downstream communities already know how to solve.
> 
> I would strongly suggest a "version" or nothing.

That's a good point.

> 
>>> Then applications could check if the kernel has the appropriate level
>>> of non-buggyness.  But the same thing could be useful for many other
>>> kernel interfaces, and I haven't seen such a fix level value for them.
>>> What makes rseq so special?
>> 
>> I guess my only answer is because I care as a user of the system call, and
>> what is a system call without users ? I have real applications which work
>> today with end users deploying them on various kernels, old and new, and I
>> want to take advantage of this system call to speed them up. However, if I
>> have to choose between speed and correctness (in other words, not crashing
>> a critical application), I will choose correctness. So if I cannot detect
>> that I can safely use the system call, it becomes pretty much useless even
>> for my own use-cases.
> 
> Yes.
> 
> In the case of RHEL we have *tons*  of users in the same predicament.
> 
> They just wait until the RHEL kernel team releases a fixed kernel version
> and check for that version and deploy with that version.
> 
> Likewise every other user of a kernel. They solve it by asking their
> kernel provider, internal or external, to verify the fix is applied to
> the deployment kernel.
> 
> If they are an ISV they have to test and deploy similar strategies for
> multiple kernel version.
> 
> By trying to automate this you are encoding into the API some level of
> package management concepts e.g. patch level, revision level, etc.
> 
> This is difficult to do without a more generalized API. Why do it just
> for rseq? Why do it with the few bits you have?
> 
> It's not a great fit IMO. Just let the kernel version be the arbiter of
> correctness.
> 
>>> It won't help with the present bug, but maybe we should add an rseq
>>> API sub-call that blocks future rseq registration for the thread.
>>> Then we can add a glibc tunable that flips off rseq reliably if people
>>> do not want to use it for some reason (and switch to the non-rseq
>>> fallback code instead).  But that's going to help with future bugs
>>> only.
>> 
>> I don't think it's needed. All I really need is to have _some_ way to
>> let lttng-ust or liburcu query whether the cpu_id field is reliable. This
>> state does not have to be made quickly accessible to other libraries,
>> nor does it have to be shared between libraries. It would allow each
>> user library or application to make its own mind on whether it can use
>> rseq or not.
> 
> That check is "kernel version > x.y.z" :-)
> 
> or
> 
> "My kernel vendor told me it's fixed."

Allright, thanks for the insight! I'll drop these patches and focus only
on the bugfix.

Thanks,

Mathieu

> 
> --
> Cheers,
> Carlos.

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2020-07-07 18:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06 20:49 [RFC PATCH for 5.8 0/4] rseq cpu_id ABI fix Mathieu Desnoyers
2020-07-06 20:49 ` [RFC PATCH for 5.8 1/4] sched: Fix unreliable rseq cpu_id for new tasks Mathieu Desnoyers
2020-07-07  7:30   ` Florian Weimer
2020-07-07 10:51     ` Mathieu Desnoyers
2020-07-06 20:49 ` [RFC PATCH for 5.8 2/4] rseq: Introduce RSEQ_FLAG_REGISTER Mathieu Desnoyers
2020-07-06 20:49 ` [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID Mathieu Desnoyers
2020-07-07  7:29   ` Florian Weimer
2020-07-07 10:48     ` Mathieu Desnoyers
2020-07-07 11:32       ` Florian Weimer
2020-07-07 12:06         ` Mathieu Desnoyers
2020-07-07 18:53           ` Carlos O'Donell
2020-07-07 18:59             ` Mathieu Desnoyers [this message]
2020-07-08  8:31               ` Florian Weimer
2020-07-07 19:55             ` Florian Weimer
2020-07-08 15:33               ` Mathieu Desnoyers
2020-07-08 16:22                 ` Christian Brauner
2020-07-08 16:36                   ` Florian Weimer
2020-07-08 17:34                   ` Mathieu Desnoyers
2020-07-09 12:49                     ` Christian Brauner
2020-07-09 15:15                       ` Mathieu Desnoyers
2020-07-11 15:54                         ` Christian Brauner
2020-07-13 18:40                           ` Mathieu Desnoyers
2020-07-06 20:49 ` [RFC PATCH for 5.8 4/4] rseq: selftests: Expect reliable cpu_id field Mathieu Desnoyers
2020-07-07  6:26 ` [RFC PATCH for 5.8 0/4] rseq cpu_id ABI fix Florian Weimer
2020-07-07 14:54 ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1834900818.710.1594148361942.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=boqun.feng@gmail.com \
    --cc=carlos@redhat.com \
    --cc=dvyukov@google.com \
    --cc=fw@deneb.enyo.de \
    --cc=hpa@zytor.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neelnatu@google.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).