From: Marcelo Tosatti <mtosatti@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
Daniel Bristot de Oliveira <bristot@kernel.org>,
Juri Lelli <juri.lelli@redhat.com>,
Valentin Schneider <vschneid@redhat.com>,
Frederic Weisbecker <frederic@kernel.org>,
Leonardo Bras <leobras@redhat.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [patch 04/12] clockevent unbind: use smp_call_func_single_fail
Date: Wed, 14 Feb 2024 15:58:49 -0300 [thread overview]
Message-ID: <Zc0Naa3pwTyndUvK@tpad> (raw)
In-Reply-To: <87plx3l6wc.ffs@tglx>
On Sun, Feb 11, 2024 at 09:52:35AM +0100, Thomas Gleixner wrote:
> On Wed, Feb 07 2024 at 09:51, Marcelo Tosatti wrote:
> > On Wed, Feb 07, 2024 at 12:55:59PM +0100, Thomas Gleixner wrote:
> >
> > OK, so the problem is the following: due to software complexity, one is
> > often not aware of all operations that might take place.
>
> The problem is that people throw random crap on their systems and avoid
> proper system engineering and then complain that their realtime
> constraints are violated. So you are proliferating bad engineering
> practices and encourage people not to care.
Its more of a practicality and cost concern: one usually does not have
resources to fully review software before using that software.
> > Now think of all possible paths, from userspace, that lead to kernel
> > code that ends up in smp_call_function_* variants (or other functions
> > that cause IPIs to isolated CPUs).
>
> So you need to analyze every possible code path and interface and add
> your magic functions there after figuring out whether that's valid or
> not.
"A magic function", yes.
> > The alternative, from blocking this in the kernel, would be to validate all
> > userspace software involved in your application, to ensure it won't end
> > up in the kernel sending IPIs. Which is impractical, isnt it ?
>
> It's absolutely not impractical. It's part of proper system
> engineering. The wet dream that you can run random docker containers and
> everything works magically is just a wet dream.
Unfortunately that is what people do.
I understand that "full software review" would be the ideal, but in most
situations it does not seem to happen.
> > (or rather, with such option in the kernel, it would be possible to run
> > applications which have not been validated, since the kernel would fail
> > the operation that results in IPI to isolated CPU).
>
> That's a fallacy because you _cannot_ define with a single CPU mask
> which interface is valid in a particular configuration to end up with an
> IPI and which one is not. There are legitimate reasons in realtime or
> latency constraint systems to invoke selective functionality which
> interferes with the overall system constraints.
>
> How do you cover that with your magic CPU mask? You can't.
>
> Aside of that there is a decent chance that you are subtly breaking user
> space that way. Just look at that hwmon/coretemp commit you pointed to:
>
> "Temperature information from the housekeeping cores should be
> sufficient to infer die temperature."
>
> That's just wishful thinking for various reasons:
>
> - The die temperature on larger packages is not evenly distributed and
> you can run into situations where the housekeeping cores are sitting
> "far" enough away from the worker core which creates the heat spot
I know.
> - Some monitoring applications just stop to work when they can't read
> the full data set, which means that they break subtly and you can
> infer exactly nothing.
>
> > So the idea would be an additional "isolation mode", which when enabled,
> > would disallow the IPIs. Its still possible for root user to disable
> > this mode, and retry the operation.
> >
> > So lets say i want to read MSRs on a given CPU, as root.
> >
> > You'd have to:
> >
> > 1) readmsr on given CPU (returns -EPERM or whatever), since the
> > "block interference" mode is enabled for that CPU.
> >
> > 2) Disable that CPU in the block interference cpumask.
> >
> > 3) readmsr on the given CPU (success).
> >
> > 4) Re-enable CPU in block interference cpumask, if desired.
>
> That's just wrong. Why?
>
> Once you enable it just to read the MSR you enable the operation for
> _ALL_ other non-validated crap too. So while the single MSR read might
> be OK under certain circumstances the fact that you open up a window for
> all other interfaces to do far more interfering operations is a red
> flag.
>
> This whole thing is a really badly defined policy mechanism of very
> dubious value.
>
> Thanks,
OK, fair enough. From your comments, it seems that per-callsite
toggling would be ideal, for example:
/sys/kernel/interference_blocking/ directory containing one sub-directory
per call site.
Inside each sub-directory, a "enabled" file, controlling a boolean
to enable or disable interference blocking for that particular
callsite.
Also a "cpumask" file on each directory, by default containing the same
cpumask as the nohz_full CPUs, to control to which CPUs to block the
interference for.
How does that sound?
next prev parent reply other threads:[~2024-02-14 19:06 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-06 18:49 [patch 00/12] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
2024-02-06 18:49 ` [patch 01/12] cpu isolation: basic block interference infrastructure Marcelo Tosatti
2024-02-06 18:49 ` [patch 02/12] introduce smp_call_func_single_fail Marcelo Tosatti
2024-02-06 18:49 ` [patch 03/12] Introduce _fail variants of stop_machine functions Marcelo Tosatti
2024-02-06 18:49 ` [patch 04/12] clockevent unbind: use smp_call_func_single_fail Marcelo Tosatti
2024-02-07 11:55 ` Thomas Gleixner
2024-02-07 12:51 ` Marcelo Tosatti
2024-02-11 8:52 ` Thomas Gleixner
2024-02-14 18:58 ` Marcelo Tosatti [this message]
2024-02-06 18:49 ` [patch 05/12] timekeeping_notify: use stop_machine_fail when appropriate Marcelo Tosatti
2024-02-07 11:57 ` Thomas Gleixner
2024-02-07 12:58 ` Marcelo Tosatti
2024-02-08 15:23 ` Thomas Gleixner
2024-02-09 15:30 ` Marcelo Tosatti
2024-02-12 15:29 ` Thomas Gleixner
2024-02-06 18:49 ` [patch 06/12] perf_event_open: check for block interference CPUs Marcelo Tosatti
2024-02-06 18:49 ` [patch 07/12] mtrr_add_page/mtrr_del_page: " Marcelo Tosatti
2024-02-06 18:49 ` [patch 08/12] arm64 kernel/topology: use smp_call_function_single_fail Marcelo Tosatti
2024-02-06 18:49 ` [patch 09/12] AMD MCE: use smp_call_func_single_fail Marcelo Tosatti
2024-02-06 18:49 ` [patch 10/12] x86/mce/inject.c: fail if target cpu is block interference Marcelo Tosatti
2024-02-06 18:49 ` [patch 11/12] x86/resctrl: use smp_call_function_single_fail Marcelo Tosatti
2024-02-12 15:19 ` Thomas Gleixner
2024-02-14 18:59 ` Marcelo Tosatti
2024-02-06 18:49 ` [patch 12/12] x86/cacheinfo.c: check for block interference CPUs Marcelo Tosatti
2024-02-07 12:41 ` Thomas Gleixner
2024-02-07 13:10 ` Marcelo Tosatti
2024-02-07 13:16 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zc0Naa3pwTyndUvK@tpad \
--to=mtosatti@redhat.com \
--cc=bristot@kernel.org \
--cc=frederic@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=leobras@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox