Live Patching
 help / color / mirror / Atom feed
* Re: [PATCH] killswitch: add per-function short-circuit mitigation primitive
       [not found]                 ` <agINlnNN4ubZgyiN@tiehlicka>
@ 2026-05-11 18:09                   ` Sasha Levin
  0 siblings, 0 replies; only message in thread
From: Sasha Levin @ 2026-05-11 18:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Breno Leitao, Andrew Morton, corbet, skhan, linux-doc,
	linux-kernel, linux-kselftest, gregkh, akinobu.mita,
	live-patching

On Mon, May 11, 2026 at 07:10:46PM +0200, Michal Hocko wrote:
>On Mon 11-05-26 12:45:36, Sasha Levin wrote:
>> Could you describe an existing infrastructure I can use here?
>
>I think it would help to CC maintainers of subsystems that provide
>kernel modification functionality. They will surely have a better
>insight than me.
>
>> Let's look at
>> this recent "Copy Fail" thing as an example.
>>
>> I can obviously build my own kernel and enroll my own key, but 99.9% of our
>> users won't be doing that.
>> Livepatching, or manually building a module that just injects a kprobe is out
>> of the question as we previously agreed.
>
>Onless I am mistaken you can enroll your own key through MOK. But you
>are right that this is an additional step. But the real question is
>whether this is a major road block for users of this specific feature.

The roadblock here is that then I need to start owning the kernel package: I
need to pull updates, rebuild, reinstall, etc. I lose the support I might be
getting from the distro vendor.

I see "users of this particular feature" as the other 99.9% of folks who don't
build their own kernel, who follow security updates from their distro vendor
and could apply the simple workaround that those vendors could now provide.

>> systemtap falls into the same bucket as building my own module.
>>
>> BPF doesn't help because bpf_override_return() requires the target to be on the
>> same within_error_injection_list() whitelist as fault injection, and the CVE
>> targets never are. Some of our fleet doesn't even have BPF enabled either, but
>> that's the smaller objection.
>>
>> I can't use fault injection because:
>>
>>  a. It's almost never built in production/distro kernels, and I suspect this
>> won't change.
>>  b. The functions I need are not whitelisted.
>>  c. Even if (a) and (b) were addressed, fault injection would still need a
>> securityfs front-end, a cmdline parser, a module-unload notifier, a taint flag,
>> and audit on engage and disengage. By the time those land in fail_function and
>> tie into/refactor the fault injection code, the net diff is bigger than this
>> proposal.
>
>I cannot comment on fault injection imeplementation details of course
>but I have to say that the whitelist nature is something that makes its
>use very limited. Maybe this is a good opportunity to change the
>approach.

Possibly, but IMO the bigger hurdle is the refactoring we'll need to do so
seperate fail_function out of the fault injection umbrella.

One approach would be to abstract the kprobe logic out of fail_function into a
common lib that killswitch could also use, but from a brief look the benefit
will be minimal.

>> In my case I can remove the module, but not if I run a distro that shipped with
>> CONFIG_CRYPTO_USER_API_AEAD=y (like RHEL/SUSE).
>
>If you look at copy fail[2], IIRC algif_aead, esp[46] and rxrcp are all
>modules that could be blacklisted.

On some distros sure, on some others, not:
https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-10/-/raw/main/redhat/configs/common/generic/CONFIG_CRYPTO_USER_API_AEAD

Even if it is a module, what if I can't just unload it because I have something
that actually uses it?

Look at the ESP issue for example. I can mitigate it by simply doing:

   echo "engage xfrm4_udp_encap_rcv 0" > /sys/kernel/security/killswitch/control
   echo "engage xfrm6_udp_encap_rcv 0" > /sys/kernel/security/killswitch/control

which only kills ESP encapsulated in UDP. The remaining functionality will keep
working just fine.

>> I can use "initcall_blacklist=" hack and reboot, but as things stand today,
>> I'll need to be rebooting few times a day.
>
>with your just disable some functions in the kernel you might need to
>reboot even more. But more seriously...
>
>> Even if I'm okay with rebooting that often (and I really really would prefer
>> not to), this doesn't solve the issues of a larger fleet of servers that can't
>> just reboot that often.
>>
>> What am I missing?
>
>For one, you are missing more maintainers of code modification infrastructures.

Happy to add more, but I don't want to be too spammy. I'll add in the
livepatching ML and the fault injection maintainer (I couldn't find a list).
Please add any other folks/lists who you think might want to contribute to this
discussion. 

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-05-11 18:09 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <agG_PZ3qcl6TwLnL@gmail.com>
     [not found] ` <agHUp8ulaWJ75WU5@tiehlicka>
     [not found]   ` <agHcFCRVSn5ra5Kc@laps>
     [not found]     ` <agHeZPA3eHhJHIsQ@tiehlicka>
     [not found]       ` <agHgDgwu8H9Opzpl@laps>
     [not found]         ` <agHm9Vj7bPPCRS1g@tiehlicka>
     [not found]           ` <agH7_QBPLWKTZucB@laps>
     [not found]             ` <agH_bGUTvWm2h5g4@tiehlicka>
     [not found]               ` <agIHsN9tiIHnVTeV@laps>
     [not found]                 ` <agINlnNN4ubZgyiN@tiehlicka>
2026-05-11 18:09                   ` [PATCH] killswitch: add per-function short-circuit mitigation primitive Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox