public inbox for live-patching@vger.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: jpoimboe@kernel.org, jikos@kernel.org, mbenes@suse.cz,
	pmladek@suse.com, joe.lawrence@redhat.com, rostedt@goodmis.org,
	mathieu.desnoyers@efficios.com, kpsingh@kernel.org,
	mattbobrowski@google.com, song@kernel.org, jolsa@kernel.org,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	martin.lau@linux.dev, eddyz87@gmail.com, memxor@gmail.com,
	yonghong.song@linux.dev, live-patching@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	bpf@vger.kernel.org
Subject: Re: [RFC PATCH 0/4] trace, livepatch: Allow kprobe return overriding for livepatched functions
Date: Wed, 15 Apr 2026 09:48:28 +0900	[thread overview]
Message-ID: <20260415094828.479ad388ac1eb5cd7ee84535@kernel.org> (raw)
In-Reply-To: <CALOAHbAOx=C4b+4xQwRf59xvY0vbMPfOjO5LMDghC4Ryksv++Q@mail.gmail.com>

On Sun, 12 Apr 2026 21:50:31 +0800
Yafang Shao <laoar.shao@gmail.com> wrote:

> On Fri, Apr 10, 2026 at 12:38 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > Hi Yafang,
> >
> > On Thu,  2 Apr 2026 17:26:03 +0800
> > Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > > Livepatching allows for rapid experimentation with new kernel features
> > > without interrupting production workloads. However, static livepatches lack
> > > the flexibility required to tune features based on task-specific attributes,
> > > such as cgroup membership, which is critical in multi-tenant k8s
> > > environments. Furthermore, hardcoding logic into a livepatch prevents
> > > dynamic adjustments based on the runtime environment.
> > >
> > > To address this, we propose a hybrid approach using BPF. Our production use
> > > case involves:
> > >
> > > 1. Deploying a Livepatch function to serve as a stable BPF hook.
> > >
> > > 2. Utilizing bpf_override_return() to dynamically modify the return value
> > >    of that hook based on the current task's context.
> >
> > First of all, I don't like this approach to test a new feature in the
> > kernel, because it sounds like allowing multiple different generations
> > of implementations to coexist simultaneously. The standard kernel code
> > is not designed to withstand such implementations.
> 
> However, this approach is invaluable for rapidly deploying new kernel
> features to production servers without downtime. Upgrading kernels
> across a large fleet remains a significant challenge.

I think that downtime should be accepted as a cost for stability in
general. If your new kernel feature has a bug and causes a crash,
anyway it gets your servers down.

> >
> > For example, if you implement a well-designed framework in a specific
> > subsystem, like Schedext, which allows multiple implementations extended
> > with BPF to coexist, there's no problem (at least it's debatable).
> >
> > But if it is for any function, it is dangerous feature. Bugs that occur
> > in kernels that use this functionality cannot be addressed here. They
> > need to be treated the same way as out-of-tree drivers or forked kernels.
> > I mean, add a tainted flag for this feature. And we don't care of it.
> 
> Agreed. This should be handled as an OOT module rather than part of
> the core kernel.
> 
> >
> > >
> > > A significant challenge arises when atomic-replace is enabled. In this
> > > mode, deploying a new livepatch changes the target function's address,
> > > forcing a re-attachment of the BPF program. This re-attachment latency is
> > > unacceptable in critical paths, such as those handling networking policies.
> > >
> > > To solve this, we introduce a hybrid livepatch mode that allows specific
> > > patches to remain non-replaceable, ensuring the function address remains
> > > stable and the BPF program stays attached.
> >
> > Can you share your actual problem to be solved?
> 
> Here is an example we recently deployed on our production servers:
> 
>   https://lore.kernel.org/bpf/CALOAHbDnNba_w_nWH3-S9GAXw0+VKuLTh1gy5hy9Yqgeo4C0iA@mail.gmail.com/
> 
> In one of our specific clusters, we needed to send BGP traffic out
> through specific NICs based on the destination IP. To achieve this
> without interrupting service, we live-patched
> bond_xmit_3ad_xor_slave_get(), added a new hook called
> bond_get_slave_hook(), and then ran a BPF program attached to that
> hook to select the outgoing NIC from the SKB. This allowed us to
> rapidly deploy the feature with zero downtime.

In this case, you can make specific livepatch or kernel module
to replace the kernel function without using BPF on your server.

The BGP trafic in bonding device seems very specific, so it may not
cause a trouble, but this is very generic change, which allows
user to change more core kernel feature, e.g. memory management
or scheduler etc.

Excessive degrees of freedom introduce uncertainty and instability
into a system. While the functionality is interesting, it would be
a way to generalize schedext in an uncontrolled way.

At a minimum, some form of build time and runtime constraint, along
with a taint flag that clearly indicates in the crash logs that this
feature is being used, would be necessary. (It means this is should
not be used in production environment.)

Thank you,
> 
> [...]
> 
> -- 
> Regards
> Yafang


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

      reply	other threads:[~2026-04-15  0:48 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02  9:26 [RFC PATCH 0/4] trace, livepatch: Allow kprobe return overriding for livepatched functions Yafang Shao
2026-04-02  9:26 ` [RFC PATCH 1/4] trace: Simplify kprobe overridable function check Yafang Shao
2026-04-02 13:13   ` Masami Hiramatsu
2026-04-02  9:26 ` [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions Yafang Shao
2026-04-02 12:48   ` Menglong Dong
2026-04-02 13:20     ` Yafang Shao
2026-04-03 10:25       ` Menglong Dong
2026-04-03 11:30         ` Steven Rostedt
2026-04-03 13:30           ` Yafang Shao
2026-04-03 14:26             ` Alexei Starovoitov
2026-04-03 16:00               ` Yafang Shao
2026-04-03 13:26         ` Yafang Shao
2026-04-09  9:47   ` Miroslav Benes
2026-04-12 13:08     ` Yafang Shao
2026-04-02  9:26 ` [RFC PATCH 3/4] livepatch: Add "replaceable" attribute to klp_patch Yafang Shao
2026-04-03 16:19   ` Song Liu
2026-04-03 20:55     ` Dylan Hatch
2026-04-03 21:35       ` Song Liu
2026-04-06 11:08         ` Yafang Shao
2026-04-06 18:11           ` Song Liu
2026-04-06 21:12             ` Joe Lawrence
2026-04-07  2:54               ` Song Liu
2026-04-07  3:16                 ` Yafang Shao
2026-04-07  9:45                   ` Yafang Shao
2026-04-07 15:08                     ` Petr Mladek
2026-04-07 23:09                       ` Song Liu
2026-04-08 11:10                         ` Petr Mladek
2026-04-08  2:40                       ` Yafang Shao
2026-04-08 11:43                         ` Petr Mladek
2026-04-08 18:19                           ` Song Liu
2026-04-09  7:36                             ` Petr Mladek
2026-04-12 12:18                               ` Yafang Shao
2026-04-12 12:09                           ` Yafang Shao
2026-04-07 13:52           ` Petr Mladek
2026-04-02  9:26 ` [RFC PATCH 4/4] livepatch: Implement livepatch hybrid mode Yafang Shao
2026-04-03 16:06 ` [RFC PATCH 0/4] trace, livepatch: Allow kprobe return overriding for livepatched functions Song Liu
2026-04-06 10:55   ` Yafang Shao
2026-04-06 18:26     ` Song Liu
2026-04-07  2:21       ` Yafang Shao
2026-04-07  2:46         ` Song Liu
2026-04-07  3:13           ` Yafang Shao
2026-04-08  6:51             ` Song Liu
2026-04-09 10:08             ` Miroslav Benes
2026-04-12 13:30               ` Yafang Shao
2026-04-06  5:36 ` Christoph Hellwig
2026-04-06 10:57   ` Yafang Shao
2026-04-10  4:38 ` Masami Hiramatsu
2026-04-12 13:50   ` Yafang Shao
2026-04-15  0:48     ` Masami Hiramatsu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260415094828.479ad388ac1eb5cd7ee84535@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jolsa@kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mattbobrowski@google.com \
    --cc=mbenes@suse.cz \
    --cc=memxor@gmail.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox