All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Yafang Shao <laoar.shao@gmail.com>,
	jpoimboe@kernel.org, jikos@kernel.org, mbenes@suse.cz,
	song@kernel.org, live-patching@vger.kernel.org
Subject: Re: [PATCH v3 3/7] livepatch: Support scoped atomic replace using replace_set
Date: Thu, 18 Jun 2026 15:03:56 +0200	[thread overview]
Message-ID: <ajPsvHv4t3gyGOit@pathway.suse.cz> (raw)
In-Reply-To: <ajL-Y3dapC1FiAwx@redhat.com>

On Wed 2026-06-17 16:06:59, Joe Lawrence wrote:
> On Wed, Jun 17, 2026 at 03:52:27PM +0200, Petr Mladek wrote:
> > On Tue 2026-06-16 16:15:17, Joe Lawrence wrote:
> > > On Thu, Jun 11, 2026 at 02:58:39PM +0200, Petr Mladek wrote:
> > > > On Tue 2026-06-09 18:00:55, Petr Mladek wrote:
> > > > > On Sun 2026-06-07 21:16:55, Yafang Shao wrote:
> > [ ... snip ... ]
> > > I'm not against supercedes functionality, but continuing the
> > > brainstorming: what about solution 1 (.replace_set=0 special) with a
> > > special zero-day overlay?
> > 
> > I continue with the brainstorming ;-)
> > 
> 
> Thanks for walking through it with me.  Your reply crossed with my note
> to Yafang at nearly the same time.
> 
> > [ ... snip ... ]
> > > So maybe it boils down to: is the supercedes big hammer desired and safe
> > > enough to deploy?
> > 
> > I personally like the solution with a zero-terminated array of
> > replace_sets:
> > 
> > 	struct patch {
> > 	       [...]
> > 	       unsigned int *replace_sets;
> > 	       [...],
> > 	};
> > 
> > , which would allow to build a cumulative livepatch which replaces
> > known hotfixes out of box.
> > 
> 
> Question on this at the bottom ...
> 
> > Note that the hotfix should not be allowed to modify a function or
> > livepatch state which is modified by another livepatch. It would
> > be dangerous. We should allow to solve this only by a cumulative
> > livepatch.
> > 
> 
> Agreed.
> 
> > IMHO, the OS vendor should not touch customer specific livepatches
> > by default. The customer installed them for a reason. We should
> > just refuse to install two conflicting livepatches. Where
> > we could reliably compare only the livepatched functions.
> > But it still is good because most livepatches only modify
> > functions.
> > 
> > Plus, I would still allow to resolve the possible conflict by using
> > the atomic replace. It could be done by a module-specific parameter.
> > I would call it: override_replace_sets=X[,Y]... or so.
> > 
> 
> Naming nitpick: "override_replace_sets" sounds like it may override the
> "replace_sets" value and not supplement it.  But that's just an
> implementation detail to bikeshed later :D 

Good point! "supplement_replace_sets" or "add_replace_sets"
would be better :D

But see below.

> > Finally, I assume that most users will keep using only the default
> > replace_sets=0 [*]. They will never have to deal with another sets.
> > 
> > The non-default replace sets will be only for adventurous users
> > who want to deal with the complexity and accept the risks.
> > 
> > [*] It we allow the zero-terminated array of replace_sets then
> >     zero should not be the default. Or it could be but it would
> >     be a special set which could never be replaced by anything
> >     else than another zero replace set.
> > 
> >     The zero replace set might be for users who do not want to
> >     deal with the complexity at all. For example, for an os-vendor
> >     who does not want to release separate hotfixes.
> > 
> 
> Hmm, I do like the default replace_sets=0 not dealing with the
> complication of the replace sets.
> 
> But first, back to the larger question I mentioned at the beginning.
> 
> Originally there was:
> 
>   unsigned int replace_set;            /* the set I belong to */
>   const unsigned int *supersedes;      /* other sets I also replace */
> 
> and now it's just:
> 
>   unsigned int *replace_sets;          /* sets I belong to AND replace? */
> 
> Could you trace through a few cycles of cumulative + hotfix releases with
> this approach?  For example:
> 
>   Wed: klp-1a: cumulative    (replace_sets={1})
>   Thu: klp-1b: hotfix        (replace_sets={2})     <- coexists with klp-1a
>   Fri: klp-1c: hotfix v2     (replace_sets={2})     <- replaces klp-1b (same set)
>   Mon: klp-2a: cumulative    (replace_sets={1,2})   <- replaces klp-1a AND wipes klp-1c *
>   Tue: klp-2b: hotfix        (replace_sets={2})     <- coexists with klp-2a
> 
> [*] After klp-2a loads with {1, 2}, is it permanently in both sets?  Or
>     does it just evict set 2 and then only occupy set 1 going forward?  The
>     latter makes klp-2b's load straightforward.
> 
> I can read replace_sets two ways:
> 
>   1. Positional: { set [, eviction_set ...] } where the first element is
>      the patch's own set and the rest are evicted on load.
> 
>   2. Flat: the patch belongs to every listed set equally.  But then how
>      could klp-2b load into set 2 without replacing the entire
>      cumulative klp-2a that also occupies it?

I understand it a 3rd way (similar to Yafang?) ;-)

      3. Set: the patch replaces the given set of replace_sets.
	 Where klp-2a is a cumulative livepatch for two
	 replace_sets: 1,2. And klp-2b hotfix would need to
	 use a new replace_set, .e.g. 3.

	 I see "replace_set" as a set of modifications (functions,
	 shadow variables, and callbacks) which is supposed to
	 replace/update/downgrade the same "replace_set".

It would have the following consequences:
-----------------------------------------

First, any newer cumulative livepatch would need to replace all
older hotfixes. Let's extend your example:

   Wed: klp-1a: cumulative    (replace_sets={1})
   Thu: klp-1b: hotfix        (replace_sets={2})     <- coexists with klp-1a
   Fri: klp-1c: hotfix v2     (replace_sets={2})     <- replaces klp-1b (same set)
   Mon: klp-2a: cumulative    (replace_sets={1,2})   <- replaces klp-1a AND wipes klp-1c
   Tue: klp-2b: hotfix        (replace_sets={3})     <- coexists with klp-2a
   Fri: klp-3a: cumulative    (replace_sets={1,2,3}) <- replaces klp-2a AND wipes klp-2b
   Fri: klp-4a: cumulative    (replace_sets={1,2,3}) <- replaces klp-3a
   Fri: klp-5a: cumulative    (replace_sets={1,2,3}) <- replaces klp-4a

Second, it would limit downgrades, for example:

   + klp-3a, klp-4a, and klp-5a looks compatible from the replace_set POV.
     The replace_set should not limit replacing each other.

     Well, the replacing still might be limited by the states.

     Plus the pending patchset adds per-state "block_disable" flag
     which should handle situations where the change (by a callback)
     can't be reverted, see
     https://lore.kernel.org/all/20250115082431.5550-1-pmladek@suse.com/


Hmm, this brings a question how exactly replace_sets and states
play together and if we need them both.

I did some brainstorming and came with the following definitions:
-----------------------------------------------------------------

   + Each patch.objs[i].funcs[j] defines a particular
       livepatched function.

   + Each patch.states[i] defines either a particular shadow variable
      (same id + state.is_shadow=true) [1] and/or set of callbacks [2]

     [1] https://lore.kernel.org/all/20250115082431.5550-3-pmladek@suse.com/
     [2] https://lore.kernel.org/all/20250115082431.5550-2-pmladek@suse.com/

   + Each shadow variable "id" defines a particular data (type)

   + Each set of callback (pre/post/enable/disable) is connected
     either with a particular shadow variable (for its lifetime handling)
     or it can change the system state with a one-time operation.

Why do we need "states"?
------------------------

   I see "states" as a definition of shadow variable ids and
   callbacks sets. We need to somehow tell the kernel that the
   livepatch is going to use them.

   The numeric "id" allows to compare the compatibility of the
   definitions between livepatches "easily".

   Note that we do not need states to compare livepatched functions.
   The kernel can compare them by the info in patch.objs[].funcs[].old_name.


Why do we need "replace_set" ?
------------------------------

   I see "replace_set" as an union of fixes (livepatched functions,
   shadow variables, callbacks) which is supposed to be handled
   using atomic replace.

   It defines which livepatches upgrade/downgrade or can be installed
   in parallel with other livepatches.

   I see it like "package name" in the RPM package management system.
   The "rpm" tools allows to upgrade/downgrade packages with the same
   name. It can even upgrade/downgrade a package with another name
   when "provides" [*] are defined.

   Note that the "replace_set" would allow even downgrade because
   new livepatch might:

      + stop modifying some functions,see klp_add_nops().

      + stop using some shadow variables and/or revert changes
	done by some callbacks, until it gets blocked by per-state
	"block_disable", see
	https://lore.kernel.org/all/20250115082431.5550-5-pmladek@suse.com/

[*] "provides" seems to be better name than "supplements".


Has "replace_set" a good name and semantic?
-------------------------------------------

I think that we really could find some analogy with the package
management.

"replace_set" does not exist in the package management terminology.
Also "replace_sets" is a set of replace sets which sounds a bit ugly.
Also I sometimes wanted to say "replace a replace set" which
overwhelming.

Well, we likely do not want to introduce livepatch names. They
might get confused with module names.

We could not use module names because they must differ. Otherwise,
kernel could not load both old and newer livepatch in parallel.

I would stay with numbers. They are kind of IDs. But we do want
to name it patch.id because it might get confused with state.id.

We could call it "patch_id" but "patch.patch_id" is a but ugly.

I did some brainstorming and came up with:

    "project_id"
    "changeset_id"
    "provides_id"
    "track_id"
    "fix_id"

and claude-sonnet-4.6 also suggested to use:

    "slot"

which has a similar meaning in the Gentoo package management,
see https://devmanual.gentoo.org/general-concepts/slotting/

The "slots" name is interesting but they seems to be always
mutually exclusive. I do not see any concept of merging
slots in Gentoo.


My preferences:

Honestly, I feel a bit lost. I think that I need to sleep over it.

I kind of like:

     * @slot: Livepatches with the same slot replace each other.
	      Livepatches with different slots might be installed in parallel.
    unsigned long slot;

And I would handle the merging of other slots separately by:

     * @merge_slots: Replace livepatches with the given slots.
     unsigned long merge_slots[];

Plus, a module parameter add_merge_slots=x[,y]...

But I do not have strong opinion.

Best Regards,
Petr

  parent reply	other threads:[~2026-06-18 13:04 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-07 13:16 [PATCH v3 0/7] livepatch: Introduce replace set support Yafang Shao
2026-06-07 13:16 ` [PATCH v3 1/7] livepatch: Fix NULL pointer dereference in klp_find_func() Yafang Shao
2026-06-09 13:27   ` Petr Mladek
2026-06-10  3:00     ` Yafang Shao
2026-06-07 13:16 ` [PATCH v3 2/7] livepatch: Move klp_find_func() into core.h Yafang Shao
2026-06-09 15:28   ` Petr Mladek
2026-06-10  3:01     ` Yafang Shao
2026-06-07 13:16 ` [PATCH v3 3/7] livepatch: Support scoped atomic replace using replace_set Yafang Shao
2026-06-07 13:33   ` sashiko-bot
2026-06-07 14:00     ` Yafang Shao
2026-06-09 16:00   ` Petr Mladek
2026-06-10  3:24     ` Yafang Shao
2026-06-10  9:48       ` Petr Mladek
2026-06-11 12:58     ` Petr Mladek
2026-06-15 12:30       ` Yafang Shao
2026-06-16  2:41         ` Yafang Shao
2026-06-16 20:15       ` Joe Lawrence
2026-06-17  2:40         ` Yafang Shao
2026-06-17 14:54           ` Joe Lawrence
2026-06-18  9:20             ` Yafang Shao
2026-06-17 13:52         ` Petr Mladek
2026-06-17 20:06           ` Joe Lawrence
2026-06-18  2:34             ` Yafang Shao
2026-06-18 13:03             ` Petr Mladek [this message]
2026-06-10 14:45   ` code review: was: " Petr Mladek
2026-06-11  3:06     ` Yafang Shao
2026-06-16 18:20   ` Joe Lawrence
2026-06-07 13:16 ` [PATCH v3 4/7] livepatch: Deprecate stack_order Yafang Shao
2026-06-07 13:31   ` sashiko-bot
2026-06-10 15:11   ` Petr Mladek
2026-06-11  3:21     ` Yafang Shao
2026-06-16 18:44       ` Joe Lawrence
2026-06-07 13:16 ` [PATCH v3 5/7] selftests/livepatch: Update tests for replace_set Yafang Shao
2026-06-07 13:29   ` sashiko-bot
2026-06-07 13:16 ` [PATCH v3 6/7] selftests/livepatch: Add test for state ID conflict across replace_sets Yafang Shao
2026-06-12  8:55   ` Petr Mladek
2026-06-15 11:59     ` Yafang Shao
2026-06-07 13:16 ` [PATCH v3 7/7] selftests/livepatch: Add test for function " Yafang Shao
2026-06-16 20:25 ` [PATCH v3 0/7] livepatch: Introduce replace set support Joe Lawrence
2026-06-17  2:21   ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajPsvHv4t3gyGOit@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.