public inbox for live-patching@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Song Liu <song@kernel.org>
Cc: live-patching@vger.kernel.org, jpoimboe@kernel.org,
	jikos@kernel.org, joe.lawrence@redhat.com,
	Miroslav Benes <mbenes@suse.cz>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH v7] livepatch: Clear relocation targets on a module removal
Date: Thu, 5 Jan 2023 12:19:42 +0100	[thread overview]
Message-ID: <Y7ayTvpxnDvX9Nfi@alley> (raw)
In-Reply-To: <CAPhsuW7EAFgUUgh3Q6wbE-PNLGnSFFWmdQaYfOqVW6adM0+G4g@mail.gmail.com>

On Wed 2023-01-04 09:34:25, Song Liu wrote:
> On Wed, Jan 4, 2023 at 2:26 AM Petr Mladek <pmladek@suse.com> wrote:
> >
> > On Wed 2022-12-14 09:40:35, Song Liu wrote:
> > > From: Miroslav Benes <mbenes@suse.cz>
> > >
> > > Josh reported a bug:
> > >
> > >   When the object to be patched is a module, and that module is
> > >   rmmod'ed and reloaded, it fails to load with:
> > >
> > >   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> > >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > >
> > >   The livepatch module has a relocation which references a symbol
> > >   in the _previous_ loading of nfsd. When apply_relocate_add()
> > >   tries to replace the old relocation with a new one, it sees that
> > >   the previous one is nonzero and it errors out.
> > >
> > > We thus decided to reverse the relocation patching (clear all relocation
> > > targets on x86_64). The solution is not
> > > universal and is too much arch-specific, but it may prove to be simpler
> > > in the end.
> > >
> > > --- a/arch/powerpc/kernel/module_64.c
> > > +++ b/arch/powerpc/kernel/module_64.c
> > > @@ -739,6 +739,67 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> > >       return 0;
> > >  }
> > >
> > > +#ifdef CONFIG_LIVEPATCH
> > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > +                    const char *strtab,
> > > +                    unsigned int symindex,
> > > +                    unsigned int relsec,
> > > +                    struct module *me)
> > > +{
> > > +     unsigned int i;
> > > +     Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > +     Elf64_Sym *sym;
> > > +     unsigned long *location;
> > > +     const char *symname;
> > > +     u32 *instruction;
> > > +
> > > +     pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > > +              sechdrs[relsec].sh_info);
> > > +
> > > +     for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > +             location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > +                     + rela[i].r_offset;
> > > +             sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > +                     + ELF64_R_SYM(rela[i].r_info);
> > > +             symname = me->core_kallsyms.strtab
> > > +                     + sym->st_name;
> > > +
> > > +             if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> > > +                     continue;
> >
> > Is it OK to continue?
> >
> > IMHO, we should at least warn here. It means that the special elf
> > section contains a relocation that we are not able to clear. It will
> > most likely blow up when we try to load the livepatched module
> > again.
> >
> > > +             /*
> > > +              * reverse the operations in apply_relocate_add() for case
> > > +              * R_PPC_REL24.
> > > +              */
> > > +             if (sym->st_shndx != SHN_UNDEF &&
> > > +                 sym->st_shndx != SHN_LIVEPATCH)
> > > +                     continue;
> >
> > Same here. IMHO, we should warn when the section contains something
> > that we are not able to clear.
> >
> > > +             /* skip mprofile and ftrace calls, same as restore_r2() */
> > > +             if (is_mprofile_ftrace_call(symname))
> > > +                     continue;
> >
> > Is this correct? restore_r2() returns "1" in this case. As a result
> > apply_relocate_add() returns immediately with -ENOEXEC. IMHO, we
> > should print a warning and return as well.
> >
> > > +             instruction = (u32 *)location;
> > > +             /* skip sibling call, same as restore_r2() */
> > > +             if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > +                     continue;
> >
> > Same here. restore_r2() returns '1' in this case...
> >
> > > +
> > > +             instruction += 1;
> > > +             /*
> > > +              * Patch location + 1 back to NOP so the next
> > > +              * apply_relocate_add() call (reload the module) will not
> > > +              * fail the sanity check in restore_r2():
> > > +              *
> > > +              *         if (*instruction != PPC_RAW_NOP()) {
> > > +              *             pr_err(...);
> > > +              *             return 0;
> > > +              *         }
> > > +              */
> > > +             patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> > > +     }
> >
> > This seems incomplete. The above code reverts patch_instruction() called
> > from restore_r2(). But there is another patch_instruction() called in
> > apply_relocate_add() for case R_PPC_REL24. IMHO, we should revert this
> > as well.
> >
> > > +}
> > > +#endif
> >
> > IMHO, this approach is really bad. The function is not maintainable.
> > It will be very hard to keep it in sync with apply_relocate_add().
> > And all the mistakes are just a proof.
> 
> I don't really think the above are mistakes. This should be the same
> as the version that passed Joe's tests. (I didn't test it myself).

I am not sure if Joe tested these situations.

Anyway, we should make it as robust as possible. If we manipulate
the addresses a wrong way then it might shoot-down the system.

If the code reaches an non-expected situation, it should at
least warn about it.

The entire livepatching code tries to be as robust as possible.
The main motivation for livepatching is to avoid reboot.

> >
> > IMHO, the only sane way is to avoid the code duplication.
> 
> I think this falls back to the question that do we want
> clear_relocate_add() to
>    1) undo everything by apply_relocate_add();
> or
>    2) make sure the next apply_relocate_add() succeeds.

The ideal solution would be to add checks into apply_relocated_add().
It would make it more robust. In that case, clear_relocated_add()
would need to clear everything.

But this is not the case on powerpc and s390 at the moment.
In this case, I suggest to clear only relocations that
are checked in apply_relocated_add().

But it should be done without duplicating the code.

It would actually make sense to compute the value that was
used in apply_relocated_add() and check that we are clearing
the value. If we try to clear some other value than we
probably do something wrong.

This might actually be a solution. We could compute
the value in both situations. Then we could have
a common function for writing.

This write function would check that it replaces zero
with the value in apply_relocate_add() and that it replaces
the value with zero in clear_relocate_add().

Best Regards,
Petr

  parent reply	other threads:[~2023-01-05 11:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-14 17:40 [PATCH v7] livepatch: Clear relocation targets on a module removal Song Liu
2023-01-03 17:00 ` Song Liu
2023-01-03 22:39   ` Joe Lawrence
2023-01-03 23:29     ` Song Liu
2023-01-04 10:26 ` Petr Mladek
2023-01-04 17:34   ` Song Liu
2023-01-04 23:12     ` Joe Lawrence
2023-01-05  5:59       ` Song Liu
2023-01-05 15:05         ` Joe Lawrence
2023-01-05 17:11           ` Song Liu
2023-01-06 13:02           ` Miroslav Benes
2023-01-06 16:26             ` Petr Mladek
2023-01-06 16:51               ` Song Liu
2023-01-05 11:19     ` Petr Mladek [this message]
2023-01-05 16:53       ` Song Liu
2023-01-05 18:09         ` Joe Lawrence
2023-01-05 13:03 ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7ayTvpxnDvX9Nfi@alley \
    --to=pmladek@suse.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@kernel.org \
    --cc=jpoimboe@redhat.com \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox