All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	maddy@linux.ibm.com, mpe@ellerman.id.au, peterz@infradead.org,
	jpoimboe@kernel.org, jbaron@akamai.com
Cc: npiggin@gmail.com, rostedt@goodmis.org, ardb@kernel.org,
	Erhard Furtner <erhard_f@mailbox.org>
Subject: Re: [PATCH RFC] powerpc: Panic on jump label code patching failure
Date: Tue, 09 Sep 2025 08:49:05 +0530	[thread overview]
Message-ID: <68bfad34.050a0220.11f3f7.19c5@mx.google.com> (raw)
In-Reply-To: <1b4cc6d5-5f5b-4b39-8fdf-ac02c94cd5e2@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:

> Le 06/09/2025 à 05:52, Ritesh Harjani a écrit :
>> [Vous ne recevez pas souvent de courriers de riteshh@linux.ibm.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>> 
>> Andrew Donnellan <ajd@linux.ibm.com> writes:
>> 
>>> If patch_branch() or patch_instruction() fails while updating a jump
>>> label, we presently fail silently, leading to unpredictable behaviour
>>> later on.
>>>
>>> Change arch_jump_label_transform() to panic on a code patching failure,
>>> matching the existing behaviour of arch_static_call_transform().
>>>
>>> Reported-by: Erhard Furtner <erhard_f@mailbox.org>
>>> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
>>>
>>> ---
>>>
>>> Ran into this while debugging an issue that Erhard reported to me about my
>>> PAGE_TABLE_CHECK series on a G4, where updating a static key failed
>>> silently, but only for one call site, leading to an incorrect reference
>>> count later on. This looks to be due to the issue fixed in [0]. A loud
>>> failure would have saved us all considerable debugging time.
>>>
>>> Should I change the return type of arch_jump_label_transform() and handle
>>> this in an arch-independent way? Are there other users of code patching
>>> in powerpc that ought to be hardened?
>>>
>>> Or is this excessive?
>>>
>>> [0] https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4b5e6eb281d7b1ea77619bee17095f905a125168.1757003584.git.christophe.leroy@csgroup.eu/
>>> ---
>>>   arch/powerpc/kernel/jump_label.c | 9 +++++++--
>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/jump_label.c b/arch/powerpc/kernel/jump_label.c
>>> index 2659e1ac8604..80d41ed7ac50 100644
>>> --- a/arch/powerpc/kernel/jump_label.c
>>> +++ b/arch/powerpc/kernel/jump_label.c
>>> @@ -12,9 +12,14 @@ void arch_jump_label_transform(struct jump_entry *entry,
>>>                               enum jump_label_type type)
>>>   {
>>>        u32 *addr = (u32 *)jump_entry_code(entry);
>>> +     int err;
>>>
>>>        if (type == JUMP_LABEL_JMP)
>>> -             patch_branch(addr, jump_entry_target(entry), 0);
>>> +             err = patch_branch(addr, jump_entry_target(entry), 0);
>>>        else
>>> -             patch_instruction(addr, ppc_inst(PPC_RAW_NOP()));
>>> +             err = patch_instruction(addr, ppc_inst(PPC_RAW_NOP()));
>>> +
>>> +     if (err)
>>> +             panic("%s: patching failed, err %d, type %d, addr %pS, target %pS\n",
>>> +                   __func__, err, type, addr, (void *)jump_entry_target(entry));
>>>   }
>> 
>> arch_jump_label_transform() is mainly getting called from
>> __jump_level_update() and it's used for enabling or updating static keys / branch.
>> 
>> But static keys can also be used by drivers / module subsystem whose
>> initialization happens late. Although I understand that if the above
>> fails, it might fail much before, from the arch setup code itself, but
>> panic() still feels like a big hammer.
>
> But not being able to patch the kernel as required means that you get a 
> kernel behaving differently from what is expected.
>
> Imagine a kernel running on a board that is controlling a saw. There is 
> a patch_instruction() to activate the safety feature which detects when 
> your hands are too close to the blade. Do you want the kernel to 
> continue running seamlessly when that patch_instruction() fails ? I'm 
> sure you don't !
>

;) Sure. Not a fan of playing with blades or saws. :) 

>> 
>> Would pr_err() print with WARN_ON_ONCE(1) would suffice in case of an
>> err?
>
> No, that's not enough, you can't rely on a kernel that will no behave as 
> expected.
>

Sure, Christophe. Thanks for the clarification.

My main concern was that during module load time we should not bring the
system down. But as I see the behavior in arch_static_call_transform()
is also the same.
And as you said, maybe it's good to panic if an important functionality
like patch_instruction() itself fails which means the kernel may not be
doing what we are expecting it to.


>> 
>> Also you said you ran into a problem at just one call site where above
>> was silently failing. With the above change are you able to hit the
>> panic() now? Because from what I see in patch_instruction(), it mainly
>> will boil down to calling __patch_mem() which always returns 0.
>
> As far as I can see, __patch_mem() returns -EPERM when 
> __put_kernel_nofault() fails:
>
> static int __patch_mem(void *exec_addr, unsigned long val, void 
> *patch_addr, bool is_dword)
> {
> 	if (!IS_ENABLED(CONFIG_PPC64) || likely(!is_dword)) {
> 		/* For big endian correctness: plain address would use the wrong half */
> 		u32 val32 = val;
>
> 		__put_kernel_nofault(patch_addr, &val32, u32, failed);
> 	} else {
> 		__put_kernel_nofault(patch_addr, &val, u64, failed);
> 	}
>
> 	asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (patch_addr),
> 							    "r" (exec_addr));
>
> 	return 0;
>
> failed:
> 	mb();  /* sync */
> 	return -EPERM;
> }
>

Right, I somehow missed that "_nofault" part.

>
>> Although there are other places where there can be an error returned,
>> so I was wondering if that is what you were hitting or something else?
>
> Andrew was hitting the -EPERM because the memory area was read-only.

yes, that make sense.


>
> Christophe

Thanks for the comments!

-ritesh


      parent reply	other threads:[~2025-09-09  4:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05  6:11 [PATCH RFC] powerpc: Panic on jump label code patching failure Andrew Donnellan
2025-09-05  6:34 ` Christophe Leroy
2025-09-06  3:52 ` Ritesh Harjani
2025-09-06  6:42   ` Christophe Leroy
2025-09-08  1:45     ` Andrew Donnellan
2025-09-09  3:19     ` Ritesh Harjani [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68bfad34.050a0220.11f3f7.19c5@mx.google.com \
    --to=ritesh.list@gmail.com \
    --cc=ajd@linux.ibm.com \
    --cc=ardb@kernel.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=erhard_f@mailbox.org \
    --cc=jbaron@akamai.com \
    --cc=jpoimboe@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.