public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Khaja Hussain Shaik Khaji <khaja.khaji@oss.qualcomm.com>
Cc: linux-arm-msm@vger.kernel.org, dev.jain@arm.com,
	linux-kernel@vger.kernel.org, mhiramat@kernel.org,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	yang@os.amperecomputing.com
Subject: Re: [PATCH v3 1/1] kernel: kprobes: fix cur_kprobe corruption during re-entrant kprobe_busy_begin() calls
Date: Mon, 2 Mar 2026 13:38:35 +0000	[thread overview]
Message-ID: <aaWS20g-jGu8mCKH@J2N7QTR9R3> (raw)
In-Reply-To: <20260302105347.3602192-2-khaja.khaji@oss.qualcomm.com>

On Mon, Mar 02, 2026 at 04:23:47PM +0530, Khaja Hussain Shaik Khaji wrote:
> Fix cur_kprobe corruption that occurs when kprobe_busy_begin() is called
> re-entrantly during an active kprobe handler.
> 
> Previously, kprobe_busy_begin() unconditionally overwrites current_kprobe
> with &kprobe_busy, and kprobe_busy_end() writes NULL. This approach works
> correctly when no kprobe is active but fails during re-entrant calls.

The structure of kprobe_busy_begin() and kprobe_busy_end() implies that
re-entrancy is unexpected, and something that should be avoided somehow.

Is that the case, or are kprobe_busy_begin() and kprobe_busy_end()
generally buggy?

> On arm64, arm64_enter_el1_dbg() re-enables IRQs before invoking kprobe
> handlers. 

No, arm64_enter_el1_dbg() does not re-enable IRQs. It only manages state
tracking.

I don't know if you meant to say a different function here, but this
statement is clearly wrong.

> This allows an IRQ during kretprobe
> entry_handler to trigger kprobe_flush_task() via softirq, which calls
> kprobe_busy_begin/end and corrupts cur_kprobe.

This would be easier to follow if the backtrace were included in the
commit message, rather than in the cover letter, such that it could be
referred to easily.

> Problem flow: kretprobe entry_handler -> IRQ -> softirq ->
> kprobe_flush_task -> kprobe_busy_begin/end -> cur_kprobe corruption.

We shouldn't take the IRQ in the first place here. AFAICT, nothing
unmasks IRQs prior to the entry handler.

That suggests that something is going wrong *within* your entry handler
that causes IRQs to be unmasked unexpectedly.

Please can we find out *exactly* where IRQs get unmasked for the first
time?

Mark.

> 
> This corruption causes two issues:
> 1. NULL cur_kprobe in setup_singlestep leading to panic in single-step
> handler
> 2. kprobe_status overwritten with HIT_ACTIVE during execute-out-of-line
> window
> 
> Implement a per-CPU re-entrancy tracking mechanism with:
> - A depth counter to track nested calls
> - Saved state for current_kprobe and kprobe_status
> - Save state on first entry, restore on final exit
> - Increment depth counter for nested calls only
> 
> This approach maintains compatibility with existing callers as
> save/restore of NULL is a no-op.
> 
> Signed-off-by: Khaja Hussain Shaik Khaji <khaja.khaji@oss.qualcomm.com>
> ---
>  kernel/kprobes.c | 34 ++++++++++++++++++++++++++++++----
>  1 file changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index e2cd01cf5968..47a4ae50ee6c 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -70,6 +70,15 @@ static bool kprobes_all_disarmed;
>  static DEFINE_MUTEX(kprobe_mutex);
>  static DEFINE_PER_CPU(struct kprobe *, kprobe_instance);
>  
> +/* Per-CPU re-entrancy state for kprobe_busy_begin/end.
> + * kprobe_busy_begin() may be called while a kprobe handler
> + * is active - e.g. kprobe_flush_task() via softirq during
> + * kretprobe entry_handler on arm64 where IRQs are re-enabled.
> + */
> +static DEFINE_PER_CPU(int, kprobe_busy_depth);
> +static DEFINE_PER_CPU(struct kprobe *, kprobe_busy_saved_current);
> +static DEFINE_PER_CPU(unsigned long, kprobe_busy_saved_status);
> +
>  kprobe_opcode_t * __weak kprobe_lookup_name(const char *name,
>  					unsigned int __unused)
>  {
> @@ -1307,14 +1316,31 @@ void kprobe_busy_begin(void)
>  	struct kprobe_ctlblk *kcb;
>  
>  	preempt_disable();
> -	__this_cpu_write(current_kprobe, &kprobe_busy);
> -	kcb = get_kprobe_ctlblk();
> -	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> +	if (__this_cpu_read(kprobe_busy_depth) == 0) {
> +		kcb = get_kprobe_ctlblk();
> +		__this_cpu_write(kprobe_busy_saved_current,
> +				 __this_cpu_read(current_kprobe));
> +		__this_cpu_write(kprobe_busy_saved_status,
> +				 kcb->kprobe_status);
> +		__this_cpu_write(current_kprobe, &kprobe_busy);
> +		kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> +	}
> +	__this_cpu_inc(kprobe_busy_depth);
>  }
>  
>  void kprobe_busy_end(void)
>  {
> -	__this_cpu_write(current_kprobe, NULL);
> +	struct kprobe_ctlblk *kcb;
> +
> +	__this_cpu_dec(kprobe_busy_depth);
> +
> +	if (__this_cpu_read(kprobe_busy_depth) == 0) {
> +		kcb = get_kprobe_ctlblk();
> +		__this_cpu_write(current_kprobe,
> +				 __this_cpu_read(kprobe_busy_saved_current));
> +		kcb->kprobe_status =
> +				__this_cpu_read(kprobe_busy_saved_status);
> +	}
>  	preempt_enable();
>  }
>  
> -- 
> 2.34.1
> 


  reply	other threads:[~2026-03-02 13:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-06 10:49 [PATCH] arm64: insn: Route BTI to simulate_nop to avoid XOL/SS at function entry Khaja Hussain Shaik Khaji
2025-11-11 10:26 ` Mark Rutland
2025-11-12 12:17   ` Mark Rutland
2026-02-17 13:38 ` [PATCH v2 0/2] arm64: kprobes: fix XOL preemption window Khaja Hussain Shaik Khaji
2026-02-17 13:38   ` [PATCH v2 1/2] arm64: kprobes: disable preemption across XOL single-step Khaja Hussain Shaik Khaji
2026-02-17 16:55     ` Mark Rutland
2026-02-23 16:07       ` Masami Hiramatsu
2026-03-02 10:19       ` Khaja Hussain Shaik Khaji
2026-03-02 10:23         ` Mark Rutland
2026-03-02 10:53       ` [PATCH v3 0/1] kernel: kprobes: fix cur_kprobe corruption during Khaja Hussain Shaik Khaji
2026-03-02 10:53         ` [PATCH v3 1/1] kernel: kprobes: fix cur_kprobe corruption during re-entrant kprobe_busy_begin() calls Khaja Hussain Shaik Khaji
2026-03-02 13:38           ` Mark Rutland [this message]
2026-03-02 11:23         ` [PATCH v3 0/1] kernel: kprobes: fix cur_kprobe corruption during Mark Rutland
2026-03-02 12:23           ` [PATCH v3 0/1] kernel: kprobes: fix cur_kprobe corruption during re-entrant kprobe_busy_begin() calls Khaja Hussain Shaik Khaji
2026-03-02 13:43             ` Mark Rutland
2026-02-17 13:38   ` [PATCH v2 2/2] arm64: insn: drop NOP from steppable hint list Khaja Hussain Shaik Khaji
2026-02-17 16:57     ` Mark Rutland
2026-02-24  8:23       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaWS20g-jGu8mCKH@J2N7QTR9R3 \
    --to=mark.rutland@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dev.jain@arm.com \
    --cc=khaja.khaji@oss.qualcomm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox