From: Masami Hiramatsu <mhiramat@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
x86@kernel.org, linux-kernel@vger.kernel.org, bristot@redhat.com,
jbaron@akamai.com, torvalds@linux-foundation.org,
tglx@linutronix.de, namit@vmware.com, hpa@zytor.com,
luto@kernel.org, ard.biesheuvel@linaro.org, jpoimboe@redhat.com,
jeyu@kernel.org, alexei.starovoitov@gmail.com
Subject: Re: [PATCH -tip 1/2] x86/alternative: Sync bp_patching update for avoiding NULL pointer exception
Date: Wed, 11 Dec 2019 17:09:19 +0900 [thread overview]
Message-ID: <20191211170919.54f6546d294f8a45c0a176c7@kernel.org> (raw)
In-Reply-To: <20191211000943.GG2871@hirez.programming.kicks-ass.net>
Hi Peter,
On Wed, 11 Dec 2019 01:09:43 +0100
Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Dec 10, 2019 at 06:32:09PM +0100, Peter Zijlstra wrote:
>
> > I feel that is actually more complicated... Let me try to see if I can
> > simplify things.
>
> How is this then?
This looks perfectly good to me :)
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Thank you!
>
> ---
> arch/x86/kernel/alternative.c | 84 +++++++++++++++++++++++++++----------------
> 1 file changed, 53 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 30e86730655c..34360ca301a2 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -948,10 +948,29 @@ struct text_poke_loc {
> const u8 text[POKE_MAX_OPCODE_SIZE];
> };
>
> -static struct bp_patching_desc {
> +struct bp_patching_desc {
> struct text_poke_loc *vec;
> int nr_entries;
> -} bp_patching;
> + atomic_t refs;
> +};
> +
> +static struct bp_patching_desc *bp_desc;
> +
> +static inline struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
> +{
> + struct bp_patching_desc *desc = READ_ONCE(*descp); /* rcu_dereference */
> +
> + if (!desc || !atomic_inc_not_zero(&desc->refs))
> + return NULL;
> +
> + return desc;
> +}
> +
> +static inline void put_desc(struct bp_patching_desc *desc)
> +{
> + smp_mb__before_atomic();
> + atomic_dec(&desc->refs);
> +}
>
> static inline void *text_poke_addr(struct text_poke_loc *tp)
> {
> @@ -972,26 +991,26 @@ NOKPROBE_SYMBOL(patch_cmp);
>
> int notrace poke_int3_handler(struct pt_regs *regs)
> {
> + struct bp_patching_desc *desc;
> struct text_poke_loc *tp;
> + int len, ret = 0;
> void *ip;
> - int len;
> +
> + if (user_mode(regs))
> + return 0;
>
> /*
> * Having observed our INT3 instruction, we now must observe
> - * bp_patching.nr_entries.
> + * bp_desc:
> *
> - * nr_entries != 0 INT3
> + * bp_desc = desc INT3
> * WMB RMB
> - * write INT3 if (nr_entries)
> - *
> - * Idem for other elements in bp_patching.
> + * write INT3 if (desc)
> */
> smp_rmb();
>
> - if (likely(!bp_patching.nr_entries))
> - return 0;
> -
> - if (user_mode(regs))
> + desc = try_get_desc(&bp_desc);
> + if (!desc)
> return 0;
>
> /*
> @@ -1002,16 +1021,16 @@ int notrace poke_int3_handler(struct pt_regs *regs)
> /*
> * Skip the binary search if there is a single member in the vector.
> */
> - if (unlikely(bp_patching.nr_entries > 1)) {
> - tp = bsearch(ip, bp_patching.vec, bp_patching.nr_entries,
> + if (unlikely(desc->nr_entries > 1)) {
> + tp = bsearch(ip, desc->vec, desc->nr_entries,
> sizeof(struct text_poke_loc),
> patch_cmp);
> if (!tp)
> - return 0;
> + goto out_put;
> } else {
> - tp = bp_patching.vec;
> + tp = desc->vec;
> if (text_poke_addr(tp) != ip)
> - return 0;
> + goto out_put;
> }
>
> len = text_opcode_size(tp->opcode);
> @@ -1023,7 +1042,7 @@ int notrace poke_int3_handler(struct pt_regs *regs)
> * Someone poked an explicit INT3, they'll want to handle it,
> * do not consume.
> */
> - return 0;
> + goto out_put;
>
> case CALL_INSN_OPCODE:
> int3_emulate_call(regs, (long)ip + tp->rel32);
> @@ -1038,7 +1057,11 @@ int notrace poke_int3_handler(struct pt_regs *regs)
> BUG();
> }
>
> - return 1;
> + ret = 1;
> +
> +out_put:
> + put_desc(desc);
> + return ret;
> }
> NOKPROBE_SYMBOL(poke_int3_handler);
>
> @@ -1069,14 +1092,18 @@ static int tp_vec_nr;
> */
> static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries)
> {
> + struct bp_patching_desc desc = {
> + .vec = tp,
> + .nr_entries = nr_entries,
> + .refs = ATOMIC_INIT(1),
> + };
> unsigned char int3 = INT3_INSN_OPCODE;
> unsigned int i;
> int do_sync;
>
> lockdep_assert_held(&text_mutex);
>
> - bp_patching.vec = tp;
> - bp_patching.nr_entries = nr_entries;
> + smp_store_release(&bp_desc, &desc); /* rcu_assign_pointer */
>
> /*
> * Corresponding read barrier in int3 notifier for making sure the
> @@ -1131,17 +1158,12 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
> text_poke_sync();
>
> /*
> - * sync_core() implies an smp_mb() and orders this store against
> - * the writing of the new instruction.
> + * Remove and synchronize_rcu(), except we have a very primitive
> + * refcount based completion.
> */
> - bp_patching.nr_entries = 0;
> - /*
> - * This sync_core () call ensures that all INT3 handlers in progress
> - * have finished. This allows poke_int3_handler() after this to
> - * avoid touching bp_paching.vec by checking nr_entries == 0.
> - */
> - text_poke_sync();
> - bp_patching.vec = NULL;
> + WRITE_ONCE(bp_desc, NULL); /* RCU_INIT_POINTER */
> + if (!atomic_dec_and_test(&desc.refs))
> + atomic_cond_read_acquire(&desc.refs, !VAL);
> }
>
> void text_poke_loc_init(struct text_poke_loc *tp, void *addr,
--
Masami Hiramatsu <mhiramat@kernel.org>
next prev parent reply other threads:[~2019-12-11 8:09 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-27 5:56 [PATCH -tip 0/2] x86/kprobes: Fix 2 issues related to text_poke_bp and optprobe Masami Hiramatsu
2019-11-27 5:56 ` [PATCH -tip 1/2] x86/alternative: Sync bp_patching update for avoiding NULL pointer exception Masami Hiramatsu
2019-12-02 9:15 ` Peter Zijlstra
2019-12-02 11:50 ` Masami Hiramatsu
2019-12-02 13:43 ` Peter Zijlstra
2019-12-02 14:39 ` Masami Hiramatsu
2019-12-04 8:33 ` [tip: core/kprobes] x86/alternatives: " tip-bot2 for Masami Hiramatsu
2019-12-09 14:39 ` [PATCH -tip 1/2] x86/alternative: " Peter Zijlstra
2019-12-10 16:44 ` Masami Hiramatsu
2019-12-10 17:32 ` Peter Zijlstra
2019-12-11 0:09 ` Peter Zijlstra
2019-12-11 8:09 ` Masami Hiramatsu [this message]
2019-12-11 9:12 ` Daniel Bristot de Oliveira
2019-11-27 5:57 ` [PATCH -tip 2/2] kprobes: Set unoptimized flag after unoptimizing code Masami Hiramatsu
2019-11-27 6:19 ` Alexei Starovoitov
2019-11-27 6:49 ` Ingo Molnar
2019-12-02 21:55 ` Alexei Starovoitov
2019-11-27 6:56 ` Masami Hiramatsu
2019-12-04 8:33 ` [tip: core/kprobes] " tip-bot2 for Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191211170919.54f6546d294f8a45c0a176c7@kernel.org \
--to=mhiramat@kernel.org \
--cc=alexei.starovoitov@gmail.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bristot@redhat.com \
--cc=hpa@zytor.com \
--cc=jbaron@akamai.com \
--cc=jeyu@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox