* [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot @ 2022-09-07 2:33 Liao Chang 2022-09-07 17:21 ` Jisheng Zhang 0 siblings, 1 reply; 7+ messages in thread From: Liao Chang @ 2022-09-07 2:33 UTC (permalink / raw) To: paul.walmsley, palmer, aou, mhiramat, rostedt, liaochang1 Cc: linux-riscv, linux-kernel Since no race condition occurs on each instruction slot, hence it is safe to patch instruction slot without stopping machine. Signed-off-by: Liao Chang <liaochang1@huawei.com> --- arch/riscv/kernel/probes/kprobes.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index e6e950b7cf32..eff7d7fab535 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); static void __kprobes arch_prepare_ss_slot(struct kprobe *p) { unsigned long offset = GET_INSN_LENGTH(p->opcode); + const kprobe_opcode_t brk_insn = __BUG_INSN_32; + kprobe_opcode_t slot[MAX_INSN_SIZE]; p->ainsn.api.restore = (unsigned long)p->addr + offset; - patch_text(p->ainsn.api.insn, p->opcode); - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), - __BUG_INSN_32); + memcpy(slot, &p->opcode, offset); + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); } static void __kprobes arch_prepare_simulate(struct kprobe *p) -- 2.17.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot 2022-09-07 2:33 [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot Liao Chang @ 2022-09-07 17:21 ` Jisheng Zhang 2022-09-07 22:28 ` Masami Hiramatsu 2022-09-08 1:43 ` liaochang (A) 0 siblings, 2 replies; 7+ messages in thread From: Jisheng Zhang @ 2022-09-07 17:21 UTC (permalink / raw) To: Liao Chang Cc: paul.walmsley, palmer, aou, mhiramat, rostedt, linux-riscv, linux-kernel On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > Since no race condition occurs on each instruction slot, hence it is > safe to patch instruction slot without stopping machine. hmm, IMHO there's race when arming kprobe under SMP, so stopping machine is necessary here. Maybe I misundertand something. > > Signed-off-by: Liao Chang <liaochang1@huawei.com> > --- > arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > index e6e950b7cf32..eff7d7fab535 100644 > --- a/arch/riscv/kernel/probes/kprobes.c > +++ b/arch/riscv/kernel/probes/kprobes.c > @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > { > unsigned long offset = GET_INSN_LENGTH(p->opcode); > + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > + kprobe_opcode_t slot[MAX_INSN_SIZE]; > > p->ainsn.api.restore = (unsigned long)p->addr + offset; > > - patch_text(p->ainsn.api.insn, p->opcode); > - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > - __BUG_INSN_32); > + memcpy(slot, &p->opcode, offset); > + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > } > > static void __kprobes arch_prepare_simulate(struct kprobe *p) > -- > 2.17.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot 2022-09-07 17:21 ` Jisheng Zhang @ 2022-09-07 22:28 ` Masami Hiramatsu 2022-09-08 1:43 ` liaochang (A) 1 sibling, 0 replies; 7+ messages in thread From: Masami Hiramatsu @ 2022-09-07 22:28 UTC (permalink / raw) To: Jisheng Zhang Cc: Liao Chang, paul.walmsley, palmer, aou, mhiramat, rostedt, linux-riscv, linux-kernel On Thu, 8 Sep 2022 01:21:27 +0800 Jisheng Zhang <jszhang@kernel.org> wrote: > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > > Since no race condition occurs on each instruction slot, hence it is > > safe to patch instruction slot without stopping machine. > > hmm, IMHO there's race when arming kprobe under SMP, so stopping > machine is necessary here. Maybe I misundertand something. Yeah, usually the self modifying code needs stop other CPUs some known points so that other CPUs does not execute the instruction which will be modified. Even if a chip ensures that, is that safe for other implementations? (Does RISC-V specification guarantee this behavior?) Thank you, > > > > > Signed-off-by: Liao Chang <liaochang1@huawei.com> > > --- > > arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > > index e6e950b7cf32..eff7d7fab535 100644 > > --- a/arch/riscv/kernel/probes/kprobes.c > > +++ b/arch/riscv/kernel/probes/kprobes.c > > @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > > static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > > { > > unsigned long offset = GET_INSN_LENGTH(p->opcode); > > + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > > + kprobe_opcode_t slot[MAX_INSN_SIZE]; > > > > p->ainsn.api.restore = (unsigned long)p->addr + offset; > > > > - patch_text(p->ainsn.api.insn, p->opcode); > > - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > > - __BUG_INSN_32); > > + memcpy(slot, &p->opcode, offset); > > + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > > + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > > } > > > > static void __kprobes arch_prepare_simulate(struct kprobe *p) > > -- > > 2.17.1 > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot 2022-09-07 17:21 ` Jisheng Zhang 2022-09-07 22:28 ` Masami Hiramatsu @ 2022-09-08 1:43 ` liaochang (A) 2022-09-08 12:49 ` Masami Hiramatsu 1 sibling, 1 reply; 7+ messages in thread From: liaochang (A) @ 2022-09-08 1:43 UTC (permalink / raw) To: Jisheng Zhang Cc: paul.walmsley, palmer, aou, mhiramat, rostedt, linux-riscv, linux-kernel Thanks for comment. 在 2022/9/8 1:21, Jisheng Zhang 写道: > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: >> Since no race condition occurs on each instruction slot, hence it is >> safe to patch instruction slot without stopping machine. > > hmm, IMHO there's race when arming kprobe under SMP, so stopping > machine is necessary here. Maybe I misundertand something. > It is indeed necessary to stop machine when arm kprobe under SMP, but i don't think it need to stop machine when prepare instruction slot, two reasons: 1. Instruction slot is dynamically allocated data. 2. Kernel would not execute instruction slot until original instruction is replaced by breakpoint. >> >> Signed-off-by: Liao Chang <liaochang1@huawei.com> >> --- >> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c >> index e6e950b7cf32..eff7d7fab535 100644 >> --- a/arch/riscv/kernel/probes/kprobes.c >> +++ b/arch/riscv/kernel/probes/kprobes.c >> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); >> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) >> { >> unsigned long offset = GET_INSN_LENGTH(p->opcode); >> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; >> + kprobe_opcode_t slot[MAX_INSN_SIZE]; >> >> p->ainsn.api.restore = (unsigned long)p->addr + offset; >> >> - patch_text(p->ainsn.api.insn, p->opcode); >> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), >> - __BUG_INSN_32); >> + memcpy(slot, &p->opcode, offset); >> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); >> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); >> } >> >> static void __kprobes arch_prepare_simulate(struct kprobe *p) >> -- >> 2.17.1 >> >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv > . -- BR, Liao, Chang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot 2022-09-08 1:43 ` liaochang (A) @ 2022-09-08 12:49 ` Masami Hiramatsu 2022-09-09 1:55 ` liaochang (A) 0 siblings, 1 reply; 7+ messages in thread From: Masami Hiramatsu @ 2022-09-08 12:49 UTC (permalink / raw) To: liaochang (A) Cc: Jisheng Zhang, paul.walmsley, palmer, aou, mhiramat, rostedt, linux-riscv, linux-kernel On Thu, 8 Sep 2022 09:43:45 +0800 "liaochang (A)" <liaochang1@huawei.com> wrote: > Thanks for comment. > > 在 2022/9/8 1:21, Jisheng Zhang 写道: > > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > >> Since no race condition occurs on each instruction slot, hence it is > >> safe to patch instruction slot without stopping machine. > > > > hmm, IMHO there's race when arming kprobe under SMP, so stopping > > machine is necessary here. Maybe I misundertand something. > > > > It is indeed necessary to stop machine when arm kprobe under SMP, > but i don't think it need to stop machine when prepare instruction slot, > two reasons: > > 1. Instruction slot is dynamically allocated data. > 2. Kernel would not execute instruction slot until original instruction > is replaced by breakpoint. Ah, this is for ss (single step out of line) slot. So until kprobe is enabled, this should not be used from other cores. OK, then it should be safe. > >> > >> Signed-off-by: Liao Chang <liaochang1@huawei.com> > >> --- > >> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > >> 1 file changed, 5 insertions(+), 3 deletions(-) > >> > >> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > >> index e6e950b7cf32..eff7d7fab535 100644 > >> --- a/arch/riscv/kernel/probes/kprobes.c > >> +++ b/arch/riscv/kernel/probes/kprobes.c > >> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > >> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > >> { > >> unsigned long offset = GET_INSN_LENGTH(p->opcode); > >> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > >> + kprobe_opcode_t slot[MAX_INSN_SIZE]; > >> > >> p->ainsn.api.restore = (unsigned long)p->addr + offset; > >> > >> - patch_text(p->ainsn.api.insn, p->opcode); > >> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > >> - __BUG_INSN_32); > >> + memcpy(slot, &p->opcode, offset); > >> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > >> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); BTW, didn't you have a macro for the size of __BUG_INSN_32? Thank you, > >> } > >> > >> static void __kprobes arch_prepare_simulate(struct kprobe *p) > >> -- > >> 2.17.1 > >> > >> > >> _______________________________________________ > >> linux-riscv mailing list > >> linux-riscv@lists.infradead.org > >> http://lists.infradead.org/mailman/listinfo/linux-riscv > > . > > -- > BR, > Liao, Chang -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot 2022-09-08 12:49 ` Masami Hiramatsu @ 2022-09-09 1:55 ` liaochang (A) 2022-09-10 2:24 ` Masami Hiramatsu 0 siblings, 1 reply; 7+ messages in thread From: liaochang (A) @ 2022-09-09 1:55 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Jisheng Zhang, paul.walmsley, palmer, aou, rostedt, linux-riscv, linux-kernel 在 2022/9/8 20:49, Masami Hiramatsu (Google) 写道: > On Thu, 8 Sep 2022 09:43:45 +0800 > "liaochang (A)" <liaochang1@huawei.com> wrote: > >> Thanks for comment. >> >> 在 2022/9/8 1:21, Jisheng Zhang 写道: >>> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: >>>> Since no race condition occurs on each instruction slot, hence it is >>>> safe to patch instruction slot without stopping machine. >>> >>> hmm, IMHO there's race when arming kprobe under SMP, so stopping >>> machine is necessary here. Maybe I misundertand something. >>> >> >> It is indeed necessary to stop machine when arm kprobe under SMP, >> but i don't think it need to stop machine when prepare instruction slot, >> two reasons: >> >> 1. Instruction slot is dynamically allocated data. >> 2. Kernel would not execute instruction slot until original instruction >> is replaced by breakpoint. > > Ah, this is for ss (single step out of line) slot. So until > kprobe is enabled, this should not be used from other cores. > OK, then it should be safe. Exactly, Masami, and i find out this optimization could be applied to some other architectures, such as arm64 and csky, do you think it is good time to do them all. Thanks. > > >>>> >>>> Signed-off-by: Liao Chang <liaochang1@huawei.com> >>>> --- >>>> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- >>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c >>>> index e6e950b7cf32..eff7d7fab535 100644 >>>> --- a/arch/riscv/kernel/probes/kprobes.c >>>> +++ b/arch/riscv/kernel/probes/kprobes.c >>>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); >>>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) >>>> { >>>> unsigned long offset = GET_INSN_LENGTH(p->opcode); >>>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; >>>> + kprobe_opcode_t slot[MAX_INSN_SIZE]; >>>> >>>> p->ainsn.api.restore = (unsigned long)p->addr + offset; >>>> >>>> - patch_text(p->ainsn.api.insn, p->opcode); >>>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), >>>> - __BUG_INSN_32); >>>> + memcpy(slot, &p->opcode, offset); >>>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); >>>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > > BTW, didn't you have a macro for the size of __BUG_INSN_32? > > Thank you, I think you are saying GET_INSN_LENGTH, i will use it to caculate the size of __BUG_INSN_32 in v2, instead of magic number '4'. Thanks. > > >>>> } >>>> >>>> static void __kprobes arch_prepare_simulate(struct kprobe *p) >>>> -- >>>> 2.17.1 >>>> >>>> >>>> _______________________________________________ >>>> linux-riscv mailing list >>>> linux-riscv@lists.infradead.org >>>> http://lists.infradead.org/mailman/listinfo/linux-riscv >>> . >> >> -- >> BR, >> Liao, Chang > > -- BR, Liao, Chang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot 2022-09-09 1:55 ` liaochang (A) @ 2022-09-10 2:24 ` Masami Hiramatsu 0 siblings, 0 replies; 7+ messages in thread From: Masami Hiramatsu @ 2022-09-10 2:24 UTC (permalink / raw) To: liaochang (A) Cc: Jisheng Zhang, paul.walmsley, palmer, aou, rostedt, linux-riscv, linux-kernel On Fri, 9 Sep 2022 09:55:08 +0800 "liaochang (A)" <liaochang1@huawei.com> wrote: > > > 在 2022/9/8 20:49, Masami Hiramatsu (Google) 写道: > > On Thu, 8 Sep 2022 09:43:45 +0800 > > "liaochang (A)" <liaochang1@huawei.com> wrote: > > > >> Thanks for comment. > >> > >> 在 2022/9/8 1:21, Jisheng Zhang 写道: > >>> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > >>>> Since no race condition occurs on each instruction slot, hence it is > >>>> safe to patch instruction slot without stopping machine. > >>> > >>> hmm, IMHO there's race when arming kprobe under SMP, so stopping > >>> machine is necessary here. Maybe I misundertand something. > >>> > >> > >> It is indeed necessary to stop machine when arm kprobe under SMP, > >> but i don't think it need to stop machine when prepare instruction slot, > >> two reasons: > >> > >> 1. Instruction slot is dynamically allocated data. > >> 2. Kernel would not execute instruction slot until original instruction > >> is replaced by breakpoint. > > > > Ah, this is for ss (single step out of line) slot. So until > > kprobe is enabled, this should not be used from other cores. > > OK, then it should be safe. > > Exactly, Masami, and i find out this optimization could be applied to some other > architectures, such as arm64 and csky, do you think it is good time to do them all. Yes, we should reduce the stop_machine() usage. Thanks for pointing it! > > Thanks. > > > > > > >>>> > >>>> Signed-off-by: Liao Chang <liaochang1@huawei.com> > >>>> --- > >>>> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > >>>> 1 file changed, 5 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > >>>> index e6e950b7cf32..eff7d7fab535 100644 > >>>> --- a/arch/riscv/kernel/probes/kprobes.c > >>>> +++ b/arch/riscv/kernel/probes/kprobes.c > >>>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > >>>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > >>>> { > >>>> unsigned long offset = GET_INSN_LENGTH(p->opcode); > >>>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > >>>> + kprobe_opcode_t slot[MAX_INSN_SIZE]; > >>>> > >>>> p->ainsn.api.restore = (unsigned long)p->addr + offset; > >>>> > >>>> - patch_text(p->ainsn.api.insn, p->opcode); > >>>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > >>>> - __BUG_INSN_32); > >>>> + memcpy(slot, &p->opcode, offset); > >>>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > >>>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > > > > BTW, didn't you have a macro for the size of __BUG_INSN_32? > > > > Thank you, > > I think you are saying GET_INSN_LENGTH, i will use it to caculate > the size of __BUG_INSN_32 in v2, instead of magic number '4'. Yeah, that's better. Thank you! > > Thanks. > > > > > > >>>> } > >>>> > >>>> static void __kprobes arch_prepare_simulate(struct kprobe *p) > >>>> -- > >>>> 2.17.1 > >>>> > >>>> > >>>> _______________________________________________ > >>>> linux-riscv mailing list > >>>> linux-riscv@lists.infradead.org > >>>> http://lists.infradead.org/mailman/listinfo/linux-riscv > >>> . > >> > >> -- > >> BR, > >> Liao, Chang > > > > > > -- > BR, > Liao, Chang -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-09-10 2:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-09-07 2:33 [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot Liao Chang 2022-09-07 17:21 ` Jisheng Zhang 2022-09-07 22:28 ` Masami Hiramatsu 2022-09-08 1:43 ` liaochang (A) 2022-09-08 12:49 ` Masami Hiramatsu 2022-09-09 1:55 ` liaochang (A) 2022-09-10 2:24 ` Masami Hiramatsu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox