* [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
@ 2022-09-07 2:33 Liao Chang
2022-09-07 17:21 ` Jisheng Zhang
0 siblings, 1 reply; 7+ messages in thread
From: Liao Chang @ 2022-09-07 2:33 UTC (permalink / raw)
To: paul.walmsley, palmer, aou, mhiramat, rostedt, liaochang1
Cc: linux-riscv, linux-kernel
Since no race condition occurs on each instruction slot, hence it is
safe to patch instruction slot without stopping machine.
Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
arch/riscv/kernel/probes/kprobes.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index e6e950b7cf32..eff7d7fab535 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
{
unsigned long offset = GET_INSN_LENGTH(p->opcode);
+ const kprobe_opcode_t brk_insn = __BUG_INSN_32;
+ kprobe_opcode_t slot[MAX_INSN_SIZE];
p->ainsn.api.restore = (unsigned long)p->addr + offset;
- patch_text(p->ainsn.api.insn, p->opcode);
- patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
- __BUG_INSN_32);
+ memcpy(slot, &p->opcode, offset);
+ memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
+ patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
}
static void __kprobes arch_prepare_simulate(struct kprobe *p)
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
2022-09-07 2:33 [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot Liao Chang
@ 2022-09-07 17:21 ` Jisheng Zhang
2022-09-07 22:28 ` Masami Hiramatsu
2022-09-08 1:43 ` liaochang (A)
0 siblings, 2 replies; 7+ messages in thread
From: Jisheng Zhang @ 2022-09-07 17:21 UTC (permalink / raw)
To: Liao Chang
Cc: paul.walmsley, palmer, aou, mhiramat, rostedt, linux-riscv,
linux-kernel
On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote:
> Since no race condition occurs on each instruction slot, hence it is
> safe to patch instruction slot without stopping machine.
hmm, IMHO there's race when arming kprobe under SMP, so stopping
machine is necessary here. Maybe I misundertand something.
>
> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> ---
> arch/riscv/kernel/probes/kprobes.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> index e6e950b7cf32..eff7d7fab535 100644
> --- a/arch/riscv/kernel/probes/kprobes.c
> +++ b/arch/riscv/kernel/probes/kprobes.c
> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> {
> unsigned long offset = GET_INSN_LENGTH(p->opcode);
> + const kprobe_opcode_t brk_insn = __BUG_INSN_32;
> + kprobe_opcode_t slot[MAX_INSN_SIZE];
>
> p->ainsn.api.restore = (unsigned long)p->addr + offset;
>
> - patch_text(p->ainsn.api.insn, p->opcode);
> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> - __BUG_INSN_32);
> + memcpy(slot, &p->opcode, offset);
> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
> }
>
> static void __kprobes arch_prepare_simulate(struct kprobe *p)
> --
> 2.17.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
2022-09-07 17:21 ` Jisheng Zhang
@ 2022-09-07 22:28 ` Masami Hiramatsu
2022-09-08 1:43 ` liaochang (A)
1 sibling, 0 replies; 7+ messages in thread
From: Masami Hiramatsu @ 2022-09-07 22:28 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Liao Chang, paul.walmsley, palmer, aou, mhiramat, rostedt,
linux-riscv, linux-kernel
On Thu, 8 Sep 2022 01:21:27 +0800
Jisheng Zhang <jszhang@kernel.org> wrote:
> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote:
> > Since no race condition occurs on each instruction slot, hence it is
> > safe to patch instruction slot without stopping machine.
>
> hmm, IMHO there's race when arming kprobe under SMP, so stopping
> machine is necessary here. Maybe I misundertand something.
Yeah, usually the self modifying code needs stop other CPUs some known
points so that other CPUs does not execute the instruction which will
be modified.
Even if a chip ensures that, is that safe for other implementations?
(Does RISC-V specification guarantee this behavior?)
Thank you,
>
> >
> > Signed-off-by: Liao Chang <liaochang1@huawei.com>
> > ---
> > arch/riscv/kernel/probes/kprobes.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> > index e6e950b7cf32..eff7d7fab535 100644
> > --- a/arch/riscv/kernel/probes/kprobes.c
> > +++ b/arch/riscv/kernel/probes/kprobes.c
> > @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
> > static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> > {
> > unsigned long offset = GET_INSN_LENGTH(p->opcode);
> > + const kprobe_opcode_t brk_insn = __BUG_INSN_32;
> > + kprobe_opcode_t slot[MAX_INSN_SIZE];
> >
> > p->ainsn.api.restore = (unsigned long)p->addr + offset;
> >
> > - patch_text(p->ainsn.api.insn, p->opcode);
> > - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> > - __BUG_INSN_32);
> > + memcpy(slot, &p->opcode, offset);
> > + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
> > + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
> > }
> >
> > static void __kprobes arch_prepare_simulate(struct kprobe *p)
> > --
> > 2.17.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
2022-09-07 17:21 ` Jisheng Zhang
2022-09-07 22:28 ` Masami Hiramatsu
@ 2022-09-08 1:43 ` liaochang (A)
2022-09-08 12:49 ` Masami Hiramatsu
1 sibling, 1 reply; 7+ messages in thread
From: liaochang (A) @ 2022-09-08 1:43 UTC (permalink / raw)
To: Jisheng Zhang
Cc: paul.walmsley, palmer, aou, mhiramat, rostedt, linux-riscv,
linux-kernel
Thanks for comment.
在 2022/9/8 1:21, Jisheng Zhang 写道:
> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote:
>> Since no race condition occurs on each instruction slot, hence it is
>> safe to patch instruction slot without stopping machine.
>
> hmm, IMHO there's race when arming kprobe under SMP, so stopping
> machine is necessary here. Maybe I misundertand something.
>
It is indeed necessary to stop machine when arm kprobe under SMP,
but i don't think it need to stop machine when prepare instruction slot,
two reasons:
1. Instruction slot is dynamically allocated data.
2. Kernel would not execute instruction slot until original instruction
is replaced by breakpoint.
>>
>> Signed-off-by: Liao Chang <liaochang1@huawei.com>
>> ---
>> arch/riscv/kernel/probes/kprobes.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
>> index e6e950b7cf32..eff7d7fab535 100644
>> --- a/arch/riscv/kernel/probes/kprobes.c
>> +++ b/arch/riscv/kernel/probes/kprobes.c
>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>> {
>> unsigned long offset = GET_INSN_LENGTH(p->opcode);
>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32;
>> + kprobe_opcode_t slot[MAX_INSN_SIZE];
>>
>> p->ainsn.api.restore = (unsigned long)p->addr + offset;
>>
>> - patch_text(p->ainsn.api.insn, p->opcode);
>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
>> - __BUG_INSN_32);
>> + memcpy(slot, &p->opcode, offset);
>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
>> }
>>
>> static void __kprobes arch_prepare_simulate(struct kprobe *p)
>> --
>> 2.17.1
>>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> .
--
BR,
Liao, Chang
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
2022-09-08 1:43 ` liaochang (A)
@ 2022-09-08 12:49 ` Masami Hiramatsu
2022-09-09 1:55 ` liaochang (A)
0 siblings, 1 reply; 7+ messages in thread
From: Masami Hiramatsu @ 2022-09-08 12:49 UTC (permalink / raw)
To: liaochang (A)
Cc: Jisheng Zhang, paul.walmsley, palmer, aou, mhiramat, rostedt,
linux-riscv, linux-kernel
On Thu, 8 Sep 2022 09:43:45 +0800
"liaochang (A)" <liaochang1@huawei.com> wrote:
> Thanks for comment.
>
> 在 2022/9/8 1:21, Jisheng Zhang 写道:
> > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote:
> >> Since no race condition occurs on each instruction slot, hence it is
> >> safe to patch instruction slot without stopping machine.
> >
> > hmm, IMHO there's race when arming kprobe under SMP, so stopping
> > machine is necessary here. Maybe I misundertand something.
> >
>
> It is indeed necessary to stop machine when arm kprobe under SMP,
> but i don't think it need to stop machine when prepare instruction slot,
> two reasons:
>
> 1. Instruction slot is dynamically allocated data.
> 2. Kernel would not execute instruction slot until original instruction
> is replaced by breakpoint.
Ah, this is for ss (single step out of line) slot. So until
kprobe is enabled, this should not be used from other cores.
OK, then it should be safe.
> >>
> >> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> >> ---
> >> arch/riscv/kernel/probes/kprobes.c | 8 +++++---
> >> 1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> >> index e6e950b7cf32..eff7d7fab535 100644
> >> --- a/arch/riscv/kernel/probes/kprobes.c
> >> +++ b/arch/riscv/kernel/probes/kprobes.c
> >> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
> >> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> >> {
> >> unsigned long offset = GET_INSN_LENGTH(p->opcode);
> >> + const kprobe_opcode_t brk_insn = __BUG_INSN_32;
> >> + kprobe_opcode_t slot[MAX_INSN_SIZE];
> >>
> >> p->ainsn.api.restore = (unsigned long)p->addr + offset;
> >>
> >> - patch_text(p->ainsn.api.insn, p->opcode);
> >> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> >> - __BUG_INSN_32);
> >> + memcpy(slot, &p->opcode, offset);
> >> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
> >> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
BTW, didn't you have a macro for the size of __BUG_INSN_32?
Thank you,
> >> }
> >>
> >> static void __kprobes arch_prepare_simulate(struct kprobe *p)
> >> --
> >> 2.17.1
> >>
> >>
> >> _______________________________________________
> >> linux-riscv mailing list
> >> linux-riscv@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-riscv
> > .
>
> --
> BR,
> Liao, Chang
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
2022-09-08 12:49 ` Masami Hiramatsu
@ 2022-09-09 1:55 ` liaochang (A)
2022-09-10 2:24 ` Masami Hiramatsu
0 siblings, 1 reply; 7+ messages in thread
From: liaochang (A) @ 2022-09-09 1:55 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Jisheng Zhang, paul.walmsley, palmer, aou, rostedt, linux-riscv,
linux-kernel
在 2022/9/8 20:49, Masami Hiramatsu (Google) 写道:
> On Thu, 8 Sep 2022 09:43:45 +0800
> "liaochang (A)" <liaochang1@huawei.com> wrote:
>
>> Thanks for comment.
>>
>> 在 2022/9/8 1:21, Jisheng Zhang 写道:
>>> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote:
>>>> Since no race condition occurs on each instruction slot, hence it is
>>>> safe to patch instruction slot without stopping machine.
>>>
>>> hmm, IMHO there's race when arming kprobe under SMP, so stopping
>>> machine is necessary here. Maybe I misundertand something.
>>>
>>
>> It is indeed necessary to stop machine when arm kprobe under SMP,
>> but i don't think it need to stop machine when prepare instruction slot,
>> two reasons:
>>
>> 1. Instruction slot is dynamically allocated data.
>> 2. Kernel would not execute instruction slot until original instruction
>> is replaced by breakpoint.
>
> Ah, this is for ss (single step out of line) slot. So until
> kprobe is enabled, this should not be used from other cores.
> OK, then it should be safe.
Exactly, Masami, and i find out this optimization could be applied to some other
architectures, such as arm64 and csky, do you think it is good time to do them all.
Thanks.
>
>
>>>>
>>>> Signed-off-by: Liao Chang <liaochang1@huawei.com>
>>>> ---
>>>> arch/riscv/kernel/probes/kprobes.c | 8 +++++---
>>>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
>>>> index e6e950b7cf32..eff7d7fab535 100644
>>>> --- a/arch/riscv/kernel/probes/kprobes.c
>>>> +++ b/arch/riscv/kernel/probes/kprobes.c
>>>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
>>>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>>>> {
>>>> unsigned long offset = GET_INSN_LENGTH(p->opcode);
>>>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32;
>>>> + kprobe_opcode_t slot[MAX_INSN_SIZE];
>>>>
>>>> p->ainsn.api.restore = (unsigned long)p->addr + offset;
>>>>
>>>> - patch_text(p->ainsn.api.insn, p->opcode);
>>>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
>>>> - __BUG_INSN_32);
>>>> + memcpy(slot, &p->opcode, offset);
>>>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
>>>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
>
> BTW, didn't you have a macro for the size of __BUG_INSN_32?
>
> Thank you,
I think you are saying GET_INSN_LENGTH, i will use it to caculate
the size of __BUG_INSN_32 in v2, instead of magic number '4'.
Thanks.
>
>
>>>> }
>>>>
>>>> static void __kprobes arch_prepare_simulate(struct kprobe *p)
>>>> --
>>>> 2.17.1
>>>>
>>>>
>>>> _______________________________________________
>>>> linux-riscv mailing list
>>>> linux-riscv@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>> .
>>
>> --
>> BR,
>> Liao, Chang
>
>
--
BR,
Liao, Chang
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot
2022-09-09 1:55 ` liaochang (A)
@ 2022-09-10 2:24 ` Masami Hiramatsu
0 siblings, 0 replies; 7+ messages in thread
From: Masami Hiramatsu @ 2022-09-10 2:24 UTC (permalink / raw)
To: liaochang (A)
Cc: Jisheng Zhang, paul.walmsley, palmer, aou, rostedt, linux-riscv,
linux-kernel
On Fri, 9 Sep 2022 09:55:08 +0800
"liaochang (A)" <liaochang1@huawei.com> wrote:
>
>
> 在 2022/9/8 20:49, Masami Hiramatsu (Google) 写道:
> > On Thu, 8 Sep 2022 09:43:45 +0800
> > "liaochang (A)" <liaochang1@huawei.com> wrote:
> >
> >> Thanks for comment.
> >>
> >> 在 2022/9/8 1:21, Jisheng Zhang 写道:
> >>> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote:
> >>>> Since no race condition occurs on each instruction slot, hence it is
> >>>> safe to patch instruction slot without stopping machine.
> >>>
> >>> hmm, IMHO there's race when arming kprobe under SMP, so stopping
> >>> machine is necessary here. Maybe I misundertand something.
> >>>
> >>
> >> It is indeed necessary to stop machine when arm kprobe under SMP,
> >> but i don't think it need to stop machine when prepare instruction slot,
> >> two reasons:
> >>
> >> 1. Instruction slot is dynamically allocated data.
> >> 2. Kernel would not execute instruction slot until original instruction
> >> is replaced by breakpoint.
> >
> > Ah, this is for ss (single step out of line) slot. So until
> > kprobe is enabled, this should not be used from other cores.
> > OK, then it should be safe.
>
> Exactly, Masami, and i find out this optimization could be applied to some other
> architectures, such as arm64 and csky, do you think it is good time to do them all.
Yes, we should reduce the stop_machine() usage. Thanks for pointing it!
>
> Thanks.
>
> >
> >
> >>>>
> >>>> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> >>>> ---
> >>>> arch/riscv/kernel/probes/kprobes.c | 8 +++++---
> >>>> 1 file changed, 5 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> >>>> index e6e950b7cf32..eff7d7fab535 100644
> >>>> --- a/arch/riscv/kernel/probes/kprobes.c
> >>>> +++ b/arch/riscv/kernel/probes/kprobes.c
> >>>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
> >>>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> >>>> {
> >>>> unsigned long offset = GET_INSN_LENGTH(p->opcode);
> >>>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32;
> >>>> + kprobe_opcode_t slot[MAX_INSN_SIZE];
> >>>>
> >>>> p->ainsn.api.restore = (unsigned long)p->addr + offset;
> >>>>
> >>>> - patch_text(p->ainsn.api.insn, p->opcode);
> >>>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> >>>> - __BUG_INSN_32);
> >>>> + memcpy(slot, &p->opcode, offset);
> >>>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4);
> >>>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4);
> >
> > BTW, didn't you have a macro for the size of __BUG_INSN_32?
> >
> > Thank you,
>
> I think you are saying GET_INSN_LENGTH, i will use it to caculate
> the size of __BUG_INSN_32 in v2, instead of magic number '4'.
Yeah, that's better.
Thank you!
>
> Thanks.
>
> >
> >
> >>>> }
> >>>>
> >>>> static void __kprobes arch_prepare_simulate(struct kprobe *p)
> >>>> --
> >>>> 2.17.1
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> linux-riscv mailing list
> >>>> linux-riscv@lists.infradead.org
> >>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> >>> .
> >>
> >> --
> >> BR,
> >> Liao, Chang
> >
> >
>
> --
> BR,
> Liao, Chang
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-09-10 2:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-07 2:33 [PATCH] riscv/kprobe: Optimize the performance of patching instruction slot Liao Chang
2022-09-07 17:21 ` Jisheng Zhang
2022-09-07 22:28 ` Masami Hiramatsu
2022-09-08 1:43 ` liaochang (A)
2022-09-08 12:49 ` Masami Hiramatsu
2022-09-09 1:55 ` liaochang (A)
2022-09-10 2:24 ` Masami Hiramatsu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox