From: dave.long@linaro.org (David Long)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm64: Improve kprobes test for atomic sequence
Date: Thu, 1 Sep 2016 17:27:59 -0400 [thread overview]
Message-ID: <57C89D5F.7070808@linaro.org> (raw)
In-Reply-To: <20160901113810.91a6c6aefcdab761c9fb1ff0@kernel.org>
On 08/31/2016 10:38 PM, Masami Hiramatsu wrote:
> Hi Dave,
>
> On Wed, 31 Aug 2016 16:52:22 -0400
> David Long <dave.long@linaro.org> wrote:
>
>> From: "David A. Long" <dave.long@linaro.org>
>>
>> Kprobes searches backwards a finite number of instructions to determine if
>> there is an attempt to probe a load/store exclusive sequence. It stops when
>> it hits the maximum number of instructions or a load or store exclusive.
>
> Hmm, so on aarch64, we can not put a kprobe between load exclusive and
> store exclusive, because kprobe always breaks the atomicity, am I correct?
Yes.
> If so, what happen if any branch in the sequence? e.g.
>
> load-ex
> (do something)
> l1:
> store-ex
> ...
> load-ex
> (do something)
> branch l1;
>
I'm sure atomic code can be constructed in a way which we don't detect,
and probably can't detect, this is just a "best effort".
>> However this means it can run up past the beginning of the function and
>> start looking at literal constants. This has been shown to cause a false
>> positive and blocks insertion of the probe. To fix this add a test to see
>> if the typical:
>>
>> "stp x29, x30, [sp, #n]!"
>>
>> instruction beginning a function gets hit. This also improves efficiency by
>> not testing code that is not part of the function. There is some
>> possibility that a function will not begin with this instruction, in which
>> case the fixed code will behave no worse than before.
>
> If the function boundary is the problem, why you wouldn't use kallsyms information
> as I did in can_optimize()@arch/x86/kernel/kprobes/opt.c ?
>
> /* Lookup symbol including addr */
> if (!kallsyms_lookup_size_offset(paddr, &size, &offset))
> return 0;
>
> With this call, symbol start address is (paddr - offset) and end address
> is (paddr - offset + size).
>
Thanks for pointing this out. I shall work on a V2 patch ASAP.
> Thank you,
>
>>
>> There could also be the case that the stp instruction is found further in
>> the body of the function, which could theoretically allow probing of an
>> atomic squence. The likelihood of this seems low, and this would not be the
>> only aspect of kprobes where the user needs to be careful to avoid
>> problems.
>>
>> Signed-off-by: David A. Long <dave.long@linaro.org>
>> ---
>> arch/arm64/kernel/probes/decode-insn.c | 25 ++++++++++++++++++-------
>> 1 file changed, 18 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/probes/decode-insn.c b/arch/arm64/kernel/probes/decode-insn.c
>> index 37e47a9..248e820 100644
>> --- a/arch/arm64/kernel/probes/decode-insn.c
>> +++ b/arch/arm64/kernel/probes/decode-insn.c
>> @@ -122,16 +122,28 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi)
>> static bool __kprobes
>> is_probed_address_atomic(kprobe_opcode_t *scan_start, kprobe_opcode_t *scan_end)
>> {
>> + const u32 stp_x29_x30_sp_pre = 0xa9807bfd;
>> + const u32 stp_ignore_index_mask = 0xffc07fff;
>> + u32 instruction = le32_to_cpu(*scan_start);
>> +
>> while (scan_start > scan_end) {
>> /*
>> - * atomic region starts from exclusive load and ends with
>> - * exclusive store.
>> + * Atomic region starts from exclusive load and ends with
>> + * exclusive store. If we hit a "stp x29, x30, [sp, #n]!"
>> + * assume it is the beginning of the function and end the
>> + * search. This helps avoid false positives from literal
>> + * constants that look like a load-exclusive, in addition
>> + * to being more efficient.
>> */
>> - if (aarch64_insn_is_store_ex(le32_to_cpu(*scan_start)))
>> + if ((instruction & stp_ignore_index_mask) == stp_x29_x30_sp_pre)
>> return false;
>> - else if (aarch64_insn_is_load_ex(le32_to_cpu(*scan_start)))
>> - return true;
>> +
>> scan_start--;
>> + instruction = le32_to_cpu(*scan_start);
>> + if (aarch64_insn_is_store_ex(instruction))
>> + return false;
>> + else if (aarch64_insn_is_load_ex(instruction))
>> + return true;
>> }
>>
>> return false;
>> @@ -142,7 +154,6 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
>> {
>> enum kprobe_insn decoded;
>> kprobe_opcode_t insn = le32_to_cpu(*addr);
>> - kprobe_opcode_t *scan_start = addr - 1;
>> kprobe_opcode_t *scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE;
>> #if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
>> struct module *mod;
>> @@ -167,7 +178,7 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
>> decoded = arm_probe_decode_insn(insn, asi);
>>
>> if (decoded == INSN_REJECTED ||
>> - is_probed_address_atomic(scan_start, scan_end))
>> + is_probed_address_atomic(addr, scan_end))
>> return INSN_REJECTED;
>>
>> return decoded;
>> --
>> 2.5.0
>>
>
>
Thanks,
-dl
WARNING: multiple messages have this Message-ID (diff)
From: David Long <dave.long@linaro.org>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>,
Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
"David S. Miller" <davem@davemloft.net>,
Will Deacon <will.deacon@arm.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com,
Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com>,
William Cohen <wcohen@redhat.com>,
Pratyush Anand <panand@redhat.com>,
Mark Brown <broonie@kernel.org>
Subject: Re: [PATCH] arm64: Improve kprobes test for atomic sequence
Date: Thu, 1 Sep 2016 17:27:59 -0400 [thread overview]
Message-ID: <57C89D5F.7070808@linaro.org> (raw)
In-Reply-To: <20160901113810.91a6c6aefcdab761c9fb1ff0@kernel.org>
On 08/31/2016 10:38 PM, Masami Hiramatsu wrote:
> Hi Dave,
>
> On Wed, 31 Aug 2016 16:52:22 -0400
> David Long <dave.long@linaro.org> wrote:
>
>> From: "David A. Long" <dave.long@linaro.org>
>>
>> Kprobes searches backwards a finite number of instructions to determine if
>> there is an attempt to probe a load/store exclusive sequence. It stops when
>> it hits the maximum number of instructions or a load or store exclusive.
>
> Hmm, so on aarch64, we can not put a kprobe between load exclusive and
> store exclusive, because kprobe always breaks the atomicity, am I correct?
Yes.
> If so, what happen if any branch in the sequence? e.g.
>
> load-ex
> (do something)
> l1:
> store-ex
> ...
> load-ex
> (do something)
> branch l1;
>
I'm sure atomic code can be constructed in a way which we don't detect,
and probably can't detect, this is just a "best effort".
>> However this means it can run up past the beginning of the function and
>> start looking at literal constants. This has been shown to cause a false
>> positive and blocks insertion of the probe. To fix this add a test to see
>> if the typical:
>>
>> "stp x29, x30, [sp, #n]!"
>>
>> instruction beginning a function gets hit. This also improves efficiency by
>> not testing code that is not part of the function. There is some
>> possibility that a function will not begin with this instruction, in which
>> case the fixed code will behave no worse than before.
>
> If the function boundary is the problem, why you wouldn't use kallsyms information
> as I did in can_optimize()@arch/x86/kernel/kprobes/opt.c ?
>
> /* Lookup symbol including addr */
> if (!kallsyms_lookup_size_offset(paddr, &size, &offset))
> return 0;
>
> With this call, symbol start address is (paddr - offset) and end address
> is (paddr - offset + size).
>
Thanks for pointing this out. I shall work on a V2 patch ASAP.
> Thank you,
>
>>
>> There could also be the case that the stp instruction is found further in
>> the body of the function, which could theoretically allow probing of an
>> atomic squence. The likelihood of this seems low, and this would not be the
>> only aspect of kprobes where the user needs to be careful to avoid
>> problems.
>>
>> Signed-off-by: David A. Long <dave.long@linaro.org>
>> ---
>> arch/arm64/kernel/probes/decode-insn.c | 25 ++++++++++++++++++-------
>> 1 file changed, 18 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/probes/decode-insn.c b/arch/arm64/kernel/probes/decode-insn.c
>> index 37e47a9..248e820 100644
>> --- a/arch/arm64/kernel/probes/decode-insn.c
>> +++ b/arch/arm64/kernel/probes/decode-insn.c
>> @@ -122,16 +122,28 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi)
>> static bool __kprobes
>> is_probed_address_atomic(kprobe_opcode_t *scan_start, kprobe_opcode_t *scan_end)
>> {
>> + const u32 stp_x29_x30_sp_pre = 0xa9807bfd;
>> + const u32 stp_ignore_index_mask = 0xffc07fff;
>> + u32 instruction = le32_to_cpu(*scan_start);
>> +
>> while (scan_start > scan_end) {
>> /*
>> - * atomic region starts from exclusive load and ends with
>> - * exclusive store.
>> + * Atomic region starts from exclusive load and ends with
>> + * exclusive store. If we hit a "stp x29, x30, [sp, #n]!"
>> + * assume it is the beginning of the function and end the
>> + * search. This helps avoid false positives from literal
>> + * constants that look like a load-exclusive, in addition
>> + * to being more efficient.
>> */
>> - if (aarch64_insn_is_store_ex(le32_to_cpu(*scan_start)))
>> + if ((instruction & stp_ignore_index_mask) == stp_x29_x30_sp_pre)
>> return false;
>> - else if (aarch64_insn_is_load_ex(le32_to_cpu(*scan_start)))
>> - return true;
>> +
>> scan_start--;
>> + instruction = le32_to_cpu(*scan_start);
>> + if (aarch64_insn_is_store_ex(instruction))
>> + return false;
>> + else if (aarch64_insn_is_load_ex(instruction))
>> + return true;
>> }
>>
>> return false;
>> @@ -142,7 +154,6 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
>> {
>> enum kprobe_insn decoded;
>> kprobe_opcode_t insn = le32_to_cpu(*addr);
>> - kprobe_opcode_t *scan_start = addr - 1;
>> kprobe_opcode_t *scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE;
>> #if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
>> struct module *mod;
>> @@ -167,7 +178,7 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
>> decoded = arm_probe_decode_insn(insn, asi);
>>
>> if (decoded == INSN_REJECTED ||
>> - is_probed_address_atomic(scan_start, scan_end))
>> + is_probed_address_atomic(addr, scan_end))
>> return INSN_REJECTED;
>>
>> return decoded;
>> --
>> 2.5.0
>>
>
>
Thanks,
-dl
next prev parent reply other threads:[~2016-09-01 21:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-31 20:52 [PATCH] arm64: Improve kprobes test for atomic sequence David Long
2016-08-31 20:52 ` David Long
2016-09-01 2:38 ` Masami Hiramatsu
2016-09-01 2:38 ` Masami Hiramatsu
2016-09-01 21:27 ` David Long [this message]
2016-09-01 21:27 ` David Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57C89D5F.7070808@linaro.org \
--to=dave.long@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.