[PATCH] bpftool: Add CET-aware symbol matching for x86

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] bpftool: Add CET-aware symbol matching for x86_64 architectures
@ 2025-06-26  6:11 Yuan Chen
  2025-06-26  7:11 ` [PATCH v2] " Yuan Chen
  2025-06-26  7:49 ` [PATCH v3] " Yuan Chen
  0 siblings, 2 replies; 14+ messages in thread
From: Yuan Chen @ 2025-06-26  6:11 UTC (permalink / raw)
  To: ast, qmo; +Cc: bpf, linux-kernel, chenyuan_fl, chenyuan

From: chenyuan <chenyuan@kylinos.cn>

Adjust symbol matching logic to account for Control-flow Enforcement
Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
'endbr' instruction, shifting the actual entry point to symbol + 4.

Signed-off-by: chenyuan <chenyuan@kylinos.cn>
---
 tools/bpf/bpftool/link.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index 189bf312c206..96c62d8aff8e 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -744,8 +744,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
 
 	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (dd.sym_mapping[i].address != data[j].addr) {
+#if defined(__x86_64__) || defined(__amd64__)
+			/*
+			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
+			 * function entry points have a 4-byte 'endbr' instruction prefix.
+			 * This causes the actual function address = symbol address + 4.
+			 * Here we check if this symbol matches the target address minus 4,
+			 * indicating we've found a CET-enabled function entry point.
+			 */
+			if (dd.sym_mapping[i].address == data[j].addr - 4)
+				goto found;
+#endif
 			continue;
+		}
+found:
 		printf("\n\t%016lx %-16llx %s",
 		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
 		if (dd.sym_mapping[i].module[0] != '\0')
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-06-26  6:11 [PATCH] bpftool: Add CET-aware symbol matching for x86_64 architectures Yuan Chen
@ 2025-06-26  7:11 ` Yuan Chen
  2025-06-26  7:49 ` [PATCH v3] " Yuan Chen
  1 sibling, 0 replies; 14+ messages in thread
From: Yuan Chen @ 2025-06-26  7:11 UTC (permalink / raw)
  To: ast, qmo; +Cc: bpf, linux-kernel, chenyuan_fl, Yuan Chen

From: Yuan Chen <chenyuan@kylinos.cn>

Adjust symbol matching logic to account for Control-flow Enforcement
Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
'endbr' instruction, shifting the actual entry point to symbol + 4.

Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
---
 tools/bpf/bpftool/link.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index 189bf312c206..96c62d8aff8e 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -744,8 +744,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
 
 	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (dd.sym_mapping[i].address != data[j].addr) {
+#if defined(__x86_64__) || defined(__amd64__)
+			/*
+			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
+			 * function entry points have a 4-byte 'endbr' instruction prefix.
+			 * This causes the actual function address = symbol address + 4.
+			 * Here we check if this symbol matches the target address minus 4,
+			 * indicating we've found a CET-enabled function entry point.
+			 */
+			if (dd.sym_mapping[i].address == data[j].addr - 4)
+				goto found;
+#endif
 			continue;
+		}
+found:
 		printf("\n\t%016lx %-16llx %s",
 		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
 		if (dd.sym_mapping[i].module[0] != '\0')
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-06-26  6:11 [PATCH] bpftool: Add CET-aware symbol matching for x86_64 architectures Yuan Chen
  2025-06-26  7:11 ` [PATCH v2] " Yuan Chen
@ 2025-06-26  7:49 ` Yuan Chen
  2025-06-27 11:08   ` Quentin Monnet
  2025-07-01  2:31   ` Yonghong Song
  1 sibling, 2 replies; 14+ messages in thread
From: Yuan Chen @ 2025-06-26  7:49 UTC (permalink / raw)
  To: ast, qmo; +Cc: bpf, linux-kernel, chenyuan_fl, Yuan Chen

From: Yuan Chen <chenyuan@kylinos.cn>

Adjust symbol matching logic to account for Control-flow Enforcement
Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
'endbr' instruction, shifting the actual entry point to symbol + 4.

Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
---
 tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index 03513ffffb79..dfd192b4c5ad 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
 		goto error;
 
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (dd.sym_mapping[i].address != data[j].addr) {
+#if defined(__x86_64__) || defined(__amd64__)
+			/*
+			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
+			 * function entry points have a 4-byte 'endbr' instruction prefix.
+			 * This causes the actual function address = symbol address + 4.
+			 * Here we check if this symbol matches the target address minus 4,
+			 * indicating we've found a CET-enabled function entry point.
+			 */
+			if (dd.sym_mapping[i].address == data[j].addr - 4)
+				goto found;
+#endif
 			continue;
+		}
+found:
 		jsonw_start_object(json_wtr);
 		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
 		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
@@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
 
 	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (dd.sym_mapping[i].address != data[j].addr) {
+#if defined(__x86_64__) || defined(__amd64__)
+			/*
+			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
+			 * function entry points have a 4-byte 'endbr' instruction prefix.
+			 * This causes the actual function address = symbol address + 4.
+			 * Here we check if this symbol matches the target address minus 4,
+			 * indicating we've found a CET-enabled function entry point.
+			 */
+			if (dd.sym_mapping[i].address == data[j].addr - 4)
+				goto found;
+#endif
 			continue;
+		}
+found:
 		printf("\n\t%016lx %-16llx %s",
 		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
 		if (dd.sym_mapping[i].module[0] != '\0')
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-06-26  7:49 ` [PATCH v3] " Yuan Chen
@ 2025-06-27 11:08   ` Quentin Monnet
  2025-07-11  6:35     ` chenyuan
  2025-07-01  2:31   ` Yonghong Song
  1 sibling, 1 reply; 14+ messages in thread
From: Quentin Monnet @ 2025-06-27 11:08 UTC (permalink / raw)
  To: Yuan Chen, ast; +Cc: bpf, linux-kernel, Yuan Chen, Jiri Olsa

Thanks! Next time, please try to add all relevant maintainers as
recipients or in copy of your message when submitting patches. You can
get the list with get_maintainer.pl, try running it on your patch or with
"./scripts/get_maintainer.pl -f tools/bpf/bpftool/link.c"

2025-06-26 15:49 UTC+0800 ~ Yuan Chen <chenyuan_fl@163.com>
> From: Yuan Chen <chenyuan@kylinos.cn>
> 
> Adjust symbol matching logic to account for Control-flow Enforcement
> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
> 'endbr' instruction, shifting the actual entry point to symbol + 4.
> 
> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
> ---
>  tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>  1 file changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
> index 03513ffffb79..dfd192b4c5ad 100644
> --- a/tools/bpf/bpftool/link.c
> +++ b/tools/bpf/bpftool/link.c
> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>  		goto error;
>  
>  	for (i = 0; i < dd.sym_count; i++) {
> -		if (dd.sym_mapping[i].address != data[j].addr)
> +		if (dd.sym_mapping[i].address != data[j].addr) {
> +#if defined(__x86_64__) || defined(__amd64__)


I'm not familiar with CET, but from what I read, it's been around since
Tiger Lake processors (2020). Do we have a risk of false positive with
older CPUs? Maybe check that the instruction at
dd.sym_mapping[i].address is endbr32 or endbr34?


> +			/*
> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
> +			 * This causes the actual function address = symbol address + 4.
> +			 * Here we check if this symbol matches the target address minus 4,
> +			 * indicating we've found a CET-enabled function entry point.
> +			 */
> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
> +				goto found;
> +#endif
>  			continue;
> +		}
> +found:
>  		jsonw_start_object(json_wtr);
>  		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);


I suppose we still want to print dd.sym_mapping[i].address (and not
data[j].addr) when we found it with the CET offset here - just
double-checking.


>  		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>  
>  	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>  	for (i = 0; i < dd.sym_count; i++) {
> -		if (dd.sym_mapping[i].address != data[j].addr)
> +		if (dd.sym_mapping[i].address != data[j].addr) {
> +#if defined(__x86_64__) || defined(__amd64__)
> +			/*
> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
> +			 * This causes the actual function address = symbol address + 4.
> +			 * Here we check if this symbol matches the target address minus 4,
> +			 * indicating we've found a CET-enabled function entry point.
> +			 */
> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
> +				goto found;
> +#endif


Given that we have twice the same check, I'd move this to a dedicated
wrapper function that we could call from both show_kprobe_multi_json()
and show_kprobe_multi_plain().


>  			continue;
> +		}
> +found:
>  		printf("\n\t%016lx %-16llx %s",
>  		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>  		if (dd.sym_mapping[i].module[0] != '\0')


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re:Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-06-27 11:08   ` Quentin Monnet
@ 2025-07-11  6:35     ` chenyuan
  0 siblings, 0 replies; 14+ messages in thread
From: chenyuan @ 2025-07-11  6:35 UTC (permalink / raw)
  To: Quentin Monnet; +Cc: ast, bpf, linux-kernel, Yuan Chen, Jiri Olsa

Thank you for reviewing the patch and providing valuable feedback!   I appreciate your insights on CET 
compatibility and code structure. Here are my responses to your points:
1. Maintainer List

I confirm that in future submissions, I will run:
./scripts/get_maintainer.pl -f tools/bpf/bpftool/link.c
to ensure all relevant maintainers are included in the recipient list . This was an oversight in the initial submission.

2. False Positives on Older CPUs
Your concern about older CPUs is valid. To address this:
Current Approach: The patch relies on address offset matching (symbol_addr == target_addr - 4), which is safe because:
Non-CET functions won’t have a valid symbol at target_addr - 4 .
Symbol tables are deterministic, so accidental matches at addr - 4 are statistically negligible.
Instruction Verification: While checking for endbr32/endbr64 would be ideal, user-space cannot directly inspect kernel instruction memory for security and portability reasons. 
Could you advise if there are any safe methods to verify the presence of endbr32/endbr64 instructions at kernel symbol addresses from user space?





At 2025-06-27 19:08:48, "Quentin Monnet" <qmo@kernel.org> wrote:
>Thanks! Next time, please try to add all relevant maintainers as
>recipients or in copy of your message when submitting patches. You can
>get the list with get_maintainer.pl, try running it on your patch or with
>"./scripts/get_maintainer.pl -f tools/bpf/bpftool/link.c"
>
>2025-06-26 15:49 UTC+0800 ~ Yuan Chen <chenyuan_fl@163.com>
>> From: Yuan Chen <chenyuan@kylinos.cn>
>> 
>> Adjust symbol matching logic to account for Control-flow Enforcement
>> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
>> 'endbr' instruction, shifting the actual entry point to symbol + 4.
>> 
>> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
>> ---
>>  tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>>  1 file changed, 28 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
>> index 03513ffffb79..dfd192b4c5ad 100644
>> --- a/tools/bpf/bpftool/link.c
>> +++ b/tools/bpf/bpftool/link.c
>> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>>  		goto error;
>>  
>>  	for (i = 0; i < dd.sym_count; i++) {
>> -		if (dd.sym_mapping[i].address != data[j].addr)
>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>> +#if defined(__x86_64__) || defined(__amd64__)
>
>
>I'm not familiar with CET, but from what I read, it's been around since
>Tiger Lake processors (2020). Do we have a risk of false positive with
>older CPUs? Maybe check that the instruction at
>dd.sym_mapping[i].address is endbr32 or endbr34?
>
>
>> +			/*
>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>> +			 * This causes the actual function address = symbol address + 4.
>> +			 * Here we check if this symbol matches the target address minus 4,
>> +			 * indicating we've found a CET-enabled function entry point.
>> +			 */
>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>> +				goto found;
>> +#endif
>>  			continue;
>> +		}
>> +found:
>>  		jsonw_start_object(json_wtr);
>>  		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
>
>
>I suppose we still want to print dd.sym_mapping[i].address (and not
>data[j].addr) when we found it with the CET offset here - just
>double-checking.
>
>
>>  		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
>> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>>  
>>  	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>>  	for (i = 0; i < dd.sym_count; i++) {
>> -		if (dd.sym_mapping[i].address != data[j].addr)
>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>> +#if defined(__x86_64__) || defined(__amd64__)
>> +			/*
>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>> +			 * This causes the actual function address = symbol address + 4.
>> +			 * Here we check if this symbol matches the target address minus 4,
>> +			 * indicating we've found a CET-enabled function entry point.
>> +			 */
>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>> +				goto found;
>> +#endif
>
>
>Given that we have twice the same check, I'd move this to a dedicated
>wrapper function that we could call from both show_kprobe_multi_json()
>and show_kprobe_multi_plain().
>
>
>>  			continue;
>> +		}
>> +found:
>>  		printf("\n\t%016lx %-16llx %s",
>>  		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>>  		if (dd.sym_mapping[i].module[0] != '\0')


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-06-26  7:49 ` [PATCH v3] " Yuan Chen
  2025-06-27 11:08   ` Quentin Monnet
@ 2025-07-01  2:31   ` Yonghong Song
  2025-07-11  7:07     ` chenyuan
  1 sibling, 1 reply; 14+ messages in thread
From: Yonghong Song @ 2025-07-01  2:31 UTC (permalink / raw)
  To: Yuan Chen, ast, qmo; +Cc: bpf, linux-kernel, Yuan Chen



On 6/26/25 12:49 AM, Yuan Chen wrote:
> From: Yuan Chen <chenyuan@kylinos.cn>
>
> Adjust symbol matching logic to account for Control-flow Enforcement
> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
> 'endbr' instruction, shifting the actual entry point to symbol + 4.
>
> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
> ---
>   tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>   1 file changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
> index 03513ffffb79..dfd192b4c5ad 100644
> --- a/tools/bpf/bpftool/link.c
> +++ b/tools/bpf/bpftool/link.c
> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>   		goto error;
>   
>   	for (i = 0; i < dd.sym_count; i++) {
> -		if (dd.sym_mapping[i].address != data[j].addr)
> +		if (dd.sym_mapping[i].address != data[j].addr) {
> +#if defined(__x86_64__) || defined(__amd64__)
> +			/*
> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
> +			 * This causes the actual function address = symbol address + 4.
> +			 * Here we check if this symbol matches the target address minus 4,
> +			 * indicating we've found a CET-enabled function entry point.
> +			 */
> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
> +				goto found;
> +#endif

In kernel/trace/bpf_trace.c, I see

static inline unsigned long get_entry_ip(unsigned long fentry_ip)
{
#ifdef CONFIG_X86_KERNEL_IBT
         if (is_endbr((void *)(fentry_ip - ENDBR_INSN_SIZE)))
                 fentry_ip -= ENDBR_INSN_SIZE;
#endif
         return fentry_ip;
}

Could you explain why arm64 also need to do checking
     if (dd.sym_mapping[i].address == data[j].addr - 4)
like x86_64?

>   			continue;
> +		}
> +found:
>   		jsonw_start_object(json_wtr);
>   		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
>   		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>   
>   	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>   	for (i = 0; i < dd.sym_count; i++) {
> -		if (dd.sym_mapping[i].address != data[j].addr)
> +		if (dd.sym_mapping[i].address != data[j].addr) {
> +#if defined(__x86_64__) || defined(__amd64__)
> +			/*
> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
> +			 * This causes the actual function address = symbol address + 4.
> +			 * Here we check if this symbol matches the target address minus 4,
> +			 * indicating we've found a CET-enabled function entry point.
> +			 */
> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
> +				goto found;
> +#endif
>   			continue;
> +		}
> +found:
>   		printf("\n\t%016lx %-16llx %s",
>   		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>   		if (dd.sym_mapping[i].module[0] != '\0')


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re:Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-07-01  2:31   ` Yonghong Song
@ 2025-07-11  7:07     ` chenyuan
  2025-07-12  0:47       ` Yonghong Song
  0 siblings, 1 reply; 14+ messages in thread
From: chenyuan @ 2025-07-11  7:07 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, qmo, bpf, linux-kernel, Yuan Chen

Thank you for your feedback! Does ARM64 require similar address adjustment detection? In my ARM64
 environment with BTI enabled, bpftool correctly retrieves and prints function symbols. Could my verification
 method be flawed?
Here’s a detailed explanation:

ARM64 BTI vs. x86 CET: Fundamental Differences

    x86 CET (Control-flow Enforcement Technology):
        Requires endbr32/endbr64 at function entries. Overwriting these instructions breaks CET protection .
        Kernel logic (e.g., bpf_trace.c) adjusts symbol addresses by -4 to skip the endbr prefix .
    ARM64 BTI (Branch Target Identification):
        Uses BTI instructions as "landing pads" for indirect jumps. Kprobes can safely overwrite BTI instructions without triggering faults because:
            Executing BTI, SG, or PACBTI clears EPSR.B (the enforcement flag), allowing subsequent non-BTI instructions .
            Non-landing-pad instructions (e.g., probes) only fault if executed before EPSR.B is cleared – which doesn’t occur when probes replace BTI .

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension









At 2025-07-01 10:31:41, "Yonghong Song" <yonghong.song@linux.dev> wrote:
>
>
>On 6/26/25 12:49 AM, Yuan Chen wrote:
>> From: Yuan Chen <chenyuan@kylinos.cn>
>>
>> Adjust symbol matching logic to account for Control-flow Enforcement
>> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
>> 'endbr' instruction, shifting the actual entry point to symbol + 4.
>>
>> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
>> ---
>>   tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>>   1 file changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
>> index 03513ffffb79..dfd192b4c5ad 100644
>> --- a/tools/bpf/bpftool/link.c
>> +++ b/tools/bpf/bpftool/link.c
>> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>>   		goto error;
>>   
>>   	for (i = 0; i < dd.sym_count; i++) {
>> -		if (dd.sym_mapping[i].address != data[j].addr)
>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>> +#if defined(__x86_64__) || defined(__amd64__)
>> +			/*
>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>> +			 * This causes the actual function address = symbol address + 4.
>> +			 * Here we check if this symbol matches the target address minus 4,
>> +			 * indicating we've found a CET-enabled function entry point.
>> +			 */
>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>> +				goto found;
>> +#endif
>
>In kernel/trace/bpf_trace.c, I see
>
>static inline unsigned long get_entry_ip(unsigned long fentry_ip)
>{
>#ifdef CONFIG_X86_KERNEL_IBT
>         if (is_endbr((void *)(fentry_ip - ENDBR_INSN_SIZE)))
>                 fentry_ip -= ENDBR_INSN_SIZE;
>#endif
>         return fentry_ip;
>}
>
>Could you explain why arm64 also need to do checking
>     if (dd.sym_mapping[i].address == data[j].addr - 4)
>like x86_64?
>
>>   			continue;
>> +		}
>> +found:
>>   		jsonw_start_object(json_wtr);
>>   		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
>>   		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
>> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>>   
>>   	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>>   	for (i = 0; i < dd.sym_count; i++) {
>> -		if (dd.sym_mapping[i].address != data[j].addr)
>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>> +#if defined(__x86_64__) || defined(__amd64__)
>> +			/*
>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>> +			 * This causes the actual function address = symbol address + 4.
>> +			 * Here we check if this symbol matches the target address minus 4,
>> +			 * indicating we've found a CET-enabled function entry point.
>> +			 */
>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>> +				goto found;
>> +#endif
>>   			continue;
>> +		}
>> +found:
>>   		printf("\n\t%016lx %-16llx %s",
>>   		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>>   		if (dd.sym_mapping[i].module[0] != '\0')

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-07-11  7:07     ` chenyuan
@ 2025-07-12  0:47       ` Yonghong Song
  2025-07-21 12:51         ` chenyuan
  0 siblings, 1 reply; 14+ messages in thread
From: Yonghong Song @ 2025-07-12  0:47 UTC (permalink / raw)
  To: chenyuan; +Cc: ast, qmo, bpf, linux-kernel, Yuan Chen



On 7/11/25 12:07 AM, chenyuan wrote:
> Thank you for your feedback! Does ARM64 require similar address adjustment detection? In my ARM64
>   environment with BTI enabled, bpftool correctly retrieves and prints function symbols. Could my verification
>   method be flawed?
> Here’s a detailed explanation:
>
> ARM64 BTI vs. x86 CET: Fundamental Differences
>
>      x86 CET (Control-flow Enforcement Technology):
>          Requires endbr32/endbr64 at function entries. Overwriting these instructions breaks CET protection .
>          Kernel logic (e.g., bpf_trace.c) adjusts symbol addresses by -4 to skip the endbr prefix .

This interpretation is not correct. The adjustment by -4 is not to skip the endbr prefix,
but to get the actual symbol address. For example,

ffffffff83809cb0 <bpf_fentry_test3>:
ffffffff83809cb0: f3 0f 1e fa           endbr64
ffffffff83809cb4: 0f 1f 44 00 00        nopl    (%rax,%rax)
ffffffff83809cb9: 8d 04 37              leal    (%rdi,%rsi), %eax
ffffffff83809cbc: 01 d0                 addl    %edx, %eax
ffffffff83809cbe: 2e e9 6c d3 c8 00     jmp     0xffffffff84497030 <__x86_return_thunk>
ffffffff83809cc4: 66 66 66 2e 0f 1f 84 00 00 00 00 00   nopw    %cs:(%rax,%rax)

The fentry_ip argument in func get_entry_ip() is 0xffffffff83809cb4. Adding -4
will get the value 0xffffffff83809cb0 which is the actual start of the function.

>      ARM64 BTI (Branch Target Identification):
>          Uses BTI instructions as "landing pads" for indirect jumps. Kprobes can safely overwrite BTI instructions without triggering faults because:
>              Executing BTI, SG, or PACBTI clears EPSR.B (the enforcement flag), allowing subsequent non-BTI instructions .
>              Non-landing-pad instructions (e.g., probes) only fault if executed before EPSR.B is cleared – which doesn’t occur when probes replace BTI .

I am not super familiar with arm64 bti. But from an arm64 kernel, with my config file (based on bpf CI),
I didn't find bti insns for tracable functions. So I double arm64 kernel will need address adjustment.
Otherwise, get_entry_ip() should do adjustment there.

It would be great if you can have an example to show arm64 also needs addr adjustment in bpftool
as in this patch.

>
> https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
>
>
>
>
>
>
>
>
>
> At 2025-07-01 10:31:41, "Yonghong Song" <yonghong.song@linux.dev> wrote:
>>
>> On 6/26/25 12:49 AM, Yuan Chen wrote:
>>> From: Yuan Chen <chenyuan@kylinos.cn>
>>>
>>> Adjust symbol matching logic to account for Control-flow Enforcement
>>> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
>>> 'endbr' instruction, shifting the actual entry point to symbol + 4.
>>>
>>> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
>>> ---
>>>    tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>>>    1 file changed, 28 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
>>> index 03513ffffb79..dfd192b4c5ad 100644
>>> --- a/tools/bpf/bpftool/link.c
>>> +++ b/tools/bpf/bpftool/link.c
>>> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>>>    		goto error;
>>>    
>>>    	for (i = 0; i < dd.sym_count; i++) {
>>> -		if (dd.sym_mapping[i].address != data[j].addr)
>>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>>> +#if defined(__x86_64__) || defined(__amd64__)
>>> +			/*
>>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>>> +			 * This causes the actual function address = symbol address + 4.
>>> +			 * Here we check if this symbol matches the target address minus 4,
>>> +			 * indicating we've found a CET-enabled function entry point.
>>> +			 */
>>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>>> +				goto found;
>>> +#endif
>> In kernel/trace/bpf_trace.c, I see
>>
>> static inline unsigned long get_entry_ip(unsigned long fentry_ip)
>> {
>> #ifdef CONFIG_X86_KERNEL_IBT
>>          if (is_endbr((void *)(fentry_ip - ENDBR_INSN_SIZE)))
>>                  fentry_ip -= ENDBR_INSN_SIZE;
>> #endif
>>          return fentry_ip;
>> }
>>
>> Could you explain why arm64 also need to do checking
>>      if (dd.sym_mapping[i].address == data[j].addr - 4)
>> like x86_64?
>>
>>>    			continue;
>>> +		}
>>> +found:
>>>    		jsonw_start_object(json_wtr);
>>>    		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
>>>    		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
>>> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>>>    
>>>    	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>>>    	for (i = 0; i < dd.sym_count; i++) {
>>> -		if (dd.sym_mapping[i].address != data[j].addr)
>>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>>> +#if defined(__x86_64__) || defined(__amd64__)
>>> +			/*
>>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>>> +			 * This causes the actual function address = symbol address + 4.
>>> +			 * Here we check if this symbol matches the target address minus 4,
>>> +			 * indicating we've found a CET-enabled function entry point.
>>> +			 */
>>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>>> +				goto found;
>>> +#endif
>>>    			continue;
>>> +		}
>>> +found:
>>>    		printf("\n\t%016lx %-16llx %s",
>>>    		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>>>    		if (dd.sym_mapping[i].module[0] != '\0')


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re:Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-07-12  0:47       ` Yonghong Song
@ 2025-07-21 12:51         ` chenyuan
  2025-07-21 14:53           ` Yonghong Song
  0 siblings, 1 reply; 14+ messages in thread
From: chenyuan @ 2025-07-21 12:51 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, qmo, bpf, linux-kernel, Yuan Chen

Apologies for any inaccuracies in my previous explanation. Below, I'll provide a brief clarification based
on verification across both ARM64 and x86 platforms:
arm64:
Without kprobe/kprobe_multi Hook:
(gdb) disassemble vfs_read
Dump of assembler code for function vfs_read:
   0xffffc000803ca308 <+0>:	bti	c   // ARM64 BTI security instruction  
   0xffffc000803ca30c <+4>:	nop
   0xffffc000803ca310 <+8>:	nop
   0xffffc000803ca314 <+12>:	paciasp
   0xffffc000803ca318 <+16>:	sub	sp, sp, #0xa0

With kprobe/kprobe_multi Hook:
(gdb) disassemble vfs_read
Dump of assembler code for function vfs_read:
   0xffffc000803ca308 <+0>:	brk	#0x4  // BTI replaced by breakpoint  
   0xffffc000803ca30c <+4>:	mov	x9, x30
   0xffffc000803ca310 <+8>:	nop
   0xffffc000803ca314 <+12>:	paciasp
   0xffffc000803ca318 <+16>:	sub	sp, sp, #0xa0

kprobe directly overwrites the first instruction (bti c → brk #0x4). Hook address (0xffffc000803ca308) matches
the symbol address exactly.

x86_64:
Without kprobe/kprobe_multi Hook:
(gdb) disassemble vfs_read
Dump of assembler code for function vfs_read:
   0xffffffff82112b40 <+0>:     endbr64  // x86 CET security instruction  
   0xffffffff82112b44 <+4>:     nopl   0x0(%rax,%rax,1)
   0xffffffff82112b49 <+9>:     push   %r15
   0xffffffff82112b4b <+11>:    mov    %rsi,%r15
   0xffffffff82112b4e <+14>:    push   %r14
   0xffffffff82112b50 <+16>:    push   %r13

With kprobe/kprobe_multi Hook:
(gdb) disassemble vfs_read
Dump of assembler code for function vfs_read:
   0xffffffff82112b40 <+0>:     endbr64   // Preserved security instruction  
   0xffffffff82112b44 <+4>:     call   0xffffffffa1830000  // Hook replaces nopl
   0xffffffff82112b49 <+9>:     push   %r15
   0xffffffff82112b4b <+11>:    mov    %rsi,%r15
   0xffffffff82112b4e <+14>:    push   %r14
   0xffffffff82112b50 <+16>:    push   %r13

kprobe preserves endbr64 and overwrites the subsequent instruction (nopl → call). Hook address (0xffffffff82112b44) 
requires -4 offset (0xffffffff82112b40) to match the symbol address.

ARM64 hooks replace the very first instruction (including security features like BTI), while x86_64 hooks target the instruction
immediately after endbr64, creating a 4-byte offset that must be compensated for when resolving symbol addresses.













At 2025-07-12 08:47:32, "Yonghong Song" <yonghong.song@linux.dev> wrote:
>
>
>On 7/11/25 12:07 AM, chenyuan wrote:
>> Thank you for your feedback! Does ARM64 require similar address adjustment detection? In my ARM64
>>   environment with BTI enabled, bpftool correctly retrieves and prints function symbols. Could my verification
>>   method be flawed?
>> Here’s a detailed explanation:
>>
>> ARM64 BTI vs. x86 CET: Fundamental Differences
>>
>>      x86 CET (Control-flow Enforcement Technology):
>>          Requires endbr32/endbr64 at function entries. Overwriting these instructions breaks CET protection .
>>          Kernel logic (e.g., bpf_trace.c) adjusts symbol addresses by -4 to skip the endbr prefix .
>
>This interpretation is not correct. The adjustment by -4 is not to skip the endbr prefix,
>but to get the actual symbol address. For example,
>
>ffffffff83809cb0 <bpf_fentry_test3>:
>ffffffff83809cb0: f3 0f 1e fa           endbr64
>ffffffff83809cb4: 0f 1f 44 00 00        nopl    (%rax,%rax)
>ffffffff83809cb9: 8d 04 37              leal    (%rdi,%rsi), %eax
>ffffffff83809cbc: 01 d0                 addl    %edx, %eax
>ffffffff83809cbe: 2e e9 6c d3 c8 00     jmp     0xffffffff84497030 <__x86_return_thunk>
>ffffffff83809cc4: 66 66 66 2e 0f 1f 84 00 00 00 00 00   nopw    %cs:(%rax,%rax)
>
>The fentry_ip argument in func get_entry_ip() is 0xffffffff83809cb4. Adding -4
>will get the value 0xffffffff83809cb0 which is the actual start of the function.
>
>>      ARM64 BTI (Branch Target Identification):
>>          Uses BTI instructions as "landing pads" for indirect jumps. Kprobes can safely overwrite BTI instructions without triggering faults because:
>>              Executing BTI, SG, or PACBTI clears EPSR.B (the enforcement flag), allowing subsequent non-BTI instructions .
>>              Non-landing-pad instructions (e.g., probes) only fault if executed before EPSR.B is cleared – which doesn’t occur when probes replace BTI .
>
>I am not super familiar with arm64 bti. But from an arm64 kernel, with my config file (based on bpf CI),
>I didn't find bti insns for tracable functions. So I double arm64 kernel will need address adjustment.
>Otherwise, get_entry_ip() should do adjustment there.
>
>It would be great if you can have an example to show arm64 also needs addr adjustment in bpftool
>as in this patch.
>
>>
>> https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2025-07-01 10:31:41, "Yonghong Song" <yonghong.song@linux.dev> wrote:
>>>
>>> On 6/26/25 12:49 AM, Yuan Chen wrote:
>>>> From: Yuan Chen <chenyuan@kylinos.cn>
>>>>
>>>> Adjust symbol matching logic to account for Control-flow Enforcement
>>>> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
>>>> 'endbr' instruction, shifting the actual entry point to symbol + 4.
>>>>
>>>> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
>>>> ---
>>>>    tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>>>>    1 file changed, 28 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
>>>> index 03513ffffb79..dfd192b4c5ad 100644
>>>> --- a/tools/bpf/bpftool/link.c
>>>> +++ b/tools/bpf/bpftool/link.c
>>>> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>>>>    		goto error;
>>>>    
>>>>    	for (i = 0; i < dd.sym_count; i++) {
>>>> -		if (dd.sym_mapping[i].address != data[j].addr)
>>>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>>>> +#if defined(__x86_64__) || defined(__amd64__)
>>>> +			/*
>>>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>>>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>>>> +			 * This causes the actual function address = symbol address + 4.
>>>> +			 * Here we check if this symbol matches the target address minus 4,
>>>> +			 * indicating we've found a CET-enabled function entry point.
>>>> +			 */
>>>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>>>> +				goto found;
>>>> +#endif
>>> In kernel/trace/bpf_trace.c, I see
>>>
>>> static inline unsigned long get_entry_ip(unsigned long fentry_ip)
>>> {
>>> #ifdef CONFIG_X86_KERNEL_IBT
>>>          if (is_endbr((void *)(fentry_ip - ENDBR_INSN_SIZE)))
>>>                  fentry_ip -= ENDBR_INSN_SIZE;
>>> #endif
>>>          return fentry_ip;
>>> }
>>>
>>> Could you explain why arm64 also need to do checking
>>>      if (dd.sym_mapping[i].address == data[j].addr - 4)
>>> like x86_64?
>>>
>>>>    			continue;
>>>> +		}
>>>> +found:
>>>>    		jsonw_start_object(json_wtr);
>>>>    		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
>>>>    		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
>>>> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>>>>    
>>>>    	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>>>>    	for (i = 0; i < dd.sym_count; i++) {
>>>> -		if (dd.sym_mapping[i].address != data[j].addr)
>>>> +		if (dd.sym_mapping[i].address != data[j].addr) {
>>>> +#if defined(__x86_64__) || defined(__amd64__)
>>>> +			/*
>>>> +			 * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>>>> +			 * function entry points have a 4-byte 'endbr' instruction prefix.
>>>> +			 * This causes the actual function address = symbol address + 4.
>>>> +			 * Here we check if this symbol matches the target address minus 4,
>>>> +			 * indicating we've found a CET-enabled function entry point.
>>>> +			 */
>>>> +			if (dd.sym_mapping[i].address == data[j].addr - 4)
>>>> +				goto found;
>>>> +#endif
>>>>    			continue;
>>>> +		}
>>>> +found:
>>>>    		printf("\n\t%016lx %-16llx %s",
>>>>    		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>>>>    		if (dd.sym_mapping[i].module[0] != '\0')

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64 architectures
  2025-07-21 12:51         ` chenyuan
@ 2025-07-21 14:53           ` Yonghong Song
  2025-07-22  1:46             ` [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures chenyuan_fl
  2025-07-22  2:00             ` chenyuan_fl
  0 siblings, 2 replies; 14+ messages in thread
From: Yonghong Song @ 2025-07-21 14:53 UTC (permalink / raw)
  To: chenyuan; +Cc: ast, qmo, bpf, linux-kernel, Yuan Chen



On 7/21/25 5:51 AM, chenyuan wrote:
> Apologies for any inaccuracies in my previous explanation. Below, I'll provide a brief clarification based
> on verification across both ARM64 and x86 platforms:
> arm64:
> Without kprobe/kprobe_multi Hook:
> (gdb) disassemble vfs_read
> Dump of assembler code for function vfs_read:
>     0xffffc000803ca308 <+0>:	bti	c   // ARM64 BTI security instruction
>     0xffffc000803ca30c <+4>:	nop
>     0xffffc000803ca310 <+8>:	nop
>     0xffffc000803ca314 <+12>:	paciasp
>     0xffffc000803ca318 <+16>:	sub	sp, sp, #0xa0
>
> With kprobe/kprobe_multi Hook:
> (gdb) disassemble vfs_read
> Dump of assembler code for function vfs_read:
>     0xffffc000803ca308 <+0>:	brk	#0x4  // BTI replaced by breakpoint
>     0xffffc000803ca30c <+4>:	mov	x9, x30
>     0xffffc000803ca310 <+8>:	nop
>     0xffffc000803ca314 <+12>:	paciasp
>     0xffffc000803ca318 <+16>:	sub	sp, sp, #0xa0

Thanks for checking. If this is the case, then I don't think we
need to checking
    if (dd.sym_mapping[i].address == data[j].addr - 4) for arm64. In you v3 
patch, the comment also only mentions x86_64.

>
> kprobe directly overwrites the first instruction (bti c → brk #0x4). Hook address (0xffffc000803ca308) matches
> the symbol address exactly.
>
> x86_64:
> Without kprobe/kprobe_multi Hook:
> (gdb) disassemble vfs_read
> Dump of assembler code for function vfs_read:
>     0xffffffff82112b40 <+0>:     endbr64  // x86 CET security instruction
>     0xffffffff82112b44 <+4>:     nopl   0x0(%rax,%rax,1)
>     0xffffffff82112b49 <+9>:     push   %r15
>     0xffffffff82112b4b <+11>:    mov    %rsi,%r15
>     0xffffffff82112b4e <+14>:    push   %r14
>     0xffffffff82112b50 <+16>:    push   %r13
>
> With kprobe/kprobe_multi Hook:
> (gdb) disassemble vfs_read
> Dump of assembler code for function vfs_read:
>     0xffffffff82112b40 <+0>:     endbr64   // Preserved security instruction
>     0xffffffff82112b44 <+4>:     call   0xffffffffa1830000  // Hook replaces nopl
>     0xffffffff82112b49 <+9>:     push   %r15
>     0xffffffff82112b4b <+11>:    mov    %rsi,%r15
>     0xffffffff82112b4e <+14>:    push   %r14
>     0xffffffff82112b50 <+16>:    push   %r13
>
> kprobe preserves endbr64 and overwrites the subsequent instruction (nopl → call). Hook address (0xffffffff82112b44)
> requires -4 offset (0xffffffff82112b40) to match the symbol address.
>
> ARM64 hooks replace the very first instruction (including security features like BTI), while x86_64 hooks target the instruction
> immediately after endbr64, creating a 4-byte offset that must be compensated for when resolving symbol addresses.

[...]


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures
  2025-07-21 14:53           ` Yonghong Song
@ 2025-07-22  1:46             ` chenyuan_fl
  2025-07-22  2:00             ` chenyuan_fl
  1 sibling, 0 replies; 14+ messages in thread
From: chenyuan_fl @ 2025-07-22  1:46 UTC (permalink / raw)
  To: qmo, ast, daniel, andrii, yonghong.song; +Cc: bpf, linux-kernel, Yuan Chen

From: Yuan Chen <chenyuan@kylinos.com>

Adjust symbol matching logic to account for Control-flow Enforcement
Technology (CET) on x86/x86_64 systems. CET prefixes functions with
a 4-byte 'endbr' instruction, shifting the actual hook entry point to
symbol + 4.

Changed in PATCH v4:
* Refactor repeated code into a function.
* Add detection for the x86 architecture.

Signed-off-by: Yuan Chen <chenyuan@kylinos.com>
---
 tools/bpf/bpftool/link.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index a773e05d5ade..9e5d85421919 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -282,6 +282,28 @@ get_addr_cookie_array(__u64 *addrs, __u64 *cookies, __u32 count)
 	return data;
 }
 
+static bool
+symbol_matches_target(__u64 sym_addr, __u64 target_addr)
+{
+	if (sym_addr == target_addr)
+		return true;
+
+#if defined(__i386__) || defined(__x86_64__)
+	/*
+	 * On x86 architectures with CET (Control-flow Enforcement Technology),
+	 * function entry points have a 4-byte 'endbr' instruction prefix.
+	 * This causes kprobe hooks to target the address *after* 'endbr'
+	 * (symbol address + 4), preserving the CET instruction.
+	 * Here we check if the symbol address matches the hook target address minus 4,
+	 * indicating a CET-enabled function entry point.
+	 */
+	if (sym_addr == target_addr - 4)
+		return true;
+#endif
+
+	return false;
+}
+
 static void
 show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
 {
@@ -307,8 +329,9 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
 		goto error;
 
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (!symbol_matches_target(dd.sym_mapping[i].address, data[j].addr))
 			continue;
+
 		jsonw_start_object(json_wtr);
 		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
 		jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
@@ -744,7 +767,7 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
 
 	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (!symbol_matches_target(dd.sym_mapping[i].address, data[j].addr))
 			continue;
 		printf("\n\t%016lx %-16llx %s",
 		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures
  2025-07-21 14:53           ` Yonghong Song
  2025-07-22  1:46             ` [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures chenyuan_fl
@ 2025-07-22  2:00             ` chenyuan_fl
  2025-07-22 14:23               ` Quentin Monnet
  1 sibling, 1 reply; 14+ messages in thread
From: chenyuan_fl @ 2025-07-22  2:00 UTC (permalink / raw)
  To: qmo, ast, daniel, andrii, yonghong.song; +Cc: bpf, linux-kernel, Yuan Chen

From: Yuan Chen <chenyuan@kylinos.cn>

Adjust symbol matching logic to account for Control-flow Enforcement
Technology (CET) on x86/x86_64 systems. CET prefixes functions with
a 4-byte 'endbr' instruction, shifting the actual hook entry point to
symbol + 4.

Changed in PATCH v4:
* Refactor repeated code into a function.
* Add detection for the x86 architecture.

Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
---
 tools/bpf/bpftool/link.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index a773e05d5ade..717ca8c5ff83 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -282,6 +282,28 @@ get_addr_cookie_array(__u64 *addrs, __u64 *cookies, __u32 count)
 	return data;
 }
 
+static bool
+symbol_matches_target(__u64 sym_addr, __u64 target_addr)
+{
+	if (sym_addr == target_addr)
+		return true;
+
+#if defined(__i386__) || defined(__x86_64__)
+	/*
+	 * On x86 architectures with CET (Control-flow Enforcement Technology),
+	 * function entry points have a 4-byte 'endbr' instruction prefix.
+	 * This causes kprobe hooks to target the address *after* 'endbr'
+	 * (symbol address + 4), preserving the CET instruction.
+	 * Here we check if the symbol address matches the hook target address minus 4,
+	 * indicating a CET-enabled function entry point.
+	 */
+	if (sym_addr == target_addr - 4)
+		return true;
+#endif
+
+	return false;
+}
+
 static void
 show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
 {
@@ -307,7 +329,7 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
 		goto error;
 
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (!symbol_matches_target(dd.sym_mapping[i].address, data[j].addr))
 			continue;
 		jsonw_start_object(json_wtr);
 		jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
@@ -744,7 +766,7 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
 
 	printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
 	for (i = 0; i < dd.sym_count; i++) {
-		if (dd.sym_mapping[i].address != data[j].addr)
+		if (!symbol_matches_target(dd.sym_mapping[i].address, data[j].addr))
 			continue;
 		printf("\n\t%016lx %-16llx %s",
 		       dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures
  2025-07-22  2:00             ` chenyuan_fl
@ 2025-07-22 14:23               ` Quentin Monnet
  2025-07-23  1:52                 ` chenyuan
  0 siblings, 1 reply; 14+ messages in thread
From: Quentin Monnet @ 2025-07-22 14:23 UTC (permalink / raw)
  To: chenyuan_fl, ast, daniel, andrii, yonghong.song
  Cc: bpf, linux-kernel, Yuan Chen

2025-07-22 10:00 UTC+0800 ~ chenyuan_fl@163.com
> From: Yuan Chen <chenyuan@kylinos.cn>
> 
> Adjust symbol matching logic to account for Control-flow Enforcement
> Technology (CET) on x86/x86_64 systems. CET prefixes functions with
> a 4-byte 'endbr' instruction, shifting the actual hook entry point to
> symbol + 4.
> 
> Changed in PATCH v4:
> * Refactor repeated code into a function.
> * Add detection for the x86 architecture.
> 
> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
> ---
>  tools/bpf/bpftool/link.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
> index a773e05d5ade..717ca8c5ff83 100644
> --- a/tools/bpf/bpftool/link.c
> +++ b/tools/bpf/bpftool/link.c
> @@ -282,6 +282,28 @@ get_addr_cookie_array(__u64 *addrs, __u64 *cookies, __u32 count)
>  	return data;
>  }
>  
> +static bool
> +symbol_matches_target(__u64 sym_addr, __u64 target_addr)
> +{
> +	if (sym_addr == target_addr)
> +		return true;
> +
> +#if defined(__i386__) || defined(__x86_64__)


Do you really need it for __i386__ as well? My understanding was that
CET would apply only to 64-bit?

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re:Re: [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures
  2025-07-22 14:23               ` Quentin Monnet
@ 2025-07-23  1:52                 ` chenyuan
  0 siblings, 0 replies; 14+ messages in thread
From: chenyuan @ 2025-07-23  1:52 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: ast, daniel, andrii, yonghong.song, bpf, linux-kernel, Yuan Chen

You are absolutely right. My initial assumption was incorrect - while endbr32 can technically be
compiled for i386, I've verified in the kernel configuration that X86_KERNEL_IBT explicitly
depends on X86_64:

.config - Linux/i386 6.16.0-rc3 Kernel Configuration
> Search (X86_KERNEL_IBT) > Processor type and features > Search (X86_KERNEL_IBT)
Symbol: X86_KERNEL_IBT [=n]
Type  : bool
Defined at arch/x86/Kconfig:1771
Prompt: Indirect Branch Tracking
Depends on: X86_64 [=n] && CC_HAS_IBT [=y] && HAVE_OBJTOOL [=n] && (!LD_IS_LLD [=n] || LLD_VERSION [=0]>=140000)

This confirms CET is indeed 64-bit exclusive in the current implementation. I'll revise the patch
immediately to remove i386 support.

Thanks for catching this!
Best regards,
Yuan Chen



At 2025-07-22 22:23:23, "Quentin Monnet" <qmo@kernel.org> wrote:
>2025-07-22 10:00 UTC+0800 ~ chenyuan_fl@163.com
>> From: Yuan Chen <chenyuan@kylinos.cn>
>> 
>> Adjust symbol matching logic to account for Control-flow Enforcement
>> Technology (CET) on x86/x86_64 systems. CET prefixes functions with
>> a 4-byte 'endbr' instruction, shifting the actual hook entry point to
>> symbol + 4.
>> 
>> Changed in PATCH v4:
>> * Refactor repeated code into a function.
>> * Add detection for the x86 architecture.
>> 
>> Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
>> ---
>>  tools/bpf/bpftool/link.c | 26 ++++++++++++++++++++++++--
>>  1 file changed, 24 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
>> index a773e05d5ade..717ca8c5ff83 100644
>> --- a/tools/bpf/bpftool/link.c
>> +++ b/tools/bpf/bpftool/link.c
>> @@ -282,6 +282,28 @@ get_addr_cookie_array(__u64 *addrs, __u64 *cookies, __u32 count)
>>  	return data;
>>  }
>>  
>> +static bool
>> +symbol_matches_target(__u64 sym_addr, __u64 target_addr)
>> +{
>> +	if (sym_addr == target_addr)
>> +		return true;
>> +
>> +#if defined(__i386__) || defined(__x86_64__)
>
>
>Do you really need it for __i386__ as well? My understanding was that
>CET would apply only to 64-bit?
>
>Thanks,
>Quentin

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-23  1:53 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26  6:11 [PATCH] bpftool: Add CET-aware symbol matching for x86_64 architectures Yuan Chen
2025-06-26  7:11 ` [PATCH v2] " Yuan Chen
2025-06-26  7:49 ` [PATCH v3] " Yuan Chen
2025-06-27 11:08   ` Quentin Monnet
2025-07-11  6:35     ` chenyuan
2025-07-01  2:31   ` Yonghong Song
2025-07-11  7:07     ` chenyuan
2025-07-12  0:47       ` Yonghong Song
2025-07-21 12:51         ` chenyuan
2025-07-21 14:53           ` Yonghong Song
2025-07-22  1:46             ` [PATCH v4] bpftool: Add CET-aware symbol matching for x86/x86_64 architectures chenyuan_fl
2025-07-22  2:00             ` chenyuan_fl
2025-07-22 14:23               ` Quentin Monnet
2025-07-23  1:52                 ` chenyuan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).