public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
@ 2025-01-16  6:51 Ethan Zhao
  2025-01-16  7:27 ` Xin Li
  2025-01-16 15:08 ` Nikolay Borisov
  0 siblings, 2 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-16  6:51 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, xin, andrew.cooper3, mingo, bp,
	etzhao

External interrupts (EVENT_TYPE_EXTINT) and system calls (EVENT_TYPE_OTHER)
occur more frequently than other events in a typical system. Prioritizing
these events saves CPU cycles and optimizes the efficiency of performance-
critical paths.

When examining the compiler-generated assembly code for event dispatching
in the functions fred_entry_from_user() and fred_entry_from_kernel(), it
was observed that the compiler intelligently uses a binary search to match
all event type values (0-7) and perform dispatching. As a result, even if
the following cases:

	case EVENT_TYPE_EXTINT:
		return fred_extint(regs);
	case EVENT_TYPE_OTHER:
		return fred_other(regs);

are placed at the beginning of the switch() statement, the generated
assembly code would remain the same, and the expected prioritization would
not be achieved.

Command line to check the assembly code generated by the compiler for
fred_entry_from_user():

$objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'

00000000000015a0 <fred_entry_from_user>:
15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
15ab:       55                      push   %rbp
15ac:       48 c7 47 78 ff ff ff    movq   $0xffffffffffffffff,0x78(%rdi)
15b3:       ff
15b4:       83 e0 0f                and    $0xf,%eax
15b7:       48 89 e5                mov    %rsp,%rbp
15ba:       3c 04                   cmp    $0x4,%al
-->>			            /* match 4(EVENT_TYPE_SWINT) first */
15bc:       74 78                   je     1636 <fred_entry_from_user+0x96>
15be:       77 15                   ja     15d5 <fred_entry_from_user+0x35>
15c0:       3c 02                   cmp    $0x2,%al
15c2:       74 53                   je     1617 <fred_entry_from_user+0x77>
15c4:       77 65                   ja     162b <fred_entry_from_user+0x8b>
15c6:       84 c0                   test   %al,%al
15c8:       75 42                   jne    160c <fred_entry_from_user+0x6c>
15ca:       e8 71 fc ff ff          callq  1240 <fred_extint>
15cf:       5d                      pop    %rbp
15d0:       e9 00 00 00 00          jmpq   15d5 <fred_entry_from_user+0x35>
15d5:       3c 06                   cmp    $0x6,%al
15d7:       74 7c                   je     1655 <fred_entry_from_user+0xb5>
15d9:       72 66                   jb     1641 <fred_entry_from_user+0xa1>
15db:       3c 07                   cmp    $0x7,%al
15dd:       75 2d                   jne    160c <fred_entry_from_user+0x6c>
15df:       8b 87 a4 00 00 00       mov    0xa4(%rdi),%eax
15e5:       25 ff 00 00 02          and    $0x20000ff,%eax
15ea:       3d 01 00 00 02          cmp    $0x2000001,%eax
15ef:       75 6f                   jne    1660 <fred_entry_from_user+0xc0>
15f1:       48 8b 77 50             mov    0x50(%rdi),%rsi
15f5:       48 c7 47 50 da ff ff    movq   $0xffffffffffffffda,0x50(%rdi)
... ...

Command line to check the assembly code generated by the compiler for
fred_entry_from_kernel():

$objdump -d vmlinux.o | awk '/<fred_entry_from_kernel>:/{c=65} c&&c--'

00000000000016b0 <fred_entry_from_kernel>:
16b0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
16b7:       48 8b 77 78             mov    0x78(%rdi),%rsi
16bb:       55                      push   %rbp
16bc:       48 c7 47 78 ff ff ff    movq   $0xffffffffffffffff,0x78(%rdi)
16c3:       ff
16c4:       83 e0 0f                and    $0xf,%eax
16c7:       48 89 e5                mov    %rsp,%rbp
16ca:       3c 03                   cmp    $0x3,%al
-->>                                /* match 3(EVENT_TYPE_HWEXC) first */
16cc:       74 3c                 je     170a <fred_entry_from_kernel+0x5a>
16ce:       76 13                 jbe    16e3 <fred_entry_from_kernel+0x33>
16d0:       3c 05                 cmp    $0x5,%al
16d2:       74 41                 je     1715 <fred_entry_from_kernel+0x65>
16d4:       3c 06                 cmp    $0x6,%al
16d6:       75 27                 jne    16ff <fred_entry_from_kernel+0x4f>
16d8:       e8 73 fe ff ff        callq  1550 <fred_swexc.isra.3>
16dd:       5d                    pop    %rbp
... ...

Therefore, it is necessary to handle EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER
before the switch statement using if-else syntax to ensure the compiler
generates the desired code. After applying the patch, the verification
results are as follows:

$objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'

00000000000015a0 <fred_entry_from_user>:
15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
15ab:       55                      push   %rbp
15ac:       48 c7 47 78 ff ff ff    movq   $0xffffffffffffffff,0x78(%rdi)
15b3:       ff
15b4:       48 89 e5                mov    %rsp,%rbp
15b7:       83 e0 0f                and    $0xf,%eax
15ba:       74 34                   je     15f0 <fred_entry_from_user+0x50>
-->>				    /* match 0(EVENT_TYPE_EXTINT) first */
15bc:       3c 07                   cmp    $0x7,%al
-->>                                /* match 7(EVENT_TYPE_OTHER) second *
15be:       74 6e                   je     162e <fred_entry_from_user+0x8e>
15c0:       3c 04                   cmp    $0x4,%al
15c2:       0f 84 93 00 00 00       je     165b <fred_entry_from_user+0xbb>
15c8:       76 13                   jbe    15dd <fred_entry_from_user+0x3d>
15ca:       3c 05                   cmp    $0x5,%al
15cc:       74 41                   je     160f <fred_entry_from_user+0x6f>
15ce:       3c 06                   cmp    $0x6,%al
15d0:       75 51                   jne    1623 <fred_entry_from_user+0x83>
15d2:       e8 79 ff ff ff          callq  1550 <fred_swexc.isra.3>
15d7:       5d                      pop    %rbp
15d8:       e9 00 00 00 00          jmpq   15dd <fred_entry_from_user+0x3d>
15dd:       3c 02                   cmp    $0x2,%al
15df:       74 1a                   je     15fb <fred_entry_from_user+0x5b>
15e1:       3c 03                   cmp    $0x3,%al
15e3:       75 3e                   jne    1623 <fred_entry_from_user+0x83>
... ...

The same desired code in fred_entry_from_kernel is no longer repeated.

While the C code with if-else placed before switch() may appear ugly, it
works. Additionally, using a jump table is not advisable; even if the jump
table resides in the L1 cache, the cost of loading it is over 10 times the
latency of a cmp instruction.

Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
---
base commit: 619f0b6fad524f08d493a98d55bac9ab8895e3a6
---
 arch/x86/entry/entry_fred.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
index f004a4dc74c2..591f47771ecf 100644
--- a/arch/x86/entry/entry_fred.c
+++ b/arch/x86/entry/entry_fred.c
@@ -228,9 +228,18 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
 	/* Invalidate orig_ax so that syscall_get_nr() works correctly */
 	regs->orig_ax = -1;
 
-	switch (regs->fred_ss.type) {
-	case EVENT_TYPE_EXTINT:
+	if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
 		return fred_extint(regs);
+	else if (regs->fred_ss.type == EVENT_TYPE_OTHER)
+		return fred_other(regs);
+
+	/*
+	 * Dispatch EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER(syscall) type events
+	 * first due to their high probability and let the compiler create binary search
+	 * dispatching for the remaining events
+	 */
+
+	switch (regs->fred_ss.type) {
 	case EVENT_TYPE_NMI:
 		if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
 			return fred_exc_nmi(regs);
@@ -245,8 +254,6 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
 		break;
 	case EVENT_TYPE_SWEXC:
 		return fred_swexc(regs, error_code);
-	case EVENT_TYPE_OTHER:
-		return fred_other(regs);
 	default: break;
 	}
 
@@ -260,9 +267,15 @@ __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
 	/* Invalidate orig_ax so that syscall_get_nr() works correctly */
 	regs->orig_ax = -1;
 
-	switch (regs->fred_ss.type) {
-	case EVENT_TYPE_EXTINT:
+	if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
 		return fred_extint(regs);
+
+	/*
+	 * Dispatch EVENT_TYPE_EXTINT type event first due to its high probability
+	 * and let the compiler do binary search dispatching for the other events
+	 */
+
+	switch (regs->fred_ss.type) {
 	case EVENT_TYPE_NMI:
 		if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
 			return fred_exc_nmi(regs);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16  6:51 [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching Ethan Zhao
@ 2025-01-16  7:27 ` Xin Li
  2025-01-16  9:22   ` Ethan Zhao
  2025-01-16 13:03   ` Ethan Zhao
  2025-01-16 15:08 ` Nikolay Borisov
  1 sibling, 2 replies; 27+ messages in thread
From: Xin Li @ 2025-01-16  7:27 UTC (permalink / raw)
  To: Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, andrew.cooper3, mingo, bp, etzhao

On 1/15/2025 10:51 PM, Ethan Zhao wrote:
> External interrupts (EVENT_TYPE_EXTINT) and system calls (EVENT_TYPE_OTHER)
> occur more frequently than other events in a typical system. Prioritizing
> these events saves CPU cycles and optimizes the efficiency of performance-
> critical paths.

We deliberately hold off sending performance improvement patches at this
point, but first of all please read:
     https://lore.kernel.org/lkml/87fs766o3t.ffs@tglx/

Thanks!
     Xin

> 
> When examining the compiler-generated assembly code for event dispatching
> in the functions fred_entry_from_user() and fred_entry_from_kernel(), it
> was observed that the compiler intelligently uses a binary search to match
> all event type values (0-7) and perform dispatching. As a result, even if
> the following cases:
> 
> 	case EVENT_TYPE_EXTINT:
> 		return fred_extint(regs);
> 	case EVENT_TYPE_OTHER:
> 		return fred_other(regs);
> 
> are placed at the beginning of the switch() statement, the generated
> assembly code would remain the same, and the expected prioritization would
> not be achieved.
> 
> Command line to check the assembly code generated by the compiler for
> fred_entry_from_user():
> 
> $objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
> 
> 00000000000015a0 <fred_entry_from_user>:
> 15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
> 15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
> 15ab:       55                      push   %rbp
> 15ac:       48 c7 47 78 ff ff ff    movq   $0xffffffffffffffff,0x78(%rdi)
> 15b3:       ff
> 15b4:       83 e0 0f                and    $0xf,%eax
> 15b7:       48 89 e5                mov    %rsp,%rbp
> 15ba:       3c 04                   cmp    $0x4,%al
> -->>			            /* match 4(EVENT_TYPE_SWINT) first */
> 15bc:       74 78                   je     1636 <fred_entry_from_user+0x96>
> 15be:       77 15                   ja     15d5 <fred_entry_from_user+0x35>
> 15c0:       3c 02                   cmp    $0x2,%al
> 15c2:       74 53                   je     1617 <fred_entry_from_user+0x77>
> 15c4:       77 65                   ja     162b <fred_entry_from_user+0x8b>
> 15c6:       84 c0                   test   %al,%al
> 15c8:       75 42                   jne    160c <fred_entry_from_user+0x6c>
> 15ca:       e8 71 fc ff ff          callq  1240 <fred_extint>
> 15cf:       5d                      pop    %rbp
> 15d0:       e9 00 00 00 00          jmpq   15d5 <fred_entry_from_user+0x35>
> 15d5:       3c 06                   cmp    $0x6,%al
> 15d7:       74 7c                   je     1655 <fred_entry_from_user+0xb5>
> 15d9:       72 66                   jb     1641 <fred_entry_from_user+0xa1>
> 15db:       3c 07                   cmp    $0x7,%al
> 15dd:       75 2d                   jne    160c <fred_entry_from_user+0x6c>
> 15df:       8b 87 a4 00 00 00       mov    0xa4(%rdi),%eax
> 15e5:       25 ff 00 00 02          and    $0x20000ff,%eax
> 15ea:       3d 01 00 00 02          cmp    $0x2000001,%eax
> 15ef:       75 6f                   jne    1660 <fred_entry_from_user+0xc0>
> 15f1:       48 8b 77 50             mov    0x50(%rdi),%rsi
> 15f5:       48 c7 47 50 da ff ff    movq   $0xffffffffffffffda,0x50(%rdi)
> ... ...
> 
> Command line to check the assembly code generated by the compiler for
> fred_entry_from_kernel():
> 
> $objdump -d vmlinux.o | awk '/<fred_entry_from_kernel>:/{c=65} c&&c--'
> 
> 00000000000016b0 <fred_entry_from_kernel>:
> 16b0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
> 16b7:       48 8b 77 78             mov    0x78(%rdi),%rsi
> 16bb:       55                      push   %rbp
> 16bc:       48 c7 47 78 ff ff ff    movq   $0xffffffffffffffff,0x78(%rdi)
> 16c3:       ff
> 16c4:       83 e0 0f                and    $0xf,%eax
> 16c7:       48 89 e5                mov    %rsp,%rbp
> 16ca:       3c 03                   cmp    $0x3,%al
> -->>                                /* match 3(EVENT_TYPE_HWEXC) first */
> 16cc:       74 3c                 je     170a <fred_entry_from_kernel+0x5a>
> 16ce:       76 13                 jbe    16e3 <fred_entry_from_kernel+0x33>
> 16d0:       3c 05                 cmp    $0x5,%al
> 16d2:       74 41                 je     1715 <fred_entry_from_kernel+0x65>
> 16d4:       3c 06                 cmp    $0x6,%al
> 16d6:       75 27                 jne    16ff <fred_entry_from_kernel+0x4f>
> 16d8:       e8 73 fe ff ff        callq  1550 <fred_swexc.isra.3>
> 16dd:       5d                    pop    %rbp
> ... ...
> 
> Therefore, it is necessary to handle EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER
> before the switch statement using if-else syntax to ensure the compiler
> generates the desired code. After applying the patch, the verification
> results are as follows:
> 
> $objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
> 
> 00000000000015a0 <fred_entry_from_user>:
> 15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
> 15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
> 15ab:       55                      push   %rbp
> 15ac:       48 c7 47 78 ff ff ff    movq   $0xffffffffffffffff,0x78(%rdi)
> 15b3:       ff
> 15b4:       48 89 e5                mov    %rsp,%rbp
> 15b7:       83 e0 0f                and    $0xf,%eax
> 15ba:       74 34                   je     15f0 <fred_entry_from_user+0x50>
> -->>				    /* match 0(EVENT_TYPE_EXTINT) first */
> 15bc:       3c 07                   cmp    $0x7,%al
> -->>                                /* match 7(EVENT_TYPE_OTHER) second *
> 15be:       74 6e                   je     162e <fred_entry_from_user+0x8e>
> 15c0:       3c 04                   cmp    $0x4,%al
> 15c2:       0f 84 93 00 00 00       je     165b <fred_entry_from_user+0xbb>
> 15c8:       76 13                   jbe    15dd <fred_entry_from_user+0x3d>
> 15ca:       3c 05                   cmp    $0x5,%al
> 15cc:       74 41                   je     160f <fred_entry_from_user+0x6f>
> 15ce:       3c 06                   cmp    $0x6,%al
> 15d0:       75 51                   jne    1623 <fred_entry_from_user+0x83>
> 15d2:       e8 79 ff ff ff          callq  1550 <fred_swexc.isra.3>
> 15d7:       5d                      pop    %rbp
> 15d8:       e9 00 00 00 00          jmpq   15dd <fred_entry_from_user+0x3d>
> 15dd:       3c 02                   cmp    $0x2,%al
> 15df:       74 1a                   je     15fb <fred_entry_from_user+0x5b>
> 15e1:       3c 03                   cmp    $0x3,%al
> 15e3:       75 3e                   jne    1623 <fred_entry_from_user+0x83>
> ... ...
> 
> The same desired code in fred_entry_from_kernel is no longer repeated.
> 
> While the C code with if-else placed before switch() may appear ugly, it
> works. Additionally, using a jump table is not advisable; even if the jump
> table resides in the L1 cache, the cost of loading it is over 10 times the
> latency of a cmp instruction.
> 
> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
> ---
> base commit: 619f0b6fad524f08d493a98d55bac9ab8895e3a6
> ---
>   arch/x86/entry/entry_fred.c | 25 +++++++++++++++++++------
>   1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
> index f004a4dc74c2..591f47771ecf 100644
> --- a/arch/x86/entry/entry_fred.c
> +++ b/arch/x86/entry/entry_fred.c
> @@ -228,9 +228,18 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>   	/* Invalidate orig_ax so that syscall_get_nr() works correctly */
>   	regs->orig_ax = -1;
>   
> -	switch (regs->fred_ss.type) {
> -	case EVENT_TYPE_EXTINT:
> +	if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>   		return fred_extint(regs);
> +	else if (regs->fred_ss.type == EVENT_TYPE_OTHER)
> +		return fred_other(regs);
> +
> +	/*
> +	 * Dispatch EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER(syscall) type events
> +	 * first due to their high probability and let the compiler create binary search
> +	 * dispatching for the remaining events
> +	 */
> +
> +	switch (regs->fred_ss.type) {
>   	case EVENT_TYPE_NMI:
>   		if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>   			return fred_exc_nmi(regs);
> @@ -245,8 +254,6 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>   		break;
>   	case EVENT_TYPE_SWEXC:
>   		return fred_swexc(regs, error_code);
> -	case EVENT_TYPE_OTHER:
> -		return fred_other(regs);
>   	default: break;
>   	}
>   
> @@ -260,9 +267,15 @@ __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
>   	/* Invalidate orig_ax so that syscall_get_nr() works correctly */
>   	regs->orig_ax = -1;
>   
> -	switch (regs->fred_ss.type) {
> -	case EVENT_TYPE_EXTINT:
> +	if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>   		return fred_extint(regs);
> +
> +	/*
> +	 * Dispatch EVENT_TYPE_EXTINT type event first due to its high probability
> +	 * and let the compiler do binary search dispatching for the other events
> +	 */
> +
> +	switch (regs->fred_ss.type) {
>   	case EVENT_TYPE_NMI:
>   		if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>   			return fred_exc_nmi(regs);


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16  7:27 ` Xin Li
@ 2025-01-16  9:22   ` Ethan Zhao
  2025-01-16 13:03   ` Ethan Zhao
  1 sibling, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-16  9:22 UTC (permalink / raw)
  To: Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, andrew.cooper3, mingo, bp, etzhao


在 2025/1/16 15:27, Xin Li 写道:
> On 1/15/2025 10:51 PM, Ethan Zhao wrote:
>> External interrupts (EVENT_TYPE_EXTINT) and system calls 
>> (EVENT_TYPE_OTHER)
>> occur more frequently than other events in a typical system. 
>> Prioritizing
>> these events saves CPU cycles and optimizes the efficiency of 
>> performance-
>> critical paths.
>
> We deliberately hold off sending performance improvement patches at this
> point, but first of all please read:
>     https://lore.kernel.org/lkml/87fs766o3t.ffs@tglx/

   Got it, seems there is jump table implenmentation in that link page.  
will read

and discuss it -- two years ago, too old to reply that post.


Thanks,

Ethan

>
> Thanks!
>     Xin
>
>>
>> When examining the compiler-generated assembly code for event 
>> dispatching
>> in the functions fred_entry_from_user() and fred_entry_from_kernel(), it
>> was observed that the compiler intelligently uses a binary search to 
>> match
>> all event type values (0-7) and perform dispatching. As a result, 
>> even if
>> the following cases:
>>
>>     case EVENT_TYPE_EXTINT:
>>         return fred_extint(regs);
>>     case EVENT_TYPE_OTHER:
>>         return fred_other(regs);
>>
>> are placed at the beginning of the switch() statement, the generated
>> assembly code would remain the same, and the expected prioritization 
>> would
>> not be achieved.
>>
>> Command line to check the assembly code generated by the compiler for
>> fred_entry_from_user():
>>
>> $objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
>>
>> 00000000000015a0 <fred_entry_from_user>:
>> 15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
>> 15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
>> 15ab:       55                      push   %rbp
>> 15ac:       48 c7 47 78 ff ff ff    movq $0xffffffffffffffff,0x78(%rdi)
>> 15b3:       ff
>> 15b4:       83 e0 0f                and    $0xf,%eax
>> 15b7:       48 89 e5                mov    %rsp,%rbp
>> 15ba:       3c 04                   cmp    $0x4,%al
>> -->>                        /* match 4(EVENT_TYPE_SWINT) first */
>> 15bc:       74 78                   je     1636 
>> <fred_entry_from_user+0x96>
>> 15be:       77 15                   ja     15d5 
>> <fred_entry_from_user+0x35>
>> 15c0:       3c 02                   cmp    $0x2,%al
>> 15c2:       74 53                   je     1617 
>> <fred_entry_from_user+0x77>
>> 15c4:       77 65                   ja     162b 
>> <fred_entry_from_user+0x8b>
>> 15c6:       84 c0                   test   %al,%al
>> 15c8:       75 42                   jne    160c 
>> <fred_entry_from_user+0x6c>
>> 15ca:       e8 71 fc ff ff          callq  1240 <fred_extint>
>> 15cf:       5d                      pop    %rbp
>> 15d0:       e9 00 00 00 00          jmpq   15d5 
>> <fred_entry_from_user+0x35>
>> 15d5:       3c 06                   cmp    $0x6,%al
>> 15d7:       74 7c                   je     1655 
>> <fred_entry_from_user+0xb5>
>> 15d9:       72 66                   jb     1641 
>> <fred_entry_from_user+0xa1>
>> 15db:       3c 07                   cmp    $0x7,%al
>> 15dd:       75 2d                   jne    160c 
>> <fred_entry_from_user+0x6c>
>> 15df:       8b 87 a4 00 00 00       mov    0xa4(%rdi),%eax
>> 15e5:       25 ff 00 00 02          and    $0x20000ff,%eax
>> 15ea:       3d 01 00 00 02          cmp    $0x2000001,%eax
>> 15ef:       75 6f                   jne    1660 
>> <fred_entry_from_user+0xc0>
>> 15f1:       48 8b 77 50             mov    0x50(%rdi),%rsi
>> 15f5:       48 c7 47 50 da ff ff    movq $0xffffffffffffffda,0x50(%rdi)
>> ... ...
>>
>> Command line to check the assembly code generated by the compiler for
>> fred_entry_from_kernel():
>>
>> $objdump -d vmlinux.o | awk '/<fred_entry_from_kernel>:/{c=65} c&&c--'
>>
>> 00000000000016b0 <fred_entry_from_kernel>:
>> 16b0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
>> 16b7:       48 8b 77 78             mov    0x78(%rdi),%rsi
>> 16bb:       55                      push   %rbp
>> 16bc:       48 c7 47 78 ff ff ff    movq $0xffffffffffffffff,0x78(%rdi)
>> 16c3:       ff
>> 16c4:       83 e0 0f                and    $0xf,%eax
>> 16c7:       48 89 e5                mov    %rsp,%rbp
>> 16ca:       3c 03                   cmp    $0x3,%al
>> -->>                                /* match 3(EVENT_TYPE_HWEXC) 
>> first */
>> 16cc:       74 3c                 je     170a 
>> <fred_entry_from_kernel+0x5a>
>> 16ce:       76 13                 jbe    16e3 
>> <fred_entry_from_kernel+0x33>
>> 16d0:       3c 05                 cmp    $0x5,%al
>> 16d2:       74 41                 je     1715 
>> <fred_entry_from_kernel+0x65>
>> 16d4:       3c 06                 cmp    $0x6,%al
>> 16d6:       75 27                 jne    16ff 
>> <fred_entry_from_kernel+0x4f>
>> 16d8:       e8 73 fe ff ff        callq  1550 <fred_swexc.isra.3>
>> 16dd:       5d                    pop    %rbp
>> ... ...
>>
>> Therefore, it is necessary to handle EVENT_TYPE_EXTINT and 
>> EVENT_TYPE_OTHER
>> before the switch statement using if-else syntax to ensure the compiler
>> generates the desired code. After applying the patch, the verification
>> results are as follows:
>>
>> $objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
>>
>> 00000000000015a0 <fred_entry_from_user>:
>> 15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
>> 15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
>> 15ab:       55                      push   %rbp
>> 15ac:       48 c7 47 78 ff ff ff    movq $0xffffffffffffffff,0x78(%rdi)
>> 15b3:       ff
>> 15b4:       48 89 e5                mov    %rsp,%rbp
>> 15b7:       83 e0 0f                and    $0xf,%eax
>> 15ba:       74 34                   je     15f0 
>> <fred_entry_from_user+0x50>
>> -->>                    /* match 0(EVENT_TYPE_EXTINT) first */
>> 15bc:       3c 07                   cmp    $0x7,%al
>> -->>                                /* match 7(EVENT_TYPE_OTHER) 
>> second *
>> 15be:       74 6e                   je     162e 
>> <fred_entry_from_user+0x8e>
>> 15c0:       3c 04                   cmp    $0x4,%al
>> 15c2:       0f 84 93 00 00 00       je     165b 
>> <fred_entry_from_user+0xbb>
>> 15c8:       76 13                   jbe    15dd 
>> <fred_entry_from_user+0x3d>
>> 15ca:       3c 05                   cmp    $0x5,%al
>> 15cc:       74 41                   je     160f 
>> <fred_entry_from_user+0x6f>
>> 15ce:       3c 06                   cmp    $0x6,%al
>> 15d0:       75 51                   jne    1623 
>> <fred_entry_from_user+0x83>
>> 15d2:       e8 79 ff ff ff          callq  1550 <fred_swexc.isra.3>
>> 15d7:       5d                      pop    %rbp
>> 15d8:       e9 00 00 00 00          jmpq   15dd 
>> <fred_entry_from_user+0x3d>
>> 15dd:       3c 02                   cmp    $0x2,%al
>> 15df:       74 1a                   je     15fb 
>> <fred_entry_from_user+0x5b>
>> 15e1:       3c 03                   cmp    $0x3,%al
>> 15e3:       75 3e                   jne    1623 
>> <fred_entry_from_user+0x83>
>> ... ...
>>
>> The same desired code in fred_entry_from_kernel is no longer repeated.
>>
>> While the C code with if-else placed before switch() may appear ugly, it
>> works. Additionally, using a jump table is not advisable; even if the 
>> jump
>> table resides in the L1 cache, the cost of loading it is over 10 
>> times the
>> latency of a cmp instruction.
>>
>> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
>> ---
>> base commit: 619f0b6fad524f08d493a98d55bac9ab8895e3a6
>> ---
>>   arch/x86/entry/entry_fred.c | 25 +++++++++++++++++++------
>>   1 file changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
>> index f004a4dc74c2..591f47771ecf 100644
>> --- a/arch/x86/entry/entry_fred.c
>> +++ b/arch/x86/entry/entry_fred.c
>> @@ -228,9 +228,18 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>       /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>>       regs->orig_ax = -1;
>>   -    switch (regs->fred_ss.type) {
>> -    case EVENT_TYPE_EXTINT:
>> +    if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>>           return fred_extint(regs);
>> +    else if (regs->fred_ss.type == EVENT_TYPE_OTHER)
>> +        return fred_other(regs);
>> +
>> +    /*
>> +     * Dispatch EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER(syscall) type 
>> events
>> +     * first due to their high probability and let the compiler 
>> create binary search
>> +     * dispatching for the remaining events
>> +     */
>> +
>> +    switch (regs->fred_ss.type) {
>>       case EVENT_TYPE_NMI:
>>           if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>>               return fred_exc_nmi(regs);
>> @@ -245,8 +254,6 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>           break;
>>       case EVENT_TYPE_SWEXC:
>>           return fred_swexc(regs, error_code);
>> -    case EVENT_TYPE_OTHER:
>> -        return fred_other(regs);
>>       default: break;
>>       }
>>   @@ -260,9 +267,15 @@ __visible noinstr void 
>> fred_entry_from_kernel(struct pt_regs *regs)
>>       /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>>       regs->orig_ax = -1;
>>   -    switch (regs->fred_ss.type) {
>> -    case EVENT_TYPE_EXTINT:
>> +    if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>>           return fred_extint(regs);
>> +
>> +    /*
>> +     * Dispatch EVENT_TYPE_EXTINT type event first due to its high 
>> probability
>> +     * and let the compiler do binary search dispatching for the 
>> other events
>> +     */
>> +
>> +    switch (regs->fred_ss.type) {
>>       case EVENT_TYPE_NMI:
>>           if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>>               return fred_exc_nmi(regs);
>
-- 
"firm, enduring, strong, and long-lived"


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16  7:27 ` Xin Li
  2025-01-16  9:22   ` Ethan Zhao
@ 2025-01-16 13:03   ` Ethan Zhao
  2025-01-16 13:48     ` Xin Li
  1 sibling, 1 reply; 27+ messages in thread
From: Ethan Zhao @ 2025-01-16 13:03 UTC (permalink / raw)
  To: Xin Li, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, andrew.cooper3, mingo, bp


On 1/16/2025 3:27 PM, Xin Li wrote:
> On 1/15/2025 10:51 PM, Ethan Zhao wrote:
>> External interrupts (EVENT_TYPE_EXTINT) and system calls 
>> (EVENT_TYPE_OTHER)
>> occur more frequently than other events in a typical system. 
>> Prioritizing
>> these events saves CPU cycles and optimizes the efficiency of 
>> performance-
>> critical paths.
>
> We deliberately hold off sending performance improvement patches at this
> point, but first of all please read:
>     https://lore.kernel.org/lkml/87fs766o3t.ffs@tglx/

Do you mean Thomas'point "Premature optimization is the enemy of correctness.
don't do it" ? Yep, I agree with it.

Compared to the asm code generated by the compiler with fewer than 20 cmp
instructions, placing a event handler jump table indexed by 
event-typethere is a negative optimization. That is not deliberately 'hold off', correctness goes first, definitely right.

Thanks,

Ethan

> Thanks!
>     Xin
>
>>
>> When examining the compiler-generated assembly code for event 
>> dispatching
>> in the functions fred_entry_from_user() and fred_entry_from_kernel(), it
>> was observed that the compiler intelligently uses a binary search to 
>> match
>> all event type values (0-7) and perform dispatching. As a result, 
>> even if
>> the following cases:
>>
>>     case EVENT_TYPE_EXTINT:
>>         return fred_extint(regs);
>>     case EVENT_TYPE_OTHER:
>>         return fred_other(regs);
>>
>> are placed at the beginning of the switch() statement, the generated
>> assembly code would remain the same, and the expected prioritization 
>> would
>> not be achieved.
>>
>> Command line to check the assembly code generated by the compiler for
>> fred_entry_from_user():
>>
>> $objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
>>
>> 00000000000015a0 <fred_entry_from_user>:
>> 15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
>> 15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
>> 15ab:       55                      push   %rbp
>> 15ac:       48 c7 47 78 ff ff ff    movq $0xffffffffffffffff,0x78(%rdi)
>> 15b3:       ff
>> 15b4:       83 e0 0f                and    $0xf,%eax
>> 15b7:       48 89 e5                mov    %rsp,%rbp
>> 15ba:       3c 04                   cmp    $0x4,%al
>> -->>                        /* match 4(EVENT_TYPE_SWINT) first */
>> 15bc:       74 78                   je     1636 
>> <fred_entry_from_user+0x96>
>> 15be:       77 15                   ja     15d5 
>> <fred_entry_from_user+0x35>
>> 15c0:       3c 02                   cmp    $0x2,%al
>> 15c2:       74 53                   je     1617 
>> <fred_entry_from_user+0x77>
>> 15c4:       77 65                   ja     162b 
>> <fred_entry_from_user+0x8b>
>> 15c6:       84 c0                   test   %al,%al
>> 15c8:       75 42                   jne    160c 
>> <fred_entry_from_user+0x6c>
>> 15ca:       e8 71 fc ff ff          callq  1240 <fred_extint>
>> 15cf:       5d                      pop    %rbp
>> 15d0:       e9 00 00 00 00          jmpq   15d5 
>> <fred_entry_from_user+0x35>
>> 15d5:       3c 06                   cmp    $0x6,%al
>> 15d7:       74 7c                   je     1655 
>> <fred_entry_from_user+0xb5>
>> 15d9:       72 66                   jb     1641 
>> <fred_entry_from_user+0xa1>
>> 15db:       3c 07                   cmp    $0x7,%al
>> 15dd:       75 2d                   jne    160c 
>> <fred_entry_from_user+0x6c>
>> 15df:       8b 87 a4 00 00 00       mov    0xa4(%rdi),%eax
>> 15e5:       25 ff 00 00 02          and    $0x20000ff,%eax
>> 15ea:       3d 01 00 00 02          cmp    $0x2000001,%eax
>> 15ef:       75 6f                   jne    1660 
>> <fred_entry_from_user+0xc0>
>> 15f1:       48 8b 77 50             mov    0x50(%rdi),%rsi
>> 15f5:       48 c7 47 50 da ff ff    movq $0xffffffffffffffda,0x50(%rdi)
>> ... ...
>>
>> Command line to check the assembly code generated by the compiler for
>> fred_entry_from_kernel():
>>
>> $objdump -d vmlinux.o | awk '/<fred_entry_from_kernel>:/{c=65} c&&c--'
>>
>> 00000000000016b0 <fred_entry_from_kernel>:
>> 16b0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
>> 16b7:       48 8b 77 78             mov    0x78(%rdi),%rsi
>> 16bb:       55                      push   %rbp
>> 16bc:       48 c7 47 78 ff ff ff    movq $0xffffffffffffffff,0x78(%rdi)
>> 16c3:       ff
>> 16c4:       83 e0 0f                and    $0xf,%eax
>> 16c7:       48 89 e5                mov    %rsp,%rbp
>> 16ca:       3c 03                   cmp    $0x3,%al
>> -->>                                /* match 3(EVENT_TYPE_HWEXC) 
>> first */
>> 16cc:       74 3c                 je     170a 
>> <fred_entry_from_kernel+0x5a>
>> 16ce:       76 13                 jbe    16e3 
>> <fred_entry_from_kernel+0x33>
>> 16d0:       3c 05                 cmp    $0x5,%al
>> 16d2:       74 41                 je     1715 
>> <fred_entry_from_kernel+0x65>
>> 16d4:       3c 06                 cmp    $0x6,%al
>> 16d6:       75 27                 jne    16ff 
>> <fred_entry_from_kernel+0x4f>
>> 16d8:       e8 73 fe ff ff        callq  1550 <fred_swexc.isra.3>
>> 16dd:       5d                    pop    %rbp
>> ... ...
>>
>> Therefore, it is necessary to handle EVENT_TYPE_EXTINT and 
>> EVENT_TYPE_OTHER
>> before the switch statement using if-else syntax to ensure the compiler
>> generates the desired code. After applying the patch, the verification
>> results are as follows:
>>
>> $objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
>>
>> 00000000000015a0 <fred_entry_from_user>:
>> 15a0:       0f b6 87 a6 00 00 00    movzbl 0xa6(%rdi),%eax
>> 15a7:       48 8b 77 78             mov    0x78(%rdi),%rsi
>> 15ab:       55                      push   %rbp
>> 15ac:       48 c7 47 78 ff ff ff    movq $0xffffffffffffffff,0x78(%rdi)
>> 15b3:       ff
>> 15b4:       48 89 e5                mov    %rsp,%rbp
>> 15b7:       83 e0 0f                and    $0xf,%eax
>> 15ba:       74 34                   je     15f0 
>> <fred_entry_from_user+0x50>
>> -->>                    /* match 0(EVENT_TYPE_EXTINT) first */
>> 15bc:       3c 07                   cmp    $0x7,%al
>> -->>                                /* match 7(EVENT_TYPE_OTHER) 
>> second *
>> 15be:       74 6e                   je     162e 
>> <fred_entry_from_user+0x8e>
>> 15c0:       3c 04                   cmp    $0x4,%al
>> 15c2:       0f 84 93 00 00 00       je     165b 
>> <fred_entry_from_user+0xbb>
>> 15c8:       76 13                   jbe    15dd 
>> <fred_entry_from_user+0x3d>
>> 15ca:       3c 05                   cmp    $0x5,%al
>> 15cc:       74 41                   je     160f 
>> <fred_entry_from_user+0x6f>
>> 15ce:       3c 06                   cmp    $0x6,%al
>> 15d0:       75 51                   jne    1623 
>> <fred_entry_from_user+0x83>
>> 15d2:       e8 79 ff ff ff          callq  1550 <fred_swexc.isra.3>
>> 15d7:       5d                      pop    %rbp
>> 15d8:       e9 00 00 00 00          jmpq   15dd 
>> <fred_entry_from_user+0x3d>
>> 15dd:       3c 02                   cmp    $0x2,%al
>> 15df:       74 1a                   je     15fb 
>> <fred_entry_from_user+0x5b>
>> 15e1:       3c 03                   cmp    $0x3,%al
>> 15e3:       75 3e                   jne    1623 
>> <fred_entry_from_user+0x83>
>> ... ...
>>
>> The same desired code in fred_entry_from_kernel is no longer repeated.
>>
>> While the C code with if-else placed before switch() may appear ugly, it
>> works. Additionally, using a jump table is not advisable; even if the 
>> jump
>> table resides in the L1 cache, the cost of loading it is over 10 
>> times the
>> latency of a cmp instruction.
>>
>> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
>> ---
>> base commit: 619f0b6fad524f08d493a98d55bac9ab8895e3a6
>> ---
>>   arch/x86/entry/entry_fred.c | 25 +++++++++++++++++++------
>>   1 file changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
>> index f004a4dc74c2..591f47771ecf 100644
>> --- a/arch/x86/entry/entry_fred.c
>> +++ b/arch/x86/entry/entry_fred.c
>> @@ -228,9 +228,18 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>       /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>>       regs->orig_ax = -1;
>>   -    switch (regs->fred_ss.type) {
>> -    case EVENT_TYPE_EXTINT:
>> +    if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>>           return fred_extint(regs);
>> +    else if (regs->fred_ss.type == EVENT_TYPE_OTHER)
>> +        return fred_other(regs);
>> +
>> +    /*
>> +     * Dispatch EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER(syscall) type 
>> events
>> +     * first due to their high probability and let the compiler 
>> create binary search
>> +     * dispatching for the remaining events
>> +     */
>> +
>> +    switch (regs->fred_ss.type) {
>>       case EVENT_TYPE_NMI:
>>           if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>>               return fred_exc_nmi(regs);
>> @@ -245,8 +254,6 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>           break;
>>       case EVENT_TYPE_SWEXC:
>>           return fred_swexc(regs, error_code);
>> -    case EVENT_TYPE_OTHER:
>> -        return fred_other(regs);
>>       default: break;
>>       }
>>   @@ -260,9 +267,15 @@ __visible noinstr void 
>> fred_entry_from_kernel(struct pt_regs *regs)
>>       /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>>       regs->orig_ax = -1;
>>   -    switch (regs->fred_ss.type) {
>> -    case EVENT_TYPE_EXTINT:
>> +    if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>>           return fred_extint(regs);
>> +
>> +    /*
>> +     * Dispatch EVENT_TYPE_EXTINT type event first due to its high 
>> probability
>> +     * and let the compiler do binary search dispatching for the 
>> other events
>> +     */
>> +
>> +    switch (regs->fred_ss.type) {
>>       case EVENT_TYPE_NMI:
>>           if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>>               return fred_exc_nmi(regs);
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16 13:03   ` Ethan Zhao
@ 2025-01-16 13:48     ` Xin Li
  2025-01-17  0:37       ` Ethan Zhao
  0 siblings, 1 reply; 27+ messages in thread
From: Xin Li @ 2025-01-16 13:48 UTC (permalink / raw)
  To: Ethan Zhao, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, andrew.cooper3, mingo, bp

On 1/16/2025 5:03 AM, Ethan Zhao wrote:
> On 1/16/2025 3:27 PM, Xin Li wrote:
>> On 1/15/2025 10:51 PM, Ethan Zhao wrote:
>>> External interrupts (EVENT_TYPE_EXTINT) and system calls 
>>> (EVENT_TYPE_OTHER)
>>> occur more frequently than other events in a typical system. 
>>> Prioritizing
>>> these events saves CPU cycles and optimizes the efficiency of 
>>> performance-
>>> critical paths.
>>
>> We deliberately hold off sending performance improvement patches at this
>> point, but first of all please read:
>> https://lore.kernel.org/lkml/87fs766o3t.ffs@tglx/
> 
> Do you mean Thomas'point "Premature optimization is the enemy of 
> correctness.
> don't do it" ? Yep, I agree with it.

Yes.

> 
> Compared to the asm code generated by the compiler with fewer than 20 cmp
> instructions, placing a event handler jump table indexed by event- 
> typethere is a negative optimization. That is not deliberately 'hold 
> off', correctness goes first, definitely right.

hpa suggested to introduce "switch_likely" for this kind of optimization
on a switch statement, which is also easier to read.  I measured it with
a user space focus test, it does improve performance a lot.  But 
obviously there are still a lot of work to do.

Thanks!
     Xin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16  6:51 [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching Ethan Zhao
  2025-01-16  7:27 ` Xin Li
@ 2025-01-16 15:08 ` Nikolay Borisov
  2025-01-16 18:45   ` Xin Li
  2025-01-17  0:40   ` Ethan Zhao
  1 sibling, 2 replies; 27+ messages in thread
From: Nikolay Borisov @ 2025-01-16 15:08 UTC (permalink / raw)
  To: Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, xin, andrew.cooper3, mingo, bp,
	etzhao



On 16.01.25 г. 8:51 ч., Ethan Zhao wrote:
> External interrupts (EVENT_TYPE_EXTINT) and system calls (EVENT_TYPE_OTHER)
> occur more frequently than other events in a typical system. Prioritizing
> these events saves CPU cycles and optimizes the efficiency of performance-
> critical paths.
> 
<snip>

Can you also include some performance numbers?

> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
> ---
> base commit: 619f0b6fad524f08d493a98d55bac9ab8895e3a6
> ---
>   arch/x86/entry/entry_fred.c | 25 +++++++++++++++++++------
>   1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
> index f004a4dc74c2..591f47771ecf 100644
> --- a/arch/x86/entry/entry_fred.c
> +++ b/arch/x86/entry/entry_fred.c
> @@ -228,9 +228,18 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>   	/* Invalidate orig_ax so that syscall_get_nr() works correctly */
>   	regs->orig_ax = -1;
>   
> -	switch (regs->fred_ss.type) {
> -	case EVENT_TYPE_EXTINT:
> +	if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>   		return fred_extint(regs);
> +	else if (regs->fred_ss.type == EVENT_TYPE_OTHER)
> +		return fred_other(regs);
> +
> +	/*
> +	 * Dispatch EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER(syscall) type events
> +	 * first due to their high probability and let the compiler create binary search
> +	 * dispatching for the remaining events
> +	 */

nit: At least to me it makes sense to have the comment above the 'if' so 
that the flow is linear.

> +
> +	switch (regs->fred_ss.type) {
>   	case EVENT_TYPE_NMI:
>   		if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>   			return fred_exc_nmi(regs);
> @@ -245,8 +254,6 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>   		break;
>   	case EVENT_TYPE_SWEXC:
>   		return fred_swexc(regs, error_code);
> -	case EVENT_TYPE_OTHER:
> -		return fred_other(regs);
>   	default: break;
>   	}
>   
> @@ -260,9 +267,15 @@ __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
>   	/* Invalidate orig_ax so that syscall_get_nr() works correctly */
>   	regs->orig_ax = -1;
>   
> -	switch (regs->fred_ss.type) {
> -	case EVENT_TYPE_EXTINT:
> +	if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>   		return fred_extint(regs);
> +
> +	/*
> +	 * Dispatch EVENT_TYPE_EXTINT type event first due to its high probability
> +	 * and let the compiler do binary search dispatching for the other events
> +	 */
> +
> +	switch (regs->fred_ss.type) {
>   	case EVENT_TYPE_NMI:
>   		if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>   			return fred_exc_nmi(regs);


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16 15:08 ` Nikolay Borisov
@ 2025-01-16 18:45   ` Xin Li
  2025-01-17  0:40   ` Ethan Zhao
  1 sibling, 0 replies; 27+ messages in thread
From: Xin Li @ 2025-01-16 18:45 UTC (permalink / raw)
  To: Nikolay Borisov, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, andrew.cooper3, mingo, bp, etzhao

On 1/16/2025 7:08 AM, Nikolay Borisov wrote:
> 
> 
> On 16.01.25 г. 8:51 ч., Ethan Zhao wrote:
>> External interrupts (EVENT_TYPE_EXTINT) and system calls 
>> (EVENT_TYPE_OTHER)
>> occur more frequently than other events in a typical system. Prioritizing
>> these events saves CPU cycles and optimizes the efficiency of 
>> performance-
>> critical paths.
>>
> <snip>
> 
> Can you also include some performance numbers?

not a good time yet :)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16 13:48     ` Xin Li
@ 2025-01-17  0:37       ` Ethan Zhao
  2025-01-17  1:21         ` H. Peter Anvin
  0 siblings, 1 reply; 27+ messages in thread
From: Ethan Zhao @ 2025-01-17  0:37 UTC (permalink / raw)
  To: Xin Li, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, andrew.cooper3, mingo, bp


On 1/16/2025 9:48 PM, Xin Li wrote:
> On 1/16/2025 5:03 AM, Ethan Zhao wrote:
>> On 1/16/2025 3:27 PM, Xin Li wrote:
>>> On 1/15/2025 10:51 PM, Ethan Zhao wrote:
>>>> External interrupts (EVENT_TYPE_EXTINT) and system calls 
>>>> (EVENT_TYPE_OTHER)
>>>> occur more frequently than other events in a typical system. 
>>>> Prioritizing
>>>> these events saves CPU cycles and optimizes the efficiency of 
>>>> performance-
>>>> critical paths.
>>>
>>> We deliberately hold off sending performance improvement patches at 
>>> this
>>> point, but first of all please read:
>>> https://lore.kernel.org/lkml/87fs766o3t.ffs@tglx/
>>
>> Do you mean Thomas'point "Premature optimization is the enemy of 
>> correctness.
>> don't do it" ? Yep, I agree with it.
>
> Yes.
>
>>
>> Compared to the asm code generated by the compiler with fewer than 20 
>> cmp
>> instructions, placing a event handler jump table indexed by event- 
>> typethere is a negative optimization. That is not deliberately 'hold 
>> off', correctness goes first, definitely right.
>
> hpa suggested to introduce "switch_likely" for this kind of optimization
> on a switch statement, which is also easier to read.  I measured it with
> a user space focus test, it does improve performance a lot.  But 
> obviously there are still a lot of work to do.

Find a way to instruct compiler to pick the right hot branch meanwhile make folks
reading happy... yup, a lot of work.

Thanks,
Ethan

>
> Thanks!
>     Xin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-16 15:08 ` Nikolay Borisov
  2025-01-16 18:45   ` Xin Li
@ 2025-01-17  0:40   ` Ethan Zhao
  1 sibling, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-17  0:40 UTC (permalink / raw)
  To: Nikolay Borisov, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, hpa, xin, andrew.cooper3, mingo, bp


On 1/16/2025 11:08 PM, Nikolay Borisov wrote:
>
>
> On 16.01.25 г. 8:51 ч., Ethan Zhao wrote:
>> External interrupts (EVENT_TYPE_EXTINT) and system calls 
>> (EVENT_TYPE_OTHER)
>> occur more frequently than other events in a typical system. 
>> Prioritizing
>> these events saves CPU cycles and optimizes the efficiency of 
>> performance-
>> critical paths.
>>
> <snip>
>
> Can you also include some performance numbers?

Of coz will do when timing is good :)

Thanks,
Ethan

>
>> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
>> ---
>> base commit: 619f0b6fad524f08d493a98d55bac9ab8895e3a6
>> ---
>>   arch/x86/entry/entry_fred.c | 25 +++++++++++++++++++------
>>   1 file changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
>> index f004a4dc74c2..591f47771ecf 100644
>> --- a/arch/x86/entry/entry_fred.c
>> +++ b/arch/x86/entry/entry_fred.c
>> @@ -228,9 +228,18 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>       /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>>       regs->orig_ax = -1;
>>   -    switch (regs->fred_ss.type) {
>> -    case EVENT_TYPE_EXTINT:
>> +    if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>>           return fred_extint(regs);
>> +    else if (regs->fred_ss.type == EVENT_TYPE_OTHER)
>> +        return fred_other(regs);
>> +
>> +    /*
>> +     * Dispatch EVENT_TYPE_EXTINT and EVENT_TYPE_OTHER(syscall) type 
>> events
>> +     * first due to their high probability and let the compiler 
>> create binary search
>> +     * dispatching for the remaining events
>> +     */
>
> nit: At least to me it makes sense to have the comment above the 'if' 
> so that the flow is linear.
>
>> +
>> +    switch (regs->fred_ss.type) {
>>       case EVENT_TYPE_NMI:
>>           if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>>               return fred_exc_nmi(regs);
>> @@ -245,8 +254,6 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>           break;
>>       case EVENT_TYPE_SWEXC:
>>           return fred_swexc(regs, error_code);
>> -    case EVENT_TYPE_OTHER:
>> -        return fred_other(regs);
>>       default: break;
>>       }
>>   @@ -260,9 +267,15 @@ __visible noinstr void 
>> fred_entry_from_kernel(struct pt_regs *regs)
>>       /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>>       regs->orig_ax = -1;
>>   -    switch (regs->fred_ss.type) {
>> -    case EVENT_TYPE_EXTINT:
>> +    if (regs->fred_ss.type == EVENT_TYPE_EXTINT)
>>           return fred_extint(regs);
>> +
>> +    /*
>> +     * Dispatch EVENT_TYPE_EXTINT type event first due to its high 
>> probability
>> +     * and let the compiler do binary search dispatching for the 
>> other events
>> +     */
>> +
>> +    switch (regs->fred_ss.type) {
>>       case EVENT_TYPE_NMI:
>>           if (likely(regs->fred_ss.vector == X86_TRAP_NMI))
>>               return fred_exc_nmi(regs);
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  0:37       ` Ethan Zhao
@ 2025-01-17  1:21         ` H. Peter Anvin
  2025-01-17  3:04           ` Ethan Zhao
  2025-01-17  4:19           ` Ethan Zhao
  0 siblings, 2 replies; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-17  1:21 UTC (permalink / raw)
  To: Ethan Zhao, Xin Li, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/16/25 16:37, Ethan Zhao wrote:
>>
>> hpa suggested to introduce "switch_likely" for this kind of optimization
>> on a switch statement, which is also easier to read.  I measured it with
>> a user space focus test, it does improve performance a lot.  But 
>> obviously there are still a lot of work to do.
> 
> Find a way to instruct compiler to pick the right hot branch meanwhile 
> make folks
> reading happy... yup, a lot of work.
> 

It's not that complicated, believe it or not.

/*
  * switch(v) biased for speed in the case v == l
  *
  * Note: gcc is quite sensitive to the exact form of this
  * expression.
  */
#define switch_likely(v,l) \
	switch((__typeof__(v))__builtin_expect((v),(l)))

	-hpa


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  1:21         ` H. Peter Anvin
@ 2025-01-17  3:04           ` Ethan Zhao
  2025-01-17  4:19           ` Ethan Zhao
  1 sibling, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-17  3:04 UTC (permalink / raw)
  To: H. Peter Anvin, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp


在 2025/1/17 9:21, H. Peter Anvin 写道:
> On 1/16/25 16:37, Ethan Zhao wrote:
>>>
>>> hpa suggested to introduce "switch_likely" for this kind of 
>>> optimization
>>> on a switch statement, which is also easier to read.  I measured it 
>>> with
>>> a user space focus test, it does improve performance a lot. But 
>>> obviously there are still a lot of work to do.
>>
>> Find a way to instruct compiler to pick the right hot branch 
>> meanwhile make folks
>> reading happy... yup, a lot of work.
>>
>
> It's not that complicated, believe it or not.
>
> /*
>  * switch(v) biased for speed in the case v == l
>  *
>  * Note: gcc is quite sensitive to the exact form of this
>  * expression.
>  */
> #define switch_likely(v,l) \
>     switch((__typeof__(v))__builtin_expect((v),(l)))
>
I know we could play such trick for one branch, never think of there are 
2-3 branches
are expected among 7 branches. :) , --- external interrupts, syscall, 
page fault.

Thanks,
Ethan
>     -hpa
>
-- 
"firm, enduring, strong, and long-lived"


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  1:21         ` H. Peter Anvin
  2025-01-17  3:04           ` Ethan Zhao
@ 2025-01-17  4:19           ` Ethan Zhao
  2025-01-17  4:43             ` Xin Li
  1 sibling, 1 reply; 27+ messages in thread
From: Ethan Zhao @ 2025-01-17  4:19 UTC (permalink / raw)
  To: H. Peter Anvin, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp


在 2025/1/17 9:21, H. Peter Anvin 写道:
> On 1/16/25 16:37, Ethan Zhao wrote:
>>>
>>> hpa suggested to introduce "switch_likely" for this kind of 
>>> optimization
>>> on a switch statement, which is also easier to read.  I measured it 
>>> with
>>> a user space focus test, it does improve performance a lot. But 
>>> obviously there are still a lot of work to do.
>>
>> Find a way to instruct compiler to pick the right hot branch 
>> meanwhile make folks
>> reading happy... yup, a lot of work.
>>
>
> It's not that complicated, believe it or not.
>
> /*
>  * switch(v) biased for speed in the case v == l
>  *
>  * Note: gcc is quite sensitive to the exact form of this
>  * expression.
>  */
> #define switch_likely(v,l) \
>     switch((__typeof__(v))__builtin_expect((v),(l)))

I tried this macro as following, but got something really *weird* from gcc.

+#define switch_likely(v,l) \
+        switch((__typeof__(v))__builtin_expect((v),(l)))
+
  __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
  {
         unsigned long error_code = regs->orig_ax;
+       unsigned short etype = regs->fred_ss.type & 0xf;

         /* Invalidate orig_ax so that syscall_get_nr() works correctly */
         regs->orig_ax = -1;

-       switch (regs->fred_ss.type) {
+       switch_likely ((etype == EVENT_TYPE_EXTINT || etype == 
EVENT_TYPE_OTHER), etype) {
         case EVENT_TYPE_EXTINT:
                 return fred_extint(regs);
         case EVENT_TYPE_NMI:
@@ -256,11 +260,12 @@ __visible noinstr void fred_entry_from_user(struct 
pt_regs *regs)
  __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
  {
         unsigned long error_code = regs->orig_ax;
+       unsigned short etype = regs->fred_ss.type & 0xf;

         /* Invalidate orig_ax so that syscall_get_nr() works correctly */
         regs->orig_ax = -1;

-       switch (regs->fred_ss.type) {
+       switch_likely (etype == EVENT_TYPE_EXTINT, etype) {
         case EVENT_TYPE_EXTINT:
                 return fred_extint(regs);
         case EVENT_TYPE_NMI:

Got the asm code as following:

  objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
00000000000015a0 <fred_entry_from_user>:
     15a0:       0f b6 87 a6 00 00 00 movzbl 0xa6(%rdi),%eax
     15a7:       48 8b 77 78 mov    0x78(%rdi),%rsi
     15ab:       55 push   %rbp
     15ac:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
     15b3:       ff
     15b4:       48 89 e5 mov    %rsp,%rbp
     15b7:       66 83 e0 0f and    $0xf,%ax
     15bb:       74 11 je     15ce <fred_entry_from_user+0x2e>
     15bd:       66 83 f8 07 cmp    $0x7,%ax
     15c1:       74 0b je     15ce <fred_entry_from_user+0x2e>
     15c3:       e8 78 fc ff ff callq  1240 <fred_extint>
     15c8:       5d pop    %rbp
     15c9:       e9 00 00 00 00 jmpq   15ce <fred_entry_from_user+0x2e>
     15ce:       e8 4d fd ff ff callq  1320 <fred_bad_type>
     15d3:       5d pop    %rbp
     15d4:       e9 00 00 00 00 jmpq   15d9 <fred_entry_from_user+0x39>
     15d9:       0f 1f 80 00 00 00 00 nopl   0x0(%rax)

00000000000015e0 <__pfx_fred_entry_from_kernel>:
     15e0:       90                      nop
     15e1:       90                      nop

00000000000015f0 <fred_entry_from_kernel>:
     15f0:       55 push   %rbp
     15f1:       48 8b 77 78 mov    0x78(%rdi),%rsi
     15f5:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
     15fc:       ff
     15fd:       48 89 e5 mov    %rsp,%rbp
     1600:       f6 87 a6 00 00 00 0f testb  $0xf,0xa6(%rdi)
     1607:       75 0b jne    1614 <fred_entry_from_kernel+0x24>
     1609:       e8 12 fd ff ff callq  1320 <fred_bad_type>
     160e:       5d pop    %rbp
     160f:       e9 00 00 00 00 jmpq   1614 <fred_entry_from_kernel+0x24>
     1614:       e8 27 fc ff ff callq  1240 <fred_extint>
     1619:       5d pop    %rbp
     161a:       e9 00 00 00 00 jmpq   161f <fred_entry_from_kernel+0x2f>
     161f:       90                      nop

0000000000001620 <__pfx___fred_entry_from_kvm>:
     1620:       90                      nop
     1621:       90                      nop


Even the fred_entry_from_kernel() asm code doesn't look right.
*gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)*
**
*Did I screw up something ?*
**
*Thanks,*
*Ethan*
>
>     -hpa
>

-- 
"firm, enduring, strong, and long-lived"


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  4:19           ` Ethan Zhao
@ 2025-01-17  4:43             ` Xin Li
  2025-01-17  5:18               ` Ethan Zhao
  0 siblings, 1 reply; 27+ messages in thread
From: Xin Li @ 2025-01-17  4:43 UTC (permalink / raw)
  To: Ethan Zhao, H. Peter Anvin, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/16/2025 8:19 PM, Ethan Zhao wrote:
> 
> 在 2025/1/17 9:21, H. Peter Anvin 写道:
>> On 1/16/25 16:37, Ethan Zhao wrote:
>>>>
>>>> hpa suggested to introduce "switch_likely" for this kind of 
>>>> optimization
>>>> on a switch statement, which is also easier to read.  I measured it 
>>>> with
>>>> a user space focus test, it does improve performance a lot. But 
>>>> obviously there are still a lot of work to do.
>>>
>>> Find a way to instruct compiler to pick the right hot branch 
>>> meanwhile make folks
>>> reading happy... yup, a lot of work.
>>>
>>
>> It's not that complicated, believe it or not.
>>
>> /*
>>  * switch(v) biased for speed in the case v == l
>>  *
>>  * Note: gcc is quite sensitive to the exact form of this
>>  * expression.
>>  */
>> #define switch_likely(v,l) \
>>     switch((__typeof__(v))__builtin_expect((v),(l)))
> 
> I tried this macro as following, but got something really *weird* from gcc.
> 
> +#define switch_likely(v,l) \
> +        switch((__typeof__(v))__builtin_expect((v),(l)))
> +
>   __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>   {
>          unsigned long error_code = regs->orig_ax;
> +       unsigned short etype = regs->fred_ss.type & 0xf;
> 
>          /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>          regs->orig_ax = -1;
> 
> -       switch (regs->fred_ss.type) {
> +       switch_likely ((etype == EVENT_TYPE_EXTINT || etype == 
> EVENT_TYPE_OTHER), etype) {

Just swap the 2 arguments, and it should be:
+	switch_likely (etype, EVENT_TYPE_OTHER) {


Probably also check __builtin_expect on 
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html.

>          case EVENT_TYPE_EXTINT:
>                  return fred_extint(regs);
>          case EVENT_TYPE_NMI:
> @@ -256,11 +260,12 @@ __visible noinstr void fred_entry_from_user(struct 
> pt_regs *regs)
>   __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
>   {
>          unsigned long error_code = regs->orig_ax;
> +       unsigned short etype = regs->fred_ss.type & 0xf;
> 
>          /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>          regs->orig_ax = -1;
> 
> -       switch (regs->fred_ss.type) {
> +       switch_likely (etype == EVENT_TYPE_EXTINT, etype) {
>          case EVENT_TYPE_EXTINT:
>                  return fred_extint(regs);
>          case EVENT_TYPE_NMI:
> 
> Got the asm code as following:
> 
>   objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
> 00000000000015a0 <fred_entry_from_user>:
>      15a0:       0f b6 87 a6 00 00 00 movzbl 0xa6(%rdi),%eax
>      15a7:       48 8b 77 78 mov    0x78(%rdi),%rsi
>      15ab:       55 push   %rbp
>      15ac:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
>      15b3:       ff
>      15b4:       48 89 e5 mov    %rsp,%rbp
>      15b7:       66 83 e0 0f and    $0xf,%ax
>      15bb:       74 11 je     15ce <fred_entry_from_user+0x2e>
>      15bd:       66 83 f8 07 cmp    $0x7,%ax
>      15c1:       74 0b je     15ce <fred_entry_from_user+0x2e>
>      15c3:       e8 78 fc ff ff callq  1240 <fred_extint>
>      15c8:       5d pop    %rbp
>      15c9:       e9 00 00 00 00 jmpq   15ce <fred_entry_from_user+0x2e>
>      15ce:       e8 4d fd ff ff callq  1320 <fred_bad_type>
>      15d3:       5d pop    %rbp
>      15d4:       e9 00 00 00 00 jmpq   15d9 <fred_entry_from_user+0x39>
>      15d9:       0f 1f 80 00 00 00 00 nopl   0x0(%rax)
> 
> 00000000000015e0 <__pfx_fred_entry_from_kernel>:
>      15e0:       90                      nop
>      15e1:       90                      nop
> 
> 00000000000015f0 <fred_entry_from_kernel>:
>      15f0:       55 push   %rbp
>      15f1:       48 8b 77 78 mov    0x78(%rdi),%rsi
>      15f5:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
>      15fc:       ff
>      15fd:       48 89 e5 mov    %rsp,%rbp
>      1600:       f6 87 a6 00 00 00 0f testb  $0xf,0xa6(%rdi)
>      1607:       75 0b jne    1614 <fred_entry_from_kernel+0x24>
>      1609:       e8 12 fd ff ff callq  1320 <fred_bad_type>
>      160e:       5d pop    %rbp
>      160f:       e9 00 00 00 00 jmpq   1614 <fred_entry_from_kernel+0x24>
>      1614:       e8 27 fc ff ff callq  1240 <fred_extint>
>      1619:       5d pop    %rbp
>      161a:       e9 00 00 00 00 jmpq   161f <fred_entry_from_kernel+0x2f>
>      161f:       90                      nop
> 
> 0000000000001620 <__pfx___fred_entry_from_kvm>:
>      1620:       90                      nop
>      1621:       90                      nop
> 
> 
> Even the fred_entry_from_kernel() asm code doesn't look right.
> *gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)*
> **
> *Did I screw up something ?*
> **
> *Thanks,*
> *Ethan*
>>
>>     -hpa
>>
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  4:43             ` Xin Li
@ 2025-01-17  5:18               ` Ethan Zhao
  2025-01-17  5:54                 ` Xin Li
  2025-01-17 16:24                 ` H. Peter Anvin
  0 siblings, 2 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-17  5:18 UTC (permalink / raw)
  To: Xin Li, H. Peter Anvin, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp


在 2025/1/17 12:43, Xin Li 写道:
> On 1/16/2025 8:19 PM, Ethan Zhao wrote:
>>
>> 在 2025/1/17 9:21, H. Peter Anvin 写道:
>>> On 1/16/25 16:37, Ethan Zhao wrote:
>>>>>
>>>>> hpa suggested to introduce "switch_likely" for this kind of 
>>>>> optimization
>>>>> on a switch statement, which is also easier to read.  I measured 
>>>>> it with
>>>>> a user space focus test, it does improve performance a lot. But 
>>>>> obviously there are still a lot of work to do.
>>>>
>>>> Find a way to instruct compiler to pick the right hot branch 
>>>> meanwhile make folks
>>>> reading happy... yup, a lot of work.
>>>>
>>>
>>> It's not that complicated, believe it or not.
>>>
>>> /*
>>>  * switch(v) biased for speed in the case v == l
>>>  *
>>>  * Note: gcc is quite sensitive to the exact form of this
>>>  * expression.
>>>  */
>>> #define switch_likely(v,l) \
>>>     switch((__typeof__(v))__builtin_expect((v),(l)))
>>
>> I tried this macro as following, but got something really *weird* 
>> from gcc.
>>
>> +#define switch_likely(v,l) \
>> +        switch((__typeof__(v))__builtin_expect((v),(l)))
>> +
>>   __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>>   {
>>          unsigned long error_code = regs->orig_ax;
>> +       unsigned short etype = regs->fred_ss.type & 0xf;
>>
>>          /* Invalidate orig_ax so that syscall_get_nr() works 
>> correctly */
>>          regs->orig_ax = -1;
>>
>> -       switch (regs->fred_ss.type) {
>> +       switch_likely ((etype == EVENT_TYPE_EXTINT || etype == 
>> EVENT_TYPE_OTHER), etype) {
>
> Just swap the 2 arguments, and it should be:
> +    switch_likely (etype, EVENT_TYPE_OTHER) {
>
>
after swapped the parameters as following:
+#define switch_likely(v,l) \
+ switch((__typeof__(v))__builtin_expect((v),(l)))
+
  __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
  {
         unsigned long error_code = regs->orig_ax;
+       unsigned short etype = regs->fred_ss.type & 0xf;

         /* Invalidate orig_ax so that syscall_get_nr() works correctly */
         regs->orig_ax = -1;

-       switch (regs->fred_ss.type) {
+       switch_likely (etype, (EVENT_TYPE_EXTINT == etype || 
EVENT_TYPE_OTHER == etype)) {
         case EVENT_TYPE_EXTINT:
                 return fred_extint(regs);
         case EVENT_TYPE_NMI:
@@ -256,11 +260,12 @@ __visible noinstr void fred_entry_from_user(struct 
pt_regs *regs)
  __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
  {
         unsigned long error_code = regs->orig_ax;
+       unsigned short etype = regs->fred_ss.type & 0xf;

         /* Invalidate orig_ax so that syscall_get_nr() works correctly */
         regs->orig_ax = -1;

-       switch (regs->fred_ss.type) {
+       switch_likely (etype, EVENT_TYPE_EXTINT == etype) {
         case EVENT_TYPE_EXTINT:
                 return fred_extint(regs);

         case EVENT_TYPE_NMI:


Good news -- the gcc works normal, generated right asm code,
Bad news -- it fails back to binary search method to do matching, 
ignored our hints.


$objdump -d vmlinux.o | awk '/<fred_entry_from_kernel>:/{c=65} c&&c--'
00000000000016b0 <fred_entry_from_kernel>:
     16b0:       0f b6 87 a6 00 00 00 movzbl 0xa6(%rdi),%eax
     16b7:       48 8b 77 78 mov    0x78(%rdi),%rsi
     16bb:       55 push   %rbp
     16bc:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
     16c3:       ff
     16c4:       83 e0 0f and    $0xf,%eax
     16c7:       48 89 e5 mov    %rsp,%rbp
     16ca:       3c 03 cmp    $0x3,%al
     16cc:       74 3c je     170a <fred_entry_from_kernel+0x5a>
     16ce:       76 13 jbe    16e3 <fred_entry_from_kernel+0x33>
     16d0:       3c 05 cmp    $0x5,%al
     16d2:       74 41 je     1715 <fred_entry_from_kernel+0x65>
     16d4:       3c 06 cmp    $0x6,%al
     16d6:       75 27 jne    16ff <fred_entry_from_kernel+0x4f>
     16d8:       e8 73 fe ff ff callq  1550 <fred_swexc>
     16dd:       5d pop    %rbp
     16de:       e9 00 00 00 00 jmpq   16e3 <fred_entry_from_kernel+0x33>
     16e3:       84 c0 test   %al,%al
     16e5:       74 42 je     1729 <fred_entry_from_kernel+0x79>
     16e7:       3c 02 cmp    $0x2,%al
     16e9:       75 14 jne    16ff <fred_entry_from_kernel+0x4f>
     16eb:       80 bf a4 00 00 00 02 cmpb   $0x2,0xa4(%rdi)
     16f2:       75 0b jne    16ff <fred_entry_from_kernel+0x4f>
     16f4:       e8 00 00 00 00 callq  16f9 <fred_entry_from_kernel+0x49>
     16f9:       5d pop    %rbp
     16fa:       e9 00 00 00 00 jmpq   16ff <fred_entry_from_kernel+0x4f>
     16ff:       e8 1c fc ff ff callq  1320 <fred_bad_type>
     1704:       5d pop    %rbp
     1705:       e9 00 00 00 00 jmpq   170a <fred_entry_from_kernel+0x5a>
     170a:       e8 21 fd ff ff callq  1430 <fred_hwexc>
     170f:       5d pop    %rbp
     1710:       e9 00 00 00 00 jmpq   1715 <fred_entry_from_kernel+0x65>
     1715:       80 bf a4 00 00 00 01 cmpb   $0x1,0xa4(%rdi)
     171c:       75 e1 jne    16ff <fred_entry_from_kernel+0x4f>
     171e:       e8 00 00 00 00 callq  1723 <fred_entry_from_kernel+0x73>
     1723:       5d pop    %rbp
     1724:       e9 00 00 00 00 jmpq   1729 <fred_entry_from_kernel+0x79>
     1729:       e8 12 fb ff ff callq  1240 <fred_extint>
     172e:       5d pop    %rbp
     172f:       e9 00 00 00 00 jmpq   1734 <fred_entry_from_kernel+0x84>
     1734:       66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
     173b:       00 00 00 00
     173f:       90                      nop

$ objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
00000000000015a0 <fred_entry_from_user>:
     15a0:       0f b6 87 a6 00 00 00 movzbl 0xa6(%rdi),%eax
     15a7:       48 8b 77 78 mov    0x78(%rdi),%rsi
     15ab:       55 push   %rbp
     15ac:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
     15b3:       ff
     15b4:       83 e0 0f and    $0xf,%eax
     15b7:       48 89 e5 mov    %rsp,%rbp
     15ba:       3c 04 cmp    $0x4,%al
     15bc:       74 78 je     1636 <fred_entry_from_user+0x96>
     15be:       77 15 ja     15d5 <fred_entry_from_user+0x35>
     15c0:       3c 02 cmp    $0x2,%al
     15c2:       74 53 je     1617 <fred_entry_from_user+0x77>
     15c4:       77 65 ja     162b <fred_entry_from_user+0x8b>
     15c6:       84 c0 test   %al,%al
     15c8:       75 42 jne    160c <fred_entry_from_user+0x6c>
     15ca:       e8 71 fc ff ff callq  1240 <fred_extint>
     15cf:       5d pop    %rbp
     15d0:       e9 00 00 00 00 jmpq   15d5 <fred_entry_from_user+0x35>
     15d5:       3c 06 cmp    $0x6,%al
     15d7:       74 7c je     1655 <fred_entry_from_user+0xb5>
     15d9:       72 66 jb     1641 <fred_entry_from_user+0xa1>
     15db:       3c 07 cmp    $0x7,%al
     15dd:       75 2d jne    160c <fred_entry_from_user+0x6c>
     15df:       8b 87 a4 00 00 00 mov    0xa4(%rdi),%eax
     15e5:       25 ff 00 00 02 and    $0x20000ff,%eax
     15ea:       3d 01 00 00 02 cmp    $0x2000001,%eax
     15ef:       75 6f jne    1660 <fred_entry_from_user+0xc0>
     15f1:       48 8b 77 50 mov    0x50(%rdi),%rsi
     15f5:       48 c7 47 50 da ff ff movq   $0xffffffffffffffda,0x50(%rdi)
     15fc:       ff
     15fd:       48 89 77 78 mov    %rsi,0x78(%rdi)
     1601:       e8 00 00 00 00 callq  1606 <fred_entry_from_user+0x66>
     1606:       5d pop    %rbp
     1607:       e9 00 00 00 00 jmpq   160c <fred_entry_from_user+0x6c>
     160c:       e8 0f fd ff ff callq  1320 <fred_bad_type>
     1611:       5d pop    %rbp
     1612:       e9 00 00 00 00 jmpq   1617 <fred_entry_from_user+0x77>
     1617:       80 bf a4 00 00 00 02 cmpb   $0x2,0xa4(%rdi)
     161e:       75 ec jne    160c <fred_entry_from_user+0x6c>
     1620:       e8 00 00 00 00 callq  1625 <fred_entry_from_user+0x85>
     1625:       5d pop    %rbp
     1626:       e9 00 00 00 00 jmpq   162b <fred_entry_from_user+0x8b>
     162b:       e8 00 fe ff ff callq  1430 <fred_hwexc>
     1630:       5d pop    %rbp
     1631:       e9 00 00 00 00 jmpq   1636 <fred_entry_from_user+0x96>
     1636:       e8 85 fc ff ff callq  12c0 <fred_intx>
     163b:       5d pop    %rbp
     163c:       e9 00 00 00 00 jmpq   1641 <fred_entry_from_user+0xa1>
     1641:       80 bf a4 00 00 00 01 cmpb   $0x1,0xa4(%rdi)
     1648:       75 c2 jne    160c <fred_entry_from_user+0x6c>
     164a:       e8 00 00 00 00 callq  164f <fred_entry_from_user+0xaf>
     164f:       5d pop    %rbp
     1650:       e9 00 00 00 00 jmpq   1655 <fred_entry_from_user+0xb5>
     1655:       e8 f6 fe ff ff callq  1550 <fred_swexc>
     165a:       5d pop    %rbp
     165b:       e9 00 00 00 00 jmpq   1660 <fred_entry_from_user+0xc0>
     1660:       83 f8 02 cmp    $0x2,%eax
     1663:       75 24 jne    1689 <fred_entry_from_user+0xe9>
     1665:       80 3d 00 00 00 00 00 cmpb   $0x0,0x0(%rip)        # 
166c <fred_entry_from_user+0xcc>
     166c:       74 1b je     1689 <fred_entry_from_user+0xe9>
     166e:       48 8b 47 50 mov    0x50(%rdi),%rax
     1672:       48 c7 47 50 da ff ff movq   $0xffffffffffffffda,0x50(%rdi)
     1679:       ff
     167a:       48 89 47 78 mov    %rax,0x78(%rdi)

In short, seems that __builtin_expect not work with switch(), at least for
  gcc version 8.5.0 20210514(RHEL).

Thanks,
Ethan



> Probably also check __builtin_expect on 
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html.
>
>>          case EVENT_TYPE_EXTINT:
>>                  return fred_extint(regs);
>>          case EVENT_TYPE_NMI:
>> @@ -256,11 +260,12 @@ __visible noinstr void 
>> fred_entry_from_user(struct pt_regs *regs)
>>   __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
>>   {
>>          unsigned long error_code = regs->orig_ax;
>> +       unsigned short etype = regs->fred_ss.type & 0xf;
>>
>>          /* Invalidate orig_ax so that syscall_get_nr() works 
>> correctly */
>>          regs->orig_ax = -1;
>>
>> -       switch (regs->fred_ss.type) {
>> +       switch_likely (etype == EVENT_TYPE_EXTINT, etype) {
>>          case EVENT_TYPE_EXTINT:
>>                  return fred_extint(regs);
>>          case EVENT_TYPE_NMI:
>>
>> Got the asm code as following:
>>
>>   objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
>> 00000000000015a0 <fred_entry_from_user>:
>>      15a0:       0f b6 87 a6 00 00 00 movzbl 0xa6(%rdi),%eax
>>      15a7:       48 8b 77 78 mov    0x78(%rdi),%rsi
>>      15ab:       55 push   %rbp
>>      15ac:       48 c7 47 78 ff ff ff movq 
>> $0xffffffffffffffff,0x78(%rdi)
>>      15b3:       ff
>>      15b4:       48 89 e5 mov    %rsp,%rbp
>>      15b7:       66 83 e0 0f and    $0xf,%ax
>>      15bb:       74 11 je     15ce <fred_entry_from_user+0x2e>
>>      15bd:       66 83 f8 07 cmp    $0x7,%ax
>>      15c1:       74 0b je     15ce <fred_entry_from_user+0x2e>
>>      15c3:       e8 78 fc ff ff callq  1240 <fred_extint>
>>      15c8:       5d pop    %rbp
>>      15c9:       e9 00 00 00 00 jmpq   15ce <fred_entry_from_user+0x2e>
>>      15ce:       e8 4d fd ff ff callq  1320 <fred_bad_type>
>>      15d3:       5d pop    %rbp
>>      15d4:       e9 00 00 00 00 jmpq   15d9 <fred_entry_from_user+0x39>
>>      15d9:       0f 1f 80 00 00 00 00 nopl   0x0(%rax)
>>
>> 00000000000015e0 <__pfx_fred_entry_from_kernel>:
>>      15e0:       90                      nop
>>      15e1:       90                      nop
>>
>> 00000000000015f0 <fred_entry_from_kernel>:
>>      15f0:       55 push   %rbp
>>      15f1:       48 8b 77 78 mov    0x78(%rdi),%rsi
>>      15f5:       48 c7 47 78 ff ff ff movq 
>> $0xffffffffffffffff,0x78(%rdi)
>>      15fc:       ff
>>      15fd:       48 89 e5 mov    %rsp,%rbp
>>      1600:       f6 87 a6 00 00 00 0f testb  $0xf,0xa6(%rdi)
>>      1607:       75 0b jne    1614 <fred_entry_from_kernel+0x24>
>>      1609:       e8 12 fd ff ff callq  1320 <fred_bad_type>
>>      160e:       5d pop    %rbp
>>      160f:       e9 00 00 00 00 jmpq   1614 
>> <fred_entry_from_kernel+0x24>
>>      1614:       e8 27 fc ff ff callq  1240 <fred_extint>
>>      1619:       5d pop    %rbp
>>      161a:       e9 00 00 00 00 jmpq   161f 
>> <fred_entry_from_kernel+0x2f>
>>      161f:       90                      nop
>>
>> 0000000000001620 <__pfx___fred_entry_from_kvm>:
>>      1620:       90                      nop
>>      1621:       90                      nop
>>
>>
>> Even the fred_entry_from_kernel() asm code doesn't look right.
>> *gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)*
>> **
>> *Did I screw up something ?*
>> **
>> *Thanks,*
>> *Ethan*
>>>
>>>     -hpa
>>>
>>
>
>
-- 
"firm, enduring, strong, and long-lived"


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  5:18               ` Ethan Zhao
@ 2025-01-17  5:54                 ` Xin Li
  2025-01-17  6:58                   ` Ethan Zhao
  2025-01-17 16:17                   ` H. Peter Anvin
  2025-01-17 16:24                 ` H. Peter Anvin
  1 sibling, 2 replies; 27+ messages in thread
From: Xin Li @ 2025-01-17  5:54 UTC (permalink / raw)
  To: Ethan Zhao, H. Peter Anvin, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/16/2025 9:18 PM, Ethan Zhao wrote:
>>
>> Just swap the 2 arguments, and it should be:
>> +    switch_likely (etype, EVENT_TYPE_OTHER) {
>>
>>
> after swapped the parameters as following:
> +#define switch_likely(v,l) \
> + switch((__typeof__(v))__builtin_expect((v),(l)))
> +
>   __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>   {
>          unsigned long error_code = regs->orig_ax;
> +       unsigned short etype = regs->fred_ss.type & 0xf;
> 
>          /* Invalidate orig_ax so that syscall_get_nr() works correctly */
>          regs->orig_ax = -1;
> 
> -       switch (regs->fred_ss.type) {
> +       switch_likely (etype, (EVENT_TYPE_EXTINT == etype || 
> EVENT_TYPE_OTHER == etype)) {

This is not what I suggested, the (l) argument should be only one
constant; __builtin_expect() doesn't allow 2 different constants.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  5:54                 ` Xin Li
@ 2025-01-17  6:58                   ` Ethan Zhao
  2025-01-17 16:17                   ` H. Peter Anvin
  1 sibling, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-17  6:58 UTC (permalink / raw)
  To: Xin Li, Ethan Zhao, H. Peter Anvin, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp


On 1/17/2025 1:54 PM, Xin Li wrote:
> On 1/16/2025 9:18 PM, Ethan Zhao wrote:
>>>
>>> Just swap the 2 arguments, and it should be:
>>> +    switch_likely (etype, EVENT_TYPE_OTHER) {
>>>
>>>
>> after swapped the parameters as following:
>> +#define switch_likely(v,l) \
>> + switch((__typeof__(v))__builtin_expect((v),(l)))
>> +
>>   __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>>   {
>>          unsigned long error_code = regs->orig_ax;
>> +       unsigned short etype = regs->fred_ss.type & 0xf;
>>
>>          /* Invalidate orig_ax so that syscall_get_nr() works 
>> correctly */
>>          regs->orig_ax = -1;
>>
>> -       switch (regs->fred_ss.type) {
>> +       switch_likely (etype, (EVENT_TYPE_EXTINT == etype || 
>> EVENT_TYPE_OTHER == etype)) {
>
> This is not what I suggested, the (l) argument should be only one
> constant; __builtin_expect() doesn't allow 2 different constants.

That is why it couldn't be used in switch(var), while it works with if(true) else if (false).
Switch(var) needs a var returned by the __builtin_expect(), but it only could return Const.

Thanks,
Ethan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  5:54                 ` Xin Li
  2025-01-17  6:58                   ` Ethan Zhao
@ 2025-01-17 16:17                   ` H. Peter Anvin
  2025-01-17 16:23                     ` H. Peter Anvin
  1 sibling, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-17 16:17 UTC (permalink / raw)
  To: Xin Li, Ethan Zhao, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp



On 1/16/25 21:54, Xin Li wrote:
> On 1/16/2025 9:18 PM, Ethan Zhao wrote:
>>>
>>> Just swap the 2 arguments, and it should be:
>>> +    switch_likely (etype, EVENT_TYPE_OTHER) {
>>>
>>>
>> after swapped the parameters as following:
>> +#define switch_likely(v,l) \
>> + switch((__typeof__(v))__builtin_expect((v),(l)))
>> +
>>   __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
>>   {
>>          unsigned long error_code = regs->orig_ax;
>> +       unsigned short etype = regs->fred_ss.type & 0xf;
>>
>>          /* Invalidate orig_ax so that syscall_get_nr() works 
>> correctly */
>>          regs->orig_ax = -1;
>>
>> -       switch (regs->fred_ss.type) {
>> +       switch_likely (etype, (EVENT_TYPE_EXTINT == etype || 
>> EVENT_TYPE_OTHER == etype)) {
> 
> This is not what I suggested, the (l) argument should be only one
> constant; __builtin_expect() doesn't allow 2 different constants.
> 

The (l) argument is not a boolean expression! It is the *expected value* 
of (v).

	-hpa


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17 16:17                   ` H. Peter Anvin
@ 2025-01-17 16:23                     ` H. Peter Anvin
  2025-01-18  4:00                       ` Ethan Zhao
  0 siblings, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-17 16:23 UTC (permalink / raw)
  To: Xin Li, Ethan Zhao, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/17/25 08:17, H. Peter Anvin wrote:
>>>
>>> -       switch (regs->fred_ss.type) {
>>> +       switch_likely (etype, (EVENT_TYPE_EXTINT == etype || 
>>> EVENT_TYPE_OTHER == etype)) {
>>
>> This is not what I suggested, the (l) argument should be only one
>> constant; __builtin_expect() doesn't allow 2 different constants.
>>
> 
> The (l) argument is not a boolean expression! It is the *expected value* 
> of (v).
> 

Also, EVENT_TYPE_EXTINT == etype is not Linux style.

More fundamentally, though, I have to question this unless based on 
profiling, because it isn't at all clear that EXTINT is more important 
than FAULT (page faults, to be specific.)

To optimize syscalls, you want to do a one-shot comparison of the entire 
syscall64 signature (event type, 64-bit flag, and vector) as a mask and 
compare. For that you want to make sure the compiler loads the high 32 
bits into a register so that your mask and compare values can be 
immediates. In other words, you don't actually want it to be part of the 
switch at all, and you want *other* EVENT_TYPE_OTHER to fall back to the 
switch with regular (low) priority.

	-hpa


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17  5:18               ` Ethan Zhao
  2025-01-17  5:54                 ` Xin Li
@ 2025-01-17 16:24                 ` H. Peter Anvin
  2025-01-18  3:29                   ` Ethan Zhao
  1 sibling, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-17 16:24 UTC (permalink / raw)
  To: Ethan Zhao, Xin Li, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

> 
> In short, seems that __builtin_expect not work with switch(), at least for
>   gcc version 8.5.0 20210514(RHEL).
> 

For forward-facing optimizations, please don't use an ancient version of 
gcc as the benchmark.

	-hpa


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17 16:24                 ` H. Peter Anvin
@ 2025-01-18  3:29                   ` Ethan Zhao
  2025-01-18  3:41                     ` H. Peter Anvin
  0 siblings, 1 reply; 27+ messages in thread
From: Ethan Zhao @ 2025-01-18  3:29 UTC (permalink / raw)
  To: H. Peter Anvin, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp


On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>
>> In short, seems that __builtin_expect not work with switch(), at 
>> least for
>>   gcc version 8.5.0 20210514(RHEL).
>>
>
> For forward-facing optimizations, please don't use an ancient version 
> of gcc as the benchmark.

Even there is a latest Gcc built-in feature could work for this case, it 
is highly unlikely that Linus would adopt such trick into upstream 
kernel (only works for specific ver compiler). the same resultto those 
downstream vendors/LTS kernels. thus, making an optimization with latest 
only Gcc would construct an impractical benchmark-only performance 
barrier. As to the __builtin_expect(), my understanding, it was designed 
to only work for if(bool value) {
}
else if(bool value) {
} The value of the condition expression returned by __builtin_expect() 
is a bool const. while switch(variable) expects a variable. so it is 
normal for Gcc that it doesn't work with it.

If I got something wrong, please let me know.

Thanks,
Ethan

>
>     -hpa
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-18  3:29                   ` Ethan Zhao
@ 2025-01-18  3:41                     ` H. Peter Anvin
  2025-01-18  4:06                       ` Ethan Zhao
  0 siblings, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-18  3:41 UTC (permalink / raw)
  To: Ethan Zhao, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On January 17, 2025 7:29:36 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>
>On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>> 
>>> In short, seems that __builtin_expect not work with switch(), at least for
>>>   gcc version 8.5.0 20210514(RHEL).
>>> 
>> 
>> For forward-facing optimizations, please don't use an ancient version of gcc as the benchmark.
>
>Even there is a latest Gcc built-in feature could work for this case, it is highly unlikely that Linus would adopt such trick into upstream kernel (only works for specific ver compiler). the same resultto those downstream vendors/LTS kernels. thus, making an optimization with latest only Gcc would construct an impractical benchmark-only performance barrier. As to the __builtin_expect(), my understanding, it was designed to only work for if(bool value) {
>}
>else if(bool value) {
>} The value of the condition expression returned by __builtin_expect() is a bool const. while switch(variable) expects a variable. so it is normal for Gcc that it doesn't work with it.
>
>If I got something wrong, please let me know.
>
>Thanks,
>Ethan
>
>> 
>>     -hpa
>> 

That is not true at all; we do that pretty much *all the time*. The reason is that the new compiler versions will become mainstream on a much shorter time scale than the lifespan of kernel code. 

We do care about not making the code for the current mainstream compilers *worse* in the process, and we care about not *breaking* the backrev compilers.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-17 16:23                     ` H. Peter Anvin
@ 2025-01-18  4:00                       ` Ethan Zhao
  0 siblings, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-18  4:00 UTC (permalink / raw)
  To: H. Peter Anvin, Xin Li, Ethan Zhao, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/18/2025 12:23 AM, H. Peter Anvin wrote:
> On 1/17/25 08:17, H. Peter Anvin wrote:
>>>>
>>>> -       switch (regs->fred_ss.type) {
>>>> +       switch_likely (etype, (EVENT_TYPE_EXTINT == etype || 
>>>> EVENT_TYPE_OTHER == etype)) {
>>>
>>> This is not what I suggested, the (l) argument should be only one
>>> constant; __builtin_expect() doesn't allow 2 different constants.
>>>
>>
>> The (l) argument is not a boolean expression! It is the *expected 
>> value* of (v).
>>
>
> Also, EVENT_TYPE_EXTINT == etype is not Linux style.
>
> More fundamentally, though, I have to question this unless based on 
> profiling, because it isn't at all clear that EXTINT is more important 
> than FAULT (page faults, to be specific.)
>
Perhaps the conclusion about which is more important/higher probability among EXTINT,SYSCALL,PF only applies to specific kind of workload system,
no one-size-fit-all conclusion there to dig.

But for a normal system, it is certainty that events like EXTINT,SYSCALL,PF would happen in higher probability than others. saving some cycles for
their paths isn't hard to understand. just like taking shortcut at event type dispatching level, no other changes.

> To optimize syscalls, you want to do a one-shot comparison of the 
> entire syscall64 signature (event type, 64-bit flag, and vector) as a 
> mask and compare. For 

To whole event dispatching path for syscalls, yep.

> that you want to make sure the compiler loads the high 32 bits into a 
> register so that your mask and compare values can be immediates. In 
> other words, you don't actually want it to be part of the switch at 
> all, and you want *other* EVENT_TYPE_OTHER to fall back to the switch 
> with regular (low) priority.

switch() seems not too bad, at least compared to jump table.

Thanks,
Ethan

>
>     -hpa
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-18  3:41                     ` H. Peter Anvin
@ 2025-01-18  4:06                       ` Ethan Zhao
  2025-01-18  4:09                         ` H. Peter Anvin
  2025-01-18  4:14                         ` H. Peter Anvin
  0 siblings, 2 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-18  4:06 UTC (permalink / raw)
  To: H. Peter Anvin, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/18/2025 11:41 AM, H. Peter Anvin wrote:
> On January 17, 2025 7:29:36 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>> On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>>> In short, seems that __builtin_expect not work with switch(), at least for
>>>>    gcc version 8.5.0 20210514(RHEL).
>>>>
>>> For forward-facing optimizations, please don't use an ancient version of gcc as the benchmark.
>> Even there is a latest Gcc built-in feature could work for this case, it is highly unlikely that Linus would adopt such trick into upstream kernel (only works for specific ver compiler). the same resultto those downstream vendors/LTS kernels. thus, making an optimization with latest only Gcc would construct an impractical benchmark-only performance barrier. As to the __builtin_expect(), my understanding, it was designed to only work for if(bool value) {
>> }
>> else if(bool value) {
>> } The value of the condition expression returned by __builtin_expect() is a bool const. while switch(variable) expects a variable. so it is normal for Gcc that it doesn't work with it.
>>
>> If I got something wrong, please let me know.
>>
>> Thanks,
>> Ethan
>>
>>>      -hpa
>>>
> That is not true at all; we do that pretty much *all the time*. The reason is that the new compiler versions will become mainstream on a much shorter time scale than the lifespan of kernel code.

Yup, time walks forward...
But it is very painful to backporting like jobs to make those things in position for eager/no-waiting customers.

Thanks,
Ethan

>
> We do care about not making the code for the current mainstream compilers *worse* in the process, and we care about not *breaking* the backrev compilers.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-18  4:06                       ` Ethan Zhao
@ 2025-01-18  4:09                         ` H. Peter Anvin
  2025-01-18  5:55                           ` Ethan Zhao
  2025-01-18  4:14                         ` H. Peter Anvin
  1 sibling, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-18  4:09 UTC (permalink / raw)
  To: Ethan Zhao, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On January 17, 2025 8:06:27 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>On 1/18/2025 11:41 AM, H. Peter Anvin wrote:
>> On January 17, 2025 7:29:36 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>>> On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>>>> In short, seems that __builtin_expect not work with switch(), at least for
>>>>>    gcc version 8.5.0 20210514(RHEL).
>>>>> 
>>>> For forward-facing optimizations, please don't use an ancient version of gcc as the benchmark.
>>> Even there is a latest Gcc built-in feature could work for this case, it is highly unlikely that Linus would adopt such trick into upstream kernel (only works for specific ver compiler). the same resultto those downstream vendors/LTS kernels. thus, making an optimization with latest only Gcc would construct an impractical benchmark-only performance barrier. As to the __builtin_expect(), my understanding, it was designed to only work for if(bool value) {
>>> }
>>> else if(bool value) {
>>> } The value of the condition expression returned by __builtin_expect() is a bool const. while switch(variable) expects a variable. so it is normal for Gcc that it doesn't work with it.
>>> 
>>> If I got something wrong, please let me know.
>>> 
>>> Thanks,
>>> Ethan
>>> 
>>>>      -hpa
>>>> 
>> That is not true at all; we do that pretty much *all the time*. The reason is that the new compiler versions will become mainstream on a much shorter time scale than the lifespan of kernel code.
>
>Yup, time walks forward...
>But it is very painful to backporting like jobs to make those things in position for eager/no-waiting customers.
>
>Thanks,
>Ethan
>
>> 
>> We do care about not making the code for the current mainstream compilers *worse* in the process, and we care about not *breaking* the backrev compilers.

As I said, it is OK for an ancient compiler (gcc 8 is the oldest compiler still supported) to *not have an improvement* as long as it doesn't make it worse.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-18  4:06                       ` Ethan Zhao
  2025-01-18  4:09                         ` H. Peter Anvin
@ 2025-01-18  4:14                         ` H. Peter Anvin
  2025-01-18  6:02                           ` Ethan Zhao
  1 sibling, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2025-01-18  4:14 UTC (permalink / raw)
  To: Ethan Zhao, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On January 17, 2025 8:06:27 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>On 1/18/2025 11:41 AM, H. Peter Anvin wrote:
>> On January 17, 2025 7:29:36 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>>> On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>>>> In short, seems that __builtin_expect not work with switch(), at least for
>>>>>    gcc version 8.5.0 20210514(RHEL).
>>>>> 
>>>> For forward-facing optimizations, please don't use an ancient version of gcc as the benchmark.
>>> Even there is a latest Gcc built-in feature could work for this case, it is highly unlikely that Linus would adopt such trick into upstream kernel (only works for specific ver compiler). the same resultto those downstream vendors/LTS kernels. thus, making an optimization with latest only Gcc would construct an impractical benchmark-only performance barrier. As to the __builtin_expect(), my understanding, it was designed to only work for if(bool value) {
>>> }
>>> else if(bool value) {
>>> } The value of the condition expression returned by __builtin_expect() is a bool const. while switch(variable) expects a variable. so it is normal for Gcc that it doesn't work with it.
>>> 
>>> If I got something wrong, please let me know.
>>> 
>>> Thanks,
>>> Ethan
>>> 
>>>>      -hpa
>>>> 
>> That is not true at all; we do that pretty much *all the time*. The reason is that the new compiler versions will become mainstream on a much shorter time scale than the lifespan of kernel code.
>
>Yup, time walks forward...
>But it is very painful to backporting like jobs to make those things in position for eager/no-waiting customers.
>
>Thanks,
>Ethan
>
>> 
>> We do care about not making the code for the current mainstream compilers *worse* in the process, and we care about not *breaking* the backrev compilers.

Keep in mind we're not even talking about feature enablement here, but a microoptimization. The latter we really need to be syntactically lightweight and abstracted enough to deal with, for example, compiler changes.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-18  4:09                         ` H. Peter Anvin
@ 2025-01-18  5:55                           ` Ethan Zhao
  0 siblings, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-18  5:55 UTC (permalink / raw)
  To: H. Peter Anvin, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/18/2025 12:09 PM, H. Peter Anvin wrote:
> On January 17, 2025 8:06:27 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>> On 1/18/2025 11:41 AM, H. Peter Anvin wrote:
>>> On January 17, 2025 7:29:36 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>>>> On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>>>>> In short, seems that __builtin_expect not work with switch(), at least for
>>>>>>     gcc version 8.5.0 20210514(RHEL).
>>>>>>
>>>>> For forward-facing optimizations, please don't use an ancient version of gcc as the benchmark.
>>>> Even there is a latest Gcc built-in feature could work for this case, it is highly unlikely that Linus would adopt such trick into upstream kernel (only works for specific ver compiler). the same resultto those downstream vendors/LTS kernels. thus, making an optimization with latest only Gcc would construct an impractical benchmark-only performance barrier. As to the __builtin_expect(), my understanding, it was designed to only work for if(bool value) {
>>>> }
>>>> else if(bool value) {
>>>> } The value of the condition expression returned by __builtin_expect() is a bool const. while switch(variable) expects a variable. so it is normal for Gcc that it doesn't work with it.
>>>>
>>>> If I got something wrong, please let me know.
>>>>
>>>> Thanks,
>>>> Ethan
>>>>
>>>>>       -hpa
>>>>>
>>> That is not true at all; we do that pretty much *all the time*. The reason is that the new compiler versions will become mainstream on a much shorter time scale than the lifespan of kernel code.
>> Yup, time walks forward...
>> But it is very painful to backporting like jobs to make those things in position for eager/no-waiting customers.
>>
>> Thanks,
>> Ethan
>>
>>> We do care about not making the code for the current mainstream compilers *worse* in the process, and we care about not *breaking* the backrev compilers.
> As I said, it is OK for an ancient compiler (gcc 8 is the oldest compiler still supported) to *not have an improvement* as long as it doesn't make it worse.

There are always surprise that some customers want new features but wouldn't throw away much older pre-ancient toolchains (gcc etc)\kernels,and they
really have justification (business system). refuse to provide service ? hmmm :(,  they wanna both "no changing" (old system) and "changing" (new feature
and performance improvement). this is the way some guys living forward by looking & working backwards. so a little sensitive for 'the last' :)

Thanks,
Ethan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching
  2025-01-18  4:14                         ` H. Peter Anvin
@ 2025-01-18  6:02                           ` Ethan Zhao
  0 siblings, 0 replies; 27+ messages in thread
From: Ethan Zhao @ 2025-01-18  6:02 UTC (permalink / raw)
  To: H. Peter Anvin, Ethan Zhao, Xin Li, linux-kernel, stable
  Cc: tglx, dave.hansen, x86, andrew.cooper3, mingo, bp

On 1/18/2025 12:14 PM, H. Peter Anvin wrote:
> On January 17, 2025 8:06:27 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>> On 1/18/2025 11:41 AM, H. Peter Anvin wrote:
>>> On January 17, 2025 7:29:36 PM PST, Ethan Zhao <etzhao@outlook.com> wrote:
>>>> On 1/18/2025 12:24 AM, H. Peter Anvin wrote:
>>>>>> In short, seems that __builtin_expect not work with switch(), at least for
>>>>>>     gcc version 8.5.0 20210514(RHEL).
>>>>>>
>>>>> For forward-facing optimizations, please don't use an ancient version of gcc as the benchmark.
>>>> Even there is a latest Gcc built-in feature could work for this case, it is highly unlikely that Linus would adopt such trick into upstream kernel (only works for specific ver compiler). the same resultto those downstream vendors/LTS kernels. thus, making an optimization with latest only Gcc would construct an impractical benchmark-only performance barrier. As to the __builtin_expect(), my understanding, it was designed to only work for if(bool value) {
>>>> }
>>>> else if(bool value) {
>>>> } The value of the condition expression returned by __builtin_expect() is a bool const. while switch(variable) expects a variable. so it is normal for Gcc that it doesn't work with it.
>>>>
>>>> If I got something wrong, please let me know.
>>>>
>>>> Thanks,
>>>> Ethan
>>>>
>>>>>       -hpa
>>>>>
>>> That is not true at all; we do that pretty much *all the time*. The reason is that the new compiler versions will become mainstream on a much shorter time scale than the lifespan of kernel code.
>> Yup, time walks forward...
>> But it is very painful to backporting like jobs to make those things in position for eager/no-waiting customers.
>>
>> Thanks,
>> Ethan
>>
>>> We do care about not making the code for the current mainstream compilers *worse* in the process, and we care about not *breaking* the backrev compilers.
> Keep in mind we're not even talking about feature enablement here, but a microoptimization. The latter we really need to be syntactically lightweight and abstracted enough to deal with, for example, compiler changes.

Fully agree. we are looking at 'micro optimization', not feature level, system level optimization, need to be abstracted to micro lightweight method to
verify or think about its pros-cons.

Thanks,
Ethan


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2025-01-18  6:02 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-16  6:51 [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching Ethan Zhao
2025-01-16  7:27 ` Xin Li
2025-01-16  9:22   ` Ethan Zhao
2025-01-16 13:03   ` Ethan Zhao
2025-01-16 13:48     ` Xin Li
2025-01-17  0:37       ` Ethan Zhao
2025-01-17  1:21         ` H. Peter Anvin
2025-01-17  3:04           ` Ethan Zhao
2025-01-17  4:19           ` Ethan Zhao
2025-01-17  4:43             ` Xin Li
2025-01-17  5:18               ` Ethan Zhao
2025-01-17  5:54                 ` Xin Li
2025-01-17  6:58                   ` Ethan Zhao
2025-01-17 16:17                   ` H. Peter Anvin
2025-01-17 16:23                     ` H. Peter Anvin
2025-01-18  4:00                       ` Ethan Zhao
2025-01-17 16:24                 ` H. Peter Anvin
2025-01-18  3:29                   ` Ethan Zhao
2025-01-18  3:41                     ` H. Peter Anvin
2025-01-18  4:06                       ` Ethan Zhao
2025-01-18  4:09                         ` H. Peter Anvin
2025-01-18  5:55                           ` Ethan Zhao
2025-01-18  4:14                         ` H. Peter Anvin
2025-01-18  6:02                           ` Ethan Zhao
2025-01-16 15:08 ` Nikolay Borisov
2025-01-16 18:45   ` Xin Li
2025-01-17  0:40   ` Ethan Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox