* [bug-report] possible performance problem in ret_to_user_from_irq
@ 2022-12-26 8:45 Hui Tang
2023-01-03 10:06 ` Russell King (Oracle)
0 siblings, 1 reply; 8+ messages in thread
From: Hui Tang @ 2022-12-26 8:45 UTC (permalink / raw)
To: Russell King; +Cc: linux-arm-kernel, linux-kernel, Jens Axboe, tanghui20
hi folks.
I found a performance problem which is introduced by commit
32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL").
After the commit, any bit in the range of 0..15 will cause
do_work_pending() to be invoked. More frequent do_work_pending()
invoked possible result in worse performance.
Some of the tests I've done, as follows:
lmbench test base with patch
./lat_ctx -P 1 -s 0 2 7.3167 11.04
./lat_ctx -P 1 -s 16 2 8.0467 14.5367
./lat_ctx -P 1 -s 64 2 7.8667 11.43
./lat_ctx -P 1 -s 16 16 16.47 18.3667
./lat_pipe -P 1 28.1671 44.7904
libMicro-0.4.1 test base with patch
./cascade_cond -E -C 200\
-L -S -W -N "c_cond_1" -I 100 286.3333 358
When I adjust test bit, the performance problem gone.
- movs r1, r1, lsl #16
+ ldr r2, =#_TIF_WORK_MASK
+ tst r1, r2
Does anyone have a good suggestion for this problem?
should just test _TIF_WORK_MASK, as before?
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2022-12-26 8:45 [bug-report] possible performance problem in ret_to_user_from_irq Hui Tang @ 2023-01-03 10:06 ` Russell King (Oracle) 2023-01-03 14:25 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Russell King (Oracle) @ 2023-01-03 10:06 UTC (permalink / raw) To: Hui Tang; +Cc: linux-arm-kernel, linux-kernel, Jens Axboe On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: > hi folks. > > I found a performance problem which is introduced by commit > 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). > After the commit, any bit in the range of 0..15 will cause > do_work_pending() to be invoked. More frequent do_work_pending() > invoked possible result in worse performance. > > Some of the tests I've done, as follows: > lmbench test base with patch > ./lat_ctx -P 1 -s 0 2 7.3167 11.04 > ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 > ./lat_ctx -P 1 -s 64 2 7.8667 11.43 > ./lat_ctx -P 1 -s 16 16 16.47 18.3667 > ./lat_pipe -P 1 28.1671 44.7904 > > libMicro-0.4.1 test base with patch > ./cascade_cond -E -C 200\ > -L -S -W -N "c_cond_1" -I 100 286.3333 358 > > When I adjust test bit, the performance problem gone. > - movs r1, r1, lsl #16 > + ldr r2, =#_TIF_WORK_MASK > + tst r1, r2 > > Does anyone have a good suggestion for this problem? > should just test _TIF_WORK_MASK, as before? I think it should be fine - but I would suggest re-organising the TIF definitions so that those TIF bits that shouldn't trigger do_work_pending are not in the first 16 bits. Note that all four bits in _TIF_SYSCALL_WORK need to stay within an 8-bit even-bit-aligned range, so the value is suitable for an immediate assembly constant. I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to 20..23, and then 8 to 4. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2023-01-03 10:06 ` Russell King (Oracle) @ 2023-01-03 14:25 ` Jens Axboe 2023-01-03 14:34 ` Russell King (Oracle) 0 siblings, 1 reply; 8+ messages in thread From: Jens Axboe @ 2023-01-03 14:25 UTC (permalink / raw) To: Russell King (Oracle), Hui Tang; +Cc: linux-arm-kernel, linux-kernel On 1/3/23 3:06?AM, Russell King (Oracle) wrote: > On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: >> hi folks. >> >> I found a performance problem which is introduced by commit >> 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). >> After the commit, any bit in the range of 0..15 will cause >> do_work_pending() to be invoked. More frequent do_work_pending() >> invoked possible result in worse performance. >> >> Some of the tests I've done? as follows: >> lmbench test base with patch >> ./lat_ctx -P 1 -s 0 2 7.3167 11.04 >> ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 >> ./lat_ctx -P 1 -s 64 2 7.8667 11.43 >> ./lat_ctx -P 1 -s 16 16 16.47 18.3667 >> ./lat_pipe -P 1 28.1671 44.7904 >> >> libMicro-0.4.1 test base with patch >> ./cascade_cond -E -C 200\ >> -L -S -W -N "c_cond_1" -I 100 286.3333 358 >> >> When I adjust test bit, the performance problem gone. >> - movs r1, r1, lsl #16 >> + ldr r2, =#_TIF_WORK_MASK >> + tst r1, r2 >> >> Does anyone have a good suggestion for this problem? >> should just test _TIF_WORK_MASK, as before? > > I think it should be fine - but I would suggest re-organising the > TIF definitions so that those TIF bits that shouldn't trigger > do_work_pending are not in the first 16 bits. > > Note that all four bits in _TIF_SYSCALL_WORK need to stay within > an 8-bit even-bit-aligned range, so the value is suitable for an > immediate assembly constant. > > I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to > 20..23, and then 8 to 4. Like this? diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index aecc403b2880..7f092cb55a41 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -128,15 +128,16 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ #define TIF_UPROBE 3 /* breakpointed or singlestepping */ -#define TIF_SYSCALL_TRACE 4 /* syscall trace active */ -#define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ -#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ -#define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ +#define TIF_NOTIFY_SIGNAL 4 /* signal notifications exist */ #define TIF_USING_IWMMXT 17 #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ -#define TIF_RESTORE_SIGMASK 20 +#define TIF_RESTORE_SIGMASK 19 +#define TIF_SYSCALL_TRACE 20 /* syscall trace active */ +#define TIF_SYSCALL_AUDIT 21 /* syscall auditing active */ +#define TIF_SYSCALL_TRACEPOINT 22 /* syscall tracepoint instrumentation */ +#define TIF_SECCOMP 23 /* seccomp syscall filtering active */ + #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) -- Jens Axboe ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2023-01-03 14:25 ` Jens Axboe @ 2023-01-03 14:34 ` Russell King (Oracle) 2023-01-03 14:59 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Russell King (Oracle) @ 2023-01-03 14:34 UTC (permalink / raw) To: Jens Axboe; +Cc: Hui Tang, linux-arm-kernel, linux-kernel On Tue, Jan 03, 2023 at 07:25:26AM -0700, Jens Axboe wrote: > On 1/3/23 3:06?AM, Russell King (Oracle) wrote: > > On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: > >> hi folks. > >> > >> I found a performance problem which is introduced by commit > >> 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). > >> After the commit, any bit in the range of 0..15 will cause > >> do_work_pending() to be invoked. More frequent do_work_pending() > >> invoked possible result in worse performance. > >> > >> Some of the tests I've done? as follows: > >> lmbench test base with patch > >> ./lat_ctx -P 1 -s 0 2 7.3167 11.04 > >> ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 > >> ./lat_ctx -P 1 -s 64 2 7.8667 11.43 > >> ./lat_ctx -P 1 -s 16 16 16.47 18.3667 > >> ./lat_pipe -P 1 28.1671 44.7904 > >> > >> libMicro-0.4.1 test base with patch > >> ./cascade_cond -E -C 200\ > >> -L -S -W -N "c_cond_1" -I 100 286.3333 358 > >> > >> When I adjust test bit, the performance problem gone. > >> - movs r1, r1, lsl #16 > >> + ldr r2, =#_TIF_WORK_MASK > >> + tst r1, r2 > >> > >> Does anyone have a good suggestion for this problem? > >> should just test _TIF_WORK_MASK, as before? > > > > I think it should be fine - but I would suggest re-organising the > > TIF definitions so that those TIF bits that shouldn't trigger > > do_work_pending are not in the first 16 bits. > > > > Note that all four bits in _TIF_SYSCALL_WORK need to stay within > > an 8-bit even-bit-aligned range, so the value is suitable for an > > immediate assembly constant. > > > > I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to > > 20..23, and then 8 to 4. > > Like this? > > diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h > index aecc403b2880..7f092cb55a41 100644 > --- a/arch/arm/include/asm/thread_info.h > +++ b/arch/arm/include/asm/thread_info.h > @@ -128,15 +128,16 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, > #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ > #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ > #define TIF_UPROBE 3 /* breakpointed or singlestepping */ > -#define TIF_SYSCALL_TRACE 4 /* syscall trace active */ > -#define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ > -#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ > -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ > -#define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ > +#define TIF_NOTIFY_SIGNAL 4 /* signal notifications exist */ > > #define TIF_USING_IWMMXT 17 > #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ > -#define TIF_RESTORE_SIGMASK 20 > +#define TIF_RESTORE_SIGMASK 19 > +#define TIF_SYSCALL_TRACE 20 /* syscall trace active */ > +#define TIF_SYSCALL_AUDIT 21 /* syscall auditing active */ > +#define TIF_SYSCALL_TRACEPOINT 22 /* syscall tracepoint instrumentation */ > +#define TIF_SECCOMP 23 /* seccomp syscall filtering active */ > + > > #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) > #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) Yep, LGTM, thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2023-01-03 14:34 ` Russell King (Oracle) @ 2023-01-03 14:59 ` Jens Axboe 2023-01-04 1:31 ` Hui Tang 2023-01-04 7:04 ` Hui Tang 0 siblings, 2 replies; 8+ messages in thread From: Jens Axboe @ 2023-01-03 14:59 UTC (permalink / raw) To: Russell King (Oracle); +Cc: Hui Tang, linux-arm-kernel, linux-kernel On 1/3/23 7:34?AM, Russell King (Oracle) wrote: > On Tue, Jan 03, 2023 at 07:25:26AM -0700, Jens Axboe wrote: >> On 1/3/23 3:06?AM, Russell King (Oracle) wrote: >>> On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: >>>> hi folks. >>>> >>>> I found a performance problem which is introduced by commit >>>> 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). >>>> After the commit, any bit in the range of 0..15 will cause >>>> do_work_pending() to be invoked. More frequent do_work_pending() >>>> invoked possible result in worse performance. >>>> >>>> Some of the tests I've done? as follows: >>>> lmbench test base with patch >>>> ./lat_ctx -P 1 -s 0 2 7.3167 11.04 >>>> ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 >>>> ./lat_ctx -P 1 -s 64 2 7.8667 11.43 >>>> ./lat_ctx -P 1 -s 16 16 16.47 18.3667 >>>> ./lat_pipe -P 1 28.1671 44.7904 >>>> >>>> libMicro-0.4.1 test base with patch >>>> ./cascade_cond -E -C 200\ >>>> -L -S -W -N "c_cond_1" -I 100 286.3333 358 >>>> >>>> When I adjust test bit, the performance problem gone. >>>> - movs r1, r1, lsl #16 >>>> + ldr r2, =#_TIF_WORK_MASK >>>> + tst r1, r2 >>>> >>>> Does anyone have a good suggestion for this problem? >>>> should just test _TIF_WORK_MASK, as before? >>> >>> I think it should be fine - but I would suggest re-organising the >>> TIF definitions so that those TIF bits that shouldn't trigger >>> do_work_pending are not in the first 16 bits. >>> >>> Note that all four bits in _TIF_SYSCALL_WORK need to stay within >>> an 8-bit even-bit-aligned range, so the value is suitable for an >>> immediate assembly constant. >>> >>> I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to >>> 20..23, and then 8 to 4. >> >> Like this? >> >> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h >> index aecc403b2880..7f092cb55a41 100644 >> --- a/arch/arm/include/asm/thread_info.h >> +++ b/arch/arm/include/asm/thread_info.h >> @@ -128,15 +128,16 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, >> #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ >> #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ >> #define TIF_UPROBE 3 /* breakpointed or singlestepping */ >> -#define TIF_SYSCALL_TRACE 4 /* syscall trace active */ >> -#define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ >> -#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ >> -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ >> -#define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ >> +#define TIF_NOTIFY_SIGNAL 4 /* signal notifications exist */ >> >> #define TIF_USING_IWMMXT 17 >> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ >> -#define TIF_RESTORE_SIGMASK 20 >> +#define TIF_RESTORE_SIGMASK 19 >> +#define TIF_SYSCALL_TRACE 20 /* syscall trace active */ >> +#define TIF_SYSCALL_AUDIT 21 /* syscall auditing active */ >> +#define TIF_SYSCALL_TRACEPOINT 22 /* syscall tracepoint instrumentation */ >> +#define TIF_SECCOMP 23 /* seccomp syscall filtering active */ >> + >> >> #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) >> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) > > Yep, LGTM, thanks. Hui Tang, can you give it a whirl? Just checked and it applies to 5.10-stable as well, just with a slight offset. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2023-01-03 14:59 ` Jens Axboe @ 2023-01-04 1:31 ` Hui Tang 2023-01-04 7:04 ` Hui Tang 1 sibling, 0 replies; 8+ messages in thread From: Hui Tang @ 2023-01-04 1:31 UTC (permalink / raw) To: Jens Axboe, Russell King (Oracle); +Cc: linux-arm-kernel, linux-kernel On 2023/1/3 22:59, Jens Axboe wrote: > On 1/3/23 7:34?AM, Russell King (Oracle) wrote: >> On Tue, Jan 03, 2023 at 07:25:26AM -0700, Jens Axboe wrote: >>> On 1/3/23 3:06?AM, Russell King (Oracle) wrote: >>>> On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: >>>>> hi folks. >>>>> >>>>> I found a performance problem which is introduced by commit >>>>> 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). >>>>> After the commit, any bit in the range of 0..15 will cause >>>>> do_work_pending() to be invoked. More frequent do_work_pending() >>>>> invoked possible result in worse performance. >>>>> >>>>> Some of the tests I've done? as follows: >>>>> lmbench test base with patch >>>>> ./lat_ctx -P 1 -s 0 2 7.3167 11.04 >>>>> ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 >>>>> ./lat_ctx -P 1 -s 64 2 7.8667 11.43 >>>>> ./lat_ctx -P 1 -s 16 16 16.47 18.3667 >>>>> ./lat_pipe -P 1 28.1671 44.7904 >>>>> >>>>> libMicro-0.4.1 test base with patch >>>>> ./cascade_cond -E -C 200\ >>>>> -L -S -W -N "c_cond_1" -I 100 286.3333 358 >>>>> >>>>> When I adjust test bit, the performance problem gone. >>>>> - movs r1, r1, lsl #16 >>>>> + ldr r2, =#_TIF_WORK_MASK >>>>> + tst r1, r2 >>>>> >>>>> Does anyone have a good suggestion for this problem? >>>>> should just test _TIF_WORK_MASK, as before? >>>> >>>> I think it should be fine - but I would suggest re-organising the >>>> TIF definitions so that those TIF bits that shouldn't trigger >>>> do_work_pending are not in the first 16 bits. >>>> >>>> Note that all four bits in _TIF_SYSCALL_WORK need to stay within >>>> an 8-bit even-bit-aligned range, so the value is suitable for an >>>> immediate assembly constant. >>>> >>>> I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to >>>> 20..23, and then 8 to 4. >>> >>> Like this? >>> >>> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h >>> index aecc403b2880..7f092cb55a41 100644 >>> --- a/arch/arm/include/asm/thread_info.h >>> +++ b/arch/arm/include/asm/thread_info.h >>> @@ -128,15 +128,16 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, >>> #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ >>> #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ >>> #define TIF_UPROBE 3 /* breakpointed or singlestepping */ >>> -#define TIF_SYSCALL_TRACE 4 /* syscall trace active */ >>> -#define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ >>> -#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ >>> -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ >>> -#define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ >>> +#define TIF_NOTIFY_SIGNAL 4 /* signal notifications exist */ >>> >>> #define TIF_USING_IWMMXT 17 >>> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ >>> -#define TIF_RESTORE_SIGMASK 20 >>> +#define TIF_RESTORE_SIGMASK 19 >>> +#define TIF_SYSCALL_TRACE 20 /* syscall trace active */ >>> +#define TIF_SYSCALL_AUDIT 21 /* syscall auditing active */ >>> +#define TIF_SYSCALL_TRACEPOINT 22 /* syscall tracepoint instrumentation */ >>> +#define TIF_SECCOMP 23 /* seccomp syscall filtering active */ >>> + >>> >>> #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) >>> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) >> >> Yep, LGTM, thanks. > > Hui Tang, can you give it a whirl? Just checked and it applies to > 5.10-stable as well, just with a slight offset. Okay, I'll test it today. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2023-01-03 14:59 ` Jens Axboe 2023-01-04 1:31 ` Hui Tang @ 2023-01-04 7:04 ` Hui Tang 2023-01-04 14:45 ` Jens Axboe 1 sibling, 1 reply; 8+ messages in thread From: Hui Tang @ 2023-01-04 7:04 UTC (permalink / raw) To: Jens Axboe, Russell King (Oracle); +Cc: linux-arm-kernel, linux-kernel On 2023/1/3 22:59, Jens Axboe wrote: > On 1/3/23 7:34?AM, Russell King (Oracle) wrote: >> On Tue, Jan 03, 2023 at 07:25:26AM -0700, Jens Axboe wrote: >>> On 1/3/23 3:06?AM, Russell King (Oracle) wrote: >>>> On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: >>>>> hi folks. >>>>> >>>>> I found a performance problem which is introduced by commit >>>>> 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). >>>>> After the commit, any bit in the range of 0..15 will cause >>>>> do_work_pending() to be invoked. More frequent do_work_pending() >>>>> invoked possible result in worse performance. >>>>> >>>>> Some of the tests I've done? as follows: >>>>> lmbench test base with patch >>>>> ./lat_ctx -P 1 -s 0 2 7.3167 11.04 >>>>> ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 >>>>> ./lat_ctx -P 1 -s 64 2 7.8667 11.43 >>>>> ./lat_ctx -P 1 -s 16 16 16.47 18.3667 >>>>> ./lat_pipe -P 1 28.1671 44.7904 >>>>> >>>>> libMicro-0.4.1 test base with patch >>>>> ./cascade_cond -E -C 200\ >>>>> -L -S -W -N "c_cond_1" -I 100 286.3333 358 >>>>> >>>>> When I adjust test bit, the performance problem gone. >>>>> - movs r1, r1, lsl #16 >>>>> + ldr r2, =#_TIF_WORK_MASK >>>>> + tst r1, r2 >>>>> >>>>> Does anyone have a good suggestion for this problem? >>>>> should just test _TIF_WORK_MASK, as before? >>>> >>>> I think it should be fine - but I would suggest re-organising the >>>> TIF definitions so that those TIF bits that shouldn't trigger >>>> do_work_pending are not in the first 16 bits. >>>> >>>> Note that all four bits in _TIF_SYSCALL_WORK need to stay within >>>> an 8-bit even-bit-aligned range, so the value is suitable for an >>>> immediate assembly constant. >>>> >>>> I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to >>>> 20..23, and then 8 to 4. >>> >>> Like this? >>> >>> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h >>> index aecc403b2880..7f092cb55a41 100644 >>> --- a/arch/arm/include/asm/thread_info.h >>> +++ b/arch/arm/include/asm/thread_info.h >>> @@ -128,15 +128,16 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, >>> #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ >>> #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ >>> #define TIF_UPROBE 3 /* breakpointed or singlestepping */ >>> -#define TIF_SYSCALL_TRACE 4 /* syscall trace active */ >>> -#define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ >>> -#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ >>> -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ >>> -#define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ >>> +#define TIF_NOTIFY_SIGNAL 4 /* signal notifications exist */ >>> >>> #define TIF_USING_IWMMXT 17 >>> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ >>> -#define TIF_RESTORE_SIGMASK 20 >>> +#define TIF_RESTORE_SIGMASK 19 >>> +#define TIF_SYSCALL_TRACE 20 /* syscall trace active */ >>> +#define TIF_SYSCALL_AUDIT 21 /* syscall auditing active */ >>> +#define TIF_SYSCALL_TRACEPOINT 22 /* syscall tracepoint instrumentation */ >>> +#define TIF_SECCOMP 23 /* seccomp syscall filtering active */ >>> + >>> >>> #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) >>> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) >> >> Yep, LGTM, thanks. > > Hui Tang, can you give it a whirl? Just checked and it applies to > 5.10-stable as well, just with a slight offset. With the latest patch, the testcase rusults shown in the 'new patch' column. I also retested previous commit of 32d59773da38, the results shown in the 'base' column. lmbench test base 32d59773da38 new patch ./lat_ctx -P 1 -s 0 2 8.04 11.04 8.25 ./lat_ctx -P 1 -s 16 2 9.08 14.5367 9.26 ./lat_ctx -P 1 -s 64 2 8.78 11.43 8.71 ./lat_ctx -P 1 -s 16 16 17.22 18.3667 17.32 ./lat_pipe -P 1 43.5021 44.7904 41.3729 libMicro-0.4.1 test base 32d59773da38 new patch ./cascade_cond -E -C 200\ -L -S -W -N "c_cond_1" -I 100 281 358 281 The performance problem also seem to gone with the latest patch, thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bug-report] possible performance problem in ret_to_user_from_irq 2023-01-04 7:04 ` Hui Tang @ 2023-01-04 14:45 ` Jens Axboe 0 siblings, 0 replies; 8+ messages in thread From: Jens Axboe @ 2023-01-04 14:45 UTC (permalink / raw) To: Hui Tang, Russell King (Oracle); +Cc: linux-arm-kernel, linux-kernel On 1/4/23 12:04?AM, Hui Tang wrote: > > > On 2023/1/3 22:59, Jens Axboe wrote: >> On 1/3/23 7:34?AM, Russell King (Oracle) wrote: >>> On Tue, Jan 03, 2023 at 07:25:26AM -0700, Jens Axboe wrote: >>>> On 1/3/23 3:06?AM, Russell King (Oracle) wrote: >>>>> On Mon, Dec 26, 2022 at 04:45:20PM +0800, Hui Tang wrote: >>>>>> hi folks. >>>>>> >>>>>> I found a performance problem which is introduced by commit >>>>>> 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL"). >>>>>> After the commit, any bit in the range of 0..15 will cause >>>>>> do_work_pending() to be invoked. More frequent do_work_pending() >>>>>> invoked possible result in worse performance. >>>>>> >>>>>> Some of the tests I've done? as follows: >>>>>> lmbench test base with patch >>>>>> ./lat_ctx -P 1 -s 0 2 7.3167 11.04 >>>>>> ./lat_ctx -P 1 -s 16 2 8.0467 14.5367 >>>>>> ./lat_ctx -P 1 -s 64 2 7.8667 11.43 >>>>>> ./lat_ctx -P 1 -s 16 16 16.47 18.3667 >>>>>> ./lat_pipe -P 1 28.1671 44.7904 >>>>>> >>>>>> libMicro-0.4.1 test base with patch >>>>>> ./cascade_cond -E -C 200\ >>>>>> -L -S -W -N "c_cond_1" -I 100 286.3333 358 >>>>>> >>>>>> When I adjust test bit, the performance problem gone. >>>>>> - movs r1, r1, lsl #16 >>>>>> + ldr r2, =#_TIF_WORK_MASK >>>>>> + tst r1, r2 >>>>>> >>>>>> Does anyone have a good suggestion for this problem? >>>>>> should just test _TIF_WORK_MASK, as before? >>>>> >>>>> I think it should be fine - but I would suggest re-organising the >>>>> TIF definitions so that those TIF bits that shouldn't trigger >>>>> do_work_pending are not in the first 16 bits. >>>>> >>>>> Note that all four bits in _TIF_SYSCALL_WORK need to stay within >>>>> an 8-bit even-bit-aligned range, so the value is suitable for an >>>>> immediate assembly constant. >>>>> >>>>> I'd suggest moving the TIF definitions for 20 to 19, and 4..7 to >>>>> 20..23, and then 8 to 4. >>>> >>>> Like this? >>>> >>>> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h >>>> index aecc403b2880..7f092cb55a41 100644 >>>> --- a/arch/arm/include/asm/thread_info.h >>>> +++ b/arch/arm/include/asm/thread_info.h >>>> @@ -128,15 +128,16 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, >>>> #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ >>>> #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ >>>> #define TIF_UPROBE 3 /* breakpointed or singlestepping */ >>>> -#define TIF_SYSCALL_TRACE 4 /* syscall trace active */ >>>> -#define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ >>>> -#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ >>>> -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ >>>> -#define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ >>>> +#define TIF_NOTIFY_SIGNAL 4 /* signal notifications exist */ >>>> >>>> #define TIF_USING_IWMMXT 17 >>>> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ >>>> -#define TIF_RESTORE_SIGMASK 20 >>>> +#define TIF_RESTORE_SIGMASK 19 >>>> +#define TIF_SYSCALL_TRACE 20 /* syscall trace active */ >>>> +#define TIF_SYSCALL_AUDIT 21 /* syscall auditing active */ >>>> +#define TIF_SYSCALL_TRACEPOINT 22 /* syscall tracepoint instrumentation */ >>>> +#define TIF_SECCOMP 23 /* seccomp syscall filtering active */ >>>> + >>>> >>>> #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) >>>> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) >>> >>> Yep, LGTM, thanks. >> >> Hui Tang, can you give it a whirl? Just checked and it applies to >> 5.10-stable as well, just with a slight offset. > > With the latest patch, the testcase rusults shown in the 'new patch' column. > I also retested previous commit of 32d59773da38, the results shown in the 'base' column. > > lmbench test base 32d59773da38 new patch > ./lat_ctx -P 1 -s 0 2 8.04 11.04 8.25 > ./lat_ctx -P 1 -s 16 2 9.08 14.5367 9.26 > ./lat_ctx -P 1 -s 64 2 8.78 11.43 8.71 > ./lat_ctx -P 1 -s 16 16 17.22 18.3667 17.32 > ./lat_pipe -P 1 43.5021 44.7904 41.3729 > > libMicro-0.4.1 test base 32d59773da38 new patch > ./cascade_cond -E -C 200\ > -L -S -W -N "c_cond_1" -I 100 281 358 281 > > The performance problem also seem to gone with the latest patch, thanks. Thanks for testing! I'm going to send it out and add your tested-by (and reported-by). -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-01-04 14:46 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-12-26 8:45 [bug-report] possible performance problem in ret_to_user_from_irq Hui Tang 2023-01-03 10:06 ` Russell King (Oracle) 2023-01-03 14:25 ` Jens Axboe 2023-01-03 14:34 ` Russell King (Oracle) 2023-01-03 14:59 ` Jens Axboe 2023-01-04 1:31 ` Hui Tang 2023-01-04 7:04 ` Hui Tang 2023-01-04 14:45 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox