* [PATCH] powerpc/kernel: improve FP and vector registers restoration
@ 2017-06-02 20:56 Breno Leitao
0 siblings, 0 replies; 8+ messages in thread
From: Breno Leitao @ 2017-06-02 20:56 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Breno Leitao, Gustavo Romero
Currently tsk->thread->load_vec and load_fp are not initialized during a
task creation, which set garbage to these variables (non-zero value).
These variables will be checked later at restore_math() to validate if the
FP and vectors are being utilized. Since these values might be non-zero,
the restore_math() will continue to save the FP and vectors even if they
were never utilized before the userspace application. load_fp and load_vec
counters will then overflow and the FP and Altivec will be finally
disabled, but before that condition is reached (counter overflow) several
context switches restored FP and vector registers without need, causing a
performance degradation.
Signed-off-by: Breno Leitao <breno.leitao@gmail.com>
Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
---
arch/powerpc/kernel/process.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index baae104b16c7..a9435397eab8 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1666,6 +1666,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
#ifdef CONFIG_VSX
current->thread.used_vsr = 0;
#endif
+ current->thread.load_fp = 0;
memset(¤t->thread.fp_state, 0, sizeof(current->thread.fp_state));
current->thread.fp_save_area = NULL;
#ifdef CONFIG_ALTIVEC
@@ -1674,6 +1675,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
current->thread.vr_save_area = NULL;
current->thread.vrsave = 0;
current->thread.used_vr = 0;
+ current->thread.load_vec = 0;
#endif /* CONFIG_ALTIVEC */
#ifdef CONFIG_SPE
memset(current->thread.evr, 0, sizeof(current->thread.evr));
--
2.11.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] powerpc/kernel: improve FP and vector registers restoration
@ 2017-06-02 21:43 Breno Leitao
2017-06-02 22:04 ` Anton Blanchard
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Breno Leitao @ 2017-06-02 21:43 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Breno Leitao, Gustavo Romero
Currently tsk->thread->load_vec and load_fp are not initialized during a
task creation, which set garbage to these variables (non-zero value).
These variables will be checked later at restore_math() to validate if the
FP and vectors are being utilized. Since these values might be non-zero,
the restore_math() will continue to save the FP and vectors even if they
were never utilized before the userspace application. load_fp and load_vec
counters will then overflow and the FP and Altivec will be finally
disabled, but before that condition is reached (counter overflow) several
context switches restored FP and vector registers without need, causing a
performance degradation.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
---
arch/powerpc/kernel/process.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index baae104b16c7..a9435397eab8 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1666,6 +1666,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
#ifdef CONFIG_VSX
current->thread.used_vsr = 0;
#endif
+ current->thread.load_fp = 0;
memset(¤t->thread.fp_state, 0, sizeof(current->thread.fp_state));
current->thread.fp_save_area = NULL;
#ifdef CONFIG_ALTIVEC
@@ -1674,6 +1675,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
current->thread.vr_save_area = NULL;
current->thread.vrsave = 0;
current->thread.used_vr = 0;
+ current->thread.load_vec = 0;
#endif /* CONFIG_ALTIVEC */
#ifdef CONFIG_SPE
memset(current->thread.evr, 0, sizeof(current->thread.evr));
--
2.11.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/kernel: improve FP and vector registers restoration
2017-06-02 21:43 [PATCH] powerpc/kernel: improve FP and vector registers restoration Breno Leitao
@ 2017-06-02 22:04 ` Anton Blanchard
2017-06-03 22:42 ` Breno Leitao
2017-06-05 5:59 ` Michael Ellerman
2017-06-08 4:05 ` Michael Ellerman
2 siblings, 1 reply; 8+ messages in thread
From: Anton Blanchard @ 2017-06-02 22:04 UTC (permalink / raw)
To: Breno Leitao; +Cc: linuxppc-dev, Gustavo Romero
Hi Breno,
> Currently tsk->thread->load_vec and load_fp are not initialized
> during a task creation, which set garbage to these variables
> (non-zero value).
Nice catch! It seems like we should zero load_tm too though?
Acked-by: Anton Blanchard <anton@samba.org>
Anton
> These variables will be checked later at restore_math() to validate
> if the FP and vectors are being utilized. Since these values might be
> non-zero, the restore_math() will continue to save the FP and vectors
> even if they were never utilized before the userspace application.
> load_fp and load_vec counters will then overflow and the FP and
> Altivec will be finally disabled, but before that condition is
> reached (counter overflow) several context switches restored FP and
> vector registers without need, causing a performance degradation.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
> ---
> arch/powerpc/kernel/process.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/process.c
> b/arch/powerpc/kernel/process.c index baae104b16c7..a9435397eab8
> 100644 --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1666,6 +1666,7 @@ void start_thread(struct pt_regs *regs,
> unsigned long start, unsigned long sp) #ifdef CONFIG_VSX
> current->thread.used_vsr = 0;
> #endif
> + current->thread.load_fp = 0;
> memset(¤t->thread.fp_state, 0,
> sizeof(current->thread.fp_state)); current->thread.fp_save_area =
> NULL; #ifdef CONFIG_ALTIVEC
> @@ -1674,6 +1675,7 @@ void start_thread(struct pt_regs *regs,
> unsigned long start, unsigned long sp) current->thread.vr_save_area =
> NULL; current->thread.vrsave = 0;
> current->thread.used_vr = 0;
> + current->thread.load_vec = 0;
> #endif /* CONFIG_ALTIVEC */
> #ifdef CONFIG_SPE
> memset(current->thread.evr, 0, sizeof(current->thread.evr));
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/kernel: improve FP and vector registers restoration
2017-06-02 22:04 ` Anton Blanchard
@ 2017-06-03 22:42 ` Breno Leitao
2017-06-04 1:38 ` Anton Blanchard
0 siblings, 1 reply; 8+ messages in thread
From: Breno Leitao @ 2017-06-03 22:42 UTC (permalink / raw)
To: Anton Blanchard; +Cc: linuxppc-dev, Gustavo Romero
Hi Anton,
On Sat, Jun 03, 2017 at 08:04:11AM +1000, Anton Blanchard wrote:
> Hi Breno,
>
> > Currently tsk->thread->load_vec and load_fp are not initialized
> > during a task creation, which set garbage to these variables
> > (non-zero value).
>
> Nice catch! It seems like we should zero load_tm too though?
Yes, it seems we need to zero load_tm also, since it does not seem to be
zeroed anywhere else.
But I did some tests, and load_tm is always zero after start_thread()
is being called.
In fact, start_thread() is being called and pt_regs->load_tm is already
zero since the function start.
I also wrote a SystemTap script[1] to investigate it better, and I've
never seen a single load_tm != 0 in a my machine. I tested on both
POWER8 bare metal and KVM guests. (load_vec and load_fp happened to have
garbage all the time)
Any idea if this is just occasional event, or, if there is someone
zeroing it in an obscure code?
[1] https://github.com/leitao/htm_torture/blob/master/systemtap/load_tm_at_start_thread.stap
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/kernel: improve FP and vector registers restoration
2017-06-03 22:42 ` Breno Leitao
@ 2017-06-04 1:38 ` Anton Blanchard
2017-06-04 14:34 ` Breno Leitao
0 siblings, 1 reply; 8+ messages in thread
From: Anton Blanchard @ 2017-06-04 1:38 UTC (permalink / raw)
To: Breno Leitao; +Cc: linuxppc-dev, Gustavo Romero
On Sat, 3 Jun 2017 19:42:14 -0300
Breno Leitao <leitao@debian.org> wrote:
> Hi Anton,
>
> On Sat, Jun 03, 2017 at 08:04:11AM +1000, Anton Blanchard wrote:
> > Hi Breno,
> >
> > > Currently tsk->thread->load_vec and load_fp are not initialized
> > > during a task creation, which set garbage to these variables
> > > (non-zero value).
> >
> > Nice catch! It seems like we should zero load_tm too though?
>
> Yes, it seems we need to zero load_tm also, since it does not seem to
> be zeroed anywhere else.
>
> But I did some tests, and load_tm is always zero after start_thread()
> is being called.
>
> In fact, start_thread() is being called and pt_regs->load_tm is
> already zero since the function start.
>
> I also wrote a SystemTap script[1] to investigate it better, and I've
> never seen a single load_tm != 0 in a my machine. I tested on both
> POWER8 bare metal and KVM guests. (load_vec and load_fp happened to
> have garbage all the time)
>
> Any idea if this is just occasional event, or, if there is someone
> zeroing it in an obscure code?
Quite likely no one uses TM :) Try:
#include <unistd.h>
int main(void)
{
__builtin_tbegin(0);
execlp("/bin/true", "/bin/true", NULL);
}
Anton
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/kernel: improve FP and vector registers restoration
2017-06-04 1:38 ` Anton Blanchard
@ 2017-06-04 14:34 ` Breno Leitao
0 siblings, 0 replies; 8+ messages in thread
From: Breno Leitao @ 2017-06-04 14:34 UTC (permalink / raw)
To: Anton Blanchard; +Cc: linuxppc-dev, Gustavo Romero
On Sun, Jun 04, 2017 at 11:38:14AM +1000, Anton Blanchard wrote:
> On Sat, 3 Jun 2017 19:42:14 -0300
> Breno Leitao <leitao@debian.org> wrote:
>
> > Hi Anton,
> >
> > On Sat, Jun 03, 2017 at 08:04:11AM +1000, Anton Blanchard wrote:
> > > Hi Breno,
> > >
> > > > Currently tsk->thread->load_vec and load_fp are not initialized
> > > > during a task creation, which set garbage to these variables
> > > > (non-zero value).
> > >
> > > Nice catch! It seems like we should zero load_tm too though?
> >
> > Yes, it seems we need to zero load_tm also, since it does not seem to
> > be zeroed anywhere else.
> >
> > But I did some tests, and load_tm is always zero after start_thread()
> > is being called.
> >
> > In fact, start_thread() is being called and pt_regs->load_tm is
> > already zero since the function start.
> >
> > I also wrote a SystemTap script[1] to investigate it better, and I've
> > never seen a single load_tm != 0 in a my machine. I tested on both
> > POWER8 bare metal and KVM guests. (load_vec and load_fp happened to
> > have garbage all the time)
> >
> > Any idea if this is just occasional event, or, if there is someone
> > zeroing it in an obscure code?
>
> Quite likely no one uses TM :) Try:
In fact, I had tested with TM[1] and haven't seen any issue, but I was not
calling a nested application (through execve() syscall). Somehow if I
call "$ ./tm_application ; /bin/true", I do not see a non-zero load_tm
in the new task->thread.
On the other side, I see the corruption with your test case, mainly if I
sleep after 'tbegin.' and before execlp(), giving a chance to have
load_tm incremented, and this value is being inherited in the new
task->thread.
This is obviously wrong, I will send a patch to have it fixed.
Thanks for the guidance!
[1] https://github.com/leitao/htm_torture
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/kernel: improve FP and vector registers restoration
2017-06-02 21:43 [PATCH] powerpc/kernel: improve FP and vector registers restoration Breno Leitao
2017-06-02 22:04 ` Anton Blanchard
@ 2017-06-05 5:59 ` Michael Ellerman
2017-06-08 4:05 ` Michael Ellerman
2 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2017-06-05 5:59 UTC (permalink / raw)
To: Breno Leitao, linuxppc-dev; +Cc: Breno Leitao, Gustavo Romero
Breno Leitao <leitao@debian.org> writes:
> Currently tsk->thread->load_vec and load_fp are not initialized during a
> task creation, which set garbage to these variables (non-zero value).
>
> These variables will be checked later at restore_math() to validate if the
> FP and vectors are being utilized. Since these values might be non-zero,
> the restore_math() will continue to save the FP and vectors even if they
> were never utilized before the userspace application. load_fp and load_vec
> counters will then overflow and the FP and Altivec will be finally
> disabled, but before that condition is reached (counter overflow) several
> context switches restored FP and vector registers without need, causing a
> performance degradation.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
Thanks, I tweaked the wording a little and added:
Fixes: 70fe3d980f5f ("powerpc: Restore FPU/VEC/VSX if previously used")
Cc: stable@vger.kernel.org # v4.6+
cheers
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: powerpc/kernel: improve FP and vector registers restoration
2017-06-02 21:43 [PATCH] powerpc/kernel: improve FP and vector registers restoration Breno Leitao
2017-06-02 22:04 ` Anton Blanchard
2017-06-05 5:59 ` Michael Ellerman
@ 2017-06-08 4:05 ` Michael Ellerman
2 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2017-06-08 4:05 UTC (permalink / raw)
To: Breno Leitao, linuxppc-dev; +Cc: Breno Leitao, Gustavo Romero
On Fri, 2017-06-02 at 21:43:30 UTC, Breno Leitao wrote:
> Currently tsk->thread->load_vec and load_fp are not initialized during a
> task creation, which set garbage to these variables (non-zero value).
>
> These variables will be checked later at restore_math() to validate if the
> FP and vectors are being utilized. Since these values might be non-zero,
> the restore_math() will continue to save the FP and vectors even if they
> were never utilized before the userspace application. load_fp and load_vec
> counters will then overflow and the FP and Altivec will be finally
> disabled, but before that condition is reached (counter overflow) several
> context switches restored FP and vector registers without need, causing a
> performance degradation.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
> Acked-by: Anton Blanchard <anton@samba.org>
Applied to powerpc fixes, thanks.
https://git.kernel.org/powerpc/c/1195892c091a15cc862f4e202482a3
cheers
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-06-08 4:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-02 21:43 [PATCH] powerpc/kernel: improve FP and vector registers restoration Breno Leitao
2017-06-02 22:04 ` Anton Blanchard
2017-06-03 22:42 ` Breno Leitao
2017-06-04 1:38 ` Anton Blanchard
2017-06-04 14:34 ` Breno Leitao
2017-06-05 5:59 ` Michael Ellerman
2017-06-08 4:05 ` Michael Ellerman
-- strict thread matches above, loose matches on Subject: below --
2017-06-02 20:56 [PATCH] " Breno Leitao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).