* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 0:32 ` Boqun Feng
0 siblings, 0 replies; 80+ messages in thread
From: Boqun Feng @ 2023-04-24 0:32 UTC (permalink / raw)
To: Joel Fernandes
Cc: Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance,
Paul E. McKenney, Michael Ellerman
On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > Dear PowerPC and RCU developers:
> > During the RCU torture test on mainline (on the VM of Opensource Lab
> > of Oregon State University), SRCU-P failed with __stack_chk_fail:
> > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0]
> > dump_stack_lvl+0x94/0xd8 (unreliable)
> > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468
> > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24]
> > __stack_chk_fail+0x24/0x30
> > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4]
> > srcu_gp_start_if_needed+0x5c4/0x5d0
> > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4]
> > srcu_torture_call+0x34/0x50
> > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8]
> > rcu_torture_fwd_prog+0x8c8/0xa60
> > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170
> > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94]
> > ret_from_kernel_thread+0x5c/0x64
> > The kernel config file can be found in [1].
> > And I write a bash script to accelerate the bug reproducing [2].
> > After a week's debugging, I found the cause of the bug is because the
> > register r10 used to judge for stack overflow is not constant between
> > context switches.
> > The assembly code for srcu_gp_start_if_needed is located at [3]:
> > c000000000226eb4: 78 6b aa 7d mr r10,r13
> > c000000000226eb8: 14 42 29 7d add r9,r9,r8
> > c000000000226ebc: ac 04 00 7c hwsync
> > c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> > c000000000226ec4: 14 da 29 7d add r9,r9,r27
> > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> > c000000000226ecc: 01 00 08 31 addic r8,r8,1
> > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8
> > <srcu_gp_start_if_needed+0x1c8>
> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
> > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> > c000000000226ee4: 00 00 40 39 li r10,0
> > c000000000226ee8: b8 03 82 40 bne c0000000002272a0
> > <srcu_gp_start_if_needed+0x5a0>
> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> > but if there is a context-switch before c000000000226edc, a false
> > positive will be reported.
> >
> > [1] http://154.220.3.115/logs/0422/configformainline.txt
> > [2] 154.220.3.115/logs/0422/whilebash.sh
> > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt
> >
> > My analysis and debugging may not be correct, but the bug is easily
> > reproducible.
>
> If this is a bug in the stack smashing protection as you seem to hint,
> I wonder if you see the issue with a specific gcc version and is a
> compiler-specific issue. It's hard to say, but considering this I
Very likely, more asm code from Zhouyi's link:
This is the __srcu_read_unlock_nmisafe(), since "hwsync" is
smp_mb__{after,before}_atomic(), and the following code is first
barrier then atomic, so it's the unlock.
c000000000226eb4: 78 6b aa 7d mr r10,r13
^ r13 is the pointer to percpu data on PPC64 kernel, and it's also
the pointer to TLS data for userspace code.
c000000000226eb8: 14 42 29 7d add r9,r9,r8
c000000000226ebc: ac 04 00 7c hwsync
c000000000226ec0: 10 00 7b 3b addi r27,r27,16
c000000000226ec4: 14 da 29 7d add r9,r9,r27
c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
c000000000226ecc: 01 00 08 31 addic r8,r8,1
c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8>
c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
here I think that the compiler is using r10 as an alias to r13, since
for userspace program, it's safe to assume the TLS pointer doesn't
change. However this is not true for kernel percpu pointer.
The real intention here is to compare 40(r1) vs 3192(r13) for stack
guard checking, however since r13 is the percpu pointer in kernel, so
the value of r13 can be changed if the thread gets scheduled to a
different CPU after reading r13 for r10.
__srcu_read_unlock_nmisafe() triggers this issue, because:
* it contains a read from r13
* it locates at the very end of srcu_gp_start_if_needed().
This gives the compiler more opportunity to "optimize" a read from r13
away.
c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
c000000000226ee4: 00 00 40 39 li r10,0
c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
As a result, here triggers __stack_chk_fail if mis-match.
If I'm correct, the following should be a workaround:
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index ab4ee58af84b..f5ae3be3d04d 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */
atomic_long_inc(&sdp->srcu_unlock_count[idx]);
+ asm volatile("" : : : "r13", "memory");
}
EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe);
Zhouyi, could you give a try? Note I think the "memory" clobber here is
unnecesarry, but I just add it in case I'm wrong.
Needless to say, the correct fix is to make ppc stack protector aware of
r13 is volatile.
Regards,
Boqun
> think it's important for you to mention the compiler version in your
> report (along with kernel version, kernel logs etc.)
>
> thanks,
>
> - Joel
>
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 0:32 ` Boqun Feng
@ 2023-04-24 4:00 ` Zhouyi Zhou
-1 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-24 4:00 UTC (permalink / raw)
To: Boqun Feng
Cc: Paul E. McKenney, linux-kernel, rcu, lance, Joel Fernandes,
linuxppc-dev
Thank Boqun for your wonderful analysis!
On Mon, Apr 24, 2023 at 8:33 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> > On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > >
> > > Dear PowerPC and RCU developers:
> > > During the RCU torture test on mainline (on the VM of Opensource Lab
> > > of Oregon State University), SRCU-P failed with __stack_chk_fail:
> > > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0]
> > > dump_stack_lvl+0x94/0xd8 (unreliable)
> > > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468
> > > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24]
> > > __stack_chk_fail+0x24/0x30
> > > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4]
> > > srcu_gp_start_if_needed+0x5c4/0x5d0
> > > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4]
> > > srcu_torture_call+0x34/0x50
> > > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8]
> > > rcu_torture_fwd_prog+0x8c8/0xa60
> > > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170
> > > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94]
> > > ret_from_kernel_thread+0x5c/0x64
> > > The kernel config file can be found in [1].
> > > And I write a bash script to accelerate the bug reproducing [2].
> > > After a week's debugging, I found the cause of the bug is because the
> > > register r10 used to judge for stack overflow is not constant between
> > > context switches.
> > > The assembly code for srcu_gp_start_if_needed is located at [3]:
> > > c000000000226eb4: 78 6b aa 7d mr r10,r13
> > > c000000000226eb8: 14 42 29 7d add r9,r9,r8
> > > c000000000226ebc: ac 04 00 7c hwsync
> > > c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> > > c000000000226ec4: 14 da 29 7d add r9,r9,r27
> > > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> > > c000000000226ecc: 01 00 08 31 addic r8,r8,1
> > > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> > > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8
> > > <srcu_gp_start_if_needed+0x1c8>
> > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> > > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
> > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> > > c000000000226ee4: 00 00 40 39 li r10,0
> > > c000000000226ee8: b8 03 82 40 bne c0000000002272a0
> > > <srcu_gp_start_if_needed+0x5a0>
> > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> > > but if there is a context-switch before c000000000226edc, a false
> > > positive will be reported.
> > >
> > > [1] http://154.220.3.115/logs/0422/configformainline.txt
> > > [2] 154.220.3.115/logs/0422/whilebash.sh
> > > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt
> > >
> > > My analysis and debugging may not be correct, but the bug is easily
> > > reproducible.
> >
> > If this is a bug in the stack smashing protection as you seem to hint,
> > I wonder if you see the issue with a specific gcc version and is a
> > compiler-specific issue. It's hard to say, but considering this I
>
> Very likely, more asm code from Zhouyi's link:
>
> This is the __srcu_read_unlock_nmisafe(), since "hwsync" is
> smp_mb__{after,before}_atomic(), and the following code is first
> barrier then atomic, so it's the unlock.
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
>
> ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also
> the pointer to TLS data for userspace code.
>
> c000000000226eb8: 14 42 29 7d add r9,r9,r8
> c000000000226ebc: ac 04 00 7c hwsync
> c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> c000000000226ec4: 14 da 29 7d add r9,r9,r27
> c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> c000000000226ecc: 01 00 08 31 addic r8,r8,1
> c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
>
> here I think that the compiler is using r10 as an alias to r13, since
> for userspace program, it's safe to assume the TLS pointer doesn't
> change. However this is not true for kernel percpu pointer.
I learned a lot from your analysis, this is a fruitful learning
journey for me ;-)
>
> The real intention here is to compare 40(r1) vs 3192(r13) for stack
> guard checking, however since r13 is the percpu pointer in kernel, so
> the value of r13 can be changed if the thread gets scheduled to a
> different CPU after reading r13 for r10.
>
> __srcu_read_unlock_nmisafe() triggers this issue, because:
>
> * it contains a read from r13
> * it locates at the very end of srcu_gp_start_if_needed().
>
> This gives the compiler more opportunity to "optimize" a read from r13
> away.
Ah, this why adding __srcu_read_unlock_nmisafe() triggers this issue.
>
> c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> c000000000226ee4: 00 00 40 39 li r10,0
> c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
>
> As a result, here triggers __stack_chk_fail if mis-match.
>
> If I'm correct, the following should be a workaround:
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index ab4ee58af84b..f5ae3be3d04d 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
>
> smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */
> atomic_long_inc(&sdp->srcu_unlock_count[idx]);
> + asm volatile("" : : : "r13", "memory");
> }
> EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe);
>
> Zhouyi, could you give a try? Note I think the "memory" clobber here is
> unnecesarry, but I just add it in case I'm wrong.
After applying above, the srcu_gp_start_if_needed becomes
http://140.211.169.189/0424/srcu_gp_start_if_needed.txt now.
Yes, the modified kernel has survived > 2 hours' test, while the
original kernel will certainly fail within 3 minutes.
>
>
> Needless to say, the correct fix is to make ppc stack protector aware of
> r13 is volatile.
Yes, agree, thank you
Thanks you all
Regards
Zhouyi
>
> Regards,
> Boqun
>
> > think it's important for you to mention the compiler version in your
> > report (along with kernel version, kernel logs etc.)
> >
> > thanks,
> >
> > - Joel
> >
> >
> > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 4:00 ` Zhouyi Zhou
0 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-24 4:00 UTC (permalink / raw)
To: Boqun Feng
Cc: Joel Fernandes, linuxppc-dev, rcu, linux-kernel, lance,
Paul E. McKenney, Michael Ellerman
Thank Boqun for your wonderful analysis!
On Mon, Apr 24, 2023 at 8:33 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> > On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > >
> > > Dear PowerPC and RCU developers:
> > > During the RCU torture test on mainline (on the VM of Opensource Lab
> > > of Oregon State University), SRCU-P failed with __stack_chk_fail:
> > > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0]
> > > dump_stack_lvl+0x94/0xd8 (unreliable)
> > > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468
> > > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24]
> > > __stack_chk_fail+0x24/0x30
> > > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4]
> > > srcu_gp_start_if_needed+0x5c4/0x5d0
> > > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4]
> > > srcu_torture_call+0x34/0x50
> > > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8]
> > > rcu_torture_fwd_prog+0x8c8/0xa60
> > > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170
> > > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94]
> > > ret_from_kernel_thread+0x5c/0x64
> > > The kernel config file can be found in [1].
> > > And I write a bash script to accelerate the bug reproducing [2].
> > > After a week's debugging, I found the cause of the bug is because the
> > > register r10 used to judge for stack overflow is not constant between
> > > context switches.
> > > The assembly code for srcu_gp_start_if_needed is located at [3]:
> > > c000000000226eb4: 78 6b aa 7d mr r10,r13
> > > c000000000226eb8: 14 42 29 7d add r9,r9,r8
> > > c000000000226ebc: ac 04 00 7c hwsync
> > > c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> > > c000000000226ec4: 14 da 29 7d add r9,r9,r27
> > > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> > > c000000000226ecc: 01 00 08 31 addic r8,r8,1
> > > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> > > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8
> > > <srcu_gp_start_if_needed+0x1c8>
> > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> > > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
> > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> > > c000000000226ee4: 00 00 40 39 li r10,0
> > > c000000000226ee8: b8 03 82 40 bne c0000000002272a0
> > > <srcu_gp_start_if_needed+0x5a0>
> > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> > > but if there is a context-switch before c000000000226edc, a false
> > > positive will be reported.
> > >
> > > [1] http://154.220.3.115/logs/0422/configformainline.txt
> > > [2] 154.220.3.115/logs/0422/whilebash.sh
> > > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt
> > >
> > > My analysis and debugging may not be correct, but the bug is easily
> > > reproducible.
> >
> > If this is a bug in the stack smashing protection as you seem to hint,
> > I wonder if you see the issue with a specific gcc version and is a
> > compiler-specific issue. It's hard to say, but considering this I
>
> Very likely, more asm code from Zhouyi's link:
>
> This is the __srcu_read_unlock_nmisafe(), since "hwsync" is
> smp_mb__{after,before}_atomic(), and the following code is first
> barrier then atomic, so it's the unlock.
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
>
> ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also
> the pointer to TLS data for userspace code.
>
> c000000000226eb8: 14 42 29 7d add r9,r9,r8
> c000000000226ebc: ac 04 00 7c hwsync
> c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> c000000000226ec4: 14 da 29 7d add r9,r9,r27
> c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> c000000000226ecc: 01 00 08 31 addic r8,r8,1
> c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
>
> here I think that the compiler is using r10 as an alias to r13, since
> for userspace program, it's safe to assume the TLS pointer doesn't
> change. However this is not true for kernel percpu pointer.
I learned a lot from your analysis, this is a fruitful learning
journey for me ;-)
>
> The real intention here is to compare 40(r1) vs 3192(r13) for stack
> guard checking, however since r13 is the percpu pointer in kernel, so
> the value of r13 can be changed if the thread gets scheduled to a
> different CPU after reading r13 for r10.
>
> __srcu_read_unlock_nmisafe() triggers this issue, because:
>
> * it contains a read from r13
> * it locates at the very end of srcu_gp_start_if_needed().
>
> This gives the compiler more opportunity to "optimize" a read from r13
> away.
Ah, this why adding __srcu_read_unlock_nmisafe() triggers this issue.
>
> c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> c000000000226ee4: 00 00 40 39 li r10,0
> c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
>
> As a result, here triggers __stack_chk_fail if mis-match.
>
> If I'm correct, the following should be a workaround:
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index ab4ee58af84b..f5ae3be3d04d 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
>
> smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */
> atomic_long_inc(&sdp->srcu_unlock_count[idx]);
> + asm volatile("" : : : "r13", "memory");
> }
> EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe);
>
> Zhouyi, could you give a try? Note I think the "memory" clobber here is
> unnecesarry, but I just add it in case I'm wrong.
After applying above, the srcu_gp_start_if_needed becomes
http://140.211.169.189/0424/srcu_gp_start_if_needed.txt now.
Yes, the modified kernel has survived > 2 hours' test, while the
original kernel will certainly fail within 3 minutes.
>
>
> Needless to say, the correct fix is to make ppc stack protector aware of
> r13 is volatile.
Yes, agree, thank you
Thanks you all
Regards
Zhouyi
>
> Regards,
> Boqun
>
> > think it's important for you to mention the compiler version in your
> > report (along with kernel version, kernel logs etc.)
> >
> > thanks,
> >
> > - Joel
> >
> >
> > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 0:32 ` Boqun Feng
@ 2023-04-24 13:14 ` Michael Ellerman
-1 siblings, 0 replies; 80+ messages in thread
From: Michael Ellerman @ 2023-04-24 13:14 UTC (permalink / raw)
To: Boqun Feng, Joel Fernandes
Cc: Paul E. McKenney, Zhouyi Zhou, linux-kernel, rcu, lance,
linuxppc-dev
Hi Boqun,
Thanks for debugging this ...
Boqun Feng <boqun.feng@gmail.com> writes:
> On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
>> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>> >
>> > Dear PowerPC and RCU developers:
>> > During the RCU torture test on mainline (on the VM of Opensource Lab
>> > of Oregon State University), SRCU-P failed with __stack_chk_fail:
>> > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0]
>> > dump_stack_lvl+0x94/0xd8 (unreliable)
>> > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468
>> > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24]
>> > __stack_chk_fail+0x24/0x30
>> > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4]
>> > srcu_gp_start_if_needed+0x5c4/0x5d0
>> > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4]
>> > srcu_torture_call+0x34/0x50
>> > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8]
>> > rcu_torture_fwd_prog+0x8c8/0xa60
>> > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170
>> > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94]
>> > ret_from_kernel_thread+0x5c/0x64
>> > The kernel config file can be found in [1].
>> > And I write a bash script to accelerate the bug reproducing [2].
>> > After a week's debugging, I found the cause of the bug is because the
>> > register r10 used to judge for stack overflow is not constant between
>> > context switches.
>> > The assembly code for srcu_gp_start_if_needed is located at [3]:
>> > c000000000226eb4: 78 6b aa 7d mr r10,r13
>> > c000000000226eb8: 14 42 29 7d add r9,r9,r8
>> > c000000000226ebc: ac 04 00 7c hwsync
>> > c000000000226ec0: 10 00 7b 3b addi r27,r27,16
>> > c000000000226ec4: 14 da 29 7d add r9,r9,r27
>> > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
>> > c000000000226ecc: 01 00 08 31 addic r8,r8,1
>> > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
>> > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8
>> > <srcu_gp_start_if_needed+0x1c8>
>> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
>> > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
>> > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
>> > c000000000226ee4: 00 00 40 39 li r10,0
>> > c000000000226ee8: b8 03 82 40 bne c0000000002272a0
>> > <srcu_gp_start_if_needed+0x5a0>
>> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
>> > but if there is a context-switch before c000000000226edc, a false
>> > positive will be reported.
>> >
>> > [1] http://154.220.3.115/logs/0422/configformainline.txt
>> > [2] 154.220.3.115/logs/0422/whilebash.sh
>> > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt
>> >
>> > My analysis and debugging may not be correct, but the bug is easily
>> > reproducible.
>>
>> If this is a bug in the stack smashing protection as you seem to hint,
>> I wonder if you see the issue with a specific gcc version and is a
>> compiler-specific issue. It's hard to say, but considering this I
>
> Very likely, more asm code from Zhouyi's link:
>
> This is the __srcu_read_unlock_nmisafe(), since "hwsync" is
> smp_mb__{after,before}_atomic(), and the following code is first
> barrier then atomic, so it's the unlock.
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
>
> ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also
> the pointer to TLS data for userspace code.
I've never understood why the compiler wants to make a copy of a
register variable into another register!? >:#
> c000000000226eb8: 14 42 29 7d add r9,r9,r8
> c000000000226ebc: ac 04 00 7c hwsync
> c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> c000000000226ec4: 14 da 29 7d add r9,r9,r27
> c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> c000000000226ecc: 01 00 08 31 addic r8,r8,1
> c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
>
> here I think that the compiler is using r10 as an alias to r13, since
> for userspace program, it's safe to assume the TLS pointer doesn't
> change. However this is not true for kernel percpu pointer.
>
> The real intention here is to compare 40(r1) vs 3192(r13) for stack
> guard checking, however since r13 is the percpu pointer in kernel, so
> the value of r13 can be changed if the thread gets scheduled to a
> different CPU after reading r13 for r10.
Yeah that's not good.
> If I'm correct, the following should be a workaround:
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index ab4ee58af84b..f5ae3be3d04d 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
>
> smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */
> atomic_long_inc(&sdp->srcu_unlock_count[idx]);
> + asm volatile("" : : : "r13", "memory");
> }
> EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe);
>
> Zhouyi, could you give a try? Note I think the "memory" clobber here is
> unnecesarry, but I just add it in case I'm wrong.
>
> Needless to say, the correct fix is to make ppc stack protector aware of
> r13 is volatile.
I suspect the compiler developers will tell us to go jump :)
The problem of the compiler caching r13 has come up in the past, but I
only remember it being "a worry" rather than causing an actual bug.
We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
least some comfort that if the compiler is caching r13, it shouldn't be
doing it in preemptable regions.
But obviously that doesn't help at all with the stack protector check.
I don't see an easy fix.
Adding "volatile" to the definition of local_paca seems to reduce but
not elimate the caching of r13, and the GCC docs explicitly say *not* to
use volatile. It also triggers lots of warnings about volatile being
discarded.
Short term we can make stack protector depend on !PREEMPT.
Longer term possibly we can move to having current in a register like
32-bit does, and then use that as the stack protector reg.
Or something simple I haven't thought of? :)
cheers
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 13:14 ` Michael Ellerman
0 siblings, 0 replies; 80+ messages in thread
From: Michael Ellerman @ 2023-04-24 13:14 UTC (permalink / raw)
To: Boqun Feng, Joel Fernandes
Cc: Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance,
Paul E. McKenney, Segher Boessenkool
Hi Boqun,
Thanks for debugging this ...
Boqun Feng <boqun.feng@gmail.com> writes:
> On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
>> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>> >
>> > Dear PowerPC and RCU developers:
>> > During the RCU torture test on mainline (on the VM of Opensource Lab
>> > of Oregon State University), SRCU-P failed with __stack_chk_fail:
>> > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0]
>> > dump_stack_lvl+0x94/0xd8 (unreliable)
>> > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468
>> > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24]
>> > __stack_chk_fail+0x24/0x30
>> > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4]
>> > srcu_gp_start_if_needed+0x5c4/0x5d0
>> > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4]
>> > srcu_torture_call+0x34/0x50
>> > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8]
>> > rcu_torture_fwd_prog+0x8c8/0xa60
>> > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170
>> > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94]
>> > ret_from_kernel_thread+0x5c/0x64
>> > The kernel config file can be found in [1].
>> > And I write a bash script to accelerate the bug reproducing [2].
>> > After a week's debugging, I found the cause of the bug is because the
>> > register r10 used to judge for stack overflow is not constant between
>> > context switches.
>> > The assembly code for srcu_gp_start_if_needed is located at [3]:
>> > c000000000226eb4: 78 6b aa 7d mr r10,r13
>> > c000000000226eb8: 14 42 29 7d add r9,r9,r8
>> > c000000000226ebc: ac 04 00 7c hwsync
>> > c000000000226ec0: 10 00 7b 3b addi r27,r27,16
>> > c000000000226ec4: 14 da 29 7d add r9,r9,r27
>> > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
>> > c000000000226ecc: 01 00 08 31 addic r8,r8,1
>> > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
>> > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8
>> > <srcu_gp_start_if_needed+0x1c8>
>> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
>> > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
>> > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
>> > c000000000226ee4: 00 00 40 39 li r10,0
>> > c000000000226ee8: b8 03 82 40 bne c0000000002272a0
>> > <srcu_gp_start_if_needed+0x5a0>
>> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
>> > but if there is a context-switch before c000000000226edc, a false
>> > positive will be reported.
>> >
>> > [1] http://154.220.3.115/logs/0422/configformainline.txt
>> > [2] 154.220.3.115/logs/0422/whilebash.sh
>> > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt
>> >
>> > My analysis and debugging may not be correct, but the bug is easily
>> > reproducible.
>>
>> If this is a bug in the stack smashing protection as you seem to hint,
>> I wonder if you see the issue with a specific gcc version and is a
>> compiler-specific issue. It's hard to say, but considering this I
>
> Very likely, more asm code from Zhouyi's link:
>
> This is the __srcu_read_unlock_nmisafe(), since "hwsync" is
> smp_mb__{after,before}_atomic(), and the following code is first
> barrier then atomic, so it's the unlock.
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
>
> ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also
> the pointer to TLS data for userspace code.
I've never understood why the compiler wants to make a copy of a
register variable into another register!? >:#
> c000000000226eb8: 14 42 29 7d add r9,r9,r8
> c000000000226ebc: ac 04 00 7c hwsync
> c000000000226ec0: 10 00 7b 3b addi r27,r27,16
> c000000000226ec4: 14 da 29 7d add r9,r9,r27
> c000000000226ec8: a8 48 00 7d ldarx r8,0,r9
> c000000000226ecc: 01 00 08 31 addic r8,r8,1
> c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9
> c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
>
> here I think that the compiler is using r10 as an alias to r13, since
> for userspace program, it's safe to assume the TLS pointer doesn't
> change. However this is not true for kernel percpu pointer.
>
> The real intention here is to compare 40(r1) vs 3192(r13) for stack
> guard checking, however since r13 is the percpu pointer in kernel, so
> the value of r13 can be changed if the thread gets scheduled to a
> different CPU after reading r13 for r10.
Yeah that's not good.
> If I'm correct, the following should be a workaround:
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index ab4ee58af84b..f5ae3be3d04d 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
>
> smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */
> atomic_long_inc(&sdp->srcu_unlock_count[idx]);
> + asm volatile("" : : : "r13", "memory");
> }
> EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe);
>
> Zhouyi, could you give a try? Note I think the "memory" clobber here is
> unnecesarry, but I just add it in case I'm wrong.
>
> Needless to say, the correct fix is to make ppc stack protector aware of
> r13 is volatile.
I suspect the compiler developers will tell us to go jump :)
The problem of the compiler caching r13 has come up in the past, but I
only remember it being "a worry" rather than causing an actual bug.
We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
least some comfort that if the compiler is caching r13, it shouldn't be
doing it in preemptable regions.
But obviously that doesn't help at all with the stack protector check.
I don't see an easy fix.
Adding "volatile" to the definition of local_paca seems to reduce but
not elimate the caching of r13, and the GCC docs explicitly say *not* to
use volatile. It also triggers lots of warnings about volatile being
discarded.
Short term we can make stack protector depend on !PREEMPT.
Longer term possibly we can move to having current in a register like
32-bit does, and then use that as the stack protector reg.
Or something simple I haven't thought of? :)
cheers
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 13:14 ` Michael Ellerman
@ 2023-04-24 15:13 ` Segher Boessenkool
-1 siblings, 0 replies; 80+ messages in thread
From: Segher Boessenkool @ 2023-04-24 15:13 UTC (permalink / raw)
To: Michael Ellerman
Cc: Paul E. McKenney, Zhouyi Zhou, Boqun Feng, linux-kernel, rcu,
lance, Joel Fernandes, linuxppc-dev
Hi!
On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote:
> Boqun Feng <boqun.feng@gmail.com> writes:
> > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> >> > but if there is a context-switch before c000000000226edc, a false
> >> > positive will be reported.
> I've never understood why the compiler wants to make a copy of a
> register variable into another register!? >:#
It is usually because a) you told it to (maybe via an earlyclobber), or
b) it looked cheaper. I don't see either here :-(
> > here I think that the compiler is using r10 as an alias to r13, since
> > for userspace program, it's safe to assume the TLS pointer doesn't
> > change. However this is not true for kernel percpu pointer.
r13 is a "fixed" register, but that means it has a fixed purpose (so not
available for allocation), it does not mean "unchanging".
> > The real intention here is to compare 40(r1) vs 3192(r13) for stack
> > guard checking, however since r13 is the percpu pointer in kernel, so
> > the value of r13 can be changed if the thread gets scheduled to a
> > different CPU after reading r13 for r10.
>
> Yeah that's not good.
The GCC pattern here makes the four machine insns all stay together.
That is to make sure to not leak any secret value, which is impossible
to guarantee otherwise.
What tells GCC r13 can randomly change behind its back? And, what then
makes GCC ignore that fact?
> > + asm volatile("" : : : "r13", "memory");
Any asm without output is always volatile.
> > Needless to say, the correct fix is to make ppc stack protector aware of
> > r13 is volatile.
>
> I suspect the compiler developers will tell us to go jump :)
Why would r13 change over the course of *this* function / this macro,
why can this not happen anywhere else?
> The problem of the compiler caching r13 has come up in the past, but I
> only remember it being "a worry" rather than causing an actual bug.
In most cases the compiler is smart enough to use r13 directly, instead
of copying it to another reg and then using that one. But not here for
some strange reason. That of course is a very minor generated machine
code quality bug and nothing more :-(
> We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
> least some comfort that if the compiler is caching r13, it shouldn't be
> doing it in preemptable regions.
>
> But obviously that doesn't help at all with the stack protector check.
>
> I don't see an easy fix.
>
> Adding "volatile" to the definition of local_paca seems to reduce but
> not elimate the caching of r13, and the GCC docs explicitly say *not* to
> use volatile. It also triggers lots of warnings about volatile being
> discarded.
The point here is to say some code clobbers r13, not the asm volatile?
> Or something simple I haven't thought of? :)
At what points can r13 change? Only when some particular functions are
called?
Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 15:13 ` Segher Boessenkool
0 siblings, 0 replies; 80+ messages in thread
From: Segher Boessenkool @ 2023-04-24 15:13 UTC (permalink / raw)
To: Michael Ellerman
Cc: Boqun Feng, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu,
linux-kernel, lance, Paul E. McKenney
Hi!
On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote:
> Boqun Feng <boqun.feng@gmail.com> writes:
> > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> >> > but if there is a context-switch before c000000000226edc, a false
> >> > positive will be reported.
> I've never understood why the compiler wants to make a copy of a
> register variable into another register!? >:#
It is usually because a) you told it to (maybe via an earlyclobber), or
b) it looked cheaper. I don't see either here :-(
> > here I think that the compiler is using r10 as an alias to r13, since
> > for userspace program, it's safe to assume the TLS pointer doesn't
> > change. However this is not true for kernel percpu pointer.
r13 is a "fixed" register, but that means it has a fixed purpose (so not
available for allocation), it does not mean "unchanging".
> > The real intention here is to compare 40(r1) vs 3192(r13) for stack
> > guard checking, however since r13 is the percpu pointer in kernel, so
> > the value of r13 can be changed if the thread gets scheduled to a
> > different CPU after reading r13 for r10.
>
> Yeah that's not good.
The GCC pattern here makes the four machine insns all stay together.
That is to make sure to not leak any secret value, which is impossible
to guarantee otherwise.
What tells GCC r13 can randomly change behind its back? And, what then
makes GCC ignore that fact?
> > + asm volatile("" : : : "r13", "memory");
Any asm without output is always volatile.
> > Needless to say, the correct fix is to make ppc stack protector aware of
> > r13 is volatile.
>
> I suspect the compiler developers will tell us to go jump :)
Why would r13 change over the course of *this* function / this macro,
why can this not happen anywhere else?
> The problem of the compiler caching r13 has come up in the past, but I
> only remember it being "a worry" rather than causing an actual bug.
In most cases the compiler is smart enough to use r13 directly, instead
of copying it to another reg and then using that one. But not here for
some strange reason. That of course is a very minor generated machine
code quality bug and nothing more :-(
> We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
> least some comfort that if the compiler is caching r13, it shouldn't be
> doing it in preemptable regions.
>
> But obviously that doesn't help at all with the stack protector check.
>
> I don't see an easy fix.
>
> Adding "volatile" to the definition of local_paca seems to reduce but
> not elimate the caching of r13, and the GCC docs explicitly say *not* to
> use volatile. It also triggers lots of warnings about volatile being
> discarded.
The point here is to say some code clobbers r13, not the asm volatile?
> Or something simple I haven't thought of? :)
At what points can r13 change? Only when some particular functions are
called?
Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 15:13 ` Segher Boessenkool
@ 2023-04-24 15:28 ` Boqun Feng
-1 siblings, 0 replies; 80+ messages in thread
From: Boqun Feng @ 2023-04-24 15:28 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Paul E. McKenney, linux-kernel, rcu, lance, Zhouyi Zhou,
Joel Fernandes, linuxppc-dev
On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote:
> > Boqun Feng <boqun.feng@gmail.com> writes:
> > > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> > >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> > >> > but if there is a context-switch before c000000000226edc, a false
> > >> > positive will be reported.
>
> > I've never understood why the compiler wants to make a copy of a
> > register variable into another register!? >:#
>
> It is usually because a) you told it to (maybe via an earlyclobber), or
> b) it looked cheaper. I don't see either here :-(
>
> > > here I think that the compiler is using r10 as an alias to r13, since
> > > for userspace program, it's safe to assume the TLS pointer doesn't
> > > change. However this is not true for kernel percpu pointer.
>
> r13 is a "fixed" register, but that means it has a fixed purpose (so not
> available for allocation), it does not mean "unchanging".
>
> > > The real intention here is to compare 40(r1) vs 3192(r13) for stack
> > > guard checking, however since r13 is the percpu pointer in kernel, so
> > > the value of r13 can be changed if the thread gets scheduled to a
> > > different CPU after reading r13 for r10.
> >
> > Yeah that's not good.
>
> The GCC pattern here makes the four machine insns all stay together.
> That is to make sure to not leak any secret value, which is impossible
> to guarantee otherwise.
>
> What tells GCC r13 can randomly change behind its back? And, what then
> makes GCC ignore that fact?
>
> > > + asm volatile("" : : : "r13", "memory");
>
> Any asm without output is always volatile.
>
> > > Needless to say, the correct fix is to make ppc stack protector aware of
> > > r13 is volatile.
> >
> > I suspect the compiler developers will tell us to go jump :)
>
> Why would r13 change over the course of *this* function / this macro,
> why can this not happen anywhere else?
>
> > The problem of the compiler caching r13 has come up in the past, but I
> > only remember it being "a worry" rather than causing an actual bug.
>
> In most cases the compiler is smart enough to use r13 directly, instead
> of copying it to another reg and then using that one. But not here for
> some strange reason. That of course is a very minor generated machine
> code quality bug and nothing more :-(
>
> > We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
> > least some comfort that if the compiler is caching r13, it shouldn't be
> > doing it in preemptable regions.
> >
> > But obviously that doesn't help at all with the stack protector check.
> >
> > I don't see an easy fix.
> >
> > Adding "volatile" to the definition of local_paca seems to reduce but
> > not elimate the caching of r13, and the GCC docs explicitly say *not* to
> > use volatile. It also triggers lots of warnings about volatile being
> > discarded.
>
> The point here is to say some code clobbers r13, not the asm volatile?
>
> > Or something simple I haven't thought of? :)
>
> At what points can r13 change? Only when some particular functions are
> called?
>
r13 is the local paca:
register struct paca_struct *local_paca asm("r13");
, which is a pointer to percpu data.
So if a task schedule from one CPU to anotehr CPU, the value gets
changed.
Regards,
Boqun
>
> Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 15:28 ` Boqun Feng
0 siblings, 0 replies; 80+ messages in thread
From: Boqun Feng @ 2023-04-24 15:28 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu,
linux-kernel, lance, Paul E. McKenney
On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote:
> > Boqun Feng <boqun.feng@gmail.com> writes:
> > > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> > >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4,
> > >> > but if there is a context-switch before c000000000226edc, a false
> > >> > positive will be reported.
>
> > I've never understood why the compiler wants to make a copy of a
> > register variable into another register!? >:#
>
> It is usually because a) you told it to (maybe via an earlyclobber), or
> b) it looked cheaper. I don't see either here :-(
>
> > > here I think that the compiler is using r10 as an alias to r13, since
> > > for userspace program, it's safe to assume the TLS pointer doesn't
> > > change. However this is not true for kernel percpu pointer.
>
> r13 is a "fixed" register, but that means it has a fixed purpose (so not
> available for allocation), it does not mean "unchanging".
>
> > > The real intention here is to compare 40(r1) vs 3192(r13) for stack
> > > guard checking, however since r13 is the percpu pointer in kernel, so
> > > the value of r13 can be changed if the thread gets scheduled to a
> > > different CPU after reading r13 for r10.
> >
> > Yeah that's not good.
>
> The GCC pattern here makes the four machine insns all stay together.
> That is to make sure to not leak any secret value, which is impossible
> to guarantee otherwise.
>
> What tells GCC r13 can randomly change behind its back? And, what then
> makes GCC ignore that fact?
>
> > > + asm volatile("" : : : "r13", "memory");
>
> Any asm without output is always volatile.
>
> > > Needless to say, the correct fix is to make ppc stack protector aware of
> > > r13 is volatile.
> >
> > I suspect the compiler developers will tell us to go jump :)
>
> Why would r13 change over the course of *this* function / this macro,
> why can this not happen anywhere else?
>
> > The problem of the compiler caching r13 has come up in the past, but I
> > only remember it being "a worry" rather than causing an actual bug.
>
> In most cases the compiler is smart enough to use r13 directly, instead
> of copying it to another reg and then using that one. But not here for
> some strange reason. That of course is a very minor generated machine
> code quality bug and nothing more :-(
>
> > We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
> > least some comfort that if the compiler is caching r13, it shouldn't be
> > doing it in preemptable regions.
> >
> > But obviously that doesn't help at all with the stack protector check.
> >
> > I don't see an easy fix.
> >
> > Adding "volatile" to the definition of local_paca seems to reduce but
> > not elimate the caching of r13, and the GCC docs explicitly say *not* to
> > use volatile. It also triggers lots of warnings about volatile being
> > discarded.
>
> The point here is to say some code clobbers r13, not the asm volatile?
>
> > Or something simple I haven't thought of? :)
>
> At what points can r13 change? Only when some particular functions are
> called?
>
r13 is the local paca:
register struct paca_struct *local_paca asm("r13");
, which is a pointer to percpu data.
So if a task schedule from one CPU to anotehr CPU, the value gets
changed.
Regards,
Boqun
>
> Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 15:28 ` Boqun Feng
@ 2023-04-24 17:29 ` Segher Boessenkool
-1 siblings, 0 replies; 80+ messages in thread
From: Segher Boessenkool @ 2023-04-24 17:29 UTC (permalink / raw)
To: Boqun Feng
Cc: Paul E. McKenney, linux-kernel, rcu, lance, Zhouyi Zhou,
Joel Fernandes, linuxppc-dev
On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote:
> On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> > At what points can r13 change? Only when some particular functions are
> > called?
>
> r13 is the local paca:
>
> register struct paca_struct *local_paca asm("r13");
>
> , which is a pointer to percpu data.
Yes, it is a global register variable.
> So if a task schedule from one CPU to anotehr CPU, the value gets
> changed.
But the compiler does not see that something else changes local_paca (or
r13 some other way, via assembler code perhaps)? Or is there a compiler
bug?
If the latter is true:
Can you make a reproducer and open a GCC PR? <https://gcc.gnu.org/bugs/>
for how to get started doing that. We need *exact* code that shows the
problem, together with a compiler command line. So that we can
reproduce the problem. That is step 0 in figuring out what is going on,
and then maybe fixing the problem :-)
Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 17:29 ` Segher Boessenkool
0 siblings, 0 replies; 80+ messages in thread
From: Segher Boessenkool @ 2023-04-24 17:29 UTC (permalink / raw)
To: Boqun Feng
Cc: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu,
linux-kernel, lance, Paul E. McKenney
On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote:
> On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> > At what points can r13 change? Only when some particular functions are
> > called?
>
> r13 is the local paca:
>
> register struct paca_struct *local_paca asm("r13");
>
> , which is a pointer to percpu data.
Yes, it is a global register variable.
> So if a task schedule from one CPU to anotehr CPU, the value gets
> changed.
But the compiler does not see that something else changes local_paca (or
r13 some other way, via assembler code perhaps)? Or is there a compiler
bug?
If the latter is true:
Can you make a reproducer and open a GCC PR? <https://gcc.gnu.org/bugs/>
for how to get started doing that. We need *exact* code that shows the
problem, together with a compiler command line. So that we can
reproduce the problem. That is step 0 in figuring out what is going on,
and then maybe fixing the problem :-)
Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 17:29 ` Segher Boessenkool
@ 2023-04-24 19:25 ` Boqun Feng
-1 siblings, 0 replies; 80+ messages in thread
From: Boqun Feng @ 2023-04-24 19:25 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Paul E. McKenney, linux-kernel, rcu, lance, Zhouyi Zhou,
Joel Fernandes, linuxppc-dev
On Mon, Apr 24, 2023 at 12:29:00PM -0500, Segher Boessenkool wrote:
> On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote:
> > On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> > > At what points can r13 change? Only when some particular functions are
> > > called?
> >
> > r13 is the local paca:
> >
> > register struct paca_struct *local_paca asm("r13");
> >
> > , which is a pointer to percpu data.
>
> Yes, it is a global register variable.
>
> > So if a task schedule from one CPU to anotehr CPU, the value gets
> > changed.
>
> But the compiler does not see that something else changes local_paca (or
It's more like this, however, in this case r13 is not changed:
CPU 0 CPU 1
{r13 = 0x00} {r13 = 0x04}
<thread 1>
<in interrupt>
_switch():
<switch to the stack of thread 2>
<no need to change r13>
<in thread 2>
<thread 2>
<thread 3>
_switch():
<switch to the stack of thread 1>
<no need to change r13>
<in thread 1>
<thread 1>
as you can see thread 1 schedules from CPU 0 to CPU 1 and neither CPU
changes its r13, but in the point of view for thread 1, its r13 changes.
> r13 some other way, via assembler code perhaps)? Or is there a compiler
> bug?
>
This looks to me a compiler bug, but I'm not 100% sure.
Regards,
Boqun
> If the latter is true:
>
> Can you make a reproducer and open a GCC PR? <https://gcc.gnu.org/bugs/>
> for how to get started doing that. We need *exact* code that shows the
> problem, together with a compiler command line. So that we can
> reproduce the problem. That is step 0 in figuring out what is going on,
> and then maybe fixing the problem :-)
>
>
> Segher
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 19:25 ` Boqun Feng
0 siblings, 0 replies; 80+ messages in thread
From: Boqun Feng @ 2023-04-24 19:25 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu,
linux-kernel, lance, Paul E. McKenney
On Mon, Apr 24, 2023 at 12:29:00PM -0500, Segher Boessenkool wrote:
> On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote:
> > On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> > > At what points can r13 change? Only when some particular functions are
> > > called?
> >
> > r13 is the local paca:
> >
> > register struct paca_struct *local_paca asm("r13");
> >
> > , which is a pointer to percpu data.
>
> Yes, it is a global register variable.
>
> > So if a task schedule from one CPU to anotehr CPU, the value gets
> > changed.
>
> But the compiler does not see that something else changes local_paca (or
It's more like this, however, in this case r13 is not changed:
CPU 0 CPU 1
{r13 = 0x00} {r13 = 0x04}
<thread 1>
<in interrupt>
_switch():
<switch to the stack of thread 2>
<no need to change r13>
<in thread 2>
<thread 2>
<thread 3>
_switch():
<switch to the stack of thread 1>
<no need to change r13>
<in thread 1>
<thread 1>
as you can see thread 1 schedules from CPU 0 to CPU 1 and neither CPU
changes its r13, but in the point of view for thread 1, its r13 changes.
> r13 some other way, via assembler code perhaps)? Or is there a compiler
> bug?
>
This looks to me a compiler bug, but I'm not 100% sure.
Regards,
Boqun
> If the latter is true:
>
> Can you make a reproducer and open a GCC PR? <https://gcc.gnu.org/bugs/>
> for how to get started doing that. We need *exact* code that shows the
> problem, together with a compiler command line. So that we can
> reproduce the problem. That is step 0 in figuring out what is going on,
> and then maybe fixing the problem :-)
>
>
> Segher
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 15:28 ` Boqun Feng
@ 2023-04-24 18:55 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-24 18:55 UTC (permalink / raw)
To: Boqun Feng, Peter Zijlstra
Cc: Paul E. McKenney, linux-kernel, rcu, lance, Zhouyi Zhou,
linuxppc-dev
This is amazing debugging Boqun, like a boss! One comment below:
> > > Or something simple I haven't thought of? :)
> >
> > At what points can r13 change? Only when some particular functions are
> > called?
> >
>
> r13 is the local paca:
>
> register struct paca_struct *local_paca asm("r13");
>
> , which is a pointer to percpu data.
>
> So if a task schedule from one CPU to anotehr CPU, the value gets
> changed.
It appears the whole issue, per your analysis, is that the stack
checking code in gcc should not cache or alias r13, and must read its
most up-to-date value during stack checking, as its value may have
changed during a migration to a new CPU.
Did I get that right?
IMO, even without a reproducer, gcc on PPC should just not do that,
that feels terribly broken for the kernel. I wonder what clang does,
I'll go poke around with compilerexplorer after lunch.
Adding +Peter Zijlstra as well to join the party as I have a feeling
he'll be interested. ;-)
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-24 18:55 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-24 18:55 UTC (permalink / raw)
To: Boqun Feng, Peter Zijlstra
Cc: Segher Boessenkool, Michael Ellerman, Zhouyi Zhou, linuxppc-dev,
rcu, linux-kernel, lance, Paul E. McKenney
This is amazing debugging Boqun, like a boss! One comment below:
> > > Or something simple I haven't thought of? :)
> >
> > At what points can r13 change? Only when some particular functions are
> > called?
> >
>
> r13 is the local paca:
>
> register struct paca_struct *local_paca asm("r13");
>
> , which is a pointer to percpu data.
>
> So if a task schedule from one CPU to anotehr CPU, the value gets
> changed.
It appears the whole issue, per your analysis, is that the stack
checking code in gcc should not cache or alias r13, and must read its
most up-to-date value during stack checking, as its value may have
changed during a migration to a new CPU.
Did I get that right?
IMO, even without a reproducer, gcc on PPC should just not do that,
that feels terribly broken for the kernel. I wonder what clang does,
I'll go poke around with compilerexplorer after lunch.
Adding +Peter Zijlstra as well to join the party as I have a feeling
he'll be interested. ;-)
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-24 18:55 ` Joel Fernandes
@ 2023-04-25 10:13 ` Peter Zijlstra
-1 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-04-25 10:13 UTC (permalink / raw)
To: Joel Fernandes
Cc: Paul E. McKenney, Boqun Feng, linux-kernel, rcu, lance,
Zhouyi Zhou, linuxppc-dev
On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> This is amazing debugging Boqun, like a boss! One comment below:
>
> > > > Or something simple I haven't thought of? :)
> > >
> > > At what points can r13 change? Only when some particular functions are
> > > called?
> > >
> >
> > r13 is the local paca:
> >
> > register struct paca_struct *local_paca asm("r13");
> >
> > , which is a pointer to percpu data.
> >
> > So if a task schedule from one CPU to anotehr CPU, the value gets
> > changed.
>
> It appears the whole issue, per your analysis, is that the stack
> checking code in gcc should not cache or alias r13, and must read its
> most up-to-date value during stack checking, as its value may have
> changed during a migration to a new CPU.
>
> Did I get that right?
>
> IMO, even without a reproducer, gcc on PPC should just not do that,
> that feels terribly broken for the kernel. I wonder what clang does,
> I'll go poke around with compilerexplorer after lunch.
>
> Adding +Peter Zijlstra as well to join the party as I have a feeling
> he'll be interested. ;-)
I'm a little confused; the way I understand the whole stack protector
thing to work is that we push a canary on the stack at call and on
return check it is still valid. Since in general tasks randomly migrate,
the per-cpu validation canary should be the same on all CPUs.
Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
but no guarantees.
Both cases use r13 (paca) in a racy manner, and in both cases it should
be safe.
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 10:13 ` Peter Zijlstra
0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-04-25 10:13 UTC (permalink / raw)
To: Joel Fernandes
Cc: Boqun Feng, Segher Boessenkool, Michael Ellerman, Zhouyi Zhou,
linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney
On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> This is amazing debugging Boqun, like a boss! One comment below:
>
> > > > Or something simple I haven't thought of? :)
> > >
> > > At what points can r13 change? Only when some particular functions are
> > > called?
> > >
> >
> > r13 is the local paca:
> >
> > register struct paca_struct *local_paca asm("r13");
> >
> > , which is a pointer to percpu data.
> >
> > So if a task schedule from one CPU to anotehr CPU, the value gets
> > changed.
>
> It appears the whole issue, per your analysis, is that the stack
> checking code in gcc should not cache or alias r13, and must read its
> most up-to-date value during stack checking, as its value may have
> changed during a migration to a new CPU.
>
> Did I get that right?
>
> IMO, even without a reproducer, gcc on PPC should just not do that,
> that feels terribly broken for the kernel. I wonder what clang does,
> I'll go poke around with compilerexplorer after lunch.
>
> Adding +Peter Zijlstra as well to join the party as I have a feeling
> he'll be interested. ;-)
I'm a little confused; the way I understand the whole stack protector
thing to work is that we push a canary on the stack at call and on
return check it is still valid. Since in general tasks randomly migrate,
the per-cpu validation canary should be the same on all CPUs.
Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
but no guarantees.
Both cases use r13 (paca) in a racy manner, and in both cases it should
be safe.
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 10:13 ` Peter Zijlstra
@ 2023-04-25 10:58 ` Zhouyi Zhou
-1 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-25 10:58 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Paul E. McKenney, Boqun Feng, linux-kernel, rcu, lance,
Joel Fernandes, linuxppc-dev
hi
On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > This is amazing debugging Boqun, like a boss! One comment below:
> >
> > > > > Or something simple I haven't thought of? :)
> > > >
> > > > At what points can r13 change? Only when some particular functions are
> > > > called?
> > > >
> > >
> > > r13 is the local paca:
> > >
> > > register struct paca_struct *local_paca asm("r13");
> > >
> > > , which is a pointer to percpu data.
> > >
> > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > changed.
> >
> > It appears the whole issue, per your analysis, is that the stack
> > checking code in gcc should not cache or alias r13, and must read its
> > most up-to-date value during stack checking, as its value may have
> > changed during a migration to a new CPU.
> >
> > Did I get that right?
> >
> > IMO, even without a reproducer, gcc on PPC should just not do that,
> > that feels terribly broken for the kernel. I wonder what clang does,
> > I'll go poke around with compilerexplorer after lunch.
> >
> > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > he'll be interested. ;-)
>
> I'm a little confused; the way I understand the whole stack protector
> thing to work is that we push a canary on the stack at call and on
> return check it is still valid. Since in general tasks randomly migrate,
> the per-cpu validation canary should be the same on all CPUs.
>
> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> but no guarantees.
>
> Both cases use r13 (paca) in a racy manner, and in both cases it should
> be safe.
New test results today: both gcc build from git (git clone
git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
are immune from the above issue. We can see the assembly code on
http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
while
Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 10:58 ` Zhouyi Zhou
0 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-25 10:58 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Joel Fernandes, Boqun Feng, Segher Boessenkool, Michael Ellerman,
linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney
hi
On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > This is amazing debugging Boqun, like a boss! One comment below:
> >
> > > > > Or something simple I haven't thought of? :)
> > > >
> > > > At what points can r13 change? Only when some particular functions are
> > > > called?
> > > >
> > >
> > > r13 is the local paca:
> > >
> > > register struct paca_struct *local_paca asm("r13");
> > >
> > > , which is a pointer to percpu data.
> > >
> > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > changed.
> >
> > It appears the whole issue, per your analysis, is that the stack
> > checking code in gcc should not cache or alias r13, and must read its
> > most up-to-date value during stack checking, as its value may have
> > changed during a migration to a new CPU.
> >
> > Did I get that right?
> >
> > IMO, even without a reproducer, gcc on PPC should just not do that,
> > that feels terribly broken for the kernel. I wonder what clang does,
> > I'll go poke around with compilerexplorer after lunch.
> >
> > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > he'll be interested. ;-)
>
> I'm a little confused; the way I understand the whole stack protector
> thing to work is that we push a canary on the stack at call and on
> return check it is still valid. Since in general tasks randomly migrate,
> the per-cpu validation canary should be the same on all CPUs.
>
> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> but no guarantees.
>
> Both cases use r13 (paca) in a racy manner, and in both cases it should
> be safe.
New test results today: both gcc build from git (git clone
git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
are immune from the above issue. We can see the assembly code on
http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
while
Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 10:58 ` Zhouyi Zhou
@ 2023-04-25 11:06 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-25 11:06 UTC (permalink / raw)
To: Zhouyi Zhou, Christophe Leroy
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance, linuxppc-dev
On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> hi
>
> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > This is amazing debugging Boqun, like a boss! One comment below:
> > >
> > > > > > Or something simple I haven't thought of? :)
> > > > >
> > > > > At what points can r13 change? Only when some particular functions are
> > > > > called?
> > > > >
> > > >
> > > > r13 is the local paca:
> > > >
> > > > register struct paca_struct *local_paca asm("r13");
> > > >
> > > > , which is a pointer to percpu data.
> > > >
> > > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > > changed.
> > >
> > > It appears the whole issue, per your analysis, is that the stack
> > > checking code in gcc should not cache or alias r13, and must read its
> > > most up-to-date value during stack checking, as its value may have
> > > changed during a migration to a new CPU.
> > >
> > > Did I get that right?
> > >
> > > IMO, even without a reproducer, gcc on PPC should just not do that,
> > > that feels terribly broken for the kernel. I wonder what clang does,
> > > I'll go poke around with compilerexplorer after lunch.
> > >
> > > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > he'll be interested. ;-)
> >
> > I'm a little confused; the way I understand the whole stack protector
> > thing to work is that we push a canary on the stack at call and on
> > return check it is still valid. Since in general tasks randomly migrate,
> > the per-cpu validation canary should be the same on all CPUs.
> >
> > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > but no guarantees.
> >
> > Both cases use r13 (paca) in a racy manner, and in both cases it should
> > be safe.
> New test results today: both gcc build from git (git clone
> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> are immune from the above issue. We can see the assembly code on
> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
>
> while
> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
Do you know what fixes the issue? I would not declare victory yet. My
feeling is something changes in timing, or compiler codegen which
hides the issue. So the issue is still there but it is just a matter
of time before someone else reports it.
Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
canary? Michael, is this an optimization? Adding Christophe as well
since it came in a few years ago via the following commit:
commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
Author: Christophe Leroy <christophe.leroy@c-s.fr>
Date: Thu Sep 27 07:05:55 2018 +0000
powerpc/64: add stack protector support
On PPC64, as register r13 points to the paca_struct at all time,
this patch adds a copy of the canary there, which is copied at
task_switch.
That new canary is then used by using the following GCC options:
-mstack-protector-guard=tls
-mstack-protector-guard-reg=r13
-mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 11:06 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-25 11:06 UTC (permalink / raw)
To: Zhouyi Zhou, Christophe Leroy
Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman,
linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney
On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> hi
>
> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > This is amazing debugging Boqun, like a boss! One comment below:
> > >
> > > > > > Or something simple I haven't thought of? :)
> > > > >
> > > > > At what points can r13 change? Only when some particular functions are
> > > > > called?
> > > > >
> > > >
> > > > r13 is the local paca:
> > > >
> > > > register struct paca_struct *local_paca asm("r13");
> > > >
> > > > , which is a pointer to percpu data.
> > > >
> > > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > > changed.
> > >
> > > It appears the whole issue, per your analysis, is that the stack
> > > checking code in gcc should not cache or alias r13, and must read its
> > > most up-to-date value during stack checking, as its value may have
> > > changed during a migration to a new CPU.
> > >
> > > Did I get that right?
> > >
> > > IMO, even without a reproducer, gcc on PPC should just not do that,
> > > that feels terribly broken for the kernel. I wonder what clang does,
> > > I'll go poke around with compilerexplorer after lunch.
> > >
> > > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > he'll be interested. ;-)
> >
> > I'm a little confused; the way I understand the whole stack protector
> > thing to work is that we push a canary on the stack at call and on
> > return check it is still valid. Since in general tasks randomly migrate,
> > the per-cpu validation canary should be the same on all CPUs.
> >
> > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > but no guarantees.
> >
> > Both cases use r13 (paca) in a racy manner, and in both cases it should
> > be safe.
> New test results today: both gcc build from git (git clone
> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> are immune from the above issue. We can see the assembly code on
> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
>
> while
> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
Do you know what fixes the issue? I would not declare victory yet. My
feeling is something changes in timing, or compiler codegen which
hides the issue. So the issue is still there but it is just a matter
of time before someone else reports it.
Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
canary? Michael, is this an optimization? Adding Christophe as well
since it came in a few years ago via the following commit:
commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
Author: Christophe Leroy <christophe.leroy@c-s.fr>
Date: Thu Sep 27 07:05:55 2018 +0000
powerpc/64: add stack protector support
On PPC64, as register r13 points to the paca_struct at all time,
this patch adds a copy of the canary there, which is copied at
task_switch.
That new canary is then used by using the following GCC options:
-mstack-protector-guard=tls
-mstack-protector-guard-reg=r13
-mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 11:06 ` Joel Fernandes
@ 2023-04-25 3:12 ` Zhouyi Zhou
-1 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-25 3:12 UTC (permalink / raw)
To: Joel Fernandes
Cc: Christophe Leroy, Paul E. McKenney, Peter Zijlstra, Boqun Feng,
linux-kernel, rcu, lance, linuxppc-dev
On Tue, Apr 25, 2023 at 7:06 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > hi
> >
> > On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > > This is amazing debugging Boqun, like a boss! One comment below:
> > > >
> > > > > > > Or something simple I haven't thought of? :)
> > > > > >
> > > > > > At what points can r13 change? Only when some particular functions are
> > > > > > called?
> > > > > >
> > > > >
> > > > > r13 is the local paca:
> > > > >
> > > > > register struct paca_struct *local_paca asm("r13");
> > > > >
> > > > > , which is a pointer to percpu data.
> > > > >
> > > > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > > > changed.
> > > >
> > > > It appears the whole issue, per your analysis, is that the stack
> > > > checking code in gcc should not cache or alias r13, and must read its
> > > > most up-to-date value during stack checking, as its value may have
> > > > changed during a migration to a new CPU.
> > > >
> > > > Did I get that right?
> > > >
> > > > IMO, even without a reproducer, gcc on PPC should just not do that,
> > > > that feels terribly broken for the kernel. I wonder what clang does,
> > > > I'll go poke around with compilerexplorer after lunch.
> > > >
> > > > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > > he'll be interested. ;-)
> > >
> > > I'm a little confused; the way I understand the whole stack protector
> > > thing to work is that we push a canary on the stack at call and on
> > > return check it is still valid. Since in general tasks randomly migrate,
> > > the per-cpu validation canary should be the same on all CPUs.
> > >
> > > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > > but no guarantees.
> > >
> > > Both cases use r13 (paca) in a racy manner, and in both cases it should
> > > be safe.
> > New test results today: both gcc build from git (git clone
> > git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> > are immune from the above issue. We can see the assembly code on
> > http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> >
> > while
> > Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> > on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
>
> Do you know what fixes the issue? I would not declare victory yet. My
> feeling is something changes in timing, or compiler codegen which
> hides the issue. So the issue is still there but it is just a matter
> of time before someone else reports it.
I am going to try bisect on GCC, hope we can find some clue.
>
> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> canary? Michael, is this an optimization? Adding Christophe as well
> since it came in a few years ago via the following commit:
>
> commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> Author: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Thu Sep 27 07:05:55 2018 +0000
>
> powerpc/64: add stack protector support
>
> On PPC64, as register r13 points to the paca_struct at all time,
> this patch adds a copy of the canary there, which is copied at
> task_switch.
> That new canary is then used by using the following GCC options:
> -mstack-protector-guard=tls
> -mstack-protector-guard-reg=r13
> -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 3:12 ` Zhouyi Zhou
0 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-25 3:12 UTC (permalink / raw)
To: Joel Fernandes
Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance,
Paul E. McKenney
On Tue, Apr 25, 2023 at 7:06 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > hi
> >
> > On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > > This is amazing debugging Boqun, like a boss! One comment below:
> > > >
> > > > > > > Or something simple I haven't thought of? :)
> > > > > >
> > > > > > At what points can r13 change? Only when some particular functions are
> > > > > > called?
> > > > > >
> > > > >
> > > > > r13 is the local paca:
> > > > >
> > > > > register struct paca_struct *local_paca asm("r13");
> > > > >
> > > > > , which is a pointer to percpu data.
> > > > >
> > > > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > > > changed.
> > > >
> > > > It appears the whole issue, per your analysis, is that the stack
> > > > checking code in gcc should not cache or alias r13, and must read its
> > > > most up-to-date value during stack checking, as its value may have
> > > > changed during a migration to a new CPU.
> > > >
> > > > Did I get that right?
> > > >
> > > > IMO, even without a reproducer, gcc on PPC should just not do that,
> > > > that feels terribly broken for the kernel. I wonder what clang does,
> > > > I'll go poke around with compilerexplorer after lunch.
> > > >
> > > > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > > he'll be interested. ;-)
> > >
> > > I'm a little confused; the way I understand the whole stack protector
> > > thing to work is that we push a canary on the stack at call and on
> > > return check it is still valid. Since in general tasks randomly migrate,
> > > the per-cpu validation canary should be the same on all CPUs.
> > >
> > > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > > but no guarantees.
> > >
> > > Both cases use r13 (paca) in a racy manner, and in both cases it should
> > > be safe.
> > New test results today: both gcc build from git (git clone
> > git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> > are immune from the above issue. We can see the assembly code on
> > http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> >
> > while
> > Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> > on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
>
> Do you know what fixes the issue? I would not declare victory yet. My
> feeling is something changes in timing, or compiler codegen which
> hides the issue. So the issue is still there but it is just a matter
> of time before someone else reports it.
I am going to try bisect on GCC, hope we can find some clue.
>
> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> canary? Michael, is this an optimization? Adding Christophe as well
> since it came in a few years ago via the following commit:
>
> commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> Author: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Thu Sep 27 07:05:55 2018 +0000
>
> powerpc/64: add stack protector support
>
> On PPC64, as register r13 points to the paca_struct at all time,
> this patch adds a copy of the canary there, which is copied at
> task_switch.
> That new canary is then used by using the following GCC options:
> -mstack-protector-guard=tls
> -mstack-protector-guard-reg=r13
> -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 11:06 ` Joel Fernandes
@ 2023-04-25 13:40 ` Christophe Leroy
-1 siblings, 0 replies; 80+ messages in thread
From: Christophe Leroy @ 2023-04-25 13:40 UTC (permalink / raw)
To: Joel Fernandes, Zhouyi Zhou, Christophe Leroy
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>>
>> hi
>>
>> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
>>>> This is amazing debugging Boqun, like a boss! One comment below:
>>>>
>>>>>>> Or something simple I haven't thought of? :)
>>>>>>
>>>>>> At what points can r13 change? Only when some particular functions are
>>>>>> called?
>>>>>>
>>>>>
>>>>> r13 is the local paca:
>>>>>
>>>>> register struct paca_struct *local_paca asm("r13");
>>>>>
>>>>> , which is a pointer to percpu data.
>>>>>
>>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
>>>>> changed.
>>>>
>>>> It appears the whole issue, per your analysis, is that the stack
>>>> checking code in gcc should not cache or alias r13, and must read its
>>>> most up-to-date value during stack checking, as its value may have
>>>> changed during a migration to a new CPU.
>>>>
>>>> Did I get that right?
>>>>
>>>> IMO, even without a reproducer, gcc on PPC should just not do that,
>>>> that feels terribly broken for the kernel. I wonder what clang does,
>>>> I'll go poke around with compilerexplorer after lunch.
>>>>
>>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
>>>> he'll be interested. ;-)
>>>
>>> I'm a little confused; the way I understand the whole stack protector
>>> thing to work is that we push a canary on the stack at call and on
>>> return check it is still valid. Since in general tasks randomly migrate,
>>> the per-cpu validation canary should be the same on all CPUs.
>>>
>>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
>>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
>>> but no guarantees.
>>>
>>> Both cases use r13 (paca) in a racy manner, and in both cases it should
>>> be safe.
>> New test results today: both gcc build from git (git clone
>> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
>> are immune from the above issue. We can see the assembly code on
>> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
>>
>> while
>> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
>> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
>
> Do you know what fixes the issue? I would not declare victory yet. My
> feeling is something changes in timing, or compiler codegen which
> hides the issue. So the issue is still there but it is just a matter
> of time before someone else reports it.
>
> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> canary? Michael, is this an optimization? Adding Christophe as well
> since it came in a few years ago via the following commit:
It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
register pointing to 'current' at all time so the canary is copied into
a per-cpu struct during _switch().
If GCC keeps an old value of the per-cpu struct pointer, it then gets
the canary from the wrong CPU struct so from a different task.
Christophe
>
> commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> Author: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Thu Sep 27 07:05:55 2018 +0000
>
> powerpc/64: add stack protector support
>
> On PPC64, as register r13 points to the paca_struct at all time,
> this patch adds a copy of the canary there, which is copied at
> task_switch.
> That new canary is then used by using the following GCC options:
> -mstack-protector-guard=tls
> -mstack-protector-guard-reg=r13
> -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 13:40 ` Christophe Leroy
0 siblings, 0 replies; 80+ messages in thread
From: Christophe Leroy @ 2023-04-25 13:40 UTC (permalink / raw)
To: Joel Fernandes, Zhouyi Zhou, Christophe Leroy
Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman,
linuxppc-dev, rcu, linux-kernel, lance@osuosl.org,
Paul E. McKenney
Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>>
>> hi
>>
>> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
>>>> This is amazing debugging Boqun, like a boss! One comment below:
>>>>
>>>>>>> Or something simple I haven't thought of? :)
>>>>>>
>>>>>> At what points can r13 change? Only when some particular functions are
>>>>>> called?
>>>>>>
>>>>>
>>>>> r13 is the local paca:
>>>>>
>>>>> register struct paca_struct *local_paca asm("r13");
>>>>>
>>>>> , which is a pointer to percpu data.
>>>>>
>>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
>>>>> changed.
>>>>
>>>> It appears the whole issue, per your analysis, is that the stack
>>>> checking code in gcc should not cache or alias r13, and must read its
>>>> most up-to-date value during stack checking, as its value may have
>>>> changed during a migration to a new CPU.
>>>>
>>>> Did I get that right?
>>>>
>>>> IMO, even without a reproducer, gcc on PPC should just not do that,
>>>> that feels terribly broken for the kernel. I wonder what clang does,
>>>> I'll go poke around with compilerexplorer after lunch.
>>>>
>>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
>>>> he'll be interested. ;-)
>>>
>>> I'm a little confused; the way I understand the whole stack protector
>>> thing to work is that we push a canary on the stack at call and on
>>> return check it is still valid. Since in general tasks randomly migrate,
>>> the per-cpu validation canary should be the same on all CPUs.
>>>
>>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
>>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
>>> but no guarantees.
>>>
>>> Both cases use r13 (paca) in a racy manner, and in both cases it should
>>> be safe.
>> New test results today: both gcc build from git (git clone
>> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
>> are immune from the above issue. We can see the assembly code on
>> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
>>
>> while
>> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
>> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
>
> Do you know what fixes the issue? I would not declare victory yet. My
> feeling is something changes in timing, or compiler codegen which
> hides the issue. So the issue is still there but it is just a matter
> of time before someone else reports it.
>
> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> canary? Michael, is this an optimization? Adding Christophe as well
> since it came in a few years ago via the following commit:
It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
register pointing to 'current' at all time so the canary is copied into
a per-cpu struct during _switch().
If GCC keeps an old value of the per-cpu struct pointer, it then gets
the canary from the wrong CPU struct so from a different task.
Christophe
>
> commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> Author: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Thu Sep 27 07:05:55 2018 +0000
>
> powerpc/64: add stack protector support
>
> On PPC64, as register r13 points to the paca_struct at all time,
> this patch adds a copy of the canary there, which is copied at
> task_switch.
> That new canary is then used by using the following GCC options:
> -mstack-protector-guard=tls
> -mstack-protector-guard-reg=r13
> -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 13:40 ` Christophe Leroy
@ 2023-04-25 13:49 ` Zhouyi Zhou
-1 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-25 13:49 UTC (permalink / raw)
To: Christophe Leroy
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, Joel Fernandes, linuxppc-dev
Hi
On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >>
> >> hi
> >>
> >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >>>
> >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> >>>> This is amazing debugging Boqun, like a boss! One comment below:
> >>>>
> >>>>>>> Or something simple I haven't thought of? :)
> >>>>>>
> >>>>>> At what points can r13 change? Only when some particular functions are
> >>>>>> called?
> >>>>>>
> >>>>>
> >>>>> r13 is the local paca:
> >>>>>
> >>>>> register struct paca_struct *local_paca asm("r13");
> >>>>>
> >>>>> , which is a pointer to percpu data.
> >>>>>
> >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> >>>>> changed.
> >>>>
> >>>> It appears the whole issue, per your analysis, is that the stack
> >>>> checking code in gcc should not cache or alias r13, and must read its
> >>>> most up-to-date value during stack checking, as its value may have
> >>>> changed during a migration to a new CPU.
> >>>>
> >>>> Did I get that right?
> >>>>
> >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> >>>> that feels terribly broken for the kernel. I wonder what clang does,
> >>>> I'll go poke around with compilerexplorer after lunch.
> >>>>
> >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> >>>> he'll be interested. ;-)
> >>>
> >>> I'm a little confused; the way I understand the whole stack protector
> >>> thing to work is that we push a canary on the stack at call and on
> >>> return check it is still valid. Since in general tasks randomly migrate,
> >>> the per-cpu validation canary should be the same on all CPUs.
> >>>
> >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> >>> but no guarantees.
> >>>
> >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> >>> be safe.
> >> New test results today: both gcc build from git (git clone
> >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> >> are immune from the above issue. We can see the assembly code on
> >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> >>
> >> while
> >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> >
> > Do you know what fixes the issue? I would not declare victory yet. My
> > feeling is something changes in timing, or compiler codegen which
> > hides the issue. So the issue is still there but it is just a matter
> > of time before someone else reports it.
> >
> > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > canary? Michael, is this an optimization? Adding Christophe as well
> > since it came in a few years ago via the following commit:
>
> It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> register pointing to 'current' at all time so the canary is copied into
> a per-cpu struct during _switch().
>
> If GCC keeps an old value of the per-cpu struct pointer, it then gets
> the canary from the wrong CPU struct so from a different task.
This is a fruitful learning process for me!
Christophe:
Do you think there is still a need to bisect GCC ? If so, I am very
glad to continue
Cheers
Zhouyi
>
> Christophe
>
> >
> > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > Date: Thu Sep 27 07:05:55 2018 +0000
> >
> > powerpc/64: add stack protector support
> >
> > On PPC64, as register r13 points to the paca_struct at all time,
> > this patch adds a copy of the canary there, which is copied at
> > task_switch.
> > That new canary is then used by using the following GCC options:
> > -mstack-protector-guard=tls
> > -mstack-protector-guard-reg=r13
> > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> >
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> >
> > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 13:49 ` Zhouyi Zhou
0 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-25 13:49 UTC (permalink / raw)
To: Christophe Leroy
Cc: Joel Fernandes, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel,
lance@osuosl.org, Paul E. McKenney
Hi
On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >>
> >> hi
> >>
> >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >>>
> >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> >>>> This is amazing debugging Boqun, like a boss! One comment below:
> >>>>
> >>>>>>> Or something simple I haven't thought of? :)
> >>>>>>
> >>>>>> At what points can r13 change? Only when some particular functions are
> >>>>>> called?
> >>>>>>
> >>>>>
> >>>>> r13 is the local paca:
> >>>>>
> >>>>> register struct paca_struct *local_paca asm("r13");
> >>>>>
> >>>>> , which is a pointer to percpu data.
> >>>>>
> >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> >>>>> changed.
> >>>>
> >>>> It appears the whole issue, per your analysis, is that the stack
> >>>> checking code in gcc should not cache or alias r13, and must read its
> >>>> most up-to-date value during stack checking, as its value may have
> >>>> changed during a migration to a new CPU.
> >>>>
> >>>> Did I get that right?
> >>>>
> >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> >>>> that feels terribly broken for the kernel. I wonder what clang does,
> >>>> I'll go poke around with compilerexplorer after lunch.
> >>>>
> >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> >>>> he'll be interested. ;-)
> >>>
> >>> I'm a little confused; the way I understand the whole stack protector
> >>> thing to work is that we push a canary on the stack at call and on
> >>> return check it is still valid. Since in general tasks randomly migrate,
> >>> the per-cpu validation canary should be the same on all CPUs.
> >>>
> >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> >>> but no guarantees.
> >>>
> >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> >>> be safe.
> >> New test results today: both gcc build from git (git clone
> >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> >> are immune from the above issue. We can see the assembly code on
> >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> >>
> >> while
> >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> >
> > Do you know what fixes the issue? I would not declare victory yet. My
> > feeling is something changes in timing, or compiler codegen which
> > hides the issue. So the issue is still there but it is just a matter
> > of time before someone else reports it.
> >
> > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > canary? Michael, is this an optimization? Adding Christophe as well
> > since it came in a few years ago via the following commit:
>
> It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> register pointing to 'current' at all time so the canary is copied into
> a per-cpu struct during _switch().
>
> If GCC keeps an old value of the per-cpu struct pointer, it then gets
> the canary from the wrong CPU struct so from a different task.
This is a fruitful learning process for me!
Christophe:
Do you think there is still a need to bisect GCC ? If so, I am very
glad to continue
Cheers
Zhouyi
>
> Christophe
>
> >
> > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > Date: Thu Sep 27 07:05:55 2018 +0000
> >
> > powerpc/64: add stack protector support
> >
> > On PPC64, as register r13 points to the paca_struct at all time,
> > this patch adds a copy of the canary there, which is copied at
> > task_switch.
> > That new canary is then used by using the following GCC options:
> > -mstack-protector-guard=tls
> > -mstack-protector-guard-reg=r13
> > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> >
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> >
> > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 13:49 ` Zhouyi Zhou
@ 2023-04-26 0:32 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 0:32 UTC (permalink / raw)
To: Zhouyi Zhou
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
On Tue, Apr 25, 2023 at 9:50 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> Hi
>
> On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
> >
> >
> >
> > Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > >>
> > >> hi
> > >>
> > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >>>
> > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > >>>> This is amazing debugging Boqun, like a boss! One comment below:
> > >>>>
> > >>>>>>> Or something simple I haven't thought of? :)
> > >>>>>>
> > >>>>>> At what points can r13 change? Only when some particular functions are
> > >>>>>> called?
> > >>>>>>
> > >>>>>
> > >>>>> r13 is the local paca:
> > >>>>>
> > >>>>> register struct paca_struct *local_paca asm("r13");
> > >>>>>
> > >>>>> , which is a pointer to percpu data.
> > >>>>>
> > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> > >>>>> changed.
> > >>>>
> > >>>> It appears the whole issue, per your analysis, is that the stack
> > >>>> checking code in gcc should not cache or alias r13, and must read its
> > >>>> most up-to-date value during stack checking, as its value may have
> > >>>> changed during a migration to a new CPU.
> > >>>>
> > >>>> Did I get that right?
> > >>>>
> > >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> > >>>> that feels terribly broken for the kernel. I wonder what clang does,
> > >>>> I'll go poke around with compilerexplorer after lunch.
> > >>>>
> > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> > >>>> he'll be interested. ;-)
> > >>>
> > >>> I'm a little confused; the way I understand the whole stack protector
> > >>> thing to work is that we push a canary on the stack at call and on
> > >>> return check it is still valid. Since in general tasks randomly migrate,
> > >>> the per-cpu validation canary should be the same on all CPUs.
> > >>>
> > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > >>> but no guarantees.
> > >>>
> > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> > >>> be safe.
> > >> New test results today: both gcc build from git (git clone
> > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> > >> are immune from the above issue. We can see the assembly code on
> > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> > >>
> > >> while
> > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> > >
> > > Do you know what fixes the issue? I would not declare victory yet. My
> > > feeling is something changes in timing, or compiler codegen which
> > > hides the issue. So the issue is still there but it is just a matter
> > > of time before someone else reports it.
> > >
> > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > > canary? Michael, is this an optimization? Adding Christophe as well
> > > since it came in a few years ago via the following commit:
> >
> > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> > register pointing to 'current' at all time so the canary is copied into
> > a per-cpu struct during _switch().
> >
> > If GCC keeps an old value of the per-cpu struct pointer, it then gets
> > the canary from the wrong CPU struct so from a different task.
> This is a fruitful learning process for me!
Nice work Zhouyi..
> Christophe:
> Do you think there is still a need to bisect GCC ? If so, I am very
> glad to continue
my 2 cents: It would be good to write a reproducer that Segher
suggested (but that might be hard since you depend on the compiler to
cache the r13 -- maybe some trial/error with CompilerExplorer will
give you the magic recipe?).
If I understand Christophe correctly, the issue requires the following
ingredients:
1. Task A is running on CPU 1, and the task's canary is copied into
the CPU1's per-cpu area pointed to by r13.
2. r13 is now cached into r10 in the offending function due to the compiler.
3. Task A running on CPU 1 now gets preempted right in the middle of
the offending SRCU function and gets migrated to CPU 2.
4. CPU 2's per-cpu canary is updated to that of task A since task A
is the current task now.
5. Task B now runs on CPU 1 and the per-cpu canary on CPU 1 is now that of B.
6. Task A exits the function, but stack checking code reads r10 which
contains CPU 1's canary which is that of task B!
7. Boom.
So the issue is precisely in #2. The issue is in the compiler that it
does not treat r13 as volatile as Boqun had initially mentioned.
- Joel
>
> Cheers
> Zhouyi
> >
> > Christophe
> >
> > >
> > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > > Date: Thu Sep 27 07:05:55 2018 +0000
> > >
> > > powerpc/64: add stack protector support
> > >
> > > On PPC64, as register r13 points to the paca_struct at all time,
> > > this patch adds a copy of the canary there, which is copied at
> > > task_switch.
> > > That new canary is then used by using the following GCC options:
> > > -mstack-protector-guard=tls
> > > -mstack-protector-guard-reg=r13
> > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> > >
> > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> > >
> > > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 0:32 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 0:32 UTC (permalink / raw)
To: Zhouyi Zhou
Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel,
lance@osuosl.org, Paul E. McKenney
On Tue, Apr 25, 2023 at 9:50 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> Hi
>
> On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
> >
> >
> >
> > Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > >>
> > >> hi
> > >>
> > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >>>
> > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > >>>> This is amazing debugging Boqun, like a boss! One comment below:
> > >>>>
> > >>>>>>> Or something simple I haven't thought of? :)
> > >>>>>>
> > >>>>>> At what points can r13 change? Only when some particular functions are
> > >>>>>> called?
> > >>>>>>
> > >>>>>
> > >>>>> r13 is the local paca:
> > >>>>>
> > >>>>> register struct paca_struct *local_paca asm("r13");
> > >>>>>
> > >>>>> , which is a pointer to percpu data.
> > >>>>>
> > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> > >>>>> changed.
> > >>>>
> > >>>> It appears the whole issue, per your analysis, is that the stack
> > >>>> checking code in gcc should not cache or alias r13, and must read its
> > >>>> most up-to-date value during stack checking, as its value may have
> > >>>> changed during a migration to a new CPU.
> > >>>>
> > >>>> Did I get that right?
> > >>>>
> > >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> > >>>> that feels terribly broken for the kernel. I wonder what clang does,
> > >>>> I'll go poke around with compilerexplorer after lunch.
> > >>>>
> > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> > >>>> he'll be interested. ;-)
> > >>>
> > >>> I'm a little confused; the way I understand the whole stack protector
> > >>> thing to work is that we push a canary on the stack at call and on
> > >>> return check it is still valid. Since in general tasks randomly migrate,
> > >>> the per-cpu validation canary should be the same on all CPUs.
> > >>>
> > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > >>> but no guarantees.
> > >>>
> > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> > >>> be safe.
> > >> New test results today: both gcc build from git (git clone
> > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> > >> are immune from the above issue. We can see the assembly code on
> > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> > >>
> > >> while
> > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> > >
> > > Do you know what fixes the issue? I would not declare victory yet. My
> > > feeling is something changes in timing, or compiler codegen which
> > > hides the issue. So the issue is still there but it is just a matter
> > > of time before someone else reports it.
> > >
> > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > > canary? Michael, is this an optimization? Adding Christophe as well
> > > since it came in a few years ago via the following commit:
> >
> > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> > register pointing to 'current' at all time so the canary is copied into
> > a per-cpu struct during _switch().
> >
> > If GCC keeps an old value of the per-cpu struct pointer, it then gets
> > the canary from the wrong CPU struct so from a different task.
> This is a fruitful learning process for me!
Nice work Zhouyi..
> Christophe:
> Do you think there is still a need to bisect GCC ? If so, I am very
> glad to continue
my 2 cents: It would be good to write a reproducer that Segher
suggested (but that might be hard since you depend on the compiler to
cache the r13 -- maybe some trial/error with CompilerExplorer will
give you the magic recipe?).
If I understand Christophe correctly, the issue requires the following
ingredients:
1. Task A is running on CPU 1, and the task's canary is copied into
the CPU1's per-cpu area pointed to by r13.
2. r13 is now cached into r10 in the offending function due to the compiler.
3. Task A running on CPU 1 now gets preempted right in the middle of
the offending SRCU function and gets migrated to CPU 2.
4. CPU 2's per-cpu canary is updated to that of task A since task A
is the current task now.
5. Task B now runs on CPU 1 and the per-cpu canary on CPU 1 is now that of B.
6. Task A exits the function, but stack checking code reads r10 which
contains CPU 1's canary which is that of task B!
7. Boom.
So the issue is precisely in #2. The issue is in the compiler that it
does not treat r13 as volatile as Boqun had initially mentioned.
- Joel
>
> Cheers
> Zhouyi
> >
> > Christophe
> >
> > >
> > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > > Date: Thu Sep 27 07:05:55 2018 +0000
> > >
> > > powerpc/64: add stack protector support
> > >
> > > On PPC64, as register r13 points to the paca_struct at all time,
> > > this patch adds a copy of the canary there, which is copied at
> > > task_switch.
> > > That new canary is then used by using the following GCC options:
> > > -mstack-protector-guard=tls
> > > -mstack-protector-guard-reg=r13
> > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> > >
> > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> > >
> > > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 0:32 ` Joel Fernandes
@ 2023-04-26 1:31 ` Zhouyi Zhou
-1 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-26 1:31 UTC (permalink / raw)
To: Joel Fernandes
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
On Wed, Apr 26, 2023 at 8:33 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Tue, Apr 25, 2023 at 9:50 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > Hi
> >
> > On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy
> > <christophe.leroy@csgroup.eu> wrote:
> > >
> > >
> > >
> > > Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > > >>
> > > >> hi
> > > >>
> > > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >>>
> > > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > >>>> This is amazing debugging Boqun, like a boss! One comment below:
> > > >>>>
> > > >>>>>>> Or something simple I haven't thought of? :)
> > > >>>>>>
> > > >>>>>> At what points can r13 change? Only when some particular functions are
> > > >>>>>> called?
> > > >>>>>>
> > > >>>>>
> > > >>>>> r13 is the local paca:
> > > >>>>>
> > > >>>>> register struct paca_struct *local_paca asm("r13");
> > > >>>>>
> > > >>>>> , which is a pointer to percpu data.
> > > >>>>>
> > > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> > > >>>>> changed.
> > > >>>>
> > > >>>> It appears the whole issue, per your analysis, is that the stack
> > > >>>> checking code in gcc should not cache or alias r13, and must read its
> > > >>>> most up-to-date value during stack checking, as its value may have
> > > >>>> changed during a migration to a new CPU.
> > > >>>>
> > > >>>> Did I get that right?
> > > >>>>
> > > >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> > > >>>> that feels terribly broken for the kernel. I wonder what clang does,
> > > >>>> I'll go poke around with compilerexplorer after lunch.
> > > >>>>
> > > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > >>>> he'll be interested. ;-)
> > > >>>
> > > >>> I'm a little confused; the way I understand the whole stack protector
> > > >>> thing to work is that we push a canary on the stack at call and on
> > > >>> return check it is still valid. Since in general tasks randomly migrate,
> > > >>> the per-cpu validation canary should be the same on all CPUs.
> > > >>>
> > > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > > >>> but no guarantees.
> > > >>>
> > > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> > > >>> be safe.
> > > >> New test results today: both gcc build from git (git clone
> > > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> > > >> are immune from the above issue. We can see the assembly code on
> > > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> > > >>
> > > >> while
> > > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> > > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> > > >
> > > > Do you know what fixes the issue? I would not declare victory yet. My
> > > > feeling is something changes in timing, or compiler codegen which
> > > > hides the issue. So the issue is still there but it is just a matter
> > > > of time before someone else reports it.
> > > >
> > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > > > canary? Michael, is this an optimization? Adding Christophe as well
> > > > since it came in a few years ago via the following commit:
> > >
> > > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> > > register pointing to 'current' at all time so the canary is copied into
> > > a per-cpu struct during _switch().
> > >
> > > If GCC keeps an old value of the per-cpu struct pointer, it then gets
> > > the canary from the wrong CPU struct so from a different task.
> > This is a fruitful learning process for me!
>
> Nice work Zhouyi..
Thank Joel for your encouragement! Your encouragement is very
important to me ;-)
>
> > Christophe:
> > Do you think there is still a need to bisect GCC ? If so, I am very
> > glad to continue
>
> my 2 cents: It would be good to write a reproducer that Segher
> suggested (but that might be hard since you depend on the compiler to
> cache the r13 -- maybe some trial/error with CompilerExplorer will
> give you the magic recipe?).
I have reported to GCC bugzilla once before ;-) [1]
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88348
I think we could provide a preprocessed .i file, and give the command
line that invokes cc1,
The problem is the newest GCC is immune to our issue ;-(
>
> If I understand Christophe correctly, the issue requires the following
> ingredients:
> 1. Task A is running on CPU 1, and the task's canary is copied into
> the CPU1's per-cpu area pointed to by r13.
> 2. r13 is now cached into r10 in the offending function due to the compiler.
> 3. Task A running on CPU 1 now gets preempted right in the middle of
> the offending SRCU function and gets migrated to CPU 2.
> 4. CPU 2's per-cpu canary is updated to that of task A since task A
> is the current task now.
> 5. Task B now runs on CPU 1 and the per-cpu canary on CPU 1 is now that of B.
> 6. Task A exits the function, but stack checking code reads r10 which
> contains CPU 1's canary which is that of task B!
> 7. Boom.
Joel makes the learning process easier for me, indeed!
One question I have tried very hard to understand, but still confused.
for now, I know
r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
to be equal to 3192(r10).
Thanks in advance.
>
> So the issue is precisely in #2. The issue is in the compiler that it
> does not treat r13 as volatile as Boqun had initially mentioned.
Please do not hesitate to email me if there is anything I can do (for
example bisecting ;-)). I am very glad to be of help ;-)
Cheers
Zhouyi
>
> - Joel
>
>
>
> >
> > Cheers
> > Zhouyi
> > >
> > > Christophe
> > >
> > > >
> > > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > > > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > > > Date: Thu Sep 27 07:05:55 2018 +0000
> > > >
> > > > powerpc/64: add stack protector support
> > > >
> > > > On PPC64, as register r13 points to the paca_struct at all time,
> > > > this patch adds a copy of the canary there, which is copied at
> > > > task_switch.
> > > > That new canary is then used by using the following GCC options:
> > > > -mstack-protector-guard=tls
> > > > -mstack-protector-guard-reg=r13
> > > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> > > >
> > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> > > >
> > > > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 1:31 ` Zhouyi Zhou
0 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-26 1:31 UTC (permalink / raw)
To: Joel Fernandes
Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel,
lance@osuosl.org, Paul E. McKenney
On Wed, Apr 26, 2023 at 8:33 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Tue, Apr 25, 2023 at 9:50 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > Hi
> >
> > On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy
> > <christophe.leroy@csgroup.eu> wrote:
> > >
> > >
> > >
> > > Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > > >>
> > > >> hi
> > > >>
> > > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >>>
> > > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > >>>> This is amazing debugging Boqun, like a boss! One comment below:
> > > >>>>
> > > >>>>>>> Or something simple I haven't thought of? :)
> > > >>>>>>
> > > >>>>>> At what points can r13 change? Only when some particular functions are
> > > >>>>>> called?
> > > >>>>>>
> > > >>>>>
> > > >>>>> r13 is the local paca:
> > > >>>>>
> > > >>>>> register struct paca_struct *local_paca asm("r13");
> > > >>>>>
> > > >>>>> , which is a pointer to percpu data.
> > > >>>>>
> > > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> > > >>>>> changed.
> > > >>>>
> > > >>>> It appears the whole issue, per your analysis, is that the stack
> > > >>>> checking code in gcc should not cache or alias r13, and must read its
> > > >>>> most up-to-date value during stack checking, as its value may have
> > > >>>> changed during a migration to a new CPU.
> > > >>>>
> > > >>>> Did I get that right?
> > > >>>>
> > > >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> > > >>>> that feels terribly broken for the kernel. I wonder what clang does,
> > > >>>> I'll go poke around with compilerexplorer after lunch.
> > > >>>>
> > > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > >>>> he'll be interested. ;-)
> > > >>>
> > > >>> I'm a little confused; the way I understand the whole stack protector
> > > >>> thing to work is that we push a canary on the stack at call and on
> > > >>> return check it is still valid. Since in general tasks randomly migrate,
> > > >>> the per-cpu validation canary should be the same on all CPUs.
> > > >>>
> > > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > > >>> but no guarantees.
> > > >>>
> > > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> > > >>> be safe.
> > > >> New test results today: both gcc build from git (git clone
> > > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> > > >> are immune from the above issue. We can see the assembly code on
> > > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> > > >>
> > > >> while
> > > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> > > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> > > >
> > > > Do you know what fixes the issue? I would not declare victory yet. My
> > > > feeling is something changes in timing, or compiler codegen which
> > > > hides the issue. So the issue is still there but it is just a matter
> > > > of time before someone else reports it.
> > > >
> > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > > > canary? Michael, is this an optimization? Adding Christophe as well
> > > > since it came in a few years ago via the following commit:
> > >
> > > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> > > register pointing to 'current' at all time so the canary is copied into
> > > a per-cpu struct during _switch().
> > >
> > > If GCC keeps an old value of the per-cpu struct pointer, it then gets
> > > the canary from the wrong CPU struct so from a different task.
> > This is a fruitful learning process for me!
>
> Nice work Zhouyi..
Thank Joel for your encouragement! Your encouragement is very
important to me ;-)
>
> > Christophe:
> > Do you think there is still a need to bisect GCC ? If so, I am very
> > glad to continue
>
> my 2 cents: It would be good to write a reproducer that Segher
> suggested (but that might be hard since you depend on the compiler to
> cache the r13 -- maybe some trial/error with CompilerExplorer will
> give you the magic recipe?).
I have reported to GCC bugzilla once before ;-) [1]
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88348
I think we could provide a preprocessed .i file, and give the command
line that invokes cc1,
The problem is the newest GCC is immune to our issue ;-(
>
> If I understand Christophe correctly, the issue requires the following
> ingredients:
> 1. Task A is running on CPU 1, and the task's canary is copied into
> the CPU1's per-cpu area pointed to by r13.
> 2. r13 is now cached into r10 in the offending function due to the compiler.
> 3. Task A running on CPU 1 now gets preempted right in the middle of
> the offending SRCU function and gets migrated to CPU 2.
> 4. CPU 2's per-cpu canary is updated to that of task A since task A
> is the current task now.
> 5. Task B now runs on CPU 1 and the per-cpu canary on CPU 1 is now that of B.
> 6. Task A exits the function, but stack checking code reads r10 which
> contains CPU 1's canary which is that of task B!
> 7. Boom.
Joel makes the learning process easier for me, indeed!
One question I have tried very hard to understand, but still confused.
for now, I know
r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
to be equal to 3192(r10).
Thanks in advance.
>
> So the issue is precisely in #2. The issue is in the compiler that it
> does not treat r13 as volatile as Boqun had initially mentioned.
Please do not hesitate to email me if there is anything I can do (for
example bisecting ;-)). I am very glad to be of help ;-)
Cheers
Zhouyi
>
> - Joel
>
>
>
> >
> > Cheers
> > Zhouyi
> > >
> > > Christophe
> > >
> > > >
> > > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > > > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > > > Date: Thu Sep 27 07:05:55 2018 +0000
> > > >
> > > > powerpc/64: add stack protector support
> > > >
> > > > On PPC64, as register r13 points to the paca_struct at all time,
> > > > this patch adds a copy of the canary there, which is copied at
> > > > task_switch.
> > > > That new canary is then used by using the following GCC options:
> > > > -mstack-protector-guard=tls
> > > > -mstack-protector-guard-reg=r13
> > > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> > > >
> > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> > > >
> > > > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 1:31 ` Zhouyi Zhou
@ 2023-04-26 2:15 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 2:15 UTC (permalink / raw)
To: Zhouyi Zhou
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
Hi Zhouyi,
On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote:
[..]
> Joel makes the learning process easier for me, indeed!
I know that feeling being a learner myself ;-)
> One question I have tried very hard to understand, but still confused.
> for now, I know
> r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
> to be equal to 3192(r10).
First you have to I guess read up a bit about stack canaries. Google for
"gcc stack protector" and "gcc stack canaries", and the look for basics of
"buffer overflow attacks". That'll explain the concept of stack guards etc
(Sorry if this is too obvious but I did not know how much you knew about it
already).
40(r1) is where the canary was stored. In the beginning of the function, you
have this:
c000000000226d58: 78 0c 2d e9 ld r9,3192(r13)
c000000000226d5c: 28 00 21 f9 std r9,40(r1)
r1 is your stack pointer. 3192(r13) is the canary value.
40(r1) is where the canary is stored for later comparison.
r1 should not change through out the function I believe, because otherwise
you don't know where the stack frame is, right?
Later you have this stuff before the function returns which gcc presumably
did due to optimization. That mr means move register and is where the caching
of r13 to r10 happens that Boqun pointed.
c000000000226eb4: 78 6b aa 7d mr r10,r13
[...]
and then the canary comparison happens:
c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
c000000000226ee4: 00 00 40 39 li r10,0
c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
So looks like for this to blow up, the preemption/migration has to happen
precisely between the mr doing the caching, and the xor doing the comparison,
since that's when the r10 will be stale.
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 2:15 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 2:15 UTC (permalink / raw)
To: Zhouyi Zhou
Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel,
lance@osuosl.org, Paul E. McKenney
Hi Zhouyi,
On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote:
[..]
> Joel makes the learning process easier for me, indeed!
I know that feeling being a learner myself ;-)
> One question I have tried very hard to understand, but still confused.
> for now, I know
> r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
> to be equal to 3192(r10).
First you have to I guess read up a bit about stack canaries. Google for
"gcc stack protector" and "gcc stack canaries", and the look for basics of
"buffer overflow attacks". That'll explain the concept of stack guards etc
(Sorry if this is too obvious but I did not know how much you knew about it
already).
40(r1) is where the canary was stored. In the beginning of the function, you
have this:
c000000000226d58: 78 0c 2d e9 ld r9,3192(r13)
c000000000226d5c: 28 00 21 f9 std r9,40(r1)
r1 is your stack pointer. 3192(r13) is the canary value.
40(r1) is where the canary is stored for later comparison.
r1 should not change through out the function I believe, because otherwise
you don't know where the stack frame is, right?
Later you have this stuff before the function returns which gcc presumably
did due to optimization. That mr means move register and is where the caching
of r13 to r10 happens that Boqun pointed.
c000000000226eb4: 78 6b aa 7d mr r10,r13
[...]
and then the canary comparison happens:
c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
c000000000226ee4: 00 00 40 39 li r10,0
c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
So looks like for this to blow up, the preemption/migration has to happen
precisely between the mr doing the caching, and the xor doing the comparison,
since that's when the r10 will be stale.
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 2:15 ` Joel Fernandes
@ 2023-04-26 2:37 ` Zhouyi Zhou
-1 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-26 2:37 UTC (permalink / raw)
To: Joel Fernandes
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
On Wed, Apr 26, 2023 at 10:15 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> Hi Zhouyi,
>
> On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote:
> [..]
> > Joel makes the learning process easier for me, indeed!
>
> I know that feeling being a learner myself ;-)
>
> > One question I have tried very hard to understand, but still confused.
> > for now, I know
> > r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
> > to be equal to 3192(r10).
>
> First you have to I guess read up a bit about stack canaries. Google for
> "gcc stack protector" and "gcc stack canaries", and the look for basics of
> "buffer overflow attacks". That'll explain the concept of stack guards etc
> (Sorry if this is too obvious but I did not know how much you knew about it
> already).
>
> 40(r1) is where the canary was stored. In the beginning of the function, you
> have this:
>
> c000000000226d58: 78 0c 2d e9 ld r9,3192(r13)
> c000000000226d5c: 28 00 21 f9 std r9,40(r1)
>
> r1 is your stack pointer. 3192(r13) is the canary value.
>
> 40(r1) is where the canary is stored for later comparison.
>
> r1 should not change through out the function I believe, because otherwise
> you don't know where the stack frame is, right?
Thanks Joel's awesome explanation. I can understand the mechanics
behind our situation now!!
40(r1) is where the canary is stored for later comparison, this is
located on the stack.
while 3192(r13) is inside the cpu's PACA.
I quote Christophe's note here
"in order to be able to
have the canary as an offset of a fixed register as expected by GCC, we
copy the task canary into the cpu's PACA struct during _switch():
addi r6,r4,-THREAD /* Convert THREAD to 'current' */
std r6,PACACURRENT(r13) /* Set new 'current' */
#if defined(CONFIG_STACKPROTECTOR)
ld r6, TASK_CANARY(r6)
std r6, PACA_CANARY(r13)
#endif
"
>
> Later you have this stuff before the function returns which gcc presumably
> did due to optimization. That mr means move register and is where the caching
> of r13 to r10 happens that Boqun pointed.
Thank Boqun and all others' wonderful debugging! Your work confirmed
my bug report ;-)
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
> [...]
> and then the canary comparison happens:
>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
3192(r13) is correct because "we copy the task canary into the cpu's
PACA struct during _switch():"
but 3192(r10) is not correct, because r10 is the old value of r13.
> c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> c000000000226ee4: 00 00 40 39 li r10,0
> c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
>
> So looks like for this to blow up, the preemption/migration has to happen
> precisely between the mr doing the caching, and the xor doing the comparison,
> since that's when the r10 will be stale.
Thank Joel and all others for your time ;-)
I benefit a lot, and am very glad to do more good work to the
community in return ;-)
Cheers
Zhouyi
>
> thanks,
>
> - Joel
>
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 2:37 ` Zhouyi Zhou
0 siblings, 0 replies; 80+ messages in thread
From: Zhouyi Zhou @ 2023-04-26 2:37 UTC (permalink / raw)
To: Joel Fernandes
Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel,
lance@osuosl.org, Paul E. McKenney
On Wed, Apr 26, 2023 at 10:15 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> Hi Zhouyi,
>
> On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote:
> [..]
> > Joel makes the learning process easier for me, indeed!
>
> I know that feeling being a learner myself ;-)
>
> > One question I have tried very hard to understand, but still confused.
> > for now, I know
> > r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
> > to be equal to 3192(r10).
>
> First you have to I guess read up a bit about stack canaries. Google for
> "gcc stack protector" and "gcc stack canaries", and the look for basics of
> "buffer overflow attacks". That'll explain the concept of stack guards etc
> (Sorry if this is too obvious but I did not know how much you knew about it
> already).
>
> 40(r1) is where the canary was stored. In the beginning of the function, you
> have this:
>
> c000000000226d58: 78 0c 2d e9 ld r9,3192(r13)
> c000000000226d5c: 28 00 21 f9 std r9,40(r1)
>
> r1 is your stack pointer. 3192(r13) is the canary value.
>
> 40(r1) is where the canary is stored for later comparison.
>
> r1 should not change through out the function I believe, because otherwise
> you don't know where the stack frame is, right?
Thanks Joel's awesome explanation. I can understand the mechanics
behind our situation now!!
40(r1) is where the canary is stored for later comparison, this is
located on the stack.
while 3192(r13) is inside the cpu's PACA.
I quote Christophe's note here
"in order to be able to
have the canary as an offset of a fixed register as expected by GCC, we
copy the task canary into the cpu's PACA struct during _switch():
addi r6,r4,-THREAD /* Convert THREAD to 'current' */
std r6,PACACURRENT(r13) /* Set new 'current' */
#if defined(CONFIG_STACKPROTECTOR)
ld r6, TASK_CANARY(r6)
std r6, PACA_CANARY(r13)
#endif
"
>
> Later you have this stuff before the function returns which gcc presumably
> did due to optimization. That mr means move register and is where the caching
> of r13 to r10 happens that Boqun pointed.
Thank Boqun and all others' wonderful debugging! Your work confirmed
my bug report ;-)
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
> [...]
> and then the canary comparison happens:
>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
3192(r13) is correct because "we copy the task canary into the cpu's
PACA struct during _switch():"
but 3192(r10) is not correct, because r10 is the old value of r13.
> c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> c000000000226ee4: 00 00 40 39 li r10,0
> c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
>
> So looks like for this to blow up, the preemption/migration has to happen
> precisely between the mr doing the caching, and the xor doing the comparison,
> since that's when the r10 will be stale.
Thank Joel and all others for your time ;-)
I benefit a lot, and am very glad to do more good work to the
community in return ;-)
Cheers
Zhouyi
>
> thanks,
>
> - Joel
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 13:40 ` Christophe Leroy
@ 2023-04-26 0:42 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 0:42 UTC (permalink / raw)
To: Christophe Leroy
Cc: Paul E. McKenney, Peter Zijlstra, Zhouyi Zhou, Boqun Feng,
linux-kernel, rcu, lance@osuosl.org, linuxppc-dev
On Tue, Apr 25, 2023 at 9:40 AM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >>
> >> hi
> >>
> >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >>>
> >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> >>>> This is amazing debugging Boqun, like a boss! One comment below:
> >>>>
> >>>>>>> Or something simple I haven't thought of? :)
> >>>>>>
> >>>>>> At what points can r13 change? Only when some particular functions are
> >>>>>> called?
> >>>>>>
> >>>>>
> >>>>> r13 is the local paca:
> >>>>>
> >>>>> register struct paca_struct *local_paca asm("r13");
> >>>>>
> >>>>> , which is a pointer to percpu data.
> >>>>>
> >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> >>>>> changed.
> >>>>
> >>>> It appears the whole issue, per your analysis, is that the stack
> >>>> checking code in gcc should not cache or alias r13, and must read its
> >>>> most up-to-date value during stack checking, as its value may have
> >>>> changed during a migration to a new CPU.
> >>>>
> >>>> Did I get that right?
> >>>>
> >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> >>>> that feels terribly broken for the kernel. I wonder what clang does,
> >>>> I'll go poke around with compilerexplorer after lunch.
> >>>>
> >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> >>>> he'll be interested. ;-)
> >>>
> >>> I'm a little confused; the way I understand the whole stack protector
> >>> thing to work is that we push a canary on the stack at call and on
> >>> return check it is still valid. Since in general tasks randomly migrate,
> >>> the per-cpu validation canary should be the same on all CPUs.
> >>>
> >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> >>> but no guarantees.
> >>>
> >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> >>> be safe.
> >> New test results today: both gcc build from git (git clone
> >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> >> are immune from the above issue. We can see the assembly code on
> >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> >>
> >> while
> >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> >
> > Do you know what fixes the issue? I would not declare victory yet. My
> > feeling is something changes in timing, or compiler codegen which
> > hides the issue. So the issue is still there but it is just a matter
> > of time before someone else reports it.
> >
> > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > canary? Michael, is this an optimization? Adding Christophe as well
> > since it came in a few years ago via the following commit:
>
> It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> register pointing to 'current' at all time so the canary is copied into
> a per-cpu struct during _switch().
>
> If GCC keeps an old value of the per-cpu struct pointer, it then gets
> the canary from the wrong CPU struct so from a different task.
Thanks a lot Christophe, that makes sense. Segher, are you convinced
that it is a compiler issue or is there still some doubt? Could you
modify gcc's stack checker to not optimize away r13 reads or is that
already the case in newer gcc?
thanks,
- Joel
>
> Christophe
>
> >
> > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > Date: Thu Sep 27 07:05:55 2018 +0000
> >
> > powerpc/64: add stack protector support
> >
> > On PPC64, as register r13 points to the paca_struct at all time,
> > this patch adds a copy of the canary there, which is copied at
> > task_switch.
> > That new canary is then used by using the following GCC options:
> > -mstack-protector-guard=tls
> > -mstack-protector-guard-reg=r13
> > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> >
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> >
> > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 0:42 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 0:42 UTC (permalink / raw)
To: Christophe Leroy
Cc: Zhouyi Zhou, Peter Zijlstra, Boqun Feng, Segher Boessenkool,
Michael Ellerman, linuxppc-dev, rcu, linux-kernel,
lance@osuosl.org, Paul E. McKenney
On Tue, Apr 25, 2023 at 9:40 AM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 25/04/2023 à 13:06, Joel Fernandes a écrit :
> > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >>
> >> hi
> >>
> >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >>>
> >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> >>>> This is amazing debugging Boqun, like a boss! One comment below:
> >>>>
> >>>>>>> Or something simple I haven't thought of? :)
> >>>>>>
> >>>>>> At what points can r13 change? Only when some particular functions are
> >>>>>> called?
> >>>>>>
> >>>>>
> >>>>> r13 is the local paca:
> >>>>>
> >>>>> register struct paca_struct *local_paca asm("r13");
> >>>>>
> >>>>> , which is a pointer to percpu data.
> >>>>>
> >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets
> >>>>> changed.
> >>>>
> >>>> It appears the whole issue, per your analysis, is that the stack
> >>>> checking code in gcc should not cache or alias r13, and must read its
> >>>> most up-to-date value during stack checking, as its value may have
> >>>> changed during a migration to a new CPU.
> >>>>
> >>>> Did I get that right?
> >>>>
> >>>> IMO, even without a reproducer, gcc on PPC should just not do that,
> >>>> that feels terribly broken for the kernel. I wonder what clang does,
> >>>> I'll go poke around with compilerexplorer after lunch.
> >>>>
> >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling
> >>>> he'll be interested. ;-)
> >>>
> >>> I'm a little confused; the way I understand the whole stack protector
> >>> thing to work is that we push a canary on the stack at call and on
> >>> return check it is still valid. Since in general tasks randomly migrate,
> >>> the per-cpu validation canary should be the same on all CPUs.
> >>>
> >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> >>> but no guarantees.
> >>>
> >>> Both cases use r13 (paca) in a racy manner, and in both cases it should
> >>> be safe.
> >> New test results today: both gcc build from git (git clone
> >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> >> are immune from the above issue. We can see the assembly code on
> >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
> >>
> >> while
> >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
> >
> > Do you know what fixes the issue? I would not declare victory yet. My
> > feeling is something changes in timing, or compiler codegen which
> > hides the issue. So the issue is still there but it is just a matter
> > of time before someone else reports it.
> >
> > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > canary? Michael, is this an optimization? Adding Christophe as well
> > since it came in a few years ago via the following commit:
>
> It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed
> register pointing to 'current' at all time so the canary is copied into
> a per-cpu struct during _switch().
>
> If GCC keeps an old value of the per-cpu struct pointer, it then gets
> the canary from the wrong CPU struct so from a different task.
Thanks a lot Christophe, that makes sense. Segher, are you convinced
that it is a compiler issue or is there still some doubt? Could you
modify gcc's stack checker to not optimize away r13 reads or is that
already the case in newer gcc?
thanks,
- Joel
>
> Christophe
>
> >
> > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
> > Author: Christophe Leroy <christophe.leroy@c-s.fr>
> > Date: Thu Sep 27 07:05:55 2018 +0000
> >
> > powerpc/64: add stack protector support
> >
> > On PPC64, as register r13 points to the paca_struct at all time,
> > this patch adds a copy of the canary there, which is copied at
> > task_switch.
> > That new canary is then used by using the following GCC options:
> > -mstack-protector-guard=tls
> > -mstack-protector-guard-reg=r13
> > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
> >
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> >
> > - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 11:06 ` Joel Fernandes
@ 2023-04-26 12:29 ` Michael Ellerman
-1 siblings, 0 replies; 80+ messages in thread
From: Michael Ellerman @ 2023-04-26 12:29 UTC (permalink / raw)
To: Joel Fernandes, Zhouyi Zhou, Christophe Leroy
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance, linuxppc-dev
Joel Fernandes <joel@joelfernandes.org> writes:
> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
...
>
> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> canary? Michael, is this an optimization? Adding Christophe as well
> since it came in a few years ago via the following commit:
I think Christophe also answered these in his reply.
We do use a per-task canary, but because we don't have "current" in a
register, we can't use the value in current for GCC.
In one of my replies I said a possible solution would be to keep current
in a register on 64-bit, but we'd need to do that in addition to the
paca, so that would consume another GPR which we'd need to think hard
about.
There's another reason to have it in the paca, which is that the paca is
always accessible, even when the MMU is off, whereas current isn't (in
some situations).
In general we don't want to use stack protector in code that runs with
the MMU off, but if the canary wasn't in the paca then we'd have a hard
requirement to not use stack protector in that code.
cheers
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 12:29 ` Michael Ellerman
0 siblings, 0 replies; 80+ messages in thread
From: Michael Ellerman @ 2023-04-26 12:29 UTC (permalink / raw)
To: Joel Fernandes, Zhouyi Zhou, Christophe Leroy
Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu,
linux-kernel, lance, Paul E. McKenney
Joel Fernandes <joel@joelfernandes.org> writes:
> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
...
>
> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> canary? Michael, is this an optimization? Adding Christophe as well
> since it came in a few years ago via the following commit:
I think Christophe also answered these in his reply.
We do use a per-task canary, but because we don't have "current" in a
register, we can't use the value in current for GCC.
In one of my replies I said a possible solution would be to keep current
in a register on 64-bit, but we'd need to do that in addition to the
paca, so that would consume another GPR which we'd need to think hard
about.
There's another reason to have it in the paca, which is that the paca is
always accessible, even when the MMU is off, whereas current isn't (in
some situations).
In general we don't want to use stack protector in code that runs with
the MMU off, but if the canary wasn't in the paca then we'd have a hard
requirement to not use stack protector in that code.
cheers
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 12:29 ` Michael Ellerman
@ 2023-04-26 13:44 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 13:44 UTC (permalink / raw)
To: Michael Ellerman
Cc: Christophe Leroy, Paul E. McKenney, Peter Zijlstra, Zhouyi Zhou,
Boqun Feng, linux-kernel, rcu, lance, linuxppc-dev
On Wed, Apr 26, 2023 at 8:30 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Joel Fernandes <joel@joelfernandes.org> writes:
> > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> ...
> >
> > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > canary? Michael, is this an optimization? Adding Christophe as well
> > since it came in a few years ago via the following commit:
>
> I think Christophe also answered these in his reply.
>
> We do use a per-task canary, but because we don't have "current" in a
> register, we can't use the value in current for GCC.
>
> In one of my replies I said a possible solution would be to keep current
> in a register on 64-bit, but we'd need to do that in addition to the
> paca, so that would consume another GPR which we'd need to think hard
> about.
Makes sense. I'd think it is not worth allocating a separate GPR just
for this, and may cause similar register optimization issues as well.
> There's another reason to have it in the paca, which is that the paca is
> always accessible, even when the MMU is off, whereas current isn't (in
> some situations).
>
> In general we don't want to use stack protector in code that runs with
> the MMU off, but if the canary wasn't in the paca then we'd have a hard
> requirement to not use stack protector in that code.
How could you control which code paths don't have the stack protector?
Just curious.
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 13:44 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-26 13:44 UTC (permalink / raw)
To: Michael Ellerman
Cc: Zhouyi Zhou, Christophe Leroy, Peter Zijlstra, Boqun Feng,
Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance,
Paul E. McKenney
On Wed, Apr 26, 2023 at 8:30 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Joel Fernandes <joel@joelfernandes.org> writes:
> > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> ...
> >
> > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
> > canary? Michael, is this an optimization? Adding Christophe as well
> > since it came in a few years ago via the following commit:
>
> I think Christophe also answered these in his reply.
>
> We do use a per-task canary, but because we don't have "current" in a
> register, we can't use the value in current for GCC.
>
> In one of my replies I said a possible solution would be to keep current
> in a register on 64-bit, but we'd need to do that in addition to the
> paca, so that would consume another GPR which we'd need to think hard
> about.
Makes sense. I'd think it is not worth allocating a separate GPR just
for this, and may cause similar register optimization issues as well.
> There's another reason to have it in the paca, which is that the paca is
> always accessible, even when the MMU is off, whereas current isn't (in
> some situations).
>
> In general we don't want to use stack protector in code that runs with
> the MMU off, but if the canary wasn't in the paca then we'd have a hard
> requirement to not use stack protector in that code.
How could you control which code paths don't have the stack protector?
Just curious.
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 13:44 ` Joel Fernandes
@ 2023-04-26 14:20 ` Peter Zijlstra
-1 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-04-26 14:20 UTC (permalink / raw)
To: Joel Fernandes
Cc: Christophe Leroy, Paul E. McKenney, Boqun Feng, linux-kernel, rcu,
lance, Zhouyi Zhou, linuxppc-dev
On Wed, Apr 26, 2023 at 09:44:22AM -0400, Joel Fernandes wrote:
> How could you control which code paths don't have the stack protector?
> Just curious.
https://lkml.kernel.org/r/20230412-no_stackp-v2-0-116f9fe4bbe7@google.com
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 14:20 ` Peter Zijlstra
0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-04-26 14:20 UTC (permalink / raw)
To: Joel Fernandes
Cc: Michael Ellerman, Zhouyi Zhou, Christophe Leroy, Boqun Feng,
Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance,
Paul E. McKenney
On Wed, Apr 26, 2023 at 09:44:22AM -0400, Joel Fernandes wrote:
> How could you control which code paths don't have the stack protector?
> Just curious.
https://lkml.kernel.org/r/20230412-no_stackp-v2-0-116f9fe4bbe7@google.com
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 14:20 ` Peter Zijlstra
@ 2023-04-26 14:45 ` Michael Ellerman
-1 siblings, 0 replies; 80+ messages in thread
From: Michael Ellerman @ 2023-04-26 14:45 UTC (permalink / raw)
To: Peter Zijlstra, Joel Fernandes
Cc: Christophe Leroy, Paul E. McKenney, Zhouyi Zhou, Boqun Feng,
linux-kernel, rcu, lance, linuxppc-dev
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Apr 26, 2023 at 09:44:22AM -0400, Joel Fernandes wrote:
>
>> How could you control which code paths don't have the stack protector?
>> Just curious.
>
> https://lkml.kernel.org/r/20230412-no_stackp-v2-0-116f9fe4bbe7@google.com
We also have some entire files disabled, eg:
$ git grep no-stack-protector arch/powerpc/
arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o += -fno-stack-protector
arch/powerpc/kernel/Makefile:CFLAGS_syscall.o += -fno-stack-protector
arch/powerpc/kernel/Makefile:CFLAGS_setup_64.o += -fno-stack-protector
arch/powerpc/kernel/Makefile:CFLAGS_paca.o += -fno-stack-protector
cheers
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-26 14:45 ` Michael Ellerman
0 siblings, 0 replies; 80+ messages in thread
From: Michael Ellerman @ 2023-04-26 14:45 UTC (permalink / raw)
To: Peter Zijlstra, Joel Fernandes
Cc: Zhouyi Zhou, Christophe Leroy, Boqun Feng, Segher Boessenkool,
linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Apr 26, 2023 at 09:44:22AM -0400, Joel Fernandes wrote:
>
>> How could you control which code paths don't have the stack protector?
>> Just curious.
>
> https://lkml.kernel.org/r/20230412-no_stackp-v2-0-116f9fe4bbe7@google.com
We also have some entire files disabled, eg:
$ git grep no-stack-protector arch/powerpc/
arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o += -fno-stack-protector
arch/powerpc/kernel/Makefile:CFLAGS_syscall.o += -fno-stack-protector
arch/powerpc/kernel/Makefile:CFLAGS_setup_64.o += -fno-stack-protector
arch/powerpc/kernel/Makefile:CFLAGS_paca.o += -fno-stack-protector
cheers
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-26 12:29 ` Michael Ellerman
@ 2023-04-28 10:35 ` Christophe Leroy
-1 siblings, 0 replies; 80+ messages in thread
From: Christophe Leroy @ 2023-04-28 10:35 UTC (permalink / raw)
To: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, Christophe Leroy
Cc: Paul E. McKenney, Peter Zijlstra, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
Le 26/04/2023 à 14:29, Michael Ellerman a écrit :
> Joel Fernandes <joel@joelfernandes.org> writes:
>> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> ...
>>
>> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
>> canary? Michael, is this an optimization? Adding Christophe as well
>> since it came in a few years ago via the following commit:
>
> I think Christophe also answered these in his reply.
>
> We do use a per-task canary, but because we don't have "current" in a
> register, we can't use the value in current for GCC.
>
> In one of my replies I said a possible solution would be to keep current
> in a register on 64-bit, but we'd need to do that in addition to the
> paca, so that would consume another GPR which we'd need to think hard
> about.
An analysis was done here https://github.com/linuxppc/issues/issues/45
showing that r14 is very little used.
>
> There's another reason to have it in the paca, which is that the paca is
> always accessible, even when the MMU is off, whereas current isn't (in
> some situations).
Even now that powerpc is converted to CONFIG_THREAD_INFO_IN_TASK ?
Christophe
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-28 10:35 ` Christophe Leroy
0 siblings, 0 replies; 80+ messages in thread
From: Christophe Leroy @ 2023-04-28 10:35 UTC (permalink / raw)
To: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, Christophe Leroy
Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu,
linux-kernel, lance@osuosl.org, Paul E. McKenney
Le 26/04/2023 à 14:29, Michael Ellerman a écrit :
> Joel Fernandes <joel@joelfernandes.org> writes:
>> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> ...
>>
>> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
>> canary? Michael, is this an optimization? Adding Christophe as well
>> since it came in a few years ago via the following commit:
>
> I think Christophe also answered these in his reply.
>
> We do use a per-task canary, but because we don't have "current" in a
> register, we can't use the value in current for GCC.
>
> In one of my replies I said a possible solution would be to keep current
> in a register on 64-bit, but we'd need to do that in addition to the
> paca, so that would consume another GPR which we'd need to think hard
> about.
An analysis was done here https://github.com/linuxppc/issues/issues/45
showing that r14 is very little used.
>
> There's another reason to have it in the paca, which is that the paca is
> always accessible, even when the MMU is off, whereas current isn't (in
> some situations).
Even now that powerpc is converted to CONFIG_THREAD_INFO_IN_TASK ?
Christophe
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 10:13 ` Peter Zijlstra
@ 2023-04-25 10:59 ` Joel Fernandes
-1 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-25 10:59 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Paul E. McKenney, Boqun Feng, linux-kernel, rcu, lance,
Zhouyi Zhou, linuxppc-dev
On Tue, Apr 25, 2023 at 6:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > This is amazing debugging Boqun, like a boss! One comment below:
> >
> > > > > Or something simple I haven't thought of? :)
> > > >
> > > > At what points can r13 change? Only when some particular functions are
> > > > called?
> > > >
> > >
> > > r13 is the local paca:
> > >
> > > register struct paca_struct *local_paca asm("r13");
> > >
> > > , which is a pointer to percpu data.
> > >
> > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > changed.
> >
> > It appears the whole issue, per your analysis, is that the stack
> > checking code in gcc should not cache or alias r13, and must read its
> > most up-to-date value during stack checking, as its value may have
> > changed during a migration to a new CPU.
> >
> > Did I get that right?
> >
> > IMO, even without a reproducer, gcc on PPC should just not do that,
> > that feels terribly broken for the kernel. I wonder what clang does,
> > I'll go poke around with compilerexplorer after lunch.
> >
> > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > he'll be interested. ;-)
>
> I'm a little confused; the way I understand the whole stack protector
> thing to work is that we push a canary on the stack at call and on
> return check it is still valid. Since in general tasks randomly migrate,
> the per-cpu validation canary should be the same on all CPUs.
>
> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> but no guarantees.
>
> Both cases use r13 (paca) in a racy manner, and in both cases it should
> be safe.
AFAICS, the canary is randomly chosen both in the kernel [1]. This
also appears to be the case in glibc. That makes sense because you
don't want the canary to be something that the attacker can easily
predict and store on the stack to bypass buffer overflow attacks:
[1] kernel :
/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)
{
unsigned long canary = get_random_canary();
current->stack_canary = canary;
#ifdef CONFIG_PPC64
get_paca()->canary = canary;
#endif
}
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 10:59 ` Joel Fernandes
0 siblings, 0 replies; 80+ messages in thread
From: Joel Fernandes @ 2023-04-25 10:59 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Boqun Feng, Segher Boessenkool, Michael Ellerman, Zhouyi Zhou,
linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney
On Tue, Apr 25, 2023 at 6:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > This is amazing debugging Boqun, like a boss! One comment below:
> >
> > > > > Or something simple I haven't thought of? :)
> > > >
> > > > At what points can r13 change? Only when some particular functions are
> > > > called?
> > > >
> > >
> > > r13 is the local paca:
> > >
> > > register struct paca_struct *local_paca asm("r13");
> > >
> > > , which is a pointer to percpu data.
> > >
> > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > changed.
> >
> > It appears the whole issue, per your analysis, is that the stack
> > checking code in gcc should not cache or alias r13, and must read its
> > most up-to-date value during stack checking, as its value may have
> > changed during a migration to a new CPU.
> >
> > Did I get that right?
> >
> > IMO, even without a reproducer, gcc on PPC should just not do that,
> > that feels terribly broken for the kernel. I wonder what clang does,
> > I'll go poke around with compilerexplorer after lunch.
> >
> > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > he'll be interested. ;-)
>
> I'm a little confused; the way I understand the whole stack protector
> thing to work is that we push a canary on the stack at call and on
> return check it is still valid. Since in general tasks randomly migrate,
> the per-cpu validation canary should be the same on all CPUs.
>
> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> but no guarantees.
>
> Both cases use r13 (paca) in a racy manner, and in both cases it should
> be safe.
AFAICS, the canary is randomly chosen both in the kernel [1]. This
also appears to be the case in glibc. That makes sense because you
don't want the canary to be something that the attacker can easily
predict and store on the stack to bypass buffer overflow attacks:
[1] kernel :
/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)
{
unsigned long canary = get_random_canary();
current->stack_canary = canary;
#ifdef CONFIG_PPC64
get_paca()->canary = canary;
#endif
}
thanks,
- Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 10:59 ` Joel Fernandes
@ 2023-04-25 11:53 ` Peter Zijlstra
-1 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-04-25 11:53 UTC (permalink / raw)
To: Joel Fernandes
Cc: Paul E. McKenney, Boqun Feng, linux-kernel, rcu, lance,
Zhouyi Zhou, linuxppc-dev
On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote:
> > I'm a little confused; the way I understand the whole stack protector
> > thing to work is that we push a canary on the stack at call and on
> > return check it is still valid. Since in general tasks randomly migrate,
> > the per-cpu validation canary should be the same on all CPUs.
> AFAICS, the canary is randomly chosen both in the kernel [1]. This
Yes, at boot, once. But thereafter it should be the same for all CPUs.
> also appears to be the case in glibc. That makes sense because you
> don't want the canary to be something that the attacker can easily
> predict and store on the stack to bypass buffer overflow attacks:
>
> [1] kernel :
> /*
> * Initialize the stackprotector canary value.
> *
> * NOTE: this must only be called from functions that never return,
> * and it must always be inlined.
> */
> static __always_inline void boot_init_stack_canary(void)
> {
> unsigned long canary = get_random_canary();
>
> current->stack_canary = canary;
> #ifdef CONFIG_PPC64
> get_paca()->canary = canary;
> #endif
> }
>
> thanks,
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 11:53 ` Peter Zijlstra
0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-04-25 11:53 UTC (permalink / raw)
To: Joel Fernandes
Cc: Boqun Feng, Segher Boessenkool, Michael Ellerman, Zhouyi Zhou,
linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney
On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote:
> > I'm a little confused; the way I understand the whole stack protector
> > thing to work is that we push a canary on the stack at call and on
> > return check it is still valid. Since in general tasks randomly migrate,
> > the per-cpu validation canary should be the same on all CPUs.
> AFAICS, the canary is randomly chosen both in the kernel [1]. This
Yes, at boot, once. But thereafter it should be the same for all CPUs.
> also appears to be the case in glibc. That makes sense because you
> don't want the canary to be something that the attacker can easily
> predict and store on the stack to bypass buffer overflow attacks:
>
> [1] kernel :
> /*
> * Initialize the stackprotector canary value.
> *
> * NOTE: this must only be called from functions that never return,
> * and it must always be inlined.
> */
> static __always_inline void boot_init_stack_canary(void)
> {
> unsigned long canary = get_random_canary();
>
> current->stack_canary = canary;
> #ifdef CONFIG_PPC64
> get_paca()->canary = canary;
> #endif
> }
>
> thanks,
>
> - Joel
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
2023-04-25 11:53 ` Peter Zijlstra
@ 2023-04-25 13:36 ` Christophe Leroy
-1 siblings, 0 replies; 80+ messages in thread
From: Christophe Leroy @ 2023-04-25 13:36 UTC (permalink / raw)
To: Peter Zijlstra, Joel Fernandes
Cc: Paul E. McKenney, Zhouyi Zhou, Boqun Feng, linux-kernel, rcu,
lance@osuosl.org, linuxppc-dev
Le 25/04/2023 à 13:53, Peter Zijlstra a écrit :
> On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote:
>>> I'm a little confused; the way I understand the whole stack protector
>>> thing to work is that we push a canary on the stack at call and on
>>> return check it is still valid. Since in general tasks randomly migrate,
>>> the per-cpu validation canary should be the same on all CPUs.
>
>> AFAICS, the canary is randomly chosen both in the kernel [1]. This
>
> Yes, at boot, once. But thereafter it should be the same for all CPUs.
Each task has its own canary, stored in task struct :
kernel/fork.c:1012: tsk->stack_canary = get_random_canary();
On PPC32 we have register 'r2' that points to task struct at all time,
so GCC is instructed to find canary at an offset from r2.
But on PPC64 we have no such register. Instead we have r13 that points
to the PACA struct which is a per-cpu structure, and we have a pointer
to 'current' task struct in the PACA struct. So in order to be able to
have the canary as an offset of a fixed register as expected by GCC, we
copy the task canary into the cpu's PACA struct during _switch():
addi r6,r4,-THREAD /* Convert THREAD to 'current' */
std r6,PACACURRENT(r13) /* Set new 'current' */
#if defined(CONFIG_STACKPROTECTOR)
ld r6, TASK_CANARY(r6)
std r6, PACA_CANARY(r13)
#endif
The problem is that r13 will change if a task is switched to another
CPU. But if GCC is using a copy of an older value of r13, then it will
take the canary from another CPU's PACA struct hence it'll get the
canary of the task running on that CPU instead of getting the canary of
the current task running on the current CPU.
Christophe
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
@ 2023-04-25 13:36 ` Christophe Leroy
0 siblings, 0 replies; 80+ messages in thread
From: Christophe Leroy @ 2023-04-25 13:36 UTC (permalink / raw)
To: Peter Zijlstra, Joel Fernandes
Cc: Paul E. McKenney, Boqun Feng, linux-kernel, rcu, lance@osuosl.org,
Zhouyi Zhou, linuxppc-dev
Le 25/04/2023 à 13:53, Peter Zijlstra a écrit :
> On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote:
>>> I'm a little confused; the way I understand the whole stack protector
>>> thing to work is that we push a canary on the stack at call and on
>>> return check it is still valid. Since in general tasks randomly migrate,
>>> the per-cpu validation canary should be the same on all CPUs.
>
>> AFAICS, the canary is randomly chosen both in the kernel [1]. This
>
> Yes, at boot, once. But thereafter it should be the same for all CPUs.
Each task has its own canary, stored in task struct :
kernel/fork.c:1012: tsk->stack_canary = get_random_canary();
On PPC32 we have register 'r2' that points to task struct at all time,
so GCC is instructed to find canary at an offset from r2.
But on PPC64 we have no such register. Instead we have r13 that points
to the PACA struct which is a per-cpu structure, and we have a pointer
to 'current' task struct in the PACA struct. So in order to be able to
have the canary as an offset of a fixed register as expected by GCC, we
copy the task canary into the cpu's PACA struct during _switch():
addi r6,r4,-THREAD /* Convert THREAD to 'current' */
std r6,PACACURRENT(r13) /* Set new 'current' */
#if defined(CONFIG_STACKPROTECTOR)
ld r6, TASK_CANARY(r6)
std r6, PACA_CANARY(r13)
#endif
The problem is that r13 will change if a task is switched to another
CPU. But if GCC is using a copy of an older value of r13, then it will
take the canary from another CPU's PACA struct hence it'll get the
canary of the task running on that CPU instead of getting the canary of
the current task running on the current CPU.
Christophe
^ permalink raw reply [flat|nested] 80+ messages in thread