From: Richard Sandiford <richard.sandiford@arm.com>
To: Dan Li <ashimida@linux.alibaba.com>
Cc: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com,
marcus.shawcroft@arm.com, kyrylo.tkachov@arm.com, hp@gcc.gnu.org,
ndesaulniers@google.com, nsz@gcc.gnu.org, pageexec@gmail.com,
qinzhao@gcc.gnu.org, linux-hardening@vger.kernel.org
Subject: Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Date: Fri, 11 Feb 2022 15:35:01 +0000 [thread overview]
Message-ID: <mpto83dzaai.fsf@arm.com> (raw)
In-Reply-To: <81b35033-489e-d0c0-8677-55923f6abbe0@linux.alibaba.com> (Dan Li's message of "Fri, 11 Feb 2022 05:43:07 -0800")
Dan Li <ashimida@linux.alibaba.com> writes:
> On 2/11/22 01:53, Richard Sandiford wrote:
>> Dan Li <ashimida@linux.alibaba.com> writes:
>>> On 2/10/22 01:55, Richard Sandiford wrote:
>>>>>
>>>> But treating scs push and scs pop as part of the register save and
>>>> restore sequences would have one advantage: it would allow the
>>>> scs push and scs pop to be shrink-wrapped.
>>>>
>>>
>>> Sorry for my limited knowledge of shrink warping, I don't think I get
>>> it here (I tried to find a case when compiling the kernel and some
>>> gcc test cases but I still don't have a clue.).
>>>
>>> I see that the bitmap of LR_REGNUM is cleared in
>>> aarch64_get_separate_components and scs push/pop are x18 based operations.
>>>
>>> If we handle them in aarch64_restore/save_callee_saves,
>>> could scs push/pop be shrink-wrapped in some cases?
>>
>> Yeah, I think so. E.g. for:
>>
>> void f();
>> int g(int x) {
>> if (x == 0)
>> return 1;
>> f();
>> return 2;
>> }
>>
>> shrink wrapping would allow the scs push and pop to move along with the
>> x30 save:
>>
>> g:
>> cbnz w0, .L9
>> mov w0, 1
>> ret
>> .L9:
>> stp x29, x30, [sp, -16]!
>> mov x29, sp
>> bl f
>> mov w0, 2
>> ldp x29, x30, [sp], 16
>> ret
>>
>
> Thanks Richard, (to make sure I understand correctly :)) I think
> it means that the current patch could do a "shrink-wapping", but
> the X30 could not be treat as a "component", now it could gen code
> like:
>
> g:
> cbnz w0, .L9
> mov w0, 1
> ret
> .L9:
> str x30, [x18], 8
> stp x29, x30, [sp, -16]!
> mov x29, sp
> bl f
> ldr x30, [x18, -8]!
> mov w0, 2
> ldr x29, [sp], 16
> ret
Ah, right, sorry. I'd forgotten that this happened independently
of the components stuff (and has to, since like you say, we don't
treat LR_REGNUM as a separable component).
>> The idea is that aarch64_save_callee_saves would treat the scs push
>> as part of saving x30 (along with the normal store to the frame chain,
>> when used). aarch64_restore_callee_saves would similarly treat the scs
>> pop as the way of restoring x30 (instead of loading from the frame chain).
>> This is in contrast to the current patch, where the scs push and pop are
>> treated as fixed parts of the prologue and epilogue instead, and where
>> aarch64_restore_callee_saves tries to avoid doing anything for x30.
>>
>> If shrink-wrapping decides to treat x30 as a separate “component”, as it
>> does in the example above, then the scs push and pop would be emitted
>> by aarch64_process_components instead.
>>
>> It would be more complex, but it would give better code.
>>
>
> Following your idea, I made a poc to add x30 in component bitmap:
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 35f6f64f5b2..fc9b5e7af54 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -8359,7 +8359,7 @@ aarch64_get_separate_components (void)
> if (reg1 != INVALID_REGNUM)
> bitmap_clear_bit (components, reg1);
>
> - bitmap_clear_bit (components, LR_REGNUM);
> bitmap_clear_bit (components, SP_REGNUM);
>
> return components;
> @@ -8396,7 +8396,7 @@ aarch64_components_for_bb (basic_block bb)
> /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */
> for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++)
> if (!fixed_regs[regno]
> - && !crtl->abi->clobbers_full_reg_p (regno)
> + && (!crtl->abi->clobbers_full_reg_p (regno) || regno == R30_REGNUM)
> && (TEST_HARD_REG_BIT (extra_caller_saves, regno)
> || bitmap_bit_p (in, regno)
> || bitmap_bit_p (gen, regno)
>
> And with a test code compiled with -fno-omit-frame-pointer:
>
> void f();
> int g(int x) {
> if (x == 0) {
> __asm__ ("":::"x19", "x20");
> return 1;
> }
> f();
> return 2;
> }
>
> Then it seems X30 is treat as a "component" (the test
> result of aarch64.exp also seems fine).
>
> g:
> stp x19, x20, [sp, -32]!
> cbnz w0, .L2
> mov w0, 1
> ldp x19, x20, [sp], 32
> ret
> .L2:
> str x30, [sp, 16]
> bl f
> ldr x30, [sp, 16]
> mov w0, 2
> ldp x19, x20, [sp], 32
> ret
>
> And I think maybe we could handle this through three patches:
> 1.Keep current patch (a V5) unchanged for scs.
> 2.Add shrink-warpping for X30:
> logically this might be a separate topic, and I think more testing
> might be needed here (Well, I'm a little worried about if there might
> be other effects, since I just read this part of the code roughly
> yesterday).
> 3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for
> the PAC code in pro/epilogue, since it's also the operation of the X30).
Yeah, that's fair.
(Like I said earlier, I wasn't asking for the shrink-wrapping change.
It was just a note in passing. But as you point out, the individual
shrink-wrapping support would be even more work than I'd imagined.)
Thanks,
Richard
next prev parent reply other threads:[~2022-02-11 15:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-05 11:04 [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack Dan Li
2022-02-09 16:08 ` Richard Sandiford
2022-02-10 3:06 ` Dan Li
2022-02-10 9:55 ` Richard Sandiford
2022-02-11 8:57 ` Dan Li
2022-02-11 9:53 ` Richard Sandiford
2022-02-11 13:43 ` Dan Li
2022-02-11 15:35 ` Richard Sandiford [this message]
2022-02-12 5:30 ` Dan Li
2022-02-12 8:43 ` Dan Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mpto83dzaai.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=ashimida@linux.alibaba.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hp@gcc.gnu.org \
--cc=kyrylo.tkachov@arm.com \
--cc=linux-hardening@vger.kernel.org \
--cc=marcus.shawcroft@arm.com \
--cc=ndesaulniers@google.com \
--cc=nsz@gcc.gnu.org \
--cc=pageexec@gmail.com \
--cc=qinzhao@gcc.gnu.org \
--cc=richard.earnshaw@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.