public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: richard earnshaw <Richard.Earnshaw@arm.com>
Cc: peter maydell <peter.maydell@linaro.org>,
	Will Deacon <will.deacon@arm.com>,
	libc-alpha <libc-alpha@sourceware.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	carlos <carlos@redhat.com>
Subject: Re: rseq/arm32: choosing rseq code signature
Date: Wed, 17 Apr 2019 10:43:04 -0400 (EDT)	[thread overview]
Message-ID: <1583901617.467.1555512184036.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <cecd4a6e-4a36-1785-4f65-486ba3847a06@arm.com>

----- On Apr 17, 2019, at 6:37 AM, richard earnshaw Richard.Earnshaw@arm.com wrote:

> On 16/04/2019 14:39, Mathieu Desnoyers wrote:
>> ----- On Apr 15, 2019, at 9:37 AM, Mathieu Desnoyers
>> mathieu.desnoyers@efficios.com wrote:
>> 
>>> ----- On Apr 15, 2019, at 9:30 AM, peter maydell peter.maydell@linaro.org wrote:
>>>
>>>> On Mon, 15 Apr 2019 at 14:11, Mathieu Desnoyers
>>>> <mathieu.desnoyers@efficios.com> wrote:
>>>>>
>>>>> ----- On Apr 11, 2019, at 3:55 PM, peter maydell peter.maydell@linaro.org wrote:
>>>>>
>>>>>> On Thu, 11 Apr 2019 at 18:51, Mathieu Desnoyers
>>>>>> <mathieu.desnoyers@efficios.com> wrote:
>>>>>>>  * This translates to the following instruction pattern in the T16 instruction
>>>>>>>  * set:
>>>>>>>  *
>>>>>>>  * little endian:
>>>>>>>  * def3        udf    #243      ; 0xf3
>>>>>>>  * e7f5        b.n    <7f5>
>>>>>>>  *
>>>>>>>  * big endian:
>>>>>>>  * e7f5        b.n    <7f5>
>>>>>>>  * def3        udf    #243      ; 0xf3
>>>>>>
>>>>>> Do we really care about big-endian instruction-ordering for Thumb?
>>>>>> It requires (AIUI) either an ARMv7R CPU which implements and sets
>>>>>> SCTLR.IE to 1, or a v6-or-earlier CPU using BE32, and it's going to
>>>>>> be even rarer than normal BE8 big-endian...
>>>>>
>>>>> I don't think we care enough about it to look for a trick to
>>>>> turn the branch into something else (which would not branch away from the
>>>>> udf instruction), but considering this signature will be ABI, it's good to
>>>>> be thorough documentation-wise and cover all existing cases.
>>>>
>>>> I think if you want to document it it would be helpful to
>>>> readers to make it clear that this is the ultra-rare
>>>> big-endian-instruction-order "big endian Thumb", not the only
>>>> moderately-rare little-endian-instructions-big-endian-data
>>>> "big endian Thumb".
>>>
>>> I'm actually very much concerned about environments with big endian
>>> data and little endian code. Which gcc compiler flags do I need to
>>> use to test it ?
>>>
>>> I'm concerned about a signature mismatch between what is passed to
>>> the rseq system call ("data-endian signature") and what is generated
>>> in the code ("instruction-endian signature").
>> 
>> Based on this page:
>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/CDFBBCHB.html
>> 
>> My understanding is that the situation is as follows (please confirm):
>> 
>> - Prior to ARMv6, you could build and run code that is either big or little
>> endian,
>>   given you had a matching Linux kernel endianness. Code and data endianness
>>   needed
>>   to match,
>> - Starting from ARMv6, only little endian code is supported. The endianness for
>> data
>>   access can be changed through bit [9], the E bit, of the Program Status
>>   Register,
>>   (mixed endianness)
>> 
>> Looking at ARM build options for gcc, it seems you can select either big or
>> little
>> endian (-mbig-endian or -mlittle-endian (default)) which affects both
>> instruction and
>> data endianness. So I suspect the -mbig-endian option is really only useful for
>> pre-ARMv6.
> 
> -mbig-endian is still correct, even on later architectures.  The linker
> gets involved, however, and (using the mapping symbol information) swaps
> the code segments to little-endian form (this is why you have to use
> .inst rather than .word when inserting instructions, so that the correct
> mapping symbols are inserted).

So what you're saying is that if I have:

void main()
{
        asm volatile (
                        ".arm\n\t"
                        ".inst 0xe7f5def3\n\t"
                        ".long 0xe7f5def3\n\t");
}

and compile it with:

arm-linux-gnueabihf-gcc -mbig-endian -march=armv6 -c -o arm-big-endianv6.o arm-test-endian.c

It's expected that the generated .o will have big endian instructions, matching
the endianness of the data, e.g.:

hexdump arm-big-endianv6.o

[...]
0000030 0a00 0900 80b5 00af f5e7 f3de f5e7 f3de

But it's then at the linking stage that the linker will
reverse the endianness of the ".inst" (but not .long).

Let's see:

arm-linux-gnueabihf-gcc -nodefaultlibs -nostdlib -mbig-endian -march=armv6 -o arm-big-endianv6 arm-big-endianv6.o 
/usr/lib/gcc-cross/arm-linux-gnueabihf/7/../../../../arm-linux-gnueabihf/bin/ld: warning: cannot find entry symbol _start; defaulting to 00000000000001b0

hexdump gives me:
[...]
00001b0 80b5 00af f5e7 f3de f5e7 f3de c046 bd46

So it has not reversed the instruction endianness.

What am I doing wrong ?

I'm using:

gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04)
GNU ld (GNU Binutils for Ubuntu) 2.30

Thanks,

Mathieu

> 
>> 
>> For ARMv6+ mixed-endianness, it seems to be a mode that temporarily swap
>> endianness
>> of load/store instructions for specific memory accesses communicating with DMA
>> devices,
>> so I don't see any scenario where we can generate a binary that has little
>> endian code
>> and big endian data. If that is true, then it should be fine to declare the
>> signature
>> with ".arm .inst" and expect the data endianness to be the same as code
>> endianness.
>> 
>> Am I missing something ?
>> 
>> Thanks,
>> 
>> Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2019-04-17 14:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-09 19:32 rseq/arm32: choosing rseq code signature Mathieu Desnoyers
2019-04-10 20:29 ` Mathieu Desnoyers
2019-04-11 16:42   ` Will Deacon
2019-04-11 17:51     ` Mathieu Desnoyers
2019-04-11 19:55       ` Peter Maydell
2019-04-15 13:11         ` Mathieu Desnoyers
2019-04-15 13:30           ` Peter Maydell
2019-04-15 13:37             ` Mathieu Desnoyers
2019-04-16 13:39               ` Mathieu Desnoyers
2019-04-17 10:37                 ` Richard Earnshaw (lists)
2019-04-17 14:43                   ` Mathieu Desnoyers [this message]
2019-04-17 15:30                     ` Mathieu Desnoyers
2019-04-18 16:18                       ` Richard Earnshaw (lists)
2019-04-11 12:24 ` Florian Weimer
2019-04-15 13:22   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1583901617.467.1555512184036.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=carlos@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter.maydell@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox