Re: [PATCH 0/2] LoongArch: Make bounds-checking instructions useful

linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Xi Ruoyao <xry111@xry111.site>
To: WANG Xuerui <kernel@xen0n.name>, loongarch@lists.linux.dev
Cc: WANG Xuerui <git@xen0n.name>, Huacai Chen <chenhuacai@kernel.org>,
	Eric Biederman <ebiederm@xmission.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Arnd Bergmann <arnd@arndb.de>,
	linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] LoongArch: Make bounds-checking instructions useful
Date: Mon, 17 Apr 2023 17:50:40 +0800	[thread overview]
Message-ID: <f54abfae989023fcfdabb4e9800a66847c357b85.camel@xry111.site> (raw)
In-Reply-To: <6ca642a9-62a6-00e5-39ac-f14ef36f6bdb@xen0n.name>

On Mon, 2023-04-17 at 15:54 +0800, WANG Xuerui wrote:
> On 2023/4/17 14:47, Xi Ruoyao wrote:
> > On Mon, 2023-04-17 at 01:33 +0800, WANG Xuerui wrote:
> > > From: WANG Xuerui <git@xen0n.name>
> > > 
> > > Hi,
> > > 
> > > The LoongArch-64 base architecture is capable of performing
> > > bounds-checking either before memory accesses or alone, with specialized
> > > instructions generating BCEs (bounds-checking error) in case of failed
> > > assertions (ISA manual Volume 1, Sections 2.2.6.1 [1] and 2.2.10.3 [2]).
> > > This could be useful for managed runtimes, but the exception is not
> > > being handled so far, resulting in SIGSYSes in these cases, which is
> > > incorrect and warrants a fix in itself.
> > > 
> > > During experimentation, it was discovered that there is already UAPI for
> > > expressing such semantics: SIGSEGV with si_code=SEGV_BNDERR. This was
> > > originally added for Intel MPX, and there is currently no user (!) after
> > > the removal of MPX support a few years ago. Although the semantics is
> > > not a 1:1 match to that of LoongArch, still it is better than
> > > alternatives such as SIGTRAP or SIGBUS of BUS_OBJERR kind, due to being
> > > able to convey both the value that failed assertion and the bound value.
> > > 
> > > This patch series implements just this approach: translating BCEs into
> > > SIGSEGVs with si_code=SEGV_BNDERR, si_value set to the offending value,
> > > and si_lower and si_upper set to resemble a range with both lower and
> > > upper bound while in fact there is only one.
> > > 
> > > The instructions are not currently used anywhere yet in the fledgling
> > > LoongArch ecosystem, so it's not very urgent and we could take the time
> > > to figure out the best way forward (should SEGV_BNDERR turn out not
> > > suitable).
> > 
> > I don't think these instructions can be used in any systematic way
> > within a Linux userspace in 2023.  IMO they should not exist in
> > LoongArch at all because they have all the same disadvantages of Intel
> > MPX; MPX has been removed by Intel in 2019, and LoongArch is designed
> > after 2019.
> 
> Well, the difference is IMO significant enough to make LoongArch 
> bounds-checking more useful, at least for certain use cases. For 
> example, the bounds were a separate register bank in Intel MPX, but in
> LoongArch they are just values in GPRs. This fits naturally into 
> JIT-ting or other managed runtimes (e.g. Go) whose slice indexing ops 
> already bounds-check with a temporary register per bound anyway, so it's 
> just a matter of this snippet (or something like it)
> 
> - calculate element address
> - if address < base: goto fail
> - load/calculate upper bound
> - if address >= upper bound: goto fail
> - access memory
> 
> becoming
> 
> - calculate element address
> - asrtgt address, base - 1
> - load/calculate upper bound
> - {ld,st}le address, upper bound
> 
> then in SIGSEGV handler, check PC to associate the signal back with the 
> exact access op;

I remember using the signal handler for "usual" error handling can be a
very bad idea but I can't remember where I've read about it.  Is there
any managed environments doing so in practice?

If we redefine new_ldle/new_stle as "if [[likely]] the address is in-
bound, do the load/store and skip the next instruction; otherwise do
nothing", we can say:

blt        address, base, 1f
new_ldle.d rd, address, upperbound
1:b        panic_oob_access
xor        rd, rd, 42 // use rd to do something

This is more versatile, and useful for building a loop as well:

or            a0, r0, r0
0:new_ldle.d  t1, t0, t2
b             1f
add.d         a0, t1, a0
add.d         t0, t0, 8
b             0b
1:bl          do_something_with_the_sum

Yes it's "non-RISC", but at least more RISC than the current ldle: if
you want a trap anyway you can say

blt        address, base, 1f
new_ldle.d rd, address, upperbound
1:break    {a code defined for OOB}
xor        rd, rd, 42 // use rd

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

next prev parent reply	other threads:[~2023-04-17  9:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-16 17:33 [PATCH 0/2] LoongArch: Make bounds-checking instructions useful WANG Xuerui
2023-04-16 17:33 ` [PATCH 1/2] LoongArch: Add opcodes of bounds-checking instructions WANG Xuerui
2023-04-16 17:33 ` [PATCH 2/2] LoongArch: Relay BCE exceptions to userland as SIGSEGVs with si_code=SEGV_BNDERR WANG Xuerui
2023-04-16 17:57   ` WANG Xuerui
2023-04-17  6:47 ` [PATCH 0/2] LoongArch: Make bounds-checking instructions useful Xi Ruoyao
2023-04-17  7:54   ` WANG Xuerui
2023-04-17  9:50     ` Xi Ruoyao [this message]
2023-04-20  8:36       ` Huacai Chen
2023-04-20  9:38         ` WANG Xuerui
2023-04-22  8:39           ` Huacai Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f54abfae989023fcfdabb4e9800a66847c357b85.camel@xry111.site \
    --to=xry111@xry111.site \
    --cc=arnd@arndb.de \
    --cc=chenhuacai@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=git@xen0n.name \
    --cc=kernel@xen0n.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=loongarch@lists.linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).