public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'Eric W. Biederman'" <ebiederm@xmission.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>
Subject: RE: in_compat_syscall() on x86
Date: Mon, 4 Jan 2021 22:34:48 +0000	[thread overview]
Message-ID: <fe2629460b4e4b44a120a8b56efe0ac1@AcuMS.aculab.com> (raw)
In-Reply-To: <87sg7gfnaa.fsf@x220.int.ebiederm.org>

From: Eric W. Biederman
> Sent: 04 January 2021 20:41
> 
> Al Viro <viro@zeniv.linux.org.uk> writes:
> 
> > On Mon, Jan 04, 2021 at 12:16:56PM +0000, David Laight wrote:
> >> On x86 in_compat_syscall() is defined as:
> >>     in_ia32_syscall() || in_x32_syscall()
> >>
> >> Now in_ia32_syscall() is a simple check of the TS_COMPAT flag.
> >> However in_x32_syscall() is a horrid beast that has to indirect
> >> through to the original %eax value (ie the syscall number) and
> >> check for a bit there.
> >>
> >> So on a kernel with x32 support (probably most distro kernels)
> >> the in_compat_syscall() check is rather more expensive than
> >> one might expect.
> 
> I suggest you check the distro kernels.  I suspect they don't compile in
> support for x32.  As far as I can tell x32 is an undead beast of a
> subarchitecture that just enough people use that it can't be removed,
> but few enough people use it likely has a few lurking scary bugs.

It is defined in the Ubuntu kernel configs I've got lurking:
Both 3.8.0-19_generic (Ubuntu 13.04) and 5.4.0-56_generic (probably 20.04).
Which is probably why it is in my test builds (I've just cut out
a lot of modules).

> >> It would be muck better if both checks could be done together.
> >> I think this would require the syscall entry code to set a
> >> value in both the 64bit and x32 entry paths.
> >> (Can a process make both 64bit and x32 system calls?)
> >
> > Yes, it bloody well can.
> >
> > And I see no benefit in pushing that logics into syscall entry,
> > since anything that calls in_compat_syscall() more than once
> > per syscall execution is doing the wrong thing.  Moreover,
> > in quite a few cases we don't call the sucker at all, and for
> > all of those pushing that crap into syscall entry logics is
> > pure loss.
> 
> The x32 system calls have their own system call table and it would be
> trivial to set a flag like TS_COMPAT when looking up a system call from
> that table.  I expect such a change would be purely in the noise.

Certainly a write of 0/1/2 into a dirtied cache line of 'current'
could easily cost absolutely nothing.
Especially if current has already been read.

I also wondered about resetting it to zero when an x32 system call
exits (rather than entry to a 64bit one).

For ia32 the flag is set (with |=) on every syscall entry.
Even though I'm pretty sure it can only change during exec.

> > What's the point, really?
> 
> Before we came up with the current games with __copy_siginfo_to_user
> and x32_copy_siginfo_to_user I was wondering if we should make such
> a change.  The delivery of compat signal frames and core dumps which
> do not go through the system call entry path could almost benefit from
> a flag that could be set/tested when on those paths.

For signal delivery it should (probably) depend on the system call
that setup the signal handler.
Although I'm sure I remember one kernel where some of it was done
in libc (with a single entrypoint for all hadlers).

> The fact that only SIGCHLD (which can not trigger a coredump) is
> different saves the coredump code from needing such a test.
> 
> The fact that the signal frame code is simple enough it can directly
> call x32_copy_siginfo_to_user or __copy_siginfo_to_user saves us there.
> 
> So I don't think we have any cases where we actually need a flag that
> is independent of the system call but we have come very close.

If a program can do both 64bit and x32 system calls you probably
need to generate a 64bit core dump if it has ever made a 64bit
system call??

> For people who want to optimize I suggest tracking down the handful of
> users of x32 and see if x32 can be made to just go away.

Unlikely since Ubuntu seem to have enabled it for years.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


  reply	other threads:[~2021-01-04 22:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 12:16 in_compat_syscall() on x86 David Laight
2021-01-04 16:46 ` David Laight
2021-01-04 16:58 ` Al Viro
2021-01-04 20:41   ` Eric W. Biederman
2021-01-04 22:34     ` David Laight [this message]
2021-01-04 23:04       ` Andy Lutomirski
2021-01-05  0:47         ` Eric W. Biederman
2021-01-05  0:57           ` Al Viro
2021-01-06  0:03             ` Eric W. Biederman
2021-01-06  0:11               ` Bernd Petrovitsch
2021-01-06  0:30               ` Al Viro
2021-01-05  9:53         ` David Laight
2021-01-05 17:35           ` Andy Lutomirski
2021-01-06  9:42             ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe2629460b4e4b44a120a8b56efe0ac1@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=ebiederm@xmission.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox