linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yury Norov <ynorov@caviumnetworks.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Miller <davem@davemloft.net>,
	arnd@arndb.de, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	libc-alpha@sourceware.org, schwidefsky@de.ibm.com,
	heiko.carstens@de.ibm.com, pinskia@gmail.com, broonie@kernel.org,
	joseph@codesourcery.com,
	christoph.muellner@theobroma-systems.com,
	bamvor.zhangjian@huawei.com, szabolcs.nagy@arm.com,
	klimov.linux@gmail.com, Nathan_Lynch@mentor.com, agraf@suse.de,
	Prasun.Kapoor@caviumnetworks.com, kilobyte@angband.pl,
	geert@linux-m68k.org, philipp.tomsich@theobroma-systems.com
Subject: Re: [PATCH 01/23] all: syscall wrappers: add documentation
Date: Wed, 15 Jun 2016 02:08:38 +0300	[thread overview]
Message-ID: <20160614230838.GA15130@yury-N73SV> (raw)
In-Reply-To: <20160526222943.GA16729@MBP.local>

Hi Catalin, David, all

> COMPAT_SYSCALL_WRAP2(creat, ...):
> 	mov	w0, w0
> 	b	<sys_creat>
> 
> > > Cost wise, this seems like it all cancels out in the end, but what
> > > do I know?
> > 
> > I think you know something, and I also think Heiko and other s390 guys
> > know something as well. So I'd like to listen their arguments here.
> > 
> > For me spark64 way is looking reasonable only because it's really simple
> > and takes less coding. I'll try it on some branch and share here what happened.
> 
> The kernel code will definitely look simpler ;). It would be good to see
> if there actually is any performance impact. Even with 16 more cycles on
> syscall entry, would they be lost in the noise? You don't need a full
> implementation, just some dummy mov x0, x0 on the entry path.
> 
> -- 
> Catalin

I wrote a simple test:

        struct timeval start, end;
        unsigned long long ut;

        int main()
        {
                gettimeofday(&start, NULL);

                for (int i = 1000000; i; i--)
                        syscall(__NR_getrusage, 100 /* EINVAL */, NULL);

                gettimeofday(&end, NULL);
                
                ut = (end.tv_sec - start.tv_sec) * 1000000ULL
                        + end.tv_usec - start.tv_usec;

                printf("%lld\n", ut);

                exit(EXIT_SUCCESS);
        }

In kernel there's minimal overhead:
diff --git a/kernel/sys.c b/kernel/sys.c
index 89d5be4..003d5ad 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1634,6 +1634,17 @@ COMPAT_SYSCALL_DEFINE2(getrusage, int, who,
struct compat_rusage __user *, ru)
{
        struct rusage r;
         
+       asm volatile (
+       "       mov w0, w0      \n"
+       "       mov w1, w1      \n"
+       "       mov w2, w2      \n"
+       "       mov w3, w3      \n"
+       "       mov w4, w4      \n"
+       "       mov w5, w5      \n"
+       "       mov w6, w6      \n"
+       "       mov w7, w7      \n"
+       );
+
        if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN &&
            who != RUSAGE_THREAD)
                return -EINVAL;

On QEMU:
With MOVs:      W/O MOVs:
832015          814564
840639          803165
830482          813116
832895          802928
832083          832658
834461          802993
829405          812465
846677          822651
828409          803393
836845          821470
828716          801044
831620          821301
825423          800278
829946          821476

We have 83 mS vs 81 mS, ~2.6% of performance degradation.
And I can show bigger numbers if I'll use asm svc instead of
syscall() wrapper which increases time as well. 

It's definitely more than 0, but not so big anyway. For syscalls
with heavy payload it will be non-measurable. So the choice
is still there. Should we use wrappers and save 2.5% of syscall
performance. Or clear top-halves unconditionally and win in simplicity?

If QEMU is looking non-representative, I can run test on real
hardware, but it takes a time, and I think will end up with similar
results.

Latest kernel with wrappers and library are here:
https://github.com/norov/linux/commits/ilp32
https://github.com/norov/glibc/commits/ilp32-dev

BTW, notice the change in ABI: syscalls that take stat and statfs
structures now routed to (wrapped) native handlers, after switching
userspace to use 64-bit off_t, ino_t, blkcnt_t, fsblkcnt_t and
fsfilcnt_t types.

Yury.

  parent reply	other threads:[~2016-06-14 23:08 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24  0:04 [PATCH v6 00/21] ILP32 for ARM64 Yury Norov
2016-05-24  0:04 ` [PATCH 01/23] all: syscall wrappers: add documentation Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-25 19:30   ` David Miller
2016-05-25 19:30     ` David Miller
2016-05-25 20:03     ` Yury Norov
2016-05-25 20:03       ` Yury Norov
2016-05-25 20:21       ` David Miller
2016-05-25 20:47         ` Arnd Bergmann
2016-05-25 20:50           ` David Miller
2016-05-25 21:01             ` Arnd Bergmann
2016-05-25 21:28               ` David Miller
2016-05-25 21:28                 ` David Miller
2016-05-26 14:20                 ` Catalin Marinas
2016-05-26 14:20                   ` Catalin Marinas
2016-05-26 14:50                   ` Szabolcs Nagy
2016-05-26 14:50                     ` Szabolcs Nagy
2016-05-26 15:19                     ` Catalin Marinas
2016-05-26 15:19                       ` Catalin Marinas
2016-05-26 19:43                   ` David Miller
2016-05-27 10:10                     ` Catalin Marinas
2016-05-26 20:48                 ` Yury Norov
2016-05-26 22:29                   ` Catalin Marinas
2016-05-27  0:37                     ` Yury Norov
2016-05-27  6:03                       ` Heiko Carstens
2016-05-27  6:03                         ` Heiko Carstens
2016-05-27  8:42                         ` Arnd Bergmann
2016-05-27  8:42                           ` Arnd Bergmann
2016-05-27  9:30                           ` Catalin Marinas
2016-05-27 10:49                             ` Arnd Bergmann
2016-05-27 13:04                               ` Catalin Marinas
2016-05-27 13:04                                 ` Catalin Marinas
2016-05-27 16:58                                 ` Yury Norov
2016-05-27 16:58                                   ` Yury Norov
2016-05-27 17:36                                   ` Catalin Marinas
2016-05-27  9:01                         ` Catalin Marinas
2016-06-14 23:08                     ` Yury Norov [this message]
2016-06-14 23:08                       ` Yury Norov
2016-05-27  5:52     ` Heiko Carstens
2016-05-24  0:04 ` [PATCH 02/23] all: introduce COMPAT_WRAPPER option and enable it for s390 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 03/23] all: s390: move wrapper infrastructure to generic headers Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 04/23] all: s390: move compat_wrappers.c from arch/s390/kernel to kernel/ Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 05/23] all: wrap needed syscalls in generic unistd Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 06/23] compat ABI: use non-compat openat and open_by_handle_at variants Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 07/23] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 08/23] arm64: ilp32: add documentation on the ILP32 ABI for ARM64 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 09/23] arm64: ensure the kernel is compiled for LP64 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 10/23] arm64: rename COMPAT to AARCH32_EL0 in Kconfig Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 11/23] arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 12/23] thread: move thread bits accessors to separated file Yury Norov
2016-05-24  0:04 ` [PATCH 13/23] arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat) Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-06-12 12:21   ` Zhangjian (Bamvor)
2016-06-12 12:21     ` Zhangjian (Bamvor)
2016-06-12 13:08     ` Zhangjian (Bamvor)
2016-06-12 13:08       ` Zhangjian (Bamvor)
2016-06-12 17:56       ` Yury Norov
2016-06-12 17:56         ` Yury Norov
2016-05-24  0:04 ` [PATCH 14/23] arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 15/23] arm64: introduce binfmt_elf32.c Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 16/23] arm64: ilp32: introduce binfmt_ilp32.c Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-26 13:49   ` Zhangjian (Bamvor)
2016-05-26 13:49     ` Zhangjian (Bamvor)
2016-05-26 21:08     ` Yury Norov
2016-05-26 21:08       ` Yury Norov
2016-06-15  0:40     ` Yury Norov
2016-06-15  0:40       ` Yury Norov
2016-06-13  3:05   ` Zhangjian (Bamvor)
2016-06-13  3:05     ` Zhangjian (Bamvor)
2016-06-13 13:22     ` Zhangjian (Bamvor)
2016-05-24  0:04 ` [PATCH 17/23] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-06-08  1:34   ` zhouchengming
2016-06-08  1:34     ` zhouchengming
2016-06-08 17:00     ` Yury Norov
2016-06-08 17:00       ` Yury Norov
2016-06-25  9:36       ` zhouchengming
2016-06-25  9:36         ` zhouchengming
2016-06-25 14:15         ` Bamvor Zhang
2016-06-27  2:09           ` zhouchengming
2016-06-27  2:09             ` zhouchengming
2016-05-24  0:04 ` [PATCH 18/23] arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-25 20:26   ` Arnd Bergmann
2016-05-24  0:04 ` [PATCH 19/23] arm64: signal: share lp64 signal routines to ilp32 Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 20/23] arm64: signal32: move ilp32 and aarch32 common code to separated file Yury Norov
2016-05-24  0:04 ` [PATCH 21/23] arm64: ilp32: introduce ilp32-specific handlers for sigframe and ucontext Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-06-04 11:34   ` Zhangjian (Bamvor)
2016-06-04 11:34     ` Zhangjian (Bamvor)
2016-06-12 12:34     ` Zhangjian (Bamvor)
2016-06-12 12:34       ` Zhangjian (Bamvor)
2016-06-12 13:12     ` Zhangjian (Bamvor)
2016-06-12 13:12       ` Zhangjian (Bamvor)
2016-06-12 17:44     ` Yury Norov
2016-06-12 17:44       ` Yury Norov
2016-06-16 11:21       ` Zhangjian (Bamvor)
2016-06-16 11:21         ` Zhangjian (Bamvor)
2016-06-12 12:39   ` Zhangjian (Bamvor)
2016-06-12 12:39     ` Zhangjian (Bamvor)
2016-05-24  0:04 ` [PATCH 22/23] arm64:ilp32: add vdso-ilp32 and use for signal return Yury Norov
2016-05-24  0:04   ` Yury Norov
2016-05-24  0:04 ` [PATCH 23/23] arm64:ilp32: add ARM64_ILP32 to Kconfig Yury Norov
2016-05-25 10:42 ` [PATCH v6 00/21] ILP32 for ARM64 Szabolcs Nagy
2016-05-25 10:42   ` Szabolcs Nagy
2016-05-25 16:41   ` Yury Norov
2016-05-25 16:41     ` Yury Norov
2016-06-02 19:03 ` Yury Norov
2016-06-02 19:03   ` Yury Norov
2016-06-03 11:02   ` Szabolcs Nagy
2016-06-03 11:02     ` Szabolcs Nagy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160614230838.GA15130@yury-N73SV \
    --to=ynorov@caviumnetworks.com \
    --cc=Nathan_Lynch@mentor.com \
    --cc=Prasun.Kapoor@caviumnetworks.com \
    --cc=agraf@suse.de \
    --cc=arnd@arndb.de \
    --cc=bamvor.zhangjian@huawei.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoph.muellner@theobroma-systems.com \
    --cc=davem@davemloft.net \
    --cc=geert@linux-m68k.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=joseph@codesourcery.com \
    --cc=kilobyte@angband.pl \
    --cc=klimov.linux@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=philipp.tomsich@theobroma-systems.com \
    --cc=pinskia@gmail.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).