Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: "Måns Rullgård" <mans@mansr.com>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: Stephen Boyd <sboyd@codeaurora.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	Michal Marek <mmarek@suse.com>,
	linux-kbuild@vger.kernel.org,
	Russell King - ARM Linux <linux@arm.linux.org.uk>,
	Arnd Bergmann <arnd@arndb.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Subject: Re: [PATCH v2 2/2] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions
Date: Thu, 26 Nov 2015 00:07:48 +0000	[thread overview]
Message-ID: <yw1x8u5lpqqz.fsf@unicorn.mansr.com> (raw)
In-Reply-To: <alpine.LFD.2.20.1511251733570.22569@knanqh.ubzr> (Nicolas Pitre's message of "Wed, 25 Nov 2015 18:09:13 -0500 (EST)")

Nicolas Pitre <nico@fluxnic.net> writes:

> On Wed, 25 Nov 2015, Stephen Boyd wrote:
>
>> The ARM compiler inserts calls to __aeabi_uidiv() and
>> __aeabi_idiv() when it needs to perform division on signed and
>> unsigned integers. If a processor has support for the udiv and
>> sdiv division instructions the calls to these support routines
>> can be replaced with those instructions. Now that recordmcount
>> records the locations of calls to these library functions in
>> two sections (one for udiv and one for sdiv), iterate over these
>> sections early at boot and patch the call sites with the
>> appropriate division instruction when we determine that the
>> processor supports the division instructions. Using the division
>> instructions should be faster and less power intensive than
>> running the support code.
>
> A few remarks:
>
> 1) The code assumes unconditional branches to __aeabi_idiv and 
>    __aeabi_uidiv. What if there are conditional branches? Also, tail 
>    call optimizations will generate a straight b opcode rather than a bl 
>    and patching those will obviously have catastrophic results.  I think 
>    you should validate the presence of a bl before patching over it.

I did a quick check on a compiled kernel I had nearby, and there are no
conditional or tail calls to those functions, so although they should
obviously be checked for correctness, performance is unlikely to matter
for those.

However, there are almost half as many calls to __aeabi_{u}idivmod as to
the plain div functions, 129 vs 228 for signed and unsigned combined.
For best results, these functions should also be patched with the
hardware instructions.  Obviously the call sites for these can't be
patched.

> 2) For those cases where a call to __aeabi_uidiv and __aeabi_idiv is not 
>    patched due to (1), you could patch a uidiv/idiv plus "bx lr" 
>    at those function entry points too.
>
> 3) In fact I was wondering if the overhead of the branch and back is 
>    really significant compared to the non trivial cost of a idiv 
>    instruction and all the complex infrastructure required to patch 
>    those branches directly, and consequently if the performance 
>    difference is actually worth it versus simply doing (2) alone.

Depending on the operands, the div instruction can take as few as 3
cycles on a Cortex-A7.

-- 
Måns Rullgård
mans@mansr.com

  parent reply	other threads:[~2015-11-26  0:07 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-25 21:51 [PATCH v2 0/2] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions Stephen Boyd
2015-11-25 21:51 ` [PATCH v2 1/2] scripts: Add a recorduidiv program Stephen Boyd
2015-11-25 23:47   ` Russell King - ARM Linux
2015-11-30 15:11     ` Michal Marek
2015-11-30 15:32       ` Russell King - ARM Linux
2015-11-30 15:40         ` Michal Marek
2015-12-01 16:07           ` Michal Marek
2015-12-01 16:19             ` Russell King - ARM Linux
2015-12-01 16:43               ` Michal Marek
2015-12-01 16:49               ` Steven Rostedt
2015-12-01 17:10                 ` Russell King - ARM Linux
2015-12-01 17:22                   ` Steven Rostedt
2015-12-01 18:16                     ` Russell King - ARM Linux
2015-12-01 21:39                       ` Michal Marek
2015-12-02 10:23                       ` Russell King - ARM Linux
2015-12-02 14:05                         ` Steven Rostedt
2015-12-11 12:09                           ` [PATCH] scripts: recordmcount: break hardlinks Russell King
2015-12-11 14:31                             ` Steven Rostedt
2015-12-11 14:45                               ` Russell King - ARM Linux
2015-12-11 15:08                                 ` Steven Rostedt
2015-12-11 18:10                                 ` Steven Rostedt
2015-12-11 18:33                                   ` Russell King - ARM Linux
2015-12-11 18:51                                     ` Steven Rostedt
2015-12-11 18:58                                       ` Russell King - ARM Linux
2015-12-11 19:28                                         ` Steven Rostedt
2015-11-25 21:51 ` [PATCH v2 2/2] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions Stephen Boyd
2015-11-25 23:09   ` Nicolas Pitre
2015-11-26  0:05     ` Russell King - ARM Linux
2015-11-26  0:07     ` Måns Rullgård [this message]
2015-11-26  0:44       ` Nicolas Pitre
2015-11-26  0:50         ` Måns Rullgård
2015-11-26  1:28           ` Russell King - ARM Linux
2015-11-26  2:19             ` Måns Rullgård
2015-11-26  5:32               ` Nicolas Pitre
2015-11-26 12:41                 ` Måns Rullgård
2015-11-26  0:08   ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yw1x8u5lpqqz.fsf@unicorn.mansr.com \
    --to=mans@mansr.com \
    --cc=arnd@arndb.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mmarek@suse.com \
    --cc=nico@fluxnic.net \
    --cc=rostedt@goodmis.org \
    --cc=sboyd@codeaurora.org \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox