All of lore.kernel.org
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] powerpc: Force inlining of csum_add()
Date: Wed, 12 May 2021 09:31:05 -0500	[thread overview]
Message-ID: <20210512143105.GW10366@gate.crashing.org> (raw)
In-Reply-To: <e996ef13-c25c-5e9c-edd2-444eded88802@csgroup.eu>

On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote:
> Le 11/05/2021 à 12:51, Segher Boessenkool a écrit :
> >Something seems to have decided this asm is more expensive than it is.
> >That isn't always avoidable -- the compiler cannot look inside asms --
> >but it seems it could be improved here.
> >
> >Do you have (or can make) a self-contained testcase?
> 
> I have not tried, and I fear it might be difficult, because on a kernel 
> build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such 
> an issue.

Yeah.  Sometimes you can force some of the decisions, but that usually
requires knowing too many GCC internals :-/

> >>And there is even one completely unused instance of csum_add().
> >
> >That is strange, that should never happen.
> 
> It seems that several .o include unused versions of csum_add. After the 
> final link, one remains (in addition to the used one) in vmlinux.

But it is a static function, so it should not end up in any object file
where it isn't used.

> >>In the non-inlined version, the first sum with 0 was performed.
> >>Here it is skipped.
> >
> >That is because of how __builtin_constant_p works, most likely.  As we
> >discussed elsewhere it is evaluated before all forms of loop unrolling.
> 
> But we are not talking about loop unrolling here, are we ?

Oh, right you are, but that doesn't change much.  The
_builtin_constant_p(len) is evaluated long before the compiler sees len
is a constant here.

> It seems that the reason here is that __builtin_constant_p() is evaluated 
> long after GCC decided to not inline that call to csum_add().

Yes, it seems we do not currently do even trivial inlining except very
early in the compiler.

Thanks,


Segher

WARNING: multiple messages have this Message-ID (diff)
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] powerpc: Force inlining of csum_add()
Date: Wed, 12 May 2021 09:31:05 -0500	[thread overview]
Message-ID: <20210512143105.GW10366@gate.crashing.org> (raw)
In-Reply-To: <e996ef13-c25c-5e9c-edd2-444eded88802@csgroup.eu>

On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote:
> Le 11/05/2021 à 12:51, Segher Boessenkool a écrit :
> >Something seems to have decided this asm is more expensive than it is.
> >That isn't always avoidable -- the compiler cannot look inside asms --
> >but it seems it could be improved here.
> >
> >Do you have (or can make) a self-contained testcase?
> 
> I have not tried, and I fear it might be difficult, because on a kernel 
> build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such 
> an issue.

Yeah.  Sometimes you can force some of the decisions, but that usually
requires knowing too many GCC internals :-/

> >>And there is even one completely unused instance of csum_add().
> >
> >That is strange, that should never happen.
> 
> It seems that several .o include unused versions of csum_add. After the 
> final link, one remains (in addition to the used one) in vmlinux.

But it is a static function, so it should not end up in any object file
where it isn't used.

> >>In the non-inlined version, the first sum with 0 was performed.
> >>Here it is skipped.
> >
> >That is because of how __builtin_constant_p works, most likely.  As we
> >discussed elsewhere it is evaluated before all forms of loop unrolling.
> 
> But we are not talking about loop unrolling here, are we ?

Oh, right you are, but that doesn't change much.  The
_builtin_constant_p(len) is evaluated long before the compiler sees len
is a constant here.

> It seems that the reason here is that __builtin_constant_p() is evaluated 
> long after GCC decided to not inline that call to csum_add().

Yes, it seems we do not currently do even trivial inlining except very
early in the compiler.

Thanks,


Segher

  reply	other threads:[~2021-05-12 14:34 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-11  6:08 [PATCH] powerpc: Force inlining of csum_add() Christophe Leroy
2021-05-11  6:08 ` Christophe Leroy
2021-05-11 10:51 ` Segher Boessenkool
2021-05-11 10:51   ` Segher Boessenkool
2021-05-12 12:56   ` Christophe Leroy
2021-05-12 12:56     ` Christophe Leroy
2021-05-12 14:31     ` Segher Boessenkool [this message]
2021-05-12 14:31       ` Segher Boessenkool
2021-05-12 14:43       ` Christophe Leroy
2021-05-12 14:43         ` Christophe Leroy
2021-05-12 18:21         ` Segher Boessenkool
2021-05-12 18:21           ` Segher Boessenkool
2021-06-18  3:51 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210512143105.GW10366@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.