Generic Linux architectural discussions
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: bjorn@kernel.org, ndesaulniers@google.com,
	Conor Dooley <conor@kernel.org>,
	jszhang@kernel.org, llvm@lists.linux.dev,
	Paul Walmsley <paul.walmsley@sifive.com>,
	aou@eecs.berkeley.edu, Arnd Bergmann <arnd@arndb.de>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org
Subject: Re: [PATCH v2 0/4] riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
Date: Thu, 22 Jun 2023 23:18:03 +0000	[thread overview]
Message-ID: <20230622231803.GA1790165@dev-arch.thelio-3990X> (raw)
In-Reply-To: <mhng-6c34765c-126d-4e6c-8904-e002d49a4336@palmer-ri-x1c9a>

On Thu, Jun 22, 2023 at 03:16:51PM -0700, Palmer Dabbelt wrote:
> On Thu, 22 Jun 2023 14:53:27 PDT (-0700), nathan@kernel.org wrote:
> > On Wed, Jun 21, 2023 at 11:19:31AM -0700, Palmer Dabbelt wrote:
> > > On Wed, 21 Jun 2023 10:51:15 PDT (-0700), bjorn@kernel.org wrote:
> > > > Conor Dooley <conor@kernel.org> writes:
> > > >
> > > > [...]
> > > >
> > > > > > So I'm no longer actually sure there's a hang, just something
> > > > > > slow.  That's even more of a grey area, but I think it's sane to
> > > > > > call a 1-hour link time a regression -- unless it's expected
> > > > > > that this is just very slow to link?
> > > > >
> > > > > I dunno, if it was only a thing for allyesconfig, then whatever - but
> > > > > it's gonna significantly increase build times for any large kernels if LLD
> > > > > is this much slower than LD. Regression in my book.
> > > > >
> > > > > I'm gonna go and experiment with mixed toolchain builds, I'll report
> > > > > back..
> > > >
> > > > I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
> > > > HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
> > > >
> > > >   | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
> > > >   |     --toolchain=llvm-16 --runtime docker --directory . -k \
> > > >   |     allyesconfig
> > > >
> > > > Took forever, but passed after 2.5h.
> > > 
> > > Thanks.  I just re-ran mine 17/trunk LLD under time (rather that just
> > > checking top sometimes), it's at 1.5h but even that seems quite long.
> > > 
> > > I guess this is sort of up to the LLVM folks: if it's expected that DCE
> > > takes a very long time to link then I'm not opposed to allowing it, but if
> > > this is probably a bug in LLD then it seems best to turn it off until we
> > > sort things out over there.
> > > 
> > > I think maybe Nick or Nathan is the best bet to know?
> > 
> > I can confirm a regression with allyesconfig but not allmodconfig using
> > LLVM 16.0.6 on my 80-core Ampere Altra system.
> > 
> > allmodconfig: 8m 4s
> > allmodconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 7m 4s
> > allyesconfig: 1h 58m 30s
> > allyesconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 12m 41s
> 
> Are those backwards?  I'm getting super slow builds after merging the patch
> set, not before -- though apologize in advance if I'm reading it wrong, I'm
> well on my way to falling asleep already ;)

I know I already responded to you around this on IRC but I will do it
here too for the benefit of others following this thread.

These numbers are from the patchset applied on top of dad9774deaf1
("Merge tag 'timers-urgent-2023-06-21' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"); in other words,
allmodconfig and allyesconfig have CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
so turning it off is basically like building allmodconfig and
allyesconfig before the patchset was applied.

> > I am sure there is something that ld.lld can do better, given GNU ld
> > does not have any problems as earlier established, so that should
> > definitely be explored further. I see Nick already had a response about
> > writing up a report (I wrote most of this before that email so I am
> > still sending this one).
> > 
> > However, allyesconfig is pretty special and not really indicative of a
> > "real world" kernel build in my opinion (which will either be a fully
> > modular kernel to allow use on a wide range of hardware or a monolithic
> > kernel with just the drivers needed for a specific platform, which will
> > be much smaller than allyesconfig); it has given us problems with large
> > kernels before on other architectures.
> 
> I totally agree that allyesconfig is an oddity, but it's something that does
> get regularly build tested so a big build time hit there is going to cause
> trouble -- maybe not for users, but it'll be a problem for maintainers and
> that's way more likely to get me yelled at ;)

Agreed. That comment was more around justification for opting out of
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION with these configurations, since
CONFIG_COMPILE_TEST has effective become "am I allmodconfig or
allyesconfig?" nowadays.

> > CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is already marked with 'depends on
> > EXPERT' and its help text mentions its perils, so it does not seem
> > unreasonable to me to add an additional dependency on !COMPILE_TEST so
> > that allmodconfig and allyesconfig cannot flip this on, something like
> > the following perhaps?
> > 
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 32c24950c4ce..25434cbd2a6e 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1388,7 +1388,7 @@ config HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> >  config LD_DEAD_CODE_DATA_ELIMINATION
> >  	bool "Dead code and data elimination (EXPERIMENTAL)"
> >  	depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> > -	depends on EXPERT
> > +	depends on EXPERT && !COMPILE_TEST
> >  	depends on $(cc-option,-ffunction-sections -fdata-sections)
> >  	depends on $(ld-option,--gc-sections)
> >  	help
> > 
> > If applying that dependency to all architectures is too much, the
> > selection in arch/riscv/Kconfig could be gated on the same condition.
> 
> Is the regression for all ports, or just RISC-V?  I'm fine gating this with
> some sort of Kconfig flag, if it's just impacting RISC-V then it seems sane
> to keep it over here.

I am not sure. Only mips selects HAVE_LD_DEAD_CODE_DATA_ELIMINATION
unconditionally and we don't test ARCH=mips all{mod,yes}config (not sure
why off the top of my head). powerpc selects it when using objtool for
mcount generation, which only happens for ppc32 (which we don't test
heavily or with large kernels) or using '-mprofile-kernel', which clang
does not support.

If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
that would be fine with me too.

  select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD

Nick said he would work on a report for the LLVM side, so as long as
this issue is handled in some way to avoid regressing LLD builds until
it is resolved, I don't think there is anything else for the kernel to
do. We like to have breadcrumbs via issue links, not sure if the report
will be internal to Google or on LLVM's issue tracker though;
regardless, we will have to touch this block to add a version check
later, at which point we can add a link to the fix in LLD.

Cheers,
Nathan

  reply	other threads:[~2023-06-22 23:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 16:54 [PATCH v2 0/4] riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION Jisheng Zhang
2023-05-23 16:54 ` [PATCH v2 1/4] riscv: move options to keep entries sorted Jisheng Zhang
2023-05-25 14:01   ` Conor Dooley
2023-06-01  4:48   ` Guo Ren
2023-05-23 16:55 ` [PATCH v2 2/4] riscv: vmlinux-xip.lds.S: remove .alternative section Jisheng Zhang
2023-06-01  4:43   ` Guo Ren
2023-05-23 16:55 ` [PATCH v2 3/4] vmlinux.lds.h: use correct .init.data.* section name Jisheng Zhang
2023-05-24 11:02   ` Kefeng Wang
2023-05-23 16:55 ` [PATCH v2 4/4] riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION Jisheng Zhang
2023-05-24 11:04   ` Kefeng Wang
2023-06-14 14:49 ` [PATCH v2 0/4] " Palmer Dabbelt
2023-06-14 16:25   ` Jisheng Zhang
2023-06-15 13:54     ` Palmer Dabbelt
2023-06-19 22:06       ` Palmer Dabbelt
2023-06-20 20:05         ` Nick Desaulniers
2023-06-20 20:13           ` Conor Dooley
2023-06-20 20:32             ` Nick Desaulniers
2023-06-20 20:41               ` Palmer Dabbelt
2023-06-20 20:47                 ` Nick Desaulniers
2023-06-20 21:08                   ` Palmer Dabbelt
2023-06-21  0:13                     ` Palmer Dabbelt
2023-06-21 14:53                       ` Palmer Dabbelt
2023-06-21 16:42                         ` Conor Dooley
2023-06-21 17:23                           ` Conor Dooley
2023-06-21 17:51                           ` Björn Töpel
2023-06-21 18:19                             ` Palmer Dabbelt
2023-06-21 19:46                               ` Palmer Dabbelt
2023-06-22 21:40                                 ` Nick Desaulniers
2023-06-22 21:42                                   ` Palmer Dabbelt
2023-06-22 21:53                               ` Nathan Chancellor
2023-06-22 22:16                                 ` Palmer Dabbelt
2023-06-22 23:18                                   ` Nathan Chancellor [this message]
2023-06-23 17:17                                     ` Nick Desaulniers
2023-06-25 12:24                                       ` Jisheng Zhang
2023-06-25 12:43                                         ` Conor Dooley
2023-06-25 20:06                                           ` Palmer Dabbelt
2023-06-25 20:05                                             ` Palmer Dabbelt
2023-07-22  0:37                                             ` Fangrui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230622231803.GA1790165@dev-arch.thelio-3990X \
    --to=nathan@kernel.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=arnd@arndb.de \
    --cc=bjorn@kernel.org \
    --cc=conor@kernel.org \
    --cc=jszhang@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=ndesaulniers@google.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox