All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fangrui Song <maskray@google.com>
To: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Nick Desaulniers <ndesaulniers@google.com>,
	jpoimboe@redhat.com, peterz@infradead.org,
	clang-built-linux@googlegroups.com,
	Nathan Chancellor <natechancellor@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] objtool: ignore .L prefixed local symbols
Date: Fri, 14 Feb 2020 14:20:46 -0800	[thread overview]
Message-ID: <20200214222046.bkafub6dbtapgter@google.com> (raw)
In-Reply-To: <20200214204249.GA3624438@rani.riverdale.lan>

On 2020-02-14, Arvind Sankar wrote:
>On Fri, Feb 14, 2020 at 10:05:27AM -0800, Fangrui Song wrote:
>> I know little about objtool, but if it may be used by other
>> architectures, hope the following explanations don't appear to be too
>> off-topic:)
>>
>> On 2020-02-14, Arvind Sankar wrote:
>> >Can you describe what case the clang change is supposed to optimize?
>> >AFAICT, it kicks in when the symbol is known by the compiler to be local
>> >to the DSO and defined in the same translation unit.
>> >
>> >But then there are two cases:
>> >(a) we have call foo, where foo is defined in the same section as the
>> >call instruction. In this case the assembler should be able to fully
>> >resolve foo and not generate any relocation, regardless of whether foo
>> >is global or local.
>>
>> If foo is STB_GLOBAL or STB_WEAK, the assembler cannot fully resolve a
>> reference to foo in the same section, unless the assembler can assume
>> (the codegen tells it) the call to foo cannot be interposed by another
>> foo definition at runtime.
>
>I was testing with hidden/protected visibility, I see you want this for
>the no-semantic-interposition case. Actually a bit more testing shows
>some peculiarities even with hidden visibility. With the below, the call
>and lea create relocations in the object file, but the jmp doesn't. ld
>does avoid creating a plt for this though.
>
>	.text
>	.globl foo, bar
>	.hidden foo
>	bar:
>		call	foo
>		leaq	foo(%rip), %rax
>		jmp	foo
>
>	foo:	ret

Yes, GNU as is inconsistent here.  While fixing
https://sourceware.org/ml/binutils/2020-02/msg00243.html , I noticed
that the rule is quite complex. There are definitely lots of places to
improve.  clang 10 emits relocations consistently.

   call	foo              # R_X86_64_PLT32
   leaq	foo(%rip), %rax  # R_X86_64_PC32
   jmp	foo              # R_X86_64_PLT32

We can teach the assembler to not emit relocations referencing STV_HIDDEN or
STV_INTERNAL symbols, but I favor the simpler rule that only relocations
referencing STB_LOCAL non-STT_GNU_IFUNC symbols defined in the same section are resolved.
Leave the visibility jobs to the linker.

If we ever teach GNU objcopy or llvm-objcopt an option to set
visibility, resolving relocations may disallow such use cases.

Unfortunately gcc>=5 x86 and GNU ld>=2.26 x86 are in a bad status
regarding STV_PROTECTED (https://reviews.llvm.org/D72197#1866384).
(Now I retest it, I think I may add a special -no-integrated-as rule to
clang just to work around GNU ld x86>=2.26.)

>> >(b) we have call foo, where foo is defined in a different section from
>> >the call instruction. In this case the assembler must generate a
>> >relocation regardless of whether foo is global or local, and the linker
>> >should eliminate it.
>> >In what case does does replacing call foo with call .Lfoo$local help?
>>
>> For -fPIC -fno-semantic-interposition, the assembly emitter can perform
>> the following optimization:
>>
>>    void foo() {}
>>    void bar() { foo(); }
>>
>>    .globl foo, bar
>>    foo:
>>    .Lfoo$local:
>>      ret
>>    bar:
>>      call foo  --> call .Lfoo$local
>>      ret
>>
>> call foo generates an R_X86_64_PLT32. In a -shared link, it creates an
>> unneeded PLT entry for foo.
>>
>> call .Lfoo$local generates an R_X86_64_PLT32. In a -shared link, .Lfoo$local is
>> non-preemptible => no PLT entry is created.
>>
>> For -fno-PIC and -fPIE, the final link is expected to be -no-pie or
>> -pie. This optimization does not save anything, because PLT entries will
>> not be generated. With clang's integrated assembler, it may increase the
>> number of STT_SECTION symbols (because .Lfoo$local will be turned to a
>> STT_SECTION relative relocation), but the size increase is very small.
>>
>>
>> I want to teach clang -fPIC to use -fno-semantic-interposition by
>> default. (It is currently an LLVM optimization, not realized in clang.)
>> clang traditionally makes various -fno-semantic-interposition
>> assumptions and can perform interprocedural optimizations even if the
>> strict ELF rule disallows them.
>
>FWIW, gcc with no-semantic-interposition also uses local aliases, but
>rather than using .L labels, it creates a local alias by
>	.set foo.localalias, foo
>This makes the type of foo.localalias the same as foo, which I gather
>should placate objtool as it'll still see an STT_FUNC no matter whether
>it picks up foo.localalias or foo.

The GCC approach costs more bytes. foo.localalias is not prefixed by .L,
thus it wastes sizeof(Elf*_Sym) bytes for each such function.

      5: 0000000000401000     7 FUNC    LOCAL  DEFAULT    1 foo.localalias

Call/jump relocations on ARM and MIPS treat STT_FUNC differently.
If eventually we use the clang optimization for ARM and MIPS, we
probably should consider changing `.Lfoo$local:` to `.set .Lfoo$local, foo`
The assembler is quite complex. I need to investigate more into LLVM MC.

R_ARM_CALL/R_ARM_THM_CALL can be used against STT_NOTYPE symbols.
That disables interwork thunks (https://reviews.llvm.org/D73542).
If objtool is used by ARM and such disabling semantic is ever needed,
the rule should be loosened to allow STT_NOTYPE.

  reply	other threads:[~2020-02-14 22:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13 18:47 [PATCH] objtool: ignore .L prefixed local symbols Nick Desaulniers
2020-02-13 19:20 ` Fangrui Song
2020-02-14  6:16   ` Arvind Sankar
2020-02-14 18:05     ` Fangrui Song
2020-02-14 20:42       ` Arvind Sankar
2020-02-14 22:20         ` Fangrui Song [this message]
2020-02-15  0:05           ` Arvind Sankar
2020-02-15  0:34             ` Arvind Sankar
2020-02-13 22:17 ` Josh Poimboeuf
2020-02-13 22:37   ` Fangrui Song
2020-02-13 23:16     ` Josh Poimboeuf
2020-02-13 22:55   ` Nick Desaulniers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200214222046.bkafub6dbtapgter@google.com \
    --to=maskray@google.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=natechancellor@gmail.com \
    --cc=ndesaulniers@google.com \
    --cc=nivedita@alum.mit.edu \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.