From: Denys Vlasenko <dvlasenk@redhat.com>
To: "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Borislav Petkov <bp@alien8.de>,
Andy Lutomirski <luto@amacapital.net>,
Frederic Weisbecker <fweisbec@gmail.com>,
Alexei Starovoitov <ast@plumgrid.com>,
Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86: Deinline cpuid_eax and friends
Date: Thu, 07 May 2015 10:57:49 +0200 [thread overview]
Message-ID: <554B290D.6000005@redhat.com> (raw)
In-Reply-To: <554A7C8B.4030007@zytor.com>
On 05/06/2015 10:41 PM, H. Peter Anvin wrote:
> On 05/06/2015 12:09 PM, Denys Vlasenko wrote:
>>>
>>> How on Earth does it make 44 bytes? Is this due to paravirt_fail?
>>
>> No, just this construct
>>
>> unsigned int eax, ebx, ecx, edx;
>> cpuid(op, &eax, &ebx, &ecx, &edx);
>>
>> is not really that cheap to set up. You need to allocate
>> variables on stack and take address of each:
>>
>> ffffffff81063668 <cpuid_eax>:
>> ffffffff81063668: 55 push %rbp
>> ffffffff81063669: 48 89 e5 mov %rsp,%rbp
>> ffffffff8106366c: 48 83 ec 10 sub $0x10,%rsp
>> ffffffff81063670: 48 8d 4d fc lea -0x4(%rbp),%rcx
>> ffffffff81063674: 89 7d f0 mov %edi,-0x10(%rbp)
>> ffffffff81063677: 48 8d 55 f8 lea -0x8(%rbp),%rdx
>> ffffffff8106367b: 48 8d 75 f4 lea -0xc(%rbp),%rsi
>> ffffffff8106367f: 48 8d 7d f0 lea -0x10(%rbp),%rdi
>> ffffffff81063683: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%rbp)
>> ffffffff8106368a: e8 3c ff ff ff callq ffffffff810635cb <__cpuid>
>> ffffffff8106368f: 8b 45 f0 mov -0x10(%rbp),%eax
>> ffffffff81063692: c9 leaveq
>> ffffffff81063693: c3 retq
>>
>
> That almost certainly is due to paravirt_fail, because otherwise cpuid
> would be inline, and gcc actually knows how to optimize around the cpuid
> instruction to the point of eliminating the temporaries.
Yes, with HYPERVISOR_GUEST off cpuid_eax() is smaller:
ffffffff81055a66 <cpuid_eax>:
ffffffff81055a66: 55 push %rbp
ffffffff81055a67: 89 f8 mov %edi,%eax
ffffffff81055a69: 31 c9 xor %ecx,%ecx
ffffffff81055a6b: 48 89 e5 mov %rsp,%rbp
ffffffff81055a6e: 53 push %rbx
ffffffff81055a6f: 0f a2 cpuid
ffffffff81055a71: 5b pop %rbx
ffffffff81055a72: 5d pop %rbp
ffffffff81055a73: c3 retq
However, it is not small enough to make vmlinux grow:
text data bss dec hex filename
81746530 13978160 20066304 115790994 6e6d492 vmlinux.before
81746509 13978160 20066304 115790973 6e6d47d vmlinux
To recap: with this patch
Code is smaller with and without HYPERVISOR_GUEST.
Slowdown per cpuid_REG() call is at worst 4%.
prev parent reply other threads:[~2015-05-07 8:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-06 17:07 [PATCH] x86: Deinline cpuid_eax and friends Denys Vlasenko
2015-05-06 18:59 ` H. Peter Anvin
2015-05-06 19:09 ` Denys Vlasenko
2015-05-06 20:41 ` H. Peter Anvin
2015-05-07 8:57 ` Denys Vlasenko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=554B290D.6000005@redhat.com \
--to=dvlasenk@redhat.com \
--cc=ast@plumgrid.com \
--cc=bp@alien8.de \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=rostedt@goodmis.org \
--cc=wad@chromium.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.