From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josh Poimboeuf Subject: Re: [PATCH v2 0/9] x86: macrofying inline asm for better compilation Date: Mon, 4 Jun 2018 14:05:52 -0500 Message-ID: <20180604190552.hm5e6zcabeyxt26u@treble> References: <20180604112131.59100-1-namit@vmware.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <20180604112131.59100-1-namit@vmware.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Nadav Amit Cc: Juergen Gross , Kate Stewart , Kees Cook , Peter Zijlstra , Greg Kroah-Hartman , Christopher Li , x86@kernel.org, linux-kernel@vger.kernel.org, Philippe Ombredanne , virtualization@lists.linux-foundation.org, linux-sparse@vger.kernel.org, Ingo Molnar , Jan Beulich , "H. Peter Anvin" , Alok Kataria , Linus Torvalds , Thomas Gleixner List-Id: virtualization@lists.linuxfoundation.org On Mon, Jun 04, 2018 at 04:21:22AM -0700, Nadav Amit wrote: > This patch-set deals with an interesting yet stupid problem: kernel code > that does not get inlined despite its simplicity. There are several > causes for this behavior: "cold" attribute on __init, different function > optimization levels; conditional constant computations based on > __builtin_constant_p(); and finally large inline assembly blocks. > > This patch-set deals with the inline assembly problem. I separated these > patches from the others (that were sent in the RFC) for easier > inclusion. I also separated the removal of unnecessary new-lines which > would be sent separately. > > The problem with inline assembly is that inline assembly is often used > by the kernel for things that are other than code - for example, > assembly directives and data. GCC however is oblivious to the content of > the blocks and assumes their cost in space and time is proportional to > the number of the perceived assembly "instruction", according to the > number of newlines and semicolons. Alternatives, paravirt and other > mechanisms are affected, causing code not to be inlined, and degrading > compilation quality in general. > > The solution that this patch-set carries for this problem is to create > an assembly macro, and then call it from the inline assembly block. As > a result, the compiler sees a single "instruction" and assigns the more > appropriate cost to the code. > > To avoid uglification of the code, as many noted, the macros are first > precompiled into an assembly file, which is later assembled together > with the the C files. This also enables to avoid duplicate > implementation that was set before for the asm and C code. This can be > seen in the exception table changes. > > Overall this patch-set slightly increases the kernel size (my build was > done using my Ubuntu 18.04 config + localyesconfig for the record): > > text data bss dec hex filename > 18140829 10224724 2957312 31322865 1ddf2f1 ./vmlinux before > 18163608 10227348 2957312 31348268 1de562c ./vmlinux after (+0.1%) > > The number of static functions in the image is reduced by 379, but > actually inlining is even better, which does not always shows in these > numbers: a function may be inlined causing the calling function not to > be inlined. > > The Makefile stuff may not be too clean. Ideas for improvements are > welcome. > > v1->v2: * Compiling the macros into a separate .s file, improving > readability (Linus) > * Improving assembly formatting, applying most of the comments > according to my judgment (Jan) > * Adding exception-table, cpufeature and jump-labels > * Removing new-line cleanup; to be submitted separately How did you find these issues? Is there some way to find them automatically in the future? Perhaps with a GCC plugin? -- Josh