From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760609AbZENRvd (ORCPT ); Thu, 14 May 2009 13:51:33 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752258AbZENRvX (ORCPT ); Thu, 14 May 2009 13:51:23 -0400 Received: from terminus.zytor.com ([198.137.202.10]:43020 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751205AbZENRvW (ORCPT ); Thu, 14 May 2009 13:51:22 -0400 Message-ID: <4A0C59FE.3060702@zytor.com> Date: Thu, 14 May 2009 10:50:54 -0700 From: "H. Peter Anvin" User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: Jeremy Fitzhardinge CC: Ingo Molnar , "Xin, Xiaohui" , "Li, Xin" , "Nakajima, Jun" , Nick Piggin , Linux Kernel Mailing List , Xen-devel Subject: Re: Performance overhead of paravirt_ops on native identified References: <4A0B62F7.5030802@goop.org> <4A0B6F9C.4060405@zytor.com> <4A0C568B.7070907@goop.org> In-Reply-To: <4A0C568B.7070907@goop.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jeremy Fitzhardinge wrote: > > We did consider something like this at the outset. As I remember, there > were a few concerns: > > * There was no relocation data available in the kernel. I played > around with ways to make it work, but they ended up being fairly > complex and brittle, with a tendency (of course) to trigger > binutils bugs. Maybe that has changed. We already do this pass (in fact, we do something like three passes of it.) It's basically the vmlinux.o pass. > * We didn't really want to implement two separate mechanisms for the > same thing. Given that we wanted to inline things like > cli/sti/pushf/popf, we needed to have something capable of full > patching. Having a separate mechanisms for patching calls is > harder to justify. Now that pvops is well settled, perhaps it > makes sense to consider adding another more general patching > mechanism to avoid the indirect calls (a dynamic linker, essentially). Full patching is understandable (although I think sometimes the code generated was worse than out-of-line... I believe you have fixed that.) > I won't make any great claims about the beauty of the PV_CALL* gunk, but > at the very least it is contained within paravirt.h. There is still massive spillover into other code, though, at least some of which could possibly be avoided. I don't know. >> (*) if patching code on SMP was cheaper, we could actually do this >> lazily, and wouldn't have to store a list of patch sites. I don't feel >> brave enough to go down that route. >> > The problem that the tracepoints people were trying to solve was harder, > where they wanted to replace an arbitrary set of instructions with some > other arbitrary instructions (or a call) - that would need some kind SMP > synchronization, both for general sanity and to keep the Intel rules happy. > > In theory relinking a call should just be a single word write into the > instruction, but I don't know if that gets into undefined territory or > not. On older P4 systems it would end up blowing away the trace cache > on all cpus when you write to code like that, so you'd want to be sure > that your references are getting resolved fairly quickly. But its hard > to see how patching the offset in a call instruction would end up > calling something other than the old or new function. The problem is that since the call offset field can be arbitrarily aligned -- it could even cross page boundaries -- you still have absolutely no SMP atomicity guarantees. So you still have all the same problems. Without -hpa