From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [RFC PATCH v3.1 1/2] xsplice: rfc.v3.1 Date: Fri, 31 Jul 2015 11:46:27 -0400 Message-ID: <20150731154626.GA30139@localhost.localdomain> References: <1438024817-26942-1-git-send-email-konrad.wilk@oracle.com> <1438024817-26942-2-git-send-email-konrad.wilk@oracle.com> <20150730164740.GO17367@sventech.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZLCbW-000211-EZ for xen-devel@lists.xenproject.org; Fri, 31 Jul 2015 15:52:14 +0000 Content-Disposition: inline In-Reply-To: <20150730164740.GO17367@sventech.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Johannes Erdfelt Cc: elena.ufimtseva@oracle.com, jeremy@goop.org, hanweidong@huawei.com, jbeulich@suse.com, john.liuqiming@huawei.com, paul.voccio@rackspace.com, Konrad Rzeszutek Wilk , daniel.kiper@oracle.com, major.hayden@rackspace.com, liuyingdong@huawei.com, aliguori@amazon.com, xen-devel@lists.xenproject.org, steven.wilson@rackspace.com, peter.huangpeng@huawei.com, msw@amazon.com, xiantao.zxt@alibaba-inc.com, rick.harris@rackspace.com, josh.kearney@rackspace.com, jinsong.liu@alibaba-inc.com, amesserl@rackspace.com, mpohlack@amazon.com, dslutz@verizon.com, fanhenglong@huawei.com, Andrew.Cooper3@citrix.com List-Id: xen-devel@lists.xenproject.org On Thu, Jul 30, 2015 at 09:47:40AM -0700, Johannes Erdfelt wrote: > Thanks for the work on this. I had some comments and questions on the > latest draft. Hey Johannes! Thank you for your review! > > On Mon, Jul 27, 2015, Konrad Rzeszutek Wilk wrote: > > +#define XSPLICE_HOWTO_FLAG_PC_REL 0x1 /* Is PC relative. */ > > +#define XSPLICE_HOWOT_FLAG_SIGN 0x2 /* Should the new value be treated as signed value. */ > > s/HOWOT/HOWTO/ > > > +struct xsplice_reloc_howto { > > + uint32_t howto; /* XSPLICE_HOWTO_* */ > > + uint32_t flag; /* XSPLICE_HOWTO_FLAG_* */ > > + uint32_t size; /* Size, in bytes, of the item to be relocated. */ > > + uint32_t r_shift; /* The value the final relocation is shifted right by; used to drop unwanted data from the relocation. */ > > + uint64_t mask; /* Bitmask for which parts of the instruction or data are replaced with the relocated value. */ > > + uint8_t pad[8]; /* Must be zero. */ > > +}; > > I'm curious how r_shift and mask are used. I'm familiar with x86 and > x86_64 and I'm not sure how these fit in. Is this to support other > architectures? It is to patch up data. We can specify the exact mask for an unsigned int - so we only patch specific bits. Ditto if we want to remove certain values. > > > +### Trampoline (e9 opcode) > > + > > +The e9 opcode used for jmpq uses a 32-bit signed displacement. That means > > +we are limited to up to 2GB of virtual address to place the new code > > +from the old code. That should not be a problem since Xen hypervisor has > > +a very small footprint. > > + > > +However if we need - we can always add two trampolines. One at the 2GB > > +limit that calls the next trampoline. > > I'm not sure I understand the concern. At least on x86_64, there is a > ton of unused virtual address space where the hypervisor image is > mapped. Why not simply map memory at the end of virtual address space? > > There shouldn't be a concern with 2GB jumps then. Correct. This is added on if the hypervisor ends up gobbling up tons of memory and having those virtual addresses eaten up. > > > +Please note there is a small limitation for trampolines in > > +function entries: The target function (+ trailing padding) must be able > > +to accomodate the trampoline. On x86 with +-2 GB relative jumps, > > +this means 5 bytes are required. > > + > > +Depending on compiler settings, there are several functions in Xen that > > +are smaller (without inter-function padding). > > + > > +
 
> > +readelf -sW xen-syms | grep " FUNC " | \
> > +    awk '{ if ($3 < 5) print $3, $4, $5, $8 }'
> > +
> > +...
> > +3 FUNC LOCAL wbinvd_ipi
> > +3 FUNC LOCAL shadow_l1_index
> > +...
> > +
> > +A compile-time check for, e.g., a minimum alignment of functions or a > > +runtime check that verifies symbol size (+ padding to next symbols) for > > +that in the hypervisor is advised. > > Is this really necessary? The way Xen is currently compiled results in > functions being aligned at 16-byte boundaries. The extra space is padded > with NOPs. Even if a function is only 3 bytes, it still has at least 16 > bytes of space to use. > > This can be controlled with the -falign-functions option to gcc. Right. The 'compile-time' check can be just to make sure that the compiler is controlled by that - otherwise we will halt the compilation. > > Also, there are ways to get a 5-byte NOP added before the function. > This is what the Linux kernel does for ftrace which is what the recent > Linux kernel live patching support is built on. > > It seems like it would be easier to be explicit during the build process > than do runtime checks to ensure there is enough space. Correct. > > > +### When to patch > > + > > +During the discussion on the design two candidates bubbled where > > +the call stack for each CPU would be deterministic. This would > > +minimize the chance of the patch not being applied due to safety > > +checks failing. > > It would be nice to have the consistency model be more explicit. > > Maybe using the terminology from this LKML post? > > https://lkml.org/lkml/2014/11/7/354 Certainy we can borrow. > > > +To randezvous all the CPUs an barrier with an maximum timeout (which > > +could be adjusted), combined with forcing all other CPUs through the > > +hypervisor with IPIs, can be utilized to have all the CPUs be lockstep. > > s/randezvous/rendezvous/ > > > +### Compiling the hypervisor code > > + > > +Hotpatch generation often requires support for compiling the target > > +with -ffunction-sections / -fdata-sections. Changes would have to > > +be done to the linker scripts to support this. > > Is this for correctness reasons? Sanity mostly. Without that having the objdump and tools to figure out what piece of code is for what would be complicated. > > I understand this would be a good idea to reduce the size of patches, > but I wanted to make sure I'm not missing something. > > > +### Symbol names > > + > > + > > +Xen as it is now, has a couple of non-unique symbol names which will > > +make runtime symbol identification hard. Sometimes, static symbols > > +simply have the same name in C files, sometimes such symbols get > > +included via header files, and some C files are also compiled > > +multiple times and linked under different names (guest_walk.c). > > I'm not sure I understand the problem with static symbols. They aren't > visible outside of the .c file, so when performing the linking against > the target xen image, there shouldn't be any conflicts. To do run-time checking based on symbols. We may get information that we need to patch 'xen_someting_static+<0x1f>' and we have not been supplied the virtual address. As such if xen_something_static may not show up at all - and we can't patch it. We need some mechanism to tweak the build so that those symbols do bubble up. > > JE > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel