From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: [PATCH v2 01/13] xsplice: Design document (v5). Date: Tue, 19 Jan 2016 11:14:11 +0000 Message-ID: <20160119111411.GN1691@citrix.com> References: <1452808031-706-1-git-send-email-konrad.wilk@oracle.com> <1452808031-706-2-git-send-email-konrad.wilk@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aLUEp-0004b6-Rg for xen-devel@lists.xenproject.org; Tue, 19 Jan 2016 11:14:16 +0000 Content-Disposition: inline In-Reply-To: <1452808031-706-2-git-send-email-konrad.wilk@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: wei.liu2@citrix.com, ian.campbell@citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, mpohlack@amazon.com, ross.lagerwall@citrix.com, stefano.stabellini@citrix.com, jbeulich@suse.com, sasha.levin@oracle.com, xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org I skimmed this document and managed to do some non-technical nitpicks. :-) On Thu, Jan 14, 2016 at 04:46:59PM -0500, Konrad Rzeszutek Wilk wrote: [...] > +## Patching code > + > +The first mechanism to patch that comes in mind is in-place replacement. > +That is replace the affected code with new code. Unfortunately the x86 "replacing" or "to replace" > +ISA is variable size which places limits on how much space we have available > +to replace the instructions. That is not a problem if the change is smaller > +than the original opcode and we can fill it with nops. Problems will > +appear if the replacement code is longer. > + > +The second mechanism is by replacing the call or jump to the > +old function with the address of the new function. > + > +A third mechanism is to add a jump to the new function at the > +start of the old function. N.B. The Xen hypervisor implements the third > +mechanism. > + > +### Example of trampoline and in-place splicing > + > +As example we will assume the hypervisor does not have XSA-132 (see > +*domctl/sysctl: don't leak hypervisor stack to toolstacks* > +4ff3449f0e9d175ceb9551d3f2aecb59273f639d) and we would like to binary patch > +the hypervisor with it. The original code looks as so: > + > +
> +   48 89 e0                  mov    %rsp,%rax  
> +   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
> +
> + > +while the new patched hypervisor would be: > + > +
> +   48 c7 45 b8 00 00 00 00   movq   $0x0,-0x48(%rbp)  
> +   48 c7 45 c0 00 00 00 00   movq   $0x0,-0x40(%rbp)  
> +   48 c7 45 c8 00 00 00 00   movq   $0x0,-0x38(%rbp)  
> +   48 89 e0                  mov    %rsp,%rax  
> +   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
> +
> + > +This is inside the arch_do_domctl. This new change adds 21 extra > +bytes of code which alters all the offsets inside the function. To alter > +these offsets and add the extra 21 bytes of code we might not have enough > +space in .text to squeeze this in. > + > +As such we could simplify this problem by only patching the site > +which calls arch_do_domctl: > + > +
> +:  
> + e8 4b b1 05 00          callq  ffff82d08015fbb9   
> +
> + > +with a new address for where the new `arch_do_domctl` would be (this > +area would be allocated dynamically). > + > +Astute readers will wonder what we need to do if we were to patch `do_domctl` > +- which is not called directly by hypervisor but on behalf of the guests via > +the `compat_hypercall_table` and `hypercall_table`. > +Patching the offset in `hypercall_table` for `do_domctl: > +(ffff82d080103079 :) Blank line here please. > +
> +
> + ffff82d08024d490:   79 30  
> + ffff82d08024d492:   10 80 d0 82 ff ff   
> +
> +
Blank line. > +with the new address where the new `do_domctl` is possible. The other > +place where it is used is in `hvm_hypercall64_table` which would need > +to be patched in a similar way. This would require an in-place splicing > +of the new virtual address of `arch_do_domctl`. > + > +In summary this example patched the callee of the affected function by > + * allocating memory for the new code to live in, > + * changing the virtual address in all the functions which called the old > + code (computing the new offset, patching the callq with a new callq). > + * changing the function pointer tables with the new virtual address of > + the function (splicing in the new virtual address). Since this table > + resides in the .rodata section we would need to temporarily change the > + page table permissions during this part. > + > + > +However it has severe drawbacks - the safety checks which have to make sure > +the function is not on the stack - must also check every caller. For some > +patches this could mean - if there were an sufficient large amount of > +callers - that we would never be able to apply the update. > + > +### Example of different trampoline patching. > + > +An alternative mechanism exists where we can insert a trampoline in the > +existing function to be patched to jump directly to the new code. This > +lessens the locations to be patched to one but it puts pressure on the > +CPU branching logic (I-cache, but it is just one unconditional jump). > + > +For this example we will assume that the hypervisor has not been compiled > +with fe2e079f642effb3d24a6e1a7096ef26e691d93e (XSA-125: *pre-fill structures > +for certain HYPERVISOR_xen_version sub-ops*) which mem-sets an structure > +in `xen_version` hypercall. This function is not called **anywhere** in > +the hypervisor (it is called by the guest) but referenced in the > +`compat_hypercall_table` and `hypercall_table` (and indirectly called > +from that). Patching the offset in `hypercall_table` for the old > +`do_xen_version` (ffff82d080112f9e ) > + > + > + ffff82d08024b270 > + ... > + ffff82d08024b2f8: 9e 2f 11 80 d0 82 ff ff > + > + Blank line. > +with the new address where the new `do_xen_version` is possible. The other > +place where it is used is in `hvm_hypercall64_table` which would need > +to be patched in a similar way. This would require an in-place splicing > +of the new virtual address of `do_xen_version`. > + [...] > +#### Before entering the guest code. > + > +Before we call VMXResume we check whether any soft IRQs need to be executed. > +This is a good spot because all Xen stacks are effectively empty at > +that point. > + > +To randezvous all the CPUs an barrier with an maximum timeout (which "rendezvous" > +could be adjusted), combined with forcing all other CPUs through the > +hypervisor with IPIs, can be utilized to have all the CPUs be lockstep. > + I couldn't parse the last part of this sentence. But I'm not a native speaker. Wei.