Re: [RFC v2] xSplice design - Konrad Rzeszutek Wilk

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	jeremy@goop.org, hanweidong@huawei.com,
	john.liuqiming@huawei.com,
	Paul Voccio <paul.voccio@rackspace.com>,
	xen-devel@lists.xenproject.org,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Major Hayden <major.hayden@rackspace.com>,
	liuyingdong@huawei.com, aliguori@amazon.com, konrad@darnok.org,
	lars.kurth@citrix.com,
	Steven Wilson <steven.wilson@rackspace.com>,
	peter.huangpeng@huawei.com, msw@amazon.com,
	xiantao.zxt@alibaba-inc.com,
	Rick Harris <rick.harris@rackspace.com>,
	boris.ostrovsky@oracle.com,
	Josh Kearney <josh.kearney@rackspace.com>,
	jinsong.liu@alibaba-inc.com,
	Antony Messerli <amesserl@rackspace.com>,
	fanhenglong@huawei.com, andrew.cooper3@citrix.com
Subject: Re: [RFC v2] xSplice design
Date: Fri, 5 Jun 2015 12:00:52 -0400	[thread overview]
Message-ID: <20150605160052.GB11714@l.oracle.com> (raw)
In-Reply-To: <5571D98B0200007800081785@mail.emea.novell.com>

On Fri, Jun 05, 2015 at 04:16:59PM +0100, Jan Beulich wrote:
> >>> On 05.06.15 at 16:49, <konrad.wilk@oracle.com> wrote:
> > On Mon, May 18, 2015 at 01:41:34PM +0100, Jan Beulich wrote:
> >> >>> On 15.05.15 at 21:44, <konrad.wilk@oracle.com> wrote:
> >> > As such having the payload in an ELF file is the sensible way. We would be
> >> > carrying the various set of structures (and data) in the ELF sections under
> >> > different names and with definitions. The prefix for the ELF section name
> >> > would always be: *.xsplice_*
> >> > 
> >> > Note that every structure has padding. This is added so that the hypervisor
> >> > can re-use those fields as it sees fit.
> >> > 
> >> > There are five sections *.xsplice_* sections:
> >> 
> >> A general remark: Having worked on ELF on different occasions,
> >> UNIX'es (and hence Linux'es) use of section names to identify the
> >> sections' purposes is pretty contrary to ELF's original intentions.
> >> Section types (and architecture as well as OS extensions to them)
> >> exist for a reason. Of course from the following sections it's not
> >> really clear in how far you intended this to be done. If you did
> >> and just didn't explicitly say so, pointing out that relations
> >> between sections should then also be expressed suitably via ELF
> >> mechanisms (sh_link, sh_info) would then be unnecessary too. If
> >> you didn't, then I'd strongly suggest switching to a formally more
> >> correct model, which would then include specifying section types
> >> and section attributes (and optional other relevant aspects)
> >> along with their (then no longer mandatory) names.
> > 
> > 
> > Something like:
> > http://docs.oracle.com/cd/E23824_01/html/819-0690/chapter7-1.html 
> 
> Yes, the SHT_SUNW_* ones are examples of what I mean.
> 
> > With (for example) an pointer to a specific section:
> > http://docs.oracle.com/cd/E23824_01/html/819-0690/chapter7-28.html#scrolltoc 
> > ?
> 
> Not sure what part of that page you mean to refer to.

The whole page - with the:

typedef struct {
        Elf32_Word      c_tag;
        union {
                Elf32_Word      c_val;
                Elf32_Addr      c_ptr;
        } c_un;
} Elf32_Cap;

.. and such.

> 
> >> >  * `.bss` section for the new code (function)
> >> >  * `.data` and `.data.read_mostly` section for the new and old code (function)
> >> >  * `.rodata` section for the new and old code (function).
> >> > 
> >> > In short the *.xsplice_* sections represent various structures and the
> >> > ELF provides the mechanism to glue it all together when loaded in memory.
> >> 
> >> As per above - I think ELF can do this in a more explicit way. Of
> >> course that would imply sticking to strictly ELF types in the structure
> >> definitions below. That would in turn have the advantage of
> >> expressing e.g. "const char *" references to strings by section
> >> offsets into a specified string section, reducing (with the ultimate
> >> goal of eliminating) the need for processing relocations on the
> >> ELF object (possible references to "external symbols" left aside for
> >> the moment).
> > 
> > Do you think that going that way would be too complex for this design?
> > Folks have expressed that perhaps I went overboard with this ELF and
> > should have just stuck with structures and just mention that it is
> > in an ELF file?
> 
> I think it should be done either completely ELF-agnostic (with ELF
> just happening the container to convey the data) or fully ELF-
> compliant.

Let me do it ELF agnostic for right now and focus on the structures.
If it becomes to difficult to understand then I can look at converting
the whole thing to be ELF compliant.

> 
> >> > #define XSPLICE_HOWTO_RELOC_TIME    3 /* __TIME__ type chnage. */  
> >> 
> >> Assuming the former is DATE - what do you foresee date/time
> >> kinds of changes to be good for?
> > 
> > In case we have source code which uses __DATE__ and the binary patch
> > modifies it - we would need to sort out other users of the __DATE__
> > and make sure that they are not affected (or perhaps they should be).
> 
> And how is this different from arbitrary other string literals?

They are exactly the same. Will coalesce them.
> 
> >> > #define XSPLICE_HOWTO_BUG           4 /* BUG_ON being replaced.*/  
> >> > #define XSPLICE_HOWTO_EXTABLE       5 /* exception_table change. */  
> >> > #define XSPLICE_HOWTO_SYMBOL        6 /* change in symbol table. */  
> >> 
> >> For none of these I really understand their purpose.
> > 
> > This is to help the hypervisor to figure out which of the lookup
> > mechanisms to use to find the relevant section. As in if we need
> > to update (as part of the patch) a normal function, but also
> > the exception table (because the new function is at a new virtual
> > address) we want to search in the exception table for the old 
> > one and replace it with the new virtual address.
> 
> I think this gets too specific. Telling apart code and data may make
> sense, but I think beyond that thing should be expressed as
> generically as possible. Imagine new special data kinds showing up
> - do you envision adding a special type (and hence special handling)
> for all of them?

Yes and then updating the design document to include it.

Would there be an more relaxed way to go about this?

> 
> >> > #define XSPLICE_SECTION_STRING 0x00000008 /* Section is in .str */  
> >> > #define XSPLICE_SECTION_ALTINSTRUCTIONS 0x00000010 /* Section has 
> > .altinstructions. */  
> >> 
> >> I don't see the purpose of adding flags for a few special sections we
> >> may know about now - what if over time hundreds of sections appear?
> > 
> > We just need to know if it has - and then the hypervisor needs
> > to call apply_altinstructions to redo the alternative instructions.
> > 
> > I presume you are advocating to move this flag to an 'higher' level part
> > of the structures?
> 
> As above I'd rather see such things not special cased at all. You
> already need to deal with the to be patched code having two
> (or more) possible variants due to live patching at boot. I think
> for the purpose of runtime patching it ought to be quite fine to
> use correct, but less optimal instructions (i.e. no consideration of
> alternative instructions at all).

OK.
> 
> >> > #define XSPLICE_SECTION_TEXT_INPLACE 0x00000200 /* Change is in place. */   
> >> > #dekine XSPLICE_SECTION_MATCH_EXACT 0x00000400 /* Must match exactly. */  
> >> > #define XSPLICE_SECTION_NO_STACKCHECK 0x00000800 /* Do not check the stack. */  
> >> 
> >> This isn't the first reference to "stack", without me having seen a
> >> description of what the word is meant to stand for in such contexts.
> >> Did I overlook it?
> > 
> > I mentioned it earlier in 'Example of trampoline and in-place splicing':
> > 
> > 126 However it has severe drawbacks - the safety checks which have to make sure     
> > 127 the function is not on the stack - must also check every caller. For some 
> > 128 patches this could if there were an sufficient large amount of callers   
> > 129 that we would never be able to apply the update.
> > 
> > But perhaps I should dive much deeper in about the patching
> > mechanism having to check whether the function to be patched
> > is on the stack.
> 
> Yeah - the thing is that the lines you quote are precisely what I
> referred to by "this isn't the first reference..." - it's not being
> clarified anywhere what "stack" you talk about. In some places
> it looked like you meant the stack in memory that the local
> variables and spilled registers go onto, but in other places it
> wasn't clear whether something else was meant. And with us
> supposedly being in stop-machine state, what is on the call
> stacks of each CPU should be pretty deterministic.

True, the 'should' is the operative word. Hence the extra mechanism
to check for stack as an fail-safe mechanism. Or if we really really
are sure we can go away with - just patch it and don't do any
stack checking.

I will update the glossary to describe the 'stack' problem.
> 
> >> > ### Alternative assembler
> >> > 
> >> > Alternative assembler is a mechanism to use different instructions depending
> >> > on what the CPU supports. This is done by providing multiple streams of code
> >> > that can be patched in - or if the CPU does not support it - padded with
> >> > `nop` operations. The alternative assembler macros cause the compiler to
> >> > expand the code to place a most generic code in place - emit a special
> >> > ELF .section header to tag this location. During run-time the hypervisor
> >> > can leave the areas alone or patch them with an better suited opcodes.
> >> > 
> >> > As we might be patching the alternative assembler sections as well - by
> >> > providing a new better suited op-codes or perhaps with nops - we need to
> >> > also re-run the alternative assembler patching after we have done our
> >> > patching.
> >> 
> >> These sections are part of .init.* and as such can't reasonably be
> >> subject to patching.
> > 
> > True, thought I thought we have some for run-time as well.
> 
> No, we're not switching between UP and SMP like Linux does (or did).

That simplifies it! Thanks.
> 
> Jan

next prev parent reply	other threads:[~2015-06-05 16:01 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-15 19:44 [RFC v2] xSplice design Konrad Rzeszutek Wilk
2015-05-18 12:41 ` Jan Beulich
2015-06-05 14:49   ` Konrad Rzeszutek Wilk
2015-06-05 15:16     ` Jan Beulich
2015-06-05 16:00       ` Konrad Rzeszutek Wilk [this message]
2015-06-05 16:14         ` Jan Beulich
2015-05-18 12:54 ` Liuqiming (John)
2015-05-18 13:11   ` Daniel Kiper
2015-06-05 14:50   ` Konrad Rzeszutek Wilk
2015-05-19 19:13 ` Lars Kurth
2015-05-20 15:11 ` Martin Pohlack
2015-06-05 15:00   ` Konrad Rzeszutek Wilk
2015-06-05 15:15     ` Andrew Cooper
2015-06-05 15:27     ` Jan Beulich
2015-06-08  8:34       ` Martin Pohlack
2015-06-08  8:51         ` Jan Beulich
2015-06-08 14:38     ` Martin Pohlack
2015-06-08 15:19       ` Konrad Rzeszutek Wilk
2015-06-12 11:51         ` Martin Pohlack
2015-06-12 14:06           ` Konrad Rzeszutek Wilk
2015-06-12 11:39 ` Martin Pohlack
2015-06-12 14:03   ` Konrad Rzeszutek Wilk
2015-06-12 14:31     ` Martin Pohlack
2015-06-12 14:43       ` Jan Beulich
2015-06-12 17:31         ` Martin Pohlack
2015-06-12 18:46           ` Konrad Rzeszutek Wilk
2015-06-12 16:09       ` Konrad Rzeszutek Wilk
2015-06-12 16:17         ` Andrew Cooper
2015-06-12 16:39           ` Konrad Rzeszutek Wilk
2015-06-12 18:36             ` Martin Pohlack
2015-06-12 18:51               ` Konrad Rzeszutek Wilk
2015-07-06 19:36         ` Konrad Rzeszutek Wilk
2015-10-27 12:05   ` Ross Lagerwall
2015-10-29 16:55     ` Ross Lagerwall
2015-10-30 10:39       ` Martin Pohlack
2015-10-30 14:03         ` Ross Lagerwall
2015-10-30 14:06           ` Martin Pohlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150605160052.GB11714@l.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=aliguori@amazon.com \
    --cc=amesserl@rackspace.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=daniel.kiper@oracle.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=fanhenglong@huawei.com \
    --cc=hanweidong@huawei.com \
    --cc=jeremy@goop.org \
    --cc=jinsong.liu@alibaba-inc.com \
    --cc=john.liuqiming@huawei.com \
    --cc=josh.kearney@rackspace.com \
    --cc=konrad@darnok.org \
    --cc=lars.kurth@citrix.com \
    --cc=liuyingdong@huawei.com \
    --cc=major.hayden@rackspace.com \
    --cc=msw@amazon.com \
    --cc=paul.voccio@rackspace.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=rick.harris@rackspace.com \
    --cc=steven.wilson@rackspace.com \
    --cc=xen-devel@lists.xenproject.org \
    --cc=xiantao.zxt@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).