From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
jeremy@goop.org, hanweidong@huawei.com,
john.liuqiming@huawei.com,
Paul Voccio <paul.voccio@rackspace.com>,
xen-devel@lists.xenproject.org,
Daniel Kiper <daniel.kiper@oracle.com>,
Major Hayden <major.hayden@rackspace.com>,
liuyingdong@huawei.com, aliguori@amazon.com, konrad@darnok.org,
lars.kurth@citrix.com,
Steven Wilson <steven.wilson@rackspace.com>,
peter.huangpeng@huawei.com, msw@amazon.com,
xiantao.zxt@alibaba-inc.com,
Rick Harris <rick.harris@rackspace.com>,
boris.ostrovsky@oracle.com,
Josh Kearney <josh.kearney@rackspace.com>,
jinsong.liu@alibaba-inc.com,
Antony Messerli <amesserl@rackspace.com>,
fanhenglong@huawei.com, andrew.cooper3@citrix.com
Subject: Re: [RFC v2] xSplice design
Date: Fri, 5 Jun 2015 12:00:52 -0400 [thread overview]
Message-ID: <20150605160052.GB11714@l.oracle.com> (raw)
In-Reply-To: <5571D98B0200007800081785@mail.emea.novell.com>
On Fri, Jun 05, 2015 at 04:16:59PM +0100, Jan Beulich wrote:
> >>> On 05.06.15 at 16:49, <konrad.wilk@oracle.com> wrote:
> > On Mon, May 18, 2015 at 01:41:34PM +0100, Jan Beulich wrote:
> >> >>> On 15.05.15 at 21:44, <konrad.wilk@oracle.com> wrote:
> >> > As such having the payload in an ELF file is the sensible way. We would be
> >> > carrying the various set of structures (and data) in the ELF sections under
> >> > different names and with definitions. The prefix for the ELF section name
> >> > would always be: *.xsplice_*
> >> >
> >> > Note that every structure has padding. This is added so that the hypervisor
> >> > can re-use those fields as it sees fit.
> >> >
> >> > There are five sections *.xsplice_* sections:
> >>
> >> A general remark: Having worked on ELF on different occasions,
> >> UNIX'es (and hence Linux'es) use of section names to identify the
> >> sections' purposes is pretty contrary to ELF's original intentions.
> >> Section types (and architecture as well as OS extensions to them)
> >> exist for a reason. Of course from the following sections it's not
> >> really clear in how far you intended this to be done. If you did
> >> and just didn't explicitly say so, pointing out that relations
> >> between sections should then also be expressed suitably via ELF
> >> mechanisms (sh_link, sh_info) would then be unnecessary too. If
> >> you didn't, then I'd strongly suggest switching to a formally more
> >> correct model, which would then include specifying section types
> >> and section attributes (and optional other relevant aspects)
> >> along with their (then no longer mandatory) names.
> >
> >
> > Something like:
> > http://docs.oracle.com/cd/E23824_01/html/819-0690/chapter7-1.html
>
> Yes, the SHT_SUNW_* ones are examples of what I mean.
>
> > With (for example) an pointer to a specific section:
> > http://docs.oracle.com/cd/E23824_01/html/819-0690/chapter7-28.html#scrolltoc
> > ?
>
> Not sure what part of that page you mean to refer to.
The whole page - with the:
typedef struct {
Elf32_Word c_tag;
union {
Elf32_Word c_val;
Elf32_Addr c_ptr;
} c_un;
} Elf32_Cap;
.. and such.
>
> >> > * `.bss` section for the new code (function)
> >> > * `.data` and `.data.read_mostly` section for the new and old code (function)
> >> > * `.rodata` section for the new and old code (function).
> >> >
> >> > In short the *.xsplice_* sections represent various structures and the
> >> > ELF provides the mechanism to glue it all together when loaded in memory.
> >>
> >> As per above - I think ELF can do this in a more explicit way. Of
> >> course that would imply sticking to strictly ELF types in the structure
> >> definitions below. That would in turn have the advantage of
> >> expressing e.g. "const char *" references to strings by section
> >> offsets into a specified string section, reducing (with the ultimate
> >> goal of eliminating) the need for processing relocations on the
> >> ELF object (possible references to "external symbols" left aside for
> >> the moment).
> >
> > Do you think that going that way would be too complex for this design?
> > Folks have expressed that perhaps I went overboard with this ELF and
> > should have just stuck with structures and just mention that it is
> > in an ELF file?
>
> I think it should be done either completely ELF-agnostic (with ELF
> just happening the container to convey the data) or fully ELF-
> compliant.
Let me do it ELF agnostic for right now and focus on the structures.
If it becomes to difficult to understand then I can look at converting
the whole thing to be ELF compliant.
>
> >> > #define XSPLICE_HOWTO_RELOC_TIME 3 /* __TIME__ type chnage. */
> >>
> >> Assuming the former is DATE - what do you foresee date/time
> >> kinds of changes to be good for?
> >
> > In case we have source code which uses __DATE__ and the binary patch
> > modifies it - we would need to sort out other users of the __DATE__
> > and make sure that they are not affected (or perhaps they should be).
>
> And how is this different from arbitrary other string literals?
They are exactly the same. Will coalesce them.
>
> >> > #define XSPLICE_HOWTO_BUG 4 /* BUG_ON being replaced.*/
> >> > #define XSPLICE_HOWTO_EXTABLE 5 /* exception_table change. */
> >> > #define XSPLICE_HOWTO_SYMBOL 6 /* change in symbol table. */
> >>
> >> For none of these I really understand their purpose.
> >
> > This is to help the hypervisor to figure out which of the lookup
> > mechanisms to use to find the relevant section. As in if we need
> > to update (as part of the patch) a normal function, but also
> > the exception table (because the new function is at a new virtual
> > address) we want to search in the exception table for the old
> > one and replace it with the new virtual address.
>
> I think this gets too specific. Telling apart code and data may make
> sense, but I think beyond that thing should be expressed as
> generically as possible. Imagine new special data kinds showing up
> - do you envision adding a special type (and hence special handling)
> for all of them?
Yes and then updating the design document to include it.
Would there be an more relaxed way to go about this?
>
> >> > #define XSPLICE_SECTION_STRING 0x00000008 /* Section is in .str */
> >> > #define XSPLICE_SECTION_ALTINSTRUCTIONS 0x00000010 /* Section has
> > .altinstructions. */
> >>
> >> I don't see the purpose of adding flags for a few special sections we
> >> may know about now - what if over time hundreds of sections appear?
> >
> > We just need to know if it has - and then the hypervisor needs
> > to call apply_altinstructions to redo the alternative instructions.
> >
> > I presume you are advocating to move this flag to an 'higher' level part
> > of the structures?
>
> As above I'd rather see such things not special cased at all. You
> already need to deal with the to be patched code having two
> (or more) possible variants due to live patching at boot. I think
> for the purpose of runtime patching it ought to be quite fine to
> use correct, but less optimal instructions (i.e. no consideration of
> alternative instructions at all).
OK.
>
> >> > #define XSPLICE_SECTION_TEXT_INPLACE 0x00000200 /* Change is in place. */
> >> > #dekine XSPLICE_SECTION_MATCH_EXACT 0x00000400 /* Must match exactly. */
> >> > #define XSPLICE_SECTION_NO_STACKCHECK 0x00000800 /* Do not check the stack. */
> >>
> >> This isn't the first reference to "stack", without me having seen a
> >> description of what the word is meant to stand for in such contexts.
> >> Did I overlook it?
> >
> > I mentioned it earlier in 'Example of trampoline and in-place splicing':
> >
> > 126 However it has severe drawbacks - the safety checks which have to make sure
> > 127 the function is not on the stack - must also check every caller. For some
> > 128 patches this could if there were an sufficient large amount of callers
> > 129 that we would never be able to apply the update.
> >
> > But perhaps I should dive much deeper in about the patching
> > mechanism having to check whether the function to be patched
> > is on the stack.
>
> Yeah - the thing is that the lines you quote are precisely what I
> referred to by "this isn't the first reference..." - it's not being
> clarified anywhere what "stack" you talk about. In some places
> it looked like you meant the stack in memory that the local
> variables and spilled registers go onto, but in other places it
> wasn't clear whether something else was meant. And with us
> supposedly being in stop-machine state, what is on the call
> stacks of each CPU should be pretty deterministic.
True, the 'should' is the operative word. Hence the extra mechanism
to check for stack as an fail-safe mechanism. Or if we really really
are sure we can go away with - just patch it and don't do any
stack checking.
I will update the glossary to describe the 'stack' problem.
>
> >> > ### Alternative assembler
> >> >
> >> > Alternative assembler is a mechanism to use different instructions depending
> >> > on what the CPU supports. This is done by providing multiple streams of code
> >> > that can be patched in - or if the CPU does not support it - padded with
> >> > `nop` operations. The alternative assembler macros cause the compiler to
> >> > expand the code to place a most generic code in place - emit a special
> >> > ELF .section header to tag this location. During run-time the hypervisor
> >> > can leave the areas alone or patch them with an better suited opcodes.
> >> >
> >> > As we might be patching the alternative assembler sections as well - by
> >> > providing a new better suited op-codes or perhaps with nops - we need to
> >> > also re-run the alternative assembler patching after we have done our
> >> > patching.
> >>
> >> These sections are part of .init.* and as such can't reasonably be
> >> subject to patching.
> >
> > True, thought I thought we have some for run-time as well.
>
> No, we're not switching between UP and SMP like Linux does (or did).
That simplifies it! Thanks.
>
> Jan
next prev parent reply other threads:[~2015-06-05 16:01 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-15 19:44 [RFC v2] xSplice design Konrad Rzeszutek Wilk
2015-05-18 12:41 ` Jan Beulich
2015-06-05 14:49 ` Konrad Rzeszutek Wilk
2015-06-05 15:16 ` Jan Beulich
2015-06-05 16:00 ` Konrad Rzeszutek Wilk [this message]
2015-06-05 16:14 ` Jan Beulich
2015-05-18 12:54 ` Liuqiming (John)
2015-05-18 13:11 ` Daniel Kiper
2015-06-05 14:50 ` Konrad Rzeszutek Wilk
2015-05-19 19:13 ` Lars Kurth
2015-05-20 15:11 ` Martin Pohlack
2015-06-05 15:00 ` Konrad Rzeszutek Wilk
2015-06-05 15:15 ` Andrew Cooper
2015-06-05 15:27 ` Jan Beulich
2015-06-08 8:34 ` Martin Pohlack
2015-06-08 8:51 ` Jan Beulich
2015-06-08 14:38 ` Martin Pohlack
2015-06-08 15:19 ` Konrad Rzeszutek Wilk
2015-06-12 11:51 ` Martin Pohlack
2015-06-12 14:06 ` Konrad Rzeszutek Wilk
2015-06-12 11:39 ` Martin Pohlack
2015-06-12 14:03 ` Konrad Rzeszutek Wilk
2015-06-12 14:31 ` Martin Pohlack
2015-06-12 14:43 ` Jan Beulich
2015-06-12 17:31 ` Martin Pohlack
2015-06-12 18:46 ` Konrad Rzeszutek Wilk
2015-06-12 16:09 ` Konrad Rzeszutek Wilk
2015-06-12 16:17 ` Andrew Cooper
2015-06-12 16:39 ` Konrad Rzeszutek Wilk
2015-06-12 18:36 ` Martin Pohlack
2015-06-12 18:51 ` Konrad Rzeszutek Wilk
2015-07-06 19:36 ` Konrad Rzeszutek Wilk
2015-10-27 12:05 ` Ross Lagerwall
2015-10-29 16:55 ` Ross Lagerwall
2015-10-30 10:39 ` Martin Pohlack
2015-10-30 14:03 ` Ross Lagerwall
2015-10-30 14:06 ` Martin Pohlack
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150605160052.GB11714@l.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=aliguori@amazon.com \
--cc=amesserl@rackspace.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=daniel.kiper@oracle.com \
--cc=elena.ufimtseva@oracle.com \
--cc=fanhenglong@huawei.com \
--cc=hanweidong@huawei.com \
--cc=jeremy@goop.org \
--cc=jinsong.liu@alibaba-inc.com \
--cc=john.liuqiming@huawei.com \
--cc=josh.kearney@rackspace.com \
--cc=konrad@darnok.org \
--cc=lars.kurth@citrix.com \
--cc=liuyingdong@huawei.com \
--cc=major.hayden@rackspace.com \
--cc=msw@amazon.com \
--cc=paul.voccio@rackspace.com \
--cc=peter.huangpeng@huawei.com \
--cc=rick.harris@rackspace.com \
--cc=steven.wilson@rackspace.com \
--cc=xen-devel@lists.xenproject.org \
--cc=xiantao.zxt@alibaba-inc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).