From: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
To: Vojtech Pavlik <vojtech@suse.cz>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
Seth Jennings <sjenning@redhat.com>,
Jiri Kosina <jkosina@suse.cz>,
Steven Rostedt <rostedt@goodmis.org>,
live-patching@vger.kernel.org, kpatch@redhat.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] Kernel Live Patching
Date: Fri, 14 Nov 2014 00:56:38 +0900 [thread overview]
Message-ID: <5464D4B6.5010808@hitachi.com> (raw)
In-Reply-To: <20141112214719.GA26809@suse.cz>
(2014/11/13 6:47), Vojtech Pavlik wrote:
> On Thu, Nov 13, 2014 at 02:33:24AM +0900, Masami Hiramatsu wrote:
>> Right. Consistency model is still same as kpatch. Btw, I think
>> we can just use the difference of consistency for classifying
>> the patches, since we have these classes, only limited combination
>> is meaningful.
>>
>>>> LEAVE_FUNCTION
>>>> LEAVE_PATCHED_SET
>>>> LEAVE_KERNEL
>>>>
>>>> SWITCH_FUNCTION
>>>> SWITCH_THREAD
>>>> SWITCH_KERNEL
>>
>> How about the below combination of consistent flags?
>>
>> <flags>
>> CONSISTENT_IN_THREAD - patching is consistent in a thread.
>> CONSISTENT_IN_TIME - patching is atomically done.
>>
>> <combination>
>> (none) - the 'null' mode? same as LEAVE_FUNCTION & SWITCH_FUNCTION
>>
>> CONSISTENT_IN_THREAD - kGraft mode. same as LEAVE_KERNEL & SWITCH_THREAD
>>
>> CONSISTENT_IN_TIME - kpatch mode. same as LEAVE_PATCHED_SET & SWITCH_KERNEL
>>
>> CONSISTENT_IN_THREAD|CONSISTENT_IN_TIME - CRIU mode. same as LEAVE_KERNEL & SWITCH_KERNEL
>
> The reason I tried to parametrize the consistency model in a more
> flexible and fine-grained manner than just describing the existing
> solutions was for the purpose of exploring whether any of the remaining
> combinations make sense.
I see. I don't mind the implementation of how to check the execution path.
I just considers that we need classify consistency requirements when
checking the "patch" itself (maybe by manual at first).
And since your classification seemed mixing the consistency and switching
timings, I thought we'd better split them into the consistency requirement
flags and implementation of safeness checking :)
Even if you can use refcounting with per-thread patching, it still switches
per-thread basis, inconsistent among threads.
> It allowed me to look at what value we're getting from the consistency
> models: Most importantly the ability to change function prototypes and
> still make calls work.
>
> For this, the minimum requirements are LEAVE_PATCHED_SET (what
> kpatch does) and SWITCH_THREAD (which is what kGraft does).
>
> Both kpatch and kGraft do more, but:
>
> I was able to show that LEAVE_KERNEL is unnecessary and any cases where
> it is beneficial can be augmented by just increasing the patched set.
>
> I believe at this point that SWITCH_KERNEL is unnecessary and that data or
> locking changes - the major benefit of switching at once can be done by
> shadowing/versioning of data structures, which is what both kpatch and
> kGraft had planned to do anyway.
>
> I haven't shown yet whether the strongest consistency (LEAVE_KERNEL +
> SWITCH_KERNEL) is possible at all. CRIU is close, but not necessarily
> doing quite that. It might be possible to just force processes to sleep
> at syscall entry one by one until all are asleep. Also the benefits of
> doing that are still unclear.
Of course, that is what kernel/freezer.c does :)
So, if you need to patch with the strongest consistency, you can freeze
them all.
>
> The goal is to find a consistency model that is best suited for the
> goals of both kpatch and kGraft: Reliably apply simple to
> mid-complexity kernel patches.
Same as me. I just sorted out the possible consistency requirements.
And I've thought that the key was "consistent in a context of each thread" or
"consistent at the moment among all threads but not in a context" or
"consistent in contexts of all threads". What would you think, any other
consistency model is there?
>> So, each patch requires consistency constrains flag and livepatch tool
>> chooses the mode based on the flag.
>>
>>>> So, I think the patch may be classified by following four types
>>>>
>>>> PATCH_FUNCTION - Patching per function. This ignores context, just
>>>> change the function.
>>>> User must ensure that the new function can co-exist
>>>> with old functions on the same context (e.g. recursive
>>>> call can cause inconsistency).
>>>>
>>>> PATCH_THREAD - Patching per thread. If a thread leave the kernel,
>>>> changes are applied for that thread.
>>>> User must ensure that the new functions can co-exist
>>>> with old functions per-thread. Inter-thread shared
>>>> data acquisition(locks) should not be involved.
>>>>
>>>> PATCH_KERNEL - Patching all threads. This wait for all threads leave the
>>>> all target functions.
>>>> User must ensure that the new functions can co-exist
>>>> with old functions on a thread (note that if there is a
>>>> loop, old one can be called first n times, and new one
>>>> can be called afterwords).(**)
>>>
>>> Yes, but only when the function calling it is not included in the
>>> patched set, which is only a problem for semantic changes accompanied by
>>> no change in the function prototyppe. This can be avoided by changing
>>> the prototype deliberately.
>>
>> Hmm, but what would you think about following simple case?
>>
>> ----
>> int func(int a) {
>> return a + 1;
>> }
>>
>> ...
>> b = 0;
>> for (i = 0; i < 10; i++)
>> b = func(b);
>> ...
>> ----
>> ----
>> int func(int a) {
>> return a + 2; /* Changed */
>> }
>>
>> ...
>> b = 0;
>> for (i = 0; i < 10; i++)
>> b = func(b);
>> ...
>> ----
>>
>> So, after the patch, "b" will be in a range of 10 to 20, not 10 or 20.
>> Of course CONSISTENT_IN_THREAD can ensure it should be 10 or 20 :)
>
> If you force a prototype change, eg by changing func() to an unsigned
> int, or simply add a parameter, the place where it is called from will
> also be changed and will be included in the patched set. (Or you can
> just include it manually in the set.)
Yes.
> Then, you can be sure that the place which calls func() is not on the
> stack when patching. This way, in your classification, PATCH_KERNEL can
> be as good as PATCH_THREAD. In my classification, I'm saying that
> LEAVE_PATCHED_SET is as good as LEAVE_KERNEL.
OK, but again, to be sure that, we need to dump stack for each kernel
as I did.
>>>> (*) Instead of checking stacks, at first, wait for all threads leaving
>>>> the kernel once, after that, wait for refcount becomes zero and switch
>>>> all the patched functions.
>>>
>>> This is a very beautiful idea.
>>>
>>> It does away with both the stack parsing and the kernel stopping,
>>> achieving kGraft's goals, while preserving kpatch's consistency model.
>>>
>>> Sadly, it combines the disadvantages of both kpatch and kGraft: From
>>> kpatch it takes the inability to patch functions where threads are
>>> sleeping often and as such never leave them at once. From kGraft it
>>> takes the need to annotate kernel threads and wake sleepers from
>>> userspace.
>>
>> But how frequently the former case happens? It seems very very rare.
>> And if we aim to enable both kpatch mode and kGraft mode in the kernel,
>> anyway we'll have something for the latter cases.
>
> The kpatch problem case isn't that rare. It just happened with a CVE in
> futexes recently. It will happen if you try to patch anything that is on
> the stack when a TTY or TCP read is waiting for data as another example.
Oh, I see. this should be solved then... perhaps, we can freeze those
tasks and thaw it again.
> The kGraft problem case will happen when you load a 3rd party module
> with a non-annotated kernel thread. Or a different problem will happen
> when you have an application sleeping that will exit when receiving any
> signal.
Ah, yes. especially latter case is serious. maybe freezer can handle
this too...
> Both the cases can be handled with tricks and workarounds. But it'd be
> much nicer to have a patching engine that is reliable.
>
>>> So while it is beautiful, it's less practical than either kpatch or
>>> kGraft alone.
>>
>> Ah, sorry for confusing, I don't tend to integrate kpatch and kGraft.
>> Actually, it is just about modifying kpatch, since it may shorten
>> stack-checking time.
>> This means that does not change the consistency model.
>> We certainly need both of kGraft mode and kpatch mode.
>
> What I'm proposing is a LEAVE_PATCHED_SET + SWITCH_THREAD mode. It's
> less consistency, but it is enough. And it is more reliable (likely to
> succeed in finite time) than either kpatch or kGraft.
Yeah, that is actual merge of kpatch and kGraft, and also can avoid
stop_machine (yes, that is important for me :)).
> It'd be mostly based on your refcounting code, including stack
> checking (when a process sleeps, counter gets set based on number of
> patched functions on the stack), possibly including setting the counter
> to 0 on syscall entry/exit, but it'd make the switch per-thread like
> kGraft does, not for the whole system, when the respective counters
> reach zero.
I'm not sure what happens if a process sleeps on the patched-set?
If we switch the other threads, when this sleeping thread wakes up
that will see the old functions (and old data). So I think we need
both SWITCH_THREAD and SWITCH_KERNEL options in that case.
What I'm thinking is to merge the code (technique) of both and
allow to choose the "switch-timing" based on the patch's consistency
requirement.
>
> This handles the frequent sleeper case, it doesn't need annotated kernel
> thread main loops, it will not need the user to wake up every process in
> the system unless it sleeps in a patched function.
>
> And it can handle all the patches that kpatch and kGraft can (it needs
> shadowing for some).
>
>>> Yes, this is what I call 'extending the patched set'. You can do that
>>> either by deliberately changing the prototype of the patched function
>>> being called, which causes the calling function to be considered
>>> different, or just add it to the set of functions considered manually.
>>
>> I'd prefer latter one :) or just gives hints of watching targets.
>
> Me too.
>
Anyway, I'd like to support for this effort from kernel side.
At least I have to solve ftrace regs conflict by IPMODIFY flag and
a headache kretprobe failure case by sharing per-thread retstack
with ftrace-callgraph.
Thank you,
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
next prev parent reply other threads:[~2014-11-13 15:56 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-06 14:39 [PATCH 0/2] Kernel Live Patching Seth Jennings
2014-11-06 14:39 ` [PATCH 1/2] kernel: add TAINT_LIVEPATCH Seth Jennings
2014-11-09 20:19 ` Greg KH
2014-11-11 14:54 ` Seth Jennings
2014-11-06 14:39 ` [PATCH 2/2] kernel: add support for live patching Seth Jennings
2014-11-06 15:11 ` Jiri Kosina
2014-11-06 16:20 ` Seth Jennings
2014-11-06 16:32 ` Josh Poimboeuf
2014-11-06 18:00 ` Vojtech Pavlik
2014-11-06 22:20 ` Jiri Kosina
2014-11-07 12:50 ` Josh Poimboeuf
2014-11-07 13:13 ` Jiri Kosina
2014-11-07 13:22 ` Josh Poimboeuf
2014-11-07 14:57 ` Seth Jennings
2014-11-06 15:51 ` Jiri Slaby
2014-11-06 16:57 ` Seth Jennings
2014-11-06 17:12 ` Josh Poimboeuf
2014-11-07 18:21 ` Petr Mladek
2014-11-07 20:31 ` Josh Poimboeuf
2014-11-30 12:23 ` Pavel Machek
2014-12-01 16:49 ` Seth Jennings
2014-11-06 20:02 ` Steven Rostedt
2014-11-06 20:19 ` Seth Jennings
2014-11-07 17:13 ` module notifier: was " Petr Mladek
2014-11-07 18:07 ` Seth Jennings
2014-11-07 18:40 ` Petr Mladek
2014-11-07 18:55 ` Seth Jennings
2014-11-11 19:40 ` Seth Jennings
2014-11-11 22:17 ` Jiri Kosina
2014-11-11 22:48 ` Seth Jennings
2014-11-07 17:39 ` more patches for the same func: " Petr Mladek
2014-11-07 21:54 ` Josh Poimboeuf
2014-11-07 19:40 ` Andy Lutomirski
2014-11-07 19:42 ` Seth Jennings
2014-11-07 19:52 ` Seth Jennings
2014-11-10 10:08 ` Jiri Kosina
2014-11-10 17:31 ` Josh Poimboeuf
2014-11-13 10:16 ` Miroslav Benes
2014-11-13 14:38 ` Josh Poimboeuf
2014-11-13 17:12 ` Seth Jennings
2014-11-14 13:30 ` Miroslav Benes
2014-11-14 14:52 ` Petr Mladek
2014-11-06 18:44 ` [PATCH 0/2] Kernel Live Patching Christoph Hellwig
2014-11-06 18:51 ` Vojtech Pavlik
2014-11-06 18:58 ` Christoph Hellwig
2014-11-06 19:34 ` Josh Poimboeuf
2014-11-06 19:49 ` Steven Rostedt
2014-11-06 20:02 ` Josh Poimboeuf
2014-11-07 7:46 ` Christoph Hellwig
2014-11-07 7:45 ` Christoph Hellwig
2014-11-06 20:24 ` Vojtech Pavlik
2014-11-07 7:47 ` Christoph Hellwig
2014-11-07 13:11 ` Josh Poimboeuf
2014-11-07 14:04 ` Vojtech Pavlik
2014-11-07 15:45 ` Josh Poimboeuf
2014-11-07 21:27 ` Vojtech Pavlik
2014-11-08 3:45 ` Josh Poimboeuf
2014-11-08 8:07 ` Vojtech Pavlik
2014-11-10 17:09 ` Josh Poimboeuf
2014-11-11 9:05 ` Vojtech Pavlik
2014-11-11 17:45 ` Josh Poimboeuf
2014-11-11 1:24 ` Masami Hiramatsu
2014-11-11 10:26 ` Vojtech Pavlik
2014-11-12 17:33 ` Masami Hiramatsu
2014-11-12 21:47 ` Vojtech Pavlik
2014-11-13 15:56 ` Masami Hiramatsu [this message]
2014-11-13 16:38 ` Vojtech Pavlik
2014-11-18 12:47 ` Petr Mladek
2014-11-18 18:58 ` Josh Poimboeuf
2014-11-07 12:31 ` Josh Poimboeuf
2014-11-07 12:48 ` Vojtech Pavlik
2014-11-07 13:06 ` Josh Poimboeuf
2014-11-09 20:16 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5464D4B6.5010808@hitachi.com \
--to=masami.hiramatsu.pt@hitachi.com \
--cc=hch@infradead.org \
--cc=jkosina@suse.cz \
--cc=jpoimboe@redhat.com \
--cc=kpatch@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=sjenning@redhat.com \
--cc=vojtech@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).