From: Petr Mladek <pmladek@suse.com>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@redhat.com>, Jiri Kosina <jikos@kernel.org>,
Miroslav Benes <mbenes@suse.cz>, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
live-patching@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-s390@vger.kernel.org, Vojtech Pavlik <vojtech@suse.com>,
Jiri Slaby <jslaby@suse.cz>,
Chris J Arges <chris.j.arges@canonical.com>,
Andy Lutomirski <luto@kernel.org>
Subject: Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model
Date: Mon, 9 May 2016 14:23:03 +0200 [thread overview]
Message-ID: <20160509122303.GD2895@pathway.suse.cz> (raw)
In-Reply-To: <20160506123855.oabdeehqu3aj6nqy@treble>
On Fri 2016-05-06 07:38:55, Josh Poimboeuf wrote:
> On Thu, May 05, 2016 at 01:57:01PM +0200, Petr Mladek wrote:
> > I have missed that the two commands are called with preemption
> > disabled. So, I had the following crazy scenario in mind:
> >
> >
> > CPU0 CPU1
> >
> > klp_enable_patch()
> >
> > klp_target_state = KLP_PATCHED;
> >
> > for_each_task()
> > set TIF_PENDING_PATCH
> >
> > # task 123
> >
> > if (klp_patch_pending(current)
> > klp_patch_task(current)
> >
> > clear TIF_PENDING_PATCH
> >
> > smp_rmb();
> >
> > # switch to assembly of
> > # klp_patch_task()
> >
> > mov klp_target_state, %r12
> >
> > # interrupt and schedule
> > # another task
> >
> >
> > klp_reverse_transition();
> >
> > klp_target_state = KLP_UNPATCHED;
> >
> > klt_try_to_complete_transition()
> >
> > task = 123;
> > if (task->patch_state == klp_target_state;
> > return 0;
> >
> > => task 123 is in target state and does
> > not block conversion
> >
> > klp_complete_transition()
> >
> >
> > # disable previous patch on the stack
> > klp_disable_patch();
> >
> > klp_target_state = KLP_UNPATCHED;
> >
> >
> > # task 123 gets scheduled again
> > lea %r12, task->patch_state
> >
> > => it happily stores an outdated
> > state
> >
>
> Thanks for the clear explanation, this helps a lot.
>
> > This is why the two functions should get called with preemption
> > disabled. We should document it at least. I imagine that we will
> > use them later also in another context and nobody will remember
> > this crazy scenario.
> >
> > Well, even disabled preemption does not help. The process on
> > CPU1 might be also interrupted by an NMI and do some long
> > printk in it.
> >
> > IMHO, the only safe approach is to call klp_patch_task()
> > only for "current" on a safe place. Then this race is harmless.
> > The switch happen on a safe place, so that it does not matter
> > into which state the process is switched.
>
> I'm not sure about this solution. When klp_complete_transition() is
> called, we need all tasks to be patched, for good. We don't want any of
> them to randomly switch to the wrong state at some later time in the
> middle of a future patch operation. How would changing klp_patch_task()
> to only use "current" prevent that?
You are right that it is pity but it really should be safe because
it is not entirely random.
If the race happens and assign an outdated value, there are two
situations:
1. It is assigned when there is not transition in the progress.
Then it is OK because it will be ignored by the ftrace handler.
The right state will be set before the next transition starts.
2. It is assigned when some other transition is in progress.
Then it is OK as long as the function is called from "current".
The "wrong" state will be used consistently. It will switch
to the right state on another safe state.
> > By other words, the task state might be updated only
> >
> > + by the task itself on a safe place
> > + by other task when the updated on is sleeping on a safe place
> >
> > This should be well documented and the API should help to avoid
> > a misuse.
>
> I think we could fix it to be safe for future callers who might not have
> preemption disabled with a couple of changes to klp_patch_task():
> disabling preemption and testing/clearing the TIF_PATCH_PENDING flag
> before changing the patch state:
>
> void klp_patch_task(struct task_struct *task)
> {
> preempt_disable();
>
> if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
> task->patch_state = READ_ONCE(klp_target_state);
>
> preempt_enable();
> }
It reduces the race window a bit but it is still there. For example,
NMI still might add a huge delay between reading klp_target_state
and assigning task->patch state.
What about the following?
/*
* This function might assign an outdated value if the transaction
`* is reverted and finalized in parallel. But it is safe. If the value
* is assigned outside of a transaction, it is ignored and the next
* transaction will set the right one. Or if it gets assigned
* inside another transaction, it will repeat the cycle and
* set the right state.
*/
void klp_update_current_patch_state()
{
while (test_and_clear_tsk_thread_flag(current, TIF_PATCH_PENDING))
current->patch_state = READ_ONCE(klp_target_state);
}
Note that the disabled preemption helped only partially,
so I think that it was not really needed.
Hmm, it means that the task->patch_state might be either
KLP_PATCHED or KLP_UNPATCHED outside a transition. I wonder
if the tristate really brings some advantages.
Alternatively, we might synchronize the operation with klp_mutex.
The function is called in a slow path and in a safe context.
Well, it might cause contention on the lock when many CPUs are
trying to update their tasks.
Best Regards,
Petr
next prev parent reply other threads:[~2016-05-09 12:23 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-28 20:44 [RFC PATCH v2 00/18] livepatch: hybrid consistency model Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 01/18] x86/asm/head: clean up initial stack variable Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 02/18] x86/asm/head: use a common function for starting CPUs Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 03/18] x86/asm/head: standardize the bottom of the stack for idle tasks Josh Poimboeuf
2016-04-29 18:46 ` Brian Gerst
2016-04-29 20:28 ` Josh Poimboeuf
2016-04-29 19:39 ` Andy Lutomirski
2016-04-29 20:50 ` Josh Poimboeuf
2016-04-29 21:38 ` Andy Lutomirski
2016-04-29 23:27 ` Josh Poimboeuf
2016-04-30 0:10 ` Andy Lutomirski
2016-04-28 20:44 ` [RFC PATCH v2 04/18] x86: move _stext marker before head code Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking Josh Poimboeuf
2016-04-29 18:06 ` Andy Lutomirski
2016-04-29 20:11 ` Josh Poimboeuf
2016-04-29 20:19 ` Andy Lutomirski
2016-04-29 20:27 ` Josh Poimboeuf
2016-04-29 20:32 ` Andy Lutomirski
2016-04-29 21:25 ` Josh Poimboeuf
2016-04-29 21:37 ` Andy Lutomirski
2016-04-29 22:11 ` Jiri Kosina
2016-04-29 22:57 ` Josh Poimboeuf
2016-04-30 0:09 ` Andy Lutomirski
2016-04-29 22:41 ` Josh Poimboeuf
2016-04-30 0:08 ` Andy Lutomirski
2016-05-02 13:52 ` Josh Poimboeuf
2016-05-02 15:52 ` Andy Lutomirski
2016-05-02 17:31 ` Josh Poimboeuf
2016-05-02 18:12 ` Andy Lutomirski
2016-05-02 18:34 ` Ingo Molnar
2016-05-02 19:44 ` Josh Poimboeuf
2016-05-02 19:54 ` Jiri Kosina
2016-05-02 20:00 ` Jiri Kosina
2016-05-03 0:39 ` Andy Lutomirski
2016-05-04 15:16 ` David Laight
2016-05-19 23:15 ` Josh Poimboeuf
2016-05-19 23:39 ` Andy Lutomirski
2016-05-20 14:05 ` Josh Poimboeuf
2016-05-20 15:41 ` Andy Lutomirski
2016-05-20 16:41 ` Josh Poimboeuf
2016-05-20 16:59 ` Andy Lutomirski
2016-05-20 17:49 ` Josh Poimboeuf
2016-05-23 23:02 ` Jiri Kosina
2016-05-24 1:42 ` Andy Lutomirski
2016-05-23 21:34 ` Andy Lutomirski
2016-05-24 2:28 ` Josh Poimboeuf
2016-05-24 3:52 ` Andy Lutomirski
2016-06-22 16:30 ` Josh Poimboeuf
2016-06-22 17:59 ` Andy Lutomirski
2016-06-22 18:22 ` Josh Poimboeuf
2016-06-22 18:26 ` Andy Lutomirski
2016-06-22 18:40 ` Josh Poimboeuf
2016-06-22 19:17 ` Andy Lutomirski
2016-06-23 16:19 ` Josh Poimboeuf
2016-06-23 16:35 ` Andy Lutomirski
2016-06-23 18:31 ` Josh Poimboeuf
2016-06-23 20:40 ` Josh Poimboeuf
2016-06-23 22:00 ` Andy Lutomirski
2016-06-23 0:09 ` Andy Lutomirski
2016-06-23 15:55 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 06/18] x86: dump_trace() error handling Josh Poimboeuf
2016-04-29 13:45 ` Minfei Huang
2016-04-29 14:00 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 07/18] stacktrace/x86: function for detecting reliable stack traces Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 08/18] livepatch: temporary stubs for klp_patch_pending() and klp_patch_task() Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 09/18] livepatch/x86: add TIF_PATCH_PENDING thread flag Josh Poimboeuf
2016-04-29 18:08 ` Andy Lutomirski
2016-04-29 20:18 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 10/18] livepatch/powerpc: " Josh Poimboeuf
2016-05-03 9:07 ` Petr Mladek
2016-05-03 12:06 ` Miroslav Benes
2016-04-28 20:44 ` [RFC PATCH v2 11/18] livepatch/s390: reorganize TIF thread flag bits Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 12/18] livepatch/s390: add TIF_PATCH_PENDING thread flag Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 13/18] livepatch: separate enabled and patched states Josh Poimboeuf
2016-05-03 9:30 ` Petr Mladek
2016-05-03 13:48 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 14/18] livepatch: remove unnecessary object loaded check Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 15/18] livepatch: move patching functions into patch.c Josh Poimboeuf
2016-05-03 9:39 ` Petr Mladek
2016-04-28 20:44 ` [RFC PATCH v2 16/18] livepatch: store function sizes Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model Josh Poimboeuf
2016-05-04 8:42 ` Petr Mladek
2016-05-04 15:51 ` Josh Poimboeuf
2016-05-05 9:41 ` Miroslav Benes
2016-05-05 13:06 ` Petr Mladek
2016-05-04 12:39 ` barriers: was: " Petr Mladek
2016-05-04 13:53 ` Peter Zijlstra
2016-05-04 16:51 ` Josh Poimboeuf
2016-05-04 14:12 ` Petr Mladek
2016-05-04 17:25 ` Josh Poimboeuf
2016-05-05 11:21 ` Petr Mladek
2016-05-09 15:42 ` Miroslav Benes
2016-05-04 17:02 ` Josh Poimboeuf
2016-05-05 10:21 ` Petr Mladek
2016-05-04 14:48 ` klp_task_patch: " Petr Mladek
2016-05-04 14:56 ` Jiri Kosina
2016-05-04 17:57 ` Josh Poimboeuf
2016-05-05 11:57 ` Petr Mladek
2016-05-06 12:38 ` Josh Poimboeuf
2016-05-09 12:23 ` Petr Mladek [this message]
2016-05-16 18:12 ` Josh Poimboeuf
2016-05-18 13:12 ` Petr Mladek
2016-05-06 11:33 ` Petr Mladek
2016-05-06 12:44 ` Josh Poimboeuf
2016-05-09 9:41 ` Miroslav Benes
2016-05-16 17:27 ` Josh Poimboeuf
2016-05-10 11:39 ` Miroslav Benes
2016-05-17 22:53 ` Jessica Yu
2016-05-18 8:16 ` Jiri Kosina
2016-05-18 16:51 ` Josh Poimboeuf
2016-05-18 20:22 ` Jiri Kosina
2016-05-23 9:42 ` David Laight
2016-05-23 18:44 ` Jiri Kosina
2016-05-24 15:06 ` David Laight
2016-05-24 22:45 ` Jiri Kosina
2016-06-06 13:54 ` [RFC PATCH v2 17/18] " Petr Mladek
2016-06-06 14:29 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 18/18] livepatch: add /proc/<pid>/patch_state Josh Poimboeuf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160509122303.GD2895@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=chris.j.arges@canonical.com \
--cc=heiko.carstens@de.ibm.com \
--cc=jeyu@redhat.com \
--cc=jikos@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=live-patching@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mbenes@suse.cz \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=peterz@infradead.org \
--cc=vojtech@suse.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).