From: Petr Mladek <pmladek@suse.com>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@redhat.com>, Jiri Kosina <jikos@kernel.org>,
Miroslav Benes <mbenes@suse.cz>, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
live-patching@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-s390@vger.kernel.org, Vojtech Pavlik <vojtech@suse.com>,
Jiri Slaby <jslaby@suse.cz>,
Chris J Arges <chris.j.arges@canonical.com>,
Andy Lutomirski <luto@kernel.org>
Subject: Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model
Date: Wed, 18 May 2016 15:12:18 +0200 [thread overview]
Message-ID: <20160518131218.GF21055@pathway.suse.cz> (raw)
In-Reply-To: <20160516181250.k3bqqk4hd3dyxldf@treble>
On Mon 2016-05-16 13:12:50, Josh Poimboeuf wrote:
> On Mon, May 09, 2016 at 02:23:03PM +0200, Petr Mladek wrote:
> > On Fri 2016-05-06 07:38:55, Josh Poimboeuf wrote:
> > > On Thu, May 05, 2016 at 01:57:01PM +0200, Petr Mladek wrote:
> > > > I have missed that the two commands are called with preemption
> > > > disabled. So, I had the following crazy scenario in mind:
> > > >
> > > >
> > > > CPU0 CPU1
> > > >
> > > > klp_enable_patch()
> > > >
> > > > klp_target_state = KLP_PATCHED;
> > > >
> > > > for_each_task()
> > > > set TIF_PENDING_PATCH
> > > >
> > > > # task 123
> > > >
> > > > if (klp_patch_pending(current)
> > > > klp_patch_task(current)
> > > >
> > > > clear TIF_PENDING_PATCH
> > > >
> > > > smp_rmb();
> > > >
> > > > # switch to assembly of
> > > > # klp_patch_task()
> > > >
> > > > mov klp_target_state, %r12
> > > >
> > > > # interrupt and schedule
> > > > # another task
> > > >
> > > >
> > > > klp_reverse_transition();
> > > >
> > > > klp_target_state = KLP_UNPATCHED;
> > > >
> > > > klt_try_to_complete_transition()
> > > >
> > > > task = 123;
> > > > if (task->patch_state == klp_target_state;
> > > > return 0;
> > > >
> > > > => task 123 is in target state and does
> > > > not block conversion
> > > >
> > > > klp_complete_transition()
> > > >
> > > >
> > > > # disable previous patch on the stack
> > > > klp_disable_patch();
> > > >
> > > > klp_target_state = KLP_UNPATCHED;
> > > >
> > > >
> > > > # task 123 gets scheduled again
> > > > lea %r12, task->patch_state
> > > >
> > > > => it happily stores an outdated
> > > > state
> > > >
> > >
> > > Thanks for the clear explanation, this helps a lot.
> > >
> > > > This is why the two functions should get called with preemption
> > > > disabled. We should document it at least. I imagine that we will
> > > > use them later also in another context and nobody will remember
> > > > this crazy scenario.
> > > >
> > > > Well, even disabled preemption does not help. The process on
> > > > CPU1 might be also interrupted by an NMI and do some long
> > > > printk in it.
> > > >
> > > > IMHO, the only safe approach is to call klp_patch_task()
> > > > only for "current" on a safe place. Then this race is harmless.
> > > > The switch happen on a safe place, so that it does not matter
> > > > into which state the process is switched.
> > >
> > > I'm not sure about this solution. When klp_complete_transition() is
> > > called, we need all tasks to be patched, for good. We don't want any of
> > > them to randomly switch to the wrong state at some later time in the
> > > middle of a future patch operation. How would changing klp_patch_task()
> > > to only use "current" prevent that?
> >
> > You are right that it is pity but it really should be safe because
> > it is not entirely random.
> >
> > If the race happens and assign an outdated value, there are two
> > situations:
> >
> > 1. It is assigned when there is not transition in the progress.
> > Then it is OK because it will be ignored by the ftrace handler.
> > The right state will be set before the next transition starts.
> >
> > 2. It is assigned when some other transition is in progress.
> > Then it is OK as long as the function is called from "current".
> > The "wrong" state will be used consistently. It will switch
> > to the right state on another safe state.
>
> Maybe it would be safe, though I'm not entirely convinced. Regardless I
> think we should avoid these situations entirely because they create
> windows for future bugs and races.
Yup, I would prefer a cleaner solution as well.
> > > > By other words, the task state might be updated only
> > > >
> > > > + by the task itself on a safe place
> > > > + by other task when the updated on is sleeping on a safe place
> > > >
> > > > This should be well documented and the API should help to avoid
> > > > a misuse.
> > >
> > > I think we could fix it to be safe for future callers who might not have
> > > preemption disabled with a couple of changes to klp_patch_task():
> > > disabling preemption and testing/clearing the TIF_PATCH_PENDING flag
> > > before changing the patch state:
> > >
> > > void klp_patch_task(struct task_struct *task)
> > > {
> > > preempt_disable();
> > >
> > > if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
> > > task->patch_state = READ_ONCE(klp_target_state);
> > >
> > > preempt_enable();
> > > }
> >
> > It reduces the race window a bit but it is still there. For example,
> > NMI still might add a huge delay between reading klp_target_state
> > and assigning task->patch state.
>
> Maybe you missed this paragraph from my last email:
>
> | We would also need a synchronize_sched() after the patching is complete,
> | either at the end of klp_try_complete_transition() or in
> | klp_complete_transition(). That would make sure that all existing calls
> | to klp_patch_task() are done.
>
> So a huge NMI delay wouldn't be a problem here. The call to
> synchronize_sched() in klp_complete_transition() would sleep until the
> NMI handler returns and the critical section of klp_patch_task()
> finishes. So once a patch is complete, we know that it's really
> complete.
Yes, synchronize_sched() will help with the premeption disabled. I did
not shake my head enough last time.
> > What about the following?
> >
> > /*
> > * This function might assign an outdated value if the transaction
> > `* is reverted and finalized in parallel. But it is safe. If the value
> > * is assigned outside of a transaction, it is ignored and the next
> > * transaction will set the right one. Or if it gets assigned
> > * inside another transaction, it will repeat the cycle and
> > * set the right state.
> > */
> > void klp_update_current_patch_state()
> > {
> > while (test_and_clear_tsk_thread_flag(current, TIF_PATCH_PENDING))
> > current->patch_state = READ_ONCE(klp_target_state);
> > }
>
> I'm not sure how this would work. How would the thread flag get set
> again after it's been cleared?
See the race described in the previous mail. The problem is when the
target_state and the TIF flags gets set after reading klp_target_state
into a register and before storing the value into current->patch_state.
We do not need this if use the synchronize_sched() and fix up
current->patch_state then.
> Also I really don't like the idea of randomly updating a task's patch
> state after the transition has been completed.
>
> > Note that the disabled preemption helped only partially,
> > so I think that it was not really needed.
> >
> > Hmm, it means that the task->patch_state might be either
> > KLP_PATCHED or KLP_UNPATCHED outside a transition. I wonder
> > if the tristate really brings some advantages.
> >
> >
> > Alternatively, we might synchronize the operation with klp_mutex.
> > The function is called in a slow path and in a safe context.
> > Well, it might cause contention on the lock when many CPUs are
> > trying to update their tasks.
>
> I don't think a mutex would work because at least the ftrace handler
> (and maybe more) can't sleep. Maybe a spinlock could work but I think
> that would be overkill.
Sure, I had a spinlock in mind.
Best Regards,
Petr
next prev parent reply other threads:[~2016-05-18 13:12 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-28 20:44 [RFC PATCH v2 00/18] livepatch: hybrid consistency model Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 01/18] x86/asm/head: clean up initial stack variable Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 02/18] x86/asm/head: use a common function for starting CPUs Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 03/18] x86/asm/head: standardize the bottom of the stack for idle tasks Josh Poimboeuf
2016-04-29 18:46 ` Brian Gerst
2016-04-29 20:28 ` Josh Poimboeuf
2016-04-29 19:39 ` Andy Lutomirski
2016-04-29 20:50 ` Josh Poimboeuf
2016-04-29 21:38 ` Andy Lutomirski
2016-04-29 23:27 ` Josh Poimboeuf
2016-04-30 0:10 ` Andy Lutomirski
2016-04-28 20:44 ` [RFC PATCH v2 04/18] x86: move _stext marker before head code Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking Josh Poimboeuf
2016-04-29 18:06 ` Andy Lutomirski
2016-04-29 20:11 ` Josh Poimboeuf
2016-04-29 20:19 ` Andy Lutomirski
2016-04-29 20:27 ` Josh Poimboeuf
2016-04-29 20:32 ` Andy Lutomirski
2016-04-29 21:25 ` Josh Poimboeuf
2016-04-29 21:37 ` Andy Lutomirski
2016-04-29 22:11 ` Jiri Kosina
2016-04-29 22:57 ` Josh Poimboeuf
2016-04-30 0:09 ` Andy Lutomirski
2016-04-29 22:41 ` Josh Poimboeuf
2016-04-30 0:08 ` Andy Lutomirski
2016-05-02 13:52 ` Josh Poimboeuf
2016-05-02 15:52 ` Andy Lutomirski
2016-05-02 17:31 ` Josh Poimboeuf
2016-05-02 18:12 ` Andy Lutomirski
2016-05-02 18:34 ` Ingo Molnar
2016-05-02 19:44 ` Josh Poimboeuf
2016-05-02 19:54 ` Jiri Kosina
2016-05-02 20:00 ` Jiri Kosina
2016-05-03 0:39 ` Andy Lutomirski
2016-05-04 15:16 ` David Laight
2016-05-19 23:15 ` Josh Poimboeuf
2016-05-19 23:39 ` Andy Lutomirski
2016-05-20 14:05 ` Josh Poimboeuf
2016-05-20 15:41 ` Andy Lutomirski
2016-05-20 16:41 ` Josh Poimboeuf
2016-05-20 16:59 ` Andy Lutomirski
2016-05-20 17:49 ` Josh Poimboeuf
2016-05-23 23:02 ` Jiri Kosina
2016-05-24 1:42 ` Andy Lutomirski
2016-05-23 21:34 ` Andy Lutomirski
2016-05-24 2:28 ` Josh Poimboeuf
2016-05-24 3:52 ` Andy Lutomirski
2016-06-22 16:30 ` Josh Poimboeuf
2016-06-22 17:59 ` Andy Lutomirski
2016-06-22 18:22 ` Josh Poimboeuf
2016-06-22 18:26 ` Andy Lutomirski
2016-06-22 18:40 ` Josh Poimboeuf
2016-06-22 19:17 ` Andy Lutomirski
2016-06-23 16:19 ` Josh Poimboeuf
2016-06-23 16:35 ` Andy Lutomirski
2016-06-23 18:31 ` Josh Poimboeuf
2016-06-23 20:40 ` Josh Poimboeuf
2016-06-23 22:00 ` Andy Lutomirski
2016-06-23 0:09 ` Andy Lutomirski
2016-06-23 15:55 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 06/18] x86: dump_trace() error handling Josh Poimboeuf
2016-04-29 13:45 ` Minfei Huang
2016-04-29 14:00 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 07/18] stacktrace/x86: function for detecting reliable stack traces Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 08/18] livepatch: temporary stubs for klp_patch_pending() and klp_patch_task() Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 09/18] livepatch/x86: add TIF_PATCH_PENDING thread flag Josh Poimboeuf
2016-04-29 18:08 ` Andy Lutomirski
2016-04-29 20:18 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 10/18] livepatch/powerpc: " Josh Poimboeuf
2016-05-03 9:07 ` Petr Mladek
2016-05-03 12:06 ` Miroslav Benes
2016-04-28 20:44 ` [RFC PATCH v2 11/18] livepatch/s390: reorganize TIF thread flag bits Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 12/18] livepatch/s390: add TIF_PATCH_PENDING thread flag Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 13/18] livepatch: separate enabled and patched states Josh Poimboeuf
2016-05-03 9:30 ` Petr Mladek
2016-05-03 13:48 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 14/18] livepatch: remove unnecessary object loaded check Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 15/18] livepatch: move patching functions into patch.c Josh Poimboeuf
2016-05-03 9:39 ` Petr Mladek
2016-04-28 20:44 ` [RFC PATCH v2 16/18] livepatch: store function sizes Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model Josh Poimboeuf
2016-05-04 8:42 ` Petr Mladek
2016-05-04 15:51 ` Josh Poimboeuf
2016-05-05 9:41 ` Miroslav Benes
2016-05-05 13:06 ` Petr Mladek
2016-05-04 12:39 ` barriers: was: " Petr Mladek
2016-05-04 13:53 ` Peter Zijlstra
2016-05-04 16:51 ` Josh Poimboeuf
2016-05-04 14:12 ` Petr Mladek
2016-05-04 17:25 ` Josh Poimboeuf
2016-05-05 11:21 ` Petr Mladek
2016-05-09 15:42 ` Miroslav Benes
2016-05-04 17:02 ` Josh Poimboeuf
2016-05-05 10:21 ` Petr Mladek
2016-05-04 14:48 ` klp_task_patch: " Petr Mladek
2016-05-04 14:56 ` Jiri Kosina
2016-05-04 17:57 ` Josh Poimboeuf
2016-05-05 11:57 ` Petr Mladek
2016-05-06 12:38 ` Josh Poimboeuf
2016-05-09 12:23 ` Petr Mladek
2016-05-16 18:12 ` Josh Poimboeuf
2016-05-18 13:12 ` Petr Mladek [this message]
2016-05-06 11:33 ` Petr Mladek
2016-05-06 12:44 ` Josh Poimboeuf
2016-05-09 9:41 ` Miroslav Benes
2016-05-16 17:27 ` Josh Poimboeuf
2016-05-10 11:39 ` Miroslav Benes
2016-05-17 22:53 ` Jessica Yu
2016-05-18 8:16 ` Jiri Kosina
2016-05-18 16:51 ` Josh Poimboeuf
2016-05-18 20:22 ` Jiri Kosina
2016-05-23 9:42 ` David Laight
2016-05-23 18:44 ` Jiri Kosina
2016-05-24 15:06 ` David Laight
2016-05-24 22:45 ` Jiri Kosina
2016-06-06 13:54 ` [RFC PATCH v2 17/18] " Petr Mladek
2016-06-06 14:29 ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 18/18] livepatch: add /proc/<pid>/patch_state Josh Poimboeuf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160518131218.GF21055@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=chris.j.arges@canonical.com \
--cc=heiko.carstens@de.ibm.com \
--cc=jeyu@redhat.com \
--cc=jikos@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=live-patching@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mbenes@suse.cz \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=peterz@infradead.org \
--cc=vojtech@suse.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox