public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] static_key: fix concurrent static_key_slow_inc
@ 2016-06-21 16:52 Paolo Bonzini
  2016-06-21 19:20 ` Peter Zijlstra
  2016-06-22  8:50 ` Christian Borntraeger
  0 siblings, 2 replies; 4+ messages in thread
From: Paolo Bonzini @ 2016-06-21 16:52 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: stable, Peter Zijlstra, Ingo Molnar

The following scenario is possible:

    CPU 1                                   CPU 2
    static_key_slow_inc
     atomic_inc_not_zero
      -> key.enabled == 0, no increment
     jump_label_lock
     atomic_inc_return
      -> key.enabled == 1 now
                                            static_key_slow_inc
                                             atomic_inc_not_zero
                                              -> key.enabled == 1, inc to 2
                                             return
                                            ** static key is wrong!
     jump_label_update
     jump_label_unlock

Testing the static key at the point marked by (**) will follow the wrong
path for jumps that have not been patched yet.  This can actually happen
when creating many KVM virtual machines with userspace LAPIC emulation;
just run several copies of the following program:

    #include <fcntl.h>
    #include <unistd.h>
    #include <sys/ioctl.h>
    #include <linux/kvm.h>

    int main(void)
    {
        for (;;) {
            int kvmfd = open("/dev/kvm", O_RDONLY);
            int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
            close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
            close(vmfd);
            close(kvmfd);
        }
        return 0;
    }

Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc.  The
static key's purpose is to skip NULL pointer checks and indeed one of
the processes eventually dereferences NULL.

As explained in the commit that introduced the bug (which is 706249c222f6,
"locking/static_keys: Rework update logic", 2015-07-24), jump_label_update
needs key.enabled to be true.  The solution adopted here is to temporarily
make key.enabled == -1, and use go down the slow path when key.enabled
<= 0.

Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 706249c222f6
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	So it wasn't as easy as I thought :) and this patch is
	a bit clumsy, but I cannot think of anything better.

 include/linux/jump_label.h | 16 +++++++++++++---
 kernel/jump_label.c        | 34 +++++++++++++++++++++++++++++++---
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 0536524bb9eb..4bed5cefdaa9 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -117,13 +117,18 @@ struct module;
 
 #include <linux/atomic.h>
 
+#ifdef HAVE_JUMP_LABEL
+
 static inline int static_key_count(struct static_key *key)
 {
-	return atomic_read(&key->enabled);
+	/*
+	 * -1 means the first static_key_slow_inc is in progress.
+	 *  static_key_enabled must return true, so return 1 here.
+	 */
+	int n = atomic_read(&key->enabled);
+	return n >= 0 ? n : 1;
 }
 
-#ifdef HAVE_JUMP_LABEL
-
 #define JUMP_TYPE_FALSE	0UL
 #define JUMP_TYPE_TRUE	1UL
 #define JUMP_TYPE_MASK	1UL
@@ -162,6 +167,11 @@ extern void jump_label_apply_nops(struct module *mod);
 
 #else  /* !HAVE_JUMP_LABEL */
 
+static inline int static_key_count(struct static_key *key)
+{
+	return atomic_read(&key->enabled);
+}
+
 static __always_inline void jump_label_init(void)
 {
 	static_key_initialized = true;
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 05254eeb4b4e..1ce0663bfa33 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -58,13 +58,34 @@ static void jump_label_update(struct static_key *key);
 
 void static_key_slow_inc(struct static_key *key)
 {
+	int v, v1;
 	STATIC_KEY_CHECK_USE();
-	if (atomic_inc_not_zero(&key->enabled))
-		return;
+
+	/*
+	 * Careful if we get concurrent static_key_slow_inc calls;
+	 * later calls must wait for the first one to _finish_ the
+	 * jump_label_update process.  At the same time, however,
+	 * the jump_label_update call below wants to see
+	 * static_key_enabled(&key) for jumps to be updated properly.
+	 *
+	 * So give a special meaning to negative key->enabled: it sends
+	 * static_key_slow_inc down the slow path, and it is non-zero
+	 * so it counts as "enabled" in jump_label_update.  Note that
+	 * atomic_inc_unless_negative checks >= 0, so roll our own.
+	 */
+	for (v = atomic_read(&key->enabled); v > 0; v = v1) {
+		v1 = atomic_cmpxchg(&key->enabled, v, v + 1);
+		if (likely(v1 == v))
+			return;
+        }
 
 	jump_label_lock();
-	if (atomic_inc_return(&key->enabled) == 1)
+	if (atomic_read(&key->enabled) == 0) {
+		atomic_set(&key->enabled, -1);
 		jump_label_update(key);
+		atomic_set(&key->enabled, 1);
+	} else
+		atomic_inc(&key->enabled);
 	jump_label_unlock();
 }
 EXPORT_SYMBOL_GPL(static_key_slow_inc);
@@ -72,6 +93,13 @@ EXPORT_SYMBOL_GPL(static_key_slow_inc);
 static void __static_key_slow_dec(struct static_key *key,
 		unsigned long rate_limit, struct delayed_work *work)
 {
+	/*
+	 * The negative count check is valid even when a negative
+	 * key->enabled is in use by static_key_slow_inc; a
+	 * __static_key_slow_dec before the first static_key_slow_inc
+	 * returns is unbalanced, because all other static_key_slow_inc
+	 * block while the update is in progress.
+	 */
 	if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex)) {
 		WARN(atomic_read(&key->enabled) < 0,
 		     "jump label: negative count!\n");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] static_key: fix concurrent static_key_slow_inc
  2016-06-21 16:52 [PATCH] static_key: fix concurrent static_key_slow_inc Paolo Bonzini
@ 2016-06-21 19:20 ` Peter Zijlstra
  2016-06-22  8:50 ` Christian Borntraeger
  1 sibling, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2016-06-21 19:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, stable, Ingo Molnar

On Tue, Jun 21, 2016 at 06:52:17PM +0200, Paolo Bonzini wrote:
> The following scenario is possible:
> 
>     CPU 1                                   CPU 2
>     static_key_slow_inc
>      atomic_inc_not_zero
>       -> key.enabled == 0, no increment
>      jump_label_lock
>      atomic_inc_return
>       -> key.enabled == 1 now
>                                             static_key_slow_inc
>                                              atomic_inc_not_zero
>                                               -> key.enabled == 1, inc to 2
>                                              return
>                                             ** static key is wrong!
>      jump_label_update
>      jump_label_unlock
> 
> Testing the static key at the point marked by (**) will follow the wrong
> path for jumps that have not been patched yet.  This can actually happen
> when creating many KVM virtual machines with userspace LAPIC emulation;
> just run several copies of the following program:
> 
>     #include <fcntl.h>
>     #include <unistd.h>
>     #include <sys/ioctl.h>
>     #include <linux/kvm.h>
> 
>     int main(void)
>     {
>         for (;;) {
>             int kvmfd = open("/dev/kvm", O_RDONLY);
>             int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
>             close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
>             close(vmfd);
>             close(kvmfd);
>         }
>         return 0;
>     }
> 
> Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc.  The
> static key's purpose is to skip NULL pointer checks and indeed one of
> the processes eventually dereferences NULL.
> 
> As explained in the commit that introduced the bug (which is 706249c222f6,
> "locking/static_keys: Rework update logic", 2015-07-24), jump_label_update
> needs key.enabled to be true.  The solution adopted here is to temporarily
> make key.enabled == -1, and use go down the slow path when key.enabled
> <= 0.

Thanks!

(I frobbed a whitespace fail and fixed the Fixes line).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] static_key: fix concurrent static_key_slow_inc
  2016-06-21 16:52 [PATCH] static_key: fix concurrent static_key_slow_inc Paolo Bonzini
  2016-06-21 19:20 ` Peter Zijlstra
@ 2016-06-22  8:50 ` Christian Borntraeger
  2016-06-22  9:52   ` Paolo Bonzini
  1 sibling, 1 reply; 4+ messages in thread
From: Christian Borntraeger @ 2016-06-22  8:50 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: stable, Peter Zijlstra, Ingo Molnar

On 06/21/2016 06:52 PM, Paolo Bonzini wrote:
> The following scenario is possible:
> 
>     CPU 1                                   CPU 2
>     static_key_slow_inc
>      atomic_inc_not_zero
>       -> key.enabled == 0, no increment
>      jump_label_lock
>      atomic_inc_return
>       -> key.enabled == 1 now
>                                             static_key_slow_inc
>                                              atomic_inc_not_zero
>                                               -> key.enabled == 1, inc to 2
>                                              return
>                                             ** static key is wrong!
>      jump_label_update
>      jump_label_unlock
> 
> Testing the static key at the point marked by (**) will follow the wrong
> path for jumps that have not been patched yet.  This can actually happen
> when creating many KVM virtual machines with userspace LAPIC emulation;
> just run several copies of the following program:
> 
>     #include <fcntl.h>
>     #include <unistd.h>
>     #include <sys/ioctl.h>
>     #include <linux/kvm.h>
> 
>     int main(void)
>     {
>         for (;;) {
>             int kvmfd = open("/dev/kvm", O_RDONLY);
>             int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
>             close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
>             close(vmfd);
>             close(kvmfd);
>         }
>         return 0;
>     }
> 
> Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc.  The
> static key's purpose is to skip NULL pointer checks and indeed one of
> the processes eventually dereferences NULL.


Interesting. Some time ago I had a spurious bug on the preempt_notifier
when starting/stopping lots of guests, but I was never able to reliably 
reproduce it. I was chasing some other bug, so I did not even considered
static_key to be broken, but this might actually be the fix for that
problem.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] static_key: fix concurrent static_key_slow_inc
  2016-06-22  8:50 ` Christian Borntraeger
@ 2016-06-22  9:52   ` Paolo Bonzini
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2016-06-22  9:52 UTC (permalink / raw)
  To: Christian Borntraeger, linux-kernel, kvm
  Cc: stable, Peter Zijlstra, Ingo Molnar



On 22/06/2016 10:50, Christian Borntraeger wrote:
> On 06/21/2016 06:52 PM, Paolo Bonzini wrote:
>> The following scenario is possible:
>>
>>     CPU 1                                   CPU 2
>>     static_key_slow_inc
>>      atomic_inc_not_zero
>>       -> key.enabled == 0, no increment
>>      jump_label_lock
>>      atomic_inc_return
>>       -> key.enabled == 1 now
>>                                             static_key_slow_inc
>>                                              atomic_inc_not_zero
>>                                               -> key.enabled == 1, inc to 2
>>                                              return
>>                                             ** static key is wrong!
>>      jump_label_update
>>      jump_label_unlock
>>
>> Testing the static key at the point marked by (**) will follow the wrong
>> path for jumps that have not been patched yet.  This can actually happen
>> when creating many KVM virtual machines with userspace LAPIC emulation;
>> just run several copies of the following program:
>>
>>     #include <fcntl.h>
>>     #include <unistd.h>
>>     #include <sys/ioctl.h>
>>     #include <linux/kvm.h>
>>
>>     int main(void)
>>     {
>>         for (;;) {
>>             int kvmfd = open("/dev/kvm", O_RDONLY);
>>             int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
>>             close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
>>             close(vmfd);
>>             close(kvmfd);
>>         }
>>         return 0;
>>     }
>>
>> Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc.  The
>> static key's purpose is to skip NULL pointer checks and indeed one of
>> the processes eventually dereferences NULL.
> 
> Interesting. Some time ago I had a spurious bug on the preempt_notifier
> when starting/stopping lots of guests, but I was never able to reliably 
> reproduce it. I was chasing some other bug, so I did not even considered
> static_key to be broken, but this might actually be the fix for that
> problem.

It could be the same that was reported here:
http://article.gmane.org/gmane.comp.emulators.kvm.devel/154069

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-06-22  9:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-21 16:52 [PATCH] static_key: fix concurrent static_key_slow_inc Paolo Bonzini
2016-06-21 19:20 ` Peter Zijlstra
2016-06-22  8:50 ` Christian Borntraeger
2016-06-22  9:52   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox