From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751968AbcFVIuT (ORCPT ); Wed, 22 Jun 2016 04:50:19 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:56502 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751879AbcFVIuO (ORCPT ); Wed, 22 Jun 2016 04:50:14 -0400 X-IBM-Helo: d06dlp02.portsmouth.uk.ibm.com X-IBM-MailFrom: borntraeger@de.ibm.com X-IBM-RcptTo: kvm@vger.kernel.org;linux-kernel@vger.kernel.org;stable@vger.kernel.org Subject: Re: [PATCH] static_key: fix concurrent static_key_slow_inc To: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org References: <1466527937-69798-1-git-send-email-pbonzini@redhat.com> Cc: stable@vger.kernel.org, Peter Zijlstra , Ingo Molnar From: Christian Borntraeger Date: Wed, 22 Jun 2016 10:50:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <1466527937-69798-1-git-send-email-pbonzini@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16062208-0020-0000-0000-000001C657E5 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16062208-0021-0000-0000-00001C79F937 Message-Id: <576A5138.8040604@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-06-22_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1606220094 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/21/2016 06:52 PM, Paolo Bonzini wrote: > The following scenario is possible: > > CPU 1 CPU 2 > static_key_slow_inc > atomic_inc_not_zero > -> key.enabled == 0, no increment > jump_label_lock > atomic_inc_return > -> key.enabled == 1 now > static_key_slow_inc > atomic_inc_not_zero > -> key.enabled == 1, inc to 2 > return > ** static key is wrong! > jump_label_update > jump_label_unlock > > Testing the static key at the point marked by (**) will follow the wrong > path for jumps that have not been patched yet. This can actually happen > when creating many KVM virtual machines with userspace LAPIC emulation; > just run several copies of the following program: > > #include > #include > #include > #include > > int main(void) > { > for (;;) { > int kvmfd = open("/dev/kvm", O_RDONLY); > int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0); > close(ioctl(vmfd, KVM_CREATE_VCPU, 1)); > close(vmfd); > close(kvmfd); > } > return 0; > } > > Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc. The > static key's purpose is to skip NULL pointer checks and indeed one of > the processes eventually dereferences NULL. Interesting. Some time ago I had a spurious bug on the preempt_notifier when starting/stopping lots of guests, but I was never able to reliably reproduce it. I was chasing some other bug, so I did not even considered static_key to be broken, but this might actually be the fix for that problem.