From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F429C432BE for ; Thu, 19 Aug 2021 21:39:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07056610A0 for ; Thu, 19 Aug 2021 21:39:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233890AbhHSVjz (ORCPT ); Thu, 19 Aug 2021 17:39:55 -0400 Received: from mail.efficios.com ([167.114.26.124]:57280 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229619AbhHSVjy (ORCPT ); Thu, 19 Aug 2021 17:39:54 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 0AFE8378536; Thu, 19 Aug 2021 17:39:17 -0400 (EDT) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id sqgMh8PS9est; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 7F9FE3784AD; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 7F9FE3784AD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1629409152; bh=5UmwxCORiN5yLIQm2BJu9shxBCOVFj1bc8s4bRqQFQo=; h=Date:From:To:Message-ID:MIME-Version; b=MF1eZ97b/KNT5hI9VL2d8/bJnxRVnz1puSaP8aXdbA0Yr8sbq1UW1Xnp5ROlOTAxA BrBHYFiVE9fyDS+W6acYJK/dsDcXuZ5FxN6/TGgq81Ge5A7ri3UTyOOmZCNrrwpGaX CaJRQA/6RdzkBds+FVqCWioAvW9dS868z5YzJqUiWVEa1eSWq/ub7V3SdNutNGYaHW /BGNriEzzY77HlrWaGIlJrha5VHDUBgHaX9E9aF4RYwTeW4avIB2YHkFNmb79XRV+u RMSG+xk/uZO1X64FabUncXvFyriO1d5efn0OTQ8xv/HTN1zVXC9OXfvrjvq/mcfJyu 98fIeMx5uJpUQ== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id gNagT47KYem3; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 5C6F23784AA; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Date: Thu, 19 Aug 2021 17:39:12 -0400 (EDT) From: Mathieu Desnoyers To: Sean Christopherson Cc: "Russell King, ARM Linux" , Catalin Marinas , Will Deacon , Guo Ren , Thomas Bogendoerfer , Michael Ellerman , Heiko Carstens , gor , Christian Borntraeger , Oleg Nesterov , rostedt , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , Andy Lutomirski , paulmck , Boqun Feng , Paolo Bonzini , shuah , Benjamin Herrenschmidt , Paul Mackerras , linux-arm-kernel , linux-kernel , linux-csky , linux-mips@vger.kernel.org, linuxppc-dev , linux-s390@vger.kernel.org, KVM list , linux-kselftest , Peter Foley , Shakeel Butt , Ben Gardon Message-ID: <1673583543.19718.1629409152244.JavaMail.zimbra@efficios.com> In-Reply-To: <20210818001210.4073390-2-seanjc@google.com> References: <20210818001210.4073390-1-seanjc@google.com> <20210818001210.4073390-2-seanjc@google.com> Subject: Re: [PATCH 1/5] KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4101 (ZimbraWebClient - FF90 (Linux)/8.8.15_GA_4059) Thread-Topic: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest Thread-Index: bvPQyLDSoOg+aSRtrktKYXzxAVZwCA== Precedence: bulk List-ID: X-Mailing-List: linux-csky@vger.kernel.org ----- On Aug 17, 2021, at 8:12 PM, Sean Christopherson seanjc@google.com wrote: > Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to > transferring to a KVM guest, which is roughly equivalent to an exit to > userspace and processes many of the same pending actions. While the task > cannot be in an rseq critical section as the KVM path is reachable only > via ioctl(KVM_RUN), the side effects that apply to rseq outside of a > critical section still apply, e.g. the CPU ID needs to be updated if the > task is migrated. > > Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults > and other badness in userspace VMMs that use rseq in combination with KVM, > e.g. due to the CPU ID being stale after task migration. I agree with the problem assessment, but I would recommend a small change to this fix. > > Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function") > Reported-by: Peter Foley > Bisected-by: Doug Evans > Cc: Shakeel Butt > Cc: Thomas Gleixner > Cc: stable@vger.kernel.org > Signed-off-by: Sean Christopherson > --- > kernel/entry/kvm.c | 4 +++- > kernel/rseq.c | 4 ++-- > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c > index 49972ee99aff..049fd06b4c3d 100644 > --- a/kernel/entry/kvm.c > +++ b/kernel/entry/kvm.c > @@ -19,8 +19,10 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, > unsigned long ti_work) > if (ti_work & _TIF_NEED_RESCHED) > schedule(); > > - if (ti_work & _TIF_NOTIFY_RESUME) > + if (ti_work & _TIF_NOTIFY_RESUME) { > tracehook_notify_resume(NULL); > + rseq_handle_notify_resume(NULL, NULL); > + } > > ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work); > if (ret) > diff --git a/kernel/rseq.c b/kernel/rseq.c > index 35f7bd0fced0..58c79a7918cd 100644 > --- a/kernel/rseq.c > +++ b/kernel/rseq.c > @@ -236,7 +236,7 @@ static bool in_rseq_cs(unsigned long ip, struct rseq_cs > *rseq_cs) > > static int rseq_ip_fixup(struct pt_regs *regs) > { > - unsigned long ip = instruction_pointer(regs); > + unsigned long ip = regs ? instruction_pointer(regs) : 0; > struct task_struct *t = current; > struct rseq_cs rseq_cs; > int ret; > @@ -250,7 +250,7 @@ static int rseq_ip_fixup(struct pt_regs *regs) > * If not nested over a rseq critical section, restart is useless. > * Clear the rseq_cs pointer and return. > */ > - if (!in_rseq_cs(ip, &rseq_cs)) > + if (!regs || !in_rseq_cs(ip, &rseq_cs)) I think clearing the thread's rseq_cs unconditionally here when regs is NULL is not the behavior we want when this is called from xfer_to_guest_mode_work. If we have a scenario where userspace ends up calling this ioctl(KVM_RUN) from within a rseq c.s., we really want a CONFIG_DEBUG_RSEQ=y kernel to kill this application in the rseq_syscall handler when exiting back to usermode when the ioctl eventually returns. However, clearing the thread's rseq_cs will prevent this from happening. So I would favor an approach where we simply do: if (!regs) return 0; Immediately at the beginning of rseq_ip_fixup, before getting the instruction pointer, so effectively skip all side-effects of the ip fixup code. Indeed, it is not relevant to do any fixup here, because it is nested in a ioctl system call. Effectively, this would preserve the SIGSEGV behavior when this ioctl is erroneously called by user-space from a rseq critical section. Thanks for looking into this ! Mathieu > return clear_rseq_cs(t); > ret = rseq_need_restart(t, rseq_cs.flags); > if (ret <= 0) > -- > 2.33.0.rc1.237.g0d66db33f3-goog -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA12AC4338F for ; Thu, 19 Aug 2021 21:48:03 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 007E46054E for ; Thu, 19 Aug 2021 21:48:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 007E46054E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=efficios.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4GrJKY3xjLz3cQ7 for ; Fri, 20 Aug 2021 07:48:01 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=efficios.com header.i=@efficios.com header.a=rsa-sha256 header.s=default header.b=MF1eZ97b; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=efficios.com (client-ip=167.114.26.124; helo=mail.efficios.com; envelope-from=compudj@efficios.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=efficios.com header.i=@efficios.com header.a=rsa-sha256 header.s=default header.b=MF1eZ97b; dkim-atps=neutral Received: from mail.efficios.com (mail.efficios.com [167.114.26.124]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4GrJJp3NYrz2yfk for ; Fri, 20 Aug 2021 07:47:22 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 0AFE8378536; Thu, 19 Aug 2021 17:39:17 -0400 (EDT) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id sqgMh8PS9est; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 7F9FE3784AD; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 7F9FE3784AD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1629409152; bh=5UmwxCORiN5yLIQm2BJu9shxBCOVFj1bc8s4bRqQFQo=; h=Date:From:To:Message-ID:MIME-Version; b=MF1eZ97b/KNT5hI9VL2d8/bJnxRVnz1puSaP8aXdbA0Yr8sbq1UW1Xnp5ROlOTAxA BrBHYFiVE9fyDS+W6acYJK/dsDcXuZ5FxN6/TGgq81Ge5A7ri3UTyOOmZCNrrwpGaX CaJRQA/6RdzkBds+FVqCWioAvW9dS868z5YzJqUiWVEa1eSWq/ub7V3SdNutNGYaHW /BGNriEzzY77HlrWaGIlJrha5VHDUBgHaX9E9aF4RYwTeW4avIB2YHkFNmb79XRV+u RMSG+xk/uZO1X64FabUncXvFyriO1d5efn0OTQ8xv/HTN1zVXC9OXfvrjvq/mcfJyu 98fIeMx5uJpUQ== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id gNagT47KYem3; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 5C6F23784AA; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Date: Thu, 19 Aug 2021 17:39:12 -0400 (EDT) From: Mathieu Desnoyers To: Sean Christopherson Message-ID: <1673583543.19718.1629409152244.JavaMail.zimbra@efficios.com> In-Reply-To: <20210818001210.4073390-2-seanjc@google.com> References: <20210818001210.4073390-1-seanjc@google.com> <20210818001210.4073390-2-seanjc@google.com> Subject: Re: [PATCH 1/5] KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4101 (ZimbraWebClient - FF90 (Linux)/8.8.15_GA_4059) Thread-Topic: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest Thread-Index: bvPQyLDSoOg+aSRtrktKYXzxAVZwCA== X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: KVM list , Peter Zijlstra , linux-kernel , Will Deacon , Guo Ren , linux-kselftest , Ben Gardon , shuah , Paul Mackerras , linux-s390@vger.kernel.org, gor , "Russell King, ARM Linux" , linux-csky , Christian Borntraeger , Ingo Molnar , Catalin Marinas , linux-mips@vger.kernel.org, Boqun Feng , paulmck , Heiko Carstens , rostedt , Shakeel Butt , Andy Lutomirski , Thomas Gleixner , Peter Foley , linux-arm-kernel , Thomas Bogendoerfer , Oleg Nesterov , Paolo Bonzini , linuxppc-dev Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" ----- On Aug 17, 2021, at 8:12 PM, Sean Christopherson seanjc@google.com wrote: > Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to > transferring to a KVM guest, which is roughly equivalent to an exit to > userspace and processes many of the same pending actions. While the task > cannot be in an rseq critical section as the KVM path is reachable only > via ioctl(KVM_RUN), the side effects that apply to rseq outside of a > critical section still apply, e.g. the CPU ID needs to be updated if the > task is migrated. > > Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults > and other badness in userspace VMMs that use rseq in combination with KVM, > e.g. due to the CPU ID being stale after task migration. I agree with the problem assessment, but I would recommend a small change to this fix. > > Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function") > Reported-by: Peter Foley > Bisected-by: Doug Evans > Cc: Shakeel Butt > Cc: Thomas Gleixner > Cc: stable@vger.kernel.org > Signed-off-by: Sean Christopherson > --- > kernel/entry/kvm.c | 4 +++- > kernel/rseq.c | 4 ++-- > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c > index 49972ee99aff..049fd06b4c3d 100644 > --- a/kernel/entry/kvm.c > +++ b/kernel/entry/kvm.c > @@ -19,8 +19,10 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, > unsigned long ti_work) > if (ti_work & _TIF_NEED_RESCHED) > schedule(); > > - if (ti_work & _TIF_NOTIFY_RESUME) > + if (ti_work & _TIF_NOTIFY_RESUME) { > tracehook_notify_resume(NULL); > + rseq_handle_notify_resume(NULL, NULL); > + } > > ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work); > if (ret) > diff --git a/kernel/rseq.c b/kernel/rseq.c > index 35f7bd0fced0..58c79a7918cd 100644 > --- a/kernel/rseq.c > +++ b/kernel/rseq.c > @@ -236,7 +236,7 @@ static bool in_rseq_cs(unsigned long ip, struct rseq_cs > *rseq_cs) > > static int rseq_ip_fixup(struct pt_regs *regs) > { > - unsigned long ip = instruction_pointer(regs); > + unsigned long ip = regs ? instruction_pointer(regs) : 0; > struct task_struct *t = current; > struct rseq_cs rseq_cs; > int ret; > @@ -250,7 +250,7 @@ static int rseq_ip_fixup(struct pt_regs *regs) > * If not nested over a rseq critical section, restart is useless. > * Clear the rseq_cs pointer and return. > */ > - if (!in_rseq_cs(ip, &rseq_cs)) > + if (!regs || !in_rseq_cs(ip, &rseq_cs)) I think clearing the thread's rseq_cs unconditionally here when regs is NULL is not the behavior we want when this is called from xfer_to_guest_mode_work. If we have a scenario where userspace ends up calling this ioctl(KVM_RUN) from within a rseq c.s., we really want a CONFIG_DEBUG_RSEQ=y kernel to kill this application in the rseq_syscall handler when exiting back to usermode when the ioctl eventually returns. However, clearing the thread's rseq_cs will prevent this from happening. So I would favor an approach where we simply do: if (!regs) return 0; Immediately at the beginning of rseq_ip_fixup, before getting the instruction pointer, so effectively skip all side-effects of the ip fixup code. Indeed, it is not relevant to do any fixup here, because it is nested in a ioctl system call. Effectively, this would preserve the SIGSEGV behavior when this ioctl is erroneously called by user-space from a rseq critical section. Thanks for looking into this ! Mathieu > return clear_rseq_cs(t); > ret = rseq_need_restart(t, rseq_cs.flags); > if (ret <= 0) > -- > 2.33.0.rc1.237.g0d66db33f3-goog -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3191FC4338F for ; Thu, 19 Aug 2021 21:41:48 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EF35060C40 for ; Thu, 19 Aug 2021 21:41:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EF35060C40 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=efficios.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Subject:References: In-Reply-To:Message-ID:Cc:To:From:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=kjSH37dkgUy2fHoH6x3fyiEYb52mxLYHn1+IogbQ1G8=; b=zTkkfnXPk/dTJIEM/ypm38ECkP aMbTBtPWSt7EXGect9RCSKjfcG3S2LR+ubrAtBjuxWcnq9PW7eRYSUH5LcA2bgrGat7vu937a0Odi jQvgrmns6rX6QYxmgl3sL2D7tD2JZmumrIa21xV4kaR+//HZIJJ9nTD8JnZFMLdDTIOLk/3Yut+lt Ih8UZQrCbOvWTDa+40I2Y8m6Bv8pwJvIjiRBb31ynGR0glJAcu2aW4a9CGos+e6WV1lQleXSPSbsB x2vAF7mqmPdS+ZRyOL44oPACP9M1w14qa1Dq9FX+L4FXlVCr/hd/p5aR3hfqUIr7aUm23wQmxFYZ5 D97lvgag==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mGpl3-009UZG-4N; Thu, 19 Aug 2021 21:39:29 +0000 Received: from mail.efficios.com ([167.114.26.124]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mGpkw-009UWs-Ax for linux-arm-kernel@lists.infradead.org; Thu, 19 Aug 2021 21:39:26 +0000 Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 0AFE8378536; Thu, 19 Aug 2021 17:39:17 -0400 (EDT) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id sqgMh8PS9est; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 7F9FE3784AD; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 7F9FE3784AD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1629409152; bh=5UmwxCORiN5yLIQm2BJu9shxBCOVFj1bc8s4bRqQFQo=; h=Date:From:To:Message-ID:MIME-Version; b=MF1eZ97b/KNT5hI9VL2d8/bJnxRVnz1puSaP8aXdbA0Yr8sbq1UW1Xnp5ROlOTAxA BrBHYFiVE9fyDS+W6acYJK/dsDcXuZ5FxN6/TGgq81Ge5A7ri3UTyOOmZCNrrwpGaX CaJRQA/6RdzkBds+FVqCWioAvW9dS868z5YzJqUiWVEa1eSWq/ub7V3SdNutNGYaHW /BGNriEzzY77HlrWaGIlJrha5VHDUBgHaX9E9aF4RYwTeW4avIB2YHkFNmb79XRV+u RMSG+xk/uZO1X64FabUncXvFyriO1d5efn0OTQ8xv/HTN1zVXC9OXfvrjvq/mcfJyu 98fIeMx5uJpUQ== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id gNagT47KYem3; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 5C6F23784AA; Thu, 19 Aug 2021 17:39:12 -0400 (EDT) Date: Thu, 19 Aug 2021 17:39:12 -0400 (EDT) From: Mathieu Desnoyers To: Sean Christopherson Cc: "Russell King, ARM Linux" , Catalin Marinas , Will Deacon , Guo Ren , Thomas Bogendoerfer , Michael Ellerman , Heiko Carstens , gor , Christian Borntraeger , Oleg Nesterov , rostedt , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , Andy Lutomirski , paulmck , Boqun Feng , Paolo Bonzini , shuah , Benjamin Herrenschmidt , Paul Mackerras , linux-arm-kernel , linux-kernel , linux-csky , linux-mips@vger.kernel.org, linuxppc-dev , linux-s390@vger.kernel.org, KVM list , linux-kselftest , Peter Foley , Shakeel Butt , Ben Gardon Message-ID: <1673583543.19718.1629409152244.JavaMail.zimbra@efficios.com> In-Reply-To: <20210818001210.4073390-2-seanjc@google.com> References: <20210818001210.4073390-1-seanjc@google.com> <20210818001210.4073390-2-seanjc@google.com> Subject: Re: [PATCH 1/5] KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest MIME-Version: 1.0 X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4101 (ZimbraWebClient - FF90 (Linux)/8.8.15_GA_4059) Thread-Topic: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest Thread-Index: bvPQyLDSoOg+aSRtrktKYXzxAVZwCA== X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210819_143922_464700_42FDDCA1 X-CRM114-Status: GOOD ( 27.71 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org ----- On Aug 17, 2021, at 8:12 PM, Sean Christopherson seanjc@google.com wrote: > Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to > transferring to a KVM guest, which is roughly equivalent to an exit to > userspace and processes many of the same pending actions. While the task > cannot be in an rseq critical section as the KVM path is reachable only > via ioctl(KVM_RUN), the side effects that apply to rseq outside of a > critical section still apply, e.g. the CPU ID needs to be updated if the > task is migrated. > > Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults > and other badness in userspace VMMs that use rseq in combination with KVM, > e.g. due to the CPU ID being stale after task migration. I agree with the problem assessment, but I would recommend a small change to this fix. > > Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function") > Reported-by: Peter Foley > Bisected-by: Doug Evans > Cc: Shakeel Butt > Cc: Thomas Gleixner > Cc: stable@vger.kernel.org > Signed-off-by: Sean Christopherson > --- > kernel/entry/kvm.c | 4 +++- > kernel/rseq.c | 4 ++-- > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c > index 49972ee99aff..049fd06b4c3d 100644 > --- a/kernel/entry/kvm.c > +++ b/kernel/entry/kvm.c > @@ -19,8 +19,10 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, > unsigned long ti_work) > if (ti_work & _TIF_NEED_RESCHED) > schedule(); > > - if (ti_work & _TIF_NOTIFY_RESUME) > + if (ti_work & _TIF_NOTIFY_RESUME) { > tracehook_notify_resume(NULL); > + rseq_handle_notify_resume(NULL, NULL); > + } > > ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work); > if (ret) > diff --git a/kernel/rseq.c b/kernel/rseq.c > index 35f7bd0fced0..58c79a7918cd 100644 > --- a/kernel/rseq.c > +++ b/kernel/rseq.c > @@ -236,7 +236,7 @@ static bool in_rseq_cs(unsigned long ip, struct rseq_cs > *rseq_cs) > > static int rseq_ip_fixup(struct pt_regs *regs) > { > - unsigned long ip = instruction_pointer(regs); > + unsigned long ip = regs ? instruction_pointer(regs) : 0; > struct task_struct *t = current; > struct rseq_cs rseq_cs; > int ret; > @@ -250,7 +250,7 @@ static int rseq_ip_fixup(struct pt_regs *regs) > * If not nested over a rseq critical section, restart is useless. > * Clear the rseq_cs pointer and return. > */ > - if (!in_rseq_cs(ip, &rseq_cs)) > + if (!regs || !in_rseq_cs(ip, &rseq_cs)) I think clearing the thread's rseq_cs unconditionally here when regs is NULL is not the behavior we want when this is called from xfer_to_guest_mode_work. If we have a scenario where userspace ends up calling this ioctl(KVM_RUN) from within a rseq c.s., we really want a CONFIG_DEBUG_RSEQ=y kernel to kill this application in the rseq_syscall handler when exiting back to usermode when the ioctl eventually returns. However, clearing the thread's rseq_cs will prevent this from happening. So I would favor an approach where we simply do: if (!regs) return 0; Immediately at the beginning of rseq_ip_fixup, before getting the instruction pointer, so effectively skip all side-effects of the ip fixup code. Indeed, it is not relevant to do any fixup here, because it is nested in a ioctl system call. Effectively, this would preserve the SIGSEGV behavior when this ioctl is erroneously called by user-space from a rseq critical section. Thanks for looking into this ! Mathieu > return clear_rseq_cs(t); > ret = rseq_need_restart(t, rseq_cs.flags); > if (ret <= 0) > -- > 2.33.0.rc1.237.g0d66db33f3-goog -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel