From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DFDDC433E0 for ; Mon, 11 Jan 2021 21:45:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C94C222CAF for ; Mon, 11 Jan 2021 21:45:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C94C222CAF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E184E8D0057; Mon, 11 Jan 2021 16:45:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CDE4A8D0051; Mon, 11 Jan 2021 16:45:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D1EB8D0057; Mon, 11 Jan 2021 16:45:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0166.hostedemail.com [216.40.44.166]) by kanga.kvack.org (Postfix) with ESMTP id 7EB7C8D0051 for ; Mon, 11 Jan 2021 16:45:15 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 42774180AD815 for ; Mon, 11 Jan 2021 21:45:15 +0000 (UTC) X-FDA: 77694825390.28.sock54_5c08f2327510 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 097376D63 for ; Mon, 11 Jan 2021 21:45:15 +0000 (UTC) X-HE-Tag: sock54_5c08f2327510 X-Filterd-Recvd-Size: 4318 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Mon, 11 Jan 2021 21:45:13 +0000 (UTC) IronPort-SDR: FAjGUXUeYEmpt9cLb5t2aRIXeiqaXNaFHZdnr2eum+cBglBsif0kl69LanidthVtwRuuJzUD0Q 7WdVGaLSE/Bg== X-IronPort-AV: E=McAfee;i="6000,8403,9861"; a="157718790" X-IronPort-AV: E=Sophos;i="5.79,339,1602572400"; d="scan'208";a="157718790" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2021 13:45:09 -0800 IronPort-SDR: ON9q/4OUcxoRqdb/E+9YK6bG+4BABofaWZXDjw0hAlpGBfjFP81JT1sCssE0/B5bVIeoAnbObJ FMM4xUtYQTKA== X-IronPort-AV: E=Sophos;i="5.79,339,1602572400"; d="scan'208";a="352760861" Received: from agluck-desk2.sc.intel.com ([10.3.52.68]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2021 13:45:08 -0800 From: Tony Luck To: Borislav Petkov Cc: Tony Luck , x86@kernel.org, Andrew Morton , Peter Zijlstra , Darren Hart , Andy Lutomirski , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 0/3] Fix infinite machine check loop in futex_wait_setup() Date: Mon, 11 Jan 2021 13:44:49 -0800 Message-Id: <20210111214452.1826-1-tony.luck@intel.com> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20210108222251.14391-1-tony.luck@intel.com> References: <20210108222251.14391-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Linux can now recover from machine checks where kernel code is doing get_user() to access application memory. But there isn't a way to distinguish whether get_user() failed because of a page fault or a machine check. Thus there is a problem if any kernel code thinks it can retry an access after doing something that would fix the page fault. One such example (I'm sure there are more) is in futex_wait_setup() where an attempt to read the futex with page faults disabled. Then a retry (after dropping a lock so page faults are safe): ret =3D get_futex_value_locked(&uval, uaddr); if (ret) { queue_unlock(*hb); ret =3D get_user(uval, uaddr); It would be good to avoid deliberately taking a second machine check (especially as the recovery code does really bad things and ends up in an infinite loop!). V2 (thanks to feedback from PeterZ) fixes this by changing get_user() to return -ENXIO ("No such device or address") for the case where a machine check occurred. Peter left it open which error code to use (suggesting "-EMEMERR or whatever name we come up with"). I think the existing ENXIO error code seems appropriate (the address being accessed has effectively gone away). But I don't have a strong attachment if anyone thinks we need a new code. Callers can check for ENXIO in paths where the access would be retried so they can avoid a second machine check. Patch roadmap: Part 1 (unchanged since v1): Add code to avoid the infinite loop in the machine check code. Just panic if code runs into the same machine check a second time. This should make it much easier to debug other places where this happens. Part 2: Change recovery path for get_user() to return -ENXIO Part 3: Fix the one case in futex code that my test case hits (I'm sure there are more). TBD: There are a few places in arch/x86 code that test "ret =3D=3D -EFAUL= T" or have "switch (ret) { case -EFAULT: }" that may benefit from an additional check for -ENXIO. For now those will continue to crash (just like every pre-v5.10 kernel crashed when get_user() touched poison). Tony Luck (3): x86/mce: Avoid infinite loop for copy from user recovery x86/mce: Add new return value to get_user() for machine check futex, x86/mce: Avoid double machine checks arch/x86/kernel/cpu/mce/core.c | 7 ++++++- arch/x86/lib/getuser.S | 8 +++++++- arch/x86/mm/extable.c | 1 + include/linux/sched.h | 3 ++- kernel/futex.c | 5 ++++- 5 files changed, 20 insertions(+), 4 deletions(-) --=20 2.21.1