From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F219C433EF for ; Tue, 15 Feb 2022 17:55:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242746AbiBORza (ORCPT ); Tue, 15 Feb 2022 12:55:30 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:41514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233741AbiBORz3 (ORCPT ); Tue, 15 Feb 2022 12:55:29 -0500 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22D5BFEB34; Tue, 15 Feb 2022 09:55:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644947719; x=1676483719; h=message-id:date:mime-version:to:cc:references:from: subject:in-reply-to:content-transfer-encoding; bh=WrTqW/MTMMAtSIFLsehmFFuDfgHFboDGMW3aqd59c2k=; b=M+yG+oMi9auL/Ns4gmTwrINxF0zRyZaqQ5QIWIBjfLpzWttCKoDfKh2L dnBB+jkfyWFPoAxLpsMNN6AvKDm89TKMFOuAEmUYXl+UtQyOrTAwF51BY V/ZE5A7aomIveycfC9oQuTVPlIMYr7x6LI7MMWBGRjdhsOAV5CdnIAJUI RnHV+VpGweTday9/dv/dP1xvPmyYoA5hcjr/9OnPpYJzXzNqw2ggLahJ8 sQVYdnVW+LK9Vvs3E//n1xozX7F6N10E9bGJXN2djT2CaskNyNd0JqED3 ud7qyxWng/eKL2NDW7/4szNI8yvTrx+g7fkLQzlSQakmiySPmgliYmL8F w==; X-IronPort-AV: E=McAfee;i="6200,9189,10259"; a="274989548" X-IronPort-AV: E=Sophos;i="5.88,371,1635231600"; d="scan'208";a="274989548" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2022 09:55:16 -0800 X-IronPort-AV: E=Sophos;i="5.88,371,1635231600"; d="scan'208";a="528990176" Received: from tngodup-mobl.amr.corp.intel.com (HELO [10.209.32.98]) ([10.209.32.98]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2022 09:55:13 -0800 Message-ID: <543efc25-9b99-53cd-e305-d8b4d917b64b@intel.com> Date: Tue, 15 Feb 2022 09:55:09 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: Brian Geffon Cc: Thomas Gleixner , Willis Kung , Guenter Roeck , Borislav Petkov , Andy Lutomirski , "# v4 . 10+" , the arch/x86 maintainers , LKML References: <20220215153644.3654582-1-bgeffon@google.com> <56fc0ced-d8d2-146f-6ca8-b95bd7e0b4f5@intel.com> From: Dave Hansen Subject: Re: [PATCH] x86/fpu: Correct pkru/xstate inconsistency In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/15/22 09:50, Brian Geffon wrote: >>> is not a kernel thread as kernel threads will never use PKRU. It's >>> possible that this_cpu_read_stable() on current_task (ie. >>> get_current()) is returning an old cached value. By forcing the read >>> with this_cpu_read() the correct task is used. Without this it's >>> possible when switching from a kernel thread to a userspace thread >>> that we'll still observe the PF_KTHREAD flag and never restore the >>> PKRU. And as a result this issue only occurs when switching from a >>> kernel thread to a userspace thread, switching from a non kernel >>> thread works perfectly fine because all we consider in that situation >>> is the flags from some other non kernel task and the next fpu is >>> passed in to switch_fpu_finish(). >> >> It makes *sense* that there would be a place in the context switch code >> where 'current' is wonky, but I never realized this. This seems really >> fragile, but *also* trivially detectable. >> >> Is the PKRU code really the only code to use 'current' in a buggy way >> like this? > > Yes, because the remaining code in __switch_to() references the next > task as next_p rather than current. Technically, it might be more > correct to pass next_p to switch_fpu_finish(), what do you think? This > may make sense since we're also passing the next fpu anyway. Yeah, passing next_p instead of next_fpu makes a lot of sense to me.