From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32C4BC43334 for ; Wed, 22 Jun 2022 17:10:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377149AbiFVRK2 (ORCPT ); Wed, 22 Jun 2022 13:10:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377183AbiFVRKG (ORCPT ); Wed, 22 Jun 2022 13:10:06 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 424ED403CB for ; Wed, 22 Jun 2022 10:09:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655917771; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gwE0a9PC37waDzGpK64QMjVnshbm6LUfxIuUv+3detE=; b=VsgfROOtmSwZ5TzH3LfCjxPrim442LhcihSvNqH6ib8M6zP9ERlDCb33f+8M7dL1ZQiRl/ pK2Jk7zTPuYZ8tDnOnuyM1TSVXWLQ9JFtcQ6akvMNVTHrHF7Q1OQY+Fge5J+0QC769s+rf Us6NBwfX3el9w7/d1lFzVvzLU+OChno= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-341-CMtuCWFWNuyk6lneCrY_7w-1; Wed, 22 Jun 2022 13:09:28 -0400 X-MC-Unique: CMtuCWFWNuyk6lneCrY_7w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5A4458001EA; Wed, 22 Jun 2022 17:09:27 +0000 (UTC) Received: from starship (unknown [10.40.194.180]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1666B40D2962; Wed, 22 Jun 2022 17:09:21 +0000 (UTC) Message-ID: <70deac4c885b2bf41daecaed054f80f867ed19de.camel@redhat.com> Subject: Re: [PATCH 4/4] x86/cpuid: check for dependencies violations in CPUID and attempt to fix them From: Maxim Levitsky To: Dave Hansen , linux-kernel@vger.kernel.org Cc: "Chang S. Bae" , Jiri Olsa , linux-perf-users@vger.kernel.org, Peter Zijlstra , "H. Peter Anvin" , "David S. Miller" , Borislav Petkov , Kees Cook , Mark Rutland , Alexander Shishkin , Namhyung Kim , Tim Chen , Pawan Gupta , Herbert Xu , Dave Hansen , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Jane Malalane , Tony Luck , Ingo Molnar , Thomas Gleixner , Arnaldo Carvalho de Melo , "open list:CRYPTO API" , Paolo Bonzini Date: Wed, 22 Jun 2022 20:09:20 +0300 In-Reply-To: References: <20220622144820.751402-1-mlevitsk@redhat.com> <20220622144820.751402-5-mlevitsk@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Wed, 2022-06-22 at 08:32 -0700, Dave Hansen wrote: > On 6/22/22 07:48, Maxim Levitsky wrote: > > Due to configuration bugs, sometimes a CPU feature is disabled in CPUID, > > but not features that depend on it. > > > > While the above is not supported, the kernel should try to not crash, > > and clearing the dependent cpu caps is the best way to do it. > > That's a rather paltry changelog. > > If I remember correctly, there's a crystal clear problem: > > If a CPU enumerates support for AVX2 but AVX via CPUID, the > kernel crashes. > > There's also a follow-on problem. The kernel has all the data it needs > to fix this, but just doesn't consult it: > > To make matters worse, the kernel _knows_ that this is an ill- > advised situation: The kernel prevents itself from clearing the > software representation of the AVX CPUID bit without also > clearing AVX2. > > But, the kernel only consults this knowledge when it is clearing > cpu_cap bits. It does not consult this information when it is > populating those cpu_cap bits. Yes, I agree. I'll update the changelog with something more in depth. > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > index 4cc79971d2d847..c83a8f447d6aed 100644 > > --- a/arch/x86/kernel/cpu/common.c > > +++ b/arch/x86/kernel/cpu/common.c > > @@ -1469,7 +1469,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c) > > this_cpu->c_early_init(c); > > > > c->cpu_index = 0; > > - filter_cpuid_features(c, false); > > + filter_cpuid_features(c, true); > > > > if (this_cpu->c_bsp_init) > > this_cpu->c_bsp_init(c); > > @@ -1757,7 +1757,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) > > */ > > > > /* Filter out anything that depends on CPUID levels we don't have */ > > - filter_cpuid_features(c, true); > > + filter_cpuid_features(c, false); > > > > /* If the model name is still unset, do table lookup. */ > > if (!c->x86_model_id[0]) { > > While we're at it, could we please rid ourselves of this unreadable > mystery true/false gunk? It is present if I understand the code correctly to avoid printing a warning twice. It used to be 'warn' parameter, and I changed it to 'early' parameter, inverting its boolean value, because I have seen that warning is not printed at all, and I assumed that it is because the first early call already clears the cpuid cap and the second call doesn't get the warning. Now however, looking at that I think that the same will happen with the cpuid level fitering as well, and thus we can just remove that 'warn' parameter. > > > diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c > > index bcb091d02a754b..6d9c0e39851805 100644 > > --- a/arch/x86/kernel/cpu/cpuid-deps.c > > +++ b/arch/x86/kernel/cpu/cpuid-deps.c > > @@ -94,6 +94,11 @@ static inline void clear_feature(struct cpuinfo_x86 *c, unsigned int feature) > > set_bit(feature, (unsigned long *)cpu_caps_cleared); > > } > > > > +static inline bool test_feature(struct cpuinfo_x86 *c, unsigned int feature) > > +{ > > + return test_bit(feature, (unsigned long *)c->x86_capability); > > +} > > + > > /* Take the capabilities and the BUG bits into account */ > > #define MAX_FEATURE_BITS ((NCAPINTS + NBUGINTS) * sizeof(u32) * 8) > > > > @@ -127,6 +132,7 @@ void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature) > > } while (changed); > > } > > > > + > > void setup_clear_cpu_cap(unsigned int feature) > > More superfluous whitespace. Sorry about that, will check better next time. > > > { > > clear_cpu_cap(&boot_cpu_data, feature); > > @@ -137,6 +143,10 @@ void setup_clear_cpu_cap(unsigned int feature) > > * Some CPU features depend on higher CPUID levels, which may not always > > * be available due to CPUID level capping or broken virtualization > > * software. Add those features to this table to auto-disable them. > > + * > > + * Also due to configuration bugs, some CPUID features might be present > > + * while CPUID features that they depend on are not present, > > + * e.g a AVX2 present but AVX is not present. > > */ > > struct cpuid_dependent_feature { > > u32 feature; > > @@ -151,9 +161,10 @@ cpuid_dependent_features[] = { > > { 0, 0 } > > }; > > > > -void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn) > > +void filter_cpuid_features(struct cpuinfo_x86 *c, bool early) > > { > > I have at least an inkling what 'warn' could mean. But, 'early'? One > man's 'early' is another one's 'late'. I understand what you mean, as I said above, I will try to reproduce the original issue of cpuid level mismatch and see if I can remove the warn parameter at all. > > > const struct cpuid_dependent_feature *df; > > + const struct cpuid_dep *d; > > > > for (df = cpuid_dependent_features; df->feature; df++) { > > > > @@ -172,10 +183,22 @@ void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn) > > continue; > > > > clear_cpu_cap(c, df->feature); > > - if (!warn) > > + if (early) > > continue; > > Why is it that 'early' calls don't want warnings? I don't know to be honest, except that I assumed that this allows to not print the warning twice, but as I said above, I might be able to just remove that code. > > > pr_warn("CPU: CPU feature " X86_CAP_FMT " disabled, no CPUID level 0x%x\n", > > x86_cap_flag(df->feature), df->level); > > } > > + > > + for (d = cpuid_deps; d->feature; d++) { > > + > > + if (!test_feature(c, d->feature) || test_feature(c, d->depends)) > > + continue; > > + > > + clear_feature(c, d->feature); > > + > > + pr_warn("CPU: CPU feature " X86_CAP_FMT " disabled, because it depends on " > > + X86_CAP_FMT " which is not supported in CPUID\n", > > + x86_cap_flag(d->feature), x86_cap_flag(d->depends)); > > + } > > } > > The do_clear_cpu_cap() does this with a loop, presumably because a later > (higher index in the array) feature in cpuid_deps[] could theoretically > clear an earlier (lower index) feature. Sorry this is my silly mistake. I intended to call clear_cpu_cap here, which will if needed disable all the depedencies, so a loop doesn't seem to be needed here. It's not very efficient but this is only done once per vCPU so shouldn't matter. > > Also, is that message strictly correct? There might have been a > clearcpuid= argument or even another dependency that ended up clearing a > bit. It might have nothing to do with CPUID itself. I think it should work, because clearcpuid= will end up calling clear_cpu_cap which will disable both the requested feature and everything that depends on it, thus filter_cpuid_features should not notice any inconsistencies. Other way around, if the clear_cpu_cap is called before filter_cpuid_features, it might 'fix' the inconsistency, and silence the warning but that isn't an issue IMHO. Thanks a lot for the review, Best regards, Maxim Levitsky >