From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D460BC83F14 for ; Tue, 29 Aug 2023 10:55:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232260AbjH2Ky2 (ORCPT ); Tue, 29 Aug 2023 06:54:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235300AbjH2KyS (ORCPT ); Tue, 29 Aug 2023 06:54:18 -0400 X-Greylist: delayed 1785 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Tue, 29 Aug 2023 03:53:47 PDT Received: from mx.exactcode.de (mx.exactcode.de [144.76.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84CEACF1; Tue, 29 Aug 2023 03:53:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=exactco.de; s=x; h=Content-Transfer-Encoding:Content-Type:Mime-Version:From:Subject:Cc:To:Message-Id:Date; bh=N5WzpI5h6EC9SL2rqOLnUu4sEe9bxdIyMkAU6n5jD44=; b=DGRLpT4gROnRdgq7urYw58wHiripEmYh2gtMPOjOMCSeom9F41r7gpi5q8NXBQkkoxBw7nA28yHnZA6Rf34oALTR651dzQ3ue1AWWSh2arLeZSaVC4Vzc768OIkWKUYTvUuIUFhPxqZMZSZLYQFB8AHwz9aofirJzLAlYW8Q7rA=; Received: from exactco.de ([90.187.5.221]) by mx.exactcode.de with esmtp (Exim 4.82) (envelope-from ) id 1qav8j-00020C-AL; Tue, 29 Aug 2023 09:36:01 +0000 Received: from [192.168.2.103] (helo=localhost) by exactco.de with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.86_2) (envelope-from ) id 1qav8G-0002ka-4f; Tue, 29 Aug 2023 09:35:32 +0000 Date: Tue, 29 Aug 2023 11:35:19 +0200 (CEST) Message-Id: <20230829.113519.1499398743089914237.rene@exactcode.com> To: Meng Li Cc: Huang Rui , , , , , Shuah Khan , , "Nathan Fontenot" , Deepak Sharma , Alex Deucher , Mario Limonciello , Shimmer Huang , "Perry Yuan" , Xiaojian Du , Viresh Kumar , Borislav Petkov , Meng Li , Wyes Karny Subject: Re: [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting. From: Rene Rebe X-Mailer: Mew version 6.8 on Emacs 29.1 Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Dear Meng Li and team, thank you so much for working on finally bringing AMD preferred core scheduling to mainline Linux! > The initial core rankings are set up by AMD Pstate when the > system boots. I tested this patch on our Ryzen 7950x and 5950x systems and could unfortunatlely not find any performance differences. I therefore took a closer look and as far as I can tell the conditional for the initial preferred performance priorities appears to be reversed. I marked them down below. I also attached a patch for the fix. With that fixed I can measure a 0.7% improvement compiling Firefox on 7950x. I wonder slightly how this ever past testing before, ... I think it would be a good idea to always expose the hw perf values in sysfs to help users debugging hardware issues or BIOS settings even with percore not enabled and therefore not using the unused 166 or 255 values anyway. With that fixed, however, Linux is still not always scheduling to preferred cores, but that appears to be an independant limitation of the current linux scheduler not strictly using the priority for scheduling, yet. With manual taskset guidance I could further improve the Firefox build time by some more seconds to over 1% overall performance improvement, if the linux scheudler would more reliably schedule minute long running rust lto link tasks to the preferred cores and not some mediocre ones. > - highest_perf =3D amd_get_highest_perf(); > - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1)) > - highest_perf =3D AMD_CPPC_HIGHEST_PERF(cap1); > - > - WRITE_ONCE(cpudata->highest_perf, highest_perf); > + if (prefcore) > + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); > + else > + WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); Conditional reversed, assigns THRESHOLD if enabled! > WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1)); > WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(= cap1)); > @@ -318,17 +322,15 @@ static int pstate_init_perf(struct amd_cpudata = *cpudata) > static int cppc_init_perf(struct amd_cpudata *cpudata) > { > struct cppc_perf_caps cppc_perf; > - u32 highest_perf; > = > int ret =3D cppc_get_perf_caps(cpudata->cpu, &cppc_perf); > if (ret) > return ret; > = > - highest_perf =3D amd_get_highest_perf(); > - if (highest_perf > cppc_perf.highest_perf) > - highest_perf =3D cppc_perf.highest_perf; > - > - WRITE_ONCE(cpudata->highest_perf, highest_perf); > + if (prefcore) > + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); > + else > + WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf); Same here. Not using highest_perf if enabled, ... Signed-off-by: Ren=E9 Rebe --- linux-6.4/drivers/cpufreq/amd-pstate.c.vanilla 2023-08-25 22:34:25.= 254995690 +0200 +++ linux-6.4/drivers/cpufreq/amd-pstate.c 2023-08-25 22:35:49.19499144= 6 +0200 @@ -282,9 +282,9 @@ * the default max perf. */ if (prefcore) - WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); - else WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); + else + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); = WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1)); WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(ca= p1)); @@ -303,9 +303,9 @@ return ret; = if (prefcore) - WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); - else WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf); + else + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); = WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf); WRITE_ONCE(cpudata->lowest_nonlinear_perf, -- = Ren=E9 Rebe, ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin https://exactcode.com | https://t2sde.org | https://rene.rebe.de