From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758609AbcAUHEy (ORCPT ); Thu, 21 Jan 2016 02:04:54 -0500 Received: from mail-by2on0075.outbound.protection.outlook.com ([207.46.100.75]:26438 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751662AbcAUHEu (ORCPT ); Thu, 21 Jan 2016 02:04:50 -0500 Authentication-Results: spf=none (sender IP is 165.204.84.221) smtp.mailfrom=amd.com; alien8.de; dkim=none (message not signed) header.d=none;alien8.de; dmarc=permerror action=none header.from=amd.com; X-WSS-ID: 0O1AJNV-07-ELU-02 X-M-MSG: Date: Thu, 21 Jan 2016 15:04:38 +0800 From: Huang Rui To: Peter Zijlstra CC: Borislav Petkov , Ingo Molnar , "Andy Lutomirski" , Thomas Gleixner , Robert Richter , Jacob Shin , "John Stultz" , =?utf-8?B?RnLvv71k77+9cmlj?= Weisbecker , , , , Guenter Roeck , Andreas Herrmann , Suravee Suthikulpanit , Aravind Gopalakrishnan , Borislav Petkov , "Fengguang Wu" , Aaron Lu Subject: Re: [PATCH v2 5/5] perf/x86/amd/power: Add AMD accumulated power reporting mechanism Message-ID: <20160121070437.GA15130@hr-amur2> References: <1452739808-11871-1-git-send-email-ray.huang@amd.com> <1452739808-11871-6-git-send-email-ray.huang@amd.com> <20160119121250.GA6344@twins.programming.kicks-ass.net> <20160120044823.GA13477@hr-amur2> <20160120092244.GH6357@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20160120092244.GH6357@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:165.204.84.221;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(428002)(24454002)(199003)(189002)(33716001)(33656002)(101416001)(106466001)(83506001)(4326007)(46406003)(2950100001)(105586002)(92566002)(586003)(50986999)(97756001)(77096005)(76176999)(54356999)(87936001)(2906002)(23726003)(110136002)(5008740100001)(4001350100001)(97736004)(189998001)(1076002)(47776003)(11100500001)(50466002)(86362001)(1220700001)(93886004)(1096002)(107986001);DIR:OUT;SFP:1101;SCL:1;SRVR:BN4PR12MB0851;H:atltwp01.amd.com;FPR:;SPF:None;PTR:InfoDomainNonexistent;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;BN4PR12MB0851;2:Y//7HmmPjyiaTKGEhFrBZ8GG9p3r4ariHZFJ/qf3Xz2LEsB/VhDkv8IzduLlymwEbekIkJ2MvVwnylmMEjPy+cu6m6Dws9D4u7GfT+AG8Cl3s6ypQ+7DvhqWBQkzSrvoa70DBm+FD3VO6NqOZ8Sn5Q==;3:AoizWzs9iU9Kj2CSBUn1qY6rZ6oSkHIC7dEl58GPAm9RTh0qNgYzwUQXL2Cni0ujeiTraSqkZaEmIXhAB/uFQuviBfzOGo4/HQrlHJlOgLhsUcLeJVZ9SEDv/9KheS3sD4ovmiEd89AZT6MTXOutgjs1iZb/EAmjhlY2FBGSoLgWBK6Z9mvUJEsTnuKi13JOwE/qkD4Jzqc2FF+7xeK+xRjUkOgcQauPMNWbdV9p+qE=;25:0FPdXe9f2TXWbjalB0IZBFj1bJgFW6Cg0IKJNnCrxRCCP59AMsTNC6MARv91j47KKZYXC9H0kXues2EHnxj265r9wARp4oIGEQtcuMcUzJL5m/SHGZmzL+fQFwn76FaQEeNBGbbRHu8ymlG1XxM7i8bwCfk67uijrrrPAyZnsuCOtF82goCl9xgx9V7R6d4G23suKwqbDDUsIC7KOK50jjnNsofoAgdU8YkEqHYFL8E96ECe8fR4CLKp21RkKxmF X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN4PR12MB0851; X-MS-Office365-Filtering-Correlation-Id: 984ff6c2-1cd9-4837-8be5-08d322312585 X-Microsoft-Exchange-Diagnostics: 1;BN4PR12MB0851;20:iChvejZTqCM+R5qOTViNgZHnKXujav7JLrN4jw808x7jAzvLaEbuTb7Ryp5kJWSxBoApu3NP8opGKrpUn4lZ+l2fwseuJyGIp9E+QiT7lrrEEv06sCr4u7azGAeczFbljwfN75fhTDdZgn5MiaYKP97R88FTwLllqSOwZY5Xr5TU6lJE5lINharl4wgIwsIsFTuGuvq0dSxAoLdJvvyHegHbUhCx13HjoYs5tAH8bEEbG/4lKNbh7MdcMjvjcOe8VO2e9Y/NDGknf/wbyGMkkNqo6wNLnvl9q1b9O1QMLPkJrcgKFrTa9Mkps8gGWlsvshJk4CCHkMHnEnAcOBA8e3p1r1eqY+zx7sOzJ8hc3Bhm5q2w4p/XKVSdm2kcOc+F1mOsepByJdr1hgs8rOtIIZeDbqtGkMpJW4BrH1odc/4WN3/4oRyr+iApI0EyPGX81m4Z2QSNKtReyFCXaHGy1Cpo3nsdxsHRVBOPxkHmedCX/fxUuC0lDYXqX/sUw2kg X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(123027)(601004)(2401047)(8121501046)(13024025)(5005006)(13017025)(13023025)(520078)(13018025)(13015025)(10201501046)(3002001);SRVR:BN4PR12MB0851;BCL:0;PCL:0;RULEID:;SRVR:BN4PR12MB0851; X-Microsoft-Exchange-Diagnostics: 1;BN4PR12MB0851;4:tDDe16VKZxqYtvgTFDZIzqSB4HQoPdbK8deFx0QmpISmsFEUlJnaP28F2/PVIVPOuX2IJSAyBO3JtgQuOg2lIdItnc29oFZOzrM7kLujI8p1Scadrdm3pbybnW5vililAwqRVsX6ABdHMGZtMgzuloVXKouyLTeskfPOs0Rxc3o5XlIcT+m7IAukt6P53ay+TFL05xnqj//TzugbqjHS+Bmlp4h3rX9tFYFos8xibrxXNlH5+6sYb6BrAiT6bNa9J91kNAaW9cofwo06Wnk/lW/5rUA20Ckd3LvR9kt27zTbbt4Efhh3+NvtqQy0GUyNJgKr23vFqTrqE5hMN0+Ti/+mvjSGbXpcBObk1stbh3rmvq5hrOyUUsIPRTjKhPscY4ZQqIlr+aOieAxKjYBH0xFS7/05Kpd+bxJuDMQTVFe+P5AqZiL32ARnIrFtFJ7HZYnSLCzNRgOxTIsoJsqAqY16+wGz5BK3l5ZAO/kbZOQ80P4ahuFJL0Yv01uH9gbH X-Forefront-PRVS: 08286A0BE2 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;BN4PR12MB0851;23:WsjGURBovOydXe4Iwd+S6zbNUszHqmPYJbHVUj+yo?= =?us-ascii?Q?Fg8A6IL2wzmwJxlssxNdoA1wf2A49Bw5f3z75WC95ys8DSIpbjrIweWO9Sfw?= =?us-ascii?Q?Js9OVEkbCk3fcm7RXfg4nOmuQk55uo61C71IE/dEGvW9uMIjcK66cvKt9Ps5?= =?us-ascii?Q?8yul98+6alc/1pZBbM3NJ+o21rqO7xmEkQKUxpdK6AP30RocrMxSZZ0k0xhg?= =?us-ascii?Q?AlPtjC4saQJ58HXTftVGUseuT2IOGjlt1T4HUsVMomYQWJV8wrii+v60y01u?= =?us-ascii?Q?yvO9gSH2LT88Feh7TmNbA3roA+XU5unRvgrAWpUAlICxJQ+FDnoEz1Uxnvg/?= =?us-ascii?Q?XNBYRQsj64lrrlNM76PuHL0xoFfaD6lUFsgAk3t0wWUan5zLILXST/j63yio?= =?us-ascii?Q?gq7OjiO9YxM3lmKJ6iDqfXu8l5Z5CFmj62lfo/pFUXo/hUHPA4uOWhPhW5hJ?= =?us-ascii?Q?chcuRtd3NJCbI1ACiihU1wRHFYfHQRjXEQPunbs6n9CN8B/MHt39IM4Ef0ox?= =?us-ascii?Q?UPA7/pXd+zGlgWzG5yqf3nrrOe6kSaMawGV93O8NLjl/+33fxsEyidlvvT8Z?= =?us-ascii?Q?kmbAgJ3KVenGJA0J+xjaWrMVm5qKjOgXxLgLHrEaHr6+KKu4KjCs1wdbbwks?= =?us-ascii?Q?2CK4iJhdCfgyL+Vjk7vFM0LJGI3cXZdizOaiDmutXUNmE1QTo/elpmRZmcnT?= =?us-ascii?Q?uW9uWWfIluJD0CnFif/F2a3FsViUE6CWV1ExbyVtFNfmHUtSVksxa8nliWmk?= =?us-ascii?Q?1Ibk9P1cXW/tID24YzEEJf6pdG1chpZ860v55aTlE/8zP3+x5NaRspYkk/ko?= =?us-ascii?Q?2EIhc5YagLImV8Pp/S9rtEGU6cAxwgGRC6TLJJ+OZOAftzcjYrrK6f+W9o7s?= =?us-ascii?Q?FUuglLYvgOsbpbRVileCA+rsCU8vBH37fy/jWwcxjqQX7/7O5MF8h5Fd2JC9?= =?us-ascii?Q?8x2VYJpQ67GKy9ra6fgxvXnjtbDTwrPpeAutGuI7Ugjw9NVLmTZaNK2KVKx2?= =?us-ascii?Q?dVNR5WNYruTSwMenMBS5Ty8?= X-Microsoft-Exchange-Diagnostics: 1;BN4PR12MB0851;5:75SnzU136L6gkdiGHEfoL1jb+Lh7uXhsDaqM+XOucDw60kE3T+Uo6PqsDyelnGAqVav8V+owlxTBPPcP0gpMJfU2LTtYkDwTVB/lL0gX9kcevZBinUzZJZGMORGW0/ceORmE1nCbHk//Bi3+1BJK0g==;24:Vh2XPqyRzEpqoJVaV8seXSVVt9Uu+rgikdj0tXhExfJagVK47xJBl20FUWha8/dwRXYMRwa9Sud8o8LdThCTVJWEuoGVfOIXmkptMHTbFvg=;20:HaRV06K7RLvl0+3F/+e8+Yev5pd0SCtXUbbV7AumUz1lshyWB5rPAtuWpthUhgy6+A/Z+HVGXlIlUaj/XX+pXsnoGrprmMtxyM5ObvMNFn43LuRRrcZF+Xos6EfrieizZDp8V8whtClvk3UkFTOhQ6xMGOAqgl9/w8xEziPYZLiZkaakMaHOR7bTAaUTljvgrdLm2gvuKkj6r2nZEy8dmBDP64ZcXkLdozVExz8K/L0tAEfiUmk9HL5ASZpOv+nH SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Jan 2016 07:04:45.4185 (UTC) X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.221];Helo=[atltwp01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN4PR12MB0851 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 20, 2016 at 10:22:44AM +0100, Peter Zijlstra wrote: > On Wed, Jan 20, 2016 at 12:48:24PM +0800, Huang Rui wrote: > > Hi Peter, > > > > Thanks so much to your comments. > > > > On Tue, Jan 19, 2016 at 01:12:50PM +0100, Peter Zijlstra wrote: > > > On Thu, Jan 14, 2016 at 10:50:08AM +0800, Huang Rui wrote: > > > > +struct power_pmu { > > > > + spinlock_t lock; > > > > > > This should be a raw_spinlock_t, as it'll be nested under other > > > raw_spinlock_t's. > > > > > > > Do you mean the following spinlock operations are in hardware > > interrupts disabled case, so I need use raw_spinlock_t instead, right? > > > mainline -rt > > raw_spinlock_t spin-waits spin-waits > spinlock_t spin-waits blocks (rt-mutex) > struct mutex blocks blocks (rt-mutex) > > > since these functions are themselves called with raw_spinlock_t held > (perf_event_context::lock for example, but also rq::lock), any lock > nested inside them must also be raw_spinlock_t. > I see, thank you. :-) I just quickly looked at about the spinlock on -rt mode. Because realtime linux kernel provides two kinds of spinlock, the original spinlock_t will be replaced the one which is able to sleep, actually, like mutex. And another one (you mentioned here, raw_spinlock_t) can keep on non-sleep behavior, that is the real spinlock. And my lock here also will be nested under perf_event_context::lock, right? > I have a lockdep patch somewhere that checks these ordering things; I > should rebase and post that again. > Can you CC me when you post that patch next time? > > Use raw_spin_lock_irqsave/raw_spin_unlock_irqrestore? > > pmu::{start,stop,add,del} will be called with IRQs already disabled. > > > > > +static int power_cpu_init(int cpu) > > > > +{ > > > > + int i, cu, ret = 0; > > > > + cpumask_var_t mask, dummy_mask; > > > > + > > > > + cu = cpu / cores_per_cu; > > > > + > > > > + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) > > > > + return -ENOMEM; > > > > + > > > > + if (!zalloc_cpumask_var(&dummy_mask, GFP_KERNEL)) { > > > > + ret = -ENOMEM; > > > > + goto out; > > > > + } > > > > + > > > > + for (i = 0; i < cores_per_cu; i++) > > > > + cpumask_set_cpu(i, mask); > > > > + > > > > + cpumask_shift_left(mask, mask, cu * cores_per_cu); > > > > + > > > > + if (!cpumask_and(dummy_mask, mask, &cpu_mask)) > > > > + cpumask_set_cpu(cpu, &cpu_mask); > > > > + > > > > + free_cpumask_var(dummy_mask); > > > > +out: > > > > + free_cpumask_var(mask); > > > > + > > > > + return ret; > > > > +} > > > > > > > +static int power_cpu_notifier(struct notifier_block *self, > > > > + unsigned long action, void *hcpu) > > > > +{ > > > > + unsigned int cpu = (long)hcpu; > > > > + > > > > + switch (action & ~CPU_TASKS_FROZEN) { > > > > + case CPU_UP_PREPARE: > > > > + if (power_cpu_prepare(cpu)) > > > > + return NOTIFY_BAD; > > > > + break; > > > > + case CPU_STARTING: > > > > + if (power_cpu_init(cpu)) > > > > + return NOTIFY_BAD; > > > > > > this is called with IRQs disabled, which makes those GFP_KERNEL allocs > > > above a pretty bad idea. > > > > > > > Right, so should I use GFP_ATOMIC to allocate cpumask here? > > One should not use GFP_ATOMIC if at all possible, also no, -rt cannot do > _any_ allocations from this site. > OK, that's because allocation might sleep when IRQ disabled. That's incorrect. > > > Also, note that -rt cannot actually do _any_ allocations/frees from > > > STARTING. > > > > > > Please move the allocs/frees to PREPARE/ONLINE. > > > > > > > How about add two cpumask_var_t at power_pmu structure? Then allocate > > the two cpumask_var_t (pmu->mask, pmu->dummy_mask), and they can be > > also used on power_cpu_init. > > That would work. I draft an update diff that based on original patch, please take a look. 8<-------------------------------------------------------------------------- diff --git a/arch/x86/kernel/cpu/perf_event_amd_power.c b/arch/x86/kernel/cpu/perf_event_amd_power.c index 69ef234..e71d993 100644 --- a/arch/x86/kernel/cpu/perf_event_amd_power.c +++ b/arch/x86/kernel/cpu/perf_event_amd_power.c @@ -46,10 +46,17 @@ static unsigned int cu_num; static u64 max_cu_acc_power; struct power_pmu { - spinlock_t lock; + raw_spinlock_t lock; struct list_head active_list; struct pmu *pmu; /* pointer to power_pmu_class */ local64_t cpu_sw_pwr_ptsc; + /* + * These two cpumasks is used for avoiding the allocations on + * CPU_STARTING phase. Because power_cpu_prepare will be + * called on IRQs disabled status. + */ + cpumask_var_t mask; + cpumask_var_t tmp_mask; }; static struct pmu pmu_class; @@ -126,9 +133,9 @@ static void pmu_event_start(struct perf_event *event, int mode) struct power_pmu *pmu = __this_cpu_read(amd_power_pmu); unsigned long flags; - spin_lock_irqsave(&pmu->lock, flags); + raw_spin_lock_irqsave(&pmu->lock, flags); __pmu_event_start(pmu, event); - spin_unlock_irqrestore(&pmu->lock, flags); + raw_spin_unlock_irqrestore(&pmu->lock, flags); } static void pmu_event_stop(struct perf_event *event, int mode) @@ -137,7 +144,7 @@ static void pmu_event_stop(struct perf_event *event, int mode) struct hw_perf_event *hwc = &event->hw; unsigned long flags; - spin_lock_irqsave(&pmu->lock, flags); + raw_spin_lock_irqsave(&pmu->lock, flags); /* mark event as deactivated and stopped */ if (!(hwc->state & PERF_HES_STOPPED)) { @@ -155,7 +162,7 @@ static void pmu_event_stop(struct perf_event *event, int mode) hwc->state |= PERF_HES_UPTODATE; } - spin_unlock_irqrestore(&pmu->lock, flags); + raw_spin_unlock_irqrestore(&pmu->lock, flags); } static int pmu_event_add(struct perf_event *event, int mode) @@ -164,14 +171,14 @@ static int pmu_event_add(struct perf_event *event, int mode) struct hw_perf_event *hwc = &event->hw; unsigned long flags; - spin_lock_irqsave(&pmu->lock, flags); + raw_spin_lock_irqsave(&pmu->lock, flags); hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; if (mode & PERF_EF_START) __pmu_event_start(pmu, event); - spin_unlock_irqrestore(&pmu->lock, flags); + raw_spin_unlock_irqrestore(&pmu->lock, flags); return 0; } @@ -297,89 +304,71 @@ static int power_cpu_exit(int cpu) struct power_pmu *pmu = per_cpu(amd_power_pmu, cpu); int i, cu, ret = 0; int target = nr_cpumask_bits; - cpumask_var_t mask, tmp_mask; cu = cpu / cores_per_cu; - if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) - return -ENOMEM; - - if (!zalloc_cpumask_var(&tmp_mask, GFP_KERNEL)) { - ret = -ENOMEM; - goto out; - } + cpumask_clear(pmu->mask); + cpumask_clear(pmu->tmp_mask); for (i = 0; i < cores_per_cu; i++) - cpumask_set_cpu(i, mask); + cpumask_set_cpu(i, pmu->mask); - cpumask_shift_left(mask, mask, cu * cores_per_cu); + cpumask_shift_left(pmu->mask, pmu->mask, cu * cores_per_cu); cpumask_clear_cpu(cpu, &cpu_mask); - cpumask_clear_cpu(cpu, mask); + cpumask_clear_cpu(cpu, pmu->mask); - if (!cpumask_and(tmp_mask, mask, cpu_online_mask)) - goto out1; + if (!cpumask_and(pmu->tmp_mask, pmu->mask, cpu_online_mask)) + goto out; /* * find a new CPU on same compute unit, if was set in cpumask * and still some CPUs on compute unit, then move to the new * CPU */ - target = cpumask_any(tmp_mask); + target = cpumask_any(pmu->tmp_mask); if (target < nr_cpumask_bits && target != cpu) cpumask_set_cpu(target, &cpu_mask); WARN_ON(cpumask_empty(&cpu_mask)); -out1: +out: /* * migrate events and context to new CPU */ if (target < nr_cpumask_bits) perf_pmu_migrate_context(pmu->pmu, cpu, target); - free_cpumask_var(tmp_mask); -out: - free_cpumask_var(mask); - return ret; } static int power_cpu_init(int cpu) { - int i, cu, ret = 0; - cpumask_var_t mask, dummy_mask; - - cu = cpu / cores_per_cu; + struct power_pmu *pmu = per_cpu(amd_power_pmu, cpu); + int i, cu; - if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) - return -ENOMEM; + if (pmu) + return 0; - if (!zalloc_cpumask_var(&dummy_mask, GFP_KERNEL)) { - ret = -ENOMEM; - goto out; - } + cu = cpu / cores_per_cu; for (i = 0; i < cores_per_cu; i++) - cpumask_set_cpu(i, mask); + cpumask_set_cpu(i, pmu->mask); - cpumask_shift_left(mask, mask, cu * cores_per_cu); + cpumask_shift_left(pmu->mask, pmu->mask, cu * cores_per_cu); - if (!cpumask_and(dummy_mask, mask, &cpu_mask)) + if (!cpumask_and(pmu->tmp_mask, pmu->mask, &cpu_mask)) cpumask_set_cpu(cpu, &cpu_mask); - free_cpumask_var(dummy_mask); -out: - free_cpumask_var(mask); - - return ret; + return 0; } static int power_cpu_prepare(int cpu) { struct power_pmu *pmu = per_cpu(amd_power_pmu, cpu); int phys_id = topology_physical_package_id(cpu); + int ret = 0; if (pmu) return 0; @@ -391,7 +380,17 @@ static int power_cpu_prepare(int cpu) if (!pmu) return -ENOMEM; - spin_lock_init(&pmu->lock); + if (!zalloc_cpumask_var(&pmu->mask, GFP_KERNEL)) { + ret = -ENOMEM; + goto out; + } + + if (!zalloc_cpumask_var(&pmu->tmp_mask, GFP_KERNEL)) { + ret = -ENOMEM; + goto out1; + } + + raw_spin_lock_init(&pmu->lock); INIT_LIST_HEAD(&pmu->active_list); @@ -400,12 +399,21 @@ static int power_cpu_prepare(int cpu) per_cpu(amd_power_pmu, cpu) = pmu; return 0; + +out1: + free_cpumask_var(pmu->mask); +out: + kfree(pmu); + + return ret; } static void power_cpu_kfree(int cpu) { struct power_pmu *pmu = per_cpu(amd_power_pmu, cpu); + free_cpumask_var(pmu->mask); + free_cpumask_var(pmu->tmp_mask); kfree(pmu); }