From mboxrd@z Thu Jan 1 00:00:00 1970 From: viresh.kumar@linaro.org (Viresh Kumar) Date: Fri, 9 Sep 2016 16:52:04 +0530 Subject: [PATCH] cpufreq: create link to policy only for registered CPUs In-Reply-To: <20160909111620.GQ1041@n2100.armlinux.org.uk> References: <20160819110032.GM1041@n2100.armlinux.org.uk> <20160909111620.GQ1041@n2100.armlinux.org.uk> Message-ID: <20160909112204.GE18547@vireshk-i7> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 09-09-16, 12:16, Russell King - ARM Linux wrote: > On Fri, Sep 09, 2016 at 03:24:14PM +0530, Viresh Kumar wrote: > > If a cpufreq driver is registered very early in the boot stage (e.g. > > registered from postcore_initcall()), then cpufreq core may generate > > kernel warnings for it. > > > > In this case, the CPUs are registered as devices with the kernel only > > after the cpufreq driver is registered, while the CPUs were brought > > online way before that. > > This seems confusing, maybe: > > "In this case, the CPUs are brought online, then the cpufreq driver is > registered, and then the CPU topology devices are registered." > > which gives more of a linear A happens, then B, then C. Sure, thanks for the tip.. > > ... And by the time cpufreq_add_dev() gets called, > > the cpu device isn't stored in the per-cpu variable (cpu_sys_devices,) > > which is read by get_cpu_device(). > > s/And by/By/ or "However, by" > > > And so cpufreq core fails to get device for the CPU, for which > > cpufreq_add_dev() was called in the first place and we will hit a > > WARN_ON(!cpu_dev). > > s/And so/So the/ > > This isn't the WARN_ON() statement that's triggering for me. The WARN_ON() that was triggering for you was already removed by a patch from Rafael (see below), but with that patch, you would have hit this WARN_ON() :(. > > Even if we reuse the 'dev' parameter passed to cpufreq_add_dev() to > > avoid that warning, there might be other CPUs online that share the > > policy with the cpu for which cpufreq_add_dev() is called. And > > eventually get_cpu_device() will return NULL for them as well, and we > > will hit the same WARN_ON() again. > > s/And eventually/Eventually/ Thanks for all the suggestions.. > > In order to fix these issues, change cpufreq core to create links to the > > policy for a cpu only when cpufreq_add_dev() is called for that CPU. > > > > Reuse the 'real_cpus' mask to track that as well. > > > > Note that cpufreq_remove_dev() already handles removal of the links for > > individual CPUs and cpufreq_add_dev() has aligned with that now. > > I applied this patch, but I still get: > > WARNING: CPU: 0 PID: 1 at drivers/cpufreq/cpufreq.c:1040 cpufreq_add_dev+0x144/0x634 > Modules linked in: > CPU: 0 PID: 1 Comm: swapper Not tainted 4.8.0-rc5+ #1061 > Hardware name: Intel-Assabet > Backtrace: > [] (dump_backtrace) from [] (show_stack+0x18/0x1c) > r6:00000000 r5:c05ca11d r4:00000000 > [] (show_stack) from [] (dump_stack+0x20/0x28) > [] (dump_stack) from [] (__warn+0xd0/0xfc) > [] (__warn) from [] (warn_slowpath_null+0x28/0x30) > r10:00000000 r8:c062927c r7:00000000 r6:00000000 r5:c063d288 r4:00000000 > [] (warn_slowpath_null) from [] (cpufreq_add_dev+0x144/0x634) > [] (cpufreq_add_dev) from [] (bus_probe_device+0x5c/0x84) > r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:c062927c r5:c063d288 > r4:c0640148 > [] (bus_probe_device) from [] (device_add+0x390/0x520) > r6:c0629284 r5:00000000 r4:c062927c > [] (device_add) from [] (device_register+0x1c/0x20) > r10:c061d848 r8:c0603524 r7:00000001 r6:00000000 r5:c062927c r4:c062927c > [] (device_register) from [] (register_cpu+0x88/0xac) > r4:c0629274 > [] (register_cpu) from [] (topology_init+0x20/0x2c) > r7:c0646760 r6:c0623568 r5:c061d834 r4:00000000 > [] (topology_init) from [] (do_one_initcall+0xc0/0x178) > r4:00000004 > [] (do_one_initcall) from [] (kernel_init_freeable+0xfc/0x1c4) > r10:c061d848 r9:00000000 r8:0000008c r7:c0646760 r6:c0623568 r5:c061d834 > r4:00000004 > [] (kernel_init_freeable) from [] (kernel_init+0x10/0xf4) > r10:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c052b6bc r4:00000000 > [] (kernel_init) from [] (ret_from_fork+0x14/0x24) > r4:00000000 > ---[ end trace d7209ea270f4f585 ]--- > > I'm afraid I rather predicted that after reading the patch but before > running the test: the patch does nothing to solve the original warning, > as the code path which gets us to that warning remains untouched by > this patch. > > The code path is: > > static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) > { > if (cpu_online(cpu)) > return cpufreq_online(cpu); > > static int cpufreq_online(unsigned int cpu) > { > policy = per_cpu(cpufreq_cpu_data, cpu); > if (policy) { > } else { > new_policy = true; > policy = cpufreq_policy_alloc(cpu); > > static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu) > { > if (WARN_ON(!dev)) > return NULL; > > The only change in your patch that affected this path was this: > > - if (cpu_online(cpu)) > - return cpufreq_online(cpu); > + if (cpu_online(cpu)) { > + ret = cpufreq_online(cpu); > + if (ret) > + return ret; > + } > > which obviously has no bearing on that WARN_ON() firing. > > Maybe I'm testing the wrong patch. Thanks for testing it.. You need another patch from Rafael, which should be in linux-next by now.. commit 3689ad7ed6a8 ("cpufreq: Drop unnecessary check from cpufreq_policy_alloc()") Both patches combined will fix the problem you were getting. -- viresh