From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752790AbdBMNdD (ORCPT ); Mon, 13 Feb 2017 08:33:03 -0500 Received: from merlin.infradead.org ([205.233.59.134]:33340 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751472AbdBMNdC (ORCPT ); Mon, 13 Feb 2017 08:33:02 -0500 Date: Mon, 13 Feb 2017 14:32:53 +0100 From: Peter Zijlstra To: Kefeng Wang Cc: Ingo Molnar , linux-kernel@vger.kernel.org, guohanjun@huawei.com, Prarit Bhargava , Tejun Heo Subject: Re: [PATCH] sched/isolcpus: Show isolated cpu map Message-ID: <20170213133253.GP6515@twins.programming.kicks-ass.net> References: <1486979039-17874-1-git-send-email-wangkefeng.wang@huawei.com> <20170213120650.GN6515@twins.programming.kicks-ass.net> <46f994b0-ca60-65d7-3675-c01aeda5e439@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46f994b0-ca60-65d7-3675-c01aeda5e439@huawei.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 13, 2017 at 09:07:02PM +0800, Kefeng Wang wrote: > Hi Peter > > +Tejun > > On 2017/2/13 20:06, Peter Zijlstra wrote: > > On Mon, Feb 13, 2017 at 05:43:59PM +0800, Kefeng Wang wrote: > >> The commit a6e4491c682a ("sched/isolcpus: Output warning when the > >> 'isolcpus=' kernel parameter is invalid") adds an error message > >> when specified cpu bigger than nr_cpu_ids, but nr_cpumask_bits in > >> cpulist_parse() could be nr_cpu_ids or NR_CPUS. > >> > >> eg, NR_CPUS=64, nr_cpu_ids=8 in ARM64, cpulist_parse() won't return > >> -ERANGE if isolcpus=1-10; > >> > > > > But why does cpulist_parse() use nr_cpumask_bits, that seems to be the > > problem, so why not look there? > > > > > > Paste the Tejun's patch, > > commit 4d59b6ccf000862beed6fc0765d3209f98a8d8a2 > Author: Tejun Heo > Date: Wed Feb 8 14:30:56 2017 -0800 > > cpumask: use nr_cpumask_bits for parsing functions > > Commit 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and > parsing functions") converted both cpumask printing and parsing > functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was > okay for the printing functions as it just picked one of the two output > formats that we were alternating between depending on a kernel config, > doing the same for parsing wasn't okay. > > nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use > nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can > be more efficient to use NR_CPUS when we can get away with it. > Converting the printing functions to nr_cpu_ids makes sense because it > affects how the masks get presented to userspace and doesn't break > anything; however, using nr_cpu_ids for parsing functions can > incorrectly leave the higher bits uninitialized while reading in these > masks from userland. As all testing and comparison functions use > nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks > can erroneously yield false negative results. > > This made the taskstats interface incorrectly return -EINVAL even when > the inputs were correct. > > Fix it by restoring the parse functions to use nr_cpumask_bits instead > of nr_cpu_ids. OK, so its wrong both ways. Problem seems to be that cpumask is internally inconsistent with the number of bits because a small constant NR_CPUS is more efficient for things like cpumask_subset(). If everything were consistent and used nr_cpu_ids it would all be fine, but using a mixture is giving pain. Does something like the below work? It parses up to nr_cpu_ids and then and's with cpu_possible_mask (which has all bits set). In case nr_cpumask_bits is larger than nr_cpu_ids this should result in clearing the top bits (and therefore not leave them uninitialized). And using nr_cpu_ids for parsing now makes the range check work again. Since parsing in general is a really slow thing anyway, the extra cpumask operation doesn't matter. --- include/linux/cpumask.h | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 96f1e88b767c..6cf8945b999d 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -560,7 +560,13 @@ static inline void cpumask_copy(struct cpumask *dstp, static inline int cpumask_parse_user(const char __user *buf, int len, struct cpumask *dstp) { - return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpumask_bits); + int ret; + + ret = bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpu_ids); + if (!ret) + cpumask_and(dstp, dstp, cpu_possible_mask); + + return ret; } /** @@ -574,8 +580,13 @@ static inline int cpumask_parse_user(const char __user *buf, int len, static inline int cpumask_parselist_user(const char __user *buf, int len, struct cpumask *dstp) { - return bitmap_parselist_user(buf, len, cpumask_bits(dstp), - nr_cpumask_bits); + int ret; + + ret = bitmap_parselist_user(buf, len, cpumask_bits(dstp), nr_cpu_ids); + if (!ret) + cpumask_and(dstp, dstp, cpu_possible_mask); + + return ret; } /** @@ -589,8 +600,13 @@ static inline int cpumask_parse(const char *buf, struct cpumask *dstp) { char *nl = strchr(buf, '\n'); unsigned int len = nl ? (unsigned int)(nl - buf) : strlen(buf); + int ret; - return bitmap_parse(buf, len, cpumask_bits(dstp), nr_cpumask_bits); + ret = bitmap_parse(buf, len, cpumask_bits(dstp), nr_cpu_ids); + if (!ret) + cpumask_and(dstp, dstp, cpu_possible_mask); + + return ret; } /** @@ -602,7 +618,13 @@ static inline int cpumask_parse(const char *buf, struct cpumask *dstp) */ static inline int cpulist_parse(const char *buf, struct cpumask *dstp) { - return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpumask_bits); + int ret; + + ret = bitmap_parselist(buf, cpumask_bits(dstp), nr_cpu_ids); + if (!ret) + cpumask_and(dstp, dstp, cpu_possible_mask); + + return ret; } /**