From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751430AbdBBL3S (ORCPT ); Thu, 2 Feb 2017 06:29:18 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41488 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751245AbdBBL3Q (ORCPT ); Thu, 2 Feb 2017 06:29:16 -0500 Date: Thu, 2 Feb 2017 12:29:13 +0100 From: Jiri Olsa To: Jan Stancek Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@kernel.org, mhiramat@kernel.org, rui.teng@linux.vnet.ibm.com, sukadev@linux.vnet.ibm.com Subject: Re: [PATCH] perf: fix topology test on systems with sparse CPUs Message-ID: <20170202112913.GA2305@krava> References: <290bf2031885722414cb1ae031869094a18b0580.1485794959.git.jstancek@redhat.com> <20170130184908.GB28444@krava> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 02 Feb 2017 11:29:17 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 31, 2017 at 05:03:51PM +0100, Jan Stancek wrote: > On 01/30/2017 07:49 PM, Jiri Olsa wrote: > > so basically we're changing from avail to online cpus > > > > have you checked all the users of this FEATURE > > if such change is ok? > > Jiri, > > It wasn't OK as there are other users who index cpu_topology_map by CPU id. > I decided to give the alternative a try (attached): keep cpu_topology_map > indexed by CPU id, but extend it to fit max present CPU. please send this next time as a standard patchset, it's hard to discuss over attachments SNIP > When build_cpu_topo() encounters offline/absent CPUs, > it fails to find any sysfs entries and returns failure. > This leads to build_cpu_topology() and write_cpu_topology() > failing as well. > > Because HEADER_CPU_TOPOLOGY has not been written, read leaves > cpu_topology_map NULL and we get NULL ptr deref at: > > ... > cmd_test > __cmd_test > test_and_print > run_test > test_session_topology > check_cpu_topology So IIUIC that's the key issue here.. write_cpu_topology that fails to write the TOPO data and following readers crashing on processing uncomplete data? if thats the case write_cpu_topology needs to be fixed, instead of doing workarounds SNIP > u32 nr, i; > size_t sz; > long ncpus; > - int ret = -1; > + int ret = 0; > + struct cpu_map *map; > > ncpus = sysconf(_SC_NPROCESSORS_CONF); > if (ncpus < 0) > - return NULL; > + goto out; can just return NULL > + > + /* build online CPU map */ > + map = cpu_map__new(NULL); > + if (map == NULL) { > + pr_debug("failed to get system cpumap\n"); > + goto out; > + } > > nr = (u32)(ncpus & UINT_MAX); > > sz = nr * sizeof(char *); > - > addr = calloc(1, sizeof(*tp) + 2 * sz); > if (!addr) > - return NULL; > + goto out_free; > > tp = addr; > tp->cpu_nr = nr; > @@ -530,14 +537,21 @@ static struct cpu_topo *build_cpu_topology(void) > tp->thread_siblings = addr; > > for (i = 0; i < nr; i++) { > + if (!cpu_map__has(map, i)) > + continue; > + so this prevents build_cpu_topo to fail due to missing topology info because cpu is offline.. can it fail for other reasons? > ret = build_cpu_topo(tp, i); > if (ret < 0) > break; SNIP