From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F19219CC33; Wed, 29 Jan 2025 21:24:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738185851; cv=none; b=Fo8nn/rZcfbq/RJJ3XptXQoddmApJX3GV871v8gWBpWr6JDBAbPeUf4JlxufdSMKynzdO3vT2XGw0jHV/8jMvYKKUy6h7fDH4SRpV/aqGyT5GWq0Y8d1ICblYpP7gxeRL7QAnXPes9t93sA8i3woiluZ4C/qOVnIBaoQTwJ94+w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738185851; c=relaxed/simple; bh=QmMDy4hSKkSGdkNKClNFLZcY0gag8iYZv1FwISiITCw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=tHB2Gj0WxD+OEfvtyERUnJZJQQ8mIx/froByj724nLR22vMfgxjMG0gTb+YS9Yq3W4HdJQob6NWGIPTjTk+L6SVatMXqPGFTJUYmMSVR6BCf6+W0BX4cthNjt6v8UlD4WmesZ+grpOeLMqAd3438Ba79hQ4GkNJPI8ptZbkWoiw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HoRPa/11; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HoRPa/11" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5A664C4CED1; Wed, 29 Jan 2025 21:24:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1738185850; bh=QmMDy4hSKkSGdkNKClNFLZcY0gag8iYZv1FwISiITCw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HoRPa/11Ni4HPEGxiGEh6wiztJMtzAUy3Z/81nQktNBVT2tLAOqAdNWR1C1oYpTA+ /UN69mZqQoXlvvAfOuyJuB9eslpawixmAD6d+rJgOCJkS7ovPKPDZvRyB/hXasaVy9 n1sGTeEWGE6zugv2RZnAg7xOrI8kAmeixpXAfQCB+Dt6EMimdM3bAZo87WCuraruh1 +ar1HKKZguzlXoGd3pIlHEUiUlMN2kApLvmthrq+zgU+//OsiOqxXf5+bKhpU9G6X4 6xxEIP0/DYuZ/6D02wQ6MSPYb2WS82xeAAOppCa8lgi0DDisJuyK3o3FNipQx9f/Zi vpC1ErAR2GIRQ== Date: Wed, 29 Jan 2025 13:24:07 -0800 From: Namhyung Kim To: Ian Rogers Cc: Arnaldo Carvalho de Melo , Linus Torvalds , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , James Clark , Ze Gao , Weilin Wang , Dominique Martinet , Jean-Philippe Romain , Junhao He , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, Aditya Bodkhe , Leo Yan , Atish Patra Subject: Re: [PATCH v5 3/4] perf record: Skip don't fail for events that don't open Message-ID: References: <20250109222109.567031-4-irogers@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Hi Ian, Sorry for the delay. On Wed, Jan 15, 2025 at 09:56:59AM -0800, Ian Rogers wrote: > On Wed, Jan 15, 2025 at 9:31 AM Namhyung Kim wrote: > > > > On Mon, Jan 13, 2025 at 03:04:26PM -0800, Ian Rogers wrote: > > > On Mon, Jan 13, 2025 at 12:51 PM Namhyung Kim wrote: > > > > > > > > Hi Ian, > > > > > > > > On Fri, Jan 10, 2025 at 01:33:57PM -0800, Ian Rogers wrote: > > > > > On Fri, Jan 10, 2025 at 11:26 AM Namhyung Kim wrote: > > > > > > > > > > > > On Fri, Jan 10, 2025 at 08:42:02AM -0800, Ian Rogers wrote: [...] > > > > > > > A patch lowering the priority of error messages should be independent > > > > > > > of the 4 changes here. I'd be happy if someone follows this series > > > > > > > with a patch doing it. > > > > > > > > > > > > I think the error behavior is a part of this change. > > > > > > > > > > I disagree with it, so I think you need to address my comments. > > > > > > > > You are changing the error behavior by skipping failed events then the > > > > relevant error messages should be handled properly in this patchset. > > > > > > I'm not sure what you are asking and I'm not sure why it matters? > > > Previously you'd asked for all the output to be moved under verbose. > > > > > > If I specify an event that doesn't work with perf record today then it > > > fails. With this patch it fails too. If that event is a core PMU event > > > then there will be an error message for each core PMU that doesn't > > > support the event. So I get 2 error messages on hybrid. This doesn't > > > feel egregious or warrant a new error message mechanism. I would like > > > it so that evsels supported 1 or more PMUs, in which case this would > > > be 1 error message. > > > > > > If I specify perf record today on an uncore event then perf record > > > fails and I get 1 error message for the uncore PMU. The new behavior > > > will be to get 1 error message per uncore PMU. If I'm on a server with > > > 10s of uncore PMUs then maybe the message is spammy, but the command > > > fails today and will continue to fail with this series. I don't see a > > > motivation to change or optimize for this case and again, evsels that > > > support >1 PMU would be the most appropriate fix. > > > > > > The only case where there is no message today but would be with this > > > patch series is for cycles on ARM's neoverse. There will be one > > > warning for the evsel on the SLC PMU. That's one warning and not many. > > > > > > As I've said, if you want a more elaborate error reporting system then > > > take these patches and add it to them. There's a larger refactor to > > > make evsels support >1 PMU that would clean up the many events on > > > server uncore PMUs issue, but that shouldn't be part of this series > > > nor gate it. If you are trying to perf record on uncore PMUs then you > > > already have problems and optimizing the error messages for your > > > mistake, I don't get why it matters? > > > > What about with multiple events in the command line - one of them > > failing with >1 PMUs and the command now succeeds? > > So this would be something like: > ``` > $ perf record -e cycles,instructions,data_read -a sleep 1 > ``` > where data_read is an uncore PMU event. The current behavior is: > ``` > $ perf record -e cycles,instructions,data_read -a sleep 1 > Error: > The sys_perf_event_open() syscall returned with 22 (Invalid argument) > for event (data_read). > "dmesg | grep -i perf" may provide additional information. > ``` > The new behavior is: > ``` > $ perf record -e cycles,instructions,data_read -a sleep 1 > Error: > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' > which will be removed. > The sys_perf_event_open() syscall returned with 22 (Invalid argument) > for event (data_read). > "dmesg | grep -i perf" may provide additional information. > > Error: > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' > which will be removed. > The sys_perf_event_open() syscall returned with 22 (Invalid argument) > for event (data_read). > "dmesg | grep -i perf" may provide additional information. > > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 3.138 MB perf.data (11670 samples) ] > ``` > > We know nobody does this, as the command currently fails. It succeeds > with this change, because that's the whole point of the change. Well, I think it's because it failed before. New users can come anytime and do whatever they want (or can). They might pass 100 failing events with 1 successful event and it will give a ton of warnings with this. So it'd be better ratelimit the message and make it optional (with -v). But more importantly, I think we should agree on the patch 4 first. Thanks, Namhyung > I'm not offended by seeing the event was being opened on >1 PMU. For the > only currently succeeding situation where this will now warn, the > cycles case on Neoverse because of the buggy event name in ARM's SLC > PMU, there will be 1 warning. For my example the appropriate fix is to > remove the data_read event. For the Neoverse case, specifying the PMU > resolves the issue until ARM fixes their driver.