From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andi Kleen Subject: Re: perf_event: rdpmc self-monitoring overhead issue Date: Mon, 2 Sep 2013 19:26:59 +0200 Message-ID: <20130902172659.GJ19750@two.firstfloor.org> References: <87ob8cdji8.fsf@tassilo.jf.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from one.firstfloor.org ([193.170.194.197]:43021 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754822Ab3IBR1B (ORCPT ); Mon, 2 Sep 2013 13:27:01 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-perf-users-owner@vger.kernel.org List-ID: To: Vince Weaver Cc: eranian@gmail.com, Andi Kleen , LKML , linux-perf-users@vger.kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo > I assume he means MAP_POPULATE Yes. > > which does improve things, from ~3000 cycles to ~219 cycles but that's > still more overhead than the ~130 or so you get by manually touching the > page first. That seems odd. It should be the same. Can you do a trace-cmd function trace and compare the two cases? trace-cmd record -p function_graph ... trace-cmd report (as usual for tracing perf remove the useless -pg removal for perf in kernel/events/Makefile and arch/x86/kernel/cpu/Makefile first) -Andi