From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1760254AbdAJKZj (ORCPT <rfc822;w@1wt.eu>);
        Tue, 10 Jan 2017 05:25:39 -0500
Received: from mail-pf0-f174.google.com ([209.85.192.174]:35284 "EHLO
        mail-pf0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1750924AbdAJKZh (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 10 Jan 2017 05:25:37 -0500
From: David Carrillo-Cisneros <davidcc@google.com>
To: linux-kernel@vger.kernel.org
Cc: "x86@kernel.org" <x86@kernel.org>, Ingo Molnar <mingo@redhat.com>,
        Thomas Gleixner <tglx@linutronix.de>, Andi Kleen <ak@linux.intel.com>,
        Kan Liang <kan.liang@intel.com>, Peter Zijlstra <peterz@infradead.org>,
        Borislav Petkov <bp@suse.de>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Dave Hansen <dave.hansen@linux.intel.com>,
        Vikas Shivappa <vikas.shivappa@linux.intel.com>,
        Mark Rutland <mark.rutland@arm.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Vince Weaver <vince@deater.net>, Paul Turner <pjt@google.com>,
        Stephane Eranian <eranian@google.com>,
        David Carrillo-Cisneros <davidcc@google.com>
Subject: [RFC 0/6] optimize ctx switch with rb-tree
Date: Tue, 10 Jan 2017 02:24:56 -0800
Message-Id: <20170110102502.106187-1-davidcc@google.com>
X-Mailer: git-send-email 2.11.0.390.gc69c2f50cf-goog
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Following the discussion in:
https://patchwork.kernel.org/patch/9420035/

This is is an early version of a series of perf context switches
optimizations.

The main idea is to create and maintain a list of inactive events sorted
by timestamp, and a rb-tree index to index it. The rb-tree's key are
{cpu,flexible,stamp} for task contexts and {cgroup,flexible,stamp}
for CPU contexts.

The rb-tree provides functions to find intervals in the inactive event
list so that ctx_sched_in only has to visit the events that can be
potentially be scheduled (i.e. avoid iterations over events bound
to CPUs or cgroups that are not current).

Since the inactive list is sort by timestamp, rotation can be done by
simply scheduling out and in the events. This implies that each timer
interrupt, the events will rotate by q events (where q is the number
of hardware counters). This changes the current behavior of rotation.
Feedback welcome!

I haven't profiled the new approach. I am only assuming it will be
superior when the number of per-cpu or distict cgroup events is large.

The last patch shows how perf_iterate_ctx can use the new rb-tree index
to reduce the number of visited events. I haven't looked carefully if
locking and other things are correct.

If this changes are in the right direction. A next version could remove
some existing code, specifically the lists ctx->pinned_groups and
ctx->flexible_groups could be removed. Also, event_filter_match could be
simplified when called on events groups filtered using the rb-tree, since
both perform similar checks.

David Carrillo-Cisneros (6):
  perf/core: create active and inactive event groups
  perf/core: add a rb-tree index to inactive_groups
  perf/core: use rb-tree to sched in event groups
  perf/core: avoid rb-tree traversal when no inactive events
  perf/core: rotation no longer neccesary. Behavior has changed. Beware
  perf/core: use rb-tree index to optimize filtered  perf_iterate_ctx

 include/linux/perf_event.h |  13 ++
 kernel/events/core.c       | 466 +++++++++++++++++++++++++++++++++++++++------
 2 files changed, 426 insertions(+), 53 deletions(-)

-- 
2.11.0.390.gc69c2f50cf-goog