[PATCH V2 0/4] Optimize cgroup context switch

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: kan.liang@linux.intel.com
To: peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com,
	linux-kernel@vger.kernel.org
Cc: eranian@google.com, tj@kernel.org, mark.rutland@arm.com,
	irogers@google.com, ak@linux.intel.com,
	Kan Liang <kan.liang@linux.intel.com>
Subject: [PATCH V2 0/4] Optimize cgroup context switch
Date: Wed, 15 May 2019 14:01:28 -0700	[thread overview]
Message-ID: <1557954092-67275-1-git-send-email-kan.liang@linux.intel.com> (raw)

From: Kan Liang <kan.liang@linux.intel.com>

Changes since V1:
- Add new event_type to indicate cgroup only switch
  Add cgrp_event_type to track event type of a cgroup
  Extend ctx_pinned/flexible_sched_in and struct sched_in_data to pass
  the event_type
- If the new cgroup has pinned events, schedule out all flexible events
  before sched in all events.
- Add macro and helper function to replace duplicated content in patch 1
- Add new RB tree keys, cgrp_id and cgrp_group_index, for cgroup.
  Now, cgrp_id is the same as css subsys-unique ID.
- Add per-cpu pinned/flexible_event in perf_cgroup to track the left most
  event for a cgroup.
- Add per-cpu rotated_event in perf_cgroup to handle multiplexing.
  Disable fast path for multiplexing.
- Support hierarchies
- Update test result. Test with different hierarchy.


On systems with very high context switch rates between cgroups,
there are high overhead using cgroup perf.

Current codes have two issues.
- System-wide events are mistakenly switched in cgroup
  context switch. It causes system-wide events miscounting,
  and brings avoidable overhead.
  Patch 1 fixes the issue.
- The cgroup context switch sched_in is low efficient.
  All cgroup events share the same per-cpu pinned/flexible groups.
  The RB trees for pinned/flexible groups don't understand cgroup.
  Current code has to traverse all events, and use event_filter_match()
  to filter the events for specific cgroup.
  Patch 2-4 adds a fast path for cgroup context switch sched_in by
  training the RB tree to understand cgroup. The extra filtering
  can be avoided.


Here is test with 6 child cgroups (sibling cgroups), 1 parent cgroup
and system-wide events.
A specjbb benchmark is running in each child cgroup.
The perf command is as below.
   perf stat -e cycles,instructions -e cycles,instructions
   -e cycles,instructions -e cycles,instructions
   -e cycles,instructions -e cycles,instructions
   -e cycles,instructions -e cycles,instructions
   -G cgroup1,cgroup1,cgroup2,cgroup2,cgroup3,cgroup3
   -G cgroup4,cgroup4,cgroup5,cgroup5,cgroup6,cgroup6
   -G cgroup_parent,cgroup_parent
   -a -e cycles,instructions -I 1000

The average RT (Response Time) reported from specjbb is
used as key performance metrics. (The lower the better)

                                        RT(us)              Overhead
Baseline (no perf stat):                4286.9
Use cgroup perf, no patches:            4537.1                5.84%
Use cgroup perf, apply patch 1:         4440.7                3.59%
Use cgroup perf, apple all patches:     4403.5                2.72%

Kan Liang (4):
  perf: Fix system-wide events miscounting during cgroup monitoring
  perf: Add filter_match() as a parameter for pinned/flexible_sched_in()
  perf cgroup: Add new RB tree keys for cgroup
  perf cgroup: Add fast path for cgroup switch

 include/linux/perf_event.h |   6 +
 kernel/events/core.c       | 427 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 394 insertions(+), 39 deletions(-)

-- 
2.7.4

next             reply	other threads:[~2019-05-15 21:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-15 21:01 kan.liang [this message]
2019-05-15 21:01 ` [PATCH V2 1/4] perf: Fix system-wide events miscounting during cgroup monitoring kan.liang
2019-05-15 21:01 ` [PATCH V2 2/4] perf: Add filter_match() as a parameter for pinned/flexible_sched_in() kan.liang
2019-05-15 21:01 ` [PATCH V2 3/4] perf cgroup: Add new RB tree keys for cgroup kan.liang
2019-05-15 21:01 ` [PATCH V2 4/4] perf cgroup: Add fast path for cgroup switch kan.liang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1557954092-67275-1-git-send-email-kan.liang@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox