From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 405DDC2D0C0 for ; Fri, 20 Dec 2019 04:33:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 09C04227BF for ; Fri, 20 Dec 2019 04:33:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1576816402; bh=G8l/5N8kFue/UZGsXowVBpFoM3W7eEVnRi8+PZweN1o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=07iPOwZqhEbNqEM8Ba9t8sLzWRFN8jeKGHFq4UkN3Uxx5a+fFnNCTdyQCCNtiWPIO 6MTegxX3428r+7Zn1hnZji6UvCNimHJJaZKyxt3kRWHdIjf0tlU/8jkl0kbQ9MOgXV MlVESarN+PNMNC+GIVI8ISviR5AaupMaLKFl5fpE= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727233AbfLTEdU (ORCPT ); Thu, 19 Dec 2019 23:33:20 -0500 Received: from mail-pl1-f194.google.com ([209.85.214.194]:40459 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727084AbfLTEdU (ORCPT ); Thu, 19 Dec 2019 23:33:20 -0500 Received: by mail-pl1-f194.google.com with SMTP id s21so805216plr.7; Thu, 19 Dec 2019 20:33:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WbSLPjQaKNTSoDVDRB8HNXADeLyPQluh+mRdn2H950w=; b=EkPf47WOHEGbFSIlTe8dKTjWtw4riand+WVT+dVXCbbeaWSAacNAAo4/hdFKlD8/t9 b27auepbc0a5FhM5tAL7op4Kx+mrvjcIJIhS2WXai9fUvhP2XzMk3b3HwXzAA8Fs743s ++S2vHbikgJbrYt1kZJAsKZY1bdFJkFcd2FabzY66z1aRQJpboSogrcpkY9ipPLtwj4C bKMvYBo7/TwF0toCwo3pzUM4J/Ug7aWJOTGrQ/Uz7RWQ/gHqGK81ItAQErHY4YANDzxJ yMYRMroiRyLjqOTk0t8+cP+ohmC/dOAHt2Viv/bc63Kd46LneACtlMwEppiqY9kYfoZ4 kC6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=WbSLPjQaKNTSoDVDRB8HNXADeLyPQluh+mRdn2H950w=; b=HLQkYyiGQQJDnqH4K6/SxQbNCD14qOw6i/344UVWuQzZw883KmraUjsCS6x7hghSJU Bpxw0WhhhjNPwj6CioVvJ5JvdznwsfUMbnh4xMt/zV3gdeZpwjhruhNxdx8Q+0iWEZD1 P0B+nu5LoL1iWRjkLgGakd7DjEM6pIHed55RQ0HVekU2akWVuWh22nzC8vpdlga6QM5p wWQx2R33lwd91V/41G6xECumpm8lTv1yj8dHKicY1Eu4HCV2UoONVFiCokAwUToAQkLr ALNcdKJk4YZt+V5qQfeh+BgR/g7w9aZKKpxSVC6cyzFUkUZx0JIfPB2aaHEtXKqFyRLE nuTQ== X-Gm-Message-State: APjAAAWsl5ZvNx1agHBG5g5rZ5akNK3G7wBGxusxFtFB1BCuqSA516x4 Ad+fCeBIJiXlwFK7PFkD1Iw= X-Google-Smtp-Source: APXvYqzd3+HR8Q8IysO9IHEaAJ+eztlAcpUUVTgp+ZHUcbdHy3yPgWFRMrBT7cb7nBEwW0uVs5fcGQ== X-Received: by 2002:a17:90a:8912:: with SMTP id u18mr13291718pjn.21.1576816398759; Thu, 19 Dec 2019 20:33:18 -0800 (PST) Received: from gaurie.seo.corp.google.com ([2401:fa00:d:1:4eb0:a5ef:3975:7440]) by smtp.gmail.com with ESMTPSA id z30sm11013982pfq.154.2019.12.19.20.33.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Dec 2019 20:33:18 -0800 (PST) From: Namhyung Kim To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo Cc: Jiri Olsa , Alexander Shishkin , Mark Rutland , Stephane Eranian , LKML , linux-perf-users@vger.kernel.org, Tejun Heo , Li Zefan , Johannes Weiner , Adrian Hunter Subject: [PATCH 1/9] perf/core: Add PERF_RECORD_CGROUP event Date: Fri, 20 Dec 2019 13:32:45 +0900 Message-Id: <20191220043253.3278951-2-namhyung@kernel.org> X-Mailer: git-send-email 2.24.1.735.g03f4e72817-goog In-Reply-To: <20191220043253.3278951-1-namhyung@kernel.org> References: <20191220043253.3278951-1-namhyung@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org To support cgroup tracking, add CGROUP event to save a link between cgroup path and inode number. The attr.cgroup bit was also added to enable cgroup tracking from userspace. This event will be generated when a new cgroup becomes active. Userspace might need to synthesize those events for existing cgroups. Cc: Tejun Heo Cc: Li Zefan Cc: Johannes Weiner Cc: Adrian Hunter Signed-off-by: Namhyung Kim --- include/uapi/linux/perf_event.h | 14 +++- kernel/events/core.c | 112 ++++++++++++++++++++++++++++++++ 2 files changed, 125 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 377d794d3105..7bae2d3380a6 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -377,7 +377,8 @@ struct perf_event_attr { ksymbol : 1, /* include ksymbol events */ bpf_event : 1, /* include bpf events */ aux_output : 1, /* generate AUX records instead of events */ - __reserved_1 : 32; + cgroup : 1, /* include cgroup events */ + __reserved_1 : 31; union { __u32 wakeup_events; /* wakeup every n events */ @@ -1006,6 +1007,17 @@ enum perf_event_type { */ PERF_RECORD_BPF_EVENT = 18, + /* + * struct { + * struct perf_event_header header; + * u64 id; + * u64 path_len; + * char path[]; + * struct sample_id sample_id; + * }; + */ + PERF_RECORD_CGROUP = 19, + PERF_RECORD_MAX, /* non-ABI */ }; diff --git a/kernel/events/core.c b/kernel/events/core.c index 4ff86d57f9e5..9bcb2b552acc 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -386,6 +386,7 @@ static atomic_t nr_freq_events __read_mostly; static atomic_t nr_switch_events __read_mostly; static atomic_t nr_ksymbol_events __read_mostly; static atomic_t nr_bpf_events __read_mostly; +static atomic_t nr_cgroup_events __read_mostly; static LIST_HEAD(pmus); static DEFINE_MUTEX(pmus_lock); @@ -4455,6 +4456,8 @@ static void unaccount_event(struct perf_event *event) atomic_dec(&nr_comm_events); if (event->attr.namespaces) atomic_dec(&nr_namespaces_events); + if (event->attr.cgroup) + atomic_dec(&nr_cgroup_events); if (event->attr.task) atomic_dec(&nr_task_events); if (event->attr.freq) @@ -7564,6 +7567,106 @@ void perf_event_namespaces(struct task_struct *task) NULL); } +/* + * cgroup tracking + */ +#ifdef CONFIG_CGROUPS + +struct perf_cgroup_event { + char *path; + struct { + struct perf_event_header header; + u64 id; + u64 path_len; + char path[]; + } event_id; +}; + +static int perf_event_cgroup_match(struct perf_event *event) +{ + return event->attr.cgroup; +} + +static void perf_event_cgroup_output(struct perf_event *event, void *data) +{ + struct perf_cgroup_event *cgroup_event = data; + struct perf_output_handle handle; + struct perf_sample_data sample; + u16 header_size = cgroup_event->event_id.header.size; + int ret; + + if (!perf_event_cgroup_match(event)) + return; + + perf_event_header__init_id(&cgroup_event->event_id.header, + &sample, event); + ret = perf_output_begin(&handle, event, + cgroup_event->event_id.header.size); + if (ret) + goto out; + + perf_output_put(&handle, cgroup_event->event_id); + __output_copy(&handle, cgroup_event->path, + cgroup_event->event_id.path_len); + + perf_event__output_id_sample(event, &handle, &sample); + + perf_output_end(&handle); +out: + cgroup_event->event_id.header.size = header_size; +} + +void perf_event_cgroup(struct cgroup *cgrp) +{ + struct perf_cgroup_event cgroup_event; + char path_enomem[16] = "//enomem"; + char *pathname; + size_t size; + + if (!atomic_read(&nr_cgroup_events)) + return; + + cgroup_event = (struct perf_cgroup_event){ + .event_id = { + .header = { + .type = PERF_RECORD_CGROUP, + .misc = 0, + .size = sizeof(cgroup_event.event_id), + }, + .id = cgroup_id(cgrp), + }, + }; + + pathname = kmalloc(PATH_MAX, GFP_KERNEL); + if (pathname == NULL) { + cgroup_event.path = path_enomem; + } else { + /* just to be sure to have enough space for alignment */ + cgroup_path(cgrp, pathname, PATH_MAX - sizeof(u64)); + cgroup_event.path = pathname; + } + + /* + * Since our buffer works in 8 byte units we need to align our string + * size to a multiple of 8. However, we must guarantee the tail end is + * zero'd out to avoid leaking random bits to userspace. + */ + size = strlen(cgroup_event.path) + 1; + while (!IS_ALIGNED(size, sizeof(u64))) + cgroup_event.path[size++] = '\0'; + + cgroup_event.event_id.header.size += size; + cgroup_event.event_id.path_len = size; + + perf_iterate_sb(perf_event_cgroup_output, + &cgroup_event, + NULL); + + kfree(pathname); +} + +#endif + /* * mmap tracking */ @@ -10607,6 +10710,8 @@ static void account_event(struct perf_event *event) atomic_inc(&nr_comm_events); if (event->attr.namespaces) atomic_inc(&nr_namespaces_events); + if (event->attr.cgroup) + atomic_inc(&nr_cgroup_events); if (event->attr.task) atomic_inc(&nr_task_events); if (event->attr.freq) @@ -12581,6 +12686,12 @@ static void perf_cgroup_css_free(struct cgroup_subsys_state *css) kfree(jc); } +static int perf_cgroup_css_online(struct cgroup_subsys_state *css) +{ + perf_event_cgroup(css->cgroup); + return 0; +} + static int __perf_cgroup_move(void *info) { struct task_struct *task = info; @@ -12602,6 +12713,7 @@ static void perf_cgroup_attach(struct cgroup_taskset *tset) struct cgroup_subsys perf_event_cgrp_subsys = { .css_alloc = perf_cgroup_css_alloc, .css_free = perf_cgroup_css_free, + .css_online = perf_cgroup_css_online, .attach = perf_cgroup_attach, /* * Implicitly enable on dfl hierarchy so that perf events can -- 2.24.1.735.g03f4e72817-goog