From: Alexey Budankov <alexey.budankov@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>, Kan Liang <kan.liang@intel.com>,
Dmitri Prokhorov <Dmitry.Prohorov@intel.com>,
Valery Cherepennikov <valery.cherepennikov@intel.com>,
Mark Rutland <mark.rutland@arm.com>,
Stephane Eranian <eranian@google.com>,
David Carrillo-Cisneros <davidcc@google.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Vince Weaver <vince@deater.net>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC][PATCH] perf: Rewrite enabled/running timekeeping
Date: Tue, 5 Sep 2017 15:06:26 +0300 [thread overview]
Message-ID: <f66ce7fe-db8d-2456-5510-c99db6cdcb2a@linux.intel.com> (raw)
In-Reply-To: <9c3c1fc7-5a2e-2745-dab1-390dc7d58d50@linux.intel.com>
Hi,
On 05.09.2017 13:17, Alexey Budankov wrote:
> On 04.09.2017 18:41, Peter Zijlstra wrote:
>> On Mon, Sep 04, 2017 at 05:56:06PM +0300, Alexey Budankov wrote:
>>> On 04.09.2017 15:08, Peter Zijlstra wrote:
>>>> On Mon, Sep 04, 2017 at 01:46:45PM +0300, Alexey Budankov wrote:
>>>>>> So the below completely rewrites timekeeping (and probably breaks
>>>>>> world) but does away with the need to touch events that don't get
>>>>>> scheduled.
>>>>>
>>>>> We still need and do iterate thru all events at some points e.g. on context switches.
>>>>
>>>> Why do we _need_ to?
>>>
>>> We do so in the current implementation with several tstamp_* fields.
>>
>> Right, but we want to stop doing so asap :-)
>>
>
> Well, I see you point :). It turns out that with straightforward timekeeping
> we can also avoid whole tree iteration on context switches additionally to RB
> tree based iterations and rotations on hrtimer interrupt. That brings even more
> performance and rotation switch can be avoided.
>
> However we can't completely get rid of whole tree iterations because of
> inheritance code on forks in perf_event_init_context() here:
>
> perf_event_groups_for_each(event, &parent_ctx->pinned_groups, group_node) {
> ret = inherit_task_group(event, parent, parent_ctx,
> child, ctxn, &inherited_all);
> if (ret)
> goto out_unlock;
> }
>
> and here:
>
> perf_event_groups_for_each(event, &parent_ctx->flexible_groups, group_node) {
> ret = inherit_task_group(event, parent, parent_ctx,
> child, ctxn, &inherited_all);
> if (ret)
> goto out_unlock;
> }
>
> Below is the patch set put on top of the timekeeping rework.
>
> It is for tip/master branch.
>
> ---
> include/linux/perf_event.h | 40 +--
> kernel/events/core.c | 839 +++++++++++++++++++++++----------------------
> 2 files changed, 448 insertions(+), 431 deletions(-)
>
Got this under perf_fuzzer on Xeon Phi (KNL):
[ 6614.226280] ------------[ cut here ]------------
[ 6614.226305] WARNING: CPU: 45 PID: 43385 at kernel/events/core.c:239 event_function+0xb3/0xe0
[ 6614.226310] Modules linked in: btrfs xor raid6_pq ufs hfsplus hfs minix vfat msdos fat jfs xfs reiserfs binfmt_misc xt_CHECKSUM iptable_mangle fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables cmac arc4 md4 nls_utf8 nfsv3 cifs rpcsec_gss_krb5 nfsv4 nfs ccm dns_resolver fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl hfi1 sb_edac nfsd x86_pkg_temp_thermal intel_powerclamp rdmavt ipmi_ssif coretemp iTCO_wdt joydev ib_core crct10dif_pclmul iTCO_vendor_support crc32_pclmul ipmi_si ghash_clmulni_intel
[ 6614.226444] auth_rpcgss ipmi_devintf mei_me tpm_tis intel_cstate mei tpm_tis_core intel_uncore pcspkr intel_rapl_perf tpm ipmi_msghandler shpchp nfs_acl lpc_ich lockd i2c_i801 wmi grace acpi_power_meter acpi_pad sunrpc mgag200 drm_kms_helper ttm drm igb crc32c_intel ptp pps_core dca i2c_algo_bit
[ 6614.226501] CPU: 45 PID: 43385 Comm: perf_fuzzer Not tainted 4.13.0-v11.1.2+ #5
[ 6614.226506] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS S72C610.86B.01.01.0190.080520162104 08/05/2016
[ 6614.226511] task: ffff8d0f5866c000 task.stack: ffffa6b05aae8000
[ 6614.226517] RIP: 0010:event_function+0xb3/0xe0
[ 6614.226522] RSP: 0018:ffffa6b05aaebc30 EFLAGS: 00010087
[ 6614.226528] RAX: 0000000000000000 RBX: ffffc6b03c545a90 RCX: ffffa6b05aaebd00
[ 6614.226532] RDX: 0000000000000000 RSI: ffffa6b05aaebca0 RDI: ffffc6b03c545a98
[ 6614.226536] RBP: ffffa6b05aaebc58 R08: 000000000001f8e0 R09: ffff8d0fa99fd120
[ 6614.226540] R10: ffff8d0fbb407900 R11: 0000000000000000 R12: ffffc6b03ba05a90
[ 6614.226544] R13: 0000000000000000 R14: ffffa6b05aaebd48 R15: ffff8d0f9c48a000
[ 6614.226550] FS: 00007f05cff85740(0000) GS:ffff8d0fbc540000(0000) knlGS:0000000000000000
[ 6614.226555] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6614.226559] CR2: 00000000007211c0 CR3: 0000002f269fa000 CR4: 00000000001407e0
[ 6614.226563] Call Trace:
[ 6614.226577] remote_function+0x3b/0x50
[ 6614.226585] generic_exec_single+0x9a/0xd0
[ 6614.226592] smp_call_function_single+0xc8/0x100
[ 6614.226599] cpu_function_call+0x43/0x60
[ 6614.226606] ? cpu_clock_event_read+0x10/0x10
[ 6614.226612] perf_event_read+0xc7/0xe0
[ 6614.226619] ? perf_install_in_context+0xf0/0xf0
[ 6614.226625] __perf_read_group_add+0x25/0x180
[ 6614.226632] perf_read+0xcb/0x2b0
[ 6614.226640] __vfs_read+0x37/0x160
[ 6614.226648] ? security_file_permission+0x9d/0xc0
[ 6614.226655] vfs_read+0x8c/0x130
[ 6614.226661] SyS_read+0x55/0xc0
[ 6614.226670] do_syscall_64+0x67/0x180
[ 6614.226678] entry_SYSCALL64_slow_path+0x25/0x25
[ 6614.226684] RIP: 0033:0x7f05cfaad980
[ 6614.226688] RSP: 002b:00007fff1b562b48 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 6614.226694] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f05cfaad980
[ 6614.226699] RDX: 0000000000004fcf RSI: 0000000000735680 RDI: 0000000000000003
[ 6614.226703] RBP: 00007fff1b562b60 R08: 00007f05cfd800f4 R09: 00007f05cfd80140
[ 6614.226707] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401980
[ 6614.226711] R13: 00007fff1b564f60 R14: 0000000000000000 R15: 0000000000000000
[ 6614.226716] Code: e2 48 89 de 4c 89 ff 41 ff 56 08 31 c0 4d 85 ed 74 05 41 c6 45 08 00 c6 43 08 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 39 dc 74 cf <0f> ff eb cb 0f ff 0f 1f 80 00 00 00 00 e9 78 ff ff ff 0f ff 66
[ 6614.226812] ---[ end trace ff12704813059a28 ]---
static int event_function(void *info)
{
struct event_function_struct *efs = info;
struct perf_event *event = efs->event;
struct perf_event_context *ctx = event->ctx;
struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
struct perf_event_context *task_ctx = cpuctx->task_ctx;
int ret = 0;
WARN_ON_ONCE(!irqs_disabled());
perf_ctx_lock(cpuctx, task_ctx);
/*
* Since we do the IPI call without holding ctx->lock things can have
* changed, double check we hit the task we set out to hit.
*/
if (ctx->task) {
if (ctx->task != current) {
ret = -ESRCH;
goto unlock;
}
/*
* We only use event_function_call() on established contexts,
* and event_function() is only ever called when active (or
* rather, we'll have bailed in task_function_call() or the
* above ctx->task != current test), therefore we must have
* ctx->is_active here.
*/
WARN_ON_ONCE(!ctx->is_active);
/*
* And since we have ctx->is_active, cpuctx->task_ctx must
* match.
*/
WARN_ON_ONCE(task_ctx != ctx);
} else {
===> WARN_ON_ONCE(&cpuctx->ctx != ctx);
}
efs->func(event, cpuctx, ctx, efs->data);
unlock:
perf_ctx_unlock(cpuctx, task_ctx);
return ret;
}
next prev parent reply other threads:[~2017-09-05 12:06 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-02 8:11 [PATCH v6 0/3] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Alexey Budankov
2017-08-02 8:13 ` [PATCH v6 1/3] perf/core: use rb trees for pinned/flexible groups Alexey Budankov
2017-08-03 13:00 ` Peter Zijlstra
2017-08-03 20:30 ` Alexey Budankov
2017-08-04 14:36 ` Peter Zijlstra
2017-08-07 7:17 ` Alexey Budankov
2017-08-07 8:39 ` Peter Zijlstra
2017-08-07 9:13 ` Peter Zijlstra
2017-08-07 15:32 ` Alexey Budankov
2017-08-07 15:55 ` Peter Zijlstra
2017-08-07 16:27 ` Alexey Budankov
2017-08-07 16:57 ` Peter Zijlstra
2017-08-07 17:39 ` Andi Kleen
2017-08-07 18:12 ` Peter Zijlstra
2017-08-07 18:13 ` Alexey Budankov
2017-08-15 17:28 ` Alexey Budankov
2017-08-23 13:39 ` Alexander Shishkin
2017-08-23 14:18 ` Alexey Budankov
2017-08-29 13:51 ` Alexander Shishkin
2017-08-30 8:30 ` Alexey Budankov
2017-08-30 10:18 ` Alexander Shishkin
2017-08-30 10:30 ` Alexey Budankov
2017-08-30 11:13 ` Alexander Shishkin
2017-08-30 11:16 ` Alexey Budankov
2017-08-31 10:12 ` Alexey Budankov
2017-08-31 10:12 ` Alexey Budankov
2017-08-04 14:53 ` Peter Zijlstra
2017-08-07 15:22 ` Alexey Budankov
2017-08-02 8:15 ` [PATCH v6 2/3]: perf/core: use context tstamp_data for skipped events on mux interrupt Alexey Budankov
2017-08-03 13:04 ` Peter Zijlstra
2017-08-03 14:00 ` Peter Zijlstra
2017-08-03 15:58 ` Alexey Budankov
2017-08-04 12:36 ` Peter Zijlstra
2017-08-03 15:00 ` Peter Zijlstra
2017-08-03 18:47 ` Alexey Budankov
2017-08-04 12:35 ` Peter Zijlstra
2017-08-04 12:51 ` Peter Zijlstra
2017-08-04 14:25 ` Alexey Budankov
2017-08-04 14:23 ` Alexey Budankov
2017-08-10 15:57 ` Alexey Budankov
2017-08-22 20:47 ` Peter Zijlstra
2017-08-23 8:54 ` Alexey Budankov
2017-08-31 17:18 ` [RFC][PATCH] perf: Rewrite enabled/running timekeeping Peter Zijlstra
2017-08-31 19:51 ` Stephane Eranian
2017-09-05 7:51 ` Stephane Eranian
2017-09-05 9:44 ` Peter Zijlstra
2017-09-01 10:45 ` Alexey Budankov
2017-09-01 12:31 ` Peter Zijlstra
2017-09-01 11:17 ` Alexey Budankov
2017-09-01 12:42 ` Peter Zijlstra
2017-09-01 21:03 ` Vince Weaver
2017-09-04 10:46 ` Alexey Budankov
2017-09-04 12:08 ` Peter Zijlstra
2017-09-04 14:56 ` Alexey Budankov
2017-09-04 15:41 ` Peter Zijlstra
2017-09-04 15:58 ` Peter Zijlstra
2017-09-05 10:17 ` Alexey Budankov
2017-09-05 11:19 ` Peter Zijlstra
2017-09-11 6:55 ` Alexey Budankov
2017-09-05 12:06 ` Alexey Budankov [this message]
2017-09-05 12:59 ` Peter Zijlstra
2017-09-05 16:03 ` Peter Zijlstra
2017-09-06 13:48 ` Alexey Budankov
2017-09-08 8:47 ` Alexey Budankov
2018-03-12 17:43 ` [tip:perf/core] perf/cor: Use RB trees for pinned/flexible groups tip-bot for Alexey Budankov
2017-08-02 8:16 ` [PATCH v6 3/3]: perf/core: add mux switch to skip to the current CPU's events list on mux interrupt Alexey Budankov
2017-08-18 5:17 ` [PATCH v7 0/2] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Alexey Budankov
2017-08-18 5:21 ` [PATCH v7 1/2] perf/core: use rb trees for pinned/flexible groups Alexey Budankov
2017-08-23 11:17 ` Alexander Shishkin
2017-08-23 17:23 ` Alexey Budankov
2017-08-18 5:22 ` [PATCH v7 2/2] perf/core: add mux switch to skip to the current CPU's events list on mux interrupt Alexey Budankov
2017-08-23 11:54 ` Alexander Shishkin
2017-08-23 18:12 ` Alexey Budankov
2017-08-22 20:21 ` [PATCH v7 0/2] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Peter Zijlstra
2017-08-23 8:54 ` Alexey Budankov
2017-08-31 10:12 ` Alexey Budankov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f66ce7fe-db8d-2456-5510-c99db6cdcb2a@linux.intel.com \
--to=alexey.budankov@linux.intel.com \
--cc=Dmitry.Prohorov@intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=davidcc@google.com \
--cc=eranian@google.com \
--cc=kan.liang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=valery.cherepennikov@intel.com \
--cc=vince@deater.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.