* [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP @ 2015-06-11 9:59 Hendrik Brueckner 2015-06-11 9:59 ` [PATCH 2/2] perf: correct event accounting imbalance on error path Hendrik Brueckner 2015-06-11 10:25 ` [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Peter Zijlstra 0 siblings, 2 replies; 5+ messages in thread From: Hendrik Brueckner @ 2015-06-11 9:59 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo Cc: Vince Weaver, Paul Mackerras, Martin Schwidefsky, Heiko Carstens, linux-kernel, linux-s390, Hendrik Brueckner The ENOTSUPP (which actually should be EOPNOTSUPP for user space) does not trigger a fallback event selection, for example, by perf record. If hardware support for the cycles perf event is available, but the hardware does not provide interrupts, returning ENOTSUPP causes perf to end. Returning ENOENT causes the perf tool to fallback to a software-based cycle PMU that supports interrupts. The commit 53b25335dd ("perf: Disable sampled events if no PMU interrupt") introduced that incompatible change. Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> --- kernel/events/core.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index eddf1ed..4c66465 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7964,7 +7964,7 @@ SYSCALL_DEFINE5(perf_event_open, if (is_sampling_event(event)) { if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { - err = -ENOTSUPP; + err = -ENOENT; goto err_alloc; } } -- 1.7.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] perf: correct event accounting imbalance on error path 2015-06-11 9:59 [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Hendrik Brueckner @ 2015-06-11 9:59 ` Hendrik Brueckner 2015-06-11 10:25 ` [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Peter Zijlstra 1 sibling, 0 replies; 5+ messages in thread From: Hendrik Brueckner @ 2015-06-11 9:59 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo Cc: Vince Weaver, Paul Mackerras, Martin Schwidefsky, Heiko Carstens, linux-kernel, linux-s390, Hendrik Brueckner If the perf_event_open() syscall is called for sampling (perf record), but the selected PMU does not support the sampling, the event is freed. This particular free includes a decrement of the perf_sched_events jump label. However, the accounting which actually increase perf_sched_events is not yet called and, therefore, triggers a warning in the jump_label code. On s390, this ends in this warning: [ 16.633195] ------------[ cut here ]------------ [ 16.633196] WARNING: at ../kernel/jump_label.c:82 [ 16.633197] Modules linked in: eadm_sch nfsd auth_rpcgss oid_registry nfs_acl lockd dm_multipath grace dm_mod sunrpc scsi_dh autofs4 [ 16.633204] CPU: 0 PID: 539 Comm: perf Not tainted 3.18.3-20150126.0.fdf02cc.31d6da9.fc20.s390xperformance #1 [ 16.633206] task: 000000000afecec0 ti: 0000000004a94000 task.ti: 0000000004a94000 [ 16.633207] Krnl PSW : 0704e00180000000 000000000024c6c2 (__static_key_slow_dec+0xfa/0x100) [ 16.633214] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000086 0000000000e39e14 000000000000001b 0000000000000000 [ 16.633216] 000000000024c6be 0000000000000000 0000000000000000 0000000000000000 [ 16.633218] 0000000000000000 0000000005d13480 fffffffffffffdf4 0000000000ccac18 [ 16.633219] 0000000000000064 0000000000ccabf8 000000000024c6be 0000000004a97d58 [ 16.633227] Krnl Code: 000000000024c6b2: c0200041fd3e larl %r2,a8c12e 000000000024c6b8: c0e5003292c4 brasl %r14,89ec40 #000000000024c6be: a7f40001 brc 15,24c6c0 >000000000024c6c2: a7f4ffb2 brc 15,24c626 000000000024c6c6: 0707 bcr 0,%r7 000000000024c6c8: c0f40000000c brcl 15,24c6e0 000000000024c6ce: c0100055c8ed larl %r1,d058a8 000000000024c6d4: c0e50033065c brasl %r14,8ad38c [ 16.633236] Call Trace: [ 16.633238] ([<000000000024c6be>] __static_key_slow_dec+0xf6/0x100) [ 16.633240] [<000000000024087c>] _free_event+0x15c/0x198 [ 16.633241] [<0000000000246e7a>] SyS_perf_event_open+0x432/0xa70 [ 16.633245] [<00000000008ac5f2>] system_call+0xd6/0x258 [ 16.633246] [<000003ffaaa784e2>] 0x3ffaaa784e2 [ 16.633247] Last Breaking-Event-Address: [ 16.633249] [<000000000024c6be>] __static_key_slow_dec+0xf6/0x100 [ 16.633250] ---[ end trace cd0b0e85985e8baa ]--- To solve this problem, just free the event without taking care of the accounting. Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> --- kernel/events/core.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4c66465..d9051e0 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7965,7 +7965,8 @@ SYSCALL_DEFINE5(perf_event_open, if (is_sampling_event(event)) { if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { err = -ENOENT; - goto err_alloc; + __free_event(event); + goto err_cpus; } } -- 1.7.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP 2015-06-11 9:59 [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Hendrik Brueckner 2015-06-11 9:59 ` [PATCH 2/2] perf: correct event accounting imbalance on error path Hendrik Brueckner @ 2015-06-11 10:25 ` Peter Zijlstra 2015-06-11 13:02 ` Hendrik Brueckner 1 sibling, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2015-06-11 10:25 UTC (permalink / raw) To: Hendrik Brueckner Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Vince Weaver, Paul Mackerras, Martin Schwidefsky, Heiko Carstens, linux-kernel, linux-s390 On Thu, 2015-06-11 at 11:59 +0200, Hendrik Brueckner wrote: > The ENOTSUPP (which actually should be EOPNOTSUPP for user space) does not > trigger a fallback event selection, for example, by perf record. > If hardware support for the cycles perf event is available, but the hardware > does not provide interrupts, returning ENOTSUPP causes perf to end. Returning > ENOENT causes the perf tool to fallback to a software-based cycle PMU that > supports interrupts. > > The commit 53b25335dd ("perf: Disable sampled events if no PMU interrupt") > introduced that incompatible change. That's 3.16 > if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { > - err = -ENOTSUPP; > + err = -ENOENT; > goto err_alloc; > } > } And now you would be changing an API that's been around for at least 4 releases. Also, I really think -ENOENT is the wrong return here, you're asking for things that's not supported, not for something that's not there. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP 2015-06-11 10:25 ` [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Peter Zijlstra @ 2015-06-11 13:02 ` Hendrik Brueckner 2015-06-11 13:28 ` Peter Zijlstra 0 siblings, 1 reply; 5+ messages in thread From: Hendrik Brueckner @ 2015-06-11 13:02 UTC (permalink / raw) To: Peter Zijlstra Cc: Hendrik Brueckner, Ingo Molnar, Arnaldo Carvalho de Melo, Vince Weaver, Paul Mackerras, Martin Schwidefsky, Heiko Carstens, linux-kernel, linux-s390 On Thu, Jun 11, 2015 at 12:25:01PM +0200, Peter Zijlstra wrote: > On Thu, 2015-06-11 at 11:59 +0200, Hendrik Brueckner wrote: > > The ENOTSUPP (which actually should be EOPNOTSUPP for user space) does not > > trigger a fallback event selection, for example, by perf record. > > If hardware support for the cycles perf event is available, but the hardware > > does not provide interrupts, returning ENOTSUPP causes perf to end. Returning > > ENOENT causes the perf tool to fallback to a software-based cycle PMU that > > supports interrupts. > > > > The commit 53b25335dd ("perf: Disable sampled events if no PMU interrupt") > > introduced that incompatible change. > > That's 3.16 Correct... I recently encountered the problem. > > > if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { > > - err = -ENOTSUPP; > > + err = -ENOENT; > > goto err_alloc; > > } > > } > > And now you would be changing an API that's been around for at least 4 > releases. Well... the behavior before 53b25335dd was differently in this regard. Of course, the API changed 4 releases ago. The question here is rather was this desired or not. In my mind I considered this problem as a regression. > > Also, I really think -ENOENT is the wrong return here, you're asking for > things that's not supported, not for something that's not there. So looks like -ENOTSUPP is the desired API now. So the problem I'd like to solve is that there are two different hardware PMUs that support the "cycles" event. Just one of them supports sampling of cycles, the other not. In the past (prior to 3.16), the perf tool tried several PMUs if -ENOENT was returned. With 3.16, -ENOTSUPP is returned (which actually should be -EOPNOTSUPP but different story) and the perf tool exits. So the question is: what is the desired behavior? A solution towards the "fallback-behavior" would be to change perf_init_event() and consider the sampling/non-sampling criteria (in general pmu->capabilities) when looking for a matching PMU to serve the event? Thanks and kind regards, Hendrik -- Hendrik Brueckner brueckner@linux.vnet.ibm.com | IBM Deutschland Research & Development GmbH Linux on System z Development | Schoenaicher Str. 220, 71032 Boeblingen IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP 2015-06-11 13:02 ` Hendrik Brueckner @ 2015-06-11 13:28 ` Peter Zijlstra 0 siblings, 0 replies; 5+ messages in thread From: Peter Zijlstra @ 2015-06-11 13:28 UTC (permalink / raw) To: Ingo Molnar, Arnaldo Carvalho de Melo, Vince Weaver, Paul Mackerras, Martin Schwidefsky, Heiko Carstens, linux-kernel, linux-s390 On Thu, Jun 11, 2015 at 03:02:55PM +0200, Hendrik Brueckner wrote: > On Thu, Jun 11, 2015 at 12:25:01PM +0200, Peter Zijlstra wrote: > > On Thu, 2015-06-11 at 11:59 +0200, Hendrik Brueckner wrote: > > > The ENOTSUPP (which actually should be EOPNOTSUPP for user space) does not > > > trigger a fallback event selection, for example, by perf record. > > > If hardware support for the cycles perf event is available, but the hardware > > > does not provide interrupts, returning ENOTSUPP causes perf to end. Returning > > > ENOENT causes the perf tool to fallback to a software-based cycle PMU that > > > supports interrupts. > > > > > > The commit 53b25335dd ("perf: Disable sampled events if no PMU interrupt") > > > introduced that incompatible change. > > > > That's 3.16 > > Correct... I recently encountered the problem. > > > > > > if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { > > > - err = -ENOTSUPP; > > > + err = -ENOENT; > > > goto err_alloc; > > > } > > > } > > > > And now you would be changing an API that's been around for at least 4 > > releases. > > Well... the behavior before 53b25335dd was differently in this regard. Of > course, the API changed 4 releases ago. The question here is rather was > this desired or not. In my mind I considered this problem as a regression. Ah, I see what you mean: 97b1198fece0 ("s390, perf: Use common PMU interrupt disabled code") > > Also, I really think -ENOENT is the wrong return here, you're asking for > > things that's not supported, not for something that's not there. > > So looks like -ENOTSUPP is the desired API now. So the problem I'd like > to solve is that there are two different hardware PMUs that support the > "cycles" event. Just one of them supports sampling of cycles, the other not. perf_event_attr::type will uniquely identify the pmu. if (event->attr.type != pmu->type) return -ENOENT; /* event is for this pmu, any fail hereafter should be fatal */ if (is_sampling_event(event)) return -EOPNOTSUPP; > In the past (prior to 3.16), the perf tool tried several PMUs if -ENOENT > was returned. With 3.16, -ENOTSUPP is returned (which actually should be > -EOPNOTSUPP but different story) and the perf tool exits. So cpumf_pmu_event_init() should not have returned -ENOENT to start with. It should have first ascertained that this event was indeed for that pmu, if not, -ENOENT would indeed be the correct return. However once it finds its an event for this pmu, which requests sampling, which this pmu cannot deliver, it should have returned a fatal error (!-ENOENT). -EOPNOTSUPP would have been a good one there. > So the question is: what is the desired behavior? I think desired would be EOPNOTSUPP, but it would mean yet another change to the API. Then again, seeing how this isn't actually working no, that might not be too bad. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-06-11 13:28 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-11 9:59 [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Hendrik Brueckner 2015-06-11 9:59 ` [PATCH 2/2] perf: correct event accounting imbalance on error path Hendrik Brueckner 2015-06-11 10:25 ` [PATCH 1/2] perf: return ENOENT instead of ENOTSUPP Peter Zijlstra 2015-06-11 13:02 ` Hendrik Brueckner 2015-06-11 13:28 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).