From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932191AbbFEMrA (ORCPT ); Fri, 5 Jun 2015 08:47:00 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:38839 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932182AbbFEMq5 convert rfc822-to-8bit (ORCPT ); Fri, 5 Jun 2015 08:46:57 -0400 Message-ID: <1433508406.1495.11.camel@twins> Subject: Re: [RFC PATCH] sched: Fix sched_wakeup tracepoint From: Peter Zijlstra To: Thomas Gleixner Cc: Mathieu Desnoyers , linux-kernel@vger.kernel.org, Ingo Molnar , Steven Rostedt , Francis Giraldeau Date: Fri, 05 Jun 2015 14:46:46 +0200 In-Reply-To: References: <1433504509-17013-1-git-send-email-mathieu.desnoyers@efficios.com> <20150605120909.GG19282@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2015-06-05 at 14:32 +0200, Thomas Gleixner wrote: > On Fri, 5 Jun 2015, Peter Zijlstra wrote: > > On Fri, Jun 05, 2015 at 01:41:49PM +0200, Mathieu Desnoyers wrote: > > > Commit 317f394160e9 "sched: Move the second half of ttwu() to the remote cpu" > > > moves ttwu_do_wakeup() to an IPI handler context on the remote CPU for > > > remote wakeups. This commit appeared upstream in Linux v3.0. > > > > > > Unfortunately, ttwu_do_wakeup() happens to contain the "sched_wakeup" > > > tracepoint. Analyzing wakup latencies depends on getting the wakeup > > > chain right: which process is the waker, which is the wakee. Moving this > > > instrumention outside of the waker context prevents trace analysis tools > > > from getting the waker pid, either through "current" in the tracepoint > > > probe, or by deducing it using other scheduler events based on the CPU > > > executing the tracepoint. > > > > > > Another side-effect of moving this instrumentation to the scheduler ipi > > > is that the delay during which the wakeup is sitting in the pending > > > queue is not accounted for when calculating wakeup latency. > > > > > > Therefore, move the sched_wakeup instrumentation back to the waker > > > context to fix those two shortcomings. > > > > What do you consider wakeup-latency? I don't see how moving the > > tracepoint into the caller will magically account the queue time. > > Well, the point of wakeup is when the wakee calls wakeup. If the trace > point is in the IPI then you account the time between the wakeup and > the actuall handling in the IPI to the wakee instead of accounting it > to the time between wakeup and sched switch. My point exactly, wake->schedule is what we call the scheduling latency, not the wake latency, which would be from 'event' to the task being runnable.