From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754328AbbCXQTO (ORCPT <rfc822;w@1wt.eu>);
	Tue, 24 Mar 2015 12:19:14 -0400
Received: from aserp1040.oracle.com ([141.146.126.69]:26170 "EHLO
	aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753719AbbCXQTI (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 24 Mar 2015 12:19:08 -0400
Message-ID: <55118E5A.20803@oracle.com>
Date: Tue, 24 Mar 2015 10:18:34 -0600
From: David Ahern <david.ahern@oracle.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: Ingo Molnar <mingo@kernel.org>
CC: acme@kernel.org, linux-kernel@vger.kernel.org,
        Frederic Weisbecker <fweisbec@gmail.com>,
        Peter Zijlstra <peterz@infradead.org>, Jiri Olsa <jolsa@kernel.org>,
        Namhyung Kim <namhyung@kernel.org>,
        Stephane Eranian <eranian@google.com>,
        Adrian Hunter <adrian.hunter@intel.com>
Subject: Re: [PATCH] perf record: Allow poll timeout to be specified
References: <1427213388-127148-1-git-send-email-david.ahern@oracle.com> <20150324161210.GA8661@gmail.com>
In-Reply-To: <20150324161210.GA8661@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Source-IP: aserv0021.oracle.com [141.146.126.233]
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 3/24/15 10:12 AM, Ingo Molnar wrote:
>
> * David Ahern <david.ahern@oracle.com> wrote:
>
>> Record currently wakes up based on watermarks to read events from
>> the mmaps and write them out to the file. The result is a file that
>> can have large blocks of events per mmap before a finished round
>> event is added to the stream.  This in turn affects the quantity of
>> events that have to be passed through the ordered events queue
>> before results can be displayed to the user. For commands like
>> perf-script this can lead to long unnecessarily long delays before a
>> user gets output. Large systems (e.g, 1024 cpus) further compound
>> this effect. I have seen instances where I have to wait 45 minutes
>> for perf-script to process a 5GB file before any events are shown.
>>
>> This patch adds an option to perf-record to allow a user to specify
>> the poll timeout in msec. For example using 100 msec timeouts
>> similar to perf-top means the mmaps are traversed much more
>> frequently leading to a smoother analysis side.
>
> Please tune the default value (perhaps influenced by N_PROC?) so that
> users will get sane behavior without having to specify this option!

I knew you were going to say that! ;-)

It's really a function of events coming in not cpus. The number of CPUs 
just compounds the problem.

I thought about making perf-record use a 100msec timeout like perf-top, 
but that can lead to unnecessary FINISHED_ROUND events in the file and 
unnecessary noise/overhead in the record side. On the other hand looking 
at scheduler tracepoints, kvm tracepoints, etc -- those can flood in to 
the point that even 100msec is too long.