From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752659AbbCYMWk (ORCPT <rfc822;w@1wt.eu>);
	Wed, 25 Mar 2015 08:22:40 -0400
Received: from mx1.redhat.com ([209.132.183.28]:46585 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752421AbbCYMWj (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 25 Mar 2015 08:22:39 -0400
Message-ID: <5512A88A.1010608@redhat.com>
Date: Wed, 25 Mar 2015 08:22:34 -0400
From: Joe Mario <jmario@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: David Ahern <dsahern@gmail.com>, Don Zickus <dzickus@redhat.com>
CC: acme@kernel.org, linux-kernel@vger.kernel.org,
        Jiri Olsa <jolsa@redhat.com>
Subject: Re: [PATCH] perf tool: Fix ppid for synthesized fork events
References: <1426786875-18025-1-git-send-email-dsahern@gmail.com> <20150319205648.GC199787@redhat.com> <550B3A66.2030902@gmail.com> <20150324201020.GH199787@redhat.com> <5511D349.7070508@gmail.com>
In-Reply-To: <5511D349.7070508@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 03/24/2015 05:12 PM, David Ahern wrote:
> On 3/24/15 2:10 PM, Don Zickus wrote:
>> He does this with and without the patch.  The difference is usually over 50%
>> extra time with the patch for both the record timings and report timings.:-(
>
> I find that shocking. The patch only populates ppid and ptid with a value read from the file that is already opened and processed. Most of the patch is just plumbing the value from the low level function that processes the status file back to synthesize_fork.
>
> What benchmark is this? Is it something I can download and run? if so, details? site, command, args?
>
> Thanks,
> David

Hi David:

We ran "time perf mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10" on a system that was running SPECjbb2013 in the background.  There were about 10,000 java threads with about 500 to 800 in a runnable state at any given time.   We ran it on a 4 socket x86 IVB server.

We had two perf binaries.  One with your patch and one without it.  Because the benchmark doesn't always have a constant load, we ran the above perf command in a loop alternating between the patched and unpatched version.   The elapsed wall clock times ("real" field from time) for the perf with your patch was typically >= 50% longer than the equivalent unpatched perf.

We can't give out a copy of the SPEC benchmark (part of the SPEC agreement).  Try to find some application (or create your own) to load the system with lots of busy threads.

Joe