From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753207AbaJXIMs (ORCPT ); Fri, 24 Oct 2014 04:12:48 -0400 Received: from mga01.intel.com ([192.55.52.88]:57194 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750720AbaJXIMi (ORCPT ); Fri, 24 Oct 2014 04:12:38 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="405278584" Message-ID: <544A099C.50104@intel.com> Date: Fri, 24 Oct 2014 11:11:08 +0300 From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: Namhyung Kim CC: Arnaldo Carvalho de Melo , Peter Zijlstra , linux-kernel@vger.kernel.org, David Ahern , Frederic Weisbecker , Jiri Olsa , Paul Mackerras , Stephane Eranian Subject: Re: [PATCH 05/16] perf tools: Add facility to export data in database-friendly way References: <1414061124-26830-1-git-send-email-adrian.hunter@intel.com> <1414061124-26830-6-git-send-email-adrian.hunter@intel.com> <87lho6c8ec.fsf@sejong.aot.lge.com> In-Reply-To: <87lho6c8ec.fsf@sejong.aot.lge.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 24/10/14 09:02, Namhyung Kim wrote: > On Thu, 23 Oct 2014 13:45:13 +0300, Adrian Hunter wrote: >> This patch introduces an abstraction for exporting sample >> data in a database-friendly way. The abstraction does not >> implement the actual output. A subsequent patch takes this >> facility into use for extending the script interface. >> >> The abstraction is needed because static data like symbols, >> dsos, comms etc need to be exported only once. That means >> allocating them a unique identifier and recording it on each >> structure. The member 'db_id' is used for that. 'db_id' >> is just a 64-bit sequence number. > > Can we do it somewhere in a script not in the core code? I don't feel > comfortable to add those bits into the core code. What if we export Please explain what you mean by "comfortable". > meta events like task, mmap or some user events to the script optionally > and let it process the data? Intel PT decoding can generate a lot of data. Many millions of samples at least. Each sample contains a lot of duplicated information like symbol names, dso names, comms, etc. So it is not optimal to export that for every sample. Even then, each piece of supporting information must be looked up - was it the same as the last sample, no then look it up, was it found, no then add a record. That is very very inefficient. Exporting each piece of information once, instead of over and over again, is a reasonable thing to do. > > Thanks, > Namhyung > > >> >> Exporting centres around the db_export__sample() function >> which exports the associated data structures if they have >> not yet been allocated a db_id. > >