From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752123AbbAGHA6 (ORCPT ); Wed, 7 Jan 2015 02:00:58 -0500 Received: from LGEMRELSE6Q.lge.com ([156.147.1.121]:36386 "EHLO lgemrelse6q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750989AbbAGHA5 (ORCPT ); Wed, 7 Jan 2015 02:00:57 -0500 X-Original-SENDERIP: 10.177.220.203 X-Original-MAILFROM: namhyung@kernel.org Date: Wed, 7 Jan 2015 15:58:49 +0900 From: Namhyung Kim To: Andi Kleen Cc: Arnaldo Carvalho de Melo , Ingo Molnar , Peter Zijlstra , Jiri Olsa , LKML , David Ahern , Stephane Eranian , Adrian Hunter , Frederic Weisbecker Subject: Re: [RFC/PATCHSET 00/37] perf tools: Speed-up perf report by using multi thread (v1) Message-ID: <20150107065849.GB849@sejong> References: <1419405333-27952-1-git-send-email-namhyung@kernel.org> <20150105184811.GQ2915@two.firstfloor.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150105184811.GQ2915@two.firstfloor.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andi, On Mon, Jan 05, 2015 at 07:48:11PM +0100, Andi Kleen wrote: > > Thanks for working on this. Haven't read any code, just > some high level comments on the design. Really appreciate it! > > > > So my approach is like this: > > > > Partially do stage 1 first - but only for meta events that changes > > machine state. To do this I add a dummy tracking event to perf record > > and make it collect such meta events only. They are saved in a > > separate file (perf.header) and processed before sample events at perf > > report time. > > Can't you just use seek to put the offset into the perf.data header > like it's already done for other sections? Managing another file would be > a big change for users and especially is a problem if the data > is moved between different systems. The files are located in a directory and users only deal with the directory so I don't think it's a big problem. In addition, moving data between different systems requires archiving related debuginfos and I think we can extend perf-archive to put those debuginfo in the data directory so that it can find the symbols more easily. > > Also I thought Adrian's meta data index already addressed this > at least partially. I know Adrian's work might have some common parts but I haven't looked at it deeply, sorry! It'd be great if we can discuss how to coordinate the future direction or something.. > > > > > This also requires to handle multiple files and to find a > > corresponding machine state when processing samples. On a large > > profiling session, many tasks were created and exited so pid might be > > recycled (even more than once!). To deal with it, I managed to have > > thread, map_groups and comm in time sorted. The only remaining thing > > is symbol loading as it's done lazily when sample requires it. > > FWIW there's often a lot of unnecessary information in this > (e.g. mmaps that are not used). The Quipper page > claims large saving in data files by avoided redundancies. > > It would be probably better if perf record avoided writing redundant > information better (I realize that's not easy) Right, many mmap events won't be used but we cannot predict which one is used or not. > > > > With that being done, the stage 2 can be done by multiple threads. I > > also save each sample data (per-cpu or per-thread) in separate files > > during record. On perf report time, each file will be processed by > > each thread. And symbol loading is protected by a mutex lock. > > I really don't like the multiple files. See above. Also it could easily > cause additional seeking on spinning disks. Right, I admit that my result ran on a SSD disk. > > Isn't it fast enough to have a single thread that pre scans > the events (perhaps with some single-thread optimizations > like vectorization), and then load balances the work to > a thread pool? I don't understand it. Could you please elaborate it? > > BTW I suspect if you used cilk plus or a similar library that > would make the code much simpler. I'm not sure how much code I can make simpler with the help of such library. I think most changes in this patchset is preparations to concurrent access in libperf and it's still needed even if the library is used anyway. Thanks, Namhyung > > > Here is the result: > > > > This is just elapsed (real) time measured by shell 'time' function. > > > > The data file was recorded during kernel build with fp callchain and > > size is 2.1GB. The machine has 6 core with hyper-threading enabled > > and I got a similar result on my laptop too. > > > > time perf report --children --no-children + --call-graph none > > ---------- ------------- ------------------- > > current 4m43.260s 1m32.779s 0m35.866s > > patched 4m43.710s 1m29.695s 0m33.995s > > --multi-thread 2m46.265s 0m45.486s 0m7.570s > > > > > > This result is with 7.7GB data file using libunwind for callchain. > > Nice results! > > -Andi > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/