From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262138AbUHNNpw (ORCPT ); Sat, 14 Aug 2004 09:45:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262279AbUHNNpw (ORCPT ); Sat, 14 Aug 2004 09:45:52 -0400 Received: from [195.23.16.24] ([195.23.16.24]:54428 "EHLO bipbip.comserver-pie.com") by vger.kernel.org with ESMTP id S262138AbUHNNpt (ORCPT ); Sat, 14 Aug 2004 09:45:49 -0400 Message-ID: <411E1783.1030103@grupopie.com> Date: Sat, 14 Aug 2004 14:45:39 +0100 From: Paulo Marques Organization: Grupo PIE User-Agent: Mozilla Thunderbird 0.7.1 (X11/20040626) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andi Kleen Cc: Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: [patch] Latency Tracer, voluntary-preempt-2.6.8-rc4-O6 References: <2rfT9-5wi-17@gated-at.bofh.it> <2rF1c-6Iy-7@gated-at.bofh.it> <2sxEs-46P-1@gated-at.bofh.it> <2sCkH-7i5-15@gated-at.bofh.it> <2sHu9-2EW-31@gated-at.bofh.it> <20040813121502.GA18860@elte.hu> <20040813121800.GA68967@muc.de> <20040813135109.GA20638@elte.hu> <411D9A2A.1000202@grupopie.com> <20040814124145.GA96097@muc.de> In-Reply-To: <20040814124145.GA96097@muc.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiVirus: checked by Vexira MailArmor (version: 2.0.1.16; VAE: 6.27.0.4; VDF: 6.27.0.10; host: bipbip) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Andi Kleen wrote: >>The final algorithm pre-calculates markers on the compressed symbols so >>that the search time is almost divided by the number of markers. > > > You could do that at compile time (in scripts/kallsyms.c) You're right! I'll see if I can come up with a patch. If this is done in scripts/kallsyms.c, there is less code needed in the actual kernel, and there is no time penalty on the first lookup. This is a true win-win scenario :) Thanks a lot for your suggestion. >>There are still a few issues with this approach. The biggest issue is >>that this is clearly a speed/space trade-off, and maybe we don't want to >>waste the space on a code path that is not supposed to be "hot". If this >>is the case, I can make a smaller patch, that fixes just the name >>"decompression" strcpy's. > > > I'm surprised that using 8 markers helps anything. There should > be many many more 0 stems than that in a not so big kernel. > Did you actually measure the hit rate of the cache? I bet it is pretty low. Well, I only left in the resulting source code the pieces that I actually measured to be a significant improvment. The thing is, this is not exactly a cache. The problem that the markers address is that the decompression must be done sequentially. If I have 20000 symbols and already figured out from the binary search that I need symbol 11201 (for instance), I would have to go through 11201 symbols to get to the symbol I need. The algorithm I wrote keeps markers in the middle of the stream, so that when I try to fetch symbol 11201, I already know where symbol 10000 is because of the marker and only have to go through 1201 symbols. With N markers you get N times the speed on the name search. (actually this is not exactly right for high values of N, because you can only use a marker that is a 0 stem, but for N<~32 this is a good approximation) -- Paulo Marques - www.grupopie.com "In a world without walls and fences who needs windows and gates?"