From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750794AbWJBIOt (ORCPT ); Mon, 2 Oct 2006 04:14:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750781AbWJBIOt (ORCPT ); Mon, 2 Oct 2006 04:14:49 -0400 Received: from amsfep17-int.chello.nl ([213.46.243.15]:42082 "EHLO amsfep13-int.chello.nl") by vger.kernel.org with ESMTP id S1750794AbWJBIOs (ORCPT ); Mon, 2 Oct 2006 04:14:48 -0400 Subject: Re: [RFC][PATCH 0/2] Swap token re-tuned From: Peter Zijlstra To: Andrew Morton Cc: ashwin.chaugule@celunite.com, linux-kernel@vger.kernel.org, Rik van Riel In-Reply-To: <20061002005905.a97a7b90.akpm@osdl.org> References: <1159555312.2141.13.camel@localhost.localdomain> <20061001155608.0a464d4c.akpm@osdl.org> <1159774552.13651.80.camel@lappy> <20061002005905.a97a7b90.akpm@osdl.org> Content-Type: text/plain Date: Mon, 02 Oct 2006 10:14:33 +0200 Message-Id: <1159776873.13651.89.camel@lappy> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2006-10-02 at 00:59 -0700, Andrew Morton wrote: > On Mon, 02 Oct 2006 09:35:52 +0200 > Peter Zijlstra wrote: > > > On Sun, 2006-10-01 at 15:56 -0700, Andrew Morton wrote: > > > On Sat, 30 Sep 2006 00:11:51 +0530 > > > Ashwin Chaugule wrote: > > > > > > PATCH 2: > > > > > > > > Instead of using TIMEOUT as a parameter to transfer the token, I think a > > > > better solution is to hand it over to a process that proves its > > > > eligibilty. > > > > > > > > What my scheme does, is to find out how frequently a process is calling > > > > these functions. The processes that call these more frequently get a > > > > higher priority. > > > > The idea is to guarantee that a high priority process gets the token. > > > > The priority of a process is determined by the number of consecutive > > > > calls to swap-in and no-page. I mean "consecutive" not from the > > > > scheduler point of view, but from the process point of view. In other > > > > words, if the task called these functions every time it was scheduled, > > > > it means it is not getting any further with its execution. > > > > > > > > This way, its a matter of simple comparison of task priorities, to > > > > decide whether to transfer the token or not. > > > > > > Does this introduce the possibility of starvation? Where the > > > fast-allocating process hogs the system and everything else makes no > > > progress? > > > > I tinkered with this a bit yesterday, and didn't get good results for: > > mem=64M ; make -j5 > > > > -vanilla: 2h32:55 Command being timed: "make -j5" User time (seconds): 2726.81 System time (seconds): 2266.85 Percent of CPU this job got: 54% Elapsed (wall clock) time (h:mm:ss or m:ss): 2:32:55 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 269956 Minor (reclaiming a frame) page faults: 8699298 Voluntary context switches: 414020 Involuntary context switches: 242365 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > > -swap-token: 2h41:48 Command being timed: "make -j5" User time (seconds): 2720.54 System time (seconds): 2428.60 Percent of CPU this job got: 53% Elapsed (wall clock) time (h:mm:ss or m:ss): 2:41:48 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 281943 Minor (reclaiming a frame) page faults: 8692417 Voluntary context switches: 421770 Involuntary context switches: 241323 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > > various other attempts at tweaking the code only made it worse. (will > > have to rerun these test, but a ~3h test is well, a 3h test ;-) > > I don't think that's a region of operation where we care a great deal. > What was the average CPU utlisation? Only a few percent. ~50%, its a slow box this, a p3-550. > It's just thrashing too much to bother optimising for. Obviously we want > it to terminate in a sane period of time and we'd _like_ to improve it. > But I think we'd accept a 10% slowdown in this region of operation if it > gave us a 10% speedup in the 25%-utilisation region. > > IOW: does the patch help mem=96M;make -j5?? Will kick off some test later today. > > Being frustrated with these results - I mean the idea made sense, so > > what is going on - I came up with this answer: > > > > Tasks owning the swap token will retain their pages and will hence swap > > less, other (contending) tasks will get less pages and will fault more > > frequent. This prio mechanism will favour exactly those tasks not > > holding the token. Which makes for token bouncing. > > OK. > > (We need to do something with > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/broken-out/mm-thrash-detect-process-thrashing-against-itself.patch, > btw. Has been in -mm since March and I'm still waiting for some benchmarks > which would justify its inclusion..) Hmm, benchmarks, I need VM benchmarks for my page replacment work too ;-) Perhaps I can create a multi-threaded progamm that knows a few patterns. > > The current mechanism seemingly assigns the token randomly (whomever > > asks while not held gets it - and the hold time is fixed), however this > > change in paging behaviour (holder less, contenders more) shifts the > > odds in favour of one of the contenders. Also the fixed holding time > > will make sure the token doesn't get released too soon and can make some > > progress. > > > > So while I agree it would be nice to get rid of all magic variables > > (holding time in the current impl) this proposed solution hasn't > > convinced me (for one it introduces another). > > > > (for the interrested, the various attempts I tried are available here: > > http://programming.kicks-ass.net/kernel-patches/swap_token/ ) > > OK, thanks or looking into it. I do think this is rich ground for > optimisation. Given the amazing reduction in speed I accomplished yesterday (worst was 3h09:02), I'd say we're not doing bad, but yeah, I too think there is room for improvement.