From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Thu, 7 Jun 2001 04:58:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Thu, 7 Jun 2001 04:58:22 -0400 Received: from ebiederm.dsl.xmission.com ([166.70.28.69]:56620 "EHLO flinx.biederman.org") by vger.kernel.org with ESMTP id ; Thu, 7 Jun 2001 04:58:17 -0400 To: Linus Torvalds Cc: linux-kernel@vger.kernel.org Subject: Re: Break 2.4 VM in five easy steps In-Reply-To: From: ebiederm@xmission.com (Eric W. Biederman) Date: 07 Jun 2001 02:54:34 -0600 In-Reply-To: Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.5 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Linus Torvalds writes: > On 7 Jun 2001, Eric W. Biederman wrote: > > No - I suspect that we're not actually doing all that much IO at all, and > the real reason for the lock-up is just that the current algorithm is so > bad that when it starts to act exponentially worse it really _is_ taking > minutes of CPU time following pointers and generally not being very nice > on the CPU cache etc.. Hmm. Unless I am mistaken the complexity is O(SwapPages*VMSize) Which is very bad, but no where near exponentially horrible. > The bulk of the work is walking the process page tables thousands and > thousands of times. Expensive. Definitely. I played following the page tables in a good way a while back, and even when you do it right the process is slow. Is if (need_resched) { schedule(); } A good idiom to use when you know you have a loop that will take a long time. Because even if we do this right we should do our best to avoid starving other processes in the system Hmm. There is a nasty case with turning the walk inside out. When we read a page into RAM there could still be other users of that page that still refer to the swap entry. So we cannot immediately remove the page from the swap cache. Unless we want to break sharing and increase the demands upon the virtual memory when we are shrinking it... > > If this is going on I think we need to look at our delayed > > deallocation policy a little more carefully. > > Agreed. I already talked in private with some people about just > re-visiting the issue of the lazy de-allocation. It has nice properties, > but it certainly appears as if the nasty cases just plain outweigh the > advantages. I'm trying to remember the advantages. Besides not having to care that a page is a swap page in free_pte. If there really is some value in not handling the pages there (and I seem to recall something about pages under I/O). It might at least be worth putting the pages on their own LRU list. So that kswapd can cruch through the list whenever it wakes up and gives a bunch of free pages. Eric