From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261626AbTI3RiU (ORCPT ); Tue, 30 Sep 2003 13:38:20 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261625AbTI3RiU (ORCPT ); Tue, 30 Sep 2003 13:38:20 -0400 Received: from host16.fastclick.com ([205.180.85.17]:6873 "EHLO mail.fastclick.net") by vger.kernel.org with ESMTP id S261689AbTI3ReT (ORCPT ); Tue, 30 Sep 2003 13:34:19 -0400 Message-ID: <3F79BE9A.7010308@fastclick.com> Date: Tue, 30 Sep 2003 10:34:18 -0700 From: Brett Reply-To: brettspamacct@fastclick.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Subject: 2.4.20 & 2.4.22 paging out when it shouldn't Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org We noticed that with linux kernel 2.4.20 and probably previous versions, machines would at a certain time consistently go into paging overload when we would coincidentially kill a few processes and start new ones. The increased paging consistently coincides with the killing and starting of processes at the same time every hour. I am running sar, iostat and ps during the time that this happens and what I see is sar showing pgpgout/s jump to 1000 or more for 30 seconds with a corresponding increase in disk writing activity(iostat's blk_writtn/s goes from about 15 to 5000) meanwhile the server is bogged down, connections to the server time out and all hell breaks loose. Also I see swap increasing. So I can only assume it's paging to disk. Problem is there's around 500 megs of cache per top/sar info, we shouldn't have to page. So I added 500 megs of memory to give it a grand total of 1.5 gigs. Same problem except the cache grew to 800 megs. So I did a swapoff -a. Same problem except vmstat/sar show the swap is 0 yet sar reports high pgpgout/s. Next I upgraded to kernel version 2.4.22 and patched it with the latest rmap(-rmap15k) patch, figuring this new VM would help. The cache became a bit smaller. But it still paged out to disk. I have gone over the linux-kernel mailing list archives and found others who have run across a similar problem but there were no solid answers. Someone recommended issuing this command as a workaround: dd if=/dev/hda bs=8M count=$(awk '/MemTotal/ { printf "%d", $2/4096 }' /proc/meminfo) So I did that, kswapd took up 20-30% CPU, cache shot up. Then I killed the process and the cache went down to 300 megs. So I figured I had finally taken the disk cache down, freed up memory and it shouldn't page. But it still paged. Am I doing something wrong? It shouldn't page out to disk if I do swapoff -a and have more than enough memory. Also, it should just kick out the disk cache and use that for process pages instead of paging out to disk, the disk cache isn't that valuable. It doesn't make sense so I hope I'm doing something wrong. Any tips? If anyone needs more information, please ask. Thanks, Brett