From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx198.postini.com [74.125.245.198]) by kanga.kvack.org (Postfix) with SMTP id 81AAB6B0002 for ; Sun, 17 Feb 2013 17:03:03 -0500 (EST) Received: by mail-ee0-f45.google.com with SMTP id b57so2510946eek.4 for ; Sun, 17 Feb 2013 14:03:01 -0800 (PST) Message-ID: <51215393.1070409@suse.cz> Date: Sun, 17 Feb 2013 23:02:59 +0100 From: Jiri Slaby MIME-Version: 1.0 Subject: kswapd craziness round 2 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm Cc: Mel Gorman , Andrew Morton , Valdis Kletnieks , LKML , Rik van Riel Hi, You still feel the sour taste of the "kswapd craziness in v3.7" thread, right? Welcome to the hell, part two :{. I believe this started happening after update from 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, many hours of uptime are needed and perhaps some suspend/resume cycles too. Memory pressure is not high, plenty of I/O cache: # free total used free shared buffers cached Mem: 6026692 5571184 455508 0 351252 2016648 -/+ buffers/cache: 3203284 2823408 Swap: 0 0 0 kswap is working very toughly though: root 580 0.6 0.0 0 0 ? S uno12 46:21 [kswapd0] This happens on I/O activity right now. For example by updatedb or find /. This is what the stack trace of kswapd0 looks like: [] shrink_slab+0xa1/0x2d0 [] kswapd+0x541/0x930 [] kthread+0xc0/0xd0 [] ret_from_fork+0x7c/0xb0 [] 0xffffffffffffffff Any ideas? thanks, -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx173.postini.com [74.125.245.173]) by kanga.kvack.org (Postfix) with SMTP id 2314B6B0002 for ; Fri, 8 Mar 2013 01:42:32 -0500 (EST) Received: by mail-ob0-f172.google.com with SMTP id tb18so1038648obb.17 for ; Thu, 07 Mar 2013 22:42:31 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <5138EC6C.6030906@suse.cz> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> Date: Fri, 8 Mar 2013 14:42:31 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Jiri Slaby Cc: Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: > On 03/01/2013 03:02 PM, Hillf Danton wrote: >> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>> >>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>> "grep -vw '0' /proc/vmstat" if you see something there. >>> >> Thanks to you for test and data. >> >> Lets try to restore the deleted nap, then. > > Oh, it seems to be nice now: > root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] > Double thanks. But Mel does not like it, probably. Lets try nap in another way. Hillf --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 @@ -2793,6 +2793,10 @@ loop_again: * speculatively avoid congestion waits */ zone_clear_flag(zone, ZONE_CONGESTED); + + else if (sc.priority > 2 && + sc.priority < DEF_PRIORITY - 2) + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); } /* -- >> >> --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 >> +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 >> @@ -2817,6 +2817,10 @@ loop_again: >> */ >> if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) >> break; >> + >> + if (sc.priority < DEF_PRIORITY - 2) >> + congestion_wait(BLK_RW_ASYNC, HZ/10); >> + >> } while (--sc.priority >= 0); >> >> out: >> -- >> > > > -- > js > suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx199.postini.com [74.125.245.199]) by kanga.kvack.org (Postfix) with SMTP id 067326B0002 for ; Fri, 8 Mar 2013 02:29:47 -0500 (EST) Message-ID: <51399368.3040200@bitsync.net> Date: Fri, 08 Mar 2013 08:29:44 +0100 From: Zlatko Calusic MIME-Version: 1.0 Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Hillf Danton Cc: Jiri Slaby , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton On 08.03.2013 07:42, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. > > But Mel does not like it, probably. > Lets try nap in another way. > > Hillf > > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* > -- > There's another bug in there, which I'm still chasing. Artificial sleeps like this just mask the real bug and introduce new problems (on my 4GB server kswapd spends all the time in those congestion wait calls). The problem is that the bug needs about 5 days of uptime to reveal it's ugly head. So far I can only tell that it was introduced somewhere between 3.1 & 3.4. Also, check shrink_inactive_list(), it already sleeps if really needed: if (nr_writeback && nr_writeback >= (nr_taken >> (DEF_PRIORITY - sc->priority))) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); Regards, -- Zlatko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx184.postini.com [74.125.245.184]) by kanga.kvack.org (Postfix) with SMTP id 36BBE6B0006 for ; Fri, 8 Mar 2013 03:27:54 -0500 (EST) Received: by mail-ob0-f170.google.com with SMTP id wc20so1101629obb.29 for ; Fri, 08 Mar 2013 00:27:53 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <51399368.3040200@bitsync.net> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <51399368.3040200@bitsync.net> Date: Fri, 8 Mar 2013 16:27:53 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Zlatko Calusic Cc: Jiri Slaby , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton On Fri, Mar 8, 2013 at 3:29 PM, Zlatko Calusic wrote: > There's another bug in there, which I'm still chasing. > I am busy in discovering an employer(a really hard work?) so I dunno the hours I have for that bug. Hmm, take a look at Mels thoughts? http://marc.info/?l=linux-mm&m=136189593423501&w=2 BTW, he will be online next week. Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx134.postini.com [74.125.245.134]) by kanga.kvack.org (Postfix) with SMTP id D6EC46B0005 for ; Fri, 8 Mar 2013 18:21:11 -0500 (EST) Received: by mail-ea0-f169.google.com with SMTP id z7so404091eaf.14 for ; Fri, 08 Mar 2013 15:21:10 -0800 (PST) Message-ID: <513A7263.5090303@suse.cz> Date: Sat, 09 Mar 2013 00:21:07 +0100 From: Jiri Slaby MIME-Version: 1.0 Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Hillf Danton Cc: Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton On 03/08/2013 07:42 AM, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. There is one downside. I'm not sure whether that patch was the culprit. My Thunderbird is jerky when scrolling and lags while writing this message. The letters sometimes appear later than typed and in groups. Like I (kbd): My Thunder TB: My Thunder I (kbd): b-i-r-d TB: is silent I (kbd): still typing... TB: bird is Perhaps it's not only TB. > But Mel does not like it, probably. > Lets try nap in another way. Will try next week. > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx123.postini.com [74.125.245.123]) by kanga.kvack.org (Postfix) with SMTP id 268366B0037 for ; Tue, 19 Mar 2013 12:59:46 -0400 (EDT) Message-ID: <51489979.2070403@draigBrady.com> Date: Tue, 19 Mar 2013 16:59:37 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= MIME-Version: 1.0 Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <513A7263.5090303@suse.cz> In-Reply-To: <513A7263.5090303@suse.cz> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: Jiri Slaby Cc: Hillf Danton , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton On 03/08/2013 11:21 PM, Jiri Slaby wrote: > On 03/08/2013 07:42 AM, Hillf Danton wrote: >> On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: >>> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>>>> >>>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>>> >>>> Thanks to you for test and data. >>>> >>>> Lets try to restore the deleted nap, then. >>> >>> Oh, it seems to be nice now: >>> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >>> >> Double thanks. > > There is one downside. I'm not sure whether that patch was the culprit. > My Thunderbird is jerky when scrolling and lags while writing this > message. The letters sometimes appear later than typed and in groups. Like > I (kbd): My Thunder > TB: My Thunder > I (kbd): b-i-r-d > TB: is silent > I (kbd): still typing... > TB: bird is > > Perhaps it's not only TB. I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 which I'd hoped would be fixed on upgrade :( My Thunderbird is using 1957m virt, 722m RSS on my 3G system. What are your corresponding mem values? For reference: http://marc.info/?t=130865025500001&r=1&w=2 https://bugzilla.redhat.com/show_bug.cgi?id=712019 thanks, PA!draig. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx162.postini.com [74.125.245.162]) by kanga.kvack.org (Postfix) with SMTP id 842256B0005 for ; Wed, 20 Mar 2013 00:12:33 -0400 (EDT) Received: by mail-oa0-f48.google.com with SMTP id j1so1315038oag.21 for ; Tue, 19 Mar 2013 21:12:32 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <51489979.2070403@draigBrady.com> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <513A7263.5090303@suse.cz> <51489979.2070403@draigBrady.com> Date: Wed, 20 Mar 2013 12:12:32 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: =?UTF-8?Q?P=C3=A1draig_Brady?= Cc: Jiri Slaby , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman On Wed, Mar 20, 2013 at 12:59 AM, P=C3=A1draig Brady wro= te: > > I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86= _64 > which I'd hoped would be fixed on upgrade :( > > My Thunderbird is using 1957m virt, 722m RSS on my 3G system. > What are your corresponding mem values? > > For reference: > http://marc.info/?t=3D130865025500001&r=3D1&w=3D2 > https://bugzilla.redhat.com/show_bug.cgi?id=3D712019 > Hey, would you all please try Mels new work? http://marc.info/?l=3Dlinux-mm&m=3D136352546814642&w=3D4 thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx201.postini.com [74.125.245.201]) by kanga.kvack.org (Postfix) with SMTP id 9B8EE6B0002 for ; Wed, 20 Mar 2013 04:38:45 -0400 (EDT) Received: by mail-ee0-f54.google.com with SMTP id c41so669655eek.13 for ; Wed, 20 Mar 2013 01:38:44 -0700 (PDT) Message-ID: <514975B9.7050304@suse.cz> Date: Wed, 20 Mar 2013 09:39:21 +0100 From: Jiri Slaby MIME-Version: 1.0 Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <513A7263.5090303@suse.cz> <51489979.2070403@draigBrady.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Hillf Danton , =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Cc: Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman On 03/20/2013 05:12 AM, Hillf Danton wrote: > Hey, would you all please try Mels new work? > http://marc.info/?l=linux-mm&m=136352546814642&w=4 Yeah, I was in CC and also asked Mel if I should apply those. I will as soon as I'm back home (next week). thanks, -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753905Ab3BQWDH (ORCPT ); Sun, 17 Feb 2013 17:03:07 -0500 Received: from mail-ee0-f52.google.com ([74.125.83.52]:53202 "EHLO mail-ee0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753195Ab3BQWDF (ORCPT ); Sun, 17 Feb 2013 17:03:05 -0500 Message-ID: <51215393.1070409@suse.cz> Date: Sun, 17 Feb 2013 23:02:59 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: linux-mm CC: Mel Gorman , Andrew Morton , Valdis Kletnieks , LKML , Rik van Riel Subject: kswapd craziness round 2 X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, You still feel the sour taste of the "kswapd craziness in v3.7" thread, right? Welcome to the hell, part two :{. I believe this started happening after update from 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, many hours of uptime are needed and perhaps some suspend/resume cycles too. Memory pressure is not high, plenty of I/O cache: # free total used free shared buffers cached Mem: 6026692 5571184 455508 0 351252 2016648 -/+ buffers/cache: 3203284 2823408 Swap: 0 0 0 kswap is working very toughly though: root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] This happens on I/O activity right now. For example by updatedb or find /. This is what the stack trace of kswapd0 looks like: [] shrink_slab+0xa1/0x2d0 [] kswapd+0x541/0x930 [] kthread+0xc0/0xd0 [] ret_from_fork+0x7c/0xb0 [] 0xffffffffffffffff Any ideas? thanks, -- js suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757922Ab3BRGS3 (ORCPT ); Mon, 18 Feb 2013 01:18:29 -0500 Received: from mail-da0-f42.google.com ([209.85.210.42]:36953 "EHLO mail-da0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753372Ab3BRGS1 (ORCPT ); Mon, 18 Feb 2013 01:18:27 -0500 Message-ID: <5121C7AF.2090803@numascale-asia.com> Date: Mon, 18 Feb 2013 14:18:23 +0800 From: Daniel J Blueman Organization: Numascale Asia User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Jiri Slaby CC: "Linux Kernel" , "Steffen Persvold" Subject: Re: kswapd craziness round 2 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: > Hi, > > You still feel the sour taste of the "kswapd craziness in v3.7" thread, > right? Welcome to the hell, part two :{. > > I believe this started happening after update from > 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, > many hours of uptime are needed and perhaps some suspend/resume cycles > too. Memory pressure is not high, plenty of I/O cache: > # free > total used free shared buffers cached > Mem: 6026692 5571184 455508 0 351252 2016648 > -/+ buffers/cache: 3203284 2823408 > Swap: 0 0 0 > > kswap is working very toughly though: > root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] > > This happens on I/O activity right now. For example by updatedb or find > /. This is what the stack trace of kswapd0 looks like: > [] shrink_slab+0xa1/0x2d0 > [] kswapd+0x541/0x930 > [] kthread+0xc0/0xd0 > [] ret_from_fork+0x7c/0xb0 > [] 0xffffffffffffffff Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario which hoses the box and observe RCU stalls are observed [2]. There may be a connection; I'll do a bit more debugging in the next few days. Daniel --- [1] 1. live-booted image using ramdisk 2. boot 3.8-rc with <16GB memory and without swap 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not ramdisk) 4. observe hang O(30) mins later --- [2] [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 jiffies g=6313 c=6312 q=68) -- Daniel J Blueman Principal Software Engineer, Numascale Asia From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753821Ab3BRLmd (ORCPT ); Mon, 18 Feb 2013 06:42:33 -0500 Received: from mail-ob0-f182.google.com ([209.85.214.182]:62125 "EHLO mail-ob0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752163Ab3BRLmb convert rfc822-to-8bit (ORCPT ); Mon, 18 Feb 2013 06:42:31 -0500 MIME-Version: 1.0 In-Reply-To: <5121C7AF.2090803@numascale-asia.com> References: <5121C7AF.2090803@numascale-asia.com> Date: Mon, 18 Feb 2013 19:42:30 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton To: Daniel J Blueman Cc: Jiri Slaby , Linux Kernel , Steffen Persvold Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman wrote: > On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: > >> Hi, >> >> You still feel the sour taste of the "kswapd craziness in v3.7" thread, >> right? Welcome to the hell, part two :{. >> >> I believe this started happening after update from >> 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, >> many hours of uptime are needed and perhaps some suspend/resume cycles >> too. Memory pressure is not high, plenty of I/O cache: >> # free >> total used free shared buffers cached >> Mem: 6026692 5571184 455508 0 351252 2016648 >> -/+ buffers/cache: 3203284 2823408 >> Swap: 0 0 0 >> >> kswap is working very toughly though: >> root 580 0.6 0.0 0 0 ? S Ășno12 46:21 [kswapd0] >> >> This happens on I/O activity right now. For example by updatedb or find >> /. This is what the stack trace of kswapd0 looks like: >> [] shrink_slab+0xa1/0x2d0 >> [] kswapd+0x541/0x930 >> [] kthread+0xc0/0xd0 >> [] ret_from_fork+0x7c/0xb0 >> [] 0xffffffffffffffff > > Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario > which hoses the box and observe RCU stalls are observed [2]. > > There may be a connection; I'll do a bit more debugging in the next few > days. > > Daniel > > --- [1] > > 1. live-booted image using ramdisk > 2. boot 3.8-rc with <16GB memory and without swap > 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not > ramdisk) > 4. observe hang O(30) mins later > > --- [2] > > [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 > jiffies g=6313 c=6312 q=68) Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753253Ab3BRPGE (ORCPT ); Mon, 18 Feb 2013 10:06:04 -0500 Received: from mail-pa0-f43.google.com ([209.85.220.43]:54275 "EHLO mail-pa0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752664Ab3BRPGB (ORCPT ); Mon, 18 Feb 2013 10:06:01 -0500 Message-ID: <51224354.4010909@numascale-asia.com> Date: Mon, 18 Feb 2013 23:05:56 +0800 From: Daniel J Blueman Organization: Numascale Asia User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Hillf Danton CC: Jiri Slaby , Linux Kernel , Steffen Persvold , Ingo Molnar , Linus Torvalds Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18/02/2013 19:42, Hillf Danton wrote: > On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman > wrote: >> On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: >> >>> Hi, >>> >>> You still feel the sour taste of the "kswapd craziness in v3.7" thread, >>> right? Welcome to the hell, part two :{. >>> >>> I believe this started happening after update from >>> 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, >>> many hours of uptime are needed and perhaps some suspend/resume cycles >>> too. Memory pressure is not high, plenty of I/O cache: >>> # free >>> total used free shared buffers cached >>> Mem: 6026692 5571184 455508 0 351252 2016648 >>> -/+ buffers/cache: 3203284 2823408 >>> Swap: 0 0 0 >>> >>> kswap is working very toughly though: >>> root 580 0.6 0.0 0 0 ? S Ășno12 46:21 [kswapd0] >>> >>> This happens on I/O activity right now. For example by updatedb or find >>> /. This is what the stack trace of kswapd0 looks like: >>> [] shrink_slab+0xa1/0x2d0 >>> [] kswapd+0x541/0x930 >>> [] kthread+0xc0/0xd0 >>> [] ret_from_fork+0x7c/0xb0 >>> [] 0xffffffffffffffff >> >> Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario >> which hoses the box and observe RCU stalls [2]. >> >> There may be a connection; I'll do a bit more debugging in the next few >> days. >> >> Daniel >> >> --- [1] >> >> 1. live-booted image using ramdisk >> 2. boot 3.8-rc with <16GB memory and without swap >> 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not >> ramdisk) >> 4. observe hang O(30) mins later >> >> --- [2] >> >> [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 >> jiffies g=6313 c=6312 q=68) > > Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 Close, but no cigar; I still hit this livelock on 3.8-rc7 with Ingo's revert or Linus's fix. However, I am unable to reproduce the hang with 3.7.9, so will begin bisection tomorrow, probably automating via pexpect. Thanks, Daniel -- Daniel J Blueman Principal Software Engineer, Numascale Asia From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751361Ab3BTWOt (ORCPT ); Wed, 20 Feb 2013 17:14:49 -0500 Received: from mail-ee0-f41.google.com ([74.125.83.41]:58675 "EHLO mail-ee0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751063Ab3BTWOs (ORCPT ); Wed, 20 Feb 2013 17:14:48 -0500 Message-ID: <51254AD2.7000906@suse.cz> Date: Wed, 20 Feb 2013 23:14:42 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: Hillf Danton , Daniel J Blueman CC: Linux Kernel , Steffen Persvold Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> In-Reply-To: X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/18/2013 12:42 PM, Hillf Danton wrote: > On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman > wrote: >> On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: >> >>> Hi, >>> >>> You still feel the sour taste of the "kswapd craziness in v3.7" thread, >>> right? Welcome to the hell, part two :{. ... >>> kswap is working very toughly though: >>> root 580 0.6 0.0 0 0 ? S Ășno12 46:21 [kswapd0] ... >> [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 >> jiffies g=6313 c=6312 q=68) > > Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 Not at all... -- js suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754118Ab3BUMHF (ORCPT ); Thu, 21 Feb 2013 07:07:05 -0500 Received: from mail-ob0-f176.google.com ([209.85.214.176]:36932 "EHLO mail-ob0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753585Ab3BUMHD (ORCPT ); Thu, 21 Feb 2013 07:07:03 -0500 MIME-Version: 1.0 In-Reply-To: <51254AD2.7000906@suse.cz> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> Date: Thu, 21 Feb 2013 20:07:03 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton To: Jiri Slaby Cc: Daniel J Blueman , Linux Kernel , Steffen Persvold Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 21, 2013 at 6:14 AM, Jiri Slaby wrote: >> >> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 > > Not at all... > Then mind taking a try? --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Thu Feb 21 20:05:58 2013 @@ -1715,7 +1715,7 @@ static void get_scan_count(struct lruvec * to swap. Better start now and leave the - probably heavily * thrashing - remaining file pages alone. */ - if (global_reclaim(sc)) { + if (global_reclaim(sc) && sc->priority >= DEF_PRIORITY - 2) { free = zone_page_state(zone, NR_FREE_PAGES); if (unlikely(file + free <= high_wmark_pages(zone))) { scan_balance = SCAN_ANON; @@ -2840,9 +2840,10 @@ out: * reclaim if they wish. */ if (sc.nr_reclaimed < SWAP_CLUSTER_MAX) - order = sc.order = 0; - - goto loop_again; + if (order != 0) { + sc.order = order = 0; + goto loop_again; + } } /* -- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759408Ab3BXV13 (ORCPT ); Sun, 24 Feb 2013 16:27:29 -0500 Received: from mail-ee0-f43.google.com ([74.125.83.43]:34446 "EHLO mail-ee0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758354Ab3BXV11 (ORCPT ); Sun, 24 Feb 2013 16:27:27 -0500 Message-ID: <512A85BB.8070500@suse.cz> Date: Sun, 24 Feb 2013 22:27:23 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: Hillf Danton CC: Daniel J Blueman , Linux Kernel , Steffen Persvold Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> In-Reply-To: X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/21/2013 01:07 PM, Hillf Danton wrote: > On Thu, Feb 21, 2013 at 6:14 AM, Jiri Slaby wrote: >>> >>> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 >> >> Not at all... >> > Then mind taking a try? Applied now, I'll report in a week or so as it needs a couple of days of uptime to occur. thanks, -- js suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759471Ab3B1RCM (ORCPT ); Thu, 28 Feb 2013 12:02:12 -0500 Received: from mail-ee0-f43.google.com ([74.125.83.43]:47188 "EHLO mail-ee0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755194Ab3B1RCI (ORCPT ); Thu, 28 Feb 2013 12:02:08 -0500 Message-ID: <512F8D8B.3070307@suse.cz> Date: Thu, 28 Feb 2013 18:02:03 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: Hillf Danton CC: Daniel J Blueman , Linux Kernel , Steffen Persvold Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> In-Reply-To: X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/21/2013 01:07 PM, Hillf Danton wrote: > On Thu, Feb 21, 2013 at 6:14 AM, Jiri Slaby wrote: >>> >>> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 >> >> Not at all... >> > Then mind taking a try? Ok, no difference, kswap is still crazy. I'm attaching the output of "grep -vw '0' /proc/vmstat" if you see something there. > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Thu Feb 21 20:05:58 2013 > @@ -1715,7 +1715,7 @@ static void get_scan_count(struct lruvec > * to swap. Better start now and leave the - probably heavily > * thrashing - remaining file pages alone. > */ > - if (global_reclaim(sc)) { > + if (global_reclaim(sc) && sc->priority >= DEF_PRIORITY - 2) { > free = zone_page_state(zone, NR_FREE_PAGES); > if (unlikely(file + free <= high_wmark_pages(zone))) { > scan_balance = SCAN_ANON; > @@ -2840,9 +2840,10 @@ out: > * reclaim if they wish. > */ > if (sc.nr_reclaimed < SWAP_CLUSTER_MAX) > - order = sc.order = 0; > - > - goto loop_again; > + if (order != 0) { > + sc.order = order = 0; > + goto loop_again; > + } > } > > /* nr_free_pages 36767 nr_inactive_anon 209253 nr_active_anon 1000355 nr_inactive_file 130500 nr_active_file 82677 nr_anon_pages 781334 nr_mapped 94443 nr_file_pages 554906 nr_dirty 29 nr_slab_reclaimable 13104 nr_slab_unreclaimable 9202 nr_page_table_pages 11694 nr_kernel_stack 477 nr_vmscan_write 114 nr_vmscan_immediate_reclaim 831 nr_shmem 341734 nr_dirtied 13492560 nr_written 13388832 nr_anon_transparent_hugepages 169 nr_dirty_threshold 20063 nr_dirty_background_threshold 10031 pgpgin 29026221 pgpgout 55166319 pgalloc_dma 256 pgalloc_dma32 75887179 pgalloc_normal 127591749 pgfree 212204191 pgactivate 5665900 pgdeactivate 1370274 pgfault 130946292 pgmajfault 91443 pgrefill_dma32 582854 pgrefill_normal 1140727 pgsteal_kswapd_dma32 6244454 pgsteal_kswapd_normal 6341734 pgsteal_direct_dma32 1209055 pgsteal_direct_normal 2280164 pgscan_kswapd_dma32 6271350 pgscan_kswapd_normal 6403760 pgscan_direct_dma32 1213349 pgscan_direct_normal 2300634 pginodesteal 190690 slabs_scanned 5139200 kswapd_inodesteal 456779 kswapd_low_wmark_hit_quickly 5042 kswapd_high_wmark_hit_quickly 156125 pageoutrun 170524 allocstall 32073 pgrotated 1321 pgmigrate_success 890843 pgmigrate_fail 282 compact_migrate_scanned 7776871 compact_free_scanned 565089036 compact_isolated 10590951 compact_stall 3114 compact_fail 2675 compact_success 439 unevictable_pgs_culled 658 unevictable_pgs_rescued 5309 unevictable_pgs_mlocked 5309 unevictable_pgs_munlocked 5309 thp_fault_alloc 6071 thp_fault_fallback 34735 thp_collapse_alloc 1817 thp_collapse_alloc_failed 2822 thp_split 292 thp_zero_page_alloc 2 thp_zero_page_alloc_failed 243 thanks, -- js suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751346Ab3CAOCa (ORCPT ); Fri, 1 Mar 2013 09:02:30 -0500 Received: from mail-oa0-f51.google.com ([209.85.219.51]:53608 "EHLO mail-oa0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751034Ab3CAOC3 (ORCPT ); Fri, 1 Mar 2013 09:02:29 -0500 MIME-Version: 1.0 In-Reply-To: <512F8D8B.3070307@suse.cz> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> Date: Fri, 1 Mar 2013 22:02:28 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton To: Jiri Slaby Cc: Daniel J Blueman , Linux Kernel , Steffen Persvold Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: > > Ok, no difference, kswap is still crazy. I'm attaching the output of > "grep -vw '0' /proc/vmstat" if you see something there. > Thanks to you for test and data. Lets try to restore the deleted nap, then. Hillf --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 @@ -2817,6 +2817,10 @@ loop_again: */ if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) break; + + if (sc.priority < DEF_PRIORITY - 2) + congestion_wait(BLK_RW_ASYNC, HZ/10); + } while (--sc.priority >= 0); out: -- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933320Ab3CGThW (ORCPT ); Thu, 7 Mar 2013 14:37:22 -0500 Received: from mail-ea0-f170.google.com ([209.85.215.170]:42702 "EHLO mail-ea0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932724Ab3CGThU (ORCPT ); Thu, 7 Mar 2013 14:37:20 -0500 Message-ID: <5138EC6C.6030906@suse.cz> Date: Thu, 07 Mar 2013 20:37:16 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: Hillf Danton CC: Daniel J Blueman , Linux Kernel , Steffen Persvold Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> In-Reply-To: X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/01/2013 03:02 PM, Hillf Danton wrote: > On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >> >> Ok, no difference, kswap is still crazy. I'm attaching the output of >> "grep -vw '0' /proc/vmstat" if you see something there. >> > Thanks to you for test and data. > > Lets try to restore the deleted nap, then. Oh, it seems to be nice now: root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] Thanks. > Hillf > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 > @@ -2817,6 +2817,10 @@ loop_again: > */ > if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) > break; > + > + if (sc.priority < DEF_PRIORITY - 2) > + congestion_wait(BLK_RW_ASYNC, HZ/10); > + > } while (--sc.priority >= 0); > > out: > -- > -- js suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753656Ab3CHGmd (ORCPT ); Fri, 8 Mar 2013 01:42:33 -0500 Received: from mail-oa0-f43.google.com ([209.85.219.43]:61005 "EHLO mail-oa0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751546Ab3CHGmb (ORCPT ); Fri, 8 Mar 2013 01:42:31 -0500 MIME-Version: 1.0 In-Reply-To: <5138EC6C.6030906@suse.cz> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> Date: Fri, 8 Mar 2013 14:42:31 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton To: Jiri Slaby Cc: Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: > On 03/01/2013 03:02 PM, Hillf Danton wrote: >> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>> >>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>> "grep -vw '0' /proc/vmstat" if you see something there. >>> >> Thanks to you for test and data. >> >> Lets try to restore the deleted nap, then. > > Oh, it seems to be nice now: > root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] > Double thanks. But Mel does not like it, probably. Lets try nap in another way. Hillf --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 @@ -2793,6 +2793,10 @@ loop_again: * speculatively avoid congestion waits */ zone_clear_flag(zone, ZONE_CONGESTED); + + else if (sc.priority > 2 && + sc.priority < DEF_PRIORITY - 2) + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); } /* -- >> >> --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 >> +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 >> @@ -2817,6 +2817,10 @@ loop_again: >> */ >> if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) >> break; >> + >> + if (sc.priority < DEF_PRIORITY - 2) >> + congestion_wait(BLK_RW_ASYNC, HZ/10); >> + >> } while (--sc.priority >= 0); >> >> out: >> -- >> > > > -- > js > suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754721Ab3CHHk0 (ORCPT ); Fri, 8 Mar 2013 02:40:26 -0500 Received: from bitsync.net ([80.83.126.10]:34363 "EHLO bitsync.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751180Ab3CHHkZ (ORCPT ); Fri, 8 Mar 2013 02:40:25 -0500 X-Greylist: delayed 637 seconds by postgrey-1.27 at vger.kernel.org; Fri, 08 Mar 2013 02:40:25 EST Message-ID: <51399368.3040200@bitsync.net> Date: Fri, 08 Mar 2013 08:29:44 +0100 From: Zlatko Calusic MIME-Version: 1.0 To: Hillf Danton CC: Jiri Slaby , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08.03.2013 07:42, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. > > But Mel does not like it, probably. > Lets try nap in another way. > > Hillf > > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* > -- > There's another bug in there, which I'm still chasing. Artificial sleeps like this just mask the real bug and introduce new problems (on my 4GB server kswapd spends all the time in those congestion wait calls). The problem is that the bug needs about 5 days of uptime to reveal it's ugly head. So far I can only tell that it was introduced somewhere between 3.1 & 3.4. Also, check shrink_inactive_list(), it already sleeps if really needed: if (nr_writeback && nr_writeback >= (nr_taken >> (DEF_PRIORITY - sc->priority))) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); Regards, -- Zlatko From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933917Ab3CHI1y (ORCPT ); Fri, 8 Mar 2013 03:27:54 -0500 Received: from mail-ob0-f170.google.com ([209.85.214.170]:60461 "EHLO mail-ob0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752283Ab3CHI1x (ORCPT ); Fri, 8 Mar 2013 03:27:53 -0500 MIME-Version: 1.0 In-Reply-To: <51399368.3040200@bitsync.net> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <51399368.3040200@bitsync.net> Date: Fri, 8 Mar 2013 16:27:53 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton To: Zlatko Calusic Cc: Jiri Slaby , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 8, 2013 at 3:29 PM, Zlatko Calusic wrote: > There's another bug in there, which I'm still chasing. > I am busy in discovering an employer(a really hard work?) so I dunno the hours I have for that bug. Hmm, take a look at Mels thoughts? http://marc.info/?l=linux-mm&m=136189593423501&w=2 BTW, he will be online next week. Hillf From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759852Ab3CHXVN (ORCPT ); Fri, 8 Mar 2013 18:21:13 -0500 Received: from mail-ee0-f52.google.com ([74.125.83.52]:59098 "EHLO mail-ee0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757654Ab3CHXVL (ORCPT ); Fri, 8 Mar 2013 18:21:11 -0500 Message-ID: <513A7263.5090303@suse.cz> Date: Sat, 09 Mar 2013 00:21:07 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: Hillf Danton CC: Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> In-Reply-To: X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/08/2013 07:42 AM, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. There is one downside. I'm not sure whether that patch was the culprit. My Thunderbird is jerky when scrolling and lags while writing this message. The letters sometimes appear later than typed and in groups. Like I (kbd): My Thunder TB: My Thunder I (kbd): b-i-r-d TB: is silent I (kbd): still typing... TB: bird is Perhaps it's not only TB. > But Mel does not like it, probably. > Lets try nap in another way. Will try next week. > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* -- js suse labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933119Ab3CSRBV (ORCPT ); Tue, 19 Mar 2013 13:01:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64429 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932483Ab3CSRBU (ORCPT ); Tue, 19 Mar 2013 13:01:20 -0400 Message-ID: <51489979.2070403@draigBrady.com> Date: Tue, 19 Mar 2013 16:59:37 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Jiri Slaby CC: Hillf Danton , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman , Andrew Morton Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <513A7263.5090303@suse.cz> In-Reply-To: <513A7263.5090303@suse.cz> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/08/2013 11:21 PM, Jiri Slaby wrote: > On 03/08/2013 07:42 AM, Hillf Danton wrote: >> On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby wrote: >>> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby wrote: >>>>> >>>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>>> >>>> Thanks to you for test and data. >>>> >>>> Lets try to restore the deleted nap, then. >>> >>> Oh, it seems to be nice now: >>> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >>> >> Double thanks. > > There is one downside. I'm not sure whether that patch was the culprit. > My Thunderbird is jerky when scrolling and lags while writing this > message. The letters sometimes appear later than typed and in groups. Like > I (kbd): My Thunder > TB: My Thunder > I (kbd): b-i-r-d > TB: is silent > I (kbd): still typing... > TB: bird is > > Perhaps it's not only TB. I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 which I'd hoped would be fixed on upgrade :( My Thunderbird is using 1957m virt, 722m RSS on my 3G system. What are your corresponding mem values? For reference: http://marc.info/?t=130865025500001&r=1&w=2 https://bugzilla.redhat.com/show_bug.cgi?id=712019 thanks, PĂĄdraig. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751476Ab3CTEMe (ORCPT ); Wed, 20 Mar 2013 00:12:34 -0400 Received: from mail-oa0-f47.google.com ([209.85.219.47]:42941 "EHLO mail-oa0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751269Ab3CTEMd convert rfc822-to-8bit (ORCPT ); Wed, 20 Mar 2013 00:12:33 -0400 MIME-Version: 1.0 In-Reply-To: <51489979.2070403@draigBrady.com> References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <513A7263.5090303@suse.cz> <51489979.2070403@draigBrady.com> Date: Wed, 20 Mar 2013 12:12:32 +0800 Message-ID: Subject: Re: kswapd craziness round 2 From: Hillf Danton To: =?UTF-8?Q?P=C3=A1draig_Brady?= Cc: Jiri Slaby , Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 20, 2013 at 12:59 AM, PĂĄdraig Brady wrote: > > I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 > which I'd hoped would be fixed on upgrade :( > > My Thunderbird is using 1957m virt, 722m RSS on my 3G system. > What are your corresponding mem values? > > For reference: > http://marc.info/?t=130865025500001&r=1&w=2 > https://bugzilla.redhat.com/show_bug.cgi?id=712019 > Hey, would you all please try Mels new work? http://marc.info/?l=linux-mm&m=136352546814642&w=4 thanks Hillf From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752503Ab3CTIis (ORCPT ); Wed, 20 Mar 2013 04:38:48 -0400 Received: from mail-ee0-f49.google.com ([74.125.83.49]:47218 "EHLO mail-ee0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750991Ab3CTIip (ORCPT ); Wed, 20 Mar 2013 04:38:45 -0400 Message-ID: <514975B9.7050304@suse.cz> Date: Wed, 20 Mar 2013 09:39:21 +0100 From: Jiri Slaby User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20130124 Thunderbird/19.0 MIME-Version: 1.0 To: Hillf Danton , =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= CC: Daniel J Blueman , Linux Kernel , Steffen Persvold , mm , Mel Gorman Subject: Re: kswapd craziness round 2 References: <5121C7AF.2090803@numascale-asia.com> <51254AD2.7000906@suse.cz> <512F8D8B.3070307@suse.cz> <5138EC6C.6030906@suse.cz> <513A7263.5090303@suse.cz> <51489979.2070403@draigBrady.com> In-Reply-To: X-Enigmail-Version: 1.6a1pre Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/20/2013 05:12 AM, Hillf Danton wrote: > Hey, would you all please try Mels new work? > http://marc.info/?l=linux-mm&m=136352546814642&w=4 Yeah, I was in CC and also asked Mel if I should apply those. I will as soon as I'm back home (next week). thanks, -- js suse labs