From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: linux-nfs-owner@vger.kernel.org Received: from mailout-de.gmx.net ([213.165.64.22]:49555 "HELO mailout-de.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752467Ab2GCMfb (ORCPT ); Tue, 3 Jul 2012 08:35:31 -0400 Message-ID: <4FF2E710.5010804@gmx.com> Date: Tue, 03 Jul 2012 14:35:28 +0200 From: Andreas Heinlein MIME-Version: 1.0 To: Jeff Layton CC: linux-nfs@vger.kernel.org Subject: Re: Kernel NFSd CPU hog? References: <4FF16D54.6090200@gmx.com> <20120702150758.66893d08@tlielax.poochiereds.net> In-Reply-To: <20120702150758.66893d08@tlielax.poochiereds.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Sender: linux-nfs-owner@vger.kernel.org List-ID: On 02.07.2012 21:07, Jeff Layton wrote: > On Mon, 02 Jul 2012 11:43:48 +0200 > Andreas Heinlein wrote: > >> Hello, >> >> we have a strange NFS problem with a newly setup Linux server, and I >> hope someone here can help. >> >> The symptom is that, slowly over time (speaking of several days up to 2 >> weeks), the kernel nfsd processes/threads consume more and more CPU >> until the system finally becomes unresponsive. We recorded system >> activity with sar, which shows that CPU (system) usage slowly rises >> after reboot from about 1% to nearly 100% over the course of several >> days. Load averages stay around 0.1-0.3 until 100% are reached, up to >> this point the problem is almost not noticable from the clients. Then >> load averages climb up to 30.0; at this point the system becomes more or >> less unusable and has to be restarted. 'top' output shows the CPU usage >> evenly distributed across all nfsd threads. >> >> The system is a fairly recent, though entry level server with a Core i3 >> and 4G RAM, hosting the home directories for about 15-20 clients. CPU >> activity does not drop at night, when no clients are connected. It is >> running Debian 6.0 with linux 3.2.0 (from the backports repository), >> with nfs-utils 1.2.5 (also from the backports repository). I suspect >> that these backports might be the culprit, but since we need this kernel >> for other purposes, and I cannot reboot that machine during office >> hours, I'd rather not try going back to the official Debian kernel >> without good reasons. If there are known problems, I'd give it a try. >> > Find the pid of one of the nfsd threads that's spinning, then get a > stack trace from it: > > # cat /proc//stack > > ...that should give us some idea of what it's doing. > > -- > Jeff Layton > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Hello, I've run into the problem again, and did a 'watch cat /proc//stack'. It actually seems to be doing something, because the stack trace changes every now and then, but mostly looks like [] try_to_wake_up+0x144/0x14d [] lock_timer_base+0x19/0x34 [] __mod_timer+0x10c/0x116 [] process_timeout+0x0/0x5 [] svc_recv+0x2e2/0x698 [sunrpc] [] default_wake_function+0x0/0x8 [] nfsd+0x90/0x108 [nfsd] [] nfsd+0x0/0x108 [nfsd] [] kthread+0x63/0x68 [] kthread+0x0/0x68 [] kernel_thread_helper+0x6/0x10 [] 0xffffffff Meanwhile, I've found a quite recent thread on this list named "3.0+ NFS issues", and within two links to Ubuntu bug reports (https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/879334 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1006446) and again to a kernel bug (https://bugzilla.kernel.org/show_bug.cgi?id=40912), all suggesting that this is indeed a kernel 3.0 problem. So I will try going back to 2.6.32 and hope this issue gets fixed soon. Thanks for your help!