From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261299AbVGLJrt (ORCPT ); Tue, 12 Jul 2005 05:47:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261304AbVGLJpn (ORCPT ); Tue, 12 Jul 2005 05:45:43 -0400 Received: from nproxy.gmail.com ([64.233.182.194]:45164 "EHLO nproxy.gmail.com") by vger.kernel.org with ESMTP id S261301AbVGLJn0 convert rfc822-to-8bit (ORCPT ); Tue, 12 Jul 2005 05:43:26 -0400 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=RCABcDUlCXEc6gdW8CkyhuFt5iz2FKwPMbHPh37q2lNX38QiuvkKzEPDSzENgxov8Qx3dmi7V1+3E+kY8Gzg53G8ugqUebpgWK5m5ktkYkXsN4d9zN3zuueCXO4D+r/j6DfUonOqVbPoJ/7oXme/3CXA57in7yOM/cTzTdQCX5o= Message-ID: <4ad99e05050712024319bc7ada@mail.gmail.com> Date: Tue, 12 Jul 2005 11:43:23 +0200 From: Lars Roland Reply-To: Lars Roland To: Rob Mueller Subject: Re: 2.6.12.2 dies after 24 hours Cc: linux-kernel@vger.kernel.org, Bron Gondwana , Jeremy Howard In-Reply-To: <01dd01c586c3$cdd525d0$7c00a8c0@ROBMHP> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT Content-Disposition: inline References: <01dd01c586c3$cdd525d0$7c00a8c0@ROBMHP> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On 7/12/05, Rob Mueller wrote: > As background, we've been using a relatively old kernel (2.6.4-mm2) on some > IBM x235 machines with 6G of RAM, umem cards, and serveraid storage. These > machines are under continuous heavy-ish load, load avg between about 1 and > 5, with between 2500-3500 procs at all times, with several largish ReiserFS > partitions and have been running *really* well with >250 days uptime on one > machine. > > We recently tried upgrading one of the machines to the latest kernel > (2.6.12.2) and it's died after about 24 hours. It seemed to end up in some > weird state where we could ssh into it, and some commands worked (eg uptime) > but process list related commands (ps) would just freeze up into an > unkillable state and we'd have to close the seesion and ssh in again. I experienced the exact same thing on a IBM 335 - in my case I had messed up with the ACPI setup. Could you paste the output from /proc/interupts also is your kernel running with IRQ balancing ?. Regards. Lars Roland