From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264638AbUEaOi0 (ORCPT ); Mon, 31 May 2004 10:38:26 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264639AbUEaOi0 (ORCPT ); Mon, 31 May 2004 10:38:26 -0400 Received: from parcelfarce.linux.theplanet.co.uk ([195.92.249.252]:26034 "EHLO www.linux.org.uk") by vger.kernel.org with ESMTP id S264638AbUEaOiN (ORCPT ); Mon, 31 May 2004 10:38:13 -0400 Date: Mon, 31 May 2004 11:39:15 -0300 From: Marcelo Tosatti To: axboe@suse.de Cc: linux-kernel@vger.kernel.org Subject: loop/highmem related 2.4.26 lockup Message-ID: <20040531143915.GA20653@logos.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.5.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi Jens, We are seeing a deadlock on our SMP buildserver, running 2.4.26. It seems to be related to loop/HIGHMEM (it does intensive use of loop). It usually takes 9 days of relatively high load (loadavg = 2.5) Any clues? MemTotal: 2069596 kB MemFree: 25956 kB MemShared: 0 kB Buffers: 52332 kB Cached: 1126108 kB SwapCached: 516 kB Active: 152892 kB Inactive: 1058772 kB HighTotal: 1179584 kB HighFree: 19768 kB I can post full ksymoops output if required (all tasks). I pasted the ones I think are relevant: Proc; kswapd >>EIP; f7bf1ed0 <===== Trace; c01463ae <__wait_on_buffer+6e/a0> Trace; c0149e48 Trace; c0149fe0 Trace; c013b45a Trace; c013b6da Trace; c013b752 Trace; c013b91c Trace; c013b988 Trace; c013bacc Trace; c0105000 <_stext+0/0> Trace; c010740e Trace; c013ba30 Proc; bdflush >>EIP; f7bca000 <===== Trace; c01280aa Trace; c01463ae <__wait_on_buffer+6e/a0> Trace; c0149e48 Trace; c0149fe0 Trace; c013b45a Trace; c013b6da Trace; c013b752 Trace; c013c74e Trace; c013ca78 <__alloc_pages+188/290> Trace; c0142882 Trace; c0142a0c Trace; c01d792c Trace; c01d4a4c Trace; c01d4b0a Trace; c014643c Trace; c014653e Trace; c014a458 Trace; c0105000 <_stext+0/0> Trace; c010740e Trace; c014a390 Proc; loop0 >>EIP; 00000000 Before first symbol Trace; c0107c68 <__down_interruptible+88/f0> Trace; c0107d36 <__down_failed_interruptible+6/c> Trace; c01d8ae6 <.text.lock.loop+bc/136> Trace; c0109172 Trace; c01d7960 Trace; c010740e Trace; c01d7960 Proc; loop1 >>EIP; 00000000 Before first symbol Trace; c0107c68 <__down_interruptible+88/f0> Trace; c0107d36 <__down_failed_interruptible+6/c> Trace; c01d8ae6 <.text.lock.loop+bc/136> Trace; c0109172 Trace; c01d7960 Trace; c010740e Trace; c01d7960 Proc; loop2 >>EIP; e5d145c0 <===== Trace; c0189ec8 Trace; c0189fa2 Trace; c0107b92 <__down+82/d0> Trace; c0107d2c <__down_failed+8/c> Trace; c01845fe <.text.lock.transaction+4/246> Trace; c018204a Trace; c0182114 Trace; c017bc3c Trace; c01795ec Trace; c013349e Trace; c01d6f84 Trace; c01d7408 Trace; c01d7b84 Trace; c0109172 Trace; c01d7960 Trace; c010740e Trace; c01d7960 There are several processes like this Proc; sshd >>EIP; f7b59400 <===== Trace; c0107b92 <__down+82/d0> Trace; c0107d2c <__down_failed+8/c> Trace; c01845fe <.text.lock.transaction+4/246> Trace; c018204a Trace; c0182114 Trace; c017bfa0 Trace; c015d434 <__mark_inode_dirty+b4/c0> Trace; c015ef0a Trace; c013413c Trace; c0133f80 Trace; c010ff3e Trace; c0144d62 Trace; c014def8 Trace; c01091be