From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750722AbWGORHU (ORCPT ); Sat, 15 Jul 2006 13:07:20 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750723AbWGORHU (ORCPT ); Sat, 15 Jul 2006 13:07:20 -0400 Received: from intrepid.intrepid.com ([192.195.190.1]:61842 "EHLO intrepid.intrepid.com") by vger.kernel.org with ESMTP id S1750722AbWGORHS (ORCPT ); Sat, 15 Jul 2006 13:07:18 -0400 From: "Gary Funck" To: "Linux-Kernel@Vger. Kernel. Org" Subject: 2.6.17-1.2145_FC5 mmap-related soft lockup Date: Sat, 15 Jul 2006 10:07:26 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.6604 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2869 Importance: Normal X-Spam-Score: -1.44 () ALL_TRUSTED Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org A test program which allocates about 256M of MAP_ANONYMOUS mmap memory, and then spawns 4 processess, where each process i writes to 1/4 of the mapped memory, and then reads the memory written by the process (i + 1)%4, triggers a soft lockup, when exiting. Hardware: dual core dual Opteron 275 (Tyan motherboard, 4G physical memory) has been rock solid reliable. BUG: soft lockup detected on CPU#3! Call Trace: {softlockup_tick+219} {update_process_times+66} {smp_local_timer_interrupt+35} {smp_apic_timer_interrupt+65} {apic_timer_interrupt+135} {__set_page_dirty_nobuffers+0} {_write_unlock_irq+11} {__set_page_dirty_nobuffers+181} {unmap_vmas+1037} {exit_mmap+120} {mmput+44} {do_exit+599} {debug_mutex_init+0} {tracesys+209} BUG: soft lockup detected on CPU#0! Call Trace: {softlockup_tick+219} {update_process_times+66} {smp_local_timer_interrupt+35} {smp_apic_timer_interrupt+65} {apic_timer_interrupt+135} {__set_page_dirty_nobuffers+0} {_write_unlock_irq+11} {__set_page_dirty_nobuffers+181} {unmap_vmas+1037} {exit_mmap+120} {mmput+44} {do_exit+599} {debug_mutex_init+0} {tracesys+209} BUG: soft lockup detected on CPU#2! Call Trace: {softlockup_tick+219} {update_process_times+66} {smp_local_timer_interrupt+35} {smp_apic_timer_interrupt+65} {apic_timer_interrupt+135} {__set_page_dirty_nobuffers+0} {_write_unlock_irq+11} {__set_page_dirty_nobuffers+181} {unmap_vmas+1037} {exit_mmap+120} {mmput+44} {do_exit+599} {debug_mutex_init+0} {tracesys+209} BUG: soft lockup detected on CPU#1! Call Trace: {softlockup_tick+219} {update_process_times+66} {smp_local_timer_interrupt+35} {smp_apic_timer_interrupt+65} {apic_timer_interrupt+135} {__set_page_dirty_nobuffers+0} {_write_unlock_irq+11} {__set_page_dirty_nobuffers+181} {unmap_vmas+1037} {exit_mmap+120} {mmput+44} {do_exit+599} {debug_mutex_init+0} {tracesys+209} The test program runs successfully, but hangs several seconds upon exit. The hardware and software configuration has been solid for several months, but we have seen timer-related synchronization issues with recent kernels (where ntp has to force a re-sync for example, and an occasional lost ticks message). The test program mentioned above is more complicated than described, and can't easily be reproduced in source form, but the binary could be made available.