From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755059Ab1G1Sfc (ORCPT ); Thu, 28 Jul 2011 14:35:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:6783 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752813Ab1G1Sf3 (ORCPT ); Thu, 28 Jul 2011 14:35:29 -0400 Date: Thu, 28 Jul 2011 14:35:21 -0400 From: Don Zickus To: Ben Greear Cc: Linux Kernel Mailing List , tglx@linutronix.de Subject: Re: WARNING at watchdog_overflow_callback in 3.0-rc7+ Message-ID: <20110728183521.GA26882@redhat.com> References: <4E1DD3AF.80201@candelatech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E1DD3AF.80201@candelatech.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 13, 2011 at 10:19:43AM -0700, Ben Greear wrote: > This is on the same nfs testing machine I've been posting about. This > has some additional nfs patches included, running tests to mount, do io, unmount > over and over again. Seems that the NFS bugs might be finally fixed, but > system is still un-stable in general: This looks like it is stuck spinning on a lock while trying to cancel an hrtimer. Not sure under what conditions the hrtimer can't get this lock. I cc'd Thomas, perhaps he might know. Cheers, Don > > WARNING: at /home/greearb/git/linux-3.0-nfs/kernel/watchdog.c:240 watchdog_overflow_callback+0x97/0xa2() > Hardware name: X7DBU > Watchdog detected hard LOCKUP on cpu 4 > Modules linked in: 8021q garp xt_addrtype xt_TPROXY nf_tproxy_core > xt_socket nf_defrag_ipv6 xt_set ip_set nfnetlink xt_connlimit > macvlan ip6table_filter ip6_tables ebtable_nat ebtables pktgen fuse > iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi stp llc nfs > lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 kvm_intel kvm uinput > i5k_amb i5000_edac ioatdma pcspkr iTCO_wdt e1000e dca > iTCO_vendor_support edac_core microcode shpchp i2c_i801 floppy > radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last > unloaded: xt_connmark] > Pid: 18179, comm: btserver Not tainted 3.0.0-rc7+ #23 > Call Trace: > [] warn_slowpath_common+0x80/0x98 > [] warn_slowpath_fmt+0x41/0x43 > [] watchdog_overflow_callback+0x97/0xa2 > [] __perf_event_overflow+0x11f/0x1d1 > [] ? rcu_read_unlock+0x21/0x23 > [] ? perf_event_update_userpage+0xfe/0x103 > [] perf_event_overflow+0x14/0x16 > [] intel_pmu_handle_irq+0x46d/0x4e0 > [] perf_event_nmi_handler+0x39/0x81 > [] notifier_call_chain+0x54/0x81 > [] __atomic_notifier_call_chain+0x5e/0x90 > [] ? notifier_call_chain+0x81/0x81 > [] atomic_notifier_call_chain+0xf/0x11 > [] notify_die+0x2e/0x30 > [] do_nmi+0x80/0x242 > [] nmi+0x20/0x39 > [] ? do_raw_spin_lock+0x11d/0x13c > <> [] _raw_spin_lock_irqsave+0x56/0x60 > [] ? lock_hrtimer_base+0x25/0x4b > [] lock_hrtimer_base+0x25/0x4b > [] hrtimer_try_to_cancel+0x15/0x46 > [] hrtimer_cancel+0x14/0x20 > [] rtc_irq_set_state+0x8b/0xaf > [] rtc_dev_release+0x35/0x58 > [] fput+0x117/0x1b2 > [] filp_close+0x6d/0x78 > [] put_files_struct+0xca/0x190 > [] exit_files+0x46/0x4e > [] do_exit+0x2b5/0x760 > [] ? fput+0x1a3/0x1b2 > [] ? lockdep_sys_exit_thunk+0x35/0x67 > [] do_group_exit+0x7e/0xa9 > [] sys_exit_group+0x12/0x16 > [] system_call_fastpath+0x16/0x1b > ---[ end trace e90038ab73718706 ]--- > > -- > Ben Greear > Candela Technologies Inc http://www.candelatech.com > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/