From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755413Ab1GLUvZ (ORCPT ); Tue, 12 Jul 2011 16:51:25 -0400 Received: from mail.candelatech.com ([208.74.158.172]:50111 "EHLO ns3.lanforge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755208Ab1GLUvY (ORCPT ); Tue, 12 Jul 2011 16:51:24 -0400 Message-ID: <4E1CB3CB.5070108@candelatech.com> Date: Tue, 12 Jul 2011 13:51:23 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-2.fc11 Thunderbird/3.0.4 MIME-Version: 1.0 To: Linux Kernel Mailing List Subject: lockdep splat in 3.0-rc7 (circular locking, scheduler related?) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Been seeing this one for a while in 3.0, and still in -rc7. Kernel has lots of debug enabled, and some nfs patches. ======================================================= [ INFO: possible circular locking dependency detected ] 3.0.0-rc7+ #23 ------------------------------------------------------- ip/30705 is trying to acquire lock: (rcu_node_level_0){..-...}, at: [] rcu_report_unblock_qs_rnp+0x52/0x72 but task is already holding lock: (&rq->lock){-.-.-.}, at: [] sched_ttwu_pending+0x34/0x58 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&rq->lock){-.-.-.}: [] lock_acquire+0xf4/0x14b [] _raw_spin_lock+0x36/0x45 [] __task_rq_lock+0x5b/0x89 [] wake_up_new_task+0x41/0x116 [] do_fork+0x207/0x2f1 [] kernel_thread+0x70/0x72 [] rest_init+0x21/0xd7 [] start_kernel+0x3bd/0x3c8 [] x86_64_start_reservations+0xb8/0xbc [] x86_64_start_kernel+0x101/0x110 -> #2 (&p->pi_lock){-.-.-.}: [] lock_acquire+0xf4/0x14b [] _raw_spin_lock_irqsave+0x4e/0x60 [] try_to_wake_up+0x29/0x1a0 [] default_wake_function+0xd/0xf [] autoremove_wake_function+0x13/0x38 [] __wake_up_common+0x49/0x7f [] __wake_up+0x34/0x48 [] rcu_report_exp_rnp+0x50/0x89 [] __rcu_read_unlock+0x1e9/0x24e [] rcu_read_unlock+0x21/0x23 [] __mem_cgroup_try_charge+0x195/0x4f5 [] mem_cgroup_charge_common+0xa7/0xdb [] mem_cgroup_newpage_charge+0x34/0x43 [] handle_pte_fault+0x1b6/0x84d [] handle_mm_fault+0x1ac/0x1c4 [] do_page_fault+0x32d/0x374 [] page_fault+0x25/0x30 -> #1 (sync_rcu_preempt_exp_wq.lock){......}: [] lock_acquire+0xf4/0x14b [] _raw_spin_lock_irqsave+0x4e/0x60 [] __wake_up+0x1d/0x48 [] rcu_report_exp_rnp+0x50/0x89 [] sync_rcu_preempt_exp_init.clone.0+0x3e/0x53 [] synchronize_rcu_expedited+0xdb/0x1c3 [] synchronize_net+0x25/0x2e [] rollback_registered_many+0xee/0x1e1 [] unregister_netdevice_many+0x14/0x55 [] 0xffffffffa036a138 [] ops_exit_list+0x25/0x4e [] unregister_pernet_operations+0x5c/0x8e [] unregister_pernet_subsys+0x22/0x32 [] 0xffffffffa0372e2c [] sys_delete_module+0x1aa/0x20e [] system_call_fastpath+0x16/0x1b -> #0 (rcu_node_level_0){..-...}: [] __lock_acquire+0xae6/0xdd5 [] lock_acquire+0xf4/0x14b [] _raw_spin_lock+0x36/0x45 [] rcu_report_unblock_qs_rnp+0x52/0x72 [] __rcu_read_unlock+0x1a7/0x24e [] rcu_read_unlock+0x21/0x23 [] cpuacct_charge+0x53/0x5b [] update_curr+0x11f/0x15a [] enqueue_task_fair+0x46/0x22a [] enqueue_task+0x61/0x68 [] activate_task+0x28/0x30 [] ttwu_activate+0x12/0x34 [] ttwu_do_activate.clone.4+0x2d/0x3f [] sched_ttwu_pending+0x43/0x58 [] scheduler_ipi+0x9/0xb [] smp_reschedule_interrupt+0x25/0x27 [] reschedule_interrupt+0x13/0x20 [] rcu_read_unlock+0x21/0x23 [] find_get_pages_tag+0xc1/0xed [] pagevec_lookup_tag+0x20/0x29 [] write_cache_pages_da+0xeb/0x37b [] ext4_da_writepages+0x2ee/0x565 [] do_writepages+0x1f/0x28 [] __filemap_fdatawrite_range+0x4e/0x50 [] filemap_flush+0x17/0x19 [] ext4_alloc_da_blocks+0x81/0xab [] ext4_release_file+0x35/0xb3 [] fput+0x117/0x1b2 [] filp_close+0x6d/0x78 [] put_files_struct+0xca/0x190 [] exit_files+0x46/0x4e [] do_exit+0x2b5/0x760 [] do_group_exit+0x7e/0xa9 [] sys_exit_group+0x12/0x16 [] system_call_fastpath+0x16/0x1b other info that might help us debug this: Chain exists of: rcu_node_level_0 --> &p->pi_lock --> &rq->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&rq->lock); lock(&p->pi_lock); lock(&rq->lock); lock(rcu_node_level_0); *** DEADLOCK *** 2 locks held by ip/30705: #0: (jbd2_handle){+.+...}, at: [] start_this_handle+0x605/0x64d #1: (&rq->lock){-.-.-.}, at: [] sched_ttwu_pending+0x34/0x58 stack backtrace: Pid: 30705, comm: ip Not tainted 3.0.0-rc7+ #23 Call Trace: [] ? _raw_spin_unlock_irqrestore+0x6b/0x79 [] print_circular_bug+0x1fe/0x20f [] __lock_acquire+0xae6/0xdd5 [] ? rcu_report_unblock_qs_rnp+0x52/0x72 [] lock_acquire+0xf4/0x14b [] ? rcu_report_unblock_qs_rnp+0x52/0x72 [] _raw_spin_lock+0x36/0x45 [] ? rcu_report_unblock_qs_rnp+0x52/0x72 [] ? _raw_spin_unlock+0x45/0x52 [] rcu_report_unblock_qs_rnp+0x52/0x72 [] ? __rcu_read_unlock+0xdc/0x24e [] __rcu_read_unlock+0x1a7/0x24e [] rcu_read_unlock+0x21/0x23 [] cpuacct_charge+0x53/0x5b [] update_curr+0x11f/0x15a [] enqueue_task_fair+0x46/0x22a [] enqueue_task+0x61/0x68 [] activate_task+0x28/0x30 [] ttwu_activate+0x12/0x34 [] ttwu_do_activate.clone.4+0x2d/0x3f [] sched_ttwu_pending+0x43/0x58 [] scheduler_ipi+0x9/0xb [] smp_reschedule_interrupt+0x25/0x27 [] reschedule_interrupt+0x13/0x20 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] ? __rcu_read_unlock+0x53/0x24e [] ? radix_tree_deref_slot+0x57/0x57 [] ? lock_release+0x6/0x1e9 [] rcu_read_unlock+0x21/0x23 [] find_get_pages_tag+0xc1/0xed [] ? start_this_handle+0x605/0x64d [] pagevec_lookup_tag+0x20/0x29 [] write_cache_pages_da+0xeb/0x37b [] ? jbd2_journal_start+0xe/0x10 [] ? ext4_journal_start_sb+0x115/0x124 [] ext4_da_writepages+0x2ee/0x565 [] do_writepages+0x1f/0x28 [] __filemap_fdatawrite_range+0x4e/0x50 [] filemap_flush+0x17/0x19 [] ext4_alloc_da_blocks+0x81/0xab [] ext4_release_file+0x35/0xb3 [] fput+0x117/0x1b2 [] filp_close+0x6d/0x78 [] put_files_struct+0xca/0x190 [] exit_files+0x46/0x4e [] do_exit+0x2b5/0x760 [] ? fsnotify_modify+0x5d/0x65 [] ? retint_swapgs+0x13/0x1b [] do_group_exit+0x7e/0xa9 [] sys_exit_group+0x12/0x16 [] system_call_fastpath+0x16/0x1b Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com