From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bernard Pidoux Subject: Inconsistent lock state and possible irq lock inversion dependency detected in ax25.ko Date: Wed, 21 Nov 2007 23:13:31 +0100 Message-ID: <4744AD8B.2000908@ccr.jussieu.fr> References: <20071003185413.GA16242@martell.zuzino.mipt.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------050601050908080509090703" Cc: davem@davemloft.net, netdev@vger.kernel.org To: Alexey Dobriyan , ralf@linux-mips.org Return-path: Received: from postfix2-g20.free.fr ([212.27.60.43]:58417 "EHLO postfix2-g20.free.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751961AbXKUWKY (ORCPT ); Wed, 21 Nov 2007 17:10:24 -0500 Received: from smtp5-g19.free.fr (smtp5-g19.free.fr [212.27.42.35]) by postfix2-g20.free.fr (Postfix) with ESMTP id AE6C91F25F6B for ; Wed, 21 Nov 2007 21:08:04 +0100 (CET) In-Reply-To: <20071003185413.GA16242@martell.zuzino.mipt.ru> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org This is a multi-part message in MIME format. --------------050601050908080509090703 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, I am practicing intensively AX25 packet radio that uses ax25.ko together with mkiss, crc16, netrom, and rose modules using two PIII CPU Linux machines with 2.6.23.8 kernel. On the first Linux machine I did not validate kernel hacking and AX25 applications are running 100% of the time without serious problems. On the second machine I validated kernel hacking and sooner or later I get exactly the same message after a connect timeout expires : [ INFO: inconsistent lock state ] The error seems to reside around ax25_disconnect+0x46/0xaf [ax25] that is called when an AX25 connect timeout or a connection failure occurs. Connect timeout is probably activating ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] The message is only displayed once on a boot session. Ralf Baechle explained to me that ax25 code is very buggy and spinlocks difficult to trace. However, as the cause of error is clearlHowever, as the cause of error is clearly identified and the y identified and the reported address is constant I suspect that an experienced programmer (which I am not) could trace the problem. Moreover, I had the opportunity to catch a different message, that was longer than usual, and seem more explicit : [ INFO: possible irq lock inversion dependency detected ] Although the symptom is different it is related to the same origin : fpac/4933 just changed the state of lock: (slock-AF_AX25){--..}, at: [] ax25_disconnect+0x46/0xaf [ax25] Whatever the running application is the inconsistent lock state could be observed with ax25_call, flexd, fpac ax25 application programms. Please find attached a few reports captured from dmesg after each event. Could someone look at the listing and identify the origin of the problem ,if unique ? Thanks. Bernard Pidoux --------------050601050908080509090703 Content-Type: text/plain; name="inconsistent_locks_report" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="inconsistent_locks_report" ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.23.1 #1 --------------------------------------------------------- fpac/4933 just changed the state of lock: (slock-AF_AX25){--..}, at: [] ax25_disconnect+0x46/0xaf [ax25] but this lock was taken by another, soft-irq-safe lock in the past: (ax25_list_lock){-+..} and interrupts could create inverse lock ordering between them. other info that might help us debug this: no locks held by fpac/4933. the first lock's dependencies: -> (slock-AF_AX25){--..} ops: 410 { initial-use at: [] mark_lock+0x5b/0x44b [] __lock_acquire+0x4c2/0xc02 [] lock_acquire+0x6e/0x87 [] lock_sock_nested+0x26/0xcc [] _spin_lock_bh+0x2e/0x39 [] lock_sock_nested+0x26/0xcc [] lock_sock_nested+0x26/0xcc [] sock_fasync+0x61/0x116 [] sock_close+0x22/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] put_files_struct+0x5e/0xa6 [] do_exit+0x22e/0x6d9 [] sysenter_past_esp+0x8f/0x99 [] trace_hardirqs_on+0x11f/0x148 [] sys_exit_group+0x0/0xd [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff softirq-on-W at: [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x4ad/0xc02 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_disconnect+0x46/0xaf [ax25] [] _spin_lock+0x29/0x34 [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_release+0x9d/0x182 [ax25] [] sock_release+0x14/0x56 [] sock_close+0x2a/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] sys_close+0x66/0x9d [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff hardirq-on-W at: [] mark_lock+0x5b/0x44b [] __lock_acquire+0x488/0xc02 [] lock_acquire+0x6e/0x87 [] lock_sock_nested+0x26/0xcc [] _spin_lock_bh+0x2e/0x39 [] lock_sock_nested+0x26/0xcc [] lock_sock_nested+0x26/0xcc [] sock_fasync+0x61/0x116 [] sock_close+0x22/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] put_files_struct+0x5e/0xa6 [] do_exit+0x22e/0x6d9 [] sysenter_past_esp+0x8f/0x99 [] trace_hardirqs_on+0x11f/0x148 [] sys_exit_group+0x0/0xd [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff } ... key at: [] af_family_slock_keys+0x18/0x120 the second lock's dependencies: -> (ax25_list_lock){-+..} ops: 24485 { initial-use at: [] __lock_acquire+0x4c2/0xc02 [] local_bh_enable+0xfe/0x105 [] mark_held_locks+0x39/0x53 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_cb_add+0xd/0x39 [ax25] [] _spin_lock_bh+0x2e/0x39 [] ax25_cb_add+0xd/0x39 [ax25] [] ax25_cb_add+0xd/0x39 [ax25] [] ax25_bind+0x129/0x14b [ax25] [] sys_bind+0x75/0xa0 [] _spin_lock+0x29/0x34 [] _spin_unlock+0x14/0x1c [] sock_map_fd+0x41/0x4a [] sys_socketcall+0x93/0x261 [] trace_hardirqs_on+0x11f/0x148 [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff in-softirq-W at: [] mark_lock+0x1de/0x44b [] __lock_acquire+0x436/0xc02 [] __lock_acquire+0xbba/0xc02 [] lock_acquire+0x6e/0x87 [] ax25_find_cb+0x18/0xc6 [ax25] [] _spin_lock_bh+0x2e/0x39 [] ax25_find_cb+0x18/0xc6 [ax25] [] ax25_find_cb+0x18/0xc6 [ax25] [] ax25_kiss_rcv+0x37d/0x712 [ax25] [] mark_held_locks+0x39/0x53 [] sock_queue_rcv_skb+0xd6/0xf3 [] _read_unlock+0x14/0x1c [] sock_queue_rcv_skb+0xd6/0xf3 [] netif_receive_skb+0x22d/0x289 [] trace_hardirqs_on+0x109/0x148 [] process_backlog+0x7b/0xeb [] net_rx_action+0x57/0xfd [] __do_softirq+0x40/0x90 [] seq_read+0x1d1/0x277 [] do_softirq+0x27/0x3d [] local_bh_enable_ip+0x9d/0xc5 [] seq_read+0x1d1/0x277 [] seq_read+0x0/0x277 [] proc_reg_read+0x64/0x77 [] proc_reg_read+0x0/0x77 [] vfs_read+0xa6/0x12e [] sys_read+0x41/0x67 [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff hardirq-on-W at: [] __lock_acquire+0x488/0xc02 [] local_bh_enable+0xfe/0x105 [] mark_held_locks+0x39/0x53 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_cb_add+0xd/0x39 [ax25] [] _spin_lock_bh+0x2e/0x39 [] ax25_cb_add+0xd/0x39 [ax25] [] ax25_cb_add+0xd/0x39 [ax25] [] ax25_bind+0x129/0x14b [ax25] [] sys_bind+0x75/0xa0 [] _spin_lock+0x29/0x34 [] _spin_unlock+0x14/0x1c [] sock_map_fd+0x41/0x4a [] sys_socketcall+0x93/0x261 [] trace_hardirqs_on+0x11f/0x148 [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff } ... key at: [] ax25_list_lock+0x10/0xffffbf0f [ax25] -> (slock-AF_AX25){--..} ops: 410 { initial-use at: [] mark_lock+0x5b/0x44b [] __lock_acquire+0x4c2/0xc02 [] lock_acquire+0x6e/0x87 [] lock_sock_nested+0x26/0xcc [] _spin_lock_bh+0x2e/0x39 [] lock_sock_nested+0x26/0xcc [] lock_sock_nested+0x26/0xcc [] sock_fasync+0x61/0x116 [] sock_close+0x22/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] put_files_struct+0x5e/0xa6 [] do_exit+0x22e/0x6d9 [] sysenter_past_esp+0x8f/0x99 [] trace_hardirqs_on+0x11f/0x148 [] sys_exit_group+0x0/0xd [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff softirq-on-W at: [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x4ad/0xc02 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_disconnect+0x46/0xaf [ax25] [] _spin_lock+0x29/0x34 [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_release+0x9d/0x182 [ax25] [] sock_release+0x14/0x56 [] sock_close+0x2a/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] sys_close+0x66/0x9d [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff hardirq-on-W at: [] mark_lock+0x5b/0x44b [] __lock_acquire+0x488/0xc02 [] lock_acquire+0x6e/0x87 [] lock_sock_nested+0x26/0xcc [] _spin_lock_bh+0x2e/0x39 [] lock_sock_nested+0x26/0xcc [] lock_sock_nested+0x26/0xcc [] sock_fasync+0x61/0x116 [] sock_close+0x22/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] put_files_struct+0x5e/0xa6 [] do_exit+0x22e/0x6d9 [] sysenter_past_esp+0x8f/0x99 [] trace_hardirqs_on+0x11f/0x148 [] sys_exit_group+0x0/0xd [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff } ... key at: [] af_family_slock_keys+0x18/0x120 ... acquired at: [] add_lock_to_list+0x68/0x8b [] __lock_acquire+0xa15/0xc02 [] ax25_info_show+0x23e/0x2aa [ax25] [] lock_acquire+0x6e/0x87 [] ax25_info_show+0x23e/0x2aa [ax25] [] _spin_lock+0x29/0x34 [] ax25_info_show+0x23e/0x2aa [ax25] [] ax25_info_show+0x23e/0x2aa [ax25] [] ax2asc+0x30/0x5a [ax25] [] seq_read+0x195/0x277 [] seq_read+0x0/0x277 [] proc_reg_read+0x64/0x77 [] proc_reg_read+0x0/0x77 [] vfs_read+0xa6/0x12e [] sys_read+0x41/0x67 [] sysenter_past_esp+0x5f/0x99 [] 0xffffffff stack backtrace: [] print_irq_inversion_bug+0x10b/0x115 [] check_usage_backwards+0x3c/0x41 [] mark_lock+0x2b8/0x44b [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x4ad/0xc02 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_disconnect+0x46/0xaf [ax25] [] _spin_lock+0x29/0x34 [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_release+0x9d/0x182 [ax25] [] sock_release+0x14/0x56 [] sock_close+0x2a/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] sys_close+0x66/0x9d [] sysenter_past_esp+0x5f/0x99 ======================= ================================= [ INFO: inconsistent lock state ] 2.6.23.1 #1 --------------------------------- inconsistent {in-softirq-W} -> {softirq-on-W} usage. ax25_call/4005 [HC0[0]:SC0[0]:HE1:SE1] takes: (slock-AF_AX25){-+..}, at: [] ax25_disconnect+0x46/0xaf [ax25] {in-softirq-W} state was registered at: [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x436/0xc02 [] __lock_acquire+0xbba/0xc02 [] lock_acquire+0x6e/0x87 [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] ax25_heartbeat_expiry+0x0/0x29 [ax25] [] _spin_lock+0x29/0x34 [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] run_timer_softirq+0xee/0x14a [] __do_softirq+0x31/0x90 [] trace_hardirqs_on+0x109/0x148 [] __do_softirq+0x40/0x90 [] do_softirq+0x27/0x3d [] do_IRQ+0x58/0x6c [] common_interrupt+0x2e/0x40 [] acpi_processor_idle+0x262/0x3cf [processor] [] cpu_idle+0x3c/0x51 [] start_kernel+0x272/0x277 [] unknown_bootoption+0x0/0x195 [] 0xffffffff irq event stamp: 1663 hardirqs last enabled at (1663): [] local_bh_enable_ip+0xbd/0xc5 hardirqs last disabled at (1661): [] local_bh_enable_ip+0x5b/0xc5 softirqs last enabled at (1662): [] ax25_disconnect+0x34/0xaf [ax25] softirqs last disabled at (1656): [] _spin_lock_bh+0xb/0x39 other info that might help us debug this: no locks held by ax25_call/4005. stack backtrace: [] print_usage_bug+0x13b/0x145 [] mark_lock+0x275/0x44b [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x4ad/0xc02 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_disconnect+0x46/0xaf [ax25] [] _spin_lock+0x29/0x34 [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_release+0x9d/0x182 [ax25] [] sock_release+0x14/0x56 [] sock_close+0x2a/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] put_files_struct+0x5e/0xa6 [] do_exit+0x22e/0x6d9 [] _spin_unlock_irq+0x20/0x23 [] trace_hardirqs_on+0x11f/0x148 [] sys_exit_group+0x0/0xd [] get_signal_to_deliver+0x3c6/0x3ea [] sys_connect+0x82/0xad [] do_notify_resume+0x81/0x5fe [] _spin_unlock_irq+0x20/0x23 [] trace_hardirqs_on+0x11f/0x148 [] _spin_unlock_irq+0x20/0x23 [] sys_socketcall+0xac/0x261 [] trace_hardirqs_on+0x11f/0x148 [] work_notifysig+0x13/0x26 ======================= Linux version 2.6.23.1 (root@bernard) (gcc version 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #1 Thu Oct 18 21:45:24 CEST 2007 Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar ... MAX_LOCKDEP_SUBCLASSES: 8 ... MAX_LOCK_DEPTH: 30 ... MAX_LOCKDEP_KEYS: 2048 ... CLASSHASH_SIZE: 1024 ... MAX_LOCKDEP_ENTRIES: 8192 ... MAX_LOCKDEP_CHAINS: 16384 ... CHAINHASH_SIZE: 8192 memory used by lock dependency info: 992 kB per task-struct memory footprint: 1200 bytes ================================= [ INFO: inconsistent lock state ] 2.6.23.1 #1 --------------------------------- inconsistent {in-softirq-W} -> {softirq-on-W} usage. flexd/3005 [HC0[0]:SC0[0]:HE1:SE1] takes: (slock-AF_AX25){-+..}, at: [] ax25_disconnect+0x46/0xaf [ax25] {in-softirq-W} state was registered at: [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x436/0xc02 [] __lock_acquire+0xbba/0xc02 [] lock_acquire+0x6e/0x87 [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] ax25_heartbeat_expiry+0x0/0x29 [ax25] [] _spin_lock+0x29/0x34 [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] run_timer_softirq+0xee/0x14a [] __do_softirq+0x31/0x90 [] trace_hardirqs_on+0x109/0x148 [] __do_softirq+0x40/0x90 [] do_softirq+0x27/0x3d [] do_IRQ+0x58/0x6c [] common_interrupt+0x2e/0x40 [] acpi_safe_halt+0x19/0x25 [processor] [] acpi_processor_idle+0x146/0x3cf [processor] [] cpu_idle+0x3c/0x51 [] start_kernel+0x272/0x277 [] unknown_bootoption+0x0/0x195 [] 0xffffffff irq event stamp: 519 hardirqs last enabled at (519): [] local_bh_enable_ip+0xbd/0xc5 hardirqs last disabled at (517): [] local_bh_enable_ip+0x5b/0xc5 softirqs last enabled at (518): [] ax25_disconnect+0x34/0xaf [ax25] softirqs last disabled at (512): [] _spin_lock_bh+0xb/0x39 other info that might help us debug this: no locks held by flexd/3005. stack backtrace: [] print_usage_bug+0x13b/0x145 [] mark_lock+0x275/0x44b [] __lock_acquire+0xbba/0xc02 [] __lock_acquire+0x4ad/0xc02 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_disconnect+0x46/0xaf [ax25] [] _spin_lock+0x29/0x34 [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_release+0x9d/0x182 [ax25] [] sock_release+0x14/0x56 [] sock_close+0x2a/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] sys_close+0x66/0x9d [] sysenter_past_esp+0x5f/0x99 ======================= Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar ... MAX_LOCKDEP_SUBCLASSES: 8 ... MAX_LOCK_DEPTH: 30 ... MAX_LOCKDEP_KEYS: 2048 ... CLASSHASH_SIZE: 1024 ... MAX_LOCKDEP_ENTRIES: 8192 ... MAX_LOCKDEP_CHAINS: 16384 ... CHAINHASH_SIZE: 8192 memory used by lock dependency info: 992 kB per task-struct memory footprint: 1200 bytes ================================= [ INFO: inconsistent lock state ] 2.6.23.8 #2 --------------------------------- inconsistent {in-softirq-W} -> {softirq-on-W} usage. flexd/4679 [HC0[0]:SC0[0]:HE1:SE1] takes: (slock-AF_AX25){-+..}, at: [] ax25_disconnect+0x46/0xaf [ax25] {in-softirq-W} state was registered at: [] __lock_acquire+0xb9e/0xbe6 [] __lock_acquire+0x418/0xbe6 [] __lock_acquire+0xb9e/0xbe6 [] lock_acquire+0x6e/0x87 [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] ax25_heartbeat_expiry+0x0/0x29 [ax25] [] _spin_lock+0x29/0x34 [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] ax25_std_heartbeat_expiry+0x19/0xd3 [ax25] [] run_timer_softirq+0xee/0x14a [] __do_softirq+0x31/0x90 [] trace_hardirqs_on+0x109/0x148 [] __do_softirq+0x40/0x90 [] do_softirq+0x27/0x3d [] do_IRQ+0x58/0x6c [] common_interrupt+0x2e/0x40 [] acpi_processor_idle+0x262/0x3cf [processor] [] cpu_idle+0x3c/0x51 [] start_kernel+0x272/0x277 [] unknown_bootoption+0x0/0x195 [] 0xffffffff irq event stamp: 601 hardirqs last enabled at (601): [] local_bh_enable_ip+0xbd/0xc5 hardirqs last disabled at (599): [] local_bh_enable_ip+0x5b/0xc5 softirqs last enabled at (600): [] ax25_disconnect+0x34/0xaf [ax25] softirqs last disabled at (594): [] _spin_lock_bh+0xb/0x39 other info that might help us debug this: no locks held by flexd/4679. stack backtrace: [] print_usage_bug+0x13b/0x145 [] mark_lock+0x275/0x44b [] __lock_acquire+0xb9e/0xbe6 [] __lock_acquire+0x47f/0xbe6 [] mark_held_locks+0x39/0x53 [] local_bh_enable_ip+0xbd/0xc5 [] lock_acquire+0x6e/0x87 [] ax25_disconnect+0x46/0xaf [ax25] [] _spin_lock+0x29/0x34 [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_disconnect+0x46/0xaf [ax25] [] ax25_release+0x9d/0x182 [ax25] [] sock_release+0x14/0x56 [] sock_close+0x2a/0x2f [] __fput+0xbc/0x172 [] filp_close+0x51/0x58 [] sys_close+0x66/0x9d [] sysenter_past_esp+0x5f/0x99 ======================= --------------050601050908080509090703--