From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: unsafe locks seen with netperf on net-2.6.29 tree Date: Mon, 29 Dec 2008 13:38:19 +0100 Message-ID: <20081229123819.GA18321@elte.hu> References: <1230410308.9487.295.camel@twins> <1230544927.16718.12.camel@twins> <20081229103154.GA9691@gondor.apana.org.au> <20081229103735.GA9763@gondor.apana.org.au> <20081229112858.GA16385@elte.hu> <20081229114907.GA10170@gondor.apana.org.au> <20081229115827.GA441@elte.hu> <20081229120132.GA10363@gondor.apana.org.au> <20081229121626.GF9628@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Peter Zijlstra , "Tantilov, Emil S" , "Kirsher, Jeffrey T" , netdev , David Miller , "Waskiewicz Jr, Peter P" , "Duyck, Alexander H" , Eric Dumazet To: Herbert Xu Return-path: Received: from mx3.mail.elte.hu ([157.181.1.138]:57835 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbYL2Mif (ORCPT ); Mon, 29 Dec 2008 07:38:35 -0500 Content-Disposition: inline In-Reply-To: <20081229121626.GF9628@elte.hu> Sender: netdev-owner@vger.kernel.org List-ID: * Ingo Molnar wrote: > > * Herbert Xu wrote: > > > On Mon, Dec 29, 2008 at 12:58:27PM +0100, Ingo Molnar wrote: > > > > > > no, i only applied one of them. Is his second patch a good solution in > > > your opinion, and should i thus test both of them? (or will the second one > > > iterate some more - in which case i will keep the revert for now) > > > > Well the second patch is definitely the right solution to the problem > > as reported. It just needs to be extended to fix other similar bugs > > introduced by the original changeset. > > okay - will keep the revert for now and will wait for you guys to do the > full fix. hm, even with the revert i got the splat below. So some other commits are causing this too? Ingo ================================= [ INFO: inconsistent lock state ] 2.6.28-tip-03883-gf855e6c-dirty #13150 --------------------------------- inconsistent {softirq-on-W} -> {in-softirq-W} usage. kjournald/1435 [HC0[0]:SC1[1]:HE1:SE0] takes: (&fbc->lock){-+..}, at: [] __percpu_counter_add+0x65/0xb0 {softirq-on-W} state was registered at: [] __lock_acquire+0x4c6/0xae0 [] lock_acquire+0x89/0xc0 [] _spin_lock+0x38/0x50 [] __percpu_counter_add+0x65/0xb0 [] get_empty_filp+0x6a/0x1d0 [] path_lookup_open+0x29/0x90 [] do_filp_open+0x9e/0x790 [] do_sys_open+0x50/0xe0 [] sys_open+0x2e/0x40 [] syscall_call+0x7/0xb [] 0xffffffff irq event stamp: 125790 hardirqs last enabled at (125790): [] free_hot_cold_page+0x1b6/0x280 hardirqs last disabled at (125789): [] free_hot_cold_page+0x10e/0x280 softirqs last enabled at (123900): [] __do_softirq+0x132/0x180 softirqs last disabled at (125765): [] call_on_stack+0x1a/0x30 other info that might help us debug this: 4 locks held by kjournald/1435: #0: (rcu_read_lock){..--}, at: [] net_rx_action+0xd0/0x220 #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x101/0x3a0 #2: (rcu_read_lock){..--}, at: [] ip_local_deliver+0x55/0x1d0 #3: (slock-AF_INET/1){-+..}, at: [] tcp_v4_rcv+0x55a/0x6e0 stack backtrace: Pid: 1435, comm: kjournald Not tainted 2.6.28-tip-03883-gf855e6c-dirty #13150 Call Trace: [] print_usage_bug+0x176/0x1d0 [] mark_lock+0xbd0/0xd80 [] __lock_acquire+0x483/0xae0 [] ? trace_hardirqs_on+0xb/0x10 [] lock_acquire+0x89/0xc0 [] ? __percpu_counter_add+0x65/0xb0 [] _spin_lock+0x38/0x50 [] ? __percpu_counter_add+0x65/0xb0 [] __percpu_counter_add+0x65/0xb0 [] tcp_v4_destroy_sock+0x1d9/0x240 [] inet_csk_destroy_sock+0x4a/0x140 [] ? inet_csk_clear_xmit_timers+0x45/0x50 [] tcp_done+0x4d/0x70 [] tcp_rcv_state_process+0x68c/0x950 [] tcp_v4_do_rcv+0xd6/0x310 [] ? _spin_lock_nested+0x3d/0x50 [] tcp_v4_rcv+0x5e4/0x6e0 [] ? ip_local_deliver+0x55/0x1d0 [] ip_local_deliver+0xa4/0x1d0 [] ? ip_local_deliver+0x55/0x1d0 [] ip_rcv+0x2aa/0x510 [] ? netif_receive_skb+0x101/0x3a0 [] ? ip_rcv+0x0/0x510 [] netif_receive_skb+0x2e9/0x3a0 [] ? netif_receive_skb+0x101/0x3a0 [] ? __lock_acquire+0x361/0xae0 [] napi_gro_receive+0x1c1/0x200 [] ? mark_held_locks+0x30/0x80 [] ? process_backlog+0x7b/0xd0 [] process_backlog+0x92/0xd0 [] net_rx_action+0x154/0x220 [] ? net_rx_action+0xd0/0x220 [] __do_softirq+0xa9/0x180 [] ? __do_softirq+0x0/0x180 [] ? irq_exit+0x4d/0x60 [] ? do_IRQ+0x8a/0xe0 [] ? check_object+0xef/0x1f0 [] ? common_interrupt+0x2c/0x34 [] ? kmem_cache_free+0xc2/0xf0 [] ? journal_write_revoke_records+0xa5/0x140 [] ? journal_write_revoke_records+0xa5/0x140 [] ? journal_write_revoke_records+0xa5/0x140 [] ? journal_commit_transaction+0x42d/0xe80 [] ? trace_hardirqs_on_caller+0x17e/0x1e0 [] ? trace_hardirqs_on+0xb/0x10 [] ? try_to_del_timer_sync+0x4e/0x60 [] ? kjournald+0xbb/0x1d0 [] ? autoremove_wake_function+0x0/0x40 [] ? kjournald+0x0/0x1d0 [] ? kthread+0x47/0x80 [] ? kthread+0x0/0x80 [] ? kernel_thread_helper+0x7/0x10