From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757653AbYDUMPX (ORCPT ); Mon, 21 Apr 2008 08:15:23 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755808AbYDUMPK (ORCPT ); Mon, 21 Apr 2008 08:15:10 -0400 Received: from pentafluge.infradead.org ([213.146.154.40]:43846 "EHLO pentafluge.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755723AbYDUMPJ (ORCPT ); Mon, 21 Apr 2008 08:15:09 -0400 Subject: Re: Possible circular locking dependency while running valgrind on an executable over CIFS From: Peter Zijlstra To: Steve French Cc: Brandon Ehle , linux-kernel@vger.kernel.org, Ingo Molnar , shaggy@austin.ibm.com In-Reply-To: <524f69650804210507x5e19971ajfdd42049e50d0e2b@mail.gmail.com> References: <3c5a513f0804202004y35820ddfse9195f883428a7db@mail.gmail.com> <1208769262.7115.164.camel@twins> <524f69650804210507x5e19971ajfdd42049e50d0e2b@mail.gmail.com> Content-Type: text/plain Date: Mon, 21 Apr 2008 14:15:04 +0200 Message-Id: <1208780104.7115.171.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.22.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2008-04-21 at 07:07 -0500, Steve French wrote: > I thought that this was considered a false positive before, but Shaggy > might remember if that was the same warning. > > But it would seem impractical to delay sending read requests from > readpages - seems to defeat the whole purpose of having a higher > performance read interface. If tcpsendmsg can deadlock when called > from readpages or other code which holds this sem, wouldn't this > deadlock all over the place (network block device drivers, SAN > drivers, various network file systems, some cluster file systems) > since readpages is a common entry point and sendmsg or equivalent > could be used by various networking relating block drivers, file > systems etc. Ah, is this one of those cases where the locking order inversion is between different sockets; like we had with NFS in ed07536ed6731775219c1df7fa26a7588753e693 . So the normal sk_lock -> mmap_sem order is not possible for these sockets because they are never exposed to user-space and will thus never need to do the whole copy_from/to_user() thing. Is that the case? > On Mon, Apr 21, 2008 at 4:14 AM, Peter Zijlstra > wrote: > On Sun, 2008-04-20 at 20:04 -0700, Brandon Ehle wrote: > > While running valgrind on various executables located on a > CIFS share, > > I eventually got the message below. The machine in question > is > > running 2.6.25 with a few of the debug options enabled, > including lock > > debugging. Any idea if this is indicative of a real problem > or just a > > false positive? > > Looks like a simple AB-BA deadlock to me. I admint being a bit > ignorant > about CIFS (nor really minding that) however it looks like you > need to > delay or decouple the actual sending of the read request from > the .readpages() implementation. > > Steve? > > > ======================================================= > > [ INFO: possible circular locking dependency detected ] > > 2.6.25 #29 > > ------------------------------------------------------- > > valgrind.bin/23200 is trying to acquire lock: > > (sk_lock-AF_INET){--..}, at: [] tcp_sendmsg > +0x17/0xb00 > > > > but task is already holding lock: > > (&mm->mmap_sem){----}, at: [] do_page_fault > +0x16d/0x8e0 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 (&mm->mmap_sem){----}: > > [] __lock_acquire+0xfa9/0xfdb > > [] lock_acquire+0x8a/0xbd > > [] down_read+0x3d/0x85 > > [] do_page_fault+0x185/0x8e0 > > [] error_code+0x72/0x78 > > [] memcpy_toiovec+0x43/0x5a > > [] skb_copy_datagram_iovec+0x47/0x1d7 > > [] tcp_recvmsg+0xb3d/0xcc8 > > [] sock_common_recvmsg+0x3d/0x53 > > [] sock_recvmsg+0x10d/0x12a > > [] sys_recvfrom+0x7c/0xc7 > > [] sys_recv+0x36/0x38 > > [] sys_socketcall+0x163/0x25b > > [] sysenter_past_esp+0x6d/0xc5 > > [] 0xffffffff > > > > -> #0 (sk_lock-AF_INET){--..}: > > [] __lock_acquire+0xba1/0xfdb > > [] lock_acquire+0x8a/0xbd > > [] lock_sock_nested+0xb5/0xc2 > > [] tcp_sendmsg+0x17/0xb00 > > [] sock_sendmsg+0xf9/0x116 > > [] kernel_sendmsg+0x28/0x37 > > [] SendReceive2+0x197/0x69b > > [] CIFSSMBRead+0x110/0x2a0 > > [] cifs_readpages+0x18c/0x5c7 > > [] __do_page_cache_readahead+0x161/0x1e1 > > [] do_page_cache_readahead+0x45/0x59 > > [] filemap_fault+0x30f/0x467 > > [] __do_fault+0x56/0x43a > > [] handle_mm_fault+0x150/0x873 > > [] do_page_fault+0x39d/0x8e0 > > [] error_code+0x72/0x78 > > [] 0xffffffff > > > > other info that might help us debug this: > > > > 1 lock held by valgrind.bin/23200: > > #0: (&mm->mmap_sem){----}, at: [] do_page_fault > +0x16d/0x8e0 > > > > stack backtrace: > > Pid: 23200, comm: valgrind.bin Not tainted 2.6.25 #29 > > [] print_circular_bug_tail+0x68/0x6a > > [] __lock_acquire+0xba1/0xfdb > > [] ? restore_nocheck+0x12/0x15 > > [] lock_acquire+0x8a/0xbd > > [] ? tcp_sendmsg+0x17/0xb00 > > [] lock_sock_nested+0xb5/0xc2 > > [] ? tcp_sendmsg+0x17/0xb00 > > [] tcp_sendmsg+0x17/0xb00 > > [] ? get_lock_stats+0x1b/0x3e > > [] ? getnstimeofday+0x34/0xde > > [] ? lapic_next_event+0x15/0x1e > > [] sock_sendmsg+0xf9/0x116 > > [] ? mempool_alloc_slab+0xe/0x10 > > [] ? autoremove_wake_function+0x0/0x3a > > [] ? _spin_unlock+0x27/0x42 > > [] ? allocate_mid+0xec/0x149 > > [] kernel_sendmsg+0x28/0x37 > > [] SendReceive2+0x197/0x69b > > [] CIFSSMBRead+0x110/0x2a0 > > [] cifs_readpages+0x18c/0x5c7 > > [] ? __alloc_pages+0x6d/0x3c7 > > [] ? mark_held_locks+0x4d/0x84 > > [] ? __rcu_read_unlock+0x81/0xa4 > > [] ? cifs_readpages+0x0/0x5c7 > > [] __do_page_cache_readahead+0x161/0x1e1 > > [] do_page_cache_readahead+0x45/0x59 > > [] filemap_fault+0x30f/0x467 > > [] __do_fault+0x56/0x43a > > [] handle_mm_fault+0x150/0x873 > > [] ? native_sched_clock+0xa6/0xea > > [] ? do_page_fault+0x16d/0x8e0 > > [] ? down_read_trylock+0x51/0x59 > > [] do_page_fault+0x39d/0x8e0 > > [] ? __lock_acquire+0x2d1/0xfdb > > [] ? vma_link+0xbc/0xd4 > > [] ? __filemap_fdatawrite_range+0x61/0x6d > > [] ? filemap_fdatawrite+0x26/0x28 > > [] ? cifs_flush+0x1c/0x5c > > [] ? sysenter_past_esp+0xb6/0xc5 > > [] ? trace_hardirqs_on+0x10a/0x15a > > [] ? do_page_fault+0x0/0x8e0 > > [] error_code+0x72/0x78 > > ======================= > > > > > > > -- > Thanks, > > Steve