From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (Postfix) with ESMTP id E545A29DF8 for ; Fri, 10 May 2013 23:48:59 -0500 (CDT) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay1.corp.sgi.com (Postfix) with ESMTP id B46608F8049 for ; Fri, 10 May 2013 21:48:56 -0700 (PDT) Received: from mail-gg0-f174.google.com (mail-gg0-f174.google.com [209.85.161.174]) by cuda.sgi.com with ESMTP id 8PWjPNUbovg64hHG (version=TLSv1 cipher=RC4-SHA bits=128 verify=NO) for ; Fri, 10 May 2013 21:48:55 -0700 (PDT) Received: by mail-gg0-f174.google.com with SMTP id y1so9446ggc.33 for ; Fri, 10 May 2013 21:48:54 -0700 (PDT) Message-ID: <518DCDB1.30408@gmail.com> Date: Sat, 11 May 2013 00:48:49 -0400 From: "Michael L. Semon" MIME-Version: 1.0 Subject: Re: Rambling noise #1: generic/230 can trigger kernel debug lock detector References: <518B08D9.1060906@gmail.com> <20130509031646.GN24635@dastard> <20130509072045.GO24635@dastard> <518C54AA.7070908@gmail.com> <20130510021942.GP23072@dastard> <20130511011732.GC32675@dastard> In-Reply-To: <20130511011732.GC32675@dastard> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Dave Chinner Cc: "xfs@oss.sgi.com" On 05/10/2013 09:17 PM, Dave Chinner wrote: > On Fri, May 10, 2013 at 03:07:19PM -0400, Michael L. Semon wrote: >> On Thu, May 9, 2013 at 10:19 PM, Dave Chinner wrote: >>> On Thu, May 09, 2013 at 10:00:10PM -0400, Michael L. Semon wrote: >> Thanks for looking at it. There are going to be plenty of false >> positives out there. Is there a pecking order of what works best? As >> in... >> >> * IRQ (IRQs-off?) checking: worth reporting...? >> * sleep inside atomic sections: fascinating, but almost anything can trigger it >> * multiple-CPU deadlock detection: can only speculate on a uniprocessor system >> * circular dependency checking: YMMV >> * reclaim-fs checking: which I knew how much developers need to >> conform to reclaim-fs, or what it is > > If there's XFS in the trace, then just post them. We try to fix > false positives (as well as real bugs) so lockdep reporting gets more > accurate and less noisy over time. > > Cheers, > > Dave. > Feel free to ignore and flame them as well. I'm going to make another attempt to triage my eldest Pentium 4, and there's a high chance that you'll have to reply, "Despite the xfs_* functions, that looks like a DRM issue. Go bug those guys." Thanks! Michael During generic/249 (lucky, first test out)... ====================================================== [ INFO: possible circular locking dependency detected ] 3.9.0+ #2 Not tainted ------------------------------------------------------- xfs_io/1181 is trying to acquire lock: (sb_writers#3){.+.+.+}, at: [] generic_file_splice_write+0x7e/0x1b0 but task is already holding lock: (&(&ip->i_iolock)->mr_lock){++++++}, at: [] xfs_ilock+0xea/0x190 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&(&ip->i_iolock)->mr_lock){++++++}: [] lock_acquire+0x80/0x100 [] down_write_nested+0x54/0xa0 [] xfs_ilock+0xea/0x190 [] xfs_setattr_size+0x30c/0x4a0 [] xfs_vn_setattr+0x2c/0x30 [] notify_change+0x13c/0x360 [] do_truncate+0x5a/0xa0 [] do_last.isra.46+0x31e/0xb90 [] path_openat.isra.47+0x9b/0x3e0 [] do_filp_open+0x31/0x80 [] do_sys_open+0xf1/0x1c0 [] sys_open+0x28/0x30 [] sysenter_do_call+0x12/0x36 -> #1 (&sb->s_type->i_mutex_key#6){+.+.+.}: [] lock_acquire+0x80/0x100 [] mutex_lock_nested+0x64/0x2b0 [] do_truncate+0x50/0xa0 [] do_last.isra.46+0x31e/0xb90 [] path_openat.isra.47+0x9b/0x3e0 [] do_filp_open+0x31/0x80 [] do_sys_open+0xf1/0x1c0 [] sys_open+0x28/0x30 [] sysenter_do_call+0x12/0x36 -> #0 (sb_writers#3){.+.+.+}: [] __lock_acquire+0x1465/0x1690 [] lock_acquire+0x80/0x100 [] __sb_start_write+0xad/0x1b0 [] generic_file_splice_write+0x7e/0x1b0 [] xfs_file_splice_write+0x83/0x120 [] do_splice_from+0x65/0x90 [] direct_splice_actor+0x2b/0x40 [] splice_direct_to_actor+0xb9/0x1e0 [] do_splice_direct+0x62/0x80 [] do_sendfile+0x1b6/0x2d0 [] sys_sendfile64+0x4e/0xb0 [] sysenter_do_call+0x12/0x36 other info that might help us debug this: Chain exists of: sb_writers#3 --> &sb->s_type->i_mutex_key#6 --> &(&ip->i_iolock)->mr_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&ip->i_iolock)->mr_lock); lock(&sb->s_type->i_mutex_key#6); lock(&(&ip->i_iolock)->mr_lock); lock(sb_writers#3); *** DEADLOCK *** 1 lock held by xfs_io/1181: #0: (&(&ip->i_iolock)->mr_lock){++++++}, at: [] xfs_ilock+0xea/0x190 stack backtrace: Pid: 1181, comm: xfs_io Not tainted 3.9.0+ #2 Call Trace: [] print_circular_bug+0x1b8/0x1c2 [] __lock_acquire+0x1465/0x1690 [] ? trace_hardirqs_off+0xb/0x10 [] lock_acquire+0x80/0x100 [] ? generic_file_splice_write+0x7e/0x1b0 [] __sb_start_write+0xad/0x1b0 [] ? generic_file_splice_write+0x7e/0x1b0 [] ? generic_file_splice_write+0x7e/0x1b0 [] generic_file_splice_write+0x7e/0x1b0 [] ? xfs_ilock+0xea/0x190 [] xfs_file_splice_write+0x83/0x120 [] ? xfs_file_fsync+0x210/0x210 [] do_splice_from+0x65/0x90 [] direct_splice_actor+0x2b/0x40 [] splice_direct_to_actor+0xb9/0x1e0 [] ? do_splice_from+0x90/0x90 [] do_splice_direct+0x62/0x80 [] do_sendfile+0x1b6/0x2d0 [] ? might_fault+0x94/0xa0 [] sys_sendfile64+0x4e/0xb0 [] sysenter_do_call+0x12/0x36 XFS (sdb5): Mounting Filesystem XFS (sdb5): Ending clean mount _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs