public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Suzuki <suzuki@in.ibm.com>
To: Nathan Scott <nathans@sgi.com>
Cc: lkml <linux-kernel@vger.kernel.org>, suparna <suparna@in.ibm.com>,
	akpm@osdl.org, linux-xfs@oss.sgi.com
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests
Date: Fri, 10 Mar 2006 11:36:23 +0530	[thread overview]
Message-ID: <4411175F.8040504@in.ibm.com> (raw)
In-Reply-To: <20060309222232.GB1135@frodo>

Hi Nathan,

Nathan Scott wrote:
> On Thu, Mar 09, 2006 at 12:47:01PM +0530, Suzuki wrote:
> 
>>Hi all,
> 
> 
> Hi there Suzuki,
> 
> 
>>I was working on an issue with getting "Badness in
>>__mutex_unlock_slowpath" and hence a stack trace, while running FS
>>stress tests on XFS on 2.6.16-rc5 kernel.
> 
> 
> Thanks for looking into this.
> 
> 
>>The dmesg looks like :
>>
>>Badness in __mutex_unlock_slowpath at kernel/mutex.c:207
>> [<c0103c0c>] show_trace+0x20/0x22
>> [<c0103d4b>] dump_stack+0x1e/0x20
>> [<c0473f1f>] __mutex_unlock_slowpath+0x12a/0x23b
>> [<c0473938>] mutex_unlock+0xb/0xd
>> [<c02a5720>] xfs_read+0x230/0x2d9
>> [<c02a1bed>] linvfs_aio_read+0x8d/0x98
>> [<c015f3df>] do_sync_read+0xb8/0x107
>> [<c015f4f7>] vfs_read+0xc9/0x19b
>> [<c015f8b2>] sys_read+0x47/0x6e
>> [<c0102db7>] sysenter_past_esp+0x54/0x75
> 
> 
> Yeah, test 008 from the xfstests suite was reliably hitting this for
> me, it'd just not percolated to the top of my todo list yet.
> 
> 
>>This happens with XFS DIO reads. xfs_read holds the i_mutex and issues a
>>__generic_file_aio_read(), which falls into __blockdev_direct_IO with
>>DIO_OWN_LOCKING flag (since xfs uses own_locking ). Now
>>__blockdev_direct_IO releases the i_mutex for READs with
>>DIO_OWN_LOCKING.When it returns to xfs_read, it tries to unlock the
>>i_mutex ( which is now already unlocked), causing the "Badness".
> 
> 
> Indeed.  And there's the problem - why is XFS releasing i_mutex
> for the direct read in xfs_read?  Shouldn't be - fs/direct-io.c
> will always release i_mutex for a direct read in the own-locking
> case, so XFS shouldn't be doing it too (thats what the code does
> and thats what the comment preceding __blockdev_direct_IO says).
> 
> The only piece of the puzzle I don't understand is why we don't
> always get that badness message at the end of every direct read.
> Perhaps its some subtle fastpath/slowpath difference, or maybe
> "debug_mutex_on" gets switched off after the first occurance...

Yes, the debug_mutex_on gets switched off after the first occurence.

> 
> Anyway, with the above change (remove 2 lines near xfs_read end),
> I can no longer reproduce the problem in that previously-warning
> test case, and all the other XFS tests seem to be chugging along
> OK (which includes a healthy mix of dio testing).
> 
> 
>>The possible solution which we can think of, is not to unlock the
>>i_mutex for DIO_OWN_LOCKING. This will only affect the DIO_OWN_LOCKING 
>>users (as of now, only XFS ) with concurrent DIO sync read requests. AIO 
>>read requests would not suffer this problem since they would just return 
>>once the DIO is submitted.
> 
> 
> I don't think that level of invasiveness is necessary at this stage,
> but perhaps you're seeing something that I've missed?  Do you see
> any reason why removing the xfs_read unlock wont work?
> 

But, what happens if __generic_file_aio_read() hits some error before 
doing the aops->direct_IO ?

> 
>>Another work around for this can  be adding a check "mutex_is_locked"
>>before trying to unlock i_mutex in xfs_read. But this seems to be an
>>ugly hack. :(
> 
> 
> Hmm, that just plain wouldn't work - what if the lock was released
> in generic direct IO code, and someone else had acquired it before
> we got to the end of xfs_read?  Badness for sure.
> 
> cheers.
> 

Suzuki.

  reply	other threads:[~2006-03-10  6:00 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-09  7:17 [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests Suzuki
2006-03-09 22:22 ` Nathan Scott
2006-03-10  6:06   ` Suzuki [this message]
  -- strict thread matches above, loose matches on Subject: below --
2006-03-09  7:54 Suzuki
2006-03-09 12:03 ` Christoph Hellwig
2006-03-09 22:30   ` Nathan Scott
2006-03-09 22:42     ` Christoph Hellwig
2006-03-09 23:14       ` Nathan Scott
2006-03-10  0:50         ` Nathan Scott
2006-03-10 15:49           ` Christoph Hellwig
2006-03-14  4:46             ` Suparna Bhattacharya
2006-03-17 17:22           ` Adrian Bunk
2006-03-18  3:34             ` Nathan Scott
2006-03-18  5:03               ` Adrian Bunk
2006-07-10 16:46           ` Stephane Doyon
2006-07-11  0:18             ` Nathan Scott
2006-07-11 13:40               ` Stephane Doyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4411175F.8040504@in.ibm.com \
    --to=suzuki@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@oss.sgi.com \
    --cc=nathans@sgi.com \
    --cc=suparna@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox