All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suzuki <suzuki@in.ibm.com>
To: Nathan Scott <nathans@sgi.com>
Cc: lkml <linux-kernel@vger.kernel.org>, suparna <suparna@in.ibm.com>,
	akpm@osdl.org, linux-xfs@oss.sgi.com
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests
Date: Fri, 10 Mar 2006 11:36:23 +0530	[thread overview]
Message-ID: <4411175F.8040504@in.ibm.com> (raw)
In-Reply-To: <20060309222232.GB1135@frodo>

Hi Nathan,

Nathan Scott wrote:
> On Thu, Mar 09, 2006 at 12:47:01PM +0530, Suzuki wrote:
> 
>>Hi all,
> 
> 
> Hi there Suzuki,
> 
> 
>>I was working on an issue with getting "Badness in
>>__mutex_unlock_slowpath" and hence a stack trace, while running FS
>>stress tests on XFS on 2.6.16-rc5 kernel.
> 
> 
> Thanks for looking into this.
> 
> 
>>The dmesg looks like :
>>
>>Badness in __mutex_unlock_slowpath at kernel/mutex.c:207
>> [<c0103c0c>] show_trace+0x20/0x22
>> [<c0103d4b>] dump_stack+0x1e/0x20
>> [<c0473f1f>] __mutex_unlock_slowpath+0x12a/0x23b
>> [<c0473938>] mutex_unlock+0xb/0xd
>> [<c02a5720>] xfs_read+0x230/0x2d9
>> [<c02a1bed>] linvfs_aio_read+0x8d/0x98
>> [<c015f3df>] do_sync_read+0xb8/0x107
>> [<c015f4f7>] vfs_read+0xc9/0x19b
>> [<c015f8b2>] sys_read+0x47/0x6e
>> [<c0102db7>] sysenter_past_esp+0x54/0x75
> 
> 
> Yeah, test 008 from the xfstests suite was reliably hitting this for
> me, it'd just not percolated to the top of my todo list yet.
> 
> 
>>This happens with XFS DIO reads. xfs_read holds the i_mutex and issues a
>>__generic_file_aio_read(), which falls into __blockdev_direct_IO with
>>DIO_OWN_LOCKING flag (since xfs uses own_locking ). Now
>>__blockdev_direct_IO releases the i_mutex for READs with
>>DIO_OWN_LOCKING.When it returns to xfs_read, it tries to unlock the
>>i_mutex ( which is now already unlocked), causing the "Badness".
> 
> 
> Indeed.  And there's the problem - why is XFS releasing i_mutex
> for the direct read in xfs_read?  Shouldn't be - fs/direct-io.c
> will always release i_mutex for a direct read in the own-locking
> case, so XFS shouldn't be doing it too (thats what the code does
> and thats what the comment preceding __blockdev_direct_IO says).
> 
> The only piece of the puzzle I don't understand is why we don't
> always get that badness message at the end of every direct read.
> Perhaps its some subtle fastpath/slowpath difference, or maybe
> "debug_mutex_on" gets switched off after the first occurance...

Yes, the debug_mutex_on gets switched off after the first occurence.

> 
> Anyway, with the above change (remove 2 lines near xfs_read end),
> I can no longer reproduce the problem in that previously-warning
> test case, and all the other XFS tests seem to be chugging along
> OK (which includes a healthy mix of dio testing).
> 
> 
>>The possible solution which we can think of, is not to unlock the
>>i_mutex for DIO_OWN_LOCKING. This will only affect the DIO_OWN_LOCKING 
>>users (as of now, only XFS ) with concurrent DIO sync read requests. AIO 
>>read requests would not suffer this problem since they would just return 
>>once the DIO is submitted.
> 
> 
> I don't think that level of invasiveness is necessary at this stage,
> but perhaps you're seeing something that I've missed?  Do you see
> any reason why removing the xfs_read unlock wont work?
> 

But, what happens if __generic_file_aio_read() hits some error before 
doing the aops->direct_IO ?

> 
>>Another work around for this can  be adding a check "mutex_is_locked"
>>before trying to unlock i_mutex in xfs_read. But this seems to be an
>>ugly hack. :(
> 
> 
> Hmm, that just plain wouldn't work - what if the lock was released
> in generic direct IO code, and someone else had acquired it before
> we got to the end of xfs_read?  Badness for sure.
> 
> cheers.
> 

Suzuki.

  reply	other threads:[~2006-03-10  6:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-09  7:17 [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests Suzuki
2006-03-09 22:22 ` Nathan Scott
2006-03-10  6:06   ` Suzuki [this message]
  -- strict thread matches above, loose matches on Subject: below --
2006-03-09  7:54 Suzuki
2006-03-09 12:03 ` Christoph Hellwig
2006-03-09 12:03   ` Christoph Hellwig
2006-03-09 22:30   ` Nathan Scott
2006-03-09 22:42     ` Christoph Hellwig
2006-03-09 23:14       ` Nathan Scott
2006-03-10  0:50         ` Nathan Scott
2006-03-10 15:49           ` Christoph Hellwig
2006-03-14  4:46             ` Suparna Bhattacharya
2006-03-17 17:22           ` Adrian Bunk
2006-03-18  3:34             ` Nathan Scott
2006-03-18  5:03               ` Adrian Bunk
2006-07-10 16:46           ` Stephane Doyon
2006-07-11  0:18             ` Nathan Scott
2006-07-11 13:40               ` Stephane Doyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4411175F.8040504@in.ibm.com \
    --to=suzuki@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@oss.sgi.com \
    --cc=nathans@sgi.com \
    --cc=suparna@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.