From: Dave Chinner <david@fromorbit.com>
To: Daniel J Blueman <daniel.blueman@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>,
Mel Gorman <mel@csn.ul.ie>, Christoph Lameter <clameter@sgi.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Alexander Beregalov <a.beregalov@gmail.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
xfs@oss.sgi.com
Subject: Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...
Date: Mon, 23 Jun 2008 08:29:52 +1000 [thread overview]
Message-ID: <20080622222952.GB11558@disturbed> (raw)
In-Reply-To: <6278d2220806221356o4c611e43n305ec9653d6d5359@mail.gmail.com>
On Sun, Jun 22, 2008 at 09:56:17PM +0100, Daniel J Blueman wrote:
> On Sun, Jun 22, 2008 at 7:21 PM, Arjan van de Ven <arjan@infradead.org> wrote:
> > this sort of thing can easily be exposed with the latencytop tool...
> > it will at least tell you WHAT the system is blocking on.
> > (not so much the why, the tool isn't smart enough to automatically spit
> > out kernel patches yet)
>
> Good plan. I reproduced this without NFS mounted, so pure XFS. I
> wasn't able to capture the same process's (ie 5480) latencytop trace,
> but it may have contributed to the list.
>
> A fair amount of debugging was turned on, hurting the latency some
> (despite it being a 3.6GHz Penryn).
>
> Daniel
>
> --- [1]
>
> $ vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> <snip>
> 1 1 156 14424 12 2329672 0 0 0 110755 177 3820 57 6 36 0
> 2 1 156 14736 12 2348152 0 0 24 25172 204 26018 35 21 43 1
> 5 0 156 24252 12 2363800 0 0 59656 31 545 25292 35 14 28 23
> 4 0 156 14696 12 2317784 0 0 3824 0 38 23083 95 6 0 0
> 4 0 156 14440 12 2319304 0 0 4672 0 72 3372 93 3 3 2
> 2 0 156 14428 12 2318484 0 0 0 4 27 731 52 0 49 0
> 2 0 156 14480 12 2308512 0 0 0 12 32 36629 39 13 49 0
> 2 0 156 14572 12 2301220 0 0 3904 12316 117 10760 58 7 26 11
>
> --- [2]
>
> Cause Maximum Percentage
> down xfs_buf_lock _xfs_buf_find xfs_buf_get_flags 271.1 msec 0.4 %
Waiting on I/O to complete. Your disk is busy.
> down xfs_buf_iowait xfs_buf_iostart xfs_buf_read_f206.1 msec 1.3 %
Waiting on I/O to complete. Your disk is busy.
> down xfs_buf_lock xfs_getsb xfs_trans_getsb xfs_tr160.4 msec 0.5 %
Waiting on a superblock I/O or a transaction to complete. Your
disk is busy. (Note, this one can be avoided with lazy
superblock counters).
[snip reast of "busy disk trace"]
But really, all latencytop is telling you here is that it takes
time to wait for I/O to complete. It's mostly useless for tracking
down locking issues when you've got I/O in progress...
> 2.6.26-rc7-211c #2
> -------------------------------------------------------
> AutopanoPro/5480 is trying to acquire lock:
> (iprune_mutex){--..}, at: [<ffffffff802d2a5d>] shrink_icache_memory+0x7d/0x280
>
> but task is already holding lock:
> (&(&ip->i_iolock)->mr_lock){----}, at: [<ffffffff803a459f>]
> xfs_ilock+0xbf/0x110
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&(&ip->i_iolock)->mr_lock){----}:
> [<ffffffff802769ad>] __lock_acquire+0xbdd/0x1020
> [<ffffffff80276e55>] lock_acquire+0x65/0x90
> [<ffffffff8026b41b>] down_write_nested+0x4b/0x90
> [<ffffffff803a45df>] xfs_ilock+0xff/0x110
> [<ffffffff803a47af>] xfs_ireclaim+0x3f/0x90
> [<ffffffff803c9169>] xfs_finish_reclaim+0x59/0x220
> [<ffffffff803cc3b5>] xfs_reclaim+0x185/0x190
> [<ffffffff803d9211>] xfs_fs_clear_inode+0xe1/0x130
> [<ffffffff802d25c7>] clear_inode+0x87/0x120
> [<ffffffff802d290a>] dispose_list+0x2a/0x100
> [<ffffffff802d2c06>] shrink_icache_memory+0x226/0x280
> [<ffffffff8029d9c5>] shrink_slab+0x125/0x180
> [<ffffffff8029dc52>] try_to_free_pages+0x232/0x360
> [<ffffffff80297f0d>] __alloc_pages_internal+0x1ed/0x4a0
> [<ffffffff802981db>] __alloc_pages+0xb/0x10
> [<ffffffff802a312a>] handle_mm_fault+0x46a/0x6d0
> [<ffffffff8060510a>] do_page_fault+0x3ca/0x830
> [<ffffffff80602add>] error_exit+0x0/0xa9
> [<ffffffffffffffff>] 0xffffffffffffffff
mmap_sem -> iprune_mutex -> xfs_ilock
> -> #0 (iprune_mutex){--..}:
> [<ffffffff80276817>] __lock_acquire+0xa47/0x1020
> [<ffffffff80276e55>] lock_acquire+0x65/0x90
> [<ffffffff8060059a>] mutex_lock_nested+0xba/0x2b0
> [<ffffffff802d2a5d>] shrink_icache_memory+0x7d/0x280
> [<ffffffff8029d9c5>] shrink_slab+0x125/0x180
> [<ffffffff8029dc52>] try_to_free_pages+0x232/0x360
> [<ffffffff80297f0d>] __alloc_pages_internal+0x1ed/0x4a0
> [<ffffffff802981db>] __alloc_pages+0xb/0x10
> [<ffffffff8029a6b6>] __do_page_cache_readahead+0x136/0x230
> [<ffffffff8029aa08>] ondemand_readahead+0x128/0x1f0
> [<ffffffff8029ab45>] page_cache_async_readahead+0x75/0xa0
> [<ffffffff80293a8a>] generic_file_aio_read+0x28a/0x610
> [<ffffffff803d78c4>] xfs_read+0x124/0x270
> [<ffffffff803d4416>] __xfs_file_read+0x46/0x50
> [<ffffffff803d4451>] xfs_file_aio_read+0x11/0x20
> [<ffffffff802bc1b1>] do_sync_read+0xf1/0x130
> [<ffffffff802bca74>] vfs_read+0xc4/0x160
> [<ffffffff802bcf10>] sys_read+0x50/0x90
> [<ffffffff8022639b>] system_call_after_swapgs+0x7b/0x80
> [<ffffffffffffffff>] 0xffffffffffffffff
xfs_ilock -> iprune_mutex
This is exactly the situation I mentioned in the previous email.
There is no potential deadlock here between the xfs_ilock and
iprune_mutex as the xfs_ilock that is held before and/or after
iprune_mutex is guaranteed to be different (the first is in use
so will never be found by shrink_icache_memory())...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-06-22 22:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <6278d2220806220256g674304ectb945c14e7e09fede@mail.gmail.com>
2008-06-22 9:58 ` [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec) Daniel J Blueman
2008-06-22 18:10 ` Mel Gorman
2008-06-22 18:14 ` Mel Gorman
2008-06-22 18:54 ` Daniel J Blueman
2008-06-22 18:21 ` Arjan van de Ven
2008-06-22 20:56 ` Daniel J Blueman
2008-06-22 22:29 ` Dave Chinner [this message]
2008-06-22 22:19 ` Dave Chinner
2008-06-23 0:24 ` Mel Gorman
2008-06-23 0:53 ` Dave Chinner
2008-06-23 7:22 ` Christoph Hellwig
2008-06-23 18:38 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080622222952.GB11558@disturbed \
--to=david@fromorbit.com \
--cc=a.beregalov@gmail.com \
--cc=arjan@infradead.org \
--cc=clameter@sgi.com \
--cc=daniel.blueman@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=torvalds@linux-foundation.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.