All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Valdis.Kletnieks@vt.edu
Cc: Janos Haar <djani22@netcenter.hu>,
	Jesper Juhl <jesper.juhl@gmail.com>,
	linux-kernel@vger.kernel.org, xfs-masters@oss.sgi.com,
	nathans@sgi.com, linux-xfs@oss.sgi.com
Subject: Re: How to send a break? - dump from frozen 64bit linux
Date: Tue, 30 May 2006 21:20:31 -0400	[thread overview]
Message-ID: <1149038431.21827.20.camel@localhost.localdomain> (raw)
In-Reply-To: <200605301903.k4UJ3xQU008919@turing-police.cc.vt.edu>

Added all those listed in the MAINTAINERS file for XFS.

On Tue, 2006-05-30 at 15:03 -0400, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 30 May 2006 12:22:01 +0200, Janos Haar said:
> 
> > http://download.netcenter.hu/bughunt/20060530/dump.txt  (The frozen system,
> > 540KB)
> 
> > Can somebody tell me, whats wrong?
> 
> kblockd/1     D ffff81011f641778     0    25     19            26    24 (L-TLB)
> ffff81011f641778 0000000000000000 0000000000000009 ffff81011f735358 
>        ffff81011f735140 ffff81011fc79100 000014a00f9a0ef2 00000000000410dd 
>        0000000102866d40 ffff810003900280 
> Call Trace: <ffffffff8026d72a>{xfs_qm_shake+135} <ffffffff804e6046>{__mutex_lock_slowpath+424}
>        <ffffffff804e62e4>{mutex_lock+41} <ffffffff8026d72a>{xfs_qm_shake+135}
>        <ffffffff80157cfd>{shrink_slab+100} <ffffffff801584d9>{try_to_free_pages+372}
>        <ffffffff80153c3f>{__alloc_pages+432} <ffffffff8046aef3>{tcp_sendmsg+1373}
>        <ffffffff804848ad>{inet_sendmsg+70} <ffffffff8043f619>{sock_sendmsg+270}
>        <ffffffff8013d3e0>{autoremove_wake_function+0} <ffffffff80440db3>{kernel_sendmsg+61}
>        <ffffffff8802c111>{:nbd:sock_xmit+273} <ffffffff8015195d>{mempool_alloc_slab+17}
>        <ffffffff80169b1b>{poison_obj+39} <ffffffff8015195d>{mempool_alloc_slab+17}
>        <ffffffff80169c11>{cache_alloc_debugcheck_after+235}
>        <ffffffff8015195d>{mempool_alloc_slab+17} <ffffffff802da471>{as_remove_queued_request+267}
>        <ffffffff8802c472>{:nbd:nbd_send_req+517} <ffffffff8802c712>{:nbd:do_nbd_request+329}
>        <ffffffff802d9b45>{as_work_handler+46} <ffffffff80139d30>{run_workqueue+168}
>        <ffffffff802d9b17>{as_work_handler+0} <ffffffff8013a27f>{worker_thread+0}
>        <ffffffff8013a383>{worker_thread+260} <ffffffff80123fa4>{default_wake_function+0}
>        <ffffffff8013a27f>{worker_thread+0} <ffffffff8013d29f>{kthread+219}
>        <ffffffff8012590d>{schedule_tail+70} <ffffffff8010bba6>{child_rip+8}
>        <ffffffff8013d1c4>{kthread+0} <ffffffff8010bb9e>{child_rip+0}
> 
> Half the processes on the box seem wedged at that same mutex_lock. I can't
> seem to find an xfs_qm_shake in my source tree though.

What everyone is waiting for is being blocked here:

kswapd0       D ffff81011fe03c38     0   297      1          1287    19 (L-TLB)
ffff81011fe03c38 0000000000000004 000000000000000a ffff81011f92ba68
       ffff81011f92b850 ffffffff805a23a0 0000149f99fa7d7c 000000000003bcde
       000000002f2c46e0 ffff81008bc37180
Call Trace: <ffffffff804e5522>{schedule_timeout+34}
       <ffffffff80269f87>{xfs_qm_dqunpin_wait+220} <ffffffff80140e74>{debug_mutex_free_waiter+141}
       <ffffffff80123fa4>{default_wake_function+0} <ffffffff80268ca5>{xfs_qm_dqflush+70}
       <ffffffff8026d7a7>{xfs_qm_shake+260} <ffffffff80157cfd>{shrink_slab+100}
       <ffffffff8015801e>{balance_pgdat+559} <ffffffff801582e8>{kswapd+283}
       <ffffffff8013d3e0>{autoremove_wake_function+0} <ffffffff804e6a80>{_spin_unlock_irq+9}
       <ffffffff8012590d>{schedule_tail+70} <ffffffff8010bba6>{child_rip+8}
       <ffffffff801581cd>{kswapd+0} <ffffffff8010bb9e>{child_rip+0}


Seems that the kswapd0 has the lock in questing and has put itself to
sleep waiting to be woken up.  I don't know the xfs code very well, but
the kswapd0 seems to be in this function:

xfs_qm_dqunpin_wait(
	xfs_dquot_t	*dqp)
{
	SPLDECL(s);

	ASSERT(XFS_DQ_IS_LOCKED(dqp));
	if (dqp->q_pincount == 0) {
		return;
	}

	/*
	 * Give the log a push so we don't wait here too long.
	 */
	xfs_log_force(dqp->q_mount, (xfs_lsn_t)0, XFS_LOG_FORCE);
	s = XFS_DQ_PINLOCK(dqp);
	if (dqp->q_pincount == 0) {
		XFS_DQ_PINUNLOCK(dqp, s);
		return;
	}
	sv_wait(&(dqp->q_pinwait), PINOD,
		&(XFS_DQ_TO_QINF(dqp)->qi_pinlock), s);
}


Where sv_wait is:

#define sv_wait(sv, pri, lock, s) \
	_sv_wait(sv, lock, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT)

And our macro hell goes further ...

static inline void _sv_wait(sv_t *sv, spinlock_t *lock, int state,
			     unsigned long timeout)
{
	DECLARE_WAITQUEUE(wait, current);

	add_wait_queue_exclusive(&sv->waiters, &wait);
	__set_current_state(state);
	spin_unlock(lock);

	schedule_timeout(timeout);

	remove_wait_queue(&sv->waiters, &wait);
}


So it is now waiting to be woken up by something that calls:

xfs_qm_dquot_logitem_unpin  which seems to be the function to wake it
up.

And decyphering all the macro crap it seems that the function that wakes
it up is xfs_trans_chunk_committed, or xfs_trans_uncommit.


The above xfs_qm_dqunpin_wait still looks awfully racy, and the
xfs_log_force, which I'm assuming wakes up whoever is suppose to wake up
kswapd0, doesn't have a return code check.  So if it failed to do
whatever the hell it's doing (that code gives me a headache), it looks
like this guy might sleep forever holding a lock that will prevent
others from freeing kernel memory.

Well that's about all I can figure out.

Good luck,

-- Steve



  parent reply	other threads:[~2006-05-31  1:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-27 12:58 How to send a break? Haar János
2006-05-27 23:43 ` Jim Crilly
2006-05-28  7:04   ` How to send a break? - dump from frozen 64bit linux Haar János
2006-05-28 16:17     ` Jesper Juhl
2006-05-28 17:34       ` Haar János
2006-05-29  4:37         ` Jesper Juhl
2007-08-20  7:44           ` Andev Debi
2006-05-30 10:22         ` Janos Haar
2006-05-30 19:03           ` Valdis.Kletnieks
2006-05-30 21:44             ` Janos Haar
2006-05-31  1:20             ` Steven Rostedt [this message]
2006-05-31  4:38               ` XFS related hang (was Re: How to send a break? - dump from frozen 64bit linux) Nathan Scott
2006-05-31  8:00                 ` Janos Haar
2006-05-31 21:54                   ` Jan Engelhardt
2006-06-01  7:29                     ` Janos Haar
2006-06-01  9:44                       ` Jan Engelhardt
2006-06-01 22:04                         ` Nathan Scott
2006-06-02  5:11                           ` Jan Engelhardt
2006-06-01 21:58                   ` Nathan Scott
2006-06-01 22:14                     ` Janos Haar
2006-06-01 23:43                       ` Nathan Scott
2006-06-02  8:01                         ` Janos Haar
2006-05-28 23:06 ` How to send a break? H. Peter Anvin
2006-05-29 15:08 ` linux-os (Dick Johnson)
2006-05-29 15:35   ` Valdis.Kletnieks
2006-05-29 17:32     ` Haar János

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1149038431.21827.20.camel@localhost.localdomain \
    --to=rostedt@goodmis.org \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=djani22@netcenter.hu \
    --cc=jesper.juhl@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@oss.sgi.com \
    --cc=nathans@sgi.com \
    --cc=xfs-masters@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.