From: Dave Chinner <david@fromorbit.com>
To: Peter Leckie <pleckie@sgi.com>
Cc: xfs@oss.sgi.com, xfs-dev@sgi.com
Subject: Re: [PATCH] Use atomic_t and wait_event to track dquot pincount
Date: Wed, 24 Sep 2008 17:43:17 +1000 [thread overview]
Message-ID: <20080924074317.GJ5448@disturbed> (raw)
In-Reply-To: <48D9E3E2.3040504@sgi.com>
On Wed, Sep 24, 2008 at 04:53:22PM +1000, Peter Leckie wrote:
> Dave Chinner wrote:
>> [ Pete, please wrap your text at 68-72 columns]
>>
> Ok will do in future
>> On Wed, Sep 24, 2008 at 02:28:13PM +1000, Peter Leckie wrote:
>>
>>> The reason the lsn had changed was due to it not being initialized
>>> by the time a copy of the lsn was taken in xfs_qm_dqflush().
>>> The lsn was then latter updated causing the test in
>>> xfs_qm_dqflush_done() to fail.
>>>
>>> Synchronizations between the 2 functions is done by the pincount
>>> in struct xfs_dquot_t and xfs_qm_dqflush() calls
>>> xfs_qm_dqunpin_wait() to wait for the pincount == 0. However after
>>> been woken up it does not validate the pincount is actually 0,
>>>
>>
>> Sure - that's because we only ever send a wakeup when the pin count
>> falls to zero. Because we have to be holding the dquot lock when we
>> either pin a dquot or wait for it to be unpinned, the act of waiting
>> for it to be unpinned with the dquot lock held guarantees that it
>> is not pinned when we wake up.
>>
>> IOWs, the pin count cannot be incremented while we are waiting for
>> it to be unpinned, and hence it must be zero when we are woken......
>>
>>
>>> allowing a false wake up by the scheduler to cause
>>> xfs_qm_dqflush() to prematurely start processing the dquot.
>>>
>>
>> .... which means I can't see how that would happen...
>>
>> What am I missing here?
>>
> Yeah it's quite intriguing isn't it, I added the pincount to the
> dquot tracing code and sure enough it's 1 all the way through
> xfs_qm_dqflush() I can tell you that none of the XFS code is
> causing it to wake up.
Only the XFS code can cause it to be woken....
> I was going to add some tracing code to
> the scheduler to determine who is causing us to wake up however
> I had other priorities to work on.
>
> Reading the code it appears things like a threads dying or a
> system suspending can cause it.
Wait queues are not affected by such events when they are
configured to be uninterruptable. The sv_t:
#define sv_wait(sv, pri, lock, s) \
_sv_wait(sv, lock, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT)
Is as uninterruptible as it comes. Which means only an explicit
wakeup will cause any waiters to wake up.
> However none of these had
> happened, after reading other examples in the Linux kernel
> it appeared pretty standard to always re-evaluate the condition
> after being woken so I suspect that Linux simply does not
> guarantee that we will only be woken by the thread calling
> wake_up on us.
Not true - wake_up() will only wake tasks on the wait queue
it was called for.
> Any insight into this would be much appreciated as I'm also very curious
> as to why this is happening.
Assert fail the machine when a dquot with a non-zero pincount is
woken in xfs_qm_dqunpin_wait(). If the assert trips, then we need
to work out how the pin count is getting elevated while we have
a waiter....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-09-24 7:42 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-24 4:28 [PATCH] Use atomic_t and wait_event to track dquot pincount Peter Leckie
2008-09-24 6:05 ` Dave Chinner
2008-09-24 6:53 ` Peter Leckie
2008-09-24 7:43 ` Dave Chinner [this message]
2008-09-24 7:26 ` [PATCH v2] " Peter Leckie
2008-09-24 7:42 ` Lachlan McIlroy
2008-09-24 7:46 ` Dave Chinner
2008-09-24 8:03 ` Lachlan McIlroy
2008-09-24 14:42 ` Christoph Hellwig
2008-09-24 8:15 ` Peter Leckie
2008-09-25 1:03 ` Dave Chinner
2008-09-25 8:43 ` Peter Leckie
2008-09-25 9:12 ` Christoph Hellwig
2008-09-26 0:34 ` Dave Chinner
2008-09-26 1:09 ` Peter Leckie
2008-09-26 1:26 ` Lachlan McIlroy
2008-09-27 1:08 ` Dave Chinner
2008-09-26 1:32 ` Lachlan McIlroy
2008-09-26 1:38 ` Peter Leckie
2008-09-26 1:44 ` Mark Goodwin
2008-09-26 1:54 ` Peter Leckie
2008-09-26 11:31 ` Christoph Hellwig
2008-09-26 2:57 ` Dave Chinner
2008-09-26 3:38 ` Lachlan McIlroy
2008-09-27 1:11 ` Dave Chinner
2008-09-26 11:30 ` Christoph Hellwig
2008-09-26 11:27 ` Christoph Hellwig
2008-09-27 1:18 ` Dave Chinner
2008-09-26 1:10 ` Lachlan McIlroy
2008-09-26 11:28 ` Christoph Hellwig
2008-09-29 3:08 ` Lachlan McIlroy
2008-09-29 21:45 ` Christoph Hellwig
2008-09-24 14:41 ` Christoph Hellwig
2008-09-25 1:08 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080924074317.GJ5448@disturbed \
--to=david@fromorbit.com \
--cc=pleckie@sgi.com \
--cc=xfs-dev@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.