From: Dave Chinner <david@fromorbit.com>
To: Peter Leckie <pleckie@sgi.com>
Cc: xfs@oss.sgi.com, xfs-dev@sgi.com
Subject: Re: [PATCH] Use atomic_t and wait_event to track dquot pincount
Date: Wed, 24 Sep 2008 17:43:17 +1000 [thread overview]
Message-ID: <20080924074317.GJ5448@disturbed> (raw)
In-Reply-To: <48D9E3E2.3040504@sgi.com>
On Wed, Sep 24, 2008 at 04:53:22PM +1000, Peter Leckie wrote:
> Dave Chinner wrote:
>> [ Pete, please wrap your text at 68-72 columns]
>>
> Ok will do in future
>> On Wed, Sep 24, 2008 at 02:28:13PM +1000, Peter Leckie wrote:
>>
>>> The reason the lsn had changed was due to it not being initialized
>>> by the time a copy of the lsn was taken in xfs_qm_dqflush().
>>> The lsn was then latter updated causing the test in
>>> xfs_qm_dqflush_done() to fail.
>>>
>>> Synchronizations between the 2 functions is done by the pincount
>>> in struct xfs_dquot_t and xfs_qm_dqflush() calls
>>> xfs_qm_dqunpin_wait() to wait for the pincount == 0. However after
>>> been woken up it does not validate the pincount is actually 0,
>>>
>>
>> Sure - that's because we only ever send a wakeup when the pin count
>> falls to zero. Because we have to be holding the dquot lock when we
>> either pin a dquot or wait for it to be unpinned, the act of waiting
>> for it to be unpinned with the dquot lock held guarantees that it
>> is not pinned when we wake up.
>>
>> IOWs, the pin count cannot be incremented while we are waiting for
>> it to be unpinned, and hence it must be zero when we are woken......
>>
>>
>>> allowing a false wake up by the scheduler to cause
>>> xfs_qm_dqflush() to prematurely start processing the dquot.
>>>
>>
>> .... which means I can't see how that would happen...
>>
>> What am I missing here?
>>
> Yeah it's quite intriguing isn't it, I added the pincount to the
> dquot tracing code and sure enough it's 1 all the way through
> xfs_qm_dqflush() I can tell you that none of the XFS code is
> causing it to wake up.
Only the XFS code can cause it to be woken....
> I was going to add some tracing code to
> the scheduler to determine who is causing us to wake up however
> I had other priorities to work on.
>
> Reading the code it appears things like a threads dying or a
> system suspending can cause it.
Wait queues are not affected by such events when they are
configured to be uninterruptable. The sv_t:
#define sv_wait(sv, pri, lock, s) \
_sv_wait(sv, lock, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT)
Is as uninterruptible as it comes. Which means only an explicit
wakeup will cause any waiters to wake up.
> However none of these had
> happened, after reading other examples in the Linux kernel
> it appeared pretty standard to always re-evaluate the condition
> after being woken so I suspect that Linux simply does not
> guarantee that we will only be woken by the thread calling
> wake_up on us.
Not true - wake_up() will only wake tasks on the wait queue
it was called for.
> Any insight into this would be much appreciated as I'm also very curious
> as to why this is happening.
Assert fail the machine when a dquot with a non-zero pincount is
woken in xfs_qm_dqunpin_wait(). If the assert trips, then we need
to work out how the pin count is getting elevated while we have
a waiter....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-09-24 7:42 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-24 4:28 [PATCH] Use atomic_t and wait_event to track dquot pincount Peter Leckie
2008-09-24 6:05 ` Dave Chinner
2008-09-24 6:53 ` Peter Leckie
2008-09-24 7:43 ` Dave Chinner [this message]
2008-09-24 7:26 ` [PATCH v2] " Peter Leckie
2008-09-24 7:42 ` Lachlan McIlroy
2008-09-24 7:46 ` Dave Chinner
2008-09-24 8:03 ` Lachlan McIlroy
2008-09-24 14:42 ` Christoph Hellwig
2008-09-24 8:15 ` Peter Leckie
2008-09-25 1:03 ` Dave Chinner
2008-09-25 8:43 ` Peter Leckie
2008-09-25 9:12 ` Christoph Hellwig
2008-09-26 0:34 ` Dave Chinner
2008-09-26 1:09 ` Peter Leckie
2008-09-26 1:26 ` Lachlan McIlroy
2008-09-27 1:08 ` Dave Chinner
2008-09-26 1:32 ` Lachlan McIlroy
2008-09-26 1:38 ` Peter Leckie
2008-09-26 1:44 ` Mark Goodwin
2008-09-26 1:54 ` Peter Leckie
2008-09-26 11:31 ` Christoph Hellwig
2008-09-26 2:57 ` Dave Chinner
2008-09-26 3:38 ` Lachlan McIlroy
2008-09-27 1:11 ` Dave Chinner
2008-09-26 11:30 ` Christoph Hellwig
2008-09-26 11:27 ` Christoph Hellwig
2008-09-27 1:18 ` Dave Chinner
2008-09-26 1:10 ` Lachlan McIlroy
2008-09-26 11:28 ` Christoph Hellwig
2008-09-29 3:08 ` Lachlan McIlroy
2008-09-29 21:45 ` Christoph Hellwig
2008-09-24 14:41 ` Christoph Hellwig
2008-09-25 1:08 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080924074317.GJ5448@disturbed \
--to=david@fromorbit.com \
--cc=pleckie@sgi.com \
--cc=xfs-dev@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox