From: Christoph Hellwig <hch@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Chandra Seetharaman <sekharan@us.ibm.com>,
XFS Mailing List <xfs@oss.sgi.com>
Subject: Re: [PATCH] xfs: Remove the entries from the queue while waking them up.
Date: Sun, 20 Nov 2011 06:43:42 -0500 [thread overview]
Message-ID: <20111120114342.GA11278@infradead.org> (raw)
In-Reply-To: <20111120082235.GN7046@dastard>
On Sun, Nov 20, 2011 at 07:22:35PM +1100, Dave Chinner wrote:
> Be nice to factor these into a single function - they do exactly the
> same thing except for the queue they walk and the need_bytes
> calculation.
I thought about it, and I think we can do it as a broader refactoring
of the two list/lock/head tuples for the reserve/grant queues. But
I'd rather do a simple minimal patch that can easily be backported
first.
>
> > +
> > void
> > xfs_log_move_tail(xfs_mount_t *mp,
> > xfs_lsn_t tail_lsn)
>
> Also, wouldn't it be a good idea to use these helpers in
> xfs_log_move_tail() and remove the code duplication from there as
> well? i.e. the four places we do this open coded wakeup walk could
> be replaced with a single implementation....
Yes, I even hinted at that in the changelog. To make it not ugly
it requires removing the opportunistic wakers. Fortunately I already
have patches for that, which I plan to sumit for 3.3. We an easily
do that factoring as part of that patchset.
> Same with this function and xlog_writeq_wait - they are identical
> code just operating on a different grant head. If we had
>
> struct xlog_grant_head {
> spinlock_t lock;
> struct list_head queue;
> atomic64_t head;
> };
>
> Rather than the current open-coded definitions, then this could all
> share the same code rather than continuing to duplicate it. If we've
> got to restructure the code to fix the bug, it makes sense to me to
> remove the code duplication as well...
I agree. But let's fix the issue first and to the cleanups later.
> .....
> > free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
> > + if (!list_empty_careful(&log->l_reserveq)) {
> > spin_lock(&log->l_grant_reserve_lock);
> .....
> > + if (!xlog_wake_reserveq(log, free_bytes))
> > + error = xlog_reserveq_wait(log, tic, need_bytes);
> > + spin_unlock(&log->l_grant_reserve_lock);
> > + } else if (free_bytes < need_bytes) {
> > spin_lock(&log->l_grant_reserve_lock);
> > - list_del_init(&tic->t_queue);
> > + error = xlog_reserveq_wait(log, tic, need_bytes);
> > spin_unlock(&log->l_grant_reserve_lock);
> > }
>
> I don't think this logic is quite correct. When we have a waiter on
> the queue, we try to wake all the waiters before sleeping if that
> did not succeed. If we woke everyone, we allow this process to
> continue. The problem is that we don't check to see if there is
> space remaining for this process to continue after doing all the
> wakeups.
>
> i.e. if need_bytes = 10000 and free_bytes = 50000, and the waiters
> on the queue total 45000 bytes, then we fall through with free_bytes
> = 5000 and need_bytes = 10000. In that case, we should have gone to
> sleep because we are 5000 bytes short of our required space but we
> don't and potentially deadlock due to oversubscribing the log space.
>
> IOWs, xlog_wake_reserveq() needs to return the amount of remaining
> free space after the wakeups so we can then run the if (free_bytes <
> need_bytes) check unconditionally....
Indeed. I think the better way to do it is to make the free_bytes in
xlog_wake_reserveq a pointer, so that the caller sees the update and
then do:
free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
if (!list_empty_careful(&log->l_reserveq)) {
spin_lock(&log->l_grant_reserve_lock);
if (!xlog_wake_reserveq(log, &free_bytes) ||
free_bytes < need_bytes)
error = xlog_reserveq_wait(log, tic, need_bytes);
spin_unlock(&log->l_grant_reserve_lock);
} else if (free_bytes < need_bytes) {
spin_lock(&log->l_grant_reserve_lock);
list_del_init(&tic->t_queue);
error = xlog_reserveq_wait(log, tic, need_bytes);
spin_unlock(&log->l_grant_reserve_lock);
}
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-11-20 11:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-18 19:20 [PATCH] xfs: Remove the entries from the queue while waking them up Chandra Seetharaman
2011-11-19 1:14 ` Dave Chinner
2011-11-21 23:01 ` Chandra Seetharaman
2011-11-19 18:19 ` Christoph Hellwig
2011-11-20 8:22 ` Dave Chinner
2011-11-20 11:43 ` Christoph Hellwig [this message]
2011-11-20 19:49 ` [PATCH v2] xfs: fix and cleanup logspace waiter lists Christoph Hellwig
2011-11-21 19:11 ` [PATCH] xfs: Remove the entries from the queue while waking them up Chandra Seetharaman
2011-11-22 10:23 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111120114342.GA11278@infradead.org \
--to=hch@infradead.org \
--cc=david@fromorbit.com \
--cc=sekharan@us.ibm.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox