public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: prevent spurious "head behind tail" warnings
@ 2013-11-19 22:37 Dave Chinner
  2013-11-19 23:08 ` Mark Tinguely
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2013-11-19 22:37 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

When xlog_space_left() cracks the grant head and the log tail, it
does so without locking to synchronise the sampling of the
variables. It samples the grant head first, so if there is a delay
before it smaples the log tail, there is a window where the log tail
could have moved onwards and be moved past the sampled value of the
grant head. This then leads to the "xlog_space_left: head behind
tail" warning message.

To avoid spurious output in this situation, swap the order in which
the variables are cracked. This means that the head may grant head
may move if there is a delay, but the log tail will be stable, hence
ensure the tail does not jump the head accidentally.

While this avoids the spurious head behind tail problem, it
introduces the opposite problem - the head can move more than a full
cycle past the tail. The code already handles this case by
indicating that the log is full (i.e. zero space available) but
that's still (generally) a spurious situation.

Hence, if we detect that the head is more than a cycle ahead of the
tail or the head is behind the tail, start the calculation again by
resampling the variables and trying again. If we get too many
resamples, then throw a warning and return a full or empty log
appropriately.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log.c | 102 ++++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 78 insertions(+), 24 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 8497a00..0cfcc20 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1113,37 +1113,91 @@ xlog_space_left(
 	struct xlog	*log,
 	atomic64_t	*head)
 {
-	int		free_bytes;
 	int		tail_bytes;
 	int		tail_cycle;
 	int		head_cycle;
 	int		head_bytes;
+	int		retries = 0;
+#define XLOG_SPACE_MAX_RETRIES	5
+
+	do {
+		/*
+		 * sample tail before head to avoid spurious "head behind tail"
+		 * warnings due to racing tail updates. We dump a memory barrier
+		 * here to make sure we pick up the latest values that have been
+		 * written by other threads each time through.
+		 */
+		smp_mb();
+		xlog_crack_atomic_lsn(&log->l_tail_lsn, &tail_cycle,
+				      &tail_bytes);
+		tail_bytes = BBTOB(tail_bytes);
+		xlog_crack_grant_head(head, &head_cycle, &head_bytes);
+
+		/*
+		 * The tail cycle should always be the same or one less than the
+		 * head cycle. If not, restart again because we've likely had a
+		 * race with an update during sampling.
+		 */
+		if (tail_cycle + 1 < head_cycle || tail_cycle > head_cycle)
+			continue;
+
+		/*
+		 * If the tail is in previous cycle, space available is the
+		 * region between the tail and the head.
+		 *
+		 *	     H		 T
+		 *    +2222222111111111111111+
+		 *            +-- free --+
+		 */
+		if (tail_cycle < head_cycle) {
+			ASSERT(tail_cycle == (head_cycle - 1));
+			return tail_bytes - head_bytes;
+		}
+
+		/*
+		 * If the tail is in the same cycle, space available is the
+		 * regions outside the range between the tail and the head:
+		 *
+		 *	   T	       H
+		 *    +2222222222222222211111+
+		 *          +-- used --+
+		 *    +free+		+free+
+		 *
+		 * If the head and tail are the same, then the whole log is
+		 * free.
+		 */
+		if (tail_cycle == head_cycle && head_bytes >= tail_bytes)
+			return log->l_logsize - (head_bytes - tail_bytes);
 
-	xlog_crack_grant_head(head, &head_cycle, &head_bytes);
-	xlog_crack_atomic_lsn(&log->l_tail_lsn, &tail_cycle, &tail_bytes);
-	tail_bytes = BBTOB(tail_bytes);
-	if (tail_cycle == head_cycle && head_bytes >= tail_bytes)
-		free_bytes = log->l_logsize - (head_bytes - tail_bytes);
-	else if (tail_cycle + 1 < head_cycle)
-		return 0;
-	else if (tail_cycle < head_cycle) {
-		ASSERT(tail_cycle == (head_cycle - 1));
-		free_bytes = tail_bytes - head_bytes;
-	} else {
 		/*
-		 * The reservation head is behind the tail.
-		 * In this case we just want to return the size of the
-		 * log as the amount of space left.
+		 * The head is behind the tail. Invalid, so let's go around
+		 * again and resample.
 		 */
-		xfs_alert(log->l_mp,
-			"xlog_space_left: head behind tail\n"
-			"  tail_cycle = %d, tail_bytes = %d\n"
-			"  GH   cycle = %d, GH   bytes = %d",
-			tail_cycle, tail_bytes, head_cycle, head_bytes);
-		ASSERT(0);
-		free_bytes = log->l_logsize;
-	}
-	return free_bytes;
+	} while (retries++ < XLOG_SPACE_MAX_RETRIES);
+
+	/*
+	 * OK, we're in trouble now - the head and tail are out of sync. Time to
+	 * issue a warning about it and return a default free space value
+	 * depending on whether the tail is ahead or behind the log.
+	 */
+	ASSERT(retries >= XLOG_SPACE_MAX_RETRIES);
+	xfs_alert(log->l_mp,
+		  "%s: The grant head is %s the log tail.\n"
+		  "  tail_cycle = %d, tail_bytes = %d\n"
+		  "  GH   cycle = %d, GH   bytes = %d",
+		  __func__,
+		  tail_cycle + 1 < head_cycle ? "in front of" : "behind",
+		  tail_cycle, tail_bytes, head_cycle, head_bytes);
+	ASSERT(0);
+
+	/*
+	 * If the head is in front, there is not space left to advance it so we
+	 * consider the log as being full. If the tail is behind, then we assume
+	 * that the whole log is available.
+	 */
+	if (tail_cycle + 1 < head_cycle)
+		return 0;
+	return log->l_logsize;
 }
 
 
-- 
1.8.4.rc3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-11-22  2:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-19 22:37 [PATCH] xfs: prevent spurious "head behind tail" warnings Dave Chinner
2013-11-19 23:08 ` Mark Tinguely
2013-11-19 23:24   ` Eric Sandeen
2013-11-19 23:44     ` Mark Tinguely
2013-11-21  2:19       ` Dave Chinner
2013-11-21  3:16         ` Dave Chinner
2013-11-21  1:42   ` Dave Chinner
2013-11-21 13:01     ` Dave Chinner
2013-11-22  2:57       ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox