* [PATCH] xfs: force background CIL push under sustained load
@ 2010-09-23 2:27 Dave Chinner
2010-09-23 10:43 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Dave Chinner @ 2010-09-23 2:27 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
I have been seeing relatively frequent pauses in transaction throughput up to
30s long under heavy parallel workloads. The only thing that seemed strange
about them was that the xfsaild was active during the pauses, but making no
progress. It was running exactly 20 times a second (on the 50ms no-progress
backoff), and the number of pushbuf events was constant across this time as
well. IOWs, the xfsaild appeared to be stuck on buffers that it could not push
out.
Further investigation indicated that it was trying to push out inode buffers
that were pinned and/or locked. The xfsbufd was also getting woken at the same
frequency (by the xfsaild, no doubt) to push out delayed write buffers. The
xfsbufd was not making any progress because all the buffers in the delwri queue
were pinned. This scan and make no progress dance went one in the trace for
some seconds, before the xfssyncd came along an issued a log force, and then
things started going again.
However, I noticed something strange about the log force - there were way too
many IO's issued. 516 log buffers were written, to be exact. That added up to
129MB of log IO, which got me very interested because it's almost exactly 25%
of the size of the log. That "25% of the log" is actually very important - it's
the distance that tail-pushing tries to keep free in the log; i.e. where it
pushes to. Further, the delayed logging code is suppose to aggregate the
minimum of 25% of the log or 8MB worth of changes before flushing. That's what
really puzzled me - why did a log force write 129MB instead of only 8MB?
Essentially what has happened is that no CIL pushes had occurred since the
previous tail push which cleared out 25% of the log space. That caused all the
new transactions to block because there wasn't log space for them, but they
kick the xfsaild to push the tail. However, the xfsaild was not making
progress because there were buffers it could not lock and flush, and the
xfsbufd could not flush them because they were pinned. As a result, both
the xfsaild and the xfsbufd could not move the tail of the log forward without
the CIL first committing.
The cause of the problem was that the background CIL push, which should happen
when 8MB of aggregated changes have been committed, is being held off by the
concurrent transaction commit load. The background push does a
down_write_trylock() which will fail if there is a concurrent transaction
commit holding the push lock in read mode. With 8 CPUs all doing transactions
as fast as they can, there was enough concurrent transaction commits to hold
off the background push until tail-pushing could no longer free log space,
and the halt would occur.
It should be noted that there is no reason why it would halt at 25% of log
space used by a single CIL checkpoint. This bug could definitely violate the
"no transaction should be larger than half the log" requirement and hence
result in corruption if the system crashed under heavy load. This sort of
bug is exactly the reason why delayed logging was tagged as experimental....
The fix is to start blocking background pushes once the threshold has been
exceeded by 50%. This is still only 37.5% of log space at worst, so keeps us
well short of the limit a transaction must not exceed.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log_cil.c | 12 +++++++++---
fs/xfs/xfs_log_priv.h | 28 ++++++++++++++++++++--------
2 files changed, 29 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 3c78a17..23d6ceb 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -424,9 +424,15 @@ xlog_cil_push(
new_ctx = kmem_zalloc(sizeof(*new_ctx), KM_SLEEP|KM_NOFS);
new_ctx->ticket = xlog_cil_ticket_alloc(log);
- /* lock out transaction commit, but don't block on background push */
+ /*
+ * Lock out transaction commit, but don't block for background pushes
+ * unless we are well over the CIL space limit. See the definition of
+ * XLOG_CIL_HARD_SPACE_LIMIT() for the full explanation of the logic
+ * used here.
+ */
if (!down_write_trylock(&cil->xc_ctx_lock)) {
- if (!push_seq)
+ if (!push_seq &&
+ cil->xc_ctx->space_used < XLOG_CIL_HARD_SPACE_LIMIT(log))
goto out_free_ticket;
down_write(&cil->xc_ctx_lock);
}
@@ -441,7 +447,7 @@ xlog_cil_push(
goto out_skip;
/* check for a previously pushed seqeunce */
- if (push_seq < cil->xc_ctx->sequence)
+ if (push_seq && push_seq < cil->xc_ctx->sequence)
goto out_skip;
/*
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index ced52b9..9776956 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -460,15 +460,27 @@ struct xfs_cil {
* than the CIL itself.
*
* With dynamic reservations, we can basically make up arbitrary limits for the
- * checkpoint size so long as they don't violate any other size rules. Hence
- * the initial maximum size for the checkpoint transaction will be set to a
- * quarter of the log or 8MB, which ever is smaller. 8MB is an arbitrary limit
- * right now based on the latency of writing out a large amount of data through
- * the circular iclog buffers.
+ * checkpoint size so long as they don't violate any other size rules. A key
+ * size we should try to keep below is the AIL push threshold which is 25% of
+ * the log. That is how much empty space in the log the transaction reservation
+ * subsystem tries to keep free. If we make transactions larger than that, then
+ * we risk requiring more space for the transaction commit than the AIL will
+ * try to free for us. Hence the maximum size we should allow is under 25% of
+ * the log. This will also keep us below the "no transaction shoul dbe more
+ * than half the log" rule that recovery requires us to keep.
+ *
+ * Further, we need to make sure the background CIL push is efficient, which
+ * means we need to give the background push a chance to commit without
+ * blocking all the current transaction commits. Hence we need some space
+ * between the threshold and the 25% limit to allow background pushes to be
+ * tried, but not enforced. To make this simple and fast to calculate, set
+ * the background push threshold to 1/8th (12.5%) the size of the log, and then start
+ * enforcing the background push at 50% above this. i.e. at 3/16th or 18.75% of
+ * the log size. This should keep us well under the limits of the AIL pushing
+ * threshold, yet give us plenty of space for aggregation on large logs.
*/
-
-#define XLOG_CIL_SPACE_LIMIT(log) \
- (min((log->l_logsize >> 2), (8 * 1024 * 1024)))
+#define XLOG_CIL_SPACE_LIMIT(log) (log->l_logsize >> 3)
+#define XLOG_CIL_HARD_SPACE_LIMIT(log) (3 * (log->l_logsize >> 4))
/*
* The reservation head lsn is not made up of a cycle number and block number.
--
1.7.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] xfs: force background CIL push under sustained load
2010-09-23 2:27 [PATCH] xfs: force background CIL push under sustained load Dave Chinner
@ 2010-09-23 10:43 ` Christoph Hellwig
2010-09-23 18:53 ` Alex Elder
2010-09-23 18:58 ` Alex Elder
2 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2010-09-23 10:43 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Thu, Sep 23, 2010 at 12:27:43PM +1000, Dave Chinner wrote:
> The fix is to start blocking background pushes once the threshold has been
> exceeded by 50%. This is still only 37.5% of log space at worst, so keeps us
> well short of the limit a transaction must not exceed.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
> + * the background push threshold to 1/8th (12.5%) the size of the log, and then start
if you can please shorten this line a bit for the final version.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xfs: force background CIL push under sustained load
2010-09-23 2:27 [PATCH] xfs: force background CIL push under sustained load Dave Chinner
2010-09-23 10:43 ` Christoph Hellwig
@ 2010-09-23 18:53 ` Alex Elder
2010-09-24 1:18 ` Dave Chinner
2010-09-23 18:58 ` Alex Elder
2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2010-09-23 18:53 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Thu, 2010-09-23 at 12:27 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> I have been seeing relatively frequent pauses in transaction throughput up to
> 30s long under heavy parallel workloads. The only thing that seemed strange
> about them was that the xfsaild was active during the pauses, but making no
> progress. It was running exactly 20 times a second (on the 50ms no-progress
> backoff), and the number of pushbuf events was constant across this time as
> well. IOWs, the xfsaild appeared to be stuck on buffers that it could not push
> out.
>
> Further investigation indicated that it was trying to push out inode buffers
> that were pinned and/or locked. The xfsbufd was also getting woken at the same
> frequency (by the xfsaild, no doubt) to push out delayed write buffers. The
> xfsbufd was not making any progress because all the buffers in the delwri queue
> were pinned. This scan and make no progress dance went one in the trace for
> some seconds, before the xfssyncd came along an issued a log force, and then
> things started going again.
>
> However, I noticed something strange about the log force - there were way too
> many IO's issued. 516 log buffers were written, to be exact. That added up to
> 129MB of log IO, which got me very interested because it's almost exactly 25%
> of the size of the log. That "25% of the log" is actually very important - it's
> the distance that tail-pushing tries to keep free in the log; i.e. where it
> pushes to. Further, the delayed logging code is suppose to aggregate the
> minimum of 25% of the log or 8MB worth of changes before flushing. That's what
> really puzzled me - why did a log force write 129MB instead of only 8MB?
>
> Essentially what has happened is that no CIL pushes had occurred since the
> previous tail push which cleared out 25% of the log space. That caused all the
> new transactions to block because there wasn't log space for them, but they
> kick the xfsaild to push the tail. However, the xfsaild was not making
> progress because there were buffers it could not lock and flush, and the
> xfsbufd could not flush them because they were pinned. As a result, both
> the xfsaild and the xfsbufd could not move the tail of the log forward without
> the CIL first committing.
>
> The cause of the problem was that the background CIL push, which should happen
> when 8MB of aggregated changes have been committed, is being held off by the
> concurrent transaction commit load. The background push does a
> down_write_trylock() which will fail if there is a concurrent transaction
> commit holding the push lock in read mode. With 8 CPUs all doing transactions
> as fast as they can, there was enough concurrent transaction commits to hold
> off the background push until tail-pushing could no longer free log space,
> and the halt would occur.
>
> It should be noted that there is no reason why it would halt at 25% of log
> space used by a single CIL checkpoint. This bug could definitely violate the
> "no transaction should be larger than half the log" requirement and hence
> result in corruption if the system crashed under heavy load. This sort of
> bug is exactly the reason why delayed logging was tagged as experimental....
>
> The fix is to start blocking background pushes once the threshold has been
> exceeded by 50%. This is still only 37.5% of log space at worst, so keeps us
> well short of the limit a transaction must not exceed.
This looks good. You're making the CIL push be controlled by
low/high water marks, basically, with a "best effort" at the
low water mark (XLOG_CIL_SPACE_LIMIT()) and forced once
XLOG_CIL_HARD_SPACE_LIMIT() has been reached. And the
important bug fix is that previously there was no point at
which the forced push would occur.
As usual I have a bit of gratuitous commentary below,
but this change looks good to me.
Reviewed-by: Alex Elder <aelder@sgi.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_log_cil.c | 12 +++++++++---
> fs/xfs/xfs_log_priv.h | 28 ++++++++++++++++++++--------
> 2 files changed, 29 insertions(+), 11 deletions(-)
. . .
> diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
> index ced52b9..9776956 100644
> --- a/fs/xfs/xfs_log_priv.h
> +++ b/fs/xfs/xfs_log_priv.h
> @@ -460,15 +460,27 @@ struct xfs_cil {
> * than the CIL itself.
> *
> * With dynamic reservations, we can basically make up arbitrary limits for the
> - * checkpoint size so long as they don't violate any other size rules. Hence
> - * the initial maximum size for the checkpoint transaction will be set to a
> - * quarter of the log or 8MB, which ever is smaller. 8MB is an arbitrary limit
> - * right now based on the latency of writing out a large amount of data through
> - * the circular iclog buffers.
> + * checkpoint size so long as they don't violate any other size rules. A key
> + * size we should try to keep below is the AIL push threshold which is 25% of
> + * the log. That is how much empty space in the log the transaction reservation
> + * subsystem tries to keep free. If we make transactions larger than that, then
> + * we risk requiring more space for the transaction commit than the AIL will
> + * try to free for us. Hence the maximum size we should allow is under 25% of
> + * the log. This will also keep us below the "no transaction shoul dbe more
s/shoul dbe/should be/
> + * than half the log" rule that recovery requires us to keep.
> + *
> + * Further, we need to make sure the background CIL push is efficient, which
> + * means we need to give the background push a chance to commit without
> + * blocking all the current transaction commits. Hence we need some space
> + * between the threshold and the 25% limit to allow background pushes to be
> + * tried, but not enforced. To make this simple and fast to calculate, set
> + * the background push threshold to 1/8th (12.5%) the size of the log, and then start
> + * enforcing the background push at 50% above this. i.e. at 3/16th or 18.75% of
> + * the log size. This should keep us well under the limits of the AIL pushing
> + * threshold, yet give us plenty of space for aggregation on large logs.
> */
I think the above explanation is pretty good but I don't know that it's
as clear or concise as it could be. I don't claim this is better but
I'll take a shot (I don't like offering criticism without suggesting
an alternative).
* With dynamic reservations, we can basically make up arbitrary
* limits for the checkpoint size so long as they don't violate any
* other size rules. Recovery imposes a rule that no transaction
* exceed half the log, so we are limited by that. Furthermore, the
* log transaction reservation subsystem tries to keep 25% of the
* log free, so we should keep below that limit or we risk not being
* able to get the space we need.
*
* In order to keep background CIL push efficient, we will set a
* lower threshold at which background pushing is attempted without
* blocking current transaction commits. A separate, higher bound
* defines when CIL pushes are forced in order to ensure we stay
* within our transaction size limits.
> -
> -#define XLOG_CIL_SPACE_LIMIT(log) \
> - (min((log->l_logsize >> 2), (8 * 1024 * 1024)))
> +#define XLOG_CIL_SPACE_LIMIT(log) (log->l_logsize >> 3)
> +#define XLOG_CIL_HARD_SPACE_LIMIT(log) (3 * (log->l_logsize >> 4))
Maybe "LIMIT" isn't quite the right name for these two.
(But I have no better suggestion.)
I don't really care much about this, but I'll take this
opportunity for a small rant.
The difference in calculation cost/speed between "x >> 3" and
"x / 8" is vanishingly small. I think it is meaningful to use
a shift in places where a power-of-two is mandated, but in places
like this it suggests there is a constraint that simply doesn't
exist. So for example, you could have chosen (log->logsize / 10)
as the "try pushing" value, and (log->logsize / 4 - 1) as the
"must push" value.
Anyway, this is all fine as-is, it just surprises me that
people still use this idiom when it makes such a negligible
actual difference in execution.
>
> /*
> * The reservation head lsn is not made up of a cycle number and block number.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] xfs: force background CIL push under sustained load
2010-09-23 18:53 ` Alex Elder
@ 2010-09-24 1:18 ` Dave Chinner
2010-09-24 14:02 ` Alex Elder
0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2010-09-24 1:18 UTC (permalink / raw)
To: Alex Elder; +Cc: xfs
On Thu, Sep 23, 2010 at 01:53:55PM -0500, Alex Elder wrote:
> On Thu, 2010-09-23 at 12:27 +1000, Dave Chinner wrote:
> > + * than half the log" rule that recovery requires us to keep.
> > + *
> > + * Further, we need to make sure the background CIL push is efficient, which
> > + * means we need to give the background push a chance to commit without
> > + * blocking all the current transaction commits. Hence we need some space
> > + * between the threshold and the 25% limit to allow background pushes to be
> > + * tried, but not enforced. To make this simple and fast to calculate, set
> > + * the background push threshold to 1/8th (12.5%) the size of the log, and then start
> > + * enforcing the background push at 50% above this. i.e. at 3/16th or 18.75% of
> > + * the log size. This should keep us well under the limits of the AIL pushing
> > + * threshold, yet give us plenty of space for aggregation on large logs.
> > */
>
> I think the above explanation is pretty good but I don't know that it's
> as clear or concise as it could be. I don't claim this is better but
> I'll take a shot (I don't like offering criticism without suggesting
> an alternative).
>
> * With dynamic reservations, we can basically make up arbitrary
> * limits for the checkpoint size so long as they don't violate any
> * other size rules. Recovery imposes a rule that no transaction
> * exceed half the log, so we are limited by that. Furthermore, the
> * log transaction reservation subsystem tries to keep 25% of the
> * log free, so we should keep below that limit or we risk not being
> * able to get the space we need.
> *
> * In order to keep background CIL push efficient, we will set a
> * lower threshold at which background pushing is attempted without
> * blocking current transaction commits. A separate, higher bound
> * defines when CIL pushes are forced in order to ensure we stay
> * within our transaction size limits.
Yes, makes sense. I'll rework it along these lines.
> > -
> > -#define XLOG_CIL_SPACE_LIMIT(log) \
> > - (min((log->l_logsize >> 2), (8 * 1024 * 1024)))
> > +#define XLOG_CIL_SPACE_LIMIT(log) (log->l_logsize >> 3)
> > +#define XLOG_CIL_HARD_SPACE_LIMIT(log) (3 * (log->l_logsize >> 4))
>
> Maybe "LIMIT" isn't quite the right name for these two.
> (But I have no better suggestion.)
Threshold is really the only other word that matches, but I think
limit is better here as it conveys a sense that it is something we
don't really want to cross...
> I don't really care much about this, but I'll take this
> opportunity for a small rant.
>
> The difference in calculation cost/speed between "x >> 3" and
> "x / 8" is vanishingly small.
That is architecture dependent, but in most cases these days the
compiler will optimise a divide-by-power-of−2-constant into a shift
operation anyway. I'm pretty sure that optimisation is done on even
on x86 as a shift is a single cycle operation while an integer
divide still takes several cycles and consumes more power.
> I think it is meaningful to use
> a shift in places where a power-of-two is mandated, but in places
> like this it suggests there is a constraint that simply doesn't
> exist. So for example, you could have chosen (log->logsize / 10)
> as the "try pushing" value, and (log->logsize / 4 - 1) as the
> "must push" value.
It's more the fact that XFS uses power-of-2 logic (i.e shifts)
everywhere. I just tend to be consistent with what is already there.
In this case, the AIL push thresholds are calculated using shifts:
free_threshold = MAX(free_threshold, (log->l_logBBsize >> 2));
and so when you compare that to the XLOG_CIL_SPACE_LIMIT()
definitions, it is immediately clear that the CIL limits are smaller
than the AIL push threshold...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xfs: force background CIL push under sustained load
2010-09-24 1:18 ` Dave Chinner
@ 2010-09-24 14:02 ` Alex Elder
0 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2010-09-24 14:02 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Fri, 2010-09-24 at 11:18 +1000, Dave Chinner wrote:
> On Thu, Sep 23, 2010 at 01:53:55PM -0500, Alex Elder wrote:
> > On Thu, 2010-09-23 at 12:27 +1000, Dave Chinner wrote:
. . .
> > I don't really care much about this, but I'll take this
> > opportunity for a small rant.
> >
> > The difference in calculation cost/speed between "x >> 3" and
> > "x / 8" is vanishingly small.
>
> That is architecture dependent, but in most cases these days the
> compiler will optimise a divide-by-power-of−2-constant into a shift
> operation anyway. I'm pretty sure that optimisation is done on even
> on x86 as a shift is a single cycle operation while an integer
> divide still takes several cycles and consumes more power.
I know, it'll take about a nanosecond longer than necessary,
and an extra 25 picowatt hours (IF the compiler doesn't optimize
it away). This being done to determine whether to start a series
of I/O operations...
> > I think it is meaningful to use
> > a shift in places where a power-of-two is mandated, but in places
> > like this it suggests there is a constraint that simply doesn't
> > exist. So for example, you could have chosen (log->logsize / 10)
> > as the "try pushing" value, and (log->logsize / 4 - 1) as the
> > "must push" value.
>
> It's more the fact that XFS uses power-of-2 logic (i.e shifts)
> everywhere. I just tend to be consistent with what is already there.
> In this case, the AIL push thresholds are calculated using shifts:
>
> free_threshold = MAX(free_threshold, (log->l_logBBsize >> 2));
>
> and so when you compare that to the XLOG_CIL_SPACE_LIMIT()
> definitions, it is immediately clear that the CIL limits are smaller
> than the AIL push threshold...
I get it. And as I said before, I truly don't care much about
this--it's an observation more than even an annoyance, and it's
already taken a lot more time and mental energy (not to mention
electric power) than it ever really deserved.
-Alex
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xfs: force background CIL push under sustained load
2010-09-23 2:27 [PATCH] xfs: force background CIL push under sustained load Dave Chinner
2010-09-23 10:43 ` Christoph Hellwig
2010-09-23 18:53 ` Alex Elder
@ 2010-09-23 18:58 ` Alex Elder
2010-09-24 2:15 ` Dave Chinner
2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2010-09-23 18:58 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Thu, 2010-09-23 at 12:27 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> I have been seeing relatively frequent pauses in transaction throughput up to
> 30s long under heavy parallel workloads. The only thing that seemed strange
> about them was that the xfsaild was active during the pauses, but making no
> progress. It was running exactly 20 times a second (on the 50ms no-progress
> backoff), and the number of pushbuf events was constant across this time as
> well. IOWs, the xfsaild appeared to be stuck on buffers that it could not push
> out.
. . .
If you like I can take this patch directly (i.e., not wait for you to
send a separate pull request). It fixes a real bug but since delayed
logging still an experimental feature I am not inclined to send it to
Linus at this point in the cycle. Let me know if you disagree.
-Alex
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xfs: force background CIL push under sustained load
2010-09-23 18:58 ` Alex Elder
@ 2010-09-24 2:15 ` Dave Chinner
0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2010-09-24 2:15 UTC (permalink / raw)
To: Alex Elder; +Cc: xfs
On Thu, Sep 23, 2010 at 01:58:32PM -0500, Alex Elder wrote:
> On Thu, 2010-09-23 at 12:27 +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > I have been seeing relatively frequent pauses in transaction throughput up to
> > 30s long under heavy parallel workloads. The only thing that seemed strange
> > about them was that the xfsaild was active during the pauses, but making no
> > progress. It was running exactly 20 times a second (on the 50ms no-progress
> > backoff), and the number of pushbuf events was constant across this time as
> > well. IOWs, the xfsaild appeared to be stuck on buffers that it could not push
> > out.
>
> . . .
>
> If you like I can take this patch directly (i.e., not wait for you to
> send a separate pull request). It fixes a real bug but since delayed
> logging still an experimental feature I am not inclined to send it to
> Linus at this point in the cycle. Let me know if you disagree.
I think it needs to go to linus as well back to 2.6.35.y as it can
result in recovery silently corrupting the filesystem if a
checkpoint larger than half the log is present in the log during
recovery. I don' tthink the experimental status of the code makes
any difference, especially as we've already pushed checkpoint/
recovery corruption fixes into this release....
I'm adding it to the start of the metadata scale patchset branch
right now, which I'll probably being sending a pull request out for
later today.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-09-24 14:01 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-23 2:27 [PATCH] xfs: force background CIL push under sustained load Dave Chinner
2010-09-23 10:43 ` Christoph Hellwig
2010-09-23 18:53 ` Alex Elder
2010-09-24 1:18 ` Dave Chinner
2010-09-24 14:02 ` Alex Elder
2010-09-23 18:58 ` Alex Elder
2010-09-24 2:15 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox