* [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail @ 2017-06-16 16:44 Brian Foster 2017-06-16 16:46 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster 2017-06-20 21:48 ` [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Darrick J. Wong 0 siblings, 2 replies; 10+ messages in thread From: Brian Foster @ 2017-06-16 16:44 UTC (permalink / raw) To: linux-xfs Create a debug mode only sysfs option to force pin the tail of the log. This option can be used by test infrastructure to induce head behind tail conditions. Specifically, this is intended to be used by xfstests to reproduce log recovery problems after failed/corrupted log writes overwrite the last good tail LSN in the log. When enabled, AIL push attempts see every log item on the AIL in the pinned state. This stalls metadata writeback and thus prevents the current tail of the log from moving forward. When disabled, subsequent AIL pushes observe the log items in their appropriate state and filesystem operation continues as normal. Signed-off-by: Brian Foster <bfoster@redhat.com> --- Hi all, This patch is a supporting patch for an xfstests test I'm about to post that pins the tail of the log in order to reproduce the log recovery problem that appears to be the root of the problem in this[1] thread. That is the primary motivation for the patch and so should probably be reviewed with that context. IOW, if there's a better way to reproduce the problem in the test without the need for kernel support, I'm happy to drop this. Thoughts, reviews, flames appreciated. Brian [1] http://www.spinics.net/lists/linux-xfs/msg07499.html fs/xfs/xfs_log_priv.h | 2 ++ fs/xfs/xfs_sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++- 3 files changed, 67 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h index c2604a5..bfbfde12 100644 --- a/fs/xfs/xfs_log_priv.h +++ b/fs/xfs/xfs_log_priv.h @@ -413,6 +413,8 @@ struct xlog { void *l_iclog_bak[XLOG_MAX_ICLOGS]; /* log record crc error injection factor */ uint32_t l_badcrc_factor; + /* force pin the log tail */ + bool l_pin_tail; #endif /* log recovery lsn tracking (for buffer submission */ xfs_lsn_t l_recovery_lsn; diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c index ec6e0e2..b86148a 100644 --- a/fs/xfs/xfs_sysfs.c +++ b/fs/xfs/xfs_sysfs.c @@ -378,6 +378,51 @@ log_badcrc_factor_show( } XFS_SYSFS_ATTR_RW(log_badcrc_factor); + +/* + * DEBUG mode flag to force pin the tail of the log. Used from test + * infrastructure to manufacture head-behind-tail conditions. DO NOT USE + * DIRECTLY. This will lock up the fs! + * + * When this option is enabled, all log items present in the AIL are emulated as + * being in the pinned state until the option is disabled. Once disabled, log + * items return to their natural state and fs operation continues as normal. + */ +STATIC ssize_t +log_pin_tail_store( + struct kobject *kobject, + const char *buf, + size_t count) +{ + struct xlog *log = to_xlog(kobject); + int ret; + int val; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if (val == 1) + log->l_pin_tail = true; + else if (val == 0) + log->l_pin_tail = false; + else + return -EINVAL; + + return count; +} + +STATIC ssize_t +log_pin_tail_show( + struct kobject *kobject, + char *buf) +{ + struct xlog *log = to_xlog(kobject); + + return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0); +} +XFS_SYSFS_ATTR_RW(log_pin_tail); + #endif /* DEBUG */ static struct attribute *xfs_log_attrs[] = { @@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = { ATTR_LIST(write_grant_head), #ifdef DEBUG ATTR_LIST(log_badcrc_factor), + ATTR_LIST(log_pin_tail), #endif NULL, }; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 9056c0f..c901e61 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -27,6 +27,7 @@ #include "xfs_trace.h" #include "xfs_error.h" #include "xfs_log.h" +#include "xfs_log_priv.h" #ifdef DEBUG /* @@ -325,6 +326,23 @@ xfs_ail_delete( xfs_trans_ail_cursor_clear(ailp, lip); } +static inline uint +xfsaild_push_item( + struct xfs_ail *ailp, + struct xfs_log_item *lip) +{ +#ifdef DEBUG + /* + * If tail pinning is enabled, skip the push and track all items as + * pinned to force pin the log tail. This helps induce head-behind-tail + * conditions. + */ + if (ailp->xa_mount->m_log->l_pin_tail) + return XFS_ITEM_PINNED; +#endif + return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); +} + static long xfsaild_push( struct xfs_ail *ailp) @@ -382,7 +400,7 @@ xfsaild_push( * rely on the AIL cursor implementation to be able to deal with * the dropped lock. */ - lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list); + lock_result = xfsaild_push_item(ailp, lip); switch (lock_result) { case XFS_ITEM_SUCCESS: XFS_STATS_INC(mp, xs_push_ail_success); -- 2.7.5 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] tests/xfs: test for log recovery failure after tail overwrite 2017-06-16 16:44 [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Brian Foster @ 2017-06-16 16:46 ` Brian Foster 2017-06-30 3:44 ` Eryu Guan 2017-06-20 21:48 ` [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Darrick J. Wong 1 sibling, 1 reply; 10+ messages in thread From: Brian Foster @ 2017-06-16 16:46 UTC (permalink / raw) To: linux-xfs; +Cc: fstests XFS is susceptible to log recovery problems if the fs crashes under certain circumstances. If the tail has been pinned for long enough to the log to fill and the next batch of log buffer submissions happen to fail, the filesystem shutsdown having potentially overwritten part of the range between the last good tail->head range in the log. This causes log recovery to fail with crc mismatch or invalid log record errors. This problem is not yet fixed and thus known/expected to fail. At this time, this test serves as a reminder that the problem exists and a reproducer for future verification purposes. Note that this problem is currently only reproducible with larger (non-default) log buffer sizes (i.e., '-o logbsize=256k') or smaller block sizes (1k). Signed-off-by: Brian Foster <bfoster@redhat.com> --- Hi all, This patch uses the XFS debug kernel mechanism recently posted for review[1] to reproduce an XFS log recovery problem. Note that this test depends on the aforementioned patch and thus should not be merged until/unless the corresponding kernel patch is merged. Brian [1] "xfs: debug mode sysfs flag to force [un]pin the log tail" tests/xfs/999 | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/xfs/999.out | 2 + tests/xfs/group | 1 + 3 files changed, 116 insertions(+) create mode 100644 tests/xfs/999 create mode 100644 tests/xfs/999.out diff --git a/tests/xfs/999 b/tests/xfs/999 new file mode 100644 index 0000000..6913a43 --- /dev/null +++ b/tests/xfs/999 @@ -0,0 +1,113 @@ +#! /bin/bash +# FS QA Test No. 999 +# +# Attempt to reproduce log recovery failure by writing corrupt log records over +# the last good tail in the log. The tail is force pinned while a workload runs +# the head as close as possible behind the tail. Once the head is pinned, +# corrupted log records are written to the log and the filesystem shuts down. +# +# While log recovery should handle the corrupted log records, it has historical +# problems dealing with the situation where the corrupted log records may have +# overwritten the tail of the previous good record in the log. If this occurs, +# log recovery may fail. +# +# This can be reproduced more reliably under non-default conditions such as with +# the smallest supported FSB sizes and/or largest supported log buffer sizes and +# counts (logbufs and logbsize mount options). +# +# Note that this test requires a DEBUG mode kernel. +# +#----------------------------------------------------------------------- +# Copyright (c) 2017 Red Hat, Inc. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* + $KILLALL_PROG -9 fsstress > /dev/null 2>&1 + [ -e /sys/fs/xfs/$sdev/log/log_pin_tail ] && + echo 0 > /sys/fs/xfs/$sdev/log/log_pin_tail + wait > /dev/null 2>&1 +} + +rm -f $seqres.full + +# get standard environment, filters and checks +. ./common/rc + +# real QA test starts here + +# Modify as appropriate. +_supported_fs xfs +_supported_os Linux +_require_xfs_sysfs $(_short_dev $TEST_DEV)/log/log_badcrc_factor +_require_xfs_sysfs $(_short_dev $TEST_DEV)/log/log_pin_tail +_require_scratch +_require_command "$KILLALL_PROG" killall + +echo "Silence is golden." + +sdev=$(_short_dev $SCRATCH_DEV) + +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" +_scratch_mount || _fail "mount failed" + +# populate the fs with some data and cycle the mount to reset the log head/tail +$FSSTRESS_PROG -d $SCRATCH_MNT -z -fcreat=1 -p 4 -n 100000 > /dev/null 2>&1 +_scratch_cycle_mount || _fail "mount failed" + +# Pin the tail and start a file removal workload. File removal tends to +# reproduce the corruption more reliably. +echo 1 > /sys/fs/xfs/$sdev/log/log_pin_tail + +rm -rf $SCRATCH_MNT/* > /dev/null 2>&1 & +workpid=$! + +# wait for the head to stop pushing forward +prevhead=-1 +head=`cat /sys/fs/xfs/$sdev/log/log_head_lsn` +while [ "$head" != "$prevhead" ]; do + sleep 5 + prevhead=$head + head=`cat /sys/fs/xfs/$sdev/log/log_head_lsn` +done + +# Once the head is pinned behind the tail, enable log record corruption and +# unpin the tail. All subsequent log buffer writes end up corrupted on-disk and +# result in log I/O errors. +echo 1 > /sys/fs/xfs/$sdev/log/log_badcrc_factor +echo 0 > /sys/fs/xfs/$sdev/log/log_pin_tail + +# wait for fs shutdown to kill the workload +wait $workpid + +# cycle mount to test log recovery +_scratch_cycle_mount + +# success, all done +status=0 +exit diff --git a/tests/xfs/999.out b/tests/xfs/999.out new file mode 100644 index 0000000..d254382 --- /dev/null +++ b/tests/xfs/999.out @@ -0,0 +1,2 @@ +QA output created by 999 +Silence is golden. diff --git a/tests/xfs/group b/tests/xfs/group index 792161a..d94f010 100644 --- a/tests/xfs/group +++ b/tests/xfs/group @@ -416,3 +416,4 @@ 416 dangerous_fuzzers dangerous_scrub dangerous_repair 417 dangerous_fuzzers dangerous_scrub dangerous_online_repair 418 dangerous_fuzzers dangerous_scrub dangerous_repair +999 auto log -- 2.7.5 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] tests/xfs: test for log recovery failure after tail overwrite 2017-06-16 16:46 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster @ 2017-06-30 3:44 ` Eryu Guan 2017-06-30 4:09 ` Darrick J. Wong 0 siblings, 1 reply; 10+ messages in thread From: Eryu Guan @ 2017-06-30 3:44 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, fstests Hi Brian, On Fri, Jun 16, 2017 at 12:46:19PM -0400, Brian Foster wrote: > XFS is susceptible to log recovery problems if the fs crashes under > certain circumstances. If the tail has been pinned for long enough > to the log to fill and the next batch of log buffer submissions > happen to fail, the filesystem shutsdown having potentially > overwritten part of the range between the last good tail->head range > in the log. This causes log recovery to fail with crc mismatch or > invalid log record errors. > > This problem is not yet fixed and thus known/expected to fail. At > this time, this test serves as a reminder that the problem exists > and a reproducer for future verification purposes. Note that this > problem is currently only reproducible with larger (non-default) log > buffer sizes (i.e., '-o logbsize=256k') or smaller block sizes (1k). > > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > > Hi all, > > This patch uses the XFS debug kernel mechanism recently posted for > review[1] to reproduce an XFS log recovery problem. Note that this test > depends on the aforementioned patch and thus should not be merged > until/unless the corresponding kernel patch is merged. Any decision made on the kernel side patch? Sorry, I'm a bit lost on the kernel side discussions. Thanks, Eryu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tests/xfs: test for log recovery failure after tail overwrite 2017-06-30 3:44 ` Eryu Guan @ 2017-06-30 4:09 ` Darrick J. Wong 0 siblings, 0 replies; 10+ messages in thread From: Darrick J. Wong @ 2017-06-30 4:09 UTC (permalink / raw) To: Eryu Guan; +Cc: Brian Foster, linux-xfs, fstests On Fri, Jun 30, 2017 at 11:44:05AM +0800, Eryu Guan wrote: > Hi Brian, > > On Fri, Jun 16, 2017 at 12:46:19PM -0400, Brian Foster wrote: > > XFS is susceptible to log recovery problems if the fs crashes under > > certain circumstances. If the tail has been pinned for long enough > > to the log to fill and the next batch of log buffer submissions > > happen to fail, the filesystem shutsdown having potentially > > overwritten part of the range between the last good tail->head range > > in the log. This causes log recovery to fail with crc mismatch or > > invalid log record errors. > > > > This problem is not yet fixed and thus known/expected to fail. At > > this time, this test serves as a reminder that the problem exists > > and a reproducer for future verification purposes. Note that this > > problem is currently only reproducible with larger (non-default) log > > buffer sizes (i.e., '-o logbsize=256k') or smaller block sizes (1k). > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > > > Hi all, > > > > This patch uses the XFS debug kernel mechanism recently posted for > > review[1] to reproduce an XFS log recovery problem. Note that this test > > depends on the aforementioned patch and thus should not be merged > > until/unless the corresponding kernel patch is merged. > > Any decision made on the kernel side patch? Sorry, I'm a bit lost on the > kernel side discussions. Uhhhh, I think this test is being redone to work with the new xfs errortag mechanism that'll be appearing soon. I will send patches to fix up xfstests momentarily. --D > > Thanks, > Eryu > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail 2017-06-16 16:44 [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Brian Foster 2017-06-16 16:46 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster @ 2017-06-20 21:48 ` Darrick J. Wong 2017-06-21 10:16 ` Brian Foster 1 sibling, 1 reply; 10+ messages in thread From: Darrick J. Wong @ 2017-06-20 21:48 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On Fri, Jun 16, 2017 at 12:44:33PM -0400, Brian Foster wrote: > Create a debug mode only sysfs option to force pin the tail of the > log. This option can be used by test infrastructure to induce head > behind tail conditions. Specifically, this is intended to be used by > xfstests to reproduce log recovery problems after failed/corrupted > log writes overwrite the last good tail LSN in the log. > > When enabled, AIL push attempts see every log item on the AIL in the > pinned state. This stalls metadata writeback and thus prevents the > current tail of the log from moving forward. When disabled, > subsequent AIL pushes observe the log items in their appropriate > state and filesystem operation continues as normal. > > Signed-off-by: Brian Foster <bfoster@redhat.com> Looks ok enough to test... Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --D > --- > > Hi all, > > This patch is a supporting patch for an xfstests test I'm about to post > that pins the tail of the log in order to reproduce the log recovery > problem that appears to be the root of the problem in this[1] thread. > That is the primary motivation for the patch and so should probably be > reviewed with that context. IOW, if there's a better way to reproduce > the problem in the test without the need for kernel support, I'm happy > to drop this. Thoughts, reviews, flames appreciated. > > Brian > > [1] http://www.spinics.net/lists/linux-xfs/msg07499.html > > fs/xfs/xfs_log_priv.h | 2 ++ > fs/xfs/xfs_sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++- > 3 files changed, 67 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h > index c2604a5..bfbfde12 100644 > --- a/fs/xfs/xfs_log_priv.h > +++ b/fs/xfs/xfs_log_priv.h > @@ -413,6 +413,8 @@ struct xlog { > void *l_iclog_bak[XLOG_MAX_ICLOGS]; > /* log record crc error injection factor */ > uint32_t l_badcrc_factor; > + /* force pin the log tail */ > + bool l_pin_tail; > #endif > /* log recovery lsn tracking (for buffer submission */ > xfs_lsn_t l_recovery_lsn; > diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c > index ec6e0e2..b86148a 100644 > --- a/fs/xfs/xfs_sysfs.c > +++ b/fs/xfs/xfs_sysfs.c > @@ -378,6 +378,51 @@ log_badcrc_factor_show( > } > > XFS_SYSFS_ATTR_RW(log_badcrc_factor); > + > +/* > + * DEBUG mode flag to force pin the tail of the log. Used from test > + * infrastructure to manufacture head-behind-tail conditions. DO NOT USE > + * DIRECTLY. This will lock up the fs! > + * > + * When this option is enabled, all log items present in the AIL are emulated as > + * being in the pinned state until the option is disabled. Once disabled, log > + * items return to their natural state and fs operation continues as normal. > + */ > +STATIC ssize_t > +log_pin_tail_store( > + struct kobject *kobject, > + const char *buf, > + size_t count) > +{ > + struct xlog *log = to_xlog(kobject); > + int ret; > + int val; > + > + ret = kstrtoint(buf, 0, &val); > + if (ret) > + return ret; > + > + if (val == 1) > + log->l_pin_tail = true; > + else if (val == 0) > + log->l_pin_tail = false; > + else > + return -EINVAL; > + > + return count; > +} > + > +STATIC ssize_t > +log_pin_tail_show( > + struct kobject *kobject, > + char *buf) > +{ > + struct xlog *log = to_xlog(kobject); > + > + return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0); > +} > +XFS_SYSFS_ATTR_RW(log_pin_tail); > + > #endif /* DEBUG */ > > static struct attribute *xfs_log_attrs[] = { > @@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = { > ATTR_LIST(write_grant_head), > #ifdef DEBUG > ATTR_LIST(log_badcrc_factor), > + ATTR_LIST(log_pin_tail), > #endif > NULL, > }; > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > index 9056c0f..c901e61 100644 > --- a/fs/xfs/xfs_trans_ail.c > +++ b/fs/xfs/xfs_trans_ail.c > @@ -27,6 +27,7 @@ > #include "xfs_trace.h" > #include "xfs_error.h" > #include "xfs_log.h" > +#include "xfs_log_priv.h" > > #ifdef DEBUG > /* > @@ -325,6 +326,23 @@ xfs_ail_delete( > xfs_trans_ail_cursor_clear(ailp, lip); > } > > +static inline uint > +xfsaild_push_item( > + struct xfs_ail *ailp, > + struct xfs_log_item *lip) > +{ > +#ifdef DEBUG > + /* > + * If tail pinning is enabled, skip the push and track all items as > + * pinned to force pin the log tail. This helps induce head-behind-tail > + * conditions. > + */ > + if (ailp->xa_mount->m_log->l_pin_tail) > + return XFS_ITEM_PINNED; > +#endif > + return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > +} > + > static long > xfsaild_push( > struct xfs_ail *ailp) > @@ -382,7 +400,7 @@ xfsaild_push( > * rely on the AIL cursor implementation to be able to deal with > * the dropped lock. > */ > - lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > + lock_result = xfsaild_push_item(ailp, lip); > switch (lock_result) { > case XFS_ITEM_SUCCESS: > XFS_STATS_INC(mp, xs_push_ail_success); > -- > 2.7.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail 2017-06-20 21:48 ` [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Darrick J. Wong @ 2017-06-21 10:16 ` Brian Foster 2017-06-21 15:47 ` Darrick J. Wong 0 siblings, 1 reply; 10+ messages in thread From: Brian Foster @ 2017-06-21 10:16 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs On Tue, Jun 20, 2017 at 02:48:32PM -0700, Darrick J. Wong wrote: > On Fri, Jun 16, 2017 at 12:44:33PM -0400, Brian Foster wrote: > > Create a debug mode only sysfs option to force pin the tail of the > > log. This option can be used by test infrastructure to induce head > > behind tail conditions. Specifically, this is intended to be used by > > xfstests to reproduce log recovery problems after failed/corrupted > > log writes overwrite the last good tail LSN in the log. > > > > When enabled, AIL push attempts see every log item on the AIL in the > > pinned state. This stalls metadata writeback and thus prevents the > > current tail of the log from moving forward. When disabled, > > subsequent AIL pushes observe the log items in their appropriate > > state and filesystem operation continues as normal. > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > Looks ok enough to test... > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > I assume you mean Reviewed-by... thanks! :) Brian > --D > > > --- > > > > Hi all, > > > > This patch is a supporting patch for an xfstests test I'm about to post > > that pins the tail of the log in order to reproduce the log recovery > > problem that appears to be the root of the problem in this[1] thread. > > That is the primary motivation for the patch and so should probably be > > reviewed with that context. IOW, if there's a better way to reproduce > > the problem in the test without the need for kernel support, I'm happy > > to drop this. Thoughts, reviews, flames appreciated. > > > > Brian > > > > [1] http://www.spinics.net/lists/linux-xfs/msg07499.html > > > > fs/xfs/xfs_log_priv.h | 2 ++ > > fs/xfs/xfs_sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > > fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++- > > 3 files changed, 67 insertions(+), 1 deletion(-) > > > > diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h > > index c2604a5..bfbfde12 100644 > > --- a/fs/xfs/xfs_log_priv.h > > +++ b/fs/xfs/xfs_log_priv.h > > @@ -413,6 +413,8 @@ struct xlog { > > void *l_iclog_bak[XLOG_MAX_ICLOGS]; > > /* log record crc error injection factor */ > > uint32_t l_badcrc_factor; > > + /* force pin the log tail */ > > + bool l_pin_tail; > > #endif > > /* log recovery lsn tracking (for buffer submission */ > > xfs_lsn_t l_recovery_lsn; > > diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c > > index ec6e0e2..b86148a 100644 > > --- a/fs/xfs/xfs_sysfs.c > > +++ b/fs/xfs/xfs_sysfs.c > > @@ -378,6 +378,51 @@ log_badcrc_factor_show( > > } > > > > XFS_SYSFS_ATTR_RW(log_badcrc_factor); > > + > > +/* > > + * DEBUG mode flag to force pin the tail of the log. Used from test > > + * infrastructure to manufacture head-behind-tail conditions. DO NOT USE > > + * DIRECTLY. This will lock up the fs! > > + * > > + * When this option is enabled, all log items present in the AIL are emulated as > > + * being in the pinned state until the option is disabled. Once disabled, log > > + * items return to their natural state and fs operation continues as normal. > > + */ > > +STATIC ssize_t > > +log_pin_tail_store( > > + struct kobject *kobject, > > + const char *buf, > > + size_t count) > > +{ > > + struct xlog *log = to_xlog(kobject); > > + int ret; > > + int val; > > + > > + ret = kstrtoint(buf, 0, &val); > > + if (ret) > > + return ret; > > + > > + if (val == 1) > > + log->l_pin_tail = true; > > + else if (val == 0) > > + log->l_pin_tail = false; > > + else > > + return -EINVAL; > > + > > + return count; > > +} > > + > > +STATIC ssize_t > > +log_pin_tail_show( > > + struct kobject *kobject, > > + char *buf) > > +{ > > + struct xlog *log = to_xlog(kobject); > > + > > + return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0); > > +} > > +XFS_SYSFS_ATTR_RW(log_pin_tail); > > + > > #endif /* DEBUG */ > > > > static struct attribute *xfs_log_attrs[] = { > > @@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = { > > ATTR_LIST(write_grant_head), > > #ifdef DEBUG > > ATTR_LIST(log_badcrc_factor), > > + ATTR_LIST(log_pin_tail), > > #endif > > NULL, > > }; > > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > > index 9056c0f..c901e61 100644 > > --- a/fs/xfs/xfs_trans_ail.c > > +++ b/fs/xfs/xfs_trans_ail.c > > @@ -27,6 +27,7 @@ > > #include "xfs_trace.h" > > #include "xfs_error.h" > > #include "xfs_log.h" > > +#include "xfs_log_priv.h" > > > > #ifdef DEBUG > > /* > > @@ -325,6 +326,23 @@ xfs_ail_delete( > > xfs_trans_ail_cursor_clear(ailp, lip); > > } > > > > +static inline uint > > +xfsaild_push_item( > > + struct xfs_ail *ailp, > > + struct xfs_log_item *lip) > > +{ > > +#ifdef DEBUG > > + /* > > + * If tail pinning is enabled, skip the push and track all items as > > + * pinned to force pin the log tail. This helps induce head-behind-tail > > + * conditions. > > + */ > > + if (ailp->xa_mount->m_log->l_pin_tail) > > + return XFS_ITEM_PINNED; > > +#endif > > + return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > +} > > + > > static long > > xfsaild_push( > > struct xfs_ail *ailp) > > @@ -382,7 +400,7 @@ xfsaild_push( > > * rely on the AIL cursor implementation to be able to deal with > > * the dropped lock. > > */ > > - lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > + lock_result = xfsaild_push_item(ailp, lip); > > switch (lock_result) { > > case XFS_ITEM_SUCCESS: > > XFS_STATS_INC(mp, xs_push_ail_success); > > -- > > 2.7.5 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail 2017-06-21 10:16 ` Brian Foster @ 2017-06-21 15:47 ` Darrick J. Wong 2017-06-21 16:08 ` Brian Foster 0 siblings, 1 reply; 10+ messages in thread From: Darrick J. Wong @ 2017-06-21 15:47 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On Wed, Jun 21, 2017 at 06:16:54AM -0400, Brian Foster wrote: > On Tue, Jun 20, 2017 at 02:48:32PM -0700, Darrick J. Wong wrote: > > On Fri, Jun 16, 2017 at 12:44:33PM -0400, Brian Foster wrote: > > > Create a debug mode only sysfs option to force pin the tail of the > > > log. This option can be used by test infrastructure to induce head > > > behind tail conditions. Specifically, this is intended to be used by > > > xfstests to reproduce log recovery problems after failed/corrupted > > > log writes overwrite the last good tail LSN in the log. > > > > > > When enabled, AIL push attempts see every log item on the AIL in the > > > pinned state. This stalls metadata writeback and thus prevents the > > > current tail of the log from moving forward. When disabled, > > > subsequent AIL pushes observe the log items in their appropriate > > > state and filesystem operation continues as normal. > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > > Looks ok enough to test... > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > > I assume you mean Reviewed-by... thanks! :) Yep. Though, having now pushed out the errortag-via-sysfs patchset, I wonder if this could target that instead? But, first things first, that series has to be reviewed. We could (in theory) push e-v-s and this for 4.13; or hold off on both for another cycle; or integrate this patch for 4.13 and e-v-s for 4.14, and update the test to sniff out the location of the knob dynamically like xfs/196 does. <mostly just pre-coffee babbling at this point> --D > > Brian > > > --D > > > > > --- > > > > > > Hi all, > > > > > > This patch is a supporting patch for an xfstests test I'm about to post > > > that pins the tail of the log in order to reproduce the log recovery > > > problem that appears to be the root of the problem in this[1] thread. > > > That is the primary motivation for the patch and so should probably be > > > reviewed with that context. IOW, if there's a better way to reproduce > > > the problem in the test without the need for kernel support, I'm happy > > > to drop this. Thoughts, reviews, flames appreciated. > > > > > > Brian > > > > > > [1] http://www.spinics.net/lists/linux-xfs/msg07499.html > > > > > > fs/xfs/xfs_log_priv.h | 2 ++ > > > fs/xfs/xfs_sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > > > fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++- > > > 3 files changed, 67 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h > > > index c2604a5..bfbfde12 100644 > > > --- a/fs/xfs/xfs_log_priv.h > > > +++ b/fs/xfs/xfs_log_priv.h > > > @@ -413,6 +413,8 @@ struct xlog { > > > void *l_iclog_bak[XLOG_MAX_ICLOGS]; > > > /* log record crc error injection factor */ > > > uint32_t l_badcrc_factor; > > > + /* force pin the log tail */ > > > + bool l_pin_tail; > > > #endif > > > /* log recovery lsn tracking (for buffer submission */ > > > xfs_lsn_t l_recovery_lsn; > > > diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c > > > index ec6e0e2..b86148a 100644 > > > --- a/fs/xfs/xfs_sysfs.c > > > +++ b/fs/xfs/xfs_sysfs.c > > > @@ -378,6 +378,51 @@ log_badcrc_factor_show( > > > } > > > > > > XFS_SYSFS_ATTR_RW(log_badcrc_factor); > > > + > > > +/* > > > + * DEBUG mode flag to force pin the tail of the log. Used from test > > > + * infrastructure to manufacture head-behind-tail conditions. DO NOT USE > > > + * DIRECTLY. This will lock up the fs! > > > + * > > > + * When this option is enabled, all log items present in the AIL are emulated as > > > + * being in the pinned state until the option is disabled. Once disabled, log > > > + * items return to their natural state and fs operation continues as normal. > > > + */ > > > +STATIC ssize_t > > > +log_pin_tail_store( > > > + struct kobject *kobject, > > > + const char *buf, > > > + size_t count) > > > +{ > > > + struct xlog *log = to_xlog(kobject); > > > + int ret; > > > + int val; > > > + > > > + ret = kstrtoint(buf, 0, &val); > > > + if (ret) > > > + return ret; > > > + > > > + if (val == 1) > > > + log->l_pin_tail = true; > > > + else if (val == 0) > > > + log->l_pin_tail = false; > > > + else > > > + return -EINVAL; > > > + > > > + return count; > > > +} > > > + > > > +STATIC ssize_t > > > +log_pin_tail_show( > > > + struct kobject *kobject, > > > + char *buf) > > > +{ > > > + struct xlog *log = to_xlog(kobject); > > > + > > > + return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0); > > > +} > > > +XFS_SYSFS_ATTR_RW(log_pin_tail); > > > + > > > #endif /* DEBUG */ > > > > > > static struct attribute *xfs_log_attrs[] = { > > > @@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = { > > > ATTR_LIST(write_grant_head), > > > #ifdef DEBUG > > > ATTR_LIST(log_badcrc_factor), > > > + ATTR_LIST(log_pin_tail), > > > #endif > > > NULL, > > > }; > > > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > > > index 9056c0f..c901e61 100644 > > > --- a/fs/xfs/xfs_trans_ail.c > > > +++ b/fs/xfs/xfs_trans_ail.c > > > @@ -27,6 +27,7 @@ > > > #include "xfs_trace.h" > > > #include "xfs_error.h" > > > #include "xfs_log.h" > > > +#include "xfs_log_priv.h" > > > > > > #ifdef DEBUG > > > /* > > > @@ -325,6 +326,23 @@ xfs_ail_delete( > > > xfs_trans_ail_cursor_clear(ailp, lip); > > > } > > > > > > +static inline uint > > > +xfsaild_push_item( > > > + struct xfs_ail *ailp, > > > + struct xfs_log_item *lip) > > > +{ > > > +#ifdef DEBUG > > > + /* > > > + * If tail pinning is enabled, skip the push and track all items as > > > + * pinned to force pin the log tail. This helps induce head-behind-tail > > > + * conditions. > > > + */ > > > + if (ailp->xa_mount->m_log->l_pin_tail) > > > + return XFS_ITEM_PINNED; > > > +#endif > > > + return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > > +} > > > + > > > static long > > > xfsaild_push( > > > struct xfs_ail *ailp) > > > @@ -382,7 +400,7 @@ xfsaild_push( > > > * rely on the AIL cursor implementation to be able to deal with > > > * the dropped lock. > > > */ > > > - lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > > + lock_result = xfsaild_push_item(ailp, lip); > > > switch (lock_result) { > > > case XFS_ITEM_SUCCESS: > > > XFS_STATS_INC(mp, xs_push_ail_success); > > > -- > > > 2.7.5 > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail 2017-06-21 15:47 ` Darrick J. Wong @ 2017-06-21 16:08 ` Brian Foster 2017-06-21 17:06 ` Darrick J. Wong 0 siblings, 1 reply; 10+ messages in thread From: Brian Foster @ 2017-06-21 16:08 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs On Wed, Jun 21, 2017 at 08:47:59AM -0700, Darrick J. Wong wrote: > On Wed, Jun 21, 2017 at 06:16:54AM -0400, Brian Foster wrote: > > On Tue, Jun 20, 2017 at 02:48:32PM -0700, Darrick J. Wong wrote: > > > On Fri, Jun 16, 2017 at 12:44:33PM -0400, Brian Foster wrote: > > > > Create a debug mode only sysfs option to force pin the tail of the > > > > log. This option can be used by test infrastructure to induce head > > > > behind tail conditions. Specifically, this is intended to be used by > > > > xfstests to reproduce log recovery problems after failed/corrupted > > > > log writes overwrite the last good tail LSN in the log. > > > > > > > > When enabled, AIL push attempts see every log item on the AIL in the > > > > pinned state. This stalls metadata writeback and thus prevents the > > > > current tail of the log from moving forward. When disabled, > > > > subsequent AIL pushes observe the log items in their appropriate > > > > state and filesystem operation continues as normal. > > > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > > > > Looks ok enough to test... > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > > I assume you mean Reviewed-by... thanks! :) > > Yep. Though, having now pushed out the errortag-via-sysfs patchset, > I wonder if this could target that instead? > Indeed, I think that is a good idea... > But, first things first, that series has to be reviewed. We could (in > theory) push e-v-s and this for 4.13; or hold off on both for another > cycle; or integrate this patch for 4.13 and e-v-s for 4.14, and update > the test to sniff out the location of the knob dynamically like xfs/196 > does. > I'm fine with deferring this particular mechanism and the test if we have a general error injection mechanism in mind. This is really just for reproducing the problem and verification (once I have a fix put together ;P, though I think I now have a decent idea on how to fix this), so to me it's fine to hold off so long as something is available for reviewers to use in the meantime (i.e., same way I feel about Carlos' AIL retry patches). That saves us from modifying the test again if we can help it as well. I assume this patch would morph into something like making ->iop_push() an injection point (where an "error" means return XFS_ITEM_PINNED) and rather than having the log_pin_tail knob, the test can just set that error between 0 and 100% (but I'll try to get a look at the new patches shortly...). Brian > <mostly just pre-coffee babbling at this point> > > --D > > > > > Brian > > > > > --D > > > > > > > --- > > > > > > > > Hi all, > > > > > > > > This patch is a supporting patch for an xfstests test I'm about to post > > > > that pins the tail of the log in order to reproduce the log recovery > > > > problem that appears to be the root of the problem in this[1] thread. > > > > That is the primary motivation for the patch and so should probably be > > > > reviewed with that context. IOW, if there's a better way to reproduce > > > > the problem in the test without the need for kernel support, I'm happy > > > > to drop this. Thoughts, reviews, flames appreciated. > > > > > > > > Brian > > > > > > > > [1] http://www.spinics.net/lists/linux-xfs/msg07499.html > > > > > > > > fs/xfs/xfs_log_priv.h | 2 ++ > > > > fs/xfs/xfs_sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++- > > > > 3 files changed, 67 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h > > > > index c2604a5..bfbfde12 100644 > > > > --- a/fs/xfs/xfs_log_priv.h > > > > +++ b/fs/xfs/xfs_log_priv.h > > > > @@ -413,6 +413,8 @@ struct xlog { > > > > void *l_iclog_bak[XLOG_MAX_ICLOGS]; > > > > /* log record crc error injection factor */ > > > > uint32_t l_badcrc_factor; > > > > + /* force pin the log tail */ > > > > + bool l_pin_tail; > > > > #endif > > > > /* log recovery lsn tracking (for buffer submission */ > > > > xfs_lsn_t l_recovery_lsn; > > > > diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c > > > > index ec6e0e2..b86148a 100644 > > > > --- a/fs/xfs/xfs_sysfs.c > > > > +++ b/fs/xfs/xfs_sysfs.c > > > > @@ -378,6 +378,51 @@ log_badcrc_factor_show( > > > > } > > > > > > > > XFS_SYSFS_ATTR_RW(log_badcrc_factor); > > > > + > > > > +/* > > > > + * DEBUG mode flag to force pin the tail of the log. Used from test > > > > + * infrastructure to manufacture head-behind-tail conditions. DO NOT USE > > > > + * DIRECTLY. This will lock up the fs! > > > > + * > > > > + * When this option is enabled, all log items present in the AIL are emulated as > > > > + * being in the pinned state until the option is disabled. Once disabled, log > > > > + * items return to their natural state and fs operation continues as normal. > > > > + */ > > > > +STATIC ssize_t > > > > +log_pin_tail_store( > > > > + struct kobject *kobject, > > > > + const char *buf, > > > > + size_t count) > > > > +{ > > > > + struct xlog *log = to_xlog(kobject); > > > > + int ret; > > > > + int val; > > > > + > > > > + ret = kstrtoint(buf, 0, &val); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + if (val == 1) > > > > + log->l_pin_tail = true; > > > > + else if (val == 0) > > > > + log->l_pin_tail = false; > > > > + else > > > > + return -EINVAL; > > > > + > > > > + return count; > > > > +} > > > > + > > > > +STATIC ssize_t > > > > +log_pin_tail_show( > > > > + struct kobject *kobject, > > > > + char *buf) > > > > +{ > > > > + struct xlog *log = to_xlog(kobject); > > > > + > > > > + return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0); > > > > +} > > > > +XFS_SYSFS_ATTR_RW(log_pin_tail); > > > > + > > > > #endif /* DEBUG */ > > > > > > > > static struct attribute *xfs_log_attrs[] = { > > > > @@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = { > > > > ATTR_LIST(write_grant_head), > > > > #ifdef DEBUG > > > > ATTR_LIST(log_badcrc_factor), > > > > + ATTR_LIST(log_pin_tail), > > > > #endif > > > > NULL, > > > > }; > > > > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > > > > index 9056c0f..c901e61 100644 > > > > --- a/fs/xfs/xfs_trans_ail.c > > > > +++ b/fs/xfs/xfs_trans_ail.c > > > > @@ -27,6 +27,7 @@ > > > > #include "xfs_trace.h" > > > > #include "xfs_error.h" > > > > #include "xfs_log.h" > > > > +#include "xfs_log_priv.h" > > > > > > > > #ifdef DEBUG > > > > /* > > > > @@ -325,6 +326,23 @@ xfs_ail_delete( > > > > xfs_trans_ail_cursor_clear(ailp, lip); > > > > } > > > > > > > > +static inline uint > > > > +xfsaild_push_item( > > > > + struct xfs_ail *ailp, > > > > + struct xfs_log_item *lip) > > > > +{ > > > > +#ifdef DEBUG > > > > + /* > > > > + * If tail pinning is enabled, skip the push and track all items as > > > > + * pinned to force pin the log tail. This helps induce head-behind-tail > > > > + * conditions. > > > > + */ > > > > + if (ailp->xa_mount->m_log->l_pin_tail) > > > > + return XFS_ITEM_PINNED; > > > > +#endif > > > > + return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > > > +} > > > > + > > > > static long > > > > xfsaild_push( > > > > struct xfs_ail *ailp) > > > > @@ -382,7 +400,7 @@ xfsaild_push( > > > > * rely on the AIL cursor implementation to be able to deal with > > > > * the dropped lock. > > > > */ > > > > - lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > > > + lock_result = xfsaild_push_item(ailp, lip); > > > > switch (lock_result) { > > > > case XFS_ITEM_SUCCESS: > > > > XFS_STATS_INC(mp, xs_push_ail_success); > > > > -- > > > > 2.7.5 > > > > > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail 2017-06-21 16:08 ` Brian Foster @ 2017-06-21 17:06 ` Darrick J. Wong 0 siblings, 0 replies; 10+ messages in thread From: Darrick J. Wong @ 2017-06-21 17:06 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On Wed, Jun 21, 2017 at 12:08:23PM -0400, Brian Foster wrote: > On Wed, Jun 21, 2017 at 08:47:59AM -0700, Darrick J. Wong wrote: > > On Wed, Jun 21, 2017 at 06:16:54AM -0400, Brian Foster wrote: > > > On Tue, Jun 20, 2017 at 02:48:32PM -0700, Darrick J. Wong wrote: > > > > On Fri, Jun 16, 2017 at 12:44:33PM -0400, Brian Foster wrote: > > > > > Create a debug mode only sysfs option to force pin the tail of the > > > > > log. This option can be used by test infrastructure to induce head > > > > > behind tail conditions. Specifically, this is intended to be used by > > > > > xfstests to reproduce log recovery problems after failed/corrupted > > > > > log writes overwrite the last good tail LSN in the log. > > > > > > > > > > When enabled, AIL push attempts see every log item on the AIL in the > > > > > pinned state. This stalls metadata writeback and thus prevents the > > > > > current tail of the log from moving forward. When disabled, > > > > > subsequent AIL pushes observe the log items in their appropriate > > > > > state and filesystem operation continues as normal. > > > > > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > > > > > > Looks ok enough to test... > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > > > > > I assume you mean Reviewed-by... thanks! :) > > > > Yep. Though, having now pushed out the errortag-via-sysfs patchset, > > I wonder if this could target that instead? > > > > Indeed, I think that is a good idea... > > > But, first things first, that series has to be reviewed. We could (in > > theory) push e-v-s and this for 4.13; or hold off on both for another > > cycle; or integrate this patch for 4.13 and e-v-s for 4.14, and update > > the test to sniff out the location of the knob dynamically like xfs/196 > > does. > > > > I'm fine with deferring this particular mechanism and the test if we > have a general error injection mechanism in mind. This is really just > for reproducing the problem and verification (once I have a fix put > together ;P, though I think I now have a decent idea on how to fix > this), so to me it's fine to hold off so long as something is available > for reviewers to use in the meantime (i.e., same way I feel about > Carlos' AIL retry patches). That saves us from modifying the test again > if we can help it as well. > > I assume this patch would morph into something like making ->iop_push() > an injection point (where an "error" means return XFS_ITEM_PINNED) and > rather than having the log_pin_tail knob, the test can just set that > error between 0 and 100% (but I'll try to get a look at the new patches > shortly...). Yes, you'd more or less just #define a new XFS_ERRTAG_PIN_LOGS and the appropriate sysfs attributes, then: static inline uint xfsaild_push_item( struct xfs_ail *ailp, struct xfs_log_item *lip) { if (XFS_TEST_ERROR(false, ailp->xa_mount, XFS_ERRTAG_PIN_LOGS)) return XFS_ITEM_PINNED; return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); } ...and then we can flip the whole thing on by: echo 43 > /sys/fs/xfs/sda/errortag/pin_logs ...which will trigger the pinning about once every 43rd invocation. --D > > Brian > > > <mostly just pre-coffee babbling at this point> > > > > --D > > > > > > > > Brian > > > > > > > --D > > > > > > > > > --- > > > > > > > > > > Hi all, > > > > > > > > > > This patch is a supporting patch for an xfstests test I'm about to post > > > > > that pins the tail of the log in order to reproduce the log recovery > > > > > problem that appears to be the root of the problem in this[1] thread. > > > > > That is the primary motivation for the patch and so should probably be > > > > > reviewed with that context. IOW, if there's a better way to reproduce > > > > > the problem in the test without the need for kernel support, I'm happy > > > > > to drop this. Thoughts, reviews, flames appreciated. > > > > > > > > > > Brian > > > > > > > > > > [1] http://www.spinics.net/lists/linux-xfs/msg07499.html > > > > > > > > > > fs/xfs/xfs_log_priv.h | 2 ++ > > > > > fs/xfs/xfs_sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > > fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++- > > > > > 3 files changed, 67 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h > > > > > index c2604a5..bfbfde12 100644 > > > > > --- a/fs/xfs/xfs_log_priv.h > > > > > +++ b/fs/xfs/xfs_log_priv.h > > > > > @@ -413,6 +413,8 @@ struct xlog { > > > > > void *l_iclog_bak[XLOG_MAX_ICLOGS]; > > > > > /* log record crc error injection factor */ > > > > > uint32_t l_badcrc_factor; > > > > > + /* force pin the log tail */ > > > > > + bool l_pin_tail; > > > > > #endif > > > > > /* log recovery lsn tracking (for buffer submission */ > > > > > xfs_lsn_t l_recovery_lsn; > > > > > diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c > > > > > index ec6e0e2..b86148a 100644 > > > > > --- a/fs/xfs/xfs_sysfs.c > > > > > +++ b/fs/xfs/xfs_sysfs.c > > > > > @@ -378,6 +378,51 @@ log_badcrc_factor_show( > > > > > } > > > > > > > > > > XFS_SYSFS_ATTR_RW(log_badcrc_factor); > > > > > + > > > > > +/* > > > > > + * DEBUG mode flag to force pin the tail of the log. Used from test > > > > > + * infrastructure to manufacture head-behind-tail conditions. DO NOT USE > > > > > + * DIRECTLY. This will lock up the fs! > > > > > + * > > > > > + * When this option is enabled, all log items present in the AIL are emulated as > > > > > + * being in the pinned state until the option is disabled. Once disabled, log > > > > > + * items return to their natural state and fs operation continues as normal. > > > > > + */ > > > > > +STATIC ssize_t > > > > > +log_pin_tail_store( > > > > > + struct kobject *kobject, > > > > > + const char *buf, > > > > > + size_t count) > > > > > +{ > > > > > + struct xlog *log = to_xlog(kobject); > > > > > + int ret; > > > > > + int val; > > > > > + > > > > > + ret = kstrtoint(buf, 0, &val); > > > > > + if (ret) > > > > > + return ret; > > > > > + > > > > > + if (val == 1) > > > > > + log->l_pin_tail = true; > > > > > + else if (val == 0) > > > > > + log->l_pin_tail = false; > > > > > + else > > > > > + return -EINVAL; > > > > > + > > > > > + return count; > > > > > +} > > > > > + > > > > > +STATIC ssize_t > > > > > +log_pin_tail_show( > > > > > + struct kobject *kobject, > > > > > + char *buf) > > > > > +{ > > > > > + struct xlog *log = to_xlog(kobject); > > > > > + > > > > > + return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0); > > > > > +} > > > > > +XFS_SYSFS_ATTR_RW(log_pin_tail); > > > > > + > > > > > #endif /* DEBUG */ > > > > > > > > > > static struct attribute *xfs_log_attrs[] = { > > > > > @@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = { > > > > > ATTR_LIST(write_grant_head), > > > > > #ifdef DEBUG > > > > > ATTR_LIST(log_badcrc_factor), > > > > > + ATTR_LIST(log_pin_tail), > > > > > #endif > > > > > NULL, > > > > > }; > > > > > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > > > > > index 9056c0f..c901e61 100644 > > > > > --- a/fs/xfs/xfs_trans_ail.c > > > > > +++ b/fs/xfs/xfs_trans_ail.c > > > > > @@ -27,6 +27,7 @@ > > > > > #include "xfs_trace.h" > > > > > #include "xfs_error.h" > > > > > #include "xfs_log.h" > > > > > +#include "xfs_log_priv.h" > > > > > > > > > > #ifdef DEBUG > > > > > /* > > > > > @@ -325,6 +326,23 @@ xfs_ail_delete( > > > > > xfs_trans_ail_cursor_clear(ailp, lip); > > > > > } > > > > > > > > > > +static inline uint > > > > > +xfsaild_push_item( > > > > > + struct xfs_ail *ailp, > > > > > + struct xfs_log_item *lip) > > > > > +{ > > > > > +#ifdef DEBUG > > > > > + /* > > > > > + * If tail pinning is enabled, skip the push and track all items as > > > > > + * pinned to force pin the log tail. This helps induce head-behind-tail > > > > > + * conditions. > > > > > + */ > > > > > + if (ailp->xa_mount->m_log->l_pin_tail) > > > > > + return XFS_ITEM_PINNED; > > > > > +#endif > > > > > + return lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > > > > +} > > > > > + > > > > > static long > > > > > xfsaild_push( > > > > > struct xfs_ail *ailp) > > > > > @@ -382,7 +400,7 @@ xfsaild_push( > > > > > * rely on the AIL cursor implementation to be able to deal with > > > > > * the dropped lock. > > > > > */ > > > > > - lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list); > > > > > + lock_result = xfsaild_push_item(ailp, lip); > > > > > switch (lock_result) { > > > > > case XFS_ITEM_SUCCESS: > > > > > XFS_STATS_INC(mp, xs_push_ail_success); > > > > > -- > > > > > 2.7.5 > > > > > > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > > > the body of a message to majordomo@vger.kernel.org > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 0/4] xfs: log recovery wrap and tail overwrite fixes @ 2017-06-27 14:40 Brian Foster 2017-06-27 14:50 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster 0 siblings, 1 reply; 10+ messages in thread From: Brian Foster @ 2017-06-27 14:40 UTC (permalink / raw) To: linux-xfs Hi all, Here's a first real stab at a fix for the log tail overwrite issue. The general approach is similar to torn write detection: move the tail forward when corruption is detected within the range of a possible tail overwrite. Patch 1 fixes an independent and spurious log recovery failure when a log record header wraps around the end of the physical log. Patch 2 is a semi-preparatory patch that unconditionally invokes log tail verification rather than only after torn write detection at the head. Patch 3 introduces the core fix to move the tail forward in the event of corruption. Patch 4 introduces an error injection tag to force log item pinning and facilitates the test that reliably reproduces the tail overwrite problem. This survives the latest variant of the xfstests test that reproduces the tail overwrite condition and otherwise hasn't shown any regressions in my testing so far (still ongoing). This also allows the metadump images provided by Sweet Tea[1] to mount (though those images do still show filesystem corruption after mount, so I suspect something more is going on there). One other slight change worth noting in log recovery behavior is that tail overwrite detection causes earlier reporting of legitimate log CRC or corruption errors. Before this series, a log corruption that is not resolved by torn write/tail overwrite detection results in log recovery failure after a partial recovery up to the point at which the corruption is encountered. After this series, it is very likely that the corruption is identified during tail verification and an error returned to userspace before real recovery begins. An xfs_repair is necessary in either case, but I'm curious if there is a preference towards the old or newly proposed behavior. An alternative I've considered to preserve the old behavior, for example, is to use the tail verification CRC pass for tail fixing only (and otherwise consider errors at this point as nonfatal). This means that we would fix up the tail if possible, otherwise leave errors to the real recovery sequence such that a partial recovery can occur before the (imminent) failure. Thoughts? Brian v1: - Add patch to fix log recovery header wrapping problem. - Replace transaction reservation rfc with log recovery based fix. - Replace custom log pinning sysfs knob with error injection tag. rfc: http://www.spinics.net/lists/linux-xfs/msg07623.html [1] http://www.spinics.net/lists/linux-xfs/msg07667.html Brian Foster (4): xfs: fix recovery failure when log record header wraps log end xfs: always verify the log tail during recovery xfs: fix log recovery corruption error due to tail overwrite xfs: add log item pinning error injection tag fs/xfs/xfs_error.c | 3 + fs/xfs/xfs_error.h | 4 +- fs/xfs/xfs_log_recover.c | 150 +++++++++++++++++++++++++++++------------------ fs/xfs/xfs_trans_ail.c | 17 +++++- 4 files changed, 114 insertions(+), 60 deletions(-) -- 2.7.5 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] tests/xfs: test for log recovery failure after tail overwrite 2017-06-27 14:40 [PATCH 0/4] xfs: log recovery wrap and tail overwrite fixes Brian Foster @ 2017-06-27 14:50 ` Brian Foster 0 siblings, 0 replies; 10+ messages in thread From: Brian Foster @ 2017-06-27 14:50 UTC (permalink / raw) To: linux-xfs XFS is susceptible to log recovery problems if the fs crashes under certain circumstances. If the tail has been pinned for long enough to the log to fill and the next batch of log buffer submissions happen to fail, the filesystem shutsdown having potentially overwritten part of the range between the last good tail->head range in the log. This causes log recovery to fail with crc mismatch or invalid log record errors. This problem is not yet fixed and thus known/expected to fail. At this time, this test serves as a reminder that the problem exists and a reproducer for future verification purposes. Note that this problem is currently only reproducible with larger (non-default) log buffer sizes (i.e., '-o logbsize=256k') or smaller block sizes (1k). Signed-off-by: Brian Foster <bfoster@redhat.com> --- Here's the latest variant of the test that can reproduce this problem. This version depends on the recent error injection sysfs knobs and the existence of the associated knobs for log item pinning and bad log record crc injection. I've left fstests off the cc list for now since much of that code is still outstanding. This is currently posted for reference. Brian tests/xfs/999 | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/xfs/999.out | 2 + tests/xfs/group | 1 + 3 files changed, 116 insertions(+) create mode 100755 tests/xfs/999 create mode 100644 tests/xfs/999.out diff --git a/tests/xfs/999 b/tests/xfs/999 new file mode 100755 index 0000000..e321d73 --- /dev/null +++ b/tests/xfs/999 @@ -0,0 +1,113 @@ +#! /bin/bash +# FS QA Test No. 999 +# +# Attempt to reproduce log recovery failure by writing corrupt log records over +# the last good tail in the log. The tail is force pinned while a workload runs +# the head as close as possible behind the tail. Once the head is pinned, +# corrupted log records are written to the log and the filesystem shuts down. +# +# While log recovery should handle the corrupted log records, it has historical +# problems dealing with the situation where the corrupted log records may have +# overwritten the tail of the previous good record in the log. If this occurs, +# log recovery may fail. +# +# This can be reproduced more reliably under non-default conditions such as with +# the smallest supported FSB sizes and/or largest supported log buffer sizes and +# counts (logbufs and logbsize mount options). +# +# Note that this test requires a DEBUG mode kernel. +# +#----------------------------------------------------------------------- +# Copyright (c) 2017 Red Hat, Inc. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* + $KILLALL_PROG -9 fsstress > /dev/null 2>&1 + [ -e /sys/fs/xfs/$sdev/errortag/log_item_pin ] && + echo 0 > /sys/fs/xfs/$sdev/errortag/log_item_pin + wait > /dev/null 2>&1 +} + +rm -f $seqres.full + +# get standard environment, filters and checks +. ./common/rc + +# real QA test starts here + +# Modify as appropriate. +_supported_fs xfs +_supported_os Linux +_require_xfs_sysfs $(_short_dev $TEST_DEV)/errortag/log_item_pin +_require_xfs_sysfs $(_short_dev $TEST_DEV)/errortag/log_bad_crc +_require_scratch +_require_command "$KILLALL_PROG" killall + +echo "Silence is golden." + +sdev=$(_short_dev $SCRATCH_DEV) + +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" +_scratch_mount || _fail "mount failed" + +# populate the fs with some data and cycle the mount to reset the log head/tail +$FSSTRESS_PROG -d $SCRATCH_MNT -z -fcreat=1 -p 4 -n 100000 > /dev/null 2>&1 +_scratch_cycle_mount || _fail "mount failed" + +# Pin the tail and start a file removal workload. File removal tends to +# reproduce the corruption more reliably. +echo 1 > /sys/fs/xfs/$sdev/errortag/log_item_pin + +rm -rf $SCRATCH_MNT/* > /dev/null 2>&1 & +workpid=$! + +# wait for the head to stop pushing forward +prevhead=-1 +head=`cat /sys/fs/xfs/$sdev/log/log_head_lsn` +while [ "$head" != "$prevhead" ]; do + sleep 5 + prevhead=$head + head=`cat /sys/fs/xfs/$sdev/log/log_head_lsn` +done + +# Once the head is pinned behind the tail, enable log record corruption and +# unpin the tail. All subsequent log buffer writes end up corrupted on-disk and +# result in log I/O errors. +echo 1 > /sys/fs/xfs/$sdev/errortag/log_bad_crc +echo 0 > /sys/fs/xfs/$sdev/errortag/log_item_pin + +# wait for fs shutdown to kill the workload +wait $workpid + +# cycle mount to test log recovery +_scratch_cycle_mount + +# success, all done +status=0 +exit diff --git a/tests/xfs/999.out b/tests/xfs/999.out new file mode 100644 index 0000000..d254382 --- /dev/null +++ b/tests/xfs/999.out @@ -0,0 +1,2 @@ +QA output created by 999 +Silence is golden. diff --git a/tests/xfs/group b/tests/xfs/group index 2a3b950..a10fd6d 100644 --- a/tests/xfs/group +++ b/tests/xfs/group @@ -419,3 +419,4 @@ 419 auto quick 420 auto quick clone dedupe 421 auto quick clone dedupe +999 auto log -- 2.7.5 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-06-30 4:10 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-16 16:44 [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Brian Foster 2017-06-16 16:46 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster 2017-06-30 3:44 ` Eryu Guan 2017-06-30 4:09 ` Darrick J. Wong 2017-06-20 21:48 ` [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Darrick J. Wong 2017-06-21 10:16 ` Brian Foster 2017-06-21 15:47 ` Darrick J. Wong 2017-06-21 16:08 ` Brian Foster 2017-06-21 17:06 ` Darrick J. Wong -- strict thread matches above, loose matches on Subject: below -- 2017-06-27 14:40 [PATCH 0/4] xfs: log recovery wrap and tail overwrite fixes Brian Foster 2017-06-27 14:50 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).