* [PATCH] xfs: force log before unmount [not found] <20120621192541.630212563@sgi.com> @ 2012-06-21 19:25 ` tinguely 2012-06-21 23:50 ` Dave Chinner 0 siblings, 1 reply; 4+ messages in thread From: tinguely @ 2012-06-21 19:25 UTC (permalink / raw) To: xfs [-- Attachment #1: xfs-force-log-before-unmount.patch --] [-- Type: text/plain, Size: 2235 bytes --] From: Mark Tinguely <tinguely@sgi.com> Testing has found a few crashes of dereference of NULL in the xlog_assign_tail_lsn_locked() and xlog_get_lowest_lsn() routines from buf iodone callbacks. The filesystem is be shutdown. At the point of the crash, the log has alreadly been freed which leads to the NULL pointer dereference. One crash still pointed to the buffer that was going through the iodone path and that buffer had XFS_SB_MAGIC in it. This patch sits on top of Ben's proposed patch to shutdown the sync worker. Although it did not cause the crash I would like to be on the safe side, and perform the shutdown of the sync worker to occur before the last write of the superblock. Then force the log to get the superblock buffer into the AIL before it is pushed for the last time.. To the best of my searching, xfs_log_unmount_write() should not dirty the superblock.. Signed-off-by: Mark Tinguely <tinguely@sgi.com> --- fs/xfs/xfs_log.c | 1 - fs/xfs/xfs_mount.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) Index: b/fs/xfs/xfs_log.c =================================================================== --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -810,7 +810,6 @@ xfs_log_unmount_write(xfs_mount_t *mp) void xfs_log_unmount(xfs_mount_t *mp) { - cancel_delayed_work_sync(&mp->m_sync_work); xfs_trans_ail_destroy(mp); xlog_dealloc_log(mp->m_log); } Index: b/fs/xfs/xfs_mount.c =================================================================== --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1517,6 +1517,7 @@ xfs_unmountfs( xfs_warn(mp, "Unable to free reserved block pool. " "Freespace may not be correct on next mount."); + cancel_delayed_work_sync(&mp->m_sync_work); error = xfs_log_sbcount(mp); if (error) xfs_warn(mp, "Unable to update superblock counters. " @@ -1526,6 +1527,7 @@ xfs_unmountfs( * At this point we might have modified the superblock again and thus * added an item to the AIL, thus flush it again. */ + xfs_log_force(mp, XFS_LOG_SYNC); xfs_ail_push_all_sync(mp->m_ail); xfs_wait_buftarg(mp->m_ddev_targp); _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: force log before unmount 2012-06-21 19:25 ` [PATCH] xfs: force log before unmount tinguely @ 2012-06-21 23:50 ` Dave Chinner 2012-06-22 20:39 ` Mark Tinguely 0 siblings, 1 reply; 4+ messages in thread From: Dave Chinner @ 2012-06-21 23:50 UTC (permalink / raw) To: tinguely; +Cc: xfs On Thu, Jun 21, 2012 at 02:25:42PM -0500, tinguely@sgi.com wrote: > From: Mark Tinguely <tinguely@sgi.com> > > Testing has found a few crashes of dereference of NULL in the > xlog_assign_tail_lsn_locked() and xlog_get_lowest_lsn() routines from > buf iodone callbacks. The filesystem is be shutdown. At the point of > the crash, the log has alreadly been freed which leads to the NULL pointer > dereference. > > One crash still pointed to the buffer that was going through the iodone > path and that buffer had XFS_SB_MAGIC in it. Which points to the problem, but you haven't quite worked out what the root cause of this is so the patch doesn't actually fix the problem, It changes the timing of IO, so it can certainly mke it seem like it fixed the race condition.... > This patch sits on top of Ben's proposed patch to shutdown the sync worker. > Although it did not cause the crash I would like to be on the safe side, and > perform the shutdown of the sync worker to occur before the last write of the > superblock. The sync worker is not an actor in this race condition - where it was placed in xfs_log_unmount() is the correct place for it. > Then force the log to get the superblock buffer into the AIL before > it is pushed for the last time.. The superblock is already in the AIL - otherwise we wouldn't have crashed trying to remove it from the AIL after the log has been torn down. > To the best of my searching, xfs_log_unmount_write() should not dirty the > superblock.. It doesn't - it just writes an unmount record to the log. > Signed-off-by: Mark Tinguely <tinguely@sgi.com> > --- > fs/xfs/xfs_log.c | 1 - > fs/xfs/xfs_mount.c | 2 ++ > 2 files changed, 2 insertions(+), 1 deletion(-) > > Index: b/fs/xfs/xfs_log.c > =================================================================== > --- a/fs/xfs/xfs_log.c > +++ b/fs/xfs/xfs_log.c > @@ -810,7 +810,6 @@ xfs_log_unmount_write(xfs_mount_t *mp) > void > xfs_log_unmount(xfs_mount_t *mp) > { > - cancel_delayed_work_sync(&mp->m_sync_work); > xfs_trans_ail_destroy(mp); > xlog_dealloc_log(mp->m_log); > } > Index: b/fs/xfs/xfs_mount.c > =================================================================== > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -1517,6 +1517,7 @@ xfs_unmountfs( > xfs_warn(mp, "Unable to free reserved block pool. " > "Freespace may not be correct on next mount."); > > + cancel_delayed_work_sync(&mp->m_sync_work); > error = xfs_log_sbcount(mp); > if (error) > xfs_warn(mp, "Unable to update superblock counters. " > @@ -1526,6 +1527,7 @@ xfs_unmountfs( > * At this point we might have modified the superblock again and thus > * added an item to the AIL, thus flush it again. > */ > + xfs_log_force(mp, XFS_LOG_SYNC); This is a no-op - xfs_log_sbcount() does a synchronous transaction, so the log has already been forced at this point and the superblock is not pinned in memory. > xfs_ail_push_all_sync(mp->m_ail); > xfs_wait_buftarg(mp->m_ddev_targp); So, what is different about the superblock buffer? It's an uncached buffer. What does that mean? It means that when xfs_wait_buftarg() walks the LRU, it never finds the superblock buffer because uncached buffers are never put on the LRU. Hence, when xfs_ail_push_all_sync() triggers an async write of the superblock, nobody ever waits for it to complete. Hence the log and AIL are torn down, and when the IO completes it goes to remove it from the AIL (which succeeds) and then move the log tail because it was the only item in the AIL, it goes kaboom. So the root cause is that we are not waiting for the superblock IO to complete i.e. it's not at all related to the xfs_sync_worker(), log forces, etc. What we really need to do is after the xfs_log_sbcount() call is write the superblock synchronously, and we can do that in xfs_log_sbcount() because all the callers of xfs_log_sbcount() have this same problem of not waiting for the SB to be written correctly. i.e. add the guts of xfs_sync_fsdata() to xfs_log_sbcount().... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: force log before unmount 2012-06-21 23:50 ` Dave Chinner @ 2012-06-22 20:39 ` Mark Tinguely 2012-06-22 23:37 ` Dave Chinner 0 siblings, 1 reply; 4+ messages in thread From: Mark Tinguely @ 2012-06-22 20:39 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On 06/21/12 18:50, Dave Chinner wrote: > On Thu, Jun 21, 2012 at 02:25:42PM -0500, tinguely@sgi.com wrote: >> From: Mark Tinguely<tinguely@sgi.com> >> >> Testing has found a few crashes of dereference of NULL in the >> xlog_assign_tail_lsn_locked() and xlog_get_lowest_lsn() routines from >> buf iodone callbacks. The filesystem is be shutdown. At the point of >> the crash, the log has alreadly been freed which leads to the NULL pointer >> dereference. >> >> One crash still pointed to the buffer that was going through the iodone >> path and that buffer had XFS_SB_MAGIC in it. > > Which points to the problem, but you haven't quite worked out what > the root cause of this is so the patch doesn't actually fix the > problem, It changes the timing of IO, so it can certainly mke it > seem like it fixed the race condition.... > >> This patch sits on top of Ben's proposed patch to shutdown the sync worker. >> Although it did not cause the crash I would like to be on the safe side, and >> perform the shutdown of the sync worker to occur before the last write of the >> superblock. > > The sync worker is not an actor in this race condition - where it > was placed in xfs_log_unmount() is the correct place for it. > >> Then force the log to get the superblock buffer into the AIL before >> it is pushed for the last time.. > > The superblock is already in the AIL - otherwise we wouldn't have > crashed trying to remove it from the AIL after the log has been torn > down. > >> To the best of my searching, xfs_log_unmount_write() should not dirty the >> superblock.. > > It doesn't - it just writes an unmount record to the log. > >> Signed-off-by: Mark Tinguely<tinguely@sgi.com> >> --- >> fs/xfs/xfs_log.c | 1 - >> fs/xfs/xfs_mount.c | 2 ++ >> 2 files changed, 2 insertions(+), 1 deletion(-) >> >> Index: b/fs/xfs/xfs_log.c >> =================================================================== >> --- a/fs/xfs/xfs_log.c >> +++ b/fs/xfs/xfs_log.c >> @@ -810,7 +810,6 @@ xfs_log_unmount_write(xfs_mount_t *mp) >> void >> xfs_log_unmount(xfs_mount_t *mp) >> { >> - cancel_delayed_work_sync(&mp->m_sync_work); >> xfs_trans_ail_destroy(mp); >> xlog_dealloc_log(mp->m_log); >> } >> Index: b/fs/xfs/xfs_mount.c >> =================================================================== >> --- a/fs/xfs/xfs_mount.c >> +++ b/fs/xfs/xfs_mount.c >> @@ -1517,6 +1517,7 @@ xfs_unmountfs( >> xfs_warn(mp, "Unable to free reserved block pool. " >> "Freespace may not be correct on next mount."); >> >> + cancel_delayed_work_sync(&mp->m_sync_work); >> error = xfs_log_sbcount(mp); >> if (error) >> xfs_warn(mp, "Unable to update superblock counters. " >> @@ -1526,6 +1527,7 @@ xfs_unmountfs( >> * At this point we might have modified the superblock again and thus >> * added an item to the AIL, thus flush it again. >> */ >> + xfs_log_force(mp, XFS_LOG_SYNC); > > This is a no-op - xfs_log_sbcount() does a synchronous transaction, > so the log has already been forced at this point and the superblock > is not pinned in memory. > >> xfs_ail_push_all_sync(mp->m_ail); >> xfs_wait_buftarg(mp->m_ddev_targp); > > So, what is different about the superblock buffer? It's an uncached > buffer. What does that mean? It means that when xfs_wait_buftarg() > walks the LRU, it never finds the superblock buffer because uncached > buffers are never put on the LRU. > Thanks Dave. Just interested; are the uncached buffers kept off the LRU with an extra b_hold reference? > Hence, when xfs_ail_push_all_sync() triggers an async write of the > superblock, nobody ever waits for it to complete. Hence the log and > AIL are torn down, and when the IO completes it goes to remove it > from the AIL (which succeeds) and then move the log tail because it > was the only item in the AIL, it goes kaboom. > > So the root cause is that we are not waiting for the superblock IO > to complete i.e. it's not at all related to the xfs_sync_worker(), log > forces, etc. What we really need to do is after the > xfs_log_sbcount() call is write the superblock synchronously, and we > can do that in xfs_log_sbcount() because all the callers of > xfs_log_sbcount() have this same problem of not waiting for the SB > to be written correctly. i.e. add the guts of xfs_sync_fsdata() to > xfs_log_sbcount().... > I did not expect the sync worker was the cause. Ben's patch is not even in at least one of the crashes. I thought it would be good thing to have the sync worker idle by the time we write the unmount record in the log. Thank-you again for the lesson and insight. --Mark Tinguely. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: force log before unmount 2012-06-22 20:39 ` Mark Tinguely @ 2012-06-22 23:37 ` Dave Chinner 0 siblings, 0 replies; 4+ messages in thread From: Dave Chinner @ 2012-06-22 23:37 UTC (permalink / raw) To: Mark Tinguely; +Cc: xfs On Fri, Jun 22, 2012 at 03:39:35PM -0500, Mark Tinguely wrote: > On 06/21/12 18:50, Dave Chinner wrote: > >On Thu, Jun 21, 2012 at 02:25:42PM -0500, tinguely@sgi.com wrote: > >> xfs_ail_push_all_sync(mp->m_ail); > >> xfs_wait_buftarg(mp->m_ddev_targp); > > > >So, what is different about the superblock buffer? It's an uncached > >buffer. What does that mean? It means that when xfs_wait_buftarg() > >walks the LRU, it never finds the superblock buffer because uncached > >buffers are never put on the LRU. > > > > Thanks Dave. Just interested; are the uncached buffers kept off the > LRU with an extra b_hold reference? No, they take the uncached path through xfs_buf_rele() which avoids putting them on the LRU. i.e: 828 void 829 xfs_buf_rele( 830 xfs_buf_t *bp) 831 { 832 struct xfs_perag *pag = bp->b_pag; 833 834 trace_xfs_buf_rele(bp, _RET_IP_); 835 836 if (!pag) { 837 ASSERT(list_empty(&bp->b_lru)); 838 ASSERT(RB_EMPTY_NODE(&bp->b_rbnode)); 839 if (atomic_dec_and_test(&bp->b_hold)) 840 xfs_buf_free(bp); 841 return; 842 } This first !pag path.... > >Hence, when xfs_ail_push_all_sync() triggers an async write of the > >superblock, nobody ever waits for it to complete. Hence the log and > >AIL are torn down, and when the IO completes it goes to remove it > >from the AIL (which succeeds) and then move the log tail because it > >was the only item in the AIL, it goes kaboom. > > > >So the root cause is that we are not waiting for the superblock IO > >to complete i.e. it's not at all related to the xfs_sync_worker(), log > >forces, etc. What we really need to do is after the > >xfs_log_sbcount() call is write the superblock synchronously, and we > >can do that in xfs_log_sbcount() because all the callers of > >xfs_log_sbcount() have this same problem of not waiting for the SB > >to be written correctly. i.e. add the guts of xfs_sync_fsdata() to > >xfs_log_sbcount().... > > > I did not expect the sync worker was the cause. Ben's patch is not even > in at least one of the crashes. I thought it would be good thing to have > the sync worker idle by the time we write the unmount record in the log. I guess it's possible it might race with writing back the superblock, but I think that we need to solve this problem once and for all rather than moving the sync worker shutdown every time we find a problem. I think part of that needs to be handled by the log itself given that really all the sync worker is doing is trying to idle the log.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-06-22 23:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20120621192541.630212563@sgi.com>
2012-06-21 19:25 ` [PATCH] xfs: force log before unmount tinguely
2012-06-21 23:50 ` Dave Chinner
2012-06-22 20:39 ` Mark Tinguely
2012-06-22 23:37 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox