* [PATCH] xfs: cancel eofblocks background trimming on remount read-only
@ 2016-06-06 17:12 Brian Foster
2016-06-06 20:24 ` Eric Sandeen
0 siblings, 1 reply; 3+ messages in thread
From: Brian Foster @ 2016-06-06 17:12 UTC (permalink / raw)
To: xfs
The filesystem quiesce sequence performs the operations necessary to
drain all background work, push pending transactions through the log
infrastructure and wait on I/O resulting from the final AIL push. We
have had reports of remount,ro hangs in xfs_log_quiesce() ->
xfs_wait_buftarg(), however, and some instrumentation code to detect
transaction commits at this point in the quiesce sequence has inculpated
the eofblocks background scanner as a cause.
While higher level remount code generally prevents user modifications by
the time the filesystem has made it to xfs_log_quiesce(), the background
scanner may still be alive and can perform pending work at any time. If
this occurs between the xfs_log_force() and xfs_wait_buftarg() calls
within xfs_log_quiesce(), this can lead to an indefinite lockup in
xfs_wait_buftarg().
To prevent this problem, cancel the background eofblocks scan worker
during the remount read-only quiesce sequence. This suspends background
trimming when a filesystem is remounted read-only. This is only done in
the remount path because the freeze codepath has already locked out new
transactions by the time the filesystem attempts to quiesce (and thus
waiting on an active work item could deadlock). Kick the eofblocks
worker to pick up where it left off once an fs is remounted back to
read-write.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
To confirm the problem, I have managed to manufacture the remount,ro
hang issue described above by hacking in some delays/coordination
between the quiesce and eofblocks background worker.
Also, an alternative approach that I was considering is to run a
synchronous scan around the same place this patch cancels the background
scan. The idea is that the background scan would then have no work to do
on the subsequent iteration and then die off naturally (until
preallocation occurs once again). I suppose the downside is that a
remount might not necessarily be expected to muck with preallocation
state of affected files. This patch seemed more simple, but I'm open to
either. Thoughts?
Brian
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_icache.h | 1 +
fs/xfs/xfs_super.c | 8 ++++++++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 99ee6eee..fb39a66 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -765,7 +765,7 @@ restart:
* Background scanning to trim post-EOF preallocated space. This is queued
* based on the 'speculative_prealloc_lifetime' tunable (5m by default).
*/
-STATIC void
+void
xfs_queue_eofblocks(
struct xfs_mount *mp)
{
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 62f1f91..05bac99 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -68,6 +68,7 @@ void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
int xfs_icache_free_eofblocks(struct xfs_mount *, struct xfs_eofblocks *);
int xfs_inode_free_quota_eofblocks(struct xfs_inode *ip);
void xfs_eofblocks_worker(struct work_struct *);
+void xfs_queue_eofblocks(struct xfs_mount *);
int xfs_inode_ag_iterator(struct xfs_mount *mp,
int (*execute)(struct xfs_inode *ip, int flags, void *args),
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 4700f09..7965371 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1294,6 +1294,7 @@ xfs_fs_remount(
*/
xfs_restore_resvblks(mp);
xfs_log_work_queue(mp);
+ xfs_queue_eofblocks(mp);
}
/* rw -> ro */
@@ -1306,6 +1307,13 @@ xfs_fs_remount(
* return it to the same size.
*/
xfs_save_resvblks(mp);
+
+ /*
+ * Cancel background eofb scanning so it cannot race with the
+ * final log force+buftarg wait and deadlock the remount.
+ */
+ cancel_delayed_work_sync(&mp->m_eofblocks_work);
+
xfs_quiesce_attr(mp);
mp->m_flags |= XFS_MOUNT_RDONLY;
}
--
2.5.5
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: cancel eofblocks background trimming on remount read-only
2016-06-06 17:12 [PATCH] xfs: cancel eofblocks background trimming on remount read-only Brian Foster
@ 2016-06-06 20:24 ` Eric Sandeen
2016-06-06 20:57 ` Brian Foster
0 siblings, 1 reply; 3+ messages in thread
From: Eric Sandeen @ 2016-06-06 20:24 UTC (permalink / raw)
To: xfs
On 6/6/16 12:12 PM, Brian Foster wrote:
> The filesystem quiesce sequence performs the operations necessary to
> drain all background work, push pending transactions through the log
> infrastructure and wait on I/O resulting from the final AIL push. We
> have had reports of remount,ro hangs in xfs_log_quiesce() ->
> xfs_wait_buftarg(), however, and some instrumentation code to detect
> transaction commits at this point in the quiesce sequence has inculpated
> the eofblocks background scanner as a cause.
>
> While higher level remount code generally prevents user modifications by
> the time the filesystem has made it to xfs_log_quiesce(), the background
> scanner may still be alive and can perform pending work at any time. If
> this occurs between the xfs_log_force() and xfs_wait_buftarg() calls
> within xfs_log_quiesce(), this can lead to an indefinite lockup in
> xfs_wait_buftarg().
>
> To prevent this problem, cancel the background eofblocks scan worker
> during the remount read-only quiesce sequence. This suspends background
> trimming when a filesystem is remounted read-only. This is only done in
> the remount path because the freeze codepath has already locked out new
> transactions by the time the filesystem attempts to quiesce (and thus
> waiting on an active work item could deadlock). Kick the eofblocks
> worker to pick up where it left off once an fs is remounted back to
> read-write.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>
> To confirm the problem, I have managed to manufacture the remount,ro
> hang issue described above by hacking in some delays/coordination
> between the quiesce and eofblocks background worker.
>
> Also, an alternative approach that I was considering is to run a
> synchronous scan around the same place this patch cancels the background
> scan. The idea is that the background scan would then have no work to do
> on the subsequent iteration and then die off naturally (until
> preallocation occurs once again). I suppose the downside is that a
> remount might not necessarily be expected to muck with preallocation
> state of affected files. This patch seemed more simple, but I'm open to
> either. Thoughts?
Your approach makes sense to me - "finishing" the scan prior to quiesce
could take a relatively unknown amount of time as well, right? And I guess
I don't see a real advantage to that; we don't wait for it on unmount
(right?)
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
> Brian
>
> fs/xfs/xfs_icache.c | 2 +-
> fs/xfs/xfs_icache.h | 1 +
> fs/xfs/xfs_super.c | 8 ++++++++
> 3 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 99ee6eee..fb39a66 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -765,7 +765,7 @@ restart:
> * Background scanning to trim post-EOF preallocated space. This is queued
> * based on the 'speculative_prealloc_lifetime' tunable (5m by default).
> */
> -STATIC void
> +void
> xfs_queue_eofblocks(
> struct xfs_mount *mp)
> {
> diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> index 62f1f91..05bac99 100644
> --- a/fs/xfs/xfs_icache.h
> +++ b/fs/xfs/xfs_icache.h
> @@ -68,6 +68,7 @@ void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
> int xfs_icache_free_eofblocks(struct xfs_mount *, struct xfs_eofblocks *);
> int xfs_inode_free_quota_eofblocks(struct xfs_inode *ip);
> void xfs_eofblocks_worker(struct work_struct *);
> +void xfs_queue_eofblocks(struct xfs_mount *);
>
> int xfs_inode_ag_iterator(struct xfs_mount *mp,
> int (*execute)(struct xfs_inode *ip, int flags, void *args),
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 4700f09..7965371 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1294,6 +1294,7 @@ xfs_fs_remount(
> */
> xfs_restore_resvblks(mp);
> xfs_log_work_queue(mp);
> + xfs_queue_eofblocks(mp);
> }
>
> /* rw -> ro */
> @@ -1306,6 +1307,13 @@ xfs_fs_remount(
> * return it to the same size.
> */
> xfs_save_resvblks(mp);
> +
> + /*
> + * Cancel background eofb scanning so it cannot race with the
> + * final log force+buftarg wait and deadlock the remount.
> + */
> + cancel_delayed_work_sync(&mp->m_eofblocks_work);
> +
> xfs_quiesce_attr(mp);
> mp->m_flags |= XFS_MOUNT_RDONLY;
> }
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: cancel eofblocks background trimming on remount read-only
2016-06-06 20:24 ` Eric Sandeen
@ 2016-06-06 20:57 ` Brian Foster
0 siblings, 0 replies; 3+ messages in thread
From: Brian Foster @ 2016-06-06 20:57 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
On Mon, Jun 06, 2016 at 03:24:10PM -0500, Eric Sandeen wrote:
>
>
> On 6/6/16 12:12 PM, Brian Foster wrote:
> > The filesystem quiesce sequence performs the operations necessary to
> > drain all background work, push pending transactions through the log
> > infrastructure and wait on I/O resulting from the final AIL push. We
> > have had reports of remount,ro hangs in xfs_log_quiesce() ->
> > xfs_wait_buftarg(), however, and some instrumentation code to detect
> > transaction commits at this point in the quiesce sequence has inculpated
> > the eofblocks background scanner as a cause.
> >
> > While higher level remount code generally prevents user modifications by
> > the time the filesystem has made it to xfs_log_quiesce(), the background
> > scanner may still be alive and can perform pending work at any time. If
> > this occurs between the xfs_log_force() and xfs_wait_buftarg() calls
> > within xfs_log_quiesce(), this can lead to an indefinite lockup in
> > xfs_wait_buftarg().
> >
> > To prevent this problem, cancel the background eofblocks scan worker
> > during the remount read-only quiesce sequence. This suspends background
> > trimming when a filesystem is remounted read-only. This is only done in
> > the remount path because the freeze codepath has already locked out new
> > transactions by the time the filesystem attempts to quiesce (and thus
> > waiting on an active work item could deadlock). Kick the eofblocks
> > worker to pick up where it left off once an fs is remounted back to
> > read-write.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >
> > To confirm the problem, I have managed to manufacture the remount,ro
> > hang issue described above by hacking in some delays/coordination
> > between the quiesce and eofblocks background worker.
> >
> > Also, an alternative approach that I was considering is to run a
> > synchronous scan around the same place this patch cancels the background
> > scan. The idea is that the background scan would then have no work to do
> > on the subsequent iteration and then die off naturally (until
> > preallocation occurs once again). I suppose the downside is that a
> > remount might not necessarily be expected to muck with preallocation
> > state of affected files. This patch seemed more simple, but I'm open to
> > either. Thoughts?
>
> Your approach makes sense to me - "finishing" the scan prior to quiesce
> could take a relatively unknown amount of time as well, right? And I guess
> I don't see a real advantage to that; we don't wait for it on unmount
> (right?)
>
Yes, it could take an unknown amount of time. It would probably be
similar to what we would do on an unmount. We don't explicitly wait for
a scan on unmount (though we destroy the workqueue at some point), but I
believe we have to reclaim all cached inodes, which then trims eofblocks
for every inode where appropriate (via xfs_inactive()).
> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
>
Thanks!
Brian
> > Brian
> >
> > fs/xfs/xfs_icache.c | 2 +-
> > fs/xfs/xfs_icache.h | 1 +
> > fs/xfs/xfs_super.c | 8 ++++++++
> > 3 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index 99ee6eee..fb39a66 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -765,7 +765,7 @@ restart:
> > * Background scanning to trim post-EOF preallocated space. This is queued
> > * based on the 'speculative_prealloc_lifetime' tunable (5m by default).
> > */
> > -STATIC void
> > +void
> > xfs_queue_eofblocks(
> > struct xfs_mount *mp)
> > {
> > diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> > index 62f1f91..05bac99 100644
> > --- a/fs/xfs/xfs_icache.h
> > +++ b/fs/xfs/xfs_icache.h
> > @@ -68,6 +68,7 @@ void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
> > int xfs_icache_free_eofblocks(struct xfs_mount *, struct xfs_eofblocks *);
> > int xfs_inode_free_quota_eofblocks(struct xfs_inode *ip);
> > void xfs_eofblocks_worker(struct work_struct *);
> > +void xfs_queue_eofblocks(struct xfs_mount *);
> >
> > int xfs_inode_ag_iterator(struct xfs_mount *mp,
> > int (*execute)(struct xfs_inode *ip, int flags, void *args),
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 4700f09..7965371 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1294,6 +1294,7 @@ xfs_fs_remount(
> > */
> > xfs_restore_resvblks(mp);
> > xfs_log_work_queue(mp);
> > + xfs_queue_eofblocks(mp);
> > }
> >
> > /* rw -> ro */
> > @@ -1306,6 +1307,13 @@ xfs_fs_remount(
> > * return it to the same size.
> > */
> > xfs_save_resvblks(mp);
> > +
> > + /*
> > + * Cancel background eofb scanning so it cannot race with the
> > + * final log force+buftarg wait and deadlock the remount.
> > + */
> > + cancel_delayed_work_sync(&mp->m_eofblocks_work);
> > +
> > xfs_quiesce_attr(mp);
> > mp->m_flags |= XFS_MOUNT_RDONLY;
> > }
> >
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-06-06 20:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-06 17:12 [PATCH] xfs: cancel eofblocks background trimming on remount read-only Brian Foster
2016-06-06 20:24 ` Eric Sandeen
2016-06-06 20:57 ` Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox