* [PATCH 0/2] xfs: regression fixes
@ 2014-05-26 22:26 Dave Chinner
2014-05-26 22:26 ` [PATCH 1/2] xfs: block allocation work needs to be kswapd aware Dave Chinner
2014-05-26 22:26 ` [PATCH 2/2] xfs: xfs_readsb needs to check for magic numbers Dave Chinner
0 siblings, 2 replies; 5+ messages in thread
From: Dave Chinner @ 2014-05-26 22:26 UTC (permalink / raw)
To: xfs
Hi folks,
These are a couple of fixes that have come to light in the past few
days. The first prevents bad things happening in memory reclaim
when kswapd delegates block allocation to a workqueue, and the
second ensures XFS doesn't incorrectly try to check the sector size
on non-xfs devices when mount is probing.
Both are regressions, but given how late in the 3.15 cycle we are,
I'll probably just push these through the 3.16 merge to give them a
bit more testing and let them be handled by stable kernel backports.
Anyone have any particular concerns over that plan?
Cheers,
Dave.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] xfs: block allocation work needs to be kswapd aware
2014-05-26 22:26 [PATCH 0/2] xfs: regression fixes Dave Chinner
@ 2014-05-26 22:26 ` Dave Chinner
2014-05-27 10:38 ` Christoph Hellwig
2014-05-26 22:26 ` [PATCH 2/2] xfs: xfs_readsb needs to check for magic numbers Dave Chinner
1 sibling, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2014-05-26 22:26 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
Upon memory pressure, kswapd calls xfs_vm_writepage() from
shrink_page_list(). This can result in delayed allocation occurring
and that gets deferred to the the allocation workqueue.
The allocation then runs outside kswapd context, which means if it
needs memory (and it does to demand page metadata from disk) it can
block in shrink_inactive_list() waiting for IO congestion. These
blocking waits are normally avoiding in kswapd context, so under
memory pressure writeback from kswapd can be arbitrarily delayed by
memory reclaim.
To avoid this, pass the kswapd context to the allocation being done
by the workqueue, so that memory reclaim understands correctly that
the work is being done for kswapd and therefore it is not blocked
and does not delay memory reclaim.
To avoid issues with int->char conversion of flag fields (as noticed
in v1 of this patch) convert the flag fields in the struct
xfs_bmalloca to bool types. pahole indicates these variables are
still single byte variables, so no extra space is consumed by this
change.
cc: <stable@vger.kernel.org>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_bmap_util.c | 16 +++++++++++++---
fs/xfs/xfs_bmap_util.h | 13 +++++++------
2 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 057f671..703b3ec 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -258,14 +258,23 @@ xfs_bmapi_allocate_worker(
struct xfs_bmalloca *args = container_of(work,
struct xfs_bmalloca, work);
unsigned long pflags;
+ unsigned long new_pflags = PF_FSTRANS;
- /* we are in a transaction context here */
- current_set_flags_nested(&pflags, PF_FSTRANS);
+ /*
+ * we are in a transaction context here, but may also be doing work
+ * in kswapd context, and hence we may need to inherit that state
+ * temporarily to ensure that we don't block waiting for memory reclaim
+ * in any way.
+ */
+ if (args->kswapd)
+ new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
+
+ current_set_flags_nested(&pflags, new_pflags);
args->result = __xfs_bmapi_allocate(args);
complete(args->done);
- current_restore_flags_nested(&pflags, PF_FSTRANS);
+ current_restore_flags_nested(&pflags, new_pflags);
}
/*
@@ -284,6 +293,7 @@ xfs_bmapi_allocate(
args->done = &done;
+ args->kswapd = current_is_kswapd();
INIT_WORK_ONSTACK(&args->work, xfs_bmapi_allocate_worker);
queue_work(xfs_alloc_wq, &args->work);
wait_for_completion(&done);
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 935ed2b..075f722 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -50,12 +50,13 @@ struct xfs_bmalloca {
xfs_extlen_t total; /* total blocks needed for xaction */
xfs_extlen_t minlen; /* minimum allocation size (blocks) */
xfs_extlen_t minleft; /* amount must be left after alloc */
- char eof; /* set if allocating past last extent */
- char wasdel; /* replacing a delayed allocation */
- char userdata;/* set if is user data */
- char aeof; /* allocated space at eof */
- char conv; /* overwriting unwritten extents */
- char stack_switch;
+ bool eof; /* set if allocating past last extent */
+ bool wasdel; /* replacing a delayed allocation */
+ bool userdata;/* set if is user data */
+ bool aeof; /* allocated space at eof */
+ bool conv; /* overwriting unwritten extents */
+ bool stack_switch;
+ bool kswapd; /* allocation in kswapd context */
int flags;
struct completion *done;
struct work_struct work;
--
1.9.0
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] xfs: xfs_readsb needs to check for magic numbers
2014-05-26 22:26 [PATCH 0/2] xfs: regression fixes Dave Chinner
2014-05-26 22:26 ` [PATCH 1/2] xfs: block allocation work needs to be kswapd aware Dave Chinner
@ 2014-05-26 22:26 ` Dave Chinner
2014-05-27 10:40 ` Christoph Hellwig
1 sibling, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2014-05-26 22:26 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
Commit daba542 ("xfs: skip verification on initial "guess"
superblock read") dropped the use of a verifier for the initial
superblock read so we can probe the sector size of the filesystem
stored in the superblock. It, however, now fails to validate that
what was read initially is actually an XFS superblock and hence will
fail the sector size check and return ENOSYS.
This causes probe-based mounts to fail because it expects XFS to
return EINVAL when it doesn't recognise the superblock format.
cc: <stable@vger.kernel.org>
Reported-by: Plamen Petrov <plamen.sisi@gmail.com>
Tested-by: Plamen Petrov <plamen.sisi@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_mount.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 3f09782..fa8a420 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -323,8 +323,19 @@ reread:
/*
* Initialize the mount structure from the superblock.
*/
- xfs_sb_from_disk(&mp->m_sb, XFS_BUF_TO_SBP(bp));
- xfs_sb_quota_from_disk(&mp->m_sb);
+ xfs_sb_from_disk(sbp, XFS_BUF_TO_SBP(bp));
+ xfs_sb_quota_from_disk(sbp);
+
+ /*
+ * If we haven't validated the superblock, do so now before we try
+ * to check the sector size and reread the superblock appropriately.
+ */
+ if (sbp->sb_magicnum != XFS_SB_MAGIC) {
+ if (loud)
+ xfs_warn(mp, "Invalid superblock magic number");
+ error = EINVAL;
+ goto release_buf;
+ }
/*
* We must be able to do sector-sized and sector-aligned IO.
@@ -337,11 +348,11 @@ reread:
goto release_buf;
}
- /*
- * Re-read the superblock so the buffer is correctly sized,
- * and properly verified.
- */
if (buf_ops == NULL) {
+ /*
+ * Re-read the superblock so the buffer is correctly sized,
+ * and properly verified.
+ */
xfs_buf_relse(bp);
sector_size = sbp->sb_sectsize;
buf_ops = loud ? &xfs_sb_buf_ops : &xfs_sb_quiet_buf_ops;
--
1.9.0
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] xfs: block allocation work needs to be kswapd aware
2014-05-26 22:26 ` [PATCH 1/2] xfs: block allocation work needs to be kswapd aware Dave Chinner
@ 2014-05-27 10:38 ` Christoph Hellwig
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2014-05-27 10:38 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Tue, May 27, 2014 at 08:26:53AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Upon memory pressure, kswapd calls xfs_vm_writepage() from
> shrink_page_list(). This can result in delayed allocation occurring
> and that gets deferred to the the allocation workqueue.
>
> The allocation then runs outside kswapd context, which means if it
> needs memory (and it does to demand page metadata from disk) it can
> block in shrink_inactive_list() waiting for IO congestion. These
> blocking waits are normally avoiding in kswapd context, so under
> memory pressure writeback from kswapd can be arbitrarily delayed by
> memory reclaim.
>
> To avoid this, pass the kswapd context to the allocation being done
> by the workqueue, so that memory reclaim understands correctly that
> the work is being done for kswapd and therefore it is not blocked
> and does not delay memory reclaim.
>
> To avoid issues with int->char conversion of flag fields (as noticed
> in v1 of this patch) convert the flag fields in the struct
> xfs_bmalloca to bool types. pahole indicates these variables are
> still single byte variables, so no extra space is consumed by this
> change.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] xfs: xfs_readsb needs to check for magic numbers
2014-05-26 22:26 ` [PATCH 2/2] xfs: xfs_readsb needs to check for magic numbers Dave Chinner
@ 2014-05-27 10:40 ` Christoph Hellwig
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2014-05-27 10:40 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Tue, May 27, 2014 at 08:26:54AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Commit daba542 ("xfs: skip verification on initial "guess"
> superblock read") dropped the use of a verifier for the initial
> superblock read so we can probe the sector size of the filesystem
> stored in the superblock. It, however, now fails to validate that
> what was read initially is actually an XFS superblock and hence will
> fail the sector size check and return ENOSYS.
>
> This causes probe-based mounts to fail because it expects XFS to
> return EINVAL when it doesn't recognise the superblock format.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-05-27 10:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-26 22:26 [PATCH 0/2] xfs: regression fixes Dave Chinner
2014-05-26 22:26 ` [PATCH 1/2] xfs: block allocation work needs to be kswapd aware Dave Chinner
2014-05-27 10:38 ` Christoph Hellwig
2014-05-26 22:26 ` [PATCH 2/2] xfs: xfs_readsb needs to check for magic numbers Dave Chinner
2014-05-27 10:40 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox