From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 061/102] xfs: introduce an allocation workqueue
Date: Thu, 23 Aug 2012 15:02:19 +1000 [thread overview]
Message-ID: <1345698180-13612-62-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1345698180-13612-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
Upstream commit: c999a223c2f0d31c64ef7379814cea1378b2b800
We currently have significant issues with the amount of stack that
allocation in XFS uses, especially in the writeback path. We can
easily consume 4k of stack between mapping the page, manipulating
the bmap btree and allocating blocks from the free list. Not to
mention btree block readahead and other functionality that issues IO
in the allocation path.
As a result, we can no longer fit allocation in the writeback path
in the stack space provided on x86_64. To alleviate this problem,
introduce an allocation workqueue and move all allocations to a
seperate context. This can be easily added as an interposing layer
into xfs_alloc_vextent(), which takes a single argument structure
and does not return until the allocation is complete or has failed.
To do this, add a work structure and a completion to the allocation
args structure. This allows xfs_alloc_vextent to queue the args onto
the workqueue and wait for it to be completed by the worker. This
can be done completely transparently to the caller.
The worker function needs to ensure that it sets and clears the
PF_TRANS flag appropriately as it is being run in an active
transaction context. Work can also be queued in a memory reclaim
context, so a rescuer is needed for the workqueue.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
---
fs/xfs/linux-2.6/xfs_super.c | 16 ++++++++++++++++
fs/xfs/xfs_alloc.c | 34 +++++++++++++++++++++++++++++++++-
fs/xfs/xfs_alloc.h | 5 +++++
3 files changed, 54 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index ec5ca7b..d687329 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1645,12 +1645,28 @@ xfs_init_workqueues(void)
xfs_syncd_wq = alloc_workqueue("xfssyncd", WQ_NON_REENTRANT, 0);
if (!xfs_syncd_wq)
return -ENOMEM;
+
+ /*
+ * The allocation workqueue can be used in memory reclaim situations
+ * (writepage path), and parallelism is only limited by the number of
+ * AGs in all the filesystems mounted. Hence use the default large
+ * max_active value for this workqueue.
+ */
+ xfs_alloc_wq = alloc_workqueue("xfsalloc", WQ_MEM_RECLAIM, 0);
+ if (!xfs_alloc_wq)
+ goto out_destroy_syncd;
+
return 0;
+
+out_destroy_syncd:
+ destroy_workqueue(xfs_syncd_wq);
+ return -ENOMEM;
}
STATIC void
xfs_destroy_workqueues(void)
{
+ destroy_workqueue(xfs_alloc_wq);
destroy_workqueue(xfs_syncd_wq);
}
diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index 95862bb..2445e1a 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -35,6 +35,7 @@
#include "xfs_error.h"
#include "xfs_trace.h"
+struct workqueue_struct *xfs_alloc_wq;
#define XFS_ABSDIFF(a,b) (((a) <= (b)) ? ((b) - (a)) : ((a) - (b)))
@@ -2212,7 +2213,7 @@ xfs_alloc_read_agf(
* group or loop over the allocation groups to find the result.
*/
int /* error */
-xfs_alloc_vextent(
+__xfs_alloc_vextent(
xfs_alloc_arg_t *args) /* allocation argument structure */
{
xfs_agblock_t agsize; /* allocation group size */
@@ -2422,6 +2423,37 @@ error0:
return error;
}
+static void
+xfs_alloc_vextent_worker(
+ struct work_struct *work)
+{
+ struct xfs_alloc_arg *args = container_of(work,
+ struct xfs_alloc_arg, work);
+ unsigned long pflags;
+
+ /* we are in a transaction context here */
+ current_set_flags_nested(&pflags, PF_FSTRANS);
+
+ args->result = __xfs_alloc_vextent(args);
+ complete(args->done);
+
+ current_restore_flags_nested(&pflags, PF_FSTRANS);
+}
+
+
+int /* error */
+xfs_alloc_vextent(
+ xfs_alloc_arg_t *args) /* allocation argument structure */
+{
+ DECLARE_COMPLETION_ONSTACK(done);
+
+ args->done = &done;
+ INIT_WORK(&args->work, xfs_alloc_vextent_worker);
+ queue_work(xfs_alloc_wq, &args->work);
+ wait_for_completion(&done);
+ return args->result;
+}
+
/*
* Free an extent.
* Just break up the extent address and hand off to xfs_free_ag_extent
diff --git a/fs/xfs/xfs_alloc.h b/fs/xfs/xfs_alloc.h
index 2f52b92..ab5d0fd 100644
--- a/fs/xfs/xfs_alloc.h
+++ b/fs/xfs/xfs_alloc.h
@@ -25,6 +25,8 @@ struct xfs_perag;
struct xfs_trans;
struct xfs_busy_extent;
+extern struct workqueue_struct *xfs_alloc_wq;
+
/*
* Freespace allocation types. Argument to xfs_alloc_[v]extent.
*/
@@ -119,6 +121,9 @@ typedef struct xfs_alloc_arg {
char isfl; /* set if is freelist blocks - !acctg */
char userdata; /* set if this is user data */
xfs_fsblock_t firstblock; /* io first block allocated */
+ struct completion *done;
+ struct work_struct work;
+ int result;
} xfs_alloc_arg_t;
/*
--
1.7.10
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-08-23 5:04 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-23 5:01 [RFC, PATCH 0/102]: xfs: 3.0.x stable kernel update Dave Chinner
2012-08-23 5:01 ` [PATCH 001/102] xfs: don't serialise adjacent concurrent direct IO appending writes Dave Chinner
2012-08-23 5:01 ` [PATCH 002/102] xfs: remove dead ENODEV handling in xfs_destroy_ioend Dave Chinner
2012-08-23 5:01 ` [PATCH 003/102] xfs: defer AIO/DIO completions Dave Chinner
2012-08-23 5:01 ` [PATCH 004/102] xfs: reduce ioend latency Dave Chinner
2012-08-23 5:01 ` [PATCH 005/102] xfs: wait for I/O completion when writing out pages in xfs_setattr_size Dave Chinner
2012-08-23 5:01 ` [PATCH 006/102] xfs: improve ioend error handling Dave Chinner
2012-08-23 5:01 ` [PATCH 007/102] xfs: Check the return value of xfs_buf_get() Dave Chinner
2012-08-23 5:01 ` [PATCH 008/102] xfs: Check the return value of xfs_trans_get_buf() Dave Chinner
2012-08-23 5:01 ` [PATCH 009/102] xfs: dont ignore error code from xfs_bmbt_update Dave Chinner
2012-08-23 5:01 ` [PATCH 010/102] xfs: fix possible overflow in xfs_ioc_trim() Dave Chinner
2012-08-23 5:01 ` [PATCH 011/102] xfs: XFS_TRANS_SWAPEXT is not a valid flag for Dave Chinner
2012-08-23 5:01 ` [PATCH 012/102] xfs: Don't allocate new buffers on every call to _xfs_buf_find Dave Chinner
2012-08-23 5:01 ` [PATCH 013/102] xfs: reduce the number of log forces from tail pushing Dave Chinner
2012-08-23 5:01 ` [PATCH 014/102] xfs: optimize fsync on directories Dave Chinner
2012-08-23 5:01 ` [PATCH 015/102] xfs: clean up buffer allocation Dave Chinner
2012-08-23 5:01 ` [PATCH 016/102] xfs: clean up xfs_ioerror_alert Dave Chinner
2012-08-23 5:01 ` [PATCH 017/102] xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks Dave Chinner
2012-08-23 5:01 ` [PATCH 018/102] xfs: do not flush data workqueues in xfs_flush_buftarg Dave Chinner
2012-08-23 5:01 ` [PATCH 019/102] xfs: add AIL pushing tracepoints Dave Chinner
2012-08-23 5:01 ` [PATCH 020/102] xfs: warn if direct reclaim tries to writeback pages Dave Chinner
2012-08-27 18:17 ` Christoph Hellwig
2012-09-05 11:32 ` Mel Gorman
2012-09-13 20:12 ` Mark Tinguely
2012-09-14 9:45 ` Mel Gorman
2012-09-14 12:56 ` Mark Tinguely
2012-09-14 16:18 ` Mark Tinguely
2012-08-23 5:01 ` [PATCH 021/102] xfs: fix force shutdown handling in xfs_end_io Dave Chinner
2012-08-23 5:01 ` [PATCH 022/102] xfs: fix allocation length overflow in xfs_bmapi_write() Dave Chinner
2012-08-23 5:01 ` [PATCH 023/102] xfs: fix the logspace waiting algorithm Dave Chinner
2012-08-23 5:01 ` [PATCH 024/102] xfs: untangle SYNC_WAIT and SYNC_TRYLOCK meanings for xfs_qm_dqflush Dave Chinner
2012-08-23 5:01 ` [PATCH 025/102] xfs: make sure to really flush all dquots in xfs_qm_quotacheck Dave Chinner
2012-08-23 5:01 ` [PATCH 026/102] xfs: simplify xfs_qm_detach_gdquots Dave Chinner
2012-08-23 5:01 ` [PATCH 027/102] xfs: mark the xfssyncd workqueue as non-reentrant Dave Chinner
2012-08-23 5:01 ` [PATCH 028/102] xfs: make i_flags an unsigned long Dave Chinner
2012-08-23 5:01 ` [PATCH 029/102] xfs: remove the i_size field in struct xfs_inode Dave Chinner
2012-08-23 5:01 ` [PATCH 030/102] xfs: remove the i_new_size " Dave Chinner
2012-08-23 5:01 ` [PATCH 031/102] xfs: always return with the iolock held from Dave Chinner
2012-08-23 5:01 ` [PATCH 032/102] xfs: cleanup xfs_file_aio_write Dave Chinner
2012-08-23 5:01 ` [PATCH 033/102] xfs: pass KM_SLEEP flag to kmem_realloc() in Dave Chinner
2012-08-23 5:01 ` [PATCH 034/102] xfs: show uuid when mount fails due to duplicate uuid Dave Chinner
2012-08-23 5:01 ` [PATCH 035/102] xfs: xfs_trans_add_item() - don't assign in ASSERT() when compare is intended Dave Chinner
2012-08-23 5:01 ` [PATCH 036/102] xfs: split tail_lsn assignments from log space wakeups Dave Chinner
2012-08-23 5:01 ` [PATCH 037/102] xfs: do exact log space wakeups in xlog_ungrant_log_space Dave Chinner
2012-08-23 5:01 ` [PATCH 038/102] xfs: remove xfs_trans_unlocked_item Dave Chinner
2012-08-23 5:01 ` [PATCH 039/102] xfs: cleanup xfs_log_space_wake Dave Chinner
2012-08-23 5:01 ` [PATCH 040/102] xfs: remove log space waitqueues Dave Chinner
2012-08-23 5:01 ` [PATCH 041/102] xfs: add the xlog_grant_head structure Dave Chinner
2012-08-23 5:02 ` [PATCH 042/102] xfs: add xlog_grant_head_init Dave Chinner
2012-08-23 5:02 ` [PATCH 043/102] xfs: add xlog_grant_head_wake_all Dave Chinner
2012-08-23 5:02 ` [PATCH 044/102] xfs: share code for grant head waiting Dave Chinner
2012-08-23 5:02 ` [PATCH 045/102] xfs: share code for grant head wakeups Dave Chinner
2012-08-23 5:02 ` [PATCH 046/102] xfs: share code for grant head availability checks Dave Chinner
2012-08-23 5:02 ` [PATCH 047/102] xfs: split and cleanup xfs_log_reserve Dave Chinner
2012-08-23 5:02 ` [PATCH 048/102] xfs: only take the ILOCK in xfs_reclaim_inode() Dave Chinner
2012-08-23 5:02 ` [PATCH 049/102] xfs: use per-filesystem I/O completion workqueues Dave Chinner
2012-08-23 5:02 ` [PATCH 050/102] xfs: do not require an ioend for new EOF calculation Dave Chinner
2012-08-23 5:02 ` [PATCH 054/102] xfs: make xfs_inode_item_size idempotent Dave Chinner
2012-08-23 5:02 ` [PATCH 055/102] xfs: split in-core and on-disk inode log item fields Dave Chinner
2012-08-23 5:02 ` [PATCH 056/102] xfs: reimplement fdatasync support Dave Chinner
2012-08-23 5:02 ` [PATCH 057/102] xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get Dave Chinner
2012-08-23 5:02 ` [PATCH 058/102] xfs: fallback to vmalloc for large buffers in xfs_getbmap Dave Chinner
2012-08-23 5:02 ` [PATCH 059/102] xfs: fix deadlock in xfs_rtfree_extent Dave Chinner
2012-08-23 5:02 ` [PATCH 060/102] xfs: Fix open flag handling in open_by_handle code Dave Chinner
2012-08-23 5:02 ` Dave Chinner [this message]
2012-08-23 5:02 ` [PATCH 062/102] xfs: trace xfs_name strings correctly Dave Chinner
2012-08-23 5:02 ` [PATCH 063/102] xfs: Account log unmount transaction correctly Dave Chinner
2012-08-23 5:02 ` [PATCH 064/102] xfs: fix fstrim offset calculations Dave Chinner
2012-08-23 5:02 ` [PATCH 065/102] xfs: add lots of attribute trace points Dave Chinner
2012-08-23 5:02 ` [PATCH 066/102] xfs: don't fill statvfs with project quota for a directory Dave Chinner
2012-08-23 5:02 ` [PATCH 067/102] xfs: Ensure inode reclaim can run during quotacheck Dave Chinner
2012-08-23 5:02 ` [PATCH 068/102] xfs: avoid taking the ilock unnessecarily in xfs_qm_dqattach Dave Chinner
2012-08-23 5:02 ` [PATCH 069/102] xfs: reduce ilock hold times in xfs_file_aio_write_checks Dave Chinner
2012-08-23 5:02 ` [PATCH 070/102] xfs: reduce ilock hold times in xfs_setattr_size Dave Chinner
2012-08-23 5:02 ` [PATCH 071/102] xfs: push the ilock into xfs_zero_eof Dave Chinner
2012-08-23 5:02 ` [PATCH 072/102] xfs: use shared ilock mode for direct IO writes by default Dave Chinner
2012-08-23 5:02 ` [PATCH 073/102] xfs: punch all delalloc blocks beyond EOF on write failure Dave Chinner
2012-08-23 5:02 ` [PATCH 074/102] xfs: using GFP_NOFS for blkdev_issue_flush Dave Chinner
2012-08-23 5:02 ` [PATCH 075/102] xfs: page type check in writeback only checks last buffer Dave Chinner
2012-08-23 5:02 ` [PATCH 076/102] xfs: punch new delalloc blocks out of failed writes inside Dave Chinner
2012-08-23 5:02 ` [PATCH 077/102] xfs: prevent needless mount warning causing test failures Dave Chinner
2012-08-23 5:02 ` [PATCH 078/102] xfs: don't assert on delalloc regions beyond EOF Dave Chinner
2012-08-23 5:02 ` [PATCH 079/102] xfs: limit specualtive delalloc to maxioffset Dave Chinner
2012-08-23 5:02 ` [PATCH 080/102] xfs: Use preallocation for inodes with extsz hints Dave Chinner
2012-08-23 5:02 ` [PATCH 081/102] xfs: fix buffer lookup race on allocation failure Dave Chinner
2012-08-23 5:02 ` [PATCH 082/102] xfs: check for buffer errors before waiting Dave Chinner
2012-08-23 5:02 ` [PATCH 083/102] xfs: fix incorrect b_offset initialisation Dave Chinner
2012-08-23 5:02 ` [PATCH 084/102] xfs: use kmem_zone_zalloc for buffers Dave Chinner
2012-08-23 5:02 ` [PATCH 085/102] xfs: use iolock on XFS_IOC_ALLOCSP calls Dave Chinner
2012-08-23 5:02 ` [PATCH 086/102] xfs: Properly exclude IO type flags from buffer flags Dave Chinner
2012-08-23 5:02 ` [PATCH 087/102] xfs: flush outstanding buffers on log mount failure Dave Chinner
2012-08-23 5:02 ` [PATCH 088/102] xfs: protect xfs_sync_worker with s_umount semaphore Dave Chinner
2012-08-23 5:02 ` [PATCH 089/102] xfs: fix memory reclaim deadlock on agi buffer Dave Chinner
2012-08-23 5:02 ` [PATCH 090/102] xfs: add trace points for log forces Dave Chinner
2012-08-23 5:02 ` [PATCH 091/102] xfs: switch to proper __bitwise type for KM_... flags Dave Chinner
2012-08-23 5:02 ` [PATCH 092/102] xfs: xfs_vm_writepage clear iomap_valid when Dave Chinner
2012-08-23 5:02 ` [PATCH 093/102] xfs: fix debug_object WARN at xfs_alloc_vextent() Dave Chinner
2012-08-23 5:02 ` [PATCH 094/102] xfs: m_maxioffset is redundant Dave Chinner
2012-08-23 5:02 ` [PATCH 095/102] xfs: make largest supported offset less shouty Dave Chinner
2012-08-23 5:02 ` [PATCH 096/102] xfs: kill copy and paste segment checks in xfs_file_aio_read Dave Chinner
2012-08-23 5:02 ` [PATCH 097/102] xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near Dave Chinner
2012-08-23 5:02 ` [PATCH 098/102] xfs: shutdown xfs_sync_worker before the log Dave Chinner
2012-08-23 5:02 ` [PATCH 099/102] xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near Dave Chinner
2012-08-23 5:02 ` [PATCH 100/102] xfs: don't defer metadata allocation to the workqueue Dave Chinner
2012-08-23 5:02 ` [PATCH 101/102] xfs: prevent recursion in xfs_buf_iorequest Dave Chinner
2012-08-23 5:03 ` [PATCH 102/102] xfs: handle EOF correctly in xfs_vm_writepage Dave Chinner
2012-08-23 21:54 ` [RFC, PATCH 0/102]: xfs: 3.0.x stable kernel update Dave Chinner
2012-08-23 22:14 ` Ben Myers
2012-08-23 22:23 ` Matthias Schniedermeyer
2012-09-01 23:10 ` Christoph Hellwig
2012-09-03 6:04 ` Dave Chinner
2012-09-04 21:13 ` Ben Myers
2012-09-05 4:24 ` Dave Chinner
2012-09-13 18:32 ` Mark Tinguely
2012-09-18 13:59 ` Mark Tinguely
2012-09-18 23:50 ` Dave Chinner
2012-09-19 13:14 ` Mark Tinguely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1345698180-13612-62-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox