* [PATCH 0/5] log all file size updates
@ 2011-11-08 8:56 Christoph Hellwig
2011-11-08 8:56 ` [PATCH 1/5] fix: force shutdown handling in xfs_end_io Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Christoph Hellwig @ 2011-11-08 8:56 UTC (permalink / raw)
To: xfs
With more reports showing up that the VFS writeback code is not able
to write back our size updates dirtied from the I/O completion handler
in reasonable time I think it's time to move to logging all file size
updates ASAP, that is for 3.2 and maybe after a reasonable testing
period even -stable.
This series has been sent out a few times, and I've been doing QA on
it for weeks. Note that I haven't implemented the log space reservations
from ->writepage - see the actual patch for the rationale.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 1/5] fix: force shutdown handling in xfs_end_io 2011-11-08 8:56 [PATCH 0/5] log all file size updates Christoph Hellwig @ 2011-11-08 8:56 ` Christoph Hellwig 2011-11-08 8:56 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig 2011-11-08 8:56 ` [PATCH 3/5] xfs: do not require an ioend for new EOF calculation Christoph Hellwig 2 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2011-11-08 8:56 UTC (permalink / raw) To: xfs [-- Attachment #1: xfs-fix-end-io-error-handling --] [-- Type: text/plain, Size: 741 bytes --] Only ioend->io_error gets propagated back to e.g. AIO completions. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_aops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/fs/xfs/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-11-02 09:00:45.569119028 +0100 +++ linux-2.6/fs/xfs/xfs_aops.c 2011-11-07 12:19:22.282316090 +0100 @@ -189,7 +189,7 @@ xfs_end_io( int error = 0; if (XFS_FORCED_SHUTDOWN(ip->i_mount)) { - error = -EIO; + ioend->io_error = -EIO; goto done; } if (ioend->io_error) _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-08 8:56 [PATCH 0/5] log all file size updates Christoph Hellwig 2011-11-08 8:56 ` [PATCH 1/5] fix: force shutdown handling in xfs_end_io Christoph Hellwig @ 2011-11-08 8:56 ` Christoph Hellwig 2011-11-08 23:11 ` Dave Chinner 2011-11-08 8:56 ` [PATCH 3/5] xfs: do not require an ioend for new EOF calculation Christoph Hellwig 2 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2011-11-08 8:56 UTC (permalink / raw) To: xfs [-- Attachment #1: xfs-split-workqueues --] [-- Type: text/plain, Size: 7659 bytes --] The new concurrency managed workqueues are cheap enough that we can create per-filesystem instead of global workqueues. This allows us to remove the trylock or defer scheme on the ilock, which has the potential of delaying size updates, and is not helpful once we start to log the inode size. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_aops.c | 39 ++++++++++----------------------------- fs/xfs/xfs_aops.h | 2 -- fs/xfs/xfs_buf.c | 17 ----------------- fs/xfs/xfs_mount.h | 3 +++ fs/xfs/xfs_super.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 55 insertions(+), 49 deletions(-) Index: linux-2.6/fs/xfs/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-11-08 08:08:11.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_aops.c 2011-11-08 08:09:29.458887066 +0100 @@ -131,21 +131,15 @@ static inline bool xfs_ioend_is_append(s * will be the intended file size until i_size is updated. If this write does * not extend all the way to the valid file size then restrict this update to * the end of the write. - * - * This function does not block as blocking on the inode lock in IO completion - * can lead to IO completion order dependency deadlocks.. If it can't get the - * inode ilock it will return EAGAIN. Callers must handle this. */ -STATIC int +STATIC void xfs_setfilesize( - xfs_ioend_t *ioend) + struct xfs_ioend *ioend) { - xfs_inode_t *ip = XFS_I(ioend->io_inode); + struct xfs_inode *ip = XFS_I(ioend->io_inode); xfs_fsize_t isize; - if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) - return EAGAIN; - + xfs_ilock(ip, XFS_ILOCK_EXCL); isize = xfs_ioend_new_eof(ioend); if (isize) { trace_xfs_setfilesize(ip, ioend->io_offset, ioend->io_size); @@ -154,7 +148,6 @@ xfs_setfilesize( } xfs_iunlock(ip, XFS_ILOCK_EXCL); - return 0; } /* @@ -168,10 +161,12 @@ xfs_finish_ioend( struct xfs_ioend *ioend) { if (atomic_dec_and_test(&ioend->io_remaining)) { + struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; + if (ioend->io_type == IO_UNWRITTEN) - queue_work(xfsconvertd_workqueue, &ioend->io_work); + queue_work(mp->m_unwritten_workqueue, &ioend->io_work); else if (xfs_ioend_is_append(ioend)) - queue_work(xfsdatad_workqueue, &ioend->io_work); + queue_work(mp->m_data_workqueue, &ioend->io_work); else xfs_destroy_ioend(ioend); } @@ -212,23 +207,9 @@ xfs_end_io( * We might have to update the on-disk file size after extending * writes. */ - error = xfs_setfilesize(ioend); - ASSERT(!error || error == EAGAIN); - + xfs_setfilesize(ioend); done: - /* - * If we didn't complete processing of the ioend, requeue it to the - * tail of the workqueue for another attempt later. Otherwise destroy - * it. - */ - if (error == EAGAIN) { - atomic_inc(&ioend->io_remaining); - xfs_finish_ioend(ioend); - /* ensure we don't spin on blocked ioends */ - delay(1); - } else { - xfs_destroy_ioend(ioend); - } + xfs_destroy_ioend(ioend); } /* Index: linux-2.6/fs/xfs/xfs_aops.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.h 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_aops.h 2011-11-08 08:08:55.915886285 +0100 @@ -18,8 +18,6 @@ #ifndef __XFS_AOPS_H__ #define __XFS_AOPS_H__ -extern struct workqueue_struct *xfsdatad_workqueue; -extern struct workqueue_struct *xfsconvertd_workqueue; extern mempool_t *xfs_ioend_pool; /* Index: linux-2.6/fs/xfs/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_buf.c 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_buf.c 2011-11-08 08:08:55.919886682 +0100 @@ -45,8 +45,6 @@ static kmem_zone_t *xfs_buf_zone; STATIC int xfsbufd(void *); static struct workqueue_struct *xfslogd_workqueue; -struct workqueue_struct *xfsdatad_workqueue; -struct workqueue_struct *xfsconvertd_workqueue; #ifdef XFS_BUF_LOCK_TRACKING # define XB_SET_OWNER(bp) ((bp)->b_last_holder = current->pid) @@ -1797,21 +1795,8 @@ xfs_buf_init(void) if (!xfslogd_workqueue) goto out_free_buf_zone; - xfsdatad_workqueue = alloc_workqueue("xfsdatad", WQ_MEM_RECLAIM, 1); - if (!xfsdatad_workqueue) - goto out_destroy_xfslogd_workqueue; - - xfsconvertd_workqueue = alloc_workqueue("xfsconvertd", - WQ_MEM_RECLAIM, 1); - if (!xfsconvertd_workqueue) - goto out_destroy_xfsdatad_workqueue; - return 0; - out_destroy_xfsdatad_workqueue: - destroy_workqueue(xfsdatad_workqueue); - out_destroy_xfslogd_workqueue: - destroy_workqueue(xfslogd_workqueue); out_free_buf_zone: kmem_zone_destroy(xfs_buf_zone); out: @@ -1821,8 +1806,6 @@ xfs_buf_init(void) void xfs_buf_terminate(void) { - destroy_workqueue(xfsconvertd_workqueue); - destroy_workqueue(xfsdatad_workqueue); destroy_workqueue(xfslogd_workqueue); kmem_zone_destroy(xfs_buf_zone); } Index: linux-2.6/fs/xfs/xfs_super.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_super.c 2011-11-08 08:08:55.927886477 +0100 @@ -769,6 +769,40 @@ xfs_setup_devices( return 0; } +STATIC int +xfs_init_mount_workqueues( + struct xfs_mount *mp) +{ +#define XFS_WQ_NAME_LEN 512 + char name[XFS_WQ_NAME_LEN]; + + snprintf(name, XFS_WQ_NAME_LEN, "xfs-data/%s", mp->m_fsname); + mp->m_data_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); + if (!mp->m_data_workqueue) + goto out; + + snprintf(name, XFS_WQ_NAME_LEN, "xfs-conv/%s", mp->m_fsname); + mp->m_unwritten_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); + if (!mp->m_unwritten_workqueue) + goto out_destroy_data_iodone_queue; + + return 0; + +out_destroy_data_iodone_queue: + destroy_workqueue(mp->m_data_workqueue); +out: + return -ENOMEM; +#undef XFS_WQ_NAME_LEN +} + +STATIC void +xfs_destroy_mount_workqueues( + struct xfs_mount *mp) +{ + destroy_workqueue(mp->m_data_workqueue); + destroy_workqueue(mp->m_unwritten_workqueue); +} + /* Catch misguided souls that try to use this interface on XFS */ STATIC struct inode * xfs_fs_alloc_inode( @@ -1020,6 +1054,7 @@ xfs_fs_put_super( xfs_unmountfs(mp); xfs_freesb(mp); xfs_icsb_destroy_counters(mp); + xfs_destroy_mount_workqueues(mp); xfs_close_devices(mp); xfs_free_fsname(mp); kfree(mp); @@ -1353,10 +1388,14 @@ xfs_fs_fill_super( if (error) goto out_free_fsname; - error = xfs_icsb_init_counters(mp); + error = xfs_init_mount_workqueues(mp); if (error) goto out_close_devices; + error = xfs_icsb_init_counters(mp); + if (error) + goto out_destroy_workqueues; + error = xfs_readsb(mp, flags); if (error) goto out_destroy_counters; @@ -1419,6 +1458,8 @@ xfs_fs_fill_super( xfs_freesb(mp); out_destroy_counters: xfs_icsb_destroy_counters(mp); +out_destroy_workqueues: + xfs_destroy_mount_workqueues(mp); out_close_devices: xfs_close_devices(mp); out_free_fsname: Index: linux-2.6/fs/xfs/xfs_mount.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_mount.h 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_mount.h 2011-11-08 08:08:55.931904609 +0100 @@ -211,6 +211,9 @@ typedef struct xfs_mount { struct shrinker m_inode_shrink; /* inode reclaim shrinker */ int64_t m_low_space[XFS_LOWSP_MAX]; /* low free space thresholds */ + + struct workqueue_struct *m_data_workqueue; + struct workqueue_struct *m_unwritten_workqueue; } xfs_mount_t; /* _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-08 8:56 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig @ 2011-11-08 23:11 ` Dave Chinner 2011-11-09 7:58 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Dave Chinner @ 2011-11-08 23:11 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs On Tue, Nov 08, 2011 at 03:56:16AM -0500, Christoph Hellwig wrote: > The new concurrency managed workqueues are cheap enough that we can create > per-filesystem instead of global workqueues. This allows us to remove the > trylock or defer scheme on the ilock, which has the potential of delaying > size updates, and is not helpful once we start to log the inode size. So why does the per-mount workqueues allow removal of the trylock? IOWs, it might be worthwhile pointing to the commit (77d7a0c "xfs: Non-blocking inode locking in IO completion") to indicate that the functionality being removed was introduced to avoid IO completion deadlocks between dependent filesystems (e.g. XFS on loop device on XFS) and that moving to per-mount completion queues removes that dependency.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-08 23:11 ` Dave Chinner @ 2011-11-09 7:58 ` Christoph Hellwig 2011-11-10 17:42 ` Ben Myers 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2011-11-09 7:58 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs On Wed, Nov 09, 2011 at 10:11:18AM +1100, Dave Chinner wrote: > On Tue, Nov 08, 2011 at 03:56:16AM -0500, Christoph Hellwig wrote: > > The new concurrency managed workqueues are cheap enough that we can create > > per-filesystem instead of global workqueues. This allows us to remove the > > trylock or defer scheme on the ilock, which has the potential of delaying > > size updates, and is not helpful once we start to log the inode size. > > So why does the per-mount workqueues allow removal of the trylock? > > IOWs, it might be worthwhile pointing to the commit (77d7a0c "xfs: > Non-blocking inode locking in IO completion") to indicate that the > functionality being removed was introduced to avoid IO completion > deadlocks between dependent filesystems (e.g. XFS on loop device on > XFS) and that moving to per-mount completion queues removes that > dependency.... Done: -- From: Christoph Hellwig <hch@lst.de> Subject: xfs: use per-filesystem I/O completion workqueues commit 77d7a0c "xfs: Non-blocking inode locking in IO completion" introduced a trylocked and defer scheme in xfs_setfilesize to avoid deadlocks when on XFS filesystem is used ontop of another using the loop device, and we fsync in the loop filesystem. Now that we have the cheap enough concurrency managed workqueues, we can create per-filesystem instead of global workqueues and remove this scheme again, given that it has the potential of delaying size updates and is not helpful once we start to log the inode size. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_aops.c | 39 ++++++++++----------------------------- fs/xfs/xfs_aops.h | 2 -- fs/xfs/xfs_buf.c | 17 ----------------- fs/xfs/xfs_mount.h | 3 +++ fs/xfs/xfs_super.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 55 insertions(+), 49 deletions(-) Index: linux-2.6/fs/xfs/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-11-08 08:08:11.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_aops.c 2011-11-08 08:09:29.458887066 +0100 @@ -131,21 +131,15 @@ static inline bool xfs_ioend_is_append(s * will be the intended file size until i_size is updated. If this write does * not extend all the way to the valid file size then restrict this update to * the end of the write. - * - * This function does not block as blocking on the inode lock in IO completion - * can lead to IO completion order dependency deadlocks.. If it can't get the - * inode ilock it will return EAGAIN. Callers must handle this. */ -STATIC int +STATIC void xfs_setfilesize( - xfs_ioend_t *ioend) + struct xfs_ioend *ioend) { - xfs_inode_t *ip = XFS_I(ioend->io_inode); + struct xfs_inode *ip = XFS_I(ioend->io_inode); xfs_fsize_t isize; - if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) - return EAGAIN; - + xfs_ilock(ip, XFS_ILOCK_EXCL); isize = xfs_ioend_new_eof(ioend); if (isize) { trace_xfs_setfilesize(ip, ioend->io_offset, ioend->io_size); @@ -154,7 +148,6 @@ xfs_setfilesize( } xfs_iunlock(ip, XFS_ILOCK_EXCL); - return 0; } /* @@ -168,10 +161,12 @@ xfs_finish_ioend( struct xfs_ioend *ioend) { if (atomic_dec_and_test(&ioend->io_remaining)) { + struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; + if (ioend->io_type == IO_UNWRITTEN) - queue_work(xfsconvertd_workqueue, &ioend->io_work); + queue_work(mp->m_unwritten_workqueue, &ioend->io_work); else if (xfs_ioend_is_append(ioend)) - queue_work(xfsdatad_workqueue, &ioend->io_work); + queue_work(mp->m_data_workqueue, &ioend->io_work); else xfs_destroy_ioend(ioend); } @@ -212,23 +207,9 @@ xfs_end_io( * We might have to update the on-disk file size after extending * writes. */ - error = xfs_setfilesize(ioend); - ASSERT(!error || error == EAGAIN); - + xfs_setfilesize(ioend); done: - /* - * If we didn't complete processing of the ioend, requeue it to the - * tail of the workqueue for another attempt later. Otherwise destroy - * it. - */ - if (error == EAGAIN) { - atomic_inc(&ioend->io_remaining); - xfs_finish_ioend(ioend); - /* ensure we don't spin on blocked ioends */ - delay(1); - } else { - xfs_destroy_ioend(ioend); - } + xfs_destroy_ioend(ioend); } /* Index: linux-2.6/fs/xfs/xfs_aops.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.h 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_aops.h 2011-11-08 08:08:55.915886285 +0100 @@ -18,8 +18,6 @@ #ifndef __XFS_AOPS_H__ #define __XFS_AOPS_H__ -extern struct workqueue_struct *xfsdatad_workqueue; -extern struct workqueue_struct *xfsconvertd_workqueue; extern mempool_t *xfs_ioend_pool; /* Index: linux-2.6/fs/xfs/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_buf.c 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_buf.c 2011-11-08 08:08:55.919886682 +0100 @@ -45,8 +45,6 @@ static kmem_zone_t *xfs_buf_zone; STATIC int xfsbufd(void *); static struct workqueue_struct *xfslogd_workqueue; -struct workqueue_struct *xfsdatad_workqueue; -struct workqueue_struct *xfsconvertd_workqueue; #ifdef XFS_BUF_LOCK_TRACKING # define XB_SET_OWNER(bp) ((bp)->b_last_holder = current->pid) @@ -1797,21 +1795,8 @@ xfs_buf_init(void) if (!xfslogd_workqueue) goto out_free_buf_zone; - xfsdatad_workqueue = alloc_workqueue("xfsdatad", WQ_MEM_RECLAIM, 1); - if (!xfsdatad_workqueue) - goto out_destroy_xfslogd_workqueue; - - xfsconvertd_workqueue = alloc_workqueue("xfsconvertd", - WQ_MEM_RECLAIM, 1); - if (!xfsconvertd_workqueue) - goto out_destroy_xfsdatad_workqueue; - return 0; - out_destroy_xfsdatad_workqueue: - destroy_workqueue(xfsdatad_workqueue); - out_destroy_xfslogd_workqueue: - destroy_workqueue(xfslogd_workqueue); out_free_buf_zone: kmem_zone_destroy(xfs_buf_zone); out: @@ -1821,8 +1806,6 @@ xfs_buf_init(void) void xfs_buf_terminate(void) { - destroy_workqueue(xfsconvertd_workqueue); - destroy_workqueue(xfsdatad_workqueue); destroy_workqueue(xfslogd_workqueue); kmem_zone_destroy(xfs_buf_zone); } Index: linux-2.6/fs/xfs/xfs_super.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_super.c 2011-11-08 08:08:55.927886477 +0100 @@ -769,6 +769,40 @@ xfs_setup_devices( return 0; } +STATIC int +xfs_init_mount_workqueues( + struct xfs_mount *mp) +{ +#define XFS_WQ_NAME_LEN 512 + char name[XFS_WQ_NAME_LEN]; + + snprintf(name, XFS_WQ_NAME_LEN, "xfs-data/%s", mp->m_fsname); + mp->m_data_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); + if (!mp->m_data_workqueue) + goto out; + + snprintf(name, XFS_WQ_NAME_LEN, "xfs-conv/%s", mp->m_fsname); + mp->m_unwritten_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); + if (!mp->m_unwritten_workqueue) + goto out_destroy_data_iodone_queue; + + return 0; + +out_destroy_data_iodone_queue: + destroy_workqueue(mp->m_data_workqueue); +out: + return -ENOMEM; +#undef XFS_WQ_NAME_LEN +} + +STATIC void +xfs_destroy_mount_workqueues( + struct xfs_mount *mp) +{ + destroy_workqueue(mp->m_data_workqueue); + destroy_workqueue(mp->m_unwritten_workqueue); +} + /* Catch misguided souls that try to use this interface on XFS */ STATIC struct inode * xfs_fs_alloc_inode( @@ -1020,6 +1054,7 @@ xfs_fs_put_super( xfs_unmountfs(mp); xfs_freesb(mp); xfs_icsb_destroy_counters(mp); + xfs_destroy_mount_workqueues(mp); xfs_close_devices(mp); xfs_free_fsname(mp); kfree(mp); @@ -1353,10 +1388,14 @@ xfs_fs_fill_super( if (error) goto out_free_fsname; - error = xfs_icsb_init_counters(mp); + error = xfs_init_mount_workqueues(mp); if (error) goto out_close_devices; + error = xfs_icsb_init_counters(mp); + if (error) + goto out_destroy_workqueues; + error = xfs_readsb(mp, flags); if (error) goto out_destroy_counters; @@ -1419,6 +1458,8 @@ xfs_fs_fill_super( xfs_freesb(mp); out_destroy_counters: xfs_icsb_destroy_counters(mp); +out_destroy_workqueues: + xfs_destroy_mount_workqueues(mp); out_close_devices: xfs_close_devices(mp); out_free_fsname: Index: linux-2.6/fs/xfs/xfs_mount.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_mount.h 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_mount.h 2011-11-08 08:08:55.931904609 +0100 @@ -211,6 +211,9 @@ typedef struct xfs_mount { struct shrinker m_inode_shrink; /* inode reclaim shrinker */ int64_t m_low_space[XFS_LOWSP_MAX]; /* low free space thresholds */ + + struct workqueue_struct *m_data_workqueue; + struct workqueue_struct *m_unwritten_workqueue; } xfs_mount_t; /* _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-09 7:58 ` Christoph Hellwig @ 2011-11-10 17:42 ` Ben Myers 2011-11-14 10:34 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Ben Myers @ 2011-11-10 17:42 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Hey Christoph, On Wed, Nov 09, 2011 at 02:58:47AM -0500, Christoph Hellwig wrote: > +STATIC int > +xfs_init_mount_workqueues( > + struct xfs_mount *mp) > +{ > +#define XFS_WQ_NAME_LEN 512 > + char name[XFS_WQ_NAME_LEN]; > + > + snprintf(name, XFS_WQ_NAME_LEN, "xfs-data/%s", mp->m_fsname); > + mp->m_data_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); > + if (!mp->m_data_workqueue) > + goto out; Looks to me like alloc_workqueue holds on to that name pointer in wq->name... won't overwriting the name below be a problem? > + snprintf(name, XFS_WQ_NAME_LEN, "xfs-conv/%s", mp->m_fsname); > + mp->m_unwritten_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); > + if (!mp->m_unwritten_workqueue) > + goto out_destroy_data_iodone_queue; -Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-10 17:42 ` Ben Myers @ 2011-11-14 10:34 ` Christoph Hellwig 2011-11-14 18:13 ` Tejun Heo 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2011-11-14 10:34 UTC (permalink / raw) To: Ben Myers, Tejun Heo; +Cc: linux-kernel, xfs On Thu, Nov 10, 2011 at 11:42:42AM -0600, Ben Myers wrote: > > +STATIC int > > +xfs_init_mount_workqueues( > > + struct xfs_mount *mp) > > +{ > > +#define XFS_WQ_NAME_LEN 512 > > + char name[XFS_WQ_NAME_LEN]; > > + > > + snprintf(name, XFS_WQ_NAME_LEN, "xfs-data/%s", mp->m_fsname); > > + mp->m_data_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); > > + if (!mp->m_data_workqueue) > > + goto out; > > Looks to me like alloc_workqueue holds on to that name pointer in > wq->name... won't overwriting the name below be a problem? It applies deep magic to make sure a pattern like mine is fine for the lockdep lock name, but just uses it directly for the workqueue name. Oddly enough the names seem to display correctly on my test systems anyway. Tejun, any chance to change alloc_workqueue to use the string pasting trick also for the normal workqueue name, or even better add a varargs version of alloc_workqueue? _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-14 10:34 ` Christoph Hellwig @ 2011-11-14 18:13 ` Tejun Heo 0 siblings, 0 replies; 14+ messages in thread From: Tejun Heo @ 2011-11-14 18:13 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Ben Myers, linux-kernel, xfs Hello, Christoph. On Mon, Nov 14, 2011 at 05:34:10AM -0500, Christoph Hellwig wrote: > On Thu, Nov 10, 2011 at 11:42:42AM -0600, Ben Myers wrote: > > > +STATIC int > > > +xfs_init_mount_workqueues( > > > + struct xfs_mount *mp) > > > +{ > > > +#define XFS_WQ_NAME_LEN 512 > > > + char name[XFS_WQ_NAME_LEN]; > > > + > > > + snprintf(name, XFS_WQ_NAME_LEN, "xfs-data/%s", mp->m_fsname); > > > + mp->m_data_workqueue = alloc_workqueue(name, WQ_MEM_RECLAIM, 1); > > > + if (!mp->m_data_workqueue) > > > + goto out; > > > > Looks to me like alloc_workqueue holds on to that name pointer in > > wq->name... won't overwriting the name below be a problem? > > It applies deep magic to make sure a pattern like mine is fine for > the lockdep lock name, but just uses it directly for the workqueue name. For lockdep lock name, it isn't about reusing but ensuring that same lockdep key doesn't end up with different names which lockdep doesn't allow. But yeah, given that workqueue is dynamically allocated, it's a bit silly to require the name to be constant. > Oddly enough the names seem to display correctly on my test systems > anyway. > > Tejun, any chance to change alloc_workqueue to use the string pasting > trick also for the normal workqueue name, or even better add a varargs > version of alloc_workqueue? Yeah, will whip something up. Thanks. -- tejun _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/5] xfs: do not require an ioend for new EOF calculation 2011-11-08 8:56 [PATCH 0/5] log all file size updates Christoph Hellwig 2011-11-08 8:56 ` [PATCH 1/5] fix: force shutdown handling in xfs_end_io Christoph Hellwig 2011-11-08 8:56 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig @ 2011-11-08 8:56 ` Christoph Hellwig 2 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2011-11-08 8:56 UTC (permalink / raw) To: xfs [-- Attachment #1: xfs-simplify-eof-calculation --] [-- Type: text/plain, Size: 3171 bytes --] Replace xfs_ioend_new_eof with a new inline xfs_new_eof helper that doesn't require and ioend, and is available also outside of xfs_aops.c. Also make the code a bit more clear by using a normal if statement instead of a slightly misleading MIN(). Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_aops.c | 26 +++++--------------------- fs/xfs/xfs_inode.h | 14 ++++++++++++++ 2 files changed, 19 insertions(+), 21 deletions(-) Index: linux-2.6/fs/xfs/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-11-08 08:11:44.891887054 +0100 +++ linux-2.6/fs/xfs/xfs_aops.c 2011-11-08 08:12:31.586400976 +0100 @@ -99,24 +99,6 @@ xfs_destroy_ioend( } /* - * If the end of the current ioend is beyond the current EOF, - * return the new EOF value, otherwise zero. - */ -STATIC xfs_fsize_t -xfs_ioend_new_eof( - xfs_ioend_t *ioend) -{ - xfs_inode_t *ip = XFS_I(ioend->io_inode); - xfs_fsize_t isize; - xfs_fsize_t bsize; - - bsize = ioend->io_offset + ioend->io_size; - isize = MAX(ip->i_size, ip->i_new_size); - isize = MIN(isize, bsize); - return isize > ip->i_d.di_size ? isize : 0; -} - -/* * Fast and loose check if this write could update the on-disk inode size. */ static inline bool xfs_ioend_is_append(struct xfs_ioend *ioend) @@ -140,7 +122,7 @@ xfs_setfilesize( xfs_fsize_t isize; xfs_ilock(ip, XFS_ILOCK_EXCL); - isize = xfs_ioend_new_eof(ioend); + isize = xfs_new_eof(ip, ioend->io_offset + ioend->io_size); if (isize) { trace_xfs_setfilesize(ip, ioend->io_offset, ioend->io_size); ip->i_d.di_size = isize; @@ -362,6 +344,8 @@ xfs_submit_ioend_bio( xfs_ioend_t *ioend, struct bio *bio) { + struct xfs_inode *ip = XFS_I(ioend->io_inode); + atomic_inc(&ioend->io_remaining); bio->bi_private = ioend; bio->bi_end_io = xfs_end_bio; @@ -370,8 +354,8 @@ xfs_submit_ioend_bio( * If the I/O is beyond EOF we mark the inode dirty immediately * but don't update the inode size until I/O completion. */ - if (xfs_ioend_new_eof(ioend)) - xfs_mark_inode_dirty(XFS_I(ioend->io_inode)); + if (xfs_new_eof(ip, ioend->io_offset + ioend->io_size)) + xfs_mark_inode_dirty(ip); submit_bio(wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE, bio); } Index: linux-2.6/fs/xfs/xfs_inode.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_inode.h 2011-11-08 08:02:50.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_inode.h 2011-11-08 08:13:01.290386996 +0100 @@ -278,6 +278,20 @@ static inline struct inode *VFS_I(struct } /* + * If this I/O goes past the on-disk inode size update it unless it would + * be past the current in-core or write in-progress inode size. + */ +static inline xfs_fsize_t +xfs_new_eof(struct xfs_inode *ip, xfs_fsize_t new_size) +{ + xfs_fsize_t i_size = max(ip->i_size, ip->i_new_size); + + if (new_size > i_size) + new_size = i_size; + return new_size > ip->i_d.di_size ? new_size : 0; +} + +/* * i_flags helper functions */ static inline void _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/5] for-3.2 queue @ 2011-11-15 20:14 Christoph Hellwig 2011-11-15 20:14 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2011-11-15 20:14 UTC (permalink / raw) To: xfs With more reports showing up that the VFS writeback code is not able to write back our size updates dirtied from the I/O completion handler in reasonable time I think it's time to move to logging all file size updates ASAP, that is for 3.2 and maybe after a reasonable testing period even -stable. This series has been sent out a few times, and I've been doing QA on it for weeks. Note that I haven't implemented the log space reservations from ->writepage - see the actual patch for the rationale. Also throw in the fork offset fix as it's an user-triggerable oops. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-15 20:14 [PATCH 0/5] for-3.2 queue Christoph Hellwig @ 2011-11-15 20:14 ` Christoph Hellwig 2011-11-16 19:01 ` Ben Myers 2011-11-16 23:24 ` Dave Chinner 0 siblings, 2 replies; 14+ messages in thread From: Christoph Hellwig @ 2011-11-15 20:14 UTC (permalink / raw) To: xfs [-- Attachment #1: xfs-split-workqueues --] [-- Type: text/plain, Size: 8068 bytes --] commit 77d7a0c "xfs: Non-blocking inode locking in IO completion" introduced a trylocked and defer scheme in xfs_setfilesize to avoid deadlocks when on XFS filesystem is used ontop of another using the loop device, and we fsync in the loop filesystem. Now that we have the cheap enough concurrency managed workqueues, we can create per-filesystem instead of global workqueues and remove this scheme again, given that it has the potential of delaying size updates and is not helpful once we start to log the inode size. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_aops.c | 39 ++++++++++----------------------------- fs/xfs/xfs_aops.h | 2 -- fs/xfs/xfs_buf.c | 17 ----------------- fs/xfs/xfs_mount.h | 3 +++ fs/xfs/xfs_super.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 55 insertions(+), 49 deletions(-) Index: linux-2.6/fs/xfs/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-11-12 15:37:28.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_aops.c 2011-11-15 09:14:16.856650210 +0100 @@ -131,21 +131,15 @@ static inline bool xfs_ioend_is_append(s * will be the intended file size until i_size is updated. If this write does * not extend all the way to the valid file size then restrict this update to * the end of the write. - * - * This function does not block as blocking on the inode lock in IO completion - * can lead to IO completion order dependency deadlocks.. If it can't get the - * inode ilock it will return EAGAIN. Callers must handle this. */ -STATIC int +STATIC void xfs_setfilesize( - xfs_ioend_t *ioend) + struct xfs_ioend *ioend) { - xfs_inode_t *ip = XFS_I(ioend->io_inode); + struct xfs_inode *ip = XFS_I(ioend->io_inode); xfs_fsize_t isize; - if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) - return EAGAIN; - + xfs_ilock(ip, XFS_ILOCK_EXCL); isize = xfs_ioend_new_eof(ioend); if (isize) { trace_xfs_setfilesize(ip, ioend->io_offset, ioend->io_size); @@ -154,7 +148,6 @@ xfs_setfilesize( } xfs_iunlock(ip, XFS_ILOCK_EXCL); - return 0; } /* @@ -168,10 +161,12 @@ xfs_finish_ioend( struct xfs_ioend *ioend) { if (atomic_dec_and_test(&ioend->io_remaining)) { + struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; + if (ioend->io_type == IO_UNWRITTEN) - queue_work(xfsconvertd_workqueue, &ioend->io_work); + queue_work(mp->m_unwritten_workqueue, &ioend->io_work); else if (xfs_ioend_is_append(ioend)) - queue_work(xfsdatad_workqueue, &ioend->io_work); + queue_work(mp->m_data_workqueue, &ioend->io_work); else xfs_destroy_ioend(ioend); } @@ -212,23 +207,9 @@ xfs_end_io( * We might have to update the on-disk file size after extending * writes. */ - error = xfs_setfilesize(ioend); - ASSERT(!error || error == EAGAIN); - + xfs_setfilesize(ioend); done: - /* - * If we didn't complete processing of the ioend, requeue it to the - * tail of the workqueue for another attempt later. Otherwise destroy - * it. - */ - if (error == EAGAIN) { - atomic_inc(&ioend->io_remaining); - xfs_finish_ioend(ioend); - /* ensure we don't spin on blocked ioends */ - delay(1); - } else { - xfs_destroy_ioend(ioend); - } + xfs_destroy_ioend(ioend); } /* Index: linux-2.6/fs/xfs/xfs_aops.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.h 2011-11-10 16:50:40.751797337 +0100 +++ linux-2.6/fs/xfs/xfs_aops.h 2011-11-15 09:14:16.856650210 +0100 @@ -18,8 +18,6 @@ #ifndef __XFS_AOPS_H__ #define __XFS_AOPS_H__ -extern struct workqueue_struct *xfsdatad_workqueue; -extern struct workqueue_struct *xfsconvertd_workqueue; extern mempool_t *xfs_ioend_pool; /* Index: linux-2.6/fs/xfs/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_buf.c 2011-11-10 16:50:40.763795564 +0100 +++ linux-2.6/fs/xfs/xfs_buf.c 2011-11-15 09:14:16.859983547 +0100 @@ -45,8 +45,6 @@ static kmem_zone_t *xfs_buf_zone; STATIC int xfsbufd(void *); static struct workqueue_struct *xfslogd_workqueue; -struct workqueue_struct *xfsdatad_workqueue; -struct workqueue_struct *xfsconvertd_workqueue; #ifdef XFS_BUF_LOCK_TRACKING # define XB_SET_OWNER(bp) ((bp)->b_last_holder = current->pid) @@ -1797,21 +1795,8 @@ xfs_buf_init(void) if (!xfslogd_workqueue) goto out_free_buf_zone; - xfsdatad_workqueue = alloc_workqueue("xfsdatad", WQ_MEM_RECLAIM, 1); - if (!xfsdatad_workqueue) - goto out_destroy_xfslogd_workqueue; - - xfsconvertd_workqueue = alloc_workqueue("xfsconvertd", - WQ_MEM_RECLAIM, 1); - if (!xfsconvertd_workqueue) - goto out_destroy_xfsdatad_workqueue; - return 0; - out_destroy_xfsdatad_workqueue: - destroy_workqueue(xfsdatad_workqueue); - out_destroy_xfslogd_workqueue: - destroy_workqueue(xfslogd_workqueue); out_free_buf_zone: kmem_zone_destroy(xfs_buf_zone); out: @@ -1821,8 +1806,6 @@ xfs_buf_init(void) void xfs_buf_terminate(void) { - destroy_workqueue(xfsconvertd_workqueue); - destroy_workqueue(xfsdatad_workqueue); destroy_workqueue(xfslogd_workqueue); kmem_zone_destroy(xfs_buf_zone); } Index: linux-2.6/fs/xfs/xfs_super.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-11-10 16:50:40.771795378 +0100 +++ linux-2.6/fs/xfs/xfs_super.c 2011-11-15 09:17:13.763315819 +0100 @@ -769,6 +769,42 @@ xfs_setup_devices( return 0; } +STATIC int +xfs_init_mount_workqueues( + struct xfs_mount *mp) +{ + snprintf(mp->m_data_workqueue_name, XFS_WQ_NAME_LEN, + "xfs-data/%s", mp->m_fsname); + mp->m_data_workqueue = + alloc_workqueue(mp->m_data_workqueue_name, WQ_MEM_RECLAIM, 1); + if (!mp->m_data_workqueue) + goto out; + + snprintf(mp->m_unwritten_workqueue_name, XFS_WQ_NAME_LEN, + "xfs-conv/%s", mp->m_fsname); + mp->m_unwritten_workqueue = + alloc_workqueue(mp->m_unwritten_workqueue_name, + WQ_MEM_RECLAIM, 1); + if (!mp->m_unwritten_workqueue) + goto out_destroy_data_iodone_queue; + + return 0; + +out_destroy_data_iodone_queue: + destroy_workqueue(mp->m_data_workqueue); +out: + return -ENOMEM; +#undef XFS_WQ_NAME_LEN +} + +STATIC void +xfs_destroy_mount_workqueues( + struct xfs_mount *mp) +{ + destroy_workqueue(mp->m_data_workqueue); + destroy_workqueue(mp->m_unwritten_workqueue); +} + /* Catch misguided souls that try to use this interface on XFS */ STATIC struct inode * xfs_fs_alloc_inode( @@ -1020,6 +1056,7 @@ xfs_fs_put_super( xfs_unmountfs(mp); xfs_freesb(mp); xfs_icsb_destroy_counters(mp); + xfs_destroy_mount_workqueues(mp); xfs_close_devices(mp); xfs_free_fsname(mp); kfree(mp); @@ -1353,10 +1390,14 @@ xfs_fs_fill_super( if (error) goto out_free_fsname; - error = xfs_icsb_init_counters(mp); + error = xfs_init_mount_workqueues(mp); if (error) goto out_close_devices; + error = xfs_icsb_init_counters(mp); + if (error) + goto out_destroy_workqueues; + error = xfs_readsb(mp, flags); if (error) goto out_destroy_counters; @@ -1419,6 +1460,8 @@ xfs_fs_fill_super( xfs_freesb(mp); out_destroy_counters: xfs_icsb_destroy_counters(mp); +out_destroy_workqueues: + xfs_destroy_mount_workqueues(mp); out_close_devices: xfs_close_devices(mp); out_free_fsname: Index: linux-2.6/fs/xfs/xfs_mount.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_mount.h 2011-11-10 16:50:40.787796774 +0100 +++ linux-2.6/fs/xfs/xfs_mount.h 2011-11-15 09:15:25.053316473 +0100 @@ -211,6 +211,12 @@ typedef struct xfs_mount { struct shrinker m_inode_shrink; /* inode reclaim shrinker */ int64_t m_low_space[XFS_LOWSP_MAX]; /* low free space thresholds */ + + struct workqueue_struct *m_data_workqueue; + struct workqueue_struct *m_unwritten_workqueue; +#define XFS_WQ_NAME_LEN 512 + char m_data_workqueue_name[XFS_WQ_NAME_LEN]; + char m_unwritten_workqueue_name[XFS_WQ_NAME_LEN]; } xfs_mount_t; /* _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-15 20:14 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig @ 2011-11-16 19:01 ` Ben Myers 2011-11-17 7:40 ` Christoph Hellwig 2011-11-16 23:24 ` Dave Chinner 1 sibling, 1 reply; 14+ messages in thread From: Ben Myers @ 2011-11-16 19:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs On Tue, Nov 15, 2011 at 03:14:09PM -0500, Christoph Hellwig wrote: > commit 77d7a0c "xfs: Non-blocking inode locking in IO completion" introduced > a trylocked and defer scheme in xfs_setfilesize to avoid deadlocks when on > XFS filesystem is used ontop of another using the loop device, and we > fsync in the loop filesystem. > > Now that we have the cheap enough concurrency managed workqueues, we can > create per-filesystem instead of global workqueues and remove this scheme > again, given that it has the potential of delaying size updates and is not > helpful once we start to log the inode size. > > Signed-off-by: Christoph Hellwig <hch@lst.de> ... > /* > @@ -168,10 +161,12 @@ xfs_finish_ioend( > struct xfs_ioend *ioend) > { > if (atomic_dec_and_test(&ioend->io_remaining)) { > + struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; > + > if (ioend->io_type == IO_UNWRITTEN) > - queue_work(xfsconvertd_workqueue, &ioend->io_work); > + queue_work(mp->m_unwritten_workqueue, &ioend->io_work); > else if (xfs_ioend_is_append(ioend)) I wonder if we could skip size updates due to the 'fast and loose' nature of xfs_ioend_is_append, and end up destroying the ioend below, without updating the file size. It's not strictly related to your patch though. > - queue_work(xfsdatad_workqueue, &ioend->io_work); > + queue_work(mp->m_data_workqueue, &ioend->io_work); > else > xfs_destroy_ioend(ioend); > } ... > Index: linux-2.6/fs/xfs/xfs_super.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-11-10 16:50:40.771795378 +0100 > +++ linux-2.6/fs/xfs/xfs_super.c 2011-11-15 09:17:13.763315819 +0100 > @@ -769,6 +769,42 @@ xfs_setup_devices( > return 0; > } > > +STATIC int > +xfs_init_mount_workqueues( > + struct xfs_mount *mp) > +{ > + snprintf(mp->m_data_workqueue_name, XFS_WQ_NAME_LEN, > + "xfs-data/%s", mp->m_fsname); > + mp->m_data_workqueue = > + alloc_workqueue(mp->m_data_workqueue_name, WQ_MEM_RECLAIM, 1); > + if (!mp->m_data_workqueue) > + goto out; > + > + snprintf(mp->m_unwritten_workqueue_name, XFS_WQ_NAME_LEN, > + "xfs-conv/%s", mp->m_fsname); > + mp->m_unwritten_workqueue = > + alloc_workqueue(mp->m_unwritten_workqueue_name, > + WQ_MEM_RECLAIM, 1); Hrm... mp->m_fsname can be MAXNAMELEN (256 in xfs), and XFS_WQ_NAME_LEN you chose is 512. As it stands there really isn't a problem here. And, it sounds like you are wanting to replace this once Tejun improves the interface... maybe that was worth pointing out. > + if (!mp->m_unwritten_workqueue) > + goto out_destroy_data_iodone_queue; > + > + return 0; > + > +out_destroy_data_iodone_queue: > + destroy_workqueue(mp->m_data_workqueue); > +out: > + return -ENOMEM; > +#undef XFS_WQ_NAME_LEN Reviewed-by: Ben Myers <bpm@sgi.com> _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-16 19:01 ` Ben Myers @ 2011-11-17 7:40 ` Christoph Hellwig 0 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2011-11-17 7:40 UTC (permalink / raw) To: Ben Myers; +Cc: Christoph Hellwig, xfs On Wed, Nov 16, 2011 at 01:01:20PM -0600, Ben Myers wrote: > On Tue, Nov 15, 2011 at 03:14:09PM -0500, Christoph Hellwig wrote: > > commit 77d7a0c "xfs: Non-blocking inode locking in IO completion" introduced > > a trylocked and defer scheme in xfs_setfilesize to avoid deadlocks when on > > XFS filesystem is used ontop of another using the loop device, and we > > fsync in the loop filesystem. > > > > Now that we have the cheap enough concurrency managed workqueues, we can > > create per-filesystem instead of global workqueues and remove this scheme > > again, given that it has the potential of delaying size updates and is not > > helpful once we start to log the inode size. > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > ... > > > /* > > @@ -168,10 +161,12 @@ xfs_finish_ioend( > > struct xfs_ioend *ioend) > > { > > if (atomic_dec_and_test(&ioend->io_remaining)) { > > + struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; > > + > > if (ioend->io_type == IO_UNWRITTEN) > > - queue_work(xfsconvertd_workqueue, &ioend->io_work); > > + queue_work(mp->m_unwritten_workqueue, &ioend->io_work); > > else if (xfs_ioend_is_append(ioend)) > > I wonder if we could skip size updates due to the 'fast and loose' > nature of xfs_ioend_is_append, and end up destroying the ioend below, > without updating the file size. It's not strictly related to your patch > though. No - xfs_ioend_is_append check that the offset is beyond the on-disk inode size. The loose part is that we don't bother with the in-core i_size and i_new_size which could change due to I/O errors. di_size on the other hand will only go downwards during truncate, and we make sure all outstanding buffered I/Os have finished first. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-15 20:14 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig 2011-11-16 19:01 ` Ben Myers @ 2011-11-16 23:24 ` Dave Chinner 2011-11-17 7:25 ` Christoph Hellwig 1 sibling, 1 reply; 14+ messages in thread From: Dave Chinner @ 2011-11-16 23:24 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs On Tue, Nov 15, 2011 at 03:14:09PM -0500, Christoph Hellwig wrote: > commit 77d7a0c "xfs: Non-blocking inode locking in IO completion" introduced > a trylocked and defer scheme in xfs_setfilesize to avoid deadlocks when on > XFS filesystem is used ontop of another using the loop device, and we > fsync in the loop filesystem. > > Now that we have the cheap enough concurrency managed workqueues, we can > create per-filesystem instead of global workqueues and remove this scheme > again, given that it has the potential of delaying size updates and is not > helpful once we start to log the inode size. > > Signed-off-by: Christoph Hellwig <hch@lst.de> .... > > +STATIC int > +xfs_init_mount_workqueues( > + struct xfs_mount *mp) > +{ > + snprintf(mp->m_data_workqueue_name, XFS_WQ_NAME_LEN, > + "xfs-data/%s", mp->m_fsname); > + mp->m_data_workqueue = > + alloc_workqueue(mp->m_data_workqueue_name, WQ_MEM_RECLAIM, 1); > + if (!mp->m_data_workqueue) > + goto out; > + > + snprintf(mp->m_unwritten_workqueue_name, XFS_WQ_NAME_LEN, > + "xfs-conv/%s", mp->m_fsname); > + mp->m_unwritten_workqueue = > + alloc_workqueue(mp->m_unwritten_workqueue_name, > + WQ_MEM_RECLAIM, 1); > + if (!mp->m_unwritten_workqueue) > + goto out_destroy_data_iodone_queue; > + > + return 0; > + > +out_destroy_data_iodone_queue: > + destroy_workqueue(mp->m_data_workqueue); > +out: > + return -ENOMEM; > +#undef XFS_WQ_NAME_LEN Don't need the undef here anymore as the #define is in the header file and not local to the function. > Index: linux-2.6/fs/xfs/xfs_mount.h > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_mount.h 2011-11-10 16:50:40.787796774 +0100 > +++ linux-2.6/fs/xfs/xfs_mount.h 2011-11-15 09:15:25.053316473 +0100 > @@ -211,6 +211,12 @@ typedef struct xfs_mount { > struct shrinker m_inode_shrink; /* inode reclaim shrinker */ > int64_t m_low_space[XFS_LOWSP_MAX]; > /* low free space thresholds */ > + > + struct workqueue_struct *m_data_workqueue; > + struct workqueue_struct *m_unwritten_workqueue; > +#define XFS_WQ_NAME_LEN 512 > + char m_data_workqueue_name[XFS_WQ_NAME_LEN]; > + char m_unwritten_workqueue_name[XFS_WQ_NAME_LEN]; > } xfs_mount_t; Can't say I'm a great fan of this - making the workqueue code use kstrdup() would be a much better better solution, IMO, just like was done a while for the SLAB cache names to solve exactly the same problem.... Regardless, Reviewed-by: Dave Chinner <dchinner@redhat.com> Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues 2011-11-16 23:24 ` Dave Chinner @ 2011-11-17 7:25 ` Christoph Hellwig 0 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2011-11-17 7:25 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs On Thu, Nov 17, 2011 at 10:24:30AM +1100, Dave Chinner wrote: > Don't need the undef here anymore as the #define is in the header > file and not local to the function. I'll fix it up. > Can't say I'm a great fan of this - making the workqueue code use > kstrdup() would be a much better better solution, IMO, just like was > done a while for the SLAB cache names to solve exactly the same > problem.... Tejun has a patch to make the name argument to alloc_workqueue both dynamically allocated and varags format. That'll fix it, but we can't rely on it yet. I will switch over to it later in the 3.3 cycle. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-11-17 7:40 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-08 8:56 [PATCH 0/5] log all file size updates Christoph Hellwig 2011-11-08 8:56 ` [PATCH 1/5] fix: force shutdown handling in xfs_end_io Christoph Hellwig 2011-11-08 8:56 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig 2011-11-08 23:11 ` Dave Chinner 2011-11-09 7:58 ` Christoph Hellwig 2011-11-10 17:42 ` Ben Myers 2011-11-14 10:34 ` Christoph Hellwig 2011-11-14 18:13 ` Tejun Heo 2011-11-08 8:56 ` [PATCH 3/5] xfs: do not require an ioend for new EOF calculation Christoph Hellwig -- strict thread matches above, loose matches on Subject: below -- 2011-11-15 20:14 [PATCH 0/5] for-3.2 queue Christoph Hellwig 2011-11-15 20:14 ` [PATCH 2/5] xfs: use per-filesystem I/O completion workqueues Christoph Hellwig 2011-11-16 19:01 ` Ben Myers 2011-11-17 7:40 ` Christoph Hellwig 2011-11-16 23:24 ` Dave Chinner 2011-11-17 7:25 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox