* [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
@ 2011-08-03 20:49 Jan Kara
2011-08-03 20:49 ` [PATCH 1/2] vfs: Create filemap_flush_range() Jan Kara
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Jan Kara @ 2011-08-03 20:49 UTC (permalink / raw)
To: hch; +Cc: xfs, linux-fsdevel
Hi,
at one of customer's machines, I've spotted an issue that sync(1) called
after writing a single huge file has been achieving rather low throughput. After
debugging this with blktrace, I've found that the culprit was in flusher thread
racing with page writeout happening from XFS sync code. The patches below helped
that case. Although they are not a complete solution, I belive they are useful
anyway so please consider merging them...
Honza
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] vfs: Create filemap_flush_range()
2011-08-03 20:49 [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Jan Kara
@ 2011-08-03 20:49 ` Jan Kara
2011-08-03 20:49 ` [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call Jan Kara
2011-08-03 21:42 ` [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Christoph Hellwig
2 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2011-08-03 20:49 UTC (permalink / raw)
To: hch; +Cc: xfs, linux-fsdevel, Jan Kara
Create a version of filemap_flush() function which can take a range.
Signed-off-by: Jan Kara <jack@suse.cz>
---
include/linux/fs.h | 2 ++
mm/filemap.c | 16 ++++++++++++++++
2 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 786b3b1..091cf90 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2177,6 +2177,8 @@ extern int invalidate_inode_pages2_range(struct address_space *mapping,
extern int write_inode_now(struct inode *, int);
extern int filemap_fdatawrite(struct address_space *);
extern int filemap_flush(struct address_space *);
+extern int filemap_flush_range(struct address_space *mapping, loff_t start,
+ loff_t end);
extern int filemap_fdatawait(struct address_space *);
extern int filemap_fdatawait_range(struct address_space *, loff_t lstart,
loff_t lend);
diff --git a/mm/filemap.c b/mm/filemap.c
index 867d402..8fd2347 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -253,6 +253,22 @@ int filemap_flush(struct address_space *mapping)
EXPORT_SYMBOL(filemap_flush);
/**
+ * filemap_flush_range - mostly a non-blocking flush of a mapping range
+ * @mapping: target address_space
+ * @start: offset in bytes where the range starts
+ * @end: offset in bytes where the range ends (inclusive)
+ *
+ * This is a mostly non-blocking flush. Not suitable for data-integrity
+ * purposes - I/O may not be started against all dirty pages.
+ */
+int filemap_flush_range(struct address_space *mapping, loff_t start,
+ loff_t end)
+{
+ return __filemap_fdatawrite_range(mapping, start, end, WB_SYNC_NONE);
+}
+EXPORT_SYMBOL(filemap_flush_range);
+
+/**
* filemap_fdatawait_range - wait for writeback to complete
* @mapping: address space structure to wait for
* @start_byte: offset in bytes where the range starts
--
1.7.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call
2011-08-03 20:49 [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Jan Kara
2011-08-03 20:49 ` [PATCH 1/2] vfs: Create filemap_flush_range() Jan Kara
@ 2011-08-03 20:49 ` Jan Kara
2011-08-03 22:03 ` Christoph Hellwig
2011-08-03 22:34 ` Vivek Goyal
2011-08-03 21:42 ` [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Christoph Hellwig
2 siblings, 2 replies; 13+ messages in thread
From: Jan Kara @ 2011-08-03 20:49 UTC (permalink / raw)
To: hch; +Cc: xfs, linux-fsdevel, Jan Kara
XFS does its own data writeout from several places using xfs_flush_pages().
When this writeout is racing with flusher thread writing the same inode the
performance gets bad because flusher thread submits writes as normal WRITE
commands while xfs_flush_pages() submits them as WRITE_SYNC (as it uses
filemap_fdatawrite_range()) and CFQ does not merge such requests. So we end up
with relatively small requests from the flusher thread and sync interleaved.
Short excerpt from blktrace:
254,16 2 16949 103.786278461 25233 Q W 54479888 + 424 [flush-254:16]
254,16 4 20662 103.786386751 25232 Q WS 54492920 + 1024 [dd]
254,16 4 20665 103.786444241 25232 Q WS 54493944 + 104 [dd]
254,16 2 16952 103.786533451 25233 Q W 54494048 + 288 [flush-254:16]
254,16 4 20668 103.787295241 25232 Q WS 54494336 + 1024 [dd]
254,16 4 20671 103.787332801 25232 Q WS 54495360 + 216 [dd]
254,16 2 16955 103.789062500 25233 Q W 54495576 + 1024 [flush-254:16]
(actually, I was observing even smaller requests on a different HW which isn't
available to me now).
Although we cannot fix all the cases (e.g. when fsync is called), we can use
filemap_flush_range() for the case when xfs_flush_pages() is called with
XFS_B_ASYNC flag which fixes the problem at least for some common cases like
the first round of sync writeback or flushing of data during truncate or after
file is closed.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/xfs/linux-2.6/xfs_fs_subr.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_fs_subr.c b/fs/xfs/linux-2.6/xfs_fs_subr.c
index ed88ed1..6f33164 100644
--- a/fs/xfs/linux-2.6/xfs_fs_subr.c
+++ b/fs/xfs/linux-2.6/xfs_fs_subr.c
@@ -70,10 +70,12 @@ xfs_flush_pages(
int ret2;
xfs_iflags_clear(ip, XFS_ITRUNCATED);
+ if (flags & XBF_ASYNC) {
+ return -filemap_flush_range(mapping, first,
+ last == -1 ? LLONG_MAX : last);
+ }
ret = -filemap_fdatawrite_range(mapping, first,
last == -1 ? LLONG_MAX : last);
- if (flags & XBF_ASYNC)
- return ret;
ret2 = xfs_wait_on_pages(ip, first, last);
if (!ret)
ret = ret2;
--
1.7.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-03 20:49 [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Jan Kara
2011-08-03 20:49 ` [PATCH 1/2] vfs: Create filemap_flush_range() Jan Kara
2011-08-03 20:49 ` [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call Jan Kara
@ 2011-08-03 21:42 ` Christoph Hellwig
2011-08-04 10:36 ` Jan Kara
2 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2011-08-03 21:42 UTC (permalink / raw)
To: Jan Kara; +Cc: hch, xfs, linux-fsdevel
On Wed, Aug 03, 2011 at 10:49:03PM +0200, Jan Kara wrote:
>
> Hi,
>
> at one of customer's machines, I've spotted an issue that sync(1) called
> after writing a single huge file has been achieving rather low throughput. After
> debugging this with blktrace, I've found that the culprit was in flusher thread
> racing with page writeout happening from XFS sync code. The patches below helped
> that case. Although they are not a complete solution, I belive they are useful
> anyway so please consider merging them...
We currently have three calls to xfs_flush_pages with XBF_ASYNC set:
- xfs_setattr_size
- xfs_sync_inode_data
- xfs_release
The first one actually is a synchronous writeout, just implemented in
a rather odd way by doing the xfs_ioend_wait right after it, so your
change is actively harmful for it. The second is only called from
xfs_flush_worker, which is the workqueue offload when we hit ENOSPC.
I can see how this might race with the writeback code, but the correct
fix is to replace it with a call to writeback_inodes_sb(_if_idle)
on that one is fixed to do a trylock on s_umount and thus won't
deadlock. The third one is opportunistic writeout if a file got
truncated down on final release. filemap_flush probably is fine
here, but there's no need for a range version. If you replace it
with filemap_flush please also kill the useless wrapper while you're
at it.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call
2011-08-03 20:49 ` [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call Jan Kara
@ 2011-08-03 22:03 ` Christoph Hellwig
2011-08-03 22:34 ` Vivek Goyal
1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2011-08-03 22:03 UTC (permalink / raw)
To: Jan Kara; +Cc: hch, xfs, linux-fsdevel
On Wed, Aug 03, 2011 at 10:49:05PM +0200, Jan Kara wrote:
> XFS does its own data writeout from several places using xfs_flush_pages().
> When this writeout is racing with flusher thread writing the same inode the
> performance gets bad because flusher thread submits writes as normal WRITE
> commands while xfs_flush_pages() submits them as WRITE_SYNC (as it uses
Oh, I think you really want commit
xfs: improve sync behaviour in the face of aggressive dirtying
for your vendor tree.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call
2011-08-03 20:49 ` [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call Jan Kara
2011-08-03 22:03 ` Christoph Hellwig
@ 2011-08-03 22:34 ` Vivek Goyal
2011-08-04 10:03 ` Jan Kara
1 sibling, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2011-08-03 22:34 UTC (permalink / raw)
To: Jan Kara; +Cc: hch, linux-fsdevel, xfs
On Wed, Aug 03, 2011 at 10:49:05PM +0200, Jan Kara wrote:
> XFS does its own data writeout from several places using xfs_flush_pages().
> When this writeout is racing with flusher thread writing the same inode the
> performance gets bad because flusher thread submits writes as normal WRITE
> commands while xfs_flush_pages() submits them as WRITE_SYNC (as it uses
> filemap_fdatawrite_range()) and CFQ does not merge such requests. So we end up
> with relatively small requests from the flusher thread and sync interleaved.
> Short excerpt from blktrace:
> 254,16 2 16949 103.786278461 25233 Q W 54479888 + 424 [flush-254:16]
> 254,16 4 20662 103.786386751 25232 Q WS 54492920 + 1024 [dd]
> 254,16 4 20665 103.786444241 25232 Q WS 54493944 + 104 [dd]
> 254,16 2 16952 103.786533451 25233 Q W 54494048 + 288 [flush-254:16]
> 254,16 4 20668 103.787295241 25232 Q WS 54494336 + 1024 [dd]
> 254,16 4 20671 103.787332801 25232 Q WS 54495360 + 216 [dd]
> 254,16 2 16955 103.789062500 25233 Q W 54495576 + 1024 [flush-254:16]
> (actually, I was observing even smaller requests on a different HW which isn't
> available to me now).
Jan,
Do you have rest of the blktrace messages? I am curious that apart from
smaller request size, is it CFQ idling also which might hurt in this
IO pattern. (grep for "fired" in your blktrace and see how many times
idle timer fired).
We probably are idling on "dd" thread. As long as dd writes are not dependent
on async writes, things will still be fine as dd will continue to make
progress and async writes will not make much progress.
But if that's not the case and after submitting bunch of synchronous
writes dd waits for some writes to finish (writes submitted by flusher
threds), then idling will start hurting.
Thanks
Vivek
>
> Although we cannot fix all the cases (e.g. when fsync is called), we can use
> filemap_flush_range() for the case when xfs_flush_pages() is called with
> XFS_B_ASYNC flag which fixes the problem at least for some common cases like
> the first round of sync writeback or flushing of data during truncate or after
> file is closed.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/xfs/linux-2.6/xfs_fs_subr.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/xfs/linux-2.6/xfs_fs_subr.c b/fs/xfs/linux-2.6/xfs_fs_subr.c
> index ed88ed1..6f33164 100644
> --- a/fs/xfs/linux-2.6/xfs_fs_subr.c
> +++ b/fs/xfs/linux-2.6/xfs_fs_subr.c
> @@ -70,10 +70,12 @@ xfs_flush_pages(
> int ret2;
>
> xfs_iflags_clear(ip, XFS_ITRUNCATED);
> + if (flags & XBF_ASYNC) {
> + return -filemap_flush_range(mapping, first,
> + last == -1 ? LLONG_MAX : last);
> + }
> ret = -filemap_fdatawrite_range(mapping, first,
> last == -1 ? LLONG_MAX : last);
> - if (flags & XBF_ASYNC)
> - return ret;
> ret2 = xfs_wait_on_pages(ip, first, last);
> if (!ret)
> ret = ret2;
> --
> 1.7.1
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call
2011-08-03 22:34 ` Vivek Goyal
@ 2011-08-04 10:03 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2011-08-04 10:03 UTC (permalink / raw)
To: Vivek Goyal; +Cc: Jan Kara, hch, linux-fsdevel, xfs
On Wed 03-08-11 18:34:39, Vivek Goyal wrote:
> On Wed, Aug 03, 2011 at 10:49:05PM +0200, Jan Kara wrote:
> > XFS does its own data writeout from several places using xfs_flush_pages().
> > When this writeout is racing with flusher thread writing the same inode the
> > performance gets bad because flusher thread submits writes as normal WRITE
> > commands while xfs_flush_pages() submits them as WRITE_SYNC (as it uses
> > filemap_fdatawrite_range()) and CFQ does not merge such requests. So we end up
> > with relatively small requests from the flusher thread and sync interleaved.
> > Short excerpt from blktrace:
> > 254,16 2 16949 103.786278461 25233 Q W 54479888 + 424 [flush-254:16]
> > 254,16 4 20662 103.786386751 25232 Q WS 54492920 + 1024 [dd]
> > 254,16 4 20665 103.786444241 25232 Q WS 54493944 + 104 [dd]
> > 254,16 2 16952 103.786533451 25233 Q W 54494048 + 288 [flush-254:16]
> > 254,16 4 20668 103.787295241 25232 Q WS 54494336 + 1024 [dd]
> > 254,16 4 20671 103.787332801 25232 Q WS 54495360 + 216 [dd]
> > 254,16 2 16955 103.789062500 25233 Q W 54495576 + 1024 [flush-254:16]
> > (actually, I was observing even smaller requests on a different HW which isn't
> > available to me now).
>
> Do you have rest of the blktrace messages? I am curious that apart from
> smaller request size, is it CFQ idling also which might hurt in this
> IO pattern. (grep for "fired" in your blktrace and see how many times
> idle timer fired).
At least in the trace I have (which is sadly not from the machine where
the problem was really bad - but that was with 2.6.32 kernel anyway)
idle timer is not a problem. Apparently we are not idling because I see
messages like
254,16 1 0 103.816811760 0 m N cfq25232 Not idling. st->count:1
> We probably are idling on "dd" thread. As long as dd writes are not dependent
> on async writes, things will still be fine as dd will continue to make
> progress and async writes will not make much progress.
>
> But if that's not the case and after submitting bunch of synchronous
> writes dd waits for some writes to finish (writes submitted by flusher
> threds), then idling will start hurting.
Umm, we have to wait for writes submitted by the flusher thread only when
we went through the whole file and are in fdatawait() now. So at least for
big files this won't be an issue. But thanks for your suggestions anyway.
Honza
> > Although we cannot fix all the cases (e.g. when fsync is called), we can use
> > filemap_flush_range() for the case when xfs_flush_pages() is called with
> > XFS_B_ASYNC flag which fixes the problem at least for some common cases like
> > the first round of sync writeback or flushing of data during truncate or after
> > file is closed.
> >
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > fs/xfs/linux-2.6/xfs_fs_subr.c | 6 ++++--
> > 1 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/xfs/linux-2.6/xfs_fs_subr.c b/fs/xfs/linux-2.6/xfs_fs_subr.c
> > index ed88ed1..6f33164 100644
> > --- a/fs/xfs/linux-2.6/xfs_fs_subr.c
> > +++ b/fs/xfs/linux-2.6/xfs_fs_subr.c
> > @@ -70,10 +70,12 @@ xfs_flush_pages(
> > int ret2;
> >
> > xfs_iflags_clear(ip, XFS_ITRUNCATED);
> > + if (flags & XBF_ASYNC) {
> > + return -filemap_flush_range(mapping, first,
> > + last == -1 ? LLONG_MAX : last);
> > + }
> > ret = -filemap_fdatawrite_range(mapping, first,
> > last == -1 ? LLONG_MAX : last);
> > - if (flags & XBF_ASYNC)
> > - return ret;
> > ret2 = xfs_wait_on_pages(ip, first, last);
> > if (!ret)
> > ret = ret2;
> > --
> > 1.7.1
> >
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-03 21:42 ` [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Christoph Hellwig
@ 2011-08-04 10:36 ` Jan Kara
2011-08-04 10:42 ` Christoph Hellwig
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2011-08-04 10:36 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jan Kara, xfs, linux-fsdevel
On Wed 03-08-11 17:42:06, Christoph Hellwig wrote:
> On Wed, Aug 03, 2011 at 10:49:03PM +0200, Jan Kara wrote:
> >
> > Hi,
> >
> > at one of customer's machines, I've spotted an issue that sync(1) called
> > after writing a single huge file has been achieving rather low throughput. After
> > debugging this with blktrace, I've found that the culprit was in flusher thread
> > racing with page writeout happening from XFS sync code. The patches below helped
> > that case. Although they are not a complete solution, I belive they are useful
> > anyway so please consider merging them...
>
> We currently have three calls to xfs_flush_pages with XBF_ASYNC set:
>
> - xfs_setattr_size
> - xfs_sync_inode_data
> - xfs_release
>
> The first one actually is a synchronous writeout, just implemented in
> a rather odd way by doing the xfs_ioend_wait right after it, so your
> change is actively harmful for it.
Oh, right. BTW cannot be truncate livelocked on a busy file because of
that xfs_ioend_wait()?
> The second is only called from xfs_flush_worker, which is the workqueue
> offload when we hit ENOSPC. I can see how this might race with the
> writeback code, but the correct fix is to replace it with a call to
> writeback_inodes_sb(_if_idle) on that one is fixed to do a trylock on
> s_umount and thus won't deadlock.
OK.
> The third one is opportunistic writeout if a file got truncated down on
> final release. filemap_flush probably is fine here, but there's no need
> for a range version. If you replace it with filemap_flush please also
> kill the useless wrapper while you're at it.
Do you mean xfs_flush_pages()? OK, I can do that.
Thanks for having a look.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-04 10:36 ` Jan Kara
@ 2011-08-04 10:42 ` Christoph Hellwig
2011-08-04 12:07 ` Jan Kara
0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2011-08-04 10:42 UTC (permalink / raw)
To: Jan Kara; +Cc: Christoph Hellwig, xfs, linux-fsdevel
On Thu, Aug 04, 2011 at 12:36:16PM +0200, Jan Kara wrote:
> > The first one actually is a synchronous writeout, just implemented in
> > a rather odd way by doing the xfs_ioend_wait right after it, so your
> > change is actively harmful for it.
> Oh, right. BTW cannot be truncate livelocked on a busy file because of
> that xfs_ioend_wait()?
Not really. We requite the iolock for new writes to start, and truncate
holds it exclusively. But I'm working on a series for 3.2 to remove
xfs_ioend_wait and just rely on inode_dio_wait for direct I/O, so it
will be gone soon. At this point I'll also have to switch to
filemap_write_and_wait_range for this caller.
> > The third one is opportunistic writeout if a file got truncated down on
> > final release. filemap_flush probably is fine here, but there's no need
> > for a range version. If you replace it with filemap_flush please also
> > kill the useless wrapper while you're at it.
> Do you mean xfs_flush_pages()? OK, I can do that.
Yes, xfs_flush_pages should go - at least he async version and its
abuse of the buffer flags.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-04 10:42 ` Christoph Hellwig
@ 2011-08-04 12:07 ` Jan Kara
2011-08-04 12:19 ` Christoph Hellwig
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2011-08-04 12:07 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jan Kara, xfs, linux-fsdevel
On Thu 04-08-11 06:42:10, Christoph Hellwig wrote:
> On Thu, Aug 04, 2011 at 12:36:16PM +0200, Jan Kara wrote:
> > > The first one actually is a synchronous writeout, just implemented in
> > > a rather odd way by doing the xfs_ioend_wait right after it, so your
> > > change is actively harmful for it.
> > Oh, right. BTW cannot be truncate livelocked on a busy file because of
> > that xfs_ioend_wait()?
>
> Not really. We requite the iolock for new writes to start, and truncate
> holds it exclusively. But I'm working on a series for 3.2 to remove
> xfs_ioend_wait and just rely on inode_dio_wait for direct I/O, so it
> will be gone soon. At this point I'll also have to switch to
> filemap_write_and_wait_range for this caller.
>
> > > The third one is opportunistic writeout if a file got truncated down on
> > > final release. filemap_flush probably is fine here, but there's no need
> > > for a range version. If you replace it with filemap_flush please also
> > > kill the useless wrapper while you're at it.
> > Do you mean xfs_flush_pages()? OK, I can do that.
>
> Yes, xfs_flush_pages should go - at least he async version and its
> abuse of the buffer flags.
Hmm, BTW, shouldn't the call to xfs_flush_pages() in
xfs_file_buffered_aio_write() be converted to an asynchronous one? I don't
quite see a point in waiting for io completion... Generally, flushing of
the inode there seems of limited usefulness to me since that inode could be
just a tiny victim not holding much delayallocated blocks.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-04 12:07 ` Jan Kara
@ 2011-08-04 12:19 ` Christoph Hellwig
2011-08-04 12:37 ` Jan Kara
0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2011-08-04 12:19 UTC (permalink / raw)
To: Jan Kara; +Cc: Christoph Hellwig, xfs, linux-fsdevel
On Thu, Aug 04, 2011 at 02:07:24PM +0200, Jan Kara wrote:
> Hmm, BTW, shouldn't the call to xfs_flush_pages() in
> xfs_file_buffered_aio_write() be converted to an asynchronous one? I don't
> quite see a point in waiting for io completion... Generally, flushing of
> the inode there seems of limited usefulness to me since that inode could be
> just a tiny victim not holding much delayallocated blocks.
This comes from commit
xfs: make inode flush at ENOSPC synchronous
from Dave - before that it was asynchronous and in weird context, so
it seems we defintively need it to be synchronous. I agree that just
flushing this inode seems like a rather odd handling for ENOSPC. It's
even more odd as we already use the big hammer before in when we
git ENOSPC in ->write_begin. The only thing I can imagine is that
this is the last attempt to get anything freed.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-04 12:19 ` Christoph Hellwig
@ 2011-08-04 12:37 ` Jan Kara
2011-08-04 12:41 ` Christoph Hellwig
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2011-08-04 12:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jan Kara, xfs, linux-fsdevel
On Thu 04-08-11 08:19:16, Christoph Hellwig wrote:
> On Thu, Aug 04, 2011 at 02:07:24PM +0200, Jan Kara wrote:
> > Hmm, BTW, shouldn't the call to xfs_flush_pages() in
> > xfs_file_buffered_aio_write() be converted to an asynchronous one? I don't
> > quite see a point in waiting for io completion... Generally, flushing of
> > the inode there seems of limited usefulness to me since that inode could be
> > just a tiny victim not holding much delayallocated blocks.
>
> This comes from commit
>
> xfs: make inode flush at ENOSPC synchronous
>
> from Dave - before that it was asynchronous and in weird context, so
> it seems we defintively need it to be synchronous.
From the changelog it seems it needs to be synchronous in the sense that
we don't offload flushing to a different thread as we used to. Also the
reason why previously flushing didn't work was that we held page locks and
IO lock but it's not the case in xfs_file_buffered_aio_write() anymore. So
filemap_flush() still looks like an appropriate thing to me.
> I agree that just flushing this inode seems like a rather odd handling
> for ENOSPC. It's even more odd as we already use the big hammer before
> in when we git ENOSPC in ->write_begin. The only thing I can imagine is
> that this is the last attempt to get anything freed.
OK, I'll leave it there then. I just wonder whether I should convert it
to filemap_flush() or to filemap_write_and_wait()...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages()
2011-08-04 12:37 ` Jan Kara
@ 2011-08-04 12:41 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2011-08-04 12:41 UTC (permalink / raw)
To: Jan Kara; +Cc: Christoph Hellwig, xfs, linux-fsdevel
On Thu, Aug 04, 2011 at 02:37:22PM +0200, Jan Kara wrote:
> > from Dave - before that it was asynchronous and in weird context, so
> > it seems we defintively need it to be synchronous.
> From the changelog it seems it needs to be synchronous in the sense that
> we don't offload flushing to a different thread as we used to. Also the
> reason why previously flushing didn't work was that we held page locks and
> IO lock but it's not the case in xfs_file_buffered_aio_write() anymore. So
> filemap_flush() still looks like an appropriate thing to me.
>
> > I agree that just flushing this inode seems like a rather odd handling
> > for ENOSPC. It's even more odd as we already use the big hammer before
> > in when we git ENOSPC in ->write_begin. The only thing I can imagine is
> > that this is the last attempt to get anything freed.
> OK, I'll leave it there then. I just wonder whether I should convert it
> to filemap_flush() or to filemap_write_and_wait()...
My preference would be to not touch it unless we have a good reason.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-08-04 12:41 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-03 20:49 [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Jan Kara
2011-08-03 20:49 ` [PATCH 1/2] vfs: Create filemap_flush_range() Jan Kara
2011-08-03 20:49 ` [PATCH 2/2] xfs: Call filemap_flush_range() for async xfs_flush_pages() call Jan Kara
2011-08-03 22:03 ` Christoph Hellwig
2011-08-03 22:34 ` Vivek Goyal
2011-08-04 10:03 ` Jan Kara
2011-08-03 21:42 ` [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Christoph Hellwig
2011-08-04 10:36 ` Jan Kara
2011-08-04 10:42 ` Christoph Hellwig
2011-08-04 12:07 ` Jan Kara
2011-08-04 12:19 ` Christoph Hellwig
2011-08-04 12:37 ` Jan Kara
2011-08-04 12:41 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).