* TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof.
@ 2007-12-10 5:59 Lachlan McIlroy
2007-12-11 8:11 ` Christoph Hellwig
2007-12-11 16:52 ` Bhagi rathi
0 siblings, 2 replies; 6+ messages in thread
From: Lachlan McIlroy @ 2007-12-10 5:59 UTC (permalink / raw)
To: sgi.bugs.xfs, xfs
Don't wait for pending I/Os when purging blocks beyond eof.
On last close of a file we purge blocks beyond eof. The same
code is used when we truncate the file size down. In this case
we need to wait for any pending I/Os for dirty pages beyond the
new eof. For the last close case we are not changing the file
size and therefore do not need to wait for any I/Os to complete.
This fixes a performance bottleneck where writes into the page
cache and cache flushes can become mutually exclusive.
Date: Mon Dec 10 16:59:09 AEDT 2007
Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-vniowait
Inspected by: pleckie
Author: lachlan
The following file(s) were checked into:
longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb
Modid: xfs-linux-melb:xfs-kern:30220a
fs/xfs/xfs_inode.c - 1.489 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.489&r2=text&tr2=1.488&f=h
- Don't wait for pending I/Os when purging blocks beyond eof.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof.
2007-12-10 5:59 TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof Lachlan McIlroy
@ 2007-12-11 8:11 ` Christoph Hellwig
2007-12-11 23:25 ` David Chinner
2007-12-11 16:52 ` Bhagi rathi
1 sibling, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2007-12-11 8:11 UTC (permalink / raw)
To: Lachlan McIlroy; +Cc: sgi.bugs.xfs, xfs
On Mon, Dec 10, 2007 at 04:59:55PM +1100, Lachlan McIlroy wrote:
> Don't wait for pending I/Os when purging blocks beyond eof.
>
> On last close of a file we purge blocks beyond eof. The same
> code is used when we truncate the file size down. In this case
> we need to wait for any pending I/Os for dirty pages beyond the
> new eof. For the last close case we are not changing the file
> size and therefore do not need to wait for any I/Os to complete.
> This fixes a performance bottleneck where writes into the page
> cache and cache flushes can become mutually exclusive.
I think I shortened from of this should be in the comment above
the conditional vn_iowait intead of the current
/* wait for the completion of any pending DIOs */
which is wrong given that we don't wait for all pending direct I/O
requests.. (and vn_iowait doesn't wait for direct I/O anyway)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof.
2007-12-11 8:11 ` Christoph Hellwig
@ 2007-12-11 23:25 ` David Chinner
2007-12-12 5:13 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: David Chinner @ 2007-12-11 23:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Lachlan McIlroy, sgi.bugs.xfs, xfs
On Tue, Dec 11, 2007 at 08:11:17AM +0000, Christoph Hellwig wrote:
> On Mon, Dec 10, 2007 at 04:59:55PM +1100, Lachlan McIlroy wrote:
> > Don't wait for pending I/Os when purging blocks beyond eof.
> >
> > On last close of a file we purge blocks beyond eof. The same
> > code is used when we truncate the file size down. In this case
> > we need to wait for any pending I/Os for dirty pages beyond the
> > new eof. For the last close case we are not changing the file
> > size and therefore do not need to wait for any I/Os to complete.
> > This fixes a performance bottleneck where writes into the page
> > cache and cache flushes can become mutually exclusive.
>
> I think I shortened from of this should be in the comment above
> the conditional vn_iowait intead of the current
>
> /* wait for the completion of any pending DIOs */
>
> which is wrong given that we don't wait for all pending direct I/O
> requests.. (and vn_iowait doesn't wait for direct I/O anyway)
vn_iowait() does wait for direct I/O. That was it's entire purpose - to be
able to prevent truncate vs direct I/O write races by tracking direct I/Os.
We increment ip->i_iocount in xfs_alloc_ioend() which is called from both the
buffered write and direct I/O write path, so vn_iowait() does wait for both
buffered and direct writes to complete.
FWIW, the freeze code makes use of this functionality (SYNC_IOWAIT) to ensure
all pending data writes are complete complete before the freeze is completed....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof.
2007-12-11 23:25 ` David Chinner
@ 2007-12-12 5:13 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2007-12-12 5:13 UTC (permalink / raw)
To: David Chinner; +Cc: Christoph Hellwig, Lachlan McIlroy, sgi.bugs.xfs, xfs
On Wed, Dec 12, 2007 at 10:25:17AM +1100, David Chinner wrote:
> > which is wrong given that we don't wait for all pending direct I/O
> > requests.. (and vn_iowait doesn't wait for direct I/O anyway)
>
> vn_iowait() does wait for direct I/O. That was it's entire purpose - to be
> able to prevent truncate vs direct I/O write races by tracking direct I/Os.
> We increment ip->i_iocount in xfs_alloc_ioend() which is called from both the
> buffered write and direct I/O write path, so vn_iowait() does wait for both
> buffered and direct writes to complete.
Sorry, forgot a little important word above - it should read
'and vn_iowait doesn't wait _just_ for direct I/O anyway), because it
waits for completion of regular I/O aswell. Not that it should actually
matter in that caller.forgot a little important word above - it should
read
'and vn_iowait doesn't wait _just_ for direct I/O anyway), because it
waits for completion of regular I/O aswell. Not that it should actually
matter in that caller.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof.
2007-12-10 5:59 TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof Lachlan McIlroy
2007-12-11 8:11 ` Christoph Hellwig
@ 2007-12-11 16:52 ` Bhagi rathi
2007-12-12 4:04 ` Lachlan McIlroy
1 sibling, 1 reply; 6+ messages in thread
From: Bhagi rathi @ 2007-12-11 16:52 UTC (permalink / raw)
To: Lachlan McIlroy; +Cc: sgi.bugs.xfs, xfs
On Dec 10, 2007 11:29 AM, Lachlan McIlroy <lachlan@sgi.com> wrote:
> Don't wait for pending I/Os when purging blocks beyond eof.
>
> On last close of a file we purge blocks beyond eof. The same
> code is used when we truncate the file size down. In this case
> we need to wait for any pending I/Os for dirty pages beyond the
> new eof. For the last close case we are not changing the file
> size and therefore do not need to wait for any I/Os to complete.
> This fixes a performance bottleneck where writes into the page
> cache and cache flushes can become mutually exclusive.
Lachlan,
How the following is addressed if we don't wait for I/O to complete during
close of the file and
issue truncate:
- We didn't waited for the I/O to complete
- Truncated blocks beyond EOF.
- These blocks are re-used for another transaction as meta-data.
- Meta-data flush was induced. However the flush of meta-data
reached first before the data I/O
because of some issues with under-lying driver. Later the file I/O
over-written the meta-data I/O.
We have corruption of data.
It seems the corruption could be in various ways. Isn't this the reason why
truncate has to wait
for the I/O to complete? I believe fundamental problem is once the blocks
are free'ed, the re-association should
not expect some I/O in concurrent to those same block addresses.
-Cheers,
Bhagi.
>
>
> Date: Mon Dec 10 16:59:09 AEDT 2007
> Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-vniowait
> Inspected by: pleckie
> Author: lachlan
>
> The following file(s) were checked into:
> longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb
>
>
> Modid: xfs-linux-melb:xfs-kern:30220a
> fs/xfs/xfs_inode.c - 1.489 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/> xfs_inode.c.diff?r1=text&tr1=1.489&r2=text&tr2=1.488&f=h
>
> http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.489&r2=text&tr2=1.488&f=h
> - Don't wait for pending I/Os when purging blocks beyond eof.
>
>
>
>
>
[[HTML alternate version deleted]]
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof.
2007-12-11 16:52 ` Bhagi rathi
@ 2007-12-12 4:04 ` Lachlan McIlroy
0 siblings, 0 replies; 6+ messages in thread
From: Lachlan McIlroy @ 2007-12-12 4:04 UTC (permalink / raw)
To: Bhagi rathi; +Cc: sgi.bugs.xfs, xfs
Bhagi rathi wrote:
> On Dec 10, 2007 11:29 AM, Lachlan McIlroy <lachlan@sgi.com> wrote:
>
>> Don't wait for pending I/Os when purging blocks beyond eof.
>>
>> On last close of a file we purge blocks beyond eof. The same
>> code is used when we truncate the file size down. In this case
>> we need to wait for any pending I/Os for dirty pages beyond the
>> new eof. For the last close case we are not changing the file
>> size and therefore do not need to wait for any I/Os to complete.
>> This fixes a performance bottleneck where writes into the page
>> cache and cache flushes can become mutually exclusive.
>
>
> Lachlan,
>
> How the following is addressed if we don't wait for I/O to complete during
> close of the file and
> issue truncate:
>
> - We didn't waited for the I/O to complete
> - Truncated blocks beyond EOF.
> - These blocks are re-used for another transaction as meta-data.
> - Meta-data flush was induced. However the flush of meta-data
> reached first before the data I/O
> because of some issues with under-lying driver. Later the file I/O
> over-written the meta-data I/O.
> We have corruption of data.
This case will not happen. For a truncate down we may have dirty pages
beyond eof that are in the process of being written to disk while we are
trying to shrink the file - we need to wait for those. In the case of
truncating blocks beyond eof on last close we are not changing the file
size and so there cannot be pages beyond eof. Any I/O that may be in
progress will be within the file size and not to any of the blocks we
are trying to free.
>
> It seems the corruption could be in various ways. Isn't this the reason why
> truncate has to wait
> for the I/O to complete?
Corruption could certainly occur if we don't wait for I/O. I think the
reason this code was added was to synchronize with direct I/O unwritten
extent conversions which could occur after the direct writer thread has
released the iolock and so we can't use the iolock alone as an I/O barrier.
> I believe fundamental problem is once the blocks
> are free'ed, the re-association should
> not expect some I/O in concurrent to those same block addresses.
>
> -Cheers,
> Bhagi.
>
>
>>
>> Date: Mon Dec 10 16:59:09 AEDT 2007
>> Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-vniowait
>> Inspected by: pleckie
>> Author: lachlan
>>
>> The following file(s) were checked into:
>> longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb
>>
>>
>> Modid: xfs-linux-melb:xfs-kern:30220a
>> fs/xfs/xfs_inode.c - 1.489 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/>> xfs_inode.c.diff?r1=text&tr1=1.489&r2=text&tr2=1.488&f=h
> http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/> xfs_inode.c.diff?r1=text&tr1=1.489&r2=text&tr2=1.488&f=h
>> http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.489&r2=text&tr2=1.488&f=h
>> - Don't wait for pending I/Os when purging blocks beyond eof.
>>
>>
>>
>>
>>
>
>
> [[HTML alternate version deleted]]
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-12-12 5:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-10 5:59 TAKE 964002 - Don't wait for pending I/Os when purging blocks beyond eof Lachlan McIlroy
2007-12-11 8:11 ` Christoph Hellwig
2007-12-11 23:25 ` David Chinner
2007-12-12 5:13 ` Christoph Hellwig
2007-12-11 16:52 ` Bhagi rathi
2007-12-12 4:04 ` Lachlan McIlroy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox