From: Chris Mason <clm@fb.com>
To: xfs@oss.sgi.com, Dave Chinner <david@fromorbit.com>,
Eric Sandeen <sandeen@redhat.com>
Subject: Re: [PATCH] xfs: don't zero partial page cache pages during O_DIRECT
Date: Fri, 8 Aug 2014 11:17:48 -0400 [thread overview]
Message-ID: <53E4EA1C.6030009@fb.com> (raw)
In-Reply-To: <53E4E03A.7050101@fb.com>
On 08/08/2014 10:35 AM, Chris Mason wrote:
>
> xfs is using truncate_pagecache_range to invalidate the page cache
> during DIO reads. This is different from the other filesystems who only
> invalidate pages during DIO writes.
>
> truncate_pagecache_range is meant to be used when we are freeing the
> underlying data structs from disk, so it will zero any partial ranges
> in the page. This means a DIO read can zero out part of the page cache
> page, and it is possible the page will stay in cache.
>
> buffered reads will find an up to date page with zeros instead of the
> data actually on disk.
>
> This patch fixes things by leaving the page cache alone during DIO
> reads.
>
> We discovered this when our buffered IO program for distributing
> database indexes was finding zero filled blocks. I think writes
> are broken too, but I'll leave that for a separate patch because I don't
> fully understand what XFS needs to happen during a DIO write.
I stuck a cc: stable@vger.kernel.org after my sob, but then inserted a
giant test program. Just realized the cc might get lost...sorry I
wasn't trying to sneak it in.
I've been trying to figure out why this bug doesn't show up in our 3.2
kernels but does show up now. Today xfs does this:
truncate_pagecache_range(VFS_I(ip), pos, -1);
But in 3.2 we did this:
ret = -xfs_flushinval_pages(ip,
(iocb->ki_pos & PAGE_CACHE_MASK),
-1, FI_REMAPF_LOCKED);
Since we've done pos & PAGE_CACHE_MASK, the 3.2 code never sent a
partial offset. So it never zero'd partial pages.
>
> Signed-off-by: Chris Mason <clm@fb.com>
> cc: stable@vger.kernel.org
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 1f66779..8d25d98 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -295,7 +295,11 @@ xfs_file_read_iter(
> xfs_rw_iunlock(ip, XFS_IOLOCK_EXCL);
> return ret;
> }
> - truncate_pagecache_range(VFS_I(ip), pos, -1);
> +
> + /* we don't remove any pages here. A direct read
> + * does not invalidate any contents of the page
> + * cache
> + */
> }
> xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
> }
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-08-08 15:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-08 14:35 [PATCH] xfs: don't zero partial page cache pages during O_DIRECT Chris Mason
2014-08-08 15:17 ` Chris Mason [this message]
2014-08-08 16:04 ` [PATCH RFC] xfs: use invalidate_inode_pages2_range for DIO writes Chris Mason
2014-08-09 0:48 ` Dave Chinner
2014-08-09 2:42 ` Chris Mason
2014-08-08 20:39 ` [PATCH] xfs: don't zero partial page cache pages during O_DIRECT Brian Foster
2014-08-09 0:36 ` Dave Chinner
2014-08-09 2:32 ` Chris Mason
2014-08-09 3:19 ` Eric Sandeen
2014-08-09 4:17 ` Dave Chinner
2014-08-09 12:57 ` [PATCH v2] " Chris Mason
2014-08-11 13:29 ` Brian Foster
2014-08-12 1:17 ` Dave Chinner
2014-08-19 19:24 ` Chris Mason
2014-08-19 22:35 ` Dave Chinner
2014-08-20 1:54 ` Chris Mason
2014-08-20 2:19 ` Dave Chinner
2014-08-20 2:36 ` Dave Chinner
2014-08-20 4:41 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53E4EA1C.6030009@fb.com \
--to=clm@fb.com \
--cc=david@fromorbit.com \
--cc=sandeen@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox