* testing stable pages being modified
@ 2013-05-30 22:36 Zach Brown
2013-05-31 5:11 ` Chris Mason
0 siblings, 1 reply; 6+ messages in thread
From: Zach Brown @ 2013-05-30 22:36 UTC (permalink / raw)
To: linux-fsdevel, linux-btrfs
'stable' pages have always been a bit of a fiction. It's easy to
intentionally modify stable pages under io with some help from page
references that ignore mappings and page state.
Here's little test that uses O_DIRECT to get the pinned aio ring pages
under IO and then has event completion stores modify them while they're
in flight.
It's a nice quick way to test the consequences of stable pages being
modified. It can be used to burp out ratelimited csum failure kernel
messages with btrfs, for example.
- z
#define _GNU_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <limits.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/uio.h>
#include <assert.h>
#include <libaio.h>
int main(int argc, char **argv)
{
size_t total = 1 * 1024 * 1024;
size_t page_size = sysconf(_SC_PAGESIZE);
struct iovec *iov;
size_t iov_nr = total / page_size;
void *junk;
io_context_t ctx = NULL;
int nr_iocbs = 3;
struct iocb iocbs[nr_iocbs];
struct iocb *iocb_ptrs[nr_iocbs];
struct io_event events[nr_iocbs];
int ret;
int fd;
int nr;
int i;
if (argc != 2) {
fprintf(stderr, "usage: %s <file_to_overwrite>\n", argv[0]);
exit(1);
}
iov = calloc(iov_nr, sizeof(*iov));
junk = malloc(total);
assert(iov && junk);
fd = open(argv[1], O_RDWR|O_CREAT|O_DIRECT, 0644);
assert(fd >= 0);
ret = io_setup(nr_iocbs, &ctx);
assert(ret >= 0);
for (i = 0; i < iov_nr; i++) {
iov[i].iov_base = ctx;
iov[i].iov_len = page_size;
}
/* initial write to allocate the file region */
ret = writev(fd, iov, iov_nr);
assert(ret == total);
/*
* Keep one of each of these iocbs in flight:
*
* [0]: hopefully fast 0 byte read to keep churning events
* [1]: dio read of file bytes to trigger csum verification
* [2]: dio write of unstable event pages
*/
io_prep_pread(&iocbs[0], fd, junk, 0, 0);
io_prep_pread(&iocbs[1], fd, junk, total, 0);
io_prep_pwritev(&iocbs[2], fd, iov, iov_nr, 0);
for (i = 0; i < nr_iocbs; i++)
iocb_ptrs[i] = &iocbs[i];
nr = nr_iocbs;
for(;;) {
ret = io_submit(ctx, nr, iocb_ptrs);
assert(ret == nr);
nr = io_getevents(ctx, 1, nr_iocbs, events, NULL);
assert(nr > 0);
for (i = 0; i < nr; i++)
iocb_ptrs[i] = events[i].obj;
}
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: testing stable pages being modified
2013-05-30 22:36 testing stable pages being modified Zach Brown
@ 2013-05-31 5:11 ` Chris Mason
2013-05-31 6:24 ` Zach Brown
0 siblings, 1 reply; 6+ messages in thread
From: Chris Mason @ 2013-05-31 5:11 UTC (permalink / raw)
To: Zach Brown, linux-fsdevel@vger.kernel.org,
linux-btrfs@vger.kernel.org
Quoting Zach Brown (2013-05-30 18:36:10)
> 'stable' pages have always been a bit of a fiction. It's easy to
> intentionally modify stable pages under io with some help from page
> references that ignore mappings and page state.
>
> Here's little test that uses O_DIRECT to get the pinned aio ring pages
> under IO and then has event completion stores modify them while they're
> in flight.
>
> It's a nice quick way to test the consequences of stable pages being
> modified. It can be used to burp out ratelimited csum failure kernel
> messages with btrfs, for example.
Changing O_DIRECT in flight has always been a deep dark corner case, and
crc errors are the expected result. Have you found anyone doing this in
real life?
I do like the small test program though, we should extend it into a test
to make sure crcs are really crcing.
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: testing stable pages being modified
2013-05-31 5:11 ` Chris Mason
@ 2013-05-31 6:24 ` Zach Brown
2013-05-31 13:29 ` Josef Bacik
0 siblings, 1 reply; 6+ messages in thread
From: Zach Brown @ 2013-05-31 6:24 UTC (permalink / raw)
To: Chris Mason; +Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org
> Changing O_DIRECT in flight has always been a deep dark corner case, and
> crc errors are the expected result. Have you found anyone doing this in
> real life?
Agreed; and no, I haven't heard of people accidentally modifying stable
pages.
> I do like the small test program though, we should extend it into a test
> to make sure crcs are really crcing.
Yeah, the little test is more to make sure that the error paths handle
the modification case as expected :).
- z
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: testing stable pages being modified
2013-05-31 6:24 ` Zach Brown
@ 2013-05-31 13:29 ` Josef Bacik
2013-05-31 13:53 ` Chris Mason
0 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2013-05-31 13:29 UTC (permalink / raw)
To: Zach Brown
Cc: Chris Mason, linux-fsdevel@vger.kernel.org,
linux-btrfs@vger.kernel.org
On Fri, May 31, 2013 at 12:24:30AM -0600, Zach Brown wrote:
> > Changing O_DIRECT in flight has always been a deep dark corner case, and
> > crc errors are the expected result. Have you found anyone doing this in
> > real life?
>
> Agreed; and no, I haven't heard of people accidentally modifying stable
> pages.
>
Windows does this, it also will send down the same page for different offsets
which is why we have that special check in check_direct_IO for reads because
that would cause lots of csum errors too. I tried to fix the modified in flight
problem by checking the csums of the pages in the io completion handler and
re-submitting the IO if the pages had changed, but this of course dramatically
reduced performance for all of those well behaved O_DIRECT applications, so in
the end I just set nodatasum for that vm image and carried on. I'm not sure
what the solution is for this problem. Thanks,
Josef
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: testing stable pages being modified
2013-05-31 13:29 ` Josef Bacik
@ 2013-05-31 13:53 ` Chris Mason
2013-05-31 14:34 ` Josef Bacik
0 siblings, 1 reply; 6+ messages in thread
From: Chris Mason @ 2013-05-31 13:53 UTC (permalink / raw)
To: Josef Bacik, Zach Brown
Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org
Quoting Josef Bacik (2013-05-31 09:29:07)
> On Fri, May 31, 2013 at 12:24:30AM -0600, Zach Brown wrote:
> > > Changing O_DIRECT in flight has always been a deep dark corner case, and
> > > crc errors are the expected result. Have you found anyone doing this in
> > > real life?
> >
> > Agreed; and no, I haven't heard of people accidentally modifying stable
> > pages.
> >
>
> Windows does this, it also will send down the same page for different offsets
> which is why we have that special check in check_direct_IO for reads because
> that would cause lots of csum errors too. I tried to fix the modified in flight
> problem by checking the csums of the pages in the io completion handler and
> re-submitting the IO if the pages had changed, but this of course dramatically
> reduced performance for all of those well behaved O_DIRECT applications, so in
> the end I just set nodatasum for that vm image and carried on. I'm not sure
> what the solution is for this problem. Thanks,
Ugh, I forgot about the windows case. KVM should really be copying the
pages for sectors where IO is already in flight.
We could do the copies ourselves: mount -o dio_copies
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: testing stable pages being modified
2013-05-31 13:53 ` Chris Mason
@ 2013-05-31 14:34 ` Josef Bacik
0 siblings, 0 replies; 6+ messages in thread
From: Josef Bacik @ 2013-05-31 14:34 UTC (permalink / raw)
To: Chris Mason
Cc: Josef Bacik, Zach Brown, linux-fsdevel@vger.kernel.org,
linux-btrfs@vger.kernel.org
On Fri, May 31, 2013 at 07:53:29AM -0600, Chris Mason wrote:
> Quoting Josef Bacik (2013-05-31 09:29:07)
> > On Fri, May 31, 2013 at 12:24:30AM -0600, Zach Brown wrote:
> > > > Changing O_DIRECT in flight has always been a deep dark corner case, and
> > > > crc errors are the expected result. Have you found anyone doing this in
> > > > real life?
> > >
> > > Agreed; and no, I haven't heard of people accidentally modifying stable
> > > pages.
> > >
> >
> > Windows does this, it also will send down the same page for different offsets
> > which is why we have that special check in check_direct_IO for reads because
> > that would cause lots of csum errors too. I tried to fix the modified in flight
> > problem by checking the csums of the pages in the io completion handler and
> > re-submitting the IO if the pages had changed, but this of course dramatically
> > reduced performance for all of those well behaved O_DIRECT applications, so in
> > the end I just set nodatasum for that vm image and carried on. I'm not sure
> > what the solution is for this problem. Thanks,
>
> Ugh, I forgot about the windows case. KVM should really be copying the
> pages for sectors where IO is already in flight.
>
> We could do the copies ourselves: mount -o dio_copies
>
That would work for us, but what about other people that rely on stable pages,
like *fs on iscsi and such? It might be good to have a generic mount option
that the vfs notices and makes the copying happen before it gets to the file
system and that way we're all save and don't have a different solution/mount
option for each fs. Thanks,
Josef
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-05-31 14:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-30 22:36 testing stable pages being modified Zach Brown
2013-05-31 5:11 ` Chris Mason
2013-05-31 6:24 ` Zach Brown
2013-05-31 13:29 ` Josef Bacik
2013-05-31 13:53 ` Chris Mason
2013-05-31 14:34 ` Josef Bacik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).