linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* non-aligned DIO reads on NFS are corrupting memory in 3.7.0
@ 2012-12-12 14:46 Jeff Layton
  2012-12-12 16:20 ` Fred Isaman
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Layton @ 2012-12-12 14:46 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs, eguan

[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]

One of our QA folks found that the attached testcase would segfault
when run on a recent rhel6 kernel that has a backport of the pnfs dio
code. I get the same segfault when I run it on a 3.7.0 kernel as well.

I think the problem is that because the buffer we're reading into is on
the stack, the kernel is scribbling over the rest of the page after the
read and corrupting it.

The problem, I think is this block in nfs_direct_read_completion():

-----------------------[snip]-----------------------
                if (test_bit(NFS_IOHDR_EOF, &hdr->flags)) {
                        if (bytes > hdr->good_bytes)
                                zero_user(page, 0, PAGE_SIZE);
                        else if (hdr->good_bytes - bytes < PAGE_SIZE)
                                zero_user_segment(page,
                                        hdr->good_bytes & ~PAGE_MASK,
                                        PAGE_SIZE);
                }
-----------------------[snip]-----------------------

If I comment that out, then the test passes and it doesn't scribble
over memory. I'm not clear on what that block is trying to accomplish.
If we get a short read in the DIO codepath, I don't think we ought to
be zeroing out the rest of the page. We should just return the number
of bytes read and be done with it.

I'm also suspicious of the "if (!PageCompound(page))" check in that
function as well. It doesn't seem like we ought to be marking pages
dirty in the DIO codepaths, should we?

-- 
Jeff Layton <jlayton@redhat.com>

[-- Attachment #2: dio.c --]
[-- Type: text/x-csrc, Size: 1475 bytes --]

#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define T_TEST_TEXT "This is a test\n"
#define BUF_SIZE 32
/* 16 won't work, 17 will cause segfault */
//#define READ_SIZE 16
#define READ_SIZE 17

int test_write(char *testfile)
{
	int fd;
	int ret;
	int status;
	char buf[BUF_SIZE];

	status = 0;
	printf("Write file with O_DIRECT\n");
	memset(buf, 'a', BUF_SIZE);
	/* Buffered write or direct write both work */
	fd = open(testfile, O_CREAT | O_RDWR | O_DIRECT | O_TRUNC, 0644);
//	fd = open(testfile, O_CREAT | O_RDWR | O_TRUNC, 0644);
	if (fd == -1) {
		printf("FAIL - open: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	ret = write(fd, T_TEST_TEXT, sizeof(T_TEST_TEXT));
	if (ret == -1) {
		printf("FAIL - write: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	close(fd);

	printf("Read file with O_DIRECT\n");
	/* Only direct read would get segfault */
	fd = open(testfile, O_RDONLY | O_DIRECT);
//	fd = open(testfile, O_RDONLY);
	ret = read(fd, buf, READ_SIZE);
	if (ret == -1) {
		printf("FAIL - re-read: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	if (!strcmp(buf, T_TEST_TEXT)) {
		printf("PASS\n");
	} else {
		printf("FAIL - expect \"%s\" - got \"%s\"\n",
			T_TEST_TEXT, buf);
		status++;
	}
	close(fd);

	return status;
}

int main(int argc, char *argv[])
{
	int ret;
	ret = test_write(argv[1]);
	exit(ret);
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-12-12 16:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-12 14:46 non-aligned DIO reads on NFS are corrupting memory in 3.7.0 Jeff Layton
2012-12-12 16:20 ` Fred Isaman
2012-12-12 16:30   ` Myklebust, Trond
2012-12-12 16:31   ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).