linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Barnes <qbarnes@gmail.com>
To: linux-nfs@vger.kernel.org
Subject: nfs-backed mmap file results in 1000s of WRITEs per second
Date: Thu, 5 Sep 2013 11:21:10 -0500	[thread overview]
Message-ID: <20130905162110.GA17920@gmail.com> (raw)

If two (or more) processes are doing nothing more than writing to
the memory addresses of an mmapped shared file on an NFS mounted
file system, it results in the kernel scribbling WRITEs to the
server as fast as it can (1000s per second) even while no syscalls
are going on.

The problems happens on NFS clients mounting NFSv3 or NFSv4.  I've
reproduced this on the 3.11 kernel, and it happens as far back as
RHEL6 (2.6.32 based), however, it is not a problem on RHEL5 (2.6.18
based).  (All x86_64 systems.)  I didn't try anything in between.

I've created a self-contained program below that will demonstrate
the problem (call it "t1").  Assuming /mnt has an NFS file system:

  $ t1 /mnt/mynfsfile 1    # Fork 1 writer, kernel behaves normally
  $ t1 /mnt/mynfsfile 2    # Fork 2 writers, kernel goes crazy WRITEing

Just run "watch -d nfsstat" in another window while running the two
writer test and watch the WRITE count explode.

I don't see anything particularly wrong with what the example code
is doing with its use of mmap.  Is there anything undefined about
the code that would explain this behavior, or is this a NFS bug
that's really lived this long?

Quentin



#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>

int
kill_children()
{
	int		cnt = 0;
	siginfo_t	infop;

	signal(SIGINT, SIG_IGN);
	kill(0, SIGINT);
	while (waitid(P_ALL, 0, &infop, WEXITED) != -1) ++cnt;

	return cnt;
}

void
sighandler(int sig)
{
	printf("Cleaning up all children.\n");
	int cnt = kill_children();
	printf("Cleaned up %d child%s.\n", cnt, cnt == 1 ? "" : "ren");

	exit(0);
}

int
do_child(volatile int *iaddr)
{
	while (1) *iaddr = 1;
}

int
main(int argc, char **argv)
{
	const char	*path;
	int		fd;
	ssize_t		wlen;
	int		*ip;
	int		fork_count = 1;

	if (argc == 1) {
		fprintf(stderr, "Usage: %s {filename} [fork_count].\n",
			argv[0]);
		return 1;
	}

	path = argv[1];

	if (argc > 2) {
		int fc = atoi(argv[2]);
		if (fc >= 0)
			fork_count = fc;
	}

	fd = open(path, O_CREAT|O_TRUNC|O_RDWR|O_APPEND, S_IRUSR|S_IWUSR);
	if (fd < 0) {
		fprintf(stderr, "Open of '%s' failed: %s (%d)\n",
			path, strerror(errno), errno);
		return 1;
	}

	wlen = write(fd, &(int){0}, sizeof(int));
	if (wlen != sizeof(int)) {
		if (wlen < 0)
			fprintf(stderr, "Write of '%s' failed: %s (%d)\n",
				path, strerror(errno), errno);
		else
			fprintf(stderr, "Short write to '%s'\n", path);
		return 1;
	}

	ip = (int *)mmap(NULL, sizeof(int), PROT_READ|PROT_WRITE,
			   MAP_SHARED, fd, 0);
	if (ip == MAP_FAILED) {
		fprintf(stderr, "Mmap of '%s' failed: %s (%d)\n",
			path, strerror(errno), errno);
		return 1;
	}

	signal(SIGINT, sighandler);

	while (fork_count-- > 0) {
		switch(fork()) {
		case -1:
			fprintf(stderr, "Fork failed: %s (%d)\n",
				strerror(errno), errno);
			kill_children();
			return 1;
		case 0:   /* child  */
			signal(SIGINT, SIG_DFL);
			do_child(ip);
			break;
		default:  /* parent */
			break;
		}
	}

	printf("Press ^C to terminate test.\n");
	pause();

	return 0;
}

             reply	other threads:[~2013-09-05 16:21 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-05 16:21 Quentin Barnes [this message]
2013-09-05 17:03 ` nfs-backed mmap file results in 1000s of WRITEs per second Malahal Naineni
2013-09-05 19:11   ` Quentin Barnes
2013-09-05 20:02     ` Myklebust, Trond
2013-09-05 21:36       ` Quentin Barnes
2013-09-05 21:57         ` Myklebust, Trond
2013-09-05 22:34           ` Quentin Barnes
2013-09-06 13:36             ` Jeff Layton
2013-09-06 15:00               ` Myklebust, Trond
2013-09-06 15:04                 ` Jeff Layton
2013-09-06 15:39                   ` Myklebust, Trond
2013-09-08 14:25                     ` William Dauchy
2013-09-06 16:48               ` Quentin Barnes
2013-09-07 14:51                 ` Jeff Layton
2013-09-07 15:00                   ` Myklebust, Trond
2013-09-09 13:04                 ` Jeff Layton
2013-09-09 17:32                   ` Quentin Barnes
2013-09-09 17:47                     ` Myklebust, Trond
2013-09-09 18:21                       ` Jeff Layton
2013-09-05 22:07         ` Myklebust, Trond

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130905162110.GA17920@gmail.com \
    --to=qbarnes@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).