All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Ying Han <yinghan@google.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	guichaz@gmail.com, Alex Khesin <alexk@google.com>,
	Mike Waychison <mikew@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.
Date: Wed, 18 Mar 2009 15:11:57 -0700	[thread overview]
Message-ID: <20090318151157.85109100.akpm@linux-foundation.org> (raw)
In-Reply-To: <604427e00903181244w360c5519k9179d5c3e5cd6ab3@mail.gmail.com>

On Wed, 18 Mar 2009 12:44:08 -0700 Ying Han <yinghan@google.com> wrote:

> We triggered the failure during some internal experiment with
> ftruncate/mmap/write/read sequence. And we found that some pages are
> "lost" after writing to the mmaped file. which in the following test
> cases (count >= 0).
> 
> First we deployed the test cases into group of machines and see about
> >20% failure rate on average. Then, I did couple of experiment to try
> to reproduce it on a single machine. what i found is that:
> 1. add a fsync after write the file, i can not reproduce this issue.
> 2. add memory pressure(mmap/mlock) while run the test in infinite
> loop, the failure is reproduced quickly. ( background flushing ? )
> 
> The "bad pages" count differs each time from one digit to 4,5 digit
> for 128M ftruncated file. and what i also found that the bad page
> number are contiguous for each segment which total bad pages container
> several segments. ext "1-4, 9-20, 48-50" (  batch flushing ? )
> 
> (The failure is reproduced based on 2.6.29-rc8, also happened on
> 2.6.18 kernel. . Here is the simple test case to reproduce it with
> memory pressure. )

Thanks.  This will be a regression - the testing I did back in the days
when I actually wrote stuff would have picked this up.

Perhaps it is a 2.6.17 thing.  Which, IIRC, is when we made the changes to
redirty pages on each write fault.  Or maybe it was something else.

Nick, Peter: I'm in .au at preset, not able to build and run kernels - is
this something you'd have time to look into please?

Given the amount of time for which this bug has existed, I guess it isn't a
2.6.29 blocker, but once we've found out the cause we should have a little
post-mortem to work out how a bug of this nature has gone undetected for so
long.


> #include <sys/mman.h>
> #include <sys/types.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> 
> long kMemSize  = 128 << 20;
> int kPageSize = 4096;
> 
> int main(int argc, char **argv) {
> 	int status;
> 	int count = 0;
> 	int i;
> 	char *fname = "/root/test.mmap";
> 	char *mem;
> 
> 	unlink(fname);
> 	int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600);
> 	status = ftruncate(fd, kMemSize);
> 
> 	mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> 	// Fill the memory with 1s.
> 	memset(mem, 1, kMemSize);
> 
> 	for (i = 0; i < kMemSize; i++) {
> 		int byte_good = mem[i] != 0;
> 
> 		if (!byte_good && ((i % kPageSize) == 0)) {
> 			//printf("%d ", i / kPageSize);
> 			count++;
> 		}
> 	}
> 
> 	munmap(mem, kMemSize);
> 	close(fd);
> 	unlink(fname);
> 
> 	if (count > 0) {
> 		printf("Running %d bad page\n", count);
> 		return 1;
> 	}
> 	return 0;
> }
> 
> --Ying

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Ying Han <yinghan@google.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	guichaz@gmail.com, Alex Khesin <alexk@google.com>,
	Mike Waychison <mikew@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.
Date: Wed, 18 Mar 2009 15:11:57 -0700	[thread overview]
Message-ID: <20090318151157.85109100.akpm@linux-foundation.org> (raw)
In-Reply-To: <604427e00903181244w360c5519k9179d5c3e5cd6ab3@mail.gmail.com>

On Wed, 18 Mar 2009 12:44:08 -0700 Ying Han <yinghan@google.com> wrote:

> We triggered the failure during some internal experiment with
> ftruncate/mmap/write/read sequence. And we found that some pages are
> "lost" after writing to the mmaped file. which in the following test
> cases (count >= 0).
> 
> First we deployed the test cases into group of machines and see about
> >20% failure rate on average. Then, I did couple of experiment to try
> to reproduce it on a single machine. what i found is that:
> 1. add a fsync after write the file, i can not reproduce this issue.
> 2. add memory pressure(mmap/mlock) while run the test in infinite
> loop, the failure is reproduced quickly. ( background flushing ? )
> 
> The "bad pages" count differs each time from one digit to 4,5 digit
> for 128M ftruncated file. and what i also found that the bad page
> number are contiguous for each segment which total bad pages container
> several segments. ext "1-4, 9-20, 48-50" (  batch flushing ? )
> 
> (The failure is reproduced based on 2.6.29-rc8, also happened on
> 2.6.18 kernel. . Here is the simple test case to reproduce it with
> memory pressure. )

Thanks.  This will be a regression - the testing I did back in the days
when I actually wrote stuff would have picked this up.

Perhaps it is a 2.6.17 thing.  Which, IIRC, is when we made the changes to
redirty pages on each write fault.  Or maybe it was something else.

Nick, Peter: I'm in .au at preset, not able to build and run kernels - is
this something you'd have time to look into please?

Given the amount of time for which this bug has existed, I guess it isn't a
2.6.29 blocker, but once we've found out the cause we should have a little
post-mortem to work out how a bug of this nature has gone undetected for so
long.


> #include <sys/mman.h>
> #include <sys/types.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> 
> long kMemSize  = 128 << 20;
> int kPageSize = 4096;
> 
> int main(int argc, char **argv) {
> 	int status;
> 	int count = 0;
> 	int i;
> 	char *fname = "/root/test.mmap";
> 	char *mem;
> 
> 	unlink(fname);
> 	int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600);
> 	status = ftruncate(fd, kMemSize);
> 
> 	mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> 	// Fill the memory with 1s.
> 	memset(mem, 1, kMemSize);
> 
> 	for (i = 0; i < kMemSize; i++) {
> 		int byte_good = mem[i] != 0;
> 
> 		if (!byte_good && ((i % kPageSize) == 0)) {
> 			//printf("%d ", i / kPageSize);
> 			count++;
> 		}
> 	}
> 
> 	munmap(mem, kMemSize);
> 	close(fd);
> 	unlink(fname);
> 
> 	if (count > 0) {
> 		printf("Running %d bad page\n", count);
> 		return 1;
> 	}
> 	return 0;
> }
> 
> --Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-18 22:21 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-18 19:44 ftruncate-mmap: pages are lost after writing to mmaped file Ying Han
2009-03-18 19:44 ` Ying Han
2009-03-18 22:11 ` Andrew Morton [this message]
2009-03-18 22:11   ` Andrew Morton
2009-03-18 22:40   ` Linus Torvalds
2009-03-18 22:40     ` Linus Torvalds
2009-03-18 23:18     ` Ying Han
2009-03-18 23:18       ` Ying Han
2009-03-18 23:36       ` Linus Torvalds
2009-03-18 23:36         ` Linus Torvalds
2009-03-18 23:54         ` Ying Han
2009-03-18 23:54           ` Ying Han
2009-03-19 15:48           ` Nick Piggin
2009-03-19 15:48             ` Nick Piggin
2009-03-19 16:16             ` Peter Zijlstra
2009-03-19 16:16               ` Peter Zijlstra
2009-03-19 16:36               ` Nick Piggin
2009-03-19 16:36                 ` Nick Piggin
2009-03-19 16:20             ` Linus Torvalds
2009-03-19 16:20               ` Linus Torvalds
2009-03-19 16:34               ` Nick Piggin
2009-03-19 16:34                 ` Nick Piggin
2009-03-19 16:51                 ` Linus Torvalds
2009-03-19 16:51                   ` Linus Torvalds
2009-03-19 17:03                   ` Jan Kara
2009-03-19 17:03                     ` Jan Kara
2009-03-19 17:06                     ` Jan Kara
2009-03-19 17:06                       ` Jan Kara
2009-03-19 20:05                     ` Linus Torvalds
2009-03-19 20:05                       ` Linus Torvalds
2009-03-19 20:21                   ` Linus Torvalds
2009-03-19 20:21                     ` Linus Torvalds
2009-03-19 21:17                     ` Ying Han
2009-03-19 21:17                       ` Ying Han
2009-03-19 22:16                     ` Jan Kara
2009-03-19 22:16                       ` Jan Kara
2009-03-19 16:46             ` Jan Kara
2009-03-19 16:46               ` Jan Kara
2009-03-24  7:44               ` Nick Piggin
2009-03-24  7:44                 ` Nick Piggin
2009-03-24 10:27                 ` Nick Piggin
2009-03-24 10:27                   ` Nick Piggin
2009-03-24 10:32                 ` Andrew Morton
2009-03-24 10:32                   ` Andrew Morton
2009-03-24 15:35                   ` Nick Piggin
2009-03-24 15:35                     ` Nick Piggin
2009-03-26 18:29                     ` Jan Kara
2009-03-26 18:29                       ` Jan Kara
2009-03-26  0:03                   ` Ying Han
2009-03-26  0:03                     ` Ying Han
2009-03-24 12:39                 ` Jan Kara
2009-03-24 12:39                   ` Jan Kara
2009-03-24 12:55                   ` Jan Kara
2009-03-24 12:55                     ` Jan Kara
2009-03-24 13:26                     ` Jan Kara
2009-03-24 13:26                       ` Jan Kara
2009-03-24 14:01                       ` Chris Mason
2009-03-24 14:01                         ` Chris Mason
2009-03-24 14:07                         ` Jan Kara
2009-03-24 14:07                           ` Jan Kara
2009-03-26  8:18                           ` Aneesh Kumar K.V
2009-03-26  8:18                             ` Aneesh Kumar K.V
2009-03-24 14:30                       ` Nick Piggin
2009-03-24 14:30                         ` Nick Piggin
2009-03-24 14:47                         ` Jan Kara
2009-03-24 14:47                           ` Jan Kara
2009-03-24 14:56                           ` Peter Zijlstra
2009-03-24 14:56                             ` Peter Zijlstra
2009-03-24 15:29                             ` Jan Kara
2009-03-24 15:29                               ` Jan Kara
2009-03-24 20:14                               ` OGAWA Hirofumi
2009-03-24 20:14                                 ` OGAWA Hirofumi
2009-03-26  8:47                               ` Aneesh Kumar K.V
2009-03-26  8:47                                 ` Aneesh Kumar K.V
2009-03-26 11:37                                 ` Jan Kara
2009-03-26 11:37                                   ` Jan Kara
2009-03-26 23:02                                 ` Linus Torvalds
2009-03-26 23:02                                   ` Linus Torvalds
2009-03-24 15:03                           ` Nick Piggin
2009-03-24 15:03                             ` Nick Piggin
2009-03-24 15:48                             ` Jan Kara
2009-03-24 15:48                               ` Jan Kara
2009-03-24 17:35                               ` Jan Kara
2009-03-24 17:35                                 ` Jan Kara
2009-03-24 17:35                                 ` Jan Kara
2009-04-01 22:36                                 ` Ying Han
2009-04-01 22:36                                   ` Ying Han
2009-04-02 10:11                                   ` Jan Kara
2009-04-02 10:11                                     ` Jan Kara
2009-04-02 11:24                                   ` Nick Piggin
2009-04-02 11:24                                     ` Nick Piggin
2009-04-02 11:34                                     ` Jan Kara
2009-04-02 11:34                                       ` Jan Kara
2009-04-02 15:51                                       ` Nick Piggin
2009-04-02 15:51                                         ` Nick Piggin
2009-04-02 17:44                                         ` Ying Han
2009-04-02 17:44                                           ` Ying Han
2009-04-02 22:52                                           ` Ying Han
2009-04-02 22:52                                             ` Ying Han
2009-04-02 23:39                                             ` Jan Kara
2009-04-02 23:39                                               ` Jan Kara
2009-04-03  0:25                                               ` Ying Han
2009-04-03  0:25                                                 ` Ying Han
2009-04-03  1:29                                               ` Ying Han
2009-04-03  1:29                                                 ` Ying Han
2009-04-03  9:41                                                 ` Jan Kara
2009-04-03  9:41                                                   ` Jan Kara
2009-04-03 21:34                                                   ` Ying Han
2009-04-03 21:34                                                     ` Ying Han
2009-04-03  0:13                                     ` Ying Han
2009-04-03  0:13                                       ` Ying Han
2009-03-27 20:35                 ` Ying Han
2009-03-27 20:35                   ` Ying Han
2009-03-20  0:34     ` Ying Han
2009-03-20  0:34       ` Ying Han
2009-03-20  0:49       ` Linus Torvalds
2009-03-20  0:49         ` Linus Torvalds
2009-03-20  7:00         ` Ying Han
2009-03-20  7:00           ` Ying Han
2009-03-25 23:15     ` Ying Han
2009-03-25 23:15       ` Ying Han

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090318151157.85109100.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=alexk@google.com \
    --cc=guichaz@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mikew@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rohitseth@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.