From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail203.messagelabs.com (mail203.messagelabs.com [216.82.254.243]) by kanga.kvack.org (Postfix) with ESMTP id 2908E6B003D for ; Wed, 18 Mar 2009 15:44:14 -0400 (EDT) Received: from spaceape7.eur.corp.google.com (spaceape7.eur.corp.google.com [172.28.16.141]) by smtp-out.google.com with ESMTP id n2IJiAOt025849 for ; Wed, 18 Mar 2009 12:44:11 -0700 Received: from gxk9 (gxk9.prod.google.com [10.202.11.9]) by spaceape7.eur.corp.google.com with ESMTP id n2IJhsW4032494 for ; Wed, 18 Mar 2009 12:44:09 -0700 Received: by gxk9 with SMTP id 9so416549gxk.23 for ; Wed, 18 Mar 2009 12:44:08 -0700 (PDT) MIME-Version: 1.0 Date: Wed, 18 Mar 2009 12:44:08 -0700 Message-ID: <604427e00903181244w360c5519k9179d5c3e5cd6ab3@mail.gmail.com> Subject: ftruncate-mmap: pages are lost after writing to mmaped file. From: Ying Han Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: linux-kernel , linux-mm , Andrew Morton , guichaz@gmail.com, Alex Khesin , Mike Waychison , Rohit Seth List-ID: We triggered the failure during some internal experiment with ftruncate/mmap/write/read sequence. And we found that some pages are "lost" after writing to the mmaped file. which in the following test cases (count >= 0). First we deployed the test cases into group of machines and see about >20% failure rate on average. Then, I did couple of experiment to try to reproduce it on a single machine. what i found is that: 1. add a fsync after write the file, i can not reproduce this issue. 2. add memory pressure(mmap/mlock) while run the test in infinite loop, the failure is reproduced quickly. ( background flushing ? ) The "bad pages" count differs each time from one digit to 4,5 digit for 128M ftruncated file. and what i also found that the bad page number are contiguous for each segment which total bad pages container several segments. ext "1-4, 9-20, 48-50" ( batch flushing ? ) (The failure is reproduced based on 2.6.29-rc8, also happened on 2.6.18 kernel. . Here is the simple test case to reproduce it with memory pressure. ) #include #include #include #include #include #include #include long kMemSize = 128 << 20; int kPageSize = 4096; int main(int argc, char **argv) { int status; int count = 0; int i; char *fname = "/root/test.mmap"; char *mem; unlink(fname); int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600); status = ftruncate(fd, kMemSize); mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); // Fill the memory with 1s. memset(mem, 1, kMemSize); for (i = 0; i < kMemSize; i++) { int byte_good = mem[i] != 0; if (!byte_good && ((i % kPageSize) == 0)) { //printf("%d ", i / kPageSize); count++; } } munmap(mem, kMemSize); close(fd); unlink(fname); if (count > 0) { printf("Running %d bad page\n", count); return 1; } return 0; } --Ying -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org