linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Partialy mapped page stays in page cache after unmap
@ 2012-10-30 18:24 chrubis
  2012-10-31 12:52 ` Bob Liu
  2012-12-03  6:26 ` Vineet Gupta
  0 siblings, 2 replies; 5+ messages in thread
From: chrubis @ 2012-10-30 18:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Hugh Dickins, Michel Lespinasse, Ingo Molnar, Al Viro,
	Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1201 bytes --]

Hi!
I'm currently revisiting mmap related tests in LTP (Linux Test Project)
and I've came to the tests testing that writes to the partially
mapped page (at the end of mapping) are carried out correctly.

These tests fails because even after the object is unmapped and the
file-descriptor closed the pages still stays in the page cache so if
(possibly another process) opens and maps the file again the whole
content of the partial page is preserved.

Strictly speaking this is not a bug at least when sticking to regular
files as POSIX which says that the change is not written out. In this
case the file content is correct and forcing the data to be written out
by msync() makes the test pass. The SHM mappings seems to preserve the
content even after calling msync() which is, in my opinion, POSIX
violation although a minor one.

Looking at the test results I have, the file based mmap test worked fine
on 2.6.5 (or perhaps the page cache was working/setup differently and
the test succeeded by accidend).

Attached is a stripped down LTP test for the problem, uncommenting the
msync() makes the test succeed.

I would like to hear your opinions on this problems.

-- 
Cyril Hrubis
chrubis@suse.cz

[-- Attachment #2: reproducer.c --]
[-- Type: text/x-c, Size: 1964 bytes --]

#define _XOPEN_SOURCE 600

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>

int main(void)
{
	char tmpfname[256];
	long page_size;
	long total_size;

	void *pa;
	size_t len;
	int i, fd;
	
	pid_t child;
	char *ch;
	int exit_val;

	page_size = sysconf(_SC_PAGE_SIZE);

	/* Size of the file to be mapped */
	total_size = page_size / 2;

	/* mmap will create a partial page */
	len = page_size / 2;

	snprintf(tmpfname, sizeof(tmpfname), "/tmp/pts_mmap_11_5_%d", getpid());
	
	/* Create shared file */
	unlink(tmpfname);
	fd = open(tmpfname, O_CREAT | O_RDWR | O_EXCL, S_IRUSR | S_IWUSR);
	if (fd == -1) {
		printf("Error at open(): %s\n", strerror(errno));
		return 1;
	}
	if (ftruncate(fd, total_size) == -1) {
		printf("Error at ftruncate(): %s\n", strerror(errno));
		return 1;
	}

	pa = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	if (pa == MAP_FAILED) {
		printf("Error at mmap(): %s\n", strerror(errno));
		return 1;
	}
		
	ch = (char*)pa + len + 1;

	/* Check the patial page is ZERO filled */
	for (i = 0; i < page_size/2 - 1; i++) {
		if (ch[i] != 0) {
			printf("Test FAILED: The partial page at the "
			       "end of the file is not zero-filled\n");
			return 1;
		}
	}

	/* Write to the partial page */
	*ch = 'b';
	//msync(pa, len, MS_SYNC);
	munmap(pa, len);
	close(fd);

	/* Open and map it again */
	fd = open(tmpfname, O_RDWR, 0);
	unlink(tmpfname);

	pa = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	if (pa == MAP_FAILED) {
		printf("Error at 2nd mmap(): %s\n", strerror(errno));
		return 1;
	}

	ch = pa + len + 1;
	if (*ch == 'b') {
		printf("Test FAILED: Modification of the partial page "
		       "at the end of an object is written out\n");
		return 1;
	}
	
	close(fd);
	munmap(pa, len);

	printf("Test PASSED\n");
	return 1;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Partialy mapped page stays in page cache after unmap
  2012-10-30 18:24 Partialy mapped page stays in page cache after unmap chrubis
@ 2012-10-31 12:52 ` Bob Liu
  2012-10-31 14:19   ` chrubis
  2012-12-03  6:26 ` Vineet Gupta
  1 sibling, 1 reply; 5+ messages in thread
From: Bob Liu @ 2012-10-31 12:52 UTC (permalink / raw)
  To: chrubis
  Cc: linux-kernel, linux-mm, Hugh Dickins, Michel Lespinasse,
	Ingo Molnar, Al Viro, Andrew Morton

On Wed, Oct 31, 2012 at 2:24 AM,  <chrubis@suse.cz> wrote:
> Hi!
> I'm currently revisiting mmap related tests in LTP (Linux Test Project)
> and I've came to the tests testing that writes to the partially
> mapped page (at the end of mapping) are carried out correctly.
>
> These tests fails because even after the object is unmapped and the
> file-descriptor closed the pages still stays in the page cache so if
> (possibly another process) opens and maps the file again the whole
> content of the partial page is preserved.
>
> Strictly speaking this is not a bug at least when sticking to regular
> files as POSIX which says that the change is not written out. In this
> case the file content is correct and forcing the data to be written out
> by msync() makes the test pass. The SHM mappings seems to preserve the
> content even after calling msync() which is, in my opinion, POSIX
> violation although a minor one.
>

fsync implemented in SHM is noop_fsync.
May be we should extend it if needed.

> Looking at the test results I have, the file based mmap test worked fine
> on 2.6.5 (or perhaps the page cache was working/setup differently and
> the test succeeded by accidend).
>
> Attached is a stripped down LTP test for the problem, uncommenting the
> msync() makes the test succeed.
>
> I would like to hear your opinions on this problems.
>
> --
> Cyril Hrubis
> chrubis@suse.cz

-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Partialy mapped page stays in page cache after unmap
  2012-10-31 12:52 ` Bob Liu
@ 2012-10-31 14:19   ` chrubis
  0 siblings, 0 replies; 5+ messages in thread
From: chrubis @ 2012-10-31 14:19 UTC (permalink / raw)
  To: Bob Liu
  Cc: linux-kernel, linux-mm, Hugh Dickins, Michel Lespinasse,
	Ingo Molnar, Al Viro, Andrew Morton, Michael Kerrisk (man-pages)

[-- Attachment #1: Type: text/plain, Size: 1566 bytes --]

Hi!
> > Strictly speaking this is not a bug at least when sticking to regular
> > files as POSIX which says that the change is not written out. In this
> > case the file content is correct and forcing the data to be written out
> > by msync() makes the test pass. The SHM mappings seems to preserve the
> > content even after calling msync() which is, in my opinion, POSIX
> > violation although a minor one.
> >
> 
> fsync implemented in SHM is noop_fsync.
> May be we should extend it if needed.

I'm entirely sure that would fix the interface correclty. The posix
says:


mmap:

...
The system shall always zero-fill any partial page at the end of an
object. Further, the system shall never write out any modified portions
of the last page of an object which are beyond its end. 
...


msync:

...
The effect of msync() on a shared memory object or a typed memory object
is unspecified. 
...

Hmm, that is a little confusing and it looks like it depends on
interpretation what 'write out' for SHM object means. And I guess that
leaving the SHM part as it is is reasonable. Maybe worth of a note in
manual page.


On the other hand there seems to be several bugs in mmap() on regular
files. For example mapping half of the page from a file doesn't fill the
rest of the page with zeroes. And it looks like when half of page is
mapped, the second half modified, then unmapped and then the whole page
is mapped the content doesn't seem to be right, although this seems to
change randomly. The reproducer for the first case attached.

-- 
Cyril Hrubis
chrubis@suse.cz

[-- Attachment #2: reproducer1.c --]
[-- Type: text/x-c, Size: 1316 bytes --]

#define _XOPEN_SOURCE 600

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>

int main(void)
{
	char tmpfname[256];
	long page_size;
	void *data;
	char *pa;
	size_t len;
	int fd, i, fail = 0;

	page_size = sysconf(_SC_PAGE_SIZE);

	snprintf(tmpfname, sizeof(tmpfname), "/tmp/test");
	
	/* Create file */
	unlink(tmpfname);
	fd = open(tmpfname, O_CREAT | O_RDWR | O_EXCL,
		  S_IRUSR | S_IWUSR);
	if (fd == -1) {
		printf("Error at open(): %s\n", strerror(errno));
		return 1;
	}
	
	/* Fill it to the size of the page with 'a' */
	data = malloc(page_size);
	memset(data, 'a', page_size);
	if (write(fd, data, page_size) != page_size) {
		printf("Error at write(): %s\n", strerror(errno));
		return 1;
	}
	free(data);

	/* mmap half of the page */
	pa = mmap(NULL, page_size/2, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	if (pa == MAP_FAILED) {
		printf("Error at mmap(): %s\n", strerror(errno));
		return 1;
	}
	
	for (i = 0; i < page_size; i++) {
		if (i > page_size/2 && pa[i] != 0)
			fail++;
		printf("%4i %2x\n", i, pa[i]);
	}

	close(fd);
	munmap(pa, len);
	
	if (fail)
		printf("FAILED: Page not zeroed\n");
	else
		printf("SUCCEDED\n");

	return 0;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Partialy mapped page stays in page cache after unmap
  2012-10-30 18:24 Partialy mapped page stays in page cache after unmap chrubis
  2012-10-31 12:52 ` Bob Liu
@ 2012-12-03  6:26 ` Vineet Gupta
  2012-12-05 12:46   ` chrubis
  1 sibling, 1 reply; 5+ messages in thread
From: Vineet Gupta @ 2012-12-03  6:26 UTC (permalink / raw)
  To: chrubis
  Cc: linux-kernel, linux-mm, Hugh Dickins, Michel Lespinasse,
	Ingo Molnar, Al Viro, Andrew Morton

On Tuesday 30 October 2012 11:54 PM, chrubis@suse.cz wrote:
> Hi!
> I'm currently revisiting mmap related tests in LTP (Linux Test Project)
> and I've came to the tests testing that writes to the partially
> mapped page (at the end of mapping) are carried out correctly.
> 
> These tests fails because even after the object is unmapped and the
> file-descriptor closed the pages still stays in the page cache so if
> (possibly another process) opens and maps the file again the whole
> content of the partial page is preserved.
> 
> Strictly speaking this is not a bug at least when sticking to regular
> files as POSIX which says that the change is not written out. In this
> case the file content is correct and forcing the data to be written out
> by msync() makes the test pass.

Hi Cyril,

I've seen the LTP open posix mmap/{11-4,11-5} issues in the past myself
and was something I wanted to discuss on the lists myself. Thanks for
bringing this up.

Jut to reiterate: the expectations are

1. zero filling of unmapped (trailing) partial page
2. NO Writeout (to disk) of trailing partial page.

#1 is broken as your latter test case proves. We can have an alternate
test case which starts with non empty file (preloaded with all 'a's),
mmap partial page and then read out the trailing partial page, it will
not be zeroes (it's probably ftruncate which does the trick in first place).

Regarding #2 - I did verify that msync indeed makes it pass - but I'm
confused why. After all it is going to commit the buffer  with 'b' to
on-disk - so a subsequent mmap is bound to see the update to file and
hence would make the test fail. What am  I missing here ?

Thx,
-Vineet

 The SHM mappings seems to preserve the
> content even after calling msync() which is, in my opinion, POSIX
> violation although a minor one.
> 
> Looking at the test results I have, the file based mmap test worked fine
> on 2.6.5 (or perhaps the page cache was working/setup differently and
> the test succeeded by accidend).
> 
> Attached is a stripped down LTP test for the problem, uncommenting the
> msync() makes the test succeed.
> 
> I would like to hear your opinions on this problems.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Partialy mapped page stays in page cache after unmap
  2012-12-03  6:26 ` Vineet Gupta
@ 2012-12-05 12:46   ` chrubis
  0 siblings, 0 replies; 5+ messages in thread
From: chrubis @ 2012-12-05 12:46 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-kernel, linux-mm, Hugh Dickins, Michel Lespinasse,
	Ingo Molnar, Al Viro, Andrew Morton

Hi!
> I've seen the LTP open posix mmap/{11-4,11-5} issues in the past myself
> and was something I wanted to discuss on the lists myself. Thanks for
> bringing this up.
> 
> Jut to reiterate: the expectations are
> 
> 1. zero filling of unmapped (trailing) partial page
> 2. NO Writeout (to disk) of trailing partial page.
> 
> #1 is broken as your latter test case proves. We can have an alternate
> test case which starts with non empty file (preloaded with all 'a's),
> mmap partial page and then read out the trailing partial page, it will
> not be zeroes (it's probably ftruncate which does the trick in first place).
> 
> Regarding #2 - I did verify that msync indeed makes it pass - but I'm
> confused why. After all it is going to commit the buffer  with 'b' to
> on-disk - so a subsequent mmap is bound to see the update to file and
> hence would make the test fail. What am  I missing here ?

I've been researching that issue for quite some time and found this:

Once the partial page gets loaded into the page cache it stays there
till it's flushed back to the disk. There is no information about the
length of the data in that page in the page cache. The page is zeroed at
the time it's loaded into the cache but once you dirty the the content
it's not zeroed until it's flushed back to the disk it just stays in the
cache as it is and any subsequent mappings will just pick this page. The
page is not written back untill it's forced to leave the cache (which is
not after the the mapping has been destroyed or the process has exited)
which is the reason why msync() makes the test succeed.

In my opinion this behavior is not 100% POSIXly correct, on the other
hand I find it quite reasonable, making the mmap() see zeroed page at
any mapping would only waste memory (I can't see any other solution than
duplicating the last page for any new mmap).

Also note that the msync() doesn't work for shm as the shm filesystem
msycn is no-operation one (as the data doesn't have to be synced
anywhere).


I've send a patch to the linux man pages with similar description:

http://marc.info/?l=linux-man&m=135271969606543&w=2


Hope this clarifies the issue.

-- 
Cyril Hrubis
chrubis@suse.cz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-12-05 12:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-30 18:24 Partialy mapped page stays in page cache after unmap chrubis
2012-10-31 12:52 ` Bob Liu
2012-10-31 14:19   ` chrubis
2012-12-03  6:26 ` Vineet Gupta
2012-12-05 12:46   ` chrubis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).