From: "Tom May" <tom@tommay.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/8][for -mm] mem_notify v6
Date: Mon, 14 Apr 2008 17:16:50 -0700 [thread overview]
Message-ID: <ab3f9b940804141716x755787f5h8e0122c394922a83@mail.gmail.com> (raw)
In-Reply-To: <20080402154910.9588.KOSAKI.MOTOHIRO@jp.fujitsu.com>
On Wed, Apr 2, 2008 at 12:31 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi Tom,
>
> Thank you very useful comment.
> that is very interesting.
>
>
> > I tried it with a real-world program that, among other things, mmaps
> > anonymous pages and touches them at a reasonable speed until it gets
> > notified via /dev/mem_notify, releases most of them with
> > madvise(MADV_DONTNEED), then loops to start the cycle again.
> >
> > What tends to happen is that I do indeed get notifications via
> > /dev/mem_notify when the kernel would like to be swapping, at which
> > point I free memory. But the notifications come at a time when the
> > kernel needs memory, and it gets the memory by discarding some Cached
> > or Mapped memory (I can see these decreasing in /proc/meminfo with
> > each notification). With each mmap/notify/madvise cycle the Cached
> > and Mapped memory gets smaller, until eventually while I'm touching
> > pages the kernel can't find enough memory and will either invoke the
> > OOM killer or return ENOMEM from syscalls. This is precisely the
> > situation I'm trying to avoid by using /dev/mem_notify.
>
> Could you send your test program?
> I can't reproduce that now, sorry.
>
>
>
> > The criterion of "notify when the kernel would like to swap" feels
> > correct, but in addition I seem to need something like "notify when
> > cached+mapped+free memory is getting low".
>
> Hmmm,
> I think this idea is only useful when userland process call
> madvise(MADV_DONTNEED) periodically.
>
> but I hope improve my patch and solve your problem.
> if you don' mind, please help my testing ;)
Here's a test program that allocates memory and frees on notification.
It takes an argument which is the number of pages to use; use a
number considerably higher than the amount of memory in the system.
I'm running this on a system without swap. Each time it gets a
notification, it frees memory and writes out the /proc/meminfo
contents. What I see is that Cached gradually decreases, then Mapped
decreases, and eventually the kernel invokes the oom killer. It may
be necessary to tune some of the constants that control the allocation
and free rates and latency; these values work for my system.
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <poll.h>
#include <sched.h>
#include <time.h>
#define PAGESIZE 4096
/* How many pages we've mmap'd. */
static int pages;
/* Pointer to mmap'd memory used as a circular buffer. One thread
touches pages, another thread releases them on notification. */
static char *p;
/* How many pages to touch each 5ms. This makes at most 2000
pages/sec. */
#define TOUCH_CHUNK 10
/* How many pages to free when we're notified. With a 100ms FREE_DELAY,
we can free ~9110 pages/sec, or perhaps only 5*911 = 4555 pages/sec if we're
notified only 5 times/sec. */
#define FREE_CHUNK 911
/* Delay in milliseconds before freeing pages, to simulate latency while finding
pages to free. */
#define FREE_DELAY 100
static void touch(void);
static int release(void *arg);
static void release_pages(void);
static void show_meminfo(void);
/* Stack for the release thread. */
static char stack[8192];
int
main (int argc, char **argv)
{
pages = atoi(argv[1]);
p = mmap(NULL, pages * PAGESIZE, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, 0, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
if (clone(release, stack + sizeof(stack) - 4,
CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_THREAD,
NULL) == -1) {
perror("clone failed");
exit(1);
}
touch();
}
static void
touch (void)
{
int page = 0;
while (1) {
int i;
struct timespec t;
for (i = 0; i < TOUCH_CHUNK; i++) {
p[page * PAGESIZE] = 1;
if (++page >= pages) {
page = 0;
}
}
t.tv_sec = 0;
t.tv_nsec = 5000000;
if (nanosleep(&t, NULL) == -1) {
perror("nanosleep");
}
}
}
static int
release (void *arg)
{
int fd = open("/dev/mem_notify", O_RDONLY);
if (fd == -1) {
perror("open(/dev/mem_notify)");
exit(1);
}
while (1) {
struct pollfd pfd;
int nfds;
pfd.fd = fd;
pfd.events = POLLIN;
nfds = poll(&pfd, 1, -1);
if (nfds == -1) {
perror("poll");
exit(1);
}
if (nfds == 1) {
struct timespec t;
t.tv_sec = 0;
t.tv_nsec = FREE_DELAY * 1000000;
if (nanosleep(&t, NULL) == -1) {
perror("nanosleep");
}
release_pages();
printf("time: %d\n", time(NULL));
show_meminfo();
}
}
}
static void
release_pages (void)
{
/* Index of the next page to free. */
static int page = 0;
int i;
/* Release FREE_CHUNK pages. */
for (i = 0; i < FREE_CHUNK; i++) {
int r = madvise(p + page*PAGESIZE, PAGESIZE, MADV_DONTNEED);
if (r == -1) {
perror("madvise");
exit(1);
}
if (++page >= pages) {
page = 0;
}
}
}
static void
show_meminfo (void)
{
char buffer[2000];
int fd;
ssize_t n;
fd = open("/proc/meminfo", O_RDONLY);
if (fd == -1) {
perror("open(/proc/meminfo)");
exit(1);
}
n = read(fd, buffer, sizeof(buffer));
if (n == -1) {
perror("read(/proc/meminfo)");
exit(1);
}
n = write(1, buffer, n);
if (n == -1) {
perror("write(stdout)");
exit(1);
}
if (close(fd) == -1) {
perror("close(/proc/meminfo)");
exit(1);
}
}
.tom
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-04-15 0:16 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-09 15:19 [PATCH 0/8][for -mm] mem_notify v6 KOSAKI Motohiro
2008-02-09 16:02 ` Jon Masters
2008-02-09 16:33 ` KOSAKI Motohiro
2008-02-09 16:43 ` Rik van Riel
2008-02-09 16:49 ` KOSAKI Motohiro
2008-02-11 15:36 ` [PATCH 0/8][for -mm] mem_notify v6, " Jonathan Corbet
2008-02-11 15:46 ` KOSAKI Motohiro
2008-02-17 14:49 ` Paul Jackson
2008-02-19 7:36 ` KOSAKI Motohiro
2008-02-19 15:00 ` Paul Jackson
2008-02-19 19:02 ` Rik van Riel
2008-02-19 20:18 ` Paul Jackson
2008-02-19 20:43 ` Paul Jackson
2008-02-19 22:28 ` Pavel Machek
2008-02-20 1:54 ` Paul Jackson
2008-02-20 2:07 ` Rik van Riel
2008-02-20 2:48 ` KOSAKI Motohiro
2008-02-20 4:57 ` Paul Jackson
2008-02-20 5:21 ` KOSAKI Motohiro
2008-02-20 4:36 ` Paul Jackson
2008-04-01 23:35 ` Tom May
2008-04-02 7:31 ` KOSAKI Motohiro
2008-04-02 17:45 ` Tom May
2008-04-15 0:16 ` Tom May [this message]
2008-04-16 2:30 ` KOSAKI Motohiro
2008-04-17 9:30 ` KOSAKI Motohiro
2008-04-17 19:23 ` Tom May
2008-04-18 10:07 ` KOSAKI Motohiro
2008-04-21 20:32 ` Tom May
2008-04-23 8:27 ` Daniel Spång
2008-05-01 2:07 ` Tom May
2008-05-01 15:06 ` KOSAKI Motohiro
2008-05-02 22:21 ` Tom May
2008-05-03 12:26 ` KOSAKI Motohiro
2008-05-06 5:22 ` Tom May
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ab3f9b940804141716x755787f5h8e0122c394922a83@mail.gmail.com \
--to=tom@tommay.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).