linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anton Vorontsov <anton.vorontsov-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
To: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
	Pekka Enberg <penberg-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Leonid Moiseichuk
	<leonid.moiseichuk-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>,
	KOSAKI Motohiro
	<kosaki.motohiro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Bartlomiej Zolnierkiewicz
	<b.zolnierkie-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>,
	John Stultz <john.stultz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linaro-kernel-cunTk1MwBs8s++Sfvej+rw@public.gmane.org,
	patches-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org,
	kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org,
	linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RFC v2 0/2] vmevent: A bit reworked pressure attribute + docs + man page
Date: Fri, 26 Oct 2012 18:02:15 -0700	[thread overview]
Message-ID: <20121027010215.GA9152@lizard> (raw)
In-Reply-To: <20121026023720.GE15767@bbox>

On Fri, Oct 26, 2012 at 11:37:20AM +0900, Minchan Kim wrote:
[...]
> > > Of course, it's very flexible and potential to add new VM knob easily but
> > > the thing we is about to use now is only VMEVENT_ATTR_PRESSURE.
> > > Is there any other use cases for swap or free? or potential user?
> > 
> > Number of idle pages by itself might be not that interesting, but
> > cache+idle level is quite interesting.
> > 
> > By definition, _MED happens when performance already degraded, slightly,
> > but still -- we can be swapping.
> > 
> > But _LOW notifications are coming when kernel is just reclaiming, so by
> > using _LOW notifications + watching for cache level we can very easily
> > predict the swapping activity long before we have even _MED pressure.
> 
> So, for seeing cache level, we need new vmevent_attr?

Hopefully, not. We're not interested in the raw values of the cache level,
but what we want is to to tell the kernel how much "easily reclaimable
pages" userland has, and get notified when kernel believes that it's good
time for the userland is to help. I.e. this new _MILD level:

> > Maybe it makes sense to implement something like PRESSURE_MILD with an
> > additional nr_pages threshold, which basically hits the kernel about how
> > many easily reclaimable pages userland has (that would be a part of our
> > definition for the mild pressure level). So, essentially it will be
> > 
> > 	if (pressure_index >= oom_level)
> > 		return PRESSURE_OOM;
> > 	else if (pressure_index >= med_level)
> > 		return PRESSURE_MEDIUM;
> > 	else if (userland_reclaimable_pages >= nr_reclaimable_pages)
> > 		return PRESSURE_MILD;
> > 	return PRESSURE_LOW;
> > 
> > I must admit I like the idea more than exposing NR_FREE and stuff, but the
> > scheme reminds me the blended attributes, which we abandoned. Although,
> > the definition sounds better now, and we seem to be doing it in the right
> > place.
> > 
> > And if we go this way, then sure, we won't need any other attributes, and
> > so we could make the API much simpler.
> 
> That's what I want! If there isn't any user who really are willing to use it,
> let's drop it. Do not persuade with imaginary scenario because we should be 
> careful to introduce new ABI.

Yeah, I think you're right. Let's make the vmevent_fd slim first. I won't
even focus on the _MILD/_BALANCE level for now, we can do it later, and we
always have the /proc/vmstat even if the _MILD turns out to be a bad idea.

Reading /proc/vmstat is a bit more overhead, but it's not that much at all
(especially when we don't have to timer-poll the vmstat).

> > > Adding vmevent_fd without them is rather overkill.
> > > 
> > > And I want to avoid timer-base polling of vmevent if possbile.
> > > mem_notify of KOSAKI doesn't use such timer.
> > 
> > For pressure notifications we don't use the timers. We also read the
> 
> Hmm, when I see the code, timer still works and can notify to user. No?

Yes, I was mostly saying that it is technically not required anymore, but
you're right, the code still fires the timer (it just runs needlessly for
the pressure attr).

Bad wording on my side.

[..]
> > We can do it via eventfd, or /dev/chardev (which has been discussed and
> > people didn't like it, IIRC), or signals (which also has been discussed
> > and there are problems with this approach as well).
> > 
> > I'm not sure why having a syscall is a big issue. If we're making eventfd
> > interface, then we'd need to maintain /sys/.../ ABI the same way as we
> > maintain the syscall. What's the difference? A dedicated syscall is just a
> 
> No difference. What I want is just to remove unnecessary stuff in vmevent_fd
> and keep it as simple. If we do via /dev/chardev, I expect we can do necessary
> things for VM pressure. But if we can diet with vmevent_fd, It would be better.
> If so, maybe we have to change vmevent_fd to lowmem_fd or
> vmpressure_fd.

Sure, then I'm starting the work to slim the API down, and we'll see how
things are going to look after that.

Thanks a lot!

Anton.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2012-10-27  1:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-22 11:19 [RFC v2 0/2] vmevent: A bit reworked pressure attribute + docs + man page Anton Vorontsov
2012-10-22 11:21 ` [RFC 1/2] vmevent: Implement pressure attribute Anton Vorontsov
2012-10-24  9:03   ` Pekka Enberg
     [not found]     ` <alpine.LFD.2.02.1210241159590.13035-XMdqyYT0w3YmYvmMESoHnA@public.gmane.org>
2012-10-25  2:23       ` Anton Vorontsov
2012-10-25  8:38         ` Minchan Kim
2012-10-22 11:22 ` [RFC 2/2] man-pages: Add man page for vmevent_fd(2) Anton Vorontsov
2012-10-25  6:40 ` [RFC v2 0/2] vmevent: A bit reworked pressure attribute + docs + man page Minchan Kim
2012-10-25  6:44   ` Pekka Enberg
     [not found]     ` <CAOJsxLGsjTe13WjY_Q=BLBELwQXOjuwo7PiEKwONHUfR4mQmig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-25  8:53       ` Minchan Kim
2012-10-25  9:08   ` Anton Vorontsov
2012-10-25  9:23     ` Anton Vorontsov
2012-10-26  2:37     ` Minchan Kim
2012-10-27  1:02       ` Anton Vorontsov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121027010215.GA9152@lizard \
    --to=anton.vorontsov-qsej5fyqhm4dnm+yrofe0a@public.gmane.org \
    --cc=b.zolnierkie-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org \
    --cc=john.stultz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
    --cc=kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org \
    --cc=kosaki.motohiro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=leonid.moiseichuk-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org \
    --cc=linaro-kernel-cunTk1MwBs8s++Sfvej+rw@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mgorman-l3A5Bk7waGM@public.gmane.org \
    --cc=minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=patches-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
    --cc=penberg-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).