From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: Luiz Capitulino <lcapitulino@redhat.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
anton@enomsg.org, akpm@linux-foundation.org,
kmpark@infradead.org, hyunhee.kim@samsung.com
Subject: Re: [PATCH v2] vmpressure: implement strict mode
Date: Fri, 28 Jun 2013 00:53:57 +0900 [thread overview]
Message-ID: <20130627155357.GC5006@gmail.com> (raw)
In-Reply-To: <20130627092616.GB17647@dhcp22.suse.cz>
On Thu, Jun 27, 2013 at 11:26:16AM +0200, Michal Hocko wrote:
> On Wed 26-06-13 23:17:12, Luiz Capitulino wrote:
> > Currently, an eventfd is notified for the level it's registered for
> > _plus_ higher levels.
> >
> > This is a problem if an application wants to implement different
> > actions for different levels. For example, an application might want
> > to release 10% of its cache on level low, 50% on medium and 100% on
> > critical. To do this, an application has to register a different
> > eventfd for each pressure level. However, fd low is always going to
> > be notified and and all fds are going to be notified on level critical.
> >
> > Strict mode solves this problem by strictly notifiying an eventfd
> > for the pressure level it registered for. This new mode is optional,
> > by default we still notify eventfds on higher levels too.
> >
> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
>
> Two nits bellow but it looks good in general to me. You can add my
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
>
> I still think that edge triggering makes some sense but that one might
> be rebased on top of this patch. We should still figure out whether the
> edge triggering is the right approach for the use case Hyunhee Kim wants
> it for so the strict mode should go first IMO.
For me, edge trigger as avoiding excessive event sending doesn't make sense
at all so I'd like to merge strict mode firstly, too.
>
> > ---
> >
> > o v2
> >
> > - Improve documentation
> > - Use a bit to store mode instead of a bool
> > - Minor changelog changes
> >
> > Documentation/cgroups/memory.txt | 26 ++++++++++++++++++++++----
> > mm/vmpressure.c | 26 ++++++++++++++++++++++++--
> > 2 files changed, 46 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
> > index ddf4f93..412872b 100644
> > --- a/Documentation/cgroups/memory.txt
> > +++ b/Documentation/cgroups/memory.txt
> > @@ -791,6 +791,22 @@ way to trigger. Applications should do whatever they can to help the
> > system. It might be too late to consult with vmstat or any other
> > statistics, so it's advisable to take an immediate action.
> >
> > +Applications can also choose between two notification modes when
> > +registering an eventfd for memory pressure events:
> > +
> > +When in "non-strict" mode, an eventfd is notified for the specific level
> > +it's registered for and higher levels. For example, an eventfd registered
> > +for low level is also going to be notified on medium and critical levels.
> > +This mode makes sense for applications interested on monitoring reclaim
> > +activity or implementing simple load-balacing logic. The non-strict mode
> > +is the default notification mode.
> > +
> > +When in "strict" mode, an eventfd is strictly notified for the pressure
> > +level it's registered for. For example, an eventfd registered for the low
> > +level event is not going to be notified when memory pressure gets into
> > +medium or critical levels. This allows for more complex logic based on
> > +the actual pressure level the system is experiencing.
>
> It would be also fair to mention that there is no guarantee that lower
> levels are signaled before higher so nobody should rely on seeing LOW
> before MEDIUM or CRITICAL.
You mean "non-strict" mode guarantees it?
>
> > +
> > The events are propagated upward until the event is handled, i.e. the
> > events are not pass-through. Here is what this means: for example you have
> > three cgroups: A->B->C. Now you set up an event listener on cgroups A, B
> > @@ -807,12 +823,14 @@ register a notification, an application must:
> >
> > - create an eventfd using eventfd(2);
> > - open memory.pressure_level;
> > -- write string like "<event_fd> <fd of memory.pressure_level> <level>"
> > +- write string like "<event_fd> <fd of memory.pressure_level> <level> [strict]"
> > to cgroup.event_control.
> >
> > -Application will be notified through eventfd when memory pressure is at
> > -the specific level (or higher). Read/write operations to
> > -memory.pressure_level are no implemented.
> > +Applications will be notified through eventfd when memory pressure is at
> > +the specific level or higher. If strict is passed, then applications
> > +will only be notified when memory pressure reaches the specified level.
> > +
> > +Read/write operations to memory.pressure_level are no implemented.
> >
> > Test:
> >
> > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > index 736a601..ba5c17e 100644
> > --- a/mm/vmpressure.c
> > +++ b/mm/vmpressure.c
> > @@ -138,8 +138,16 @@ struct vmpressure_event {
> > struct eventfd_ctx *efd;
> > enum vmpressure_levels level;
> > struct list_head node;
> > + unsigned int mode;
>
> You would fill up a hole between level and node if you move it up on
> 64b. Doesn't matter much but why not do it...
>
> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-27 15:54 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-27 3:17 [PATCH v2] vmpressure: implement strict mode Luiz Capitulino
2013-06-27 9:26 ` Michal Hocko
2013-06-27 13:34 ` Luiz Capitulino
2013-06-27 14:59 ` Michal Hocko
2013-06-27 15:53 ` Minchan Kim [this message]
2013-06-27 17:42 ` Michal Hocko
2013-06-27 15:44 ` Minchan Kim
2013-06-27 22:02 ` Andrew Morton
2013-06-28 0:02 ` Minchan Kim
2013-06-28 0:34 ` Andrew Morton
2013-06-28 0:58 ` Anton Vorontsov
2013-06-28 1:13 ` Andrew Morton
2013-06-28 4:34 ` Anton Vorontsov
2013-06-28 5:07 ` Anton Vorontsov
2013-06-28 14:00 ` Luiz Capitulino
2013-06-28 16:57 ` Anton Vorontsov
2013-06-28 17:09 ` Anton Vorontsov
2013-06-28 18:25 ` Luiz Capitulino
2013-06-28 18:45 ` Anton Vorontsov
2013-06-28 18:58 ` Luiz Capitulino
2013-06-28 18:45 ` Luiz Capitulino
2013-06-28 18:55 ` Anton Vorontsov
2013-06-28 19:44 ` Luiz Capitulino
2013-06-29 0:56 ` Anton Vorontsov
2013-07-01 8:22 ` Hyunhee Kim
2013-07-02 4:32 ` Anton Vorontsov
2013-07-02 8:29 ` Hyunhee Kim
2013-07-02 13:29 ` Michal Hocko
2013-07-02 14:59 ` Luiz Capitulino
2013-07-02 17:24 ` Anton Vorontsov
2013-07-02 18:38 ` Luiz Capitulino
2013-06-28 5:24 ` Andrew Morton
2013-06-28 13:43 ` Luiz Capitulino
2013-06-28 9:04 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130627155357.GC5006@gmail.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=anton@enomsg.org \
--cc=hyunhee.kim@samsung.com \
--cc=kmpark@infradead.org \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).