From: Glauber Costa <glommer@openvz.org>
To: <linux-mm@kvack.org>
Cc: <cgroups@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Greg Thelen <gthelen@google.com>,
<kamezawa.hiroyu@jp.fujitsu.com>, Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
<linux-fsdevel@vger.kernel.org>,
Dave Chinner <david@fromorbit.com>,
Glauber Costa <glommer@openvz.org>,
John Stultz <john.stultz@linaro.org>,
Joonsoo Kim <js1304@gmail.com>
Subject: [PATCH v6 30/31] vmpressure: in-kernel notifications
Date: Sun, 12 May 2013 22:13:51 +0400 [thread overview]
Message-ID: <1368382432-25462-31-git-send-email-glommer@openvz.org> (raw)
In-Reply-To: <1368382432-25462-1-git-send-email-glommer@openvz.org>
From: Glauber Costa <glommer@parallels.com>
During the past weeks, it became clear to us that the shrinker interface
we have right now works very well for some particular types of users,
but not that well for others. The later are usually people interested in
one-shot notifications, that were forced to adapt themselves to the
count+scan behavior of shrinkers. To do so, they had no choice than to
greatly abuse the shrinker interface producing little monsters all over.
During LSF/MM, one of the proposals that popped out during our session
was to reuse Anton Voronstsov's vmpressure for this. They are designed
for userspace consumption, but also provide a well-stablished,
cgroup-aware entry point for notifications.
This patch extends that to also support in-kernel users. Events that
should be generated for in-kernel consumption will be marked as such,
and for those, we will call a registered function instead of triggering
an eventfd notification.
Please note that due to my lack of understanding of each shrinker user,
I will stay away from converting the actual users, you are all welcome
to do so.
Signed-off-by: Glauber Costa <glommer@openvz.org>
Acked-by: Anton Vorontsov <anton@enomsg.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Reviewed-by: Greg Thelen <gthelen@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
include/linux/vmpressure.h | 6 ++++++
mm/vmpressure.c | 52 +++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 55 insertions(+), 3 deletions(-)
diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
index 76be077..3131e72 100644
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
@@ -19,6 +19,9 @@ struct vmpressure {
/* Have to grab the lock on events traversal or modifications. */
struct mutex events_lock;
+ /* False if only kernel users want to be notified, true otherwise. */
+ bool notify_userspace;
+
struct work_struct work;
};
@@ -36,6 +39,9 @@ extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css);
extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
struct eventfd_ctx *eventfd,
const char *args);
+
+extern int vmpressure_register_kernel_event(struct cgroup *cg,
+ void (*fn)(void));
extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
struct eventfd_ctx *eventfd);
#else
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 736a601..e16256e 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -135,8 +135,12 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
}
struct vmpressure_event {
- struct eventfd_ctx *efd;
+ union {
+ struct eventfd_ctx *efd;
+ void (*fn)(void);
+ };
enum vmpressure_levels level;
+ bool kernel_event;
struct list_head node;
};
@@ -152,12 +156,15 @@ static bool vmpressure_event(struct vmpressure *vmpr,
mutex_lock(&vmpr->events_lock);
list_for_each_entry(ev, &vmpr->events, node) {
- if (level >= ev->level) {
+ if (ev->kernel_event) {
+ ev->fn();
+ } else if (vmpr->notify_userspace && level >= ev->level) {
eventfd_signal(ev->efd, 1);
signalled = true;
}
}
+ vmpr->notify_userspace = false;
mutex_unlock(&vmpr->events_lock);
return signalled;
@@ -227,7 +234,7 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
* we account it too.
*/
if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS)))
- return;
+ goto schedule;
/*
* If we got here with no pages scanned, then that is an indicator
@@ -244,8 +251,15 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
vmpr->scanned += scanned;
vmpr->reclaimed += reclaimed;
scanned = vmpr->scanned;
+ /*
+ * If we didn't reach this point, only kernel events will be triggered.
+ * It is the job of the worker thread to clean this up once the
+ * notifications are all delivered.
+ */
+ vmpr->notify_userspace = true;
mutex_unlock(&vmpr->sr_lock);
+schedule:
if (scanned < vmpressure_win || work_pending(&vmpr->work))
return;
schedule_work(&vmpr->work);
@@ -328,6 +342,38 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
}
/**
+ * vmpressure_register_kernel_event() - Register kernel-side notification
+ * @cg: cgroup that is interested in vmpressure notifications
+ * @fn: function to be called when pressure happens
+ *
+ * This function register in-kernel users interested in receiving notifications
+ * about pressure conditions. Pressure notifications will be triggered at the
+ * same time as userspace notifications (with no particular ordering relative
+ * to it).
+ *
+ * Pressure notifications are a alternative method to shrinkers and will serve
+ * well users that are interested in a one-shot notification, with a
+ * well-defined cgroup aware interface.
+ */
+int vmpressure_register_kernel_event(struct cgroup *cg, void (*fn)(void))
+{
+ struct vmpressure *vmpr = cg_to_vmpressure(cg);
+ struct vmpressure_event *ev;
+
+ ev = kzalloc(sizeof(*ev), GFP_KERNEL);
+ if (!ev)
+ return -ENOMEM;
+
+ ev->kernel_event = true;
+ ev->fn = fn;
+
+ mutex_lock(&vmpr->events_lock);
+ list_add(&ev->node, &vmpr->events);
+ mutex_unlock(&vmpr->events_lock);
+ return 0;
+}
+
+/**
* vmpressure_unregister_event() - Unbind eventfd from vmpressure
* @cg: cgroup handle
* @cft: cgroup control files handle
--
1.8.1.4
next prev parent reply other threads:[~2013-05-12 18:15 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-12 18:13 [PATCH v6 00/31] kmemcg shrinkers Glauber Costa
2013-05-12 18:13 ` [PATCH v6 01/31] super: fix calculation of shrinkable objects for small numbers Glauber Costa
2013-05-12 18:13 ` [PATCH v6 02/31] dcache: convert dentry_stat.nr_unused to per-cpu counters Glauber Costa
2013-05-12 18:13 ` [PATCH v6 03/31] dentry: move to per-sb LRU locks Glauber Costa
2013-05-12 18:13 ` [PATCH v6 04/31] dcache: remove dentries from LRU before putting on dispose list Glauber Costa
2013-05-14 2:02 ` Dave Chinner
2013-05-14 5:46 ` [PATCH v7 " Dave Chinner
2013-05-14 7:10 ` Dave Chinner
2013-05-14 12:43 ` Glauber Costa
[not found] ` <51923158.7040002-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-14 20:32 ` Dave Chinner
2013-05-12 18:13 ` [PATCH v6 05/31] mm: new shrinker API Glauber Costa
2013-05-12 18:13 ` [PATCH v6 06/31] shrinker: convert superblock shrinkers to new API Glauber Costa
2013-05-12 18:13 ` [PATCH v6 07/31] list: add a new LRU list type Glauber Costa
2013-05-13 9:25 ` Mel Gorman
2013-05-12 18:13 ` [PATCH v6 08/31] inode: convert inode lru list to generic lru list code Glauber Costa
2013-05-12 18:13 ` [PATCH v6 09/31] dcache: convert to use new lru list infrastructure Glauber Costa
[not found] ` <1368382432-25462-10-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-14 6:59 ` Dave Chinner
2013-05-14 7:50 ` Glauber Costa
2013-05-14 14:01 ` Glauber Costa
2013-05-12 18:13 ` [PATCH v6 10/31] list_lru: per-node " Glauber Costa
2013-05-12 18:13 ` [PATCH v6 11/31] shrinker: add node awareness Glauber Costa
2013-05-12 18:13 ` [PATCH v6 12/31] fs: convert inode and dentry shrinking to be node aware Glauber Costa
[not found] ` <1368382432-25462-13-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-14 9:52 ` Dave Chinner
2013-05-15 15:27 ` Glauber Costa
2013-05-16 0:02 ` Dave Chinner
2013-05-16 8:03 ` Glauber Costa
2013-05-16 19:14 ` Glauber Costa
2013-05-17 0:51 ` Dave Chinner
2013-05-17 7:29 ` Glauber Costa
[not found] ` <5195DC59.8000205-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-17 14:49 ` Glauber Costa
[not found] ` <51964381.8010406-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-17 22:54 ` Glauber Costa
2013-05-18 3:39 ` Dave Chinner
2013-05-18 7:20 ` Glauber Costa
2013-05-12 18:13 ` [PATCH v6 13/31] xfs: convert buftarg LRU to generic code Glauber Costa
2013-05-12 18:13 ` [PATCH v6 14/31] xfs: convert dquot cache lru to list_lru Glauber Costa
2013-05-12 18:13 ` [PATCH v6 15/31] fs: convert fs shrinkers to new scan/count API Glauber Costa
2013-05-13 6:12 ` Artem Bityutskiy
[not found] ` <1368425530.3208.13.camel-Bxnoe/o8FG+Ef9UqXRslZEEOCMrvLtNR@public.gmane.org>
2013-05-13 7:28 ` Glauber Costa
[not found] ` <51909610.1010801-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-13 7:43 ` Artem Bityutskiy
2013-05-13 10:36 ` Jan Kara
2013-05-12 18:13 ` [PATCH v6 16/31] drivers: convert shrinkers to new count/scan API Glauber Costa
2013-05-12 18:13 ` [PATCH v6 17/31] i915: bail out earlier when shrinker cannot acquire mutex Glauber Costa
2013-05-12 18:13 ` [PATCH v6 18/31] shrinker: convert remaining shrinkers to count/scan API Glauber Costa
2013-05-12 18:13 ` [PATCH v6 19/31] hugepage: convert huge zero page shrinker to new shrinker API Glauber Costa
2013-05-12 18:13 ` [PATCH v6 20/31] shrinker: Kill old ->shrink API Glauber Costa
2013-05-12 18:13 ` [PATCH v6 21/31] vmscan: also shrink slab in memcg pressure Glauber Costa
2013-05-12 18:13 ` [PATCH v6 22/31] memcg,list_lru: duplicate LRUs upon kmemcg creation Glauber Costa
2013-05-12 18:13 ` [PATCH v6 23/31] lru: add an element to a memcg list Glauber Costa
2013-05-12 18:13 ` [PATCH v6 24/31] list_lru: per-memcg walks Glauber Costa
2013-05-12 18:13 ` [PATCH v6 25/31] memcg: per-memcg kmem shrinking Glauber Costa
2013-05-12 18:13 ` [PATCH v6 26/31] memcg: scan cache objects hierarchically Glauber Costa
2013-05-12 18:13 ` [PATCH v6 27/31] vmscan: take at least one pass with shrinkers Glauber Costa
2013-05-12 18:13 ` [PATCH v6 28/31] super: targeted memcg reclaim Glauber Costa
2013-05-12 18:13 ` [PATCH v6 29/31] memcg: move initialization to memcg creation Glauber Costa
2013-05-12 18:13 ` Glauber Costa [this message]
2013-05-12 18:13 ` [PATCH v6 31/31] memcg: reap dead memcgs upon global memory pressure Glauber Costa
2013-05-13 7:14 ` [PATCH v6 00/31] kmemcg shrinkers Dave Chinner
2013-05-13 7:21 ` Dave Chinner
2013-05-13 8:00 ` Glauber Costa
[not found] ` <51909D84.7040800-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-14 1:48 ` Dave Chinner
2013-05-14 5:22 ` Dave Chinner
2013-05-14 5:45 ` Dave Chinner
2013-05-14 7:38 ` Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1368382432-25462-31-git-send-email-glommer@openvz.org \
--to=glommer@openvz.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=david@fromorbit.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=john.stultz@linaro.org \
--cc=js1304@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).