* [RFC] [PATCH 00/12] CKRM after a major overhaul
@ 2006-04-21 2:24 sekharan
2006-04-21 2:24 ` [RFC] [PATCH 01/12] Register/Unregister interface for Controllers sekharan
` (12 more replies)
0 siblings, 13 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
CKRM has gone through a major overhaul by removing some of the complexity,
cutting down on features and moving portions to userspace.
Diffstat of this patchset (including the numtasks controller that follows)
is:
23 files changed, 2475 insertions(+), 5 deletions(-)
including Documentaion and comments.
This patchset will be followed with two controllers:
- a simple controller, numtasks to control number of tasks
- CPU controller, to control CPU resource.
--
Brief Intro for CKRM:
Class-based Kernel Resource Management (CKRM) enables control of system
resource usage and monitoring of resource usage through user-defined
groups of tasks called classes.
Class is a group of tasks that is grouped by the administrator.
By assigning tasks to classes, administrators can monitor and bound the
resource usage of any system resource with a resource controller.
Resources amenable to such control include CPU ticks, physical pages,
disk I/O bandwidth, number of open file handles, and number of tasks to
name a few.
Userspace interfaces with CKRM through a configfs subsystem: Resource Control
File System (RCFS). Users create and delete classes simply by issuing mkdir
or rmdir commands. Once created the user may set the resource
share of a class and alter the group of tasks bound to those classes by
writing to files in the class directory. Similarly, to monitor the
subsequent resource utilization of the class, users read files in the class
directory.
Users control different resource shares of a class independent of other
resource. In other words, CPU share of class can be very different from
memory share and that of I/O share.
Resource controllers implement a small set of functions that respond to
changes in resource shares, class creation/deletion and class membership.
Given a class and its shares the controller then manages resource usage
of tasks in that class. For instance, a CPU resource controller might
manipulate the timeslice of each task according to its class' remaining
CPU share.
--
Patch Descriptions:
This set of patches implements classes, resource controller registration,
and the RCFS interface. Subsequent sets of patches add specific resource
controllers.
More details are available in the doumentation patch.
Patch descriptions:
01/12: ckrm_core
- Provides register/unregister functions for a controller
02/12: ckrm_core_class_support
- Provides functions to alloc and free a user defined class
- Provides utility functions to walk through the class hierarchy
03/12: ckrm_core_handle_shares
- Provides functions to set/get shares of a class
- Defines a teardown function that is intended to be called when
user disables CKRM (by umount of configfs or rmmod of rcfs)
04/12: ckrm_tasksupport
- Adds logic to support adding/removing task to/from a class
- Provides an interface to set a task's class
05/12: ckrm_tasksupport_fork_exit_init
- Initializes and clears ckrm specific information at fork() and
exit()
- Inititalizes ckrm (called from start_kernel)
06/12: ckrm_tasksupport_procsupport
- Adds an interface in /proc to get the class name of a task.
07/12 - ckrm_configfs_rcfs
Creates configfs interface(RCFS) for managing CKRM.
Hooks up with configfs. Provides functions for creating and
deleting classes.
08/12 - ckrm_configfs_rcfs_attr_support
Adds the basic attribute store and show functions.
09/12 - 04ckrm_configfs_rcfs_stats
Adds attr_store and attr_show support for stats file.
10/12 - ckrm_configfs_rcfs_shares
Adds attr_store and attr_show support for shares file.
11/12 - ckrm_configfs_rcfs_members
Adds attr_store and attr_show support for members file.
12/12 - ckrm_docs
Documentation describing important CKRM elements such as classes,
shares, controllers, and the interface provided to userspace via RCFS
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 01/12] Register/Unregister interface for Controllers
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 02/12] Class creation/deletion sekharan
` (11 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
01/12 - ckrm_core
This patch defines data structures for defining a class and resource
controller.
Provides register/unregister functions for a controller.
Provides utility functions to get a controller's data structure.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
include/linux/ckrm.h | 76 +++++++++++++++++
include/linux/ckrm_rc.h | 67 ++++++++++++++
init/Kconfig | 14 +++
kernel/Makefile | 1
kernel/ckrm/Makefile | 1
kernel/ckrm/ckrm.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++
kernel/ckrm/ckrm_local.h | 13 ++
7 files changed, 382 insertions(+)
Index: linux2617-rc2/include/linux/ckrm.h
===================================================================
--- /dev/null
+++ linux2617-rc2/include/linux/ckrm.h
@@ -0,0 +1,76 @@
+/*
+ * ckrm.h - Header file to be used by Class-based Kernel Resource
+ * Management (CKRM).
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003, 2004
+ * (C) Shailabh Nagar, IBM Corp. 2003, 2004
+ * (C) Chandra Seetharaman, IBM Corp. 2003, 2004, 2005
+ *
+ * Provides data structures, macros and kernel APIs
+ *
+ * More details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#ifndef _LINUX_CKRM_H
+#define _LINUX_CKRM_H
+
+#ifdef CONFIG_CKRM
+#include <linux/spinlock.h>
+#include <linux/list.h>
+#include <linux/kref.h>
+
+#define CKRM_SHARE_UNCHANGED (-1) /* implicitly specified by userspace,
+ * never stored in a class' shares
+ * struct, and never displayed */
+#define CKRM_SHARE_UNSUPPORTED (-2) /* If the resource controller doesn't
+ * support user changing a shares value
+ * it sets the corresponding share
+ * value to UNSUPPORTED when it returns
+ * the newly allocated shares data
+ * structure */
+#define CKRM_SHARE_DONT_CARE (-3)
+
+#define CKRM_SHARE_DEFAULT_DIVISOR (100)
+
+#define CKRM_MAX_RES_CTLRS 8 /* max # of resource controllers */
+
+#define CKRM_NO_CLASS NULL
+#define CKRM_NO_SHARE NULL
+#define CKRM_NO_RES_ID CKRM_MAX_RES_CTLRS /* Invalid ID */
+
+/*
+ * Share quantities are a child's fraction of the parent's resource
+ * specified by a divisor in the parent and a dividend in the child.
+ *
+ * Shares are represented as a relative quantity between parent and child
+ * to simplify locking when propagating modifications to the shares of a
+ * class. Only the parent and the children of the modified class need to be
+ * locked.
+*/
+struct ckrm_shares {
+};
+
+/*
+ * Class is the grouping of tasks with shares of each resource that has
+ * registered a resource controller (see include/linux/ckrm_rc.h).
+ */
+struct ckrm_class {
+ int depth; /* depth of this class. root == 0 */
+ spinlock_t class_lock; /* protects task_list, shares and children
+ * When grabbing class_lock in a hierarchy,
+ * grab parent's class_lock first.
+ * If resource controller uses a class
+ * specific lock, grab class_lock before
+ * grabbing resource specific lock */
+ struct ckrm_shares *shares[CKRM_MAX_RES_CTLRS];/* resource shares */
+ struct list_head class_list; /* entry in list of all classes */
+};
+
+#endif /* CONFIG_CKRM */
+#endif /* _LINUX_CKRM_H */
Index: linux2617-rc2/include/linux/ckrm_rc.h
===================================================================
--- /dev/null
+++ linux2617-rc2/include/linux/ckrm_rc.h
@@ -0,0 +1,67 @@
+/*
+ * ckrm_rc.h - Header file to be used by Resource controllers of CKRM
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ * (C) Shailabh Nagar, IBM Corp. 2003
+ * (C) Chandra Seetharaman, IBM Corp. 2003, 2004, 2005
+ * (C) Vivek Kashyap , IBM Corp. 2004
+ *
+ * Provides data structures, macros and kernel API of CKRM for
+ * resource controllers.
+ *
+ * More details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#ifndef _LINUX_CKRM_RC_H
+#define _LINUX_CKRM_RC_H
+
+#include <linux/ckrm.h>
+
+struct ckrm_controller {
+ const char *name;
+ int depth_supported;/* maximum hierarchy supported by controller */
+ unsigned int ctlr_id;
+
+ /*
+ * Keeps number of references to this controller structure. kref
+ * does not work as we want to be able to allow removal of a
+ * controller even when some classes are defined.
+ */
+ atomic_t count;
+
+ /*
+ * Allocate a new shares struct for this resource controller.
+ * Called when registering a resource controller with pre-existing
+ * classes and when new classes are created by the user.
+ */
+ struct ckrm_shares *(*alloc_shares_struct)(struct ckrm_class *);
+ /* Corresponding free of shares struct for this resource controller */
+ void (*free_shares_struct)(struct ckrm_shares *);
+
+ /* Notifies the controller when the shares are changed */
+ void (*shares_changed)(struct ckrm_shares *);
+
+ /* resource statistics */
+ ssize_t (*show_stats)(struct ckrm_shares *, char *, size_t);
+ int (*reset_stats)(struct ckrm_shares *, const char *);
+
+ /*
+ * move_task is called when a task moves from one class to another.
+ * The first parameter is the task that is moving, the second
+ * is the resource specific shares of the previous class the task
+ * was in, and the third is the shares of the class the task has
+ * moved to.
+ */
+ void (*move_task)(struct task_struct *, struct ckrm_shares *,
+ struct ckrm_shares *);
+};
+
+extern int ckrm_register_controller(struct ckrm_controller *);
+extern int ckrm_unregister_controller(struct ckrm_controller *);
+#endif /* _LINUX_CKRM_RC_H */
Index: linux2617-rc2/kernel/ckrm/ckrm.c
===================================================================
--- /dev/null
+++ linux2617-rc2/kernel/ckrm/ckrm.c
@@ -0,0 +1,210 @@
+/* ckrm.c - Class-based Kernel Resource Management (CKRM)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003, 2004
+ * (C) Shailabh Nagar, IBM Corp. 2003, 2004
+ * (C) Chandra Seetharaman, IBM Corp. 2003, 2004, 2005
+ * (C) Vivek Kashyap, IBM Corp. 2004
+ * (C) Matt Helsley, IBM Corp. 2006
+ *
+ * Provides kernel API of CKRM for in-kernel,per-resource controllers
+ * (one each for cpu, memory and io).
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/ckrm_rc.h>
+
+static struct ckrm_controller *ckrm_controllers[CKRM_MAX_RES_CTLRS];
+/* res_ctlrs_lock protects ckrm_controllers array and count in controllers*/
+static spinlock_t res_ctlrs_lock = SPIN_LOCK_UNLOCKED;
+
+static LIST_HEAD(ckrm_classes);/* link all classes */
+static rwlock_t ckrm_class_lock = RW_LOCK_UNLOCKED; /* protects ckrm_classes */
+
+/* Must be called with res_ctlr_lock held */
+static inline int is_ctlr_id_valid(unsigned int ctlr_id)
+{
+ return ((ctlr_id < CKRM_MAX_RES_CTLRS) &&
+ (ckrm_controllers[ctlr_id] != NULL));
+}
+
+struct ckrm_controller *ckrm_get_controller_by_id(unsigned int ctlr_id)
+{
+ /*
+ * inc of controller[i].count has to be atomically done with
+ * checking the array ckrm_controllers, as remove_controller()
+ * checks controller[i].count and clears ckrm_controllers[i]
+ * atomically, under res_ctlrs_lock.
+ */
+ spin_lock(&res_ctlrs_lock);
+ if (!is_ctlr_id_valid(ctlr_id)) {
+ spin_unlock(&res_ctlrs_lock);
+ return NULL;
+ }
+ atomic_inc(&ckrm_controllers[ctlr_id]->count);
+ spin_unlock(&res_ctlrs_lock);
+ return ckrm_controllers[ctlr_id];
+}
+
+struct ckrm_controller *ckrm_get_controller_by_name(const char *name)
+{
+ struct ckrm_controller *ctlr;
+ unsigned int i;
+
+ spin_lock(&res_ctlrs_lock);
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++, ctlr = NULL) {
+ ctlr = ckrm_controllers[i];
+ if (!ctlr)
+ continue;
+ if (!strcmp(name, ctlr->name)) {
+ atomic_inc(&ckrm_controllers[i]->count);
+ break;
+ }
+ }
+ spin_unlock(&res_ctlrs_lock);
+ return ctlr;
+}
+
+static void ckrm_get_controller(struct ckrm_controller *ctlr)
+{
+ atomic_inc(&ctlr->count);
+}
+
+void ckrm_put_controller(struct ckrm_controller *ctlr)
+{
+ atomic_dec(&ctlr->count);
+}
+
+/* Allocate resource specific information for a class */
+static void do_alloc_shares_struct(struct ckrm_class *class,
+ struct ckrm_controller *ctlr)
+{
+ if (class->shares[ctlr->ctlr_id]) /* already allocated */
+ return;
+
+ if (class->depth > ctlr->depth_supported)
+ return;
+
+ class->shares[ctlr->ctlr_id] = ctlr->alloc_shares_struct(class);
+ if (class->shares[ctlr->ctlr_id] != NULL)
+ ckrm_get_controller(ctlr);
+}
+
+/* Free up the given resource specific information in a class */
+static void do_free_shares_struct(struct ckrm_class *class,
+ struct ckrm_controller *ctlr)
+{
+ struct ckrm_shares *shares = class->shares[ctlr->ctlr_id];
+
+ /* No shares alloced previously */
+ if (shares == NULL)
+ return;
+
+ spin_lock(&class->class_lock);
+ class->shares[ctlr->ctlr_id] = NULL;
+ spin_unlock(&class->class_lock);
+ ctlr->free_shares_struct(shares);
+ ckrm_put_controller(ctlr); /* Drop reference acquired in do_alloc */
+}
+
+static int add_controller(struct ckrm_controller *ctlr)
+{
+ int ctlr_id, ret = -ENOSPC;
+
+ spin_lock(&res_ctlrs_lock);
+ for (ctlr_id = 0; ctlr_id < CKRM_MAX_RES_CTLRS; ctlr_id++)
+ if (ckrm_controllers[ctlr_id] == NULL) {
+ ckrm_controllers[ctlr_id] = ctlr;
+ ret = ctlr_id;
+ break;
+ }
+ spin_unlock(&res_ctlrs_lock);
+ return ret;
+}
+
+/*
+ * Interface for registering a resource controller.
+ *
+ * Returns the 0 on success, -errno for failure.
+ * Fills ctlr->ctlr_id with a valid controller id on success.
+ */
+int ckrm_register_controller(struct ckrm_controller *ctlr)
+{
+ int ret;
+ struct ckrm_class *class;
+
+ if (!ctlr)
+ return -EINVAL;
+
+ /* Make sure there is an alloc and a free */
+ if (!ctlr->alloc_shares_struct || !ctlr->free_shares_struct)
+ return -EINVAL;
+
+ ret = add_controller(ctlr);
+
+ if (ret < 0)
+ return ret;
+
+ ctlr->ctlr_id = ret;
+
+ atomic_set(&ctlr->count, 0);
+
+ /*
+ * Run through all classes and create the controller specific data
+ * structures.
+ */
+ read_lock(&ckrm_class_lock);
+ list_for_each_entry(class, &ckrm_classes, class_list)
+ do_alloc_shares_struct(class, ctlr);
+ read_unlock(&ckrm_class_lock);
+ return 0;
+}
+
+static int remove_controller(struct ckrm_controller *ctlr)
+{
+ spin_lock(&res_ctlrs_lock);
+ if (atomic_read(&ctlr->count) > 0) {
+ spin_unlock(&res_ctlrs_lock);
+ return -EBUSY;
+ }
+
+ ckrm_controllers[ctlr->ctlr_id] = NULL;
+ ctlr->ctlr_id = CKRM_NO_RES_ID;
+ spin_unlock(&res_ctlrs_lock);
+ return 0;
+}
+
+/*
+ * Unregistering resource controller.
+ *
+ * Returns 0 on success -errno for failure.
+ */
+int ckrm_unregister_controller(struct ckrm_controller *ctlr)
+{
+ struct ckrm_class *class;
+
+ if (!ctlr)
+ return -EINVAL;
+
+ if (ckrm_get_controller_by_id(ctlr->ctlr_id) != ctlr)
+ return -EINVAL;
+
+ /* free shares structs for this resource from all the classes */
+ read_lock(&ckrm_class_lock);
+ list_for_each_entry_reverse(class, &ckrm_classes, class_list)
+ do_free_shares_struct(class, ctlr);
+ read_unlock(&ckrm_class_lock);
+
+ ckrm_put_controller(ctlr);
+ return remove_controller(ctlr);
+}
+
+EXPORT_SYMBOL_GPL(ckrm_register_controller);
+EXPORT_SYMBOL_GPL(ckrm_unregister_controller);
Index: linux2617-rc2/kernel/Makefile
===================================================================
--- linux2617-rc2.orig/kernel/Makefile
+++ linux2617-rc2/kernel/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
obj-$(CONFIG_SECCOMP) += seccomp.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
obj-$(CONFIG_RELAY) += relay.o
+obj-$(CONFIG_CKRM) += ckrm/
ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
# According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
Index: linux2617-rc2/kernel/ckrm/Makefile
===================================================================
--- /dev/null
+++ linux2617-rc2/kernel/ckrm/Makefile
@@ -0,0 +1 @@
+obj-y = ckrm.o
Index: linux2617-rc2/init/Kconfig
===================================================================
--- linux2617-rc2.orig/init/Kconfig
+++ linux2617-rc2/init/Kconfig
@@ -150,6 +150,20 @@ config BSD_PROCESS_ACCT_V3
for processing it. A preliminary version of these tools is available
at <http://www.physik3.uni-rostock.de/tim/kernel/utils/acct/>.
+menu "Class Based Kernel Resource Management"
+
+config CKRM
+ bool "Class Based Kernel Resource Management Core"
+ depends on EXPERIMENTAL
+ help
+ Class-based Kernel Resource Management is a framework for controlling
+ and monitoring resource allocation of user-defined groups of tasks.
+ For more information, please visit http://ckrm.sf.net.
+
+ If you say Y here, enable the Resource Class File System and at least
+ one of the resource controllers below. Say N if you are unsure.
+
+endmenu
config SYSCTL
bool "Sysctl support"
---help---
Index: linux2617-rc2/kernel/ckrm/ckrm_local.h
===================================================================
--- /dev/null
+++ linux2617-rc2/kernel/ckrm/ckrm_local.h
@@ -0,0 +1,13 @@
+/*
+ * Contains function definitions that are local to ckrm core.
+ * NOT to be included by controllers.
+ */
+
+#include <linux/ckrm_rc.h>
+
+extern struct ckrm_controller *ckrm_get_controller_by_name(const char *);
+extern struct ckrm_controller *ckrm_get_controller_by_id(unsigned int);
+extern void ckrm_put_controller(struct ckrm_controller *);
+extern struct ckrm_class *ckrm_alloc_class(struct ckrm_class *, const char *);
+extern int ckrm_free_class(struct ckrm_class *);
+extern void ckrm_release_class(struct kref *);
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 02/12] Class creation/deletion
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
2006-04-21 2:24 ` [RFC] [PATCH 01/12] Register/Unregister interface for Controllers sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 03/12] Share Handling sekharan
` (10 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
02/12 - ckrm_core_class_support
Provides functions to alloc and free a user defined class.
Provides utility macro to walk through the class hierarchy
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-Off-By: Vivek Kashyap <kashyapv@us.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
include/linux/ckrm.h | 8 ++
include/linux/ckrm_rc.h | 9 ++
kernel/ckrm/ckrm.c | 171 ++++++++++++++++++++++++++++++++++++++++++++----
3 files changed, 175 insertions(+), 13 deletions(-)
Index: linux2617-rc2/include/linux/ckrm.h
===================================================================
--- linux2617-rc2.orig/include/linux/ckrm.h
+++ linux2617-rc2/include/linux/ckrm.h
@@ -61,6 +61,8 @@ struct ckrm_shares {
* registered a resource controller (see include/linux/ckrm_rc.h).
*/
struct ckrm_class {
+ const char *name;
+ struct kref ref;
int depth; /* depth of this class. root == 0 */
spinlock_t class_lock; /* protects task_list, shares and children
* When grabbing class_lock in a hierarchy,
@@ -70,6 +72,12 @@ struct ckrm_class {
* grabbing resource specific lock */
struct ckrm_shares *shares[CKRM_MAX_RES_CTLRS];/* resource shares */
struct list_head class_list; /* entry in list of all classes */
+
+ struct list_head task_list; /* this class's tasks */
+
+ struct ckrm_class *parent;
+ struct list_head siblings; /* entry in list of siblings */
+ struct list_head children; /* head of children */
};
#endif /* CONFIG_CKRM */
Index: linux2617-rc2/kernel/ckrm/ckrm.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm.c
+++ linux2617-rc2/kernel/ckrm/ckrm.c
@@ -19,14 +19,32 @@
*/
#include <linux/module.h>
-#include <linux/ckrm_rc.h>
+#include "ckrm_local.h"
static struct ckrm_controller *ckrm_controllers[CKRM_MAX_RES_CTLRS];
/* res_ctlrs_lock protects ckrm_controllers array and count in controllers*/
static spinlock_t res_ctlrs_lock = SPIN_LOCK_UNLOCKED;
static LIST_HEAD(ckrm_classes);/* link all classes */
-static rwlock_t ckrm_class_lock = RW_LOCK_UNLOCKED; /* protects ckrm_classes */
+static int ckrm_num_classes; /* Number of user defined classes */
+static rwlock_t ckrm_class_lock = RW_LOCK_UNLOCKED;
+ /* protects ckrm_classes list and ckrm_num_classes */
+
+struct ckrm_class ckrm_default_class = {
+ .task_list = LIST_HEAD_INIT(ckrm_default_class.task_list),
+ .class_lock = SPIN_LOCK_UNLOCKED,
+ .name = "task",
+ .class_list = LIST_HEAD_INIT(ckrm_default_class.class_list),
+ .siblings = LIST_HEAD_INIT(ckrm_default_class.siblings),
+ .children = LIST_HEAD_INIT(ckrm_default_class.children),
+};
+
+/* Must be called with parent's class_lock held */
+static inline void ckrm_remove_child(struct ckrm_class *child)
+{
+ list_del(&child->siblings);
+ child->parent = CKRM_NO_CLASS;
+}
/* Must be called with res_ctlr_lock held */
static inline int is_ctlr_id_valid(unsigned int ctlr_id)
@@ -97,6 +115,55 @@ static void do_alloc_shares_struct(struc
ckrm_get_controller(ctlr);
}
+static void ckrm_class_init(struct ckrm_class *class)
+{
+ class->class_lock = SPIN_LOCK_UNLOCKED;
+ kref_init(&class->ref);
+ INIT_LIST_HEAD(&class->task_list);
+ INIT_LIST_HEAD(&class->children);
+ INIT_LIST_HEAD(&class->siblings);
+}
+
+struct ckrm_class *ckrm_alloc_class(struct ckrm_class *parent,
+ const char *name)
+{
+ int i;
+ struct ckrm_class *class;
+
+ BUG_ON(parent == NULL);
+
+ kref_get(&parent->ref);
+ class = kzalloc(sizeof(struct ckrm_class), GFP_KERNEL);
+ if (!class) {
+ kref_put(&parent->ref, ckrm_release_class);
+ return NULL;
+ }
+ ckrm_class_init(class);
+ class->name = name;
+ class->depth = parent->depth + 1;
+
+ /* Add to parent */
+ spin_lock(&parent->class_lock);
+ class->parent = parent;
+ list_add(&class->siblings, &parent->children);
+ spin_unlock(&parent->class_lock);
+
+ write_lock(&ckrm_class_lock);
+ list_add_tail(&class->class_list, &ckrm_classes);
+ ckrm_num_classes++;
+ write_unlock(&ckrm_class_lock);
+
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++) {
+ struct ckrm_controller *ctlr = ckrm_get_controller_by_id(i);
+ if (!ctlr)
+ continue;
+ do_alloc_shares_struct(class, ctlr);
+ ckrm_put_controller(ctlr);
+ }
+
+ return class;
+}
+
/* Free up the given resource specific information in a class */
static void do_free_shares_struct(struct ckrm_class *class,
struct ckrm_controller *ctlr)
@@ -114,6 +181,59 @@ static void do_free_shares_struct(struct
ckrm_put_controller(ctlr); /* Drop reference acquired in do_alloc */
}
+/*
+ * Release a class
+ * requires that all tasks were previously reassigned to another class
+ *
+ * Returns 0 on success -errno on failure.
+ */
+void ckrm_release_class(struct kref *kref)
+{
+ int i;
+ struct ckrm_class *class = container_of(kref,
+ struct ckrm_class, ref);
+ struct ckrm_class *parent = class->parent;
+
+ BUG_ON(ckrm_is_class_root(class));
+
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++) {
+ struct ckrm_controller *ctlr = ckrm_get_controller_by_id(i);
+ if (!ctlr)
+ continue;
+ do_free_shares_struct(class, ctlr);
+ ckrm_put_controller(ctlr);
+ }
+
+ /* Remove this class from the list of all classes */
+ write_lock(&ckrm_class_lock);
+ list_del(&class->class_list);
+ ckrm_num_classes--;
+ write_unlock(&ckrm_class_lock);
+
+ /* remove from parent */
+ spin_lock(&parent->class_lock);
+ list_del(&class->siblings);
+ class->parent = CKRM_NO_CLASS;
+ spin_unlock(&parent->class_lock);
+
+ kref_put(&parent->ref, ckrm_release_class);
+ kfree(class);
+}
+
+int ckrm_free_class(struct ckrm_class *class)
+{
+ BUG_ON(ckrm_is_class_root(class));
+ spin_lock(&class->class_lock);
+ if (!list_empty(&class->children)) {
+ spin_unlock(&class->class_lock);
+ return -EBUSY;
+ }
+ spin_unlock(&class->class_lock);
+ kref_put(&class->ref, ckrm_release_class);
+ return 0;
+}
+
+
static int add_controller(struct ckrm_controller *ctlr)
{
int ctlr_id, ret = -ENOSPC;
@@ -128,7 +248,6 @@ static int add_controller(struct ckrm_co
spin_unlock(&res_ctlrs_lock);
return ret;
}
-
/*
* Interface for registering a resource controller.
*
@@ -138,7 +257,7 @@ static int add_controller(struct ckrm_co
int ckrm_register_controller(struct ckrm_controller *ctlr)
{
int ret;
- struct ckrm_class *class;
+ struct ckrm_class *class, *prev_class;
if (!ctlr)
return -EINVAL;
@@ -160,10 +279,20 @@ int ckrm_register_controller(struct ckrm
* Run through all classes and create the controller specific data
* structures.
*/
- read_lock(&ckrm_class_lock);
- list_for_each_entry(class, &ckrm_classes, class_list)
- do_alloc_shares_struct(class, ctlr);
- read_unlock(&ckrm_class_lock);
+ prev_class = NULL;
+ read_lock(&ckrm_class_lock);
+ list_for_each_entry(class, &ckrm_classes, class_list) {
+ kref_get(&class->ref);
+ read_unlock(&ckrm_class_lock);
+ do_alloc_shares_struct(class, ctlr);
+ if (prev_class)
+ kref_put(&prev_class->ref, ckrm_release_class);
+ prev_class = class;
+ read_lock(&ckrm_class_lock);
+ }
+ read_unlock(&ckrm_class_lock);
+ if (prev_class)
+ kref_put(&prev_class->ref, ckrm_release_class);
return 0;
}
@@ -188,7 +317,7 @@ static int remove_controller(struct ckrm
*/
int ckrm_unregister_controller(struct ckrm_controller *ctlr)
{
- struct ckrm_class *class;
+ struct ckrm_class *class, *prev_class;
if (!ctlr)
return -EINVAL;
@@ -197,10 +326,20 @@ int ckrm_unregister_controller(struct ck
return -EINVAL;
/* free shares structs for this resource from all the classes */
- read_lock(&ckrm_class_lock);
- list_for_each_entry_reverse(class, &ckrm_classes, class_list)
- do_free_shares_struct(class, ctlr);
- read_unlock(&ckrm_class_lock);
+ prev_class = NULL;
+ read_lock(&ckrm_class_lock);
+ list_for_each_entry_reverse(class, &ckrm_classes, class_list) {
+ kref_get(&class->ref);
+ read_unlock(&ckrm_class_lock);
+ do_free_shares_struct(class, ctlr);
+ if (prev_class)
+ kref_put(&prev_class->ref, ckrm_release_class);
+ prev_class = class;
+ read_lock(&ckrm_class_lock);
+ }
+ read_unlock(&ckrm_class_lock);
+ if (prev_class)
+ kref_put(&prev_class->ref, ckrm_release_class);
ckrm_put_controller(ctlr);
return remove_controller(ctlr);
@@ -208,3 +347,9 @@ int ckrm_unregister_controller(struct ck
EXPORT_SYMBOL_GPL(ckrm_register_controller);
EXPORT_SYMBOL_GPL(ckrm_unregister_controller);
+EXPORT_SYMBOL_GPL(ckrm_alloc_class);
+EXPORT_SYMBOL_GPL(ckrm_free_class);
+EXPORT_SYMBOL_GPL(ckrm_default_class);
+EXPORT_SYMBOL_GPL(ckrm_get_controller_by_name);
+EXPORT_SYMBOL_GPL(ckrm_get_controller_by_id);
+EXPORT_SYMBOL_GPL(ckrm_put_controller);
Index: linux2617-rc2/include/linux/ckrm_rc.h
===================================================================
--- linux2617-rc2.orig/include/linux/ckrm_rc.h
+++ linux2617-rc2/include/linux/ckrm_rc.h
@@ -64,4 +64,13 @@ struct ckrm_controller {
extern int ckrm_register_controller(struct ckrm_controller *);
extern int ckrm_unregister_controller(struct ckrm_controller *);
+extern struct ckrm_class ckrm_default_class;
+static inline int ckrm_is_class_root(const struct ckrm_class* class)
+{
+ return (class == &ckrm_default_class);
+}
+
+#define for_each_child(child, parent) \
+ list_for_each_entry(child, &parent->children, siblings)
+
#endif /* _LINUX_CKRM_RC_H */
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 03/12] Share Handling
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
2006-04-21 2:24 ` [RFC] [PATCH 01/12] Register/Unregister interface for Controllers sekharan
2006-04-21 2:24 ` [RFC] [PATCH 02/12] Class creation/deletion sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 04/12] Add task logic to class sekharan
` (9 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
03/12 - ckrm_core_handle_shares
Provides functions to set/get shares of a specific resource of a class
Defines a teardown function that is intended to be called when user
disables CKRM (by umount of RCFS)
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-off-by: MAEDA Naoaki <maeda.naoaki@jp.fujitsu.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
include/linux/ckrm.h | 14 ++
include/linux/ckrm_rc.h | 10 +
kernel/ckrm/Makefile | 2
kernel/ckrm/ckrm.c | 24 ++++
kernel/ckrm/ckrm_local.h | 6 +
kernel/ckrm/ckrm_shares.c | 242 ++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 297 insertions(+), 1 deletion(-)
Index: linux2617-rc2/include/linux/ckrm.h
===================================================================
--- linux2617-rc2.orig/include/linux/ckrm.h
+++ linux2617-rc2/include/linux/ckrm.h
@@ -54,6 +54,20 @@
* locked.
*/
struct ckrm_shares {
+ /* shares only set by userspace */
+ int min_shares; /* minimun fraction of parent's resources allowed */
+ int max_shares; /* maximum fraction of parent's resources allowed */
+ int child_shares_divisor; /* >= 1, may not be DONT_CARE */
+
+ /*
+ * share values invisible to userspace. adjusted when userspace
+ * sets shares
+ */
+ int unused_min_shares;
+ /* 0 <= unused_min_shares <= (child_shares_divisor -
+ * Sum of min_shares of children)
+ */
+ int cur_max_shares; /* max(children's max_shares). need better name */
};
/*
Index: linux2617-rc2/kernel/ckrm/ckrm_shares.c
===================================================================
--- /dev/null
+++ linux2617-rc2/kernel/ckrm/ckrm_shares.c
@@ -0,0 +1,242 @@
+/*
+ * ckrm_shares.c - Share management functions for CKRM
+ *
+ * Copyright (C) Chandra Seetharaman, IBM Corp. 2003, 2004, 2005, 2006
+ * (C) Hubertus Franke, IBM Corp. 2004
+ * (C) Matt Helsley, IBM Corp. 2006
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/errno.h>
+#include <linux/ckrm_rc.h>
+
+/*
+ * Share values can be quantitative (quantity of memory for instance) or
+ * symbolic. The symbolic value DONT_CARE allows for any quantity of a resource
+ * to be substituted in its place. The symbolic value UNCHANGED is only used
+ * when setting share values and means that the old value should be used.
+ */
+
+/* Is the share a quantity (as opposed to "symbols" DONT_CARE or UNCHANGED) */
+static inline int is_share_quantitative(int share)
+{
+ return (share >= 0);
+}
+
+static inline int is_share_symbolic(int share)
+{
+ return !is_share_quantitative(share);
+}
+
+static inline int is_share_valid(int share)
+{
+ return ((share == CKRM_SHARE_DONT_CARE) ||
+ (share == CKRM_SHARE_UNSUPPORTED) ||
+ is_share_quantitative(share));
+}
+
+static inline int did_share_change(int share)
+{
+ return (share != CKRM_SHARE_UNCHANGED);
+}
+
+static inline int change_supported(int share)
+{
+ return (share != CKRM_SHARE_UNSUPPORTED);
+}
+
+/*
+ * Caller is responsible for protecting 'parent'
+ * Caller is responsible for making sure that the sum of sibling min_shares
+ * doesn't exceed parent's total min_shares.
+ */
+static inline void ckrm_child_min_shares_changed(struct ckrm_shares *parent,
+ int child_cur_min_shares,
+ int child_new_min_shares)
+{
+ if (is_share_quantitative(child_new_min_shares))
+ parent->unused_min_shares -= child_new_min_shares;
+ if (is_share_quantitative(child_cur_min_shares))
+ parent->unused_min_shares += child_cur_min_shares;
+}
+
+/*
+ * Set parent's cur_max_shares to the largest 'max_shares' of all
+ * of its children.
+ */
+static inline void ckrm_set_cur_max_shares(struct ckrm_class *parent,
+ struct ckrm_controller *ctlr)
+{
+ int max_shares = 0;
+ struct ckrm_class *child = NULL;
+ struct ckrm_shares *child_shares, *parent_shares;
+
+ for_each_child(child, parent) {
+ child_shares = ckrm_get_controller_shares(child, ctlr);
+ max_shares = max(max_shares, child_shares->max_shares);
+ }
+
+ parent_shares = ckrm_get_controller_shares(parent, ctlr);
+ parent_shares->cur_max_shares = max_shares;
+}
+
+/*
+ * Return -EINVAL if the child's shares violate self-consistency or
+ * parent-imposed restrictions. Otherwise return 0.
+ *
+ * This involves checking shares between the child and its parent;
+ * the child and itself (userspace can't be trusted).
+ */
+static inline int are_shares_valid(struct ckrm_shares *child,
+ struct ckrm_shares *parent,
+ int current_usage,
+ int min_shares_increase)
+{
+ /*
+ * CHILD <-> PARENT validation
+ * Increases in child's min_shares or max_shares can't exceed
+ * limitations imposed by the parent class.
+ * Only validate this if we have a parent.
+ */
+ if (parent &&
+ ((is_share_quantitative(child->min_shares) &&
+ (min_shares_increase > parent->unused_min_shares)) ||
+ (is_share_quantitative(child->max_shares) &&
+ (child->max_shares > parent->child_shares_divisor))))
+ return -EINVAL;
+
+ /* CHILD validation: is min valid */
+ if (!is_share_valid(child->min_shares))
+ return -EINVAL;
+
+ /* CHILD validation: is max valid */
+ if (!is_share_valid(child->max_shares))
+ return -EINVAL;
+
+ /*
+ * CHILD validation: is divisor quantitative & current_usage
+ * is not more than the new divisor
+ */
+ if (!is_share_quantitative(child->child_shares_divisor) ||
+ (current_usage > child->child_shares_divisor))
+ return -EINVAL;
+
+ /*
+ * CHILD validation: is the new child_shares_divisor large
+ * enough to accomodate largest max_shares of any of my child
+ */
+ if (child->child_shares_divisor < child->cur_max_shares)
+ return -EINVAL;
+
+ /* CHILD validation: min <= max */
+ if (is_share_quantitative(child->min_shares) &&
+ is_share_quantitative(child->max_shares) &&
+ (child->min_shares > child->max_shares))
+ return -EINVAL;
+
+ return 0;
+}
+
+/*
+ * Set the resource shares of a child class given the new shares
+ * specified by userspace, the child's current shares, and the parent class'
+ * shares.
+ *
+ * Caller is responsible for holding class->lock of child and parent
+ * classes to protect the shares structures passed to this function.
+ */
+static int ckrm_set_shares(const struct ckrm_shares *new,
+ struct ckrm_shares *child_shares,
+ struct ckrm_shares *parent_shares)
+{
+ int rc, current_usage, min_shares_increase;
+ struct ckrm_shares final_shares;
+
+ BUG_ON(!new || !child_shares);
+
+ final_shares = *child_shares;
+ if (did_share_change(new->child_shares_divisor) &&
+ change_supported(child_shares->child_shares_divisor))
+ final_shares.child_shares_divisor = new->child_shares_divisor;
+ if (did_share_change(new->min_shares) &&
+ change_supported(child_shares->min_shares))
+ final_shares.min_shares = new->min_shares;
+ if (did_share_change(new->max_shares) &&
+ change_supported(child_shares->max_shares))
+ final_shares.max_shares = new->max_shares;
+
+ current_usage = child_shares->child_shares_divisor -
+ child_shares->unused_min_shares;
+ min_shares_increase = final_shares.min_shares;
+ if (is_share_quantitative(child_shares->min_shares))
+ min_shares_increase -= child_shares->min_shares;
+
+ rc = are_shares_valid(&final_shares, parent_shares, current_usage,
+ min_shares_increase);
+ if (rc)
+ return rc; /* new shares would violate restrictions */
+
+ if (did_share_change(new->child_shares_divisor))
+ final_shares.unused_min_shares =
+ (final_shares.child_shares_divisor - current_usage);
+ *child_shares = final_shares;
+ return 0;
+}
+
+int ckrm_set_controller_shares(struct ckrm_class *class,
+ struct ckrm_controller *ctlr,
+ const struct ckrm_shares *new_shares)
+{
+ struct ckrm_shares *shares, *parent_shares;
+ int prev_min, prev_max, rc;
+
+ if (!ctlr->shares_changed)
+ return -EINVAL;
+
+ shares = ckrm_get_controller_shares(class, ctlr);
+ if (!shares)
+ return -EINVAL;
+
+ prev_min = shares->min_shares;
+ prev_max = shares->max_shares;
+
+ if (!ckrm_is_class_root(class))
+ spin_lock(&class->parent->class_lock);
+ spin_lock(&class->class_lock);
+ parent_shares = ckrm_get_controller_shares(class->parent, ctlr);
+ rc = ckrm_set_shares(new_shares, shares, parent_shares);
+ spin_unlock(&class->class_lock);
+
+ if (rc || ckrm_is_class_root(class))
+ goto done;
+
+ /* Notify parent about changes in my shares */
+ ckrm_child_min_shares_changed(parent_shares, prev_min,
+ shares->min_shares);
+ if (prev_max != shares->max_shares)
+ ckrm_set_cur_max_shares(class->parent, ctlr);
+
+done:
+ if (!ckrm_is_class_root(class))
+ spin_unlock(&class->parent->class_lock);
+ if (!rc)
+ ctlr->shares_changed(shares);
+ return rc;
+}
+
+void ckrm_set_shares_to_default(struct ckrm_class *class,
+ struct ckrm_controller *ctlr)
+{
+ struct ckrm_shares shares = {
+ .min_shares = CKRM_SHARE_DONT_CARE,
+ .max_shares = CKRM_SHARE_DONT_CARE,
+ .child_shares_divisor = CKRM_SHARE_DEFAULT_DIVISOR,
+ };
+ ckrm_set_controller_shares(class, ctlr, &shares);
+}
+
Index: linux2617-rc2/kernel/ckrm/Makefile
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/Makefile
+++ linux2617-rc2/kernel/ckrm/Makefile
@@ -1 +1 @@
-obj-y = ckrm.o
+obj-y = ckrm.o ckrm_shares.o
Index: linux2617-rc2/kernel/ckrm/ckrm.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm.c
+++ linux2617-rc2/kernel/ckrm/ckrm.c
@@ -174,6 +174,8 @@ static void do_free_shares_struct(struct
if (shares == NULL)
return;
+ ckrm_set_shares_to_default(class, ctlr);
+
spin_lock(&class->class_lock);
class->shares[ctlr->ctlr_id] = NULL;
spin_unlock(&class->class_lock);
@@ -345,6 +347,26 @@ int ckrm_unregister_controller(struct ck
return remove_controller(ctlr);
}
+/*
+ * Bring the state of CKRM to the initial state.
+ * Only shares of the default class need to be changed to default values.
+ * At this point no user-defined classes should exist.
+ */
+void ckrm_teardown(void)
+{
+ int i;
+ struct ckrm_controller *ctlr;
+
+ BUG_ON(ckrm_num_classes != 0);
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++) {
+ ctlr = ckrm_get_controller_by_id(i);
+ if (ctlr) {
+ ckrm_set_shares_to_default(&ckrm_default_class, ctlr);
+ ckrm_put_controller(ctlr);
+ }
+ }
+}
+
EXPORT_SYMBOL_GPL(ckrm_register_controller);
EXPORT_SYMBOL_GPL(ckrm_unregister_controller);
EXPORT_SYMBOL_GPL(ckrm_alloc_class);
@@ -353,3 +375,5 @@ EXPORT_SYMBOL_GPL(ckrm_default_class);
EXPORT_SYMBOL_GPL(ckrm_get_controller_by_name);
EXPORT_SYMBOL_GPL(ckrm_get_controller_by_id);
EXPORT_SYMBOL_GPL(ckrm_put_controller);
+EXPORT_SYMBOL_GPL(ckrm_set_controller_shares);
+EXPORT_SYMBOL_GPL(ckrm_teardown);
Index: linux2617-rc2/include/linux/ckrm_rc.h
===================================================================
--- linux2617-rc2.orig/include/linux/ckrm_rc.h
+++ linux2617-rc2/include/linux/ckrm_rc.h
@@ -73,4 +73,14 @@ static inline int ckrm_is_class_root(con
#define for_each_child(child, parent) \
list_for_each_entry(child, &parent->children, siblings)
+/* Get controller specific shares structure for the given class */
+static inline struct ckrm_shares *ckrm_get_controller_shares(
+ struct ckrm_class *class, struct ckrm_controller *ctlr)
+{
+ if (class && ctlr)
+ return class->shares[ctlr->ctlr_id];
+ else
+ return CKRM_NO_SHARE;
+}
+
#endif /* _LINUX_CKRM_RC_H */
Index: linux2617-rc2/kernel/ckrm/ckrm_local.h
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_local.h
+++ linux2617-rc2/kernel/ckrm/ckrm_local.h
@@ -11,3 +11,9 @@ extern void ckrm_put_controller(struct c
extern struct ckrm_class *ckrm_alloc_class(struct ckrm_class *, const char *);
extern int ckrm_free_class(struct ckrm_class *);
extern void ckrm_release_class(struct kref *);
+extern int ckrm_set_controller_shares(struct ckrm_class *,
+ struct ckrm_controller *, const struct ckrm_shares *);
+/* Set the shares for the given class and resource to default values */
+extern void ckrm_set_shares_to_default(struct ckrm_class *,
+ struct ckrm_controller *);
+extern void ckrm_teardown(void);
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 04/12] Add task logic to class
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (2 preceding siblings ...)
2006-04-21 2:24 ` [RFC] [PATCH 03/12] Share Handling sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 05/12] Init and clear class info in task sekharan
` (8 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
04/12 - ckrm_tasksupport
Adds logic to support adding/removing task to/from a class
Provides interface to set a task's class
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
include/linux/sched.h | 4 +
kernel/ckrm/Makefile | 2
kernel/ckrm/ckrm.c | 24 +++++++
kernel/ckrm/ckrm_local.h | 1
kernel/ckrm/ckrm_task.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 174 insertions(+), 1 deletion(-)
Index: linux2617-rc2/include/linux/sched.h
===================================================================
--- linux2617-rc2.orig/include/linux/sched.h
+++ linux2617-rc2/include/linux/sched.h
@@ -888,6 +888,10 @@ struct task_struct {
* cache last used pipe for splice
*/
struct pipe_inode_info *splice_pipe;
+#ifdef CONFIG_CKRM
+ struct ckrm_class *class;
+ struct list_head member_list; /* list of tasks in class */
+#endif /* CONFIG_CKRM */
};
static inline pid_t process_group(struct task_struct *tsk)
Index: linux2617-rc2/kernel/ckrm/ckrm_task.c
===================================================================
--- /dev/null
+++ linux2617-rc2/kernel/ckrm/ckrm_task.c
@@ -0,0 +1,144 @@
+/* ckrm_task.c - Class-based Kernel Resource Management (CKRM)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003,2004
+ * (C) Shailabh Nagar, IBM Corp. 2003
+ * (C) Chandra Seetharaman, IBM Corp. 2003, 2004, 2005
+ * (C) Vivek Kashyap, IBM Corp. 2004
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+#include <linux/sched.h>
+#include <linux/module.h>
+#include "ckrm_local.h"
+
+static inline struct ckrm_class *remove_from_old_class(struct task_struct *tsk)
+{
+ struct ckrm_class *class;
+
+retry:
+ class = tsk->class;
+ if (class == CKRM_NO_CLASS)
+ goto done;
+
+ spin_lock(&class->class_lock);
+ if (class != tsk->class) { /* lost the race, retry */
+ spin_unlock(&class->class_lock);
+ goto retry;
+ }
+ /* take out of old class */
+ list_del_init(&tsk->member_list);
+ tsk->class = CKRM_NO_CLASS;
+ spin_unlock(&class->class_lock);
+done:
+ return class;
+}
+
+static void move_to_new_class(struct task_struct *tsk,
+ struct ckrm_class *newclass)
+{
+ BUG_ON(!list_empty(&tsk->member_list));
+ BUG_ON(tsk->class != CKRM_NO_CLASS);
+
+ spin_lock(&newclass->class_lock);
+ tsk->class = newclass;
+ list_add(&tsk->member_list, &newclass->task_list);
+ spin_unlock(&newclass->class_lock);
+}
+
+static void notify_res_ctlrs(struct task_struct *tsk,
+ struct ckrm_class *oldclass, struct ckrm_class *newclass)
+{
+ int i;
+ struct ckrm_controller *ctlr;
+ struct ckrm_shares *old_shares, *new_shares;
+
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++) {
+ ctlr = ckrm_get_controller_by_id(i);
+ if (ctlr == NULL)
+ continue;
+ if (ctlr->move_task) {
+ old_shares = ckrm_get_controller_shares(oldclass, ctlr);
+ new_shares = ckrm_get_controller_shares(newclass, ctlr);
+ ctlr->move_task(tsk, old_shares, new_shares);
+ }
+ ckrm_put_controller(ctlr);
+ }
+}
+
+/*
+ * Change the class of the given task to "newclass"
+ *
+ * Caller is responsible to make sure the task structure stays put
+ * through this function.
+ *
+ * This function should be called without holding class_lock of
+ * newclass and tsk->class.
+ *
+ * Called with a reference to the new class held. The reference is
+ * dropped only when the task is assigned to a different class
+ * or when the task exits.
+ */
+static void ckrm_setclass_internal(struct task_struct *tsk,
+ struct ckrm_class *newclass)
+{
+ struct ckrm_class *oldclass;
+
+retry:
+ oldclass = remove_from_old_class(tsk);
+
+ /* The task is either exiting or is moving to a different class. */
+ if (oldclass == CKRM_NO_CLASS) {
+ /* In the exit path, must succeed */
+ if (newclass == CKRM_NO_CLASS)
+ goto retry;
+ kref_put(&newclass->ref, ckrm_release_class);
+ return;
+ }
+
+ /*
+ * notify resource controllers before we actually set the class
+ * in the task to avoid a race with notify_res_ctlrs being called
+ * from another ckrm_setclass_internal.
+ */
+ notify_res_ctlrs(tsk, oldclass, newclass);
+ if (newclass != CKRM_NO_CLASS)
+ move_to_new_class(tsk, newclass);
+ kref_put(&oldclass->ref, ckrm_release_class);
+}
+
+/*
+ * Set class of the task associated with pid to class.
+ * returns 0 on success, -errno on error.
+ */
+int ckrm_setclass(pid_t pid, struct ckrm_class *class)
+{
+ int rc = 0;
+ struct task_struct *tsk;
+
+ read_lock(&tasklist_lock);
+ tsk = find_task_by_pid(pid);
+ if (tsk == NULL) {
+ read_unlock(&tasklist_lock);
+ return -ESRCH; /* pid not found */
+ }
+ get_task_struct(tsk);
+ read_unlock(&tasklist_lock);
+
+ /* Check permissions */
+ if ((!capable(CAP_SYS_NICE)) &&
+ (!capable(CAP_SYS_RESOURCE)) && (current->user != tsk->user))
+ rc = -EPERM;
+ else {
+ kref_get(&class->ref);
+ ckrm_setclass_internal(tsk, class);
+ }
+ put_task_struct(tsk);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(ckrm_setclass);
Index: linux2617-rc2/kernel/ckrm/Makefile
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/Makefile
+++ linux2617-rc2/kernel/ckrm/Makefile
@@ -1 +1 @@
-obj-y = ckrm.o ckrm_shares.o
+obj-y = ckrm.o ckrm_shares.o ckrm_task.o
Index: linux2617-rc2/kernel/ckrm/ckrm.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm.c
+++ linux2617-rc2/kernel/ckrm/ckrm.c
@@ -251,6 +251,26 @@ static int add_controller(struct ckrm_co
return ret;
}
/*
+ * Helper function to move all tasks in a class to/from the registering
+ * /unregistering resource controller.
+ *
+ * Assumes ctlr is valid and class is initialized with resource
+ * controller's shares.
+ */
+static void move_tasks(struct ckrm_class *class, struct ckrm_controller *ctlr,
+ struct ckrm_shares *from, struct ckrm_shares *to)
+{
+ struct task_struct *tsk;
+
+ if (!ctlr->move_task)
+ return;
+ spin_lock(&class->class_lock);
+ list_for_each_entry(tsk, &class->task_list, member_list)
+ ctlr->move_task(tsk, from, to);
+ spin_unlock(&class->class_lock);
+}
+
+/*
* Interface for registering a resource controller.
*
* Returns the 0 on success, -errno for failure.
@@ -287,6 +307,8 @@ int ckrm_register_controller(struct ckrm
kref_get(&class->ref);
read_unlock(&ckrm_class_lock);
do_alloc_shares_struct(class, ctlr);
+ move_tasks(class, ctlr, CKRM_NO_SHARE,
+ class->shares[ctlr->ctlr_id]);
if (prev_class)
kref_put(&prev_class->ref, ckrm_release_class);
prev_class = class;
@@ -333,6 +355,8 @@ int ckrm_unregister_controller(struct ck
list_for_each_entry_reverse(class, &ckrm_classes, class_list) {
kref_get(&class->ref);
read_unlock(&ckrm_class_lock);
+ move_tasks(class, ctlr, class->shares[ctlr->ctlr_id],
+ CKRM_NO_SHARE);
do_free_shares_struct(class, ctlr);
if (prev_class)
kref_put(&prev_class->ref, ckrm_release_class);
Index: linux2617-rc2/kernel/ckrm/ckrm_local.h
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_local.h
+++ linux2617-rc2/kernel/ckrm/ckrm_local.h
@@ -17,3 +17,4 @@ extern int ckrm_set_controller_shares(st
extern void ckrm_set_shares_to_default(struct ckrm_class *,
struct ckrm_controller *);
extern void ckrm_teardown(void);
+extern int ckrm_setclass(pid_t, struct ckrm_class *);
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 05/12] Init and clear class info in task
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (3 preceding siblings ...)
2006-04-21 2:24 ` [RFC] [PATCH 04/12] Add task logic to class sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 06/12] Add proc interface to get class info of task sekharan
` (7 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
05/12 - ckrm_tasksupport_fork_exit_init
Initializes and clears ckrm specific information in a task at fork() and
exit().
Inititalizes ckrm (called from start_kernel)
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
include/linux/ckrm.h | 7 ++++++
init/main.c | 2 +
kernel/ckrm/ckrm.c | 11 +++++++++
kernel/ckrm/ckrm_local.h | 1
kernel/ckrm/ckrm_task.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++
kernel/exit.c | 2 +
kernel/fork.c | 2 +
7 files changed, 77 insertions(+)
Index: linux2617-rc2/kernel/exit.c
===================================================================
--- linux2617-rc2.orig/kernel/exit.c
+++ linux2617-rc2/kernel/exit.c
@@ -35,6 +35,7 @@
#include <linux/futex.h>
#include <linux/compat.h>
#include <linux/pipe_fs_i.h>
+#include <linux/ckrm.h>
#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -731,6 +732,7 @@ static void exit_notify(struct task_stru
struct task_struct *t;
struct list_head ptrace_dead, *_p, *_n;
+ ckrm_clear_task(tsk);
if (signal_pending(tsk) && !(tsk->signal->flags & SIGNAL_GROUP_EXIT)
&& !thread_group_empty(tsk)) {
/*
Index: linux2617-rc2/kernel/fork.c
===================================================================
--- linux2617-rc2.orig/kernel/fork.c
+++ linux2617-rc2/kernel/fork.c
@@ -44,6 +44,7 @@
#include <linux/rmap.h>
#include <linux/acct.h>
#include <linux/cn_proc.h>
+#include <linux/ckrm.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -1214,6 +1215,7 @@ static task_t *copy_process(unsigned lon
total_forks++;
spin_unlock(¤t->sighand->siglock);
write_unlock_irq(&tasklist_lock);
+ ckrm_init_task(p);
proc_fork_connector(p);
return p;
Index: linux2617-rc2/init/main.c
===================================================================
--- linux2617-rc2.orig/init/main.c
+++ linux2617-rc2/init/main.c
@@ -47,6 +47,7 @@
#include <linux/rmap.h>
#include <linux/mempolicy.h>
#include <linux/key.h>
+#include <linux/ckrm.h>
#include <asm/io.h>
#include <asm/bugs.h>
@@ -541,6 +542,7 @@ asmlinkage void __init start_kernel(void
proc_root_init();
#endif
cpuset_init();
+ ckrm_init();
check_bugs();
Index: linux2617-rc2/kernel/ckrm/ckrm.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm.c
+++ linux2617-rc2/kernel/ckrm/ckrm.c
@@ -231,6 +231,7 @@ int ckrm_free_class(struct ckrm_class *c
return -EBUSY;
}
spin_unlock(&class->class_lock);
+ ckrm_move_tasks_to_parent(class);
kref_put(&class->ref, ckrm_release_class);
return 0;
}
@@ -391,6 +392,16 @@ void ckrm_teardown(void)
}
}
+void ckrm_init(void)
+{
+ write_lock(&ckrm_class_lock);
+ list_add_tail(&ckrm_default_class.class_list, &ckrm_classes);
+ write_unlock(&ckrm_class_lock);
+ kref_init(&ckrm_default_class.ref);
+ init_task.class = &ckrm_default_class;
+ ckrm_init_task(&init_task);
+}
+
EXPORT_SYMBOL_GPL(ckrm_register_controller);
EXPORT_SYMBOL_GPL(ckrm_unregister_controller);
EXPORT_SYMBOL_GPL(ckrm_alloc_class);
Index: linux2617-rc2/kernel/ckrm/ckrm_task.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_task.c
+++ linux2617-rc2/kernel/ckrm/ckrm_task.c
@@ -141,4 +141,56 @@ int ckrm_setclass(pid_t pid, struct ckrm
put_task_struct(tsk);
return rc;
}
+
+void ckrm_init_task(struct task_struct *tsk)
+{
+ struct ckrm_class *class;
+
+ /*
+ * processes inherit their class from their real parent, and
+ * threads inherit class from their process.
+ */
+ if (thread_group_leader(tsk))
+ class = tsk->real_parent->class;
+ else
+ class = tsk->group_leader->class;
+
+ tsk->class = CKRM_NO_CLASS;
+ INIT_LIST_HEAD(&tsk->member_list);
+
+ BUG_ON(class == NULL);
+ kref_get(&class->ref);
+ move_to_new_class(tsk, class);
+ notify_res_ctlrs(tsk, CKRM_NO_CLASS, class);
+}
+
+void ckrm_clear_task(struct task_struct *tsk)
+{
+ ckrm_setclass_internal(tsk, CKRM_NO_CLASS);
+}
+
+/*
+ * Move all tasks in the given class to its parent.
+ */
+void ckrm_move_tasks_to_parent(struct ckrm_class *class)
+{
+ kref_get(&class->ref);
+
+next_task:
+ spin_lock(&class->class_lock);
+ if (!list_empty(&class->task_list)) {
+ struct task_struct *tsk =
+ list_entry(class->task_list.next,
+ struct task_struct, member_list);
+ get_task_struct(tsk);
+ spin_unlock(&class->class_lock);
+ kref_get(&class->parent->ref);
+ ckrm_setclass_internal(tsk, class->parent);
+ put_task_struct(tsk);
+ goto next_task;
+ }
+ spin_unlock(&class->class_lock);
+ kref_put(&class->ref, ckrm_release_class);
+}
+
EXPORT_SYMBOL_GPL(ckrm_setclass);
Index: linux2617-rc2/include/linux/ckrm.h
===================================================================
--- linux2617-rc2.orig/include/linux/ckrm.h
+++ linux2617-rc2/include/linux/ckrm.h
@@ -94,5 +94,12 @@ struct ckrm_class {
struct list_head children; /* head of children */
};
+extern void ckrm_init_task(struct task_struct *);
+extern void ckrm_clear_task(struct task_struct *);
+extern void ckrm_init(void);
+#else /* CONFIG_CKRM */
+static inline void ckrm_init_task(struct task_struct *tsk) { }
+static inline void ckrm_clear_task(struct task_struct *tsk) { }
+static inline void ckrm_init(void) { }
#endif /* CONFIG_CKRM */
#endif /* _LINUX_CKRM_H */
Index: linux2617-rc2/kernel/ckrm/ckrm_local.h
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_local.h
+++ linux2617-rc2/kernel/ckrm/ckrm_local.h
@@ -18,3 +18,4 @@ extern void ckrm_set_shares_to_default(s
struct ckrm_controller *);
extern void ckrm_teardown(void);
extern int ckrm_setclass(pid_t, struct ckrm_class *);
+extern void ckrm_move_tasks_to_parent(struct ckrm_class *);
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 06/12] Add proc interface to get class info of task
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (4 preceding siblings ...)
2006-04-21 2:24 ` [RFC] [PATCH 05/12] Init and clear class info in task sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 07/12] Configfs based filesystem user interface - RCFS sekharan
` (6 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
06/12: ckrm_tasksupport_procsupport
Adds an interface in /proc to get the class name of a task.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: MAEDA Naoaki <maeda.naoaki@jp.fujitsu.com>
fs/proc/base.c | 19 +++++++++++++++++++
include/linux/ckrm.h | 2 ++
kernel/ckrm/ckrm_task.c | 32 +++++++++++++++++++++++++++++++-
3 files changed, 52 insertions(+), 1 deletion(-)
Index: linux2617-rc2/fs/proc/base.c
===================================================================
--- linux2617-rc2.orig/fs/proc/base.c
+++ linux2617-rc2/fs/proc/base.c
@@ -70,6 +70,7 @@
#include <linux/ptrace.h>
#include <linux/seccomp.h>
#include <linux/cpuset.h>
+#include <linux/ckrm.h>
#include <linux/audit.h>
#include <linux/poll.h>
#include "internal.h"
@@ -115,6 +116,9 @@ enum pid_directory_inos {
#ifdef CONFIG_CPUSETS
PROC_TGID_CPUSET,
#endif
+#ifdef CONFIG_CKRM
+ PROC_TGID_CKRM_CLASS,
+#endif
#ifdef CONFIG_SECURITY
PROC_TGID_ATTR,
PROC_TGID_ATTR_CURRENT,
@@ -156,6 +160,9 @@ enum pid_directory_inos {
#ifdef CONFIG_CPUSETS
PROC_TID_CPUSET,
#endif
+#ifdef CONFIG_CKRM
+ PROC_TID_CKRM_CLASS,
+#endif
#ifdef CONFIG_SECURITY
PROC_TID_ATTR,
PROC_TID_ATTR_CURRENT,
@@ -219,6 +226,9 @@ static struct pid_entry tgid_base_stuff[
#ifdef CONFIG_CPUSETS
E(PROC_TGID_CPUSET, "cpuset", S_IFREG|S_IRUGO),
#endif
+#ifdef CONFIG_CKRM
+ E(PROC_TGID_CKRM_CLASS,"ckrm_class",S_IFREG|S_IRUGO),
+#endif
E(PROC_TGID_OOM_SCORE, "oom_score",S_IFREG|S_IRUGO),
E(PROC_TGID_OOM_ADJUST,"oom_adj", S_IFREG|S_IRUGO|S_IWUSR),
#ifdef CONFIG_AUDITSYSCALL
@@ -261,6 +271,9 @@ static struct pid_entry tid_base_stuff[]
#ifdef CONFIG_CPUSETS
E(PROC_TID_CPUSET, "cpuset", S_IFREG|S_IRUGO),
#endif
+#ifdef CONFIG_CKRM
+ E(PROC_TID_CKRM_CLASS, "ckrm_class",S_IFREG|S_IRUGO),
+#endif
E(PROC_TID_OOM_SCORE, "oom_score",S_IFREG|S_IRUGO),
E(PROC_TID_OOM_ADJUST, "oom_adj", S_IFREG|S_IRUGO|S_IWUSR),
#ifdef CONFIG_AUDITSYSCALL
@@ -1814,6 +1827,12 @@ static struct dentry *proc_pident_lookup
inode->i_fop = &proc_cpuset_operations;
break;
#endif
+#ifdef CONFIG_CKRM
+ case PROC_TID_CKRM_CLASS:
+ case PROC_TGID_CKRM_CLASS:
+ inode->i_fop = &proc_ckrm_class_operations;
+ break;
+#endif
case PROC_TID_OOM_SCORE:
case PROC_TGID_OOM_SCORE:
inode->i_fop = &proc_info_file_operations;
Index: linux2617-rc2/kernel/ckrm/ckrm_task.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_task.c
+++ linux2617-rc2/kernel/ckrm/ckrm_task.c
@@ -13,7 +13,8 @@
* (at your option) any later version.
*
*/
-#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/proc_fs.h>
#include <linux/module.h>
#include "ckrm_local.h"
@@ -193,4 +194,33 @@ next_task:
kref_put(&class->ref, ckrm_release_class);
}
+static int proc_ckrm_class_show(struct seq_file *m, void *v)
+{
+ struct task_struct *tsk = m->private;
+ struct ckrm_class *class = tsk->class;
+
+ if (!class)
+ return -EINVAL;
+
+ kref_get(&class->ref);
+ seq_puts(m, "/");
+ if (!ckrm_is_class_root(class))
+ seq_puts(m, class->name);
+ seq_putc(m, '\n');
+ kref_put(&class->ref, ckrm_release_class);
+ return 0;
+}
+
+static int ckrm_class_open(struct inode *inode, struct file *file)
+{
+ struct task_struct *tsk = PROC_I(inode)->task;
+ return single_open(file, proc_ckrm_class_show, tsk);
+}
+
+struct file_operations proc_ckrm_class_operations = {
+ .open = ckrm_class_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
EXPORT_SYMBOL_GPL(ckrm_setclass);
Index: linux2617-rc2/include/linux/ckrm.h
===================================================================
--- linux2617-rc2.orig/include/linux/ckrm.h
+++ linux2617-rc2/include/linux/ckrm.h
@@ -94,6 +94,8 @@ struct ckrm_class {
struct list_head children; /* head of children */
};
+extern struct file_operations proc_ckrm_class_operations;
+
extern void ckrm_init_task(struct task_struct *);
extern void ckrm_clear_task(struct task_struct *);
extern void ckrm_init(void);
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 07/12] Configfs based filesystem user interface - RCFS
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (5 preceding siblings ...)
2006-04-21 2:24 ` [RFC] [PATCH 06/12] Add proc interface to get class info of task sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:24 ` [RFC] [PATCH 08/12] Add attribute support to RCFS sekharan
` (5 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
07/12 - ckrm_configfs_rcfs
Create the filesystem(RCFS) for managing CKRM. Hooks up with configfs.
Provides functions for creating and deleting classes.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
init/Kconfig | 12 +++
kernel/ckrm/Makefile | 1
kernel/ckrm/ckrm_rcfs.c | 160 ++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 173 insertions(+)
Index: linux2617-rc2/init/Kconfig
===================================================================
--- linux2617-rc2.orig/init/Kconfig
+++ linux2617-rc2/init/Kconfig
@@ -163,6 +163,18 @@ config CKRM
If you say Y here, enable the Resource Class File System and at least
one of the resource controllers below. Say N if you are unsure.
+config CKRM_RCFS
+ tristate "Resource Control File System (User API for CKRM)"
+ depends on CKRM
+ select CONFIGFS_FS
+ default m
+ help
+ RCFS is the filesystem API for CKRM. Compiling it as a module permits
+ users to only load RCFS if they intend to use CKRM.
+
+ Say M if unsure, Y to save on module loading. N doesn't make sense
+ when CKRM has been configured.
+
endmenu
config SYSCTL
bool "Sysctl support"
Index: linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
===================================================================
--- /dev/null
+++ linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
@@ -0,0 +1,160 @@
+/*
+ * kernel/ckrm/ckrm_rcfs.c
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2005
+ * Chandra Seetharaman, IBM Corp. 2005, 2006
+ *
+ * Configfs based Resource class filesystem (rcfs) serving the
+ * user interface to Class-based Kernel Resource Management (CKRM).
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ */
+#include <linux/module.h>
+#include <linux/configfs.h>
+#include "ckrm_local.h"
+
+static struct configfs_subsystem rcfs_subsys;
+static struct config_item_type rcfs_class_type;
+
+struct rcfs_class {
+ char *name;
+ struct ckrm_class *core;
+ struct config_group group;
+};
+
+static inline struct rcfs_class *group_to_rcfs_class(struct config_group *grp)
+{
+ return container_of(grp, struct rcfs_class, group);
+}
+
+static inline struct rcfs_class *item_to_rcfs_class(struct config_item *item)
+{
+ return group_to_rcfs_class(to_config_group(item));
+}
+
+static inline struct ckrm_class *group_to_ckrm_class(struct config_group *grp)
+{
+ struct rcfs_class *rclass;
+ /*
+ * A configfs wrinkle forces us to treat the root group as a special
+ * case instead of wrapping the group in a struct rcfs_class like all
+ * other groups.
+ */
+ if (grp == &rcfs_subsys.su_group)
+ return &ckrm_default_class;
+ rclass = group_to_rcfs_class(grp);
+ return rclass->core;
+}
+
+static inline struct ckrm_class *item_to_ckrm_class(struct config_item *item)
+{
+ return group_to_ckrm_class(to_config_group(item));
+}
+
+/*
+ * This is the function that is called when a 'mkdir' command
+ * is issued under our filesystem
+ */
+static struct config_group *make_rcfs_class(struct config_group *group,
+ const char *name)
+{
+ struct rcfs_class *rclass, *rc_par;
+ struct ckrm_class *core, *parent;
+ char *new_name = NULL, *par_name = NULL;
+ int par_sz = 0;
+
+ rclass = kzalloc(sizeof(struct rcfs_class), GFP_KERNEL);
+ if (!rclass)
+ return NULL;
+
+ parent = group_to_ckrm_class(group);
+
+ if (parent != &ckrm_default_class) {
+ rc_par = group_to_rcfs_class(group);
+ par_name = rc_par->name;
+ par_sz = strlen(par_name);
+ }
+ new_name = kmalloc(par_sz + strlen(name) + 2, GFP_KERNEL);
+ if (!new_name)
+ goto noname;
+ if (par_name)
+ sprintf(new_name, "%s/%s", par_name, name);
+ else
+ sprintf(new_name, "%s", name);
+
+ core = ckrm_alloc_class(parent, new_name);
+ if (!core)
+ goto nocore;
+ rclass->core = core;
+ rclass->name = new_name;
+
+ config_group_init_type_name(&rclass->group, name, &rcfs_class_type);
+ return &rclass->group;
+
+nocore:
+ kfree(new_name);
+noname:
+ kfree(rclass);
+ return NULL;
+}
+
+/*
+ * This is the function that is called when a 'rmdir' command
+ * is issued under our filesystem
+ */
+static void rcfs_class_release_item(struct config_item *item)
+{
+ struct rcfs_class *rclass = item_to_rcfs_class(item);
+
+ ckrm_free_class(rclass->core);
+ kfree(rclass->name);
+ kfree(rclass);
+}
+
+static struct configfs_item_operations rcfs_class_item_ops = {
+ .release = rcfs_class_release_item,
+};
+
+static struct configfs_group_operations rcfs_class_group_ops = {
+ .make_group = make_rcfs_class,
+};
+
+static struct config_item_type rcfs_class_type = {
+ .ct_owner = THIS_MODULE,
+ .ct_item_ops = &rcfs_class_item_ops,
+ .ct_group_ops = &rcfs_class_group_ops,
+};
+
+static struct configfs_subsystem rcfs_subsys = {
+ .su_group = {
+ .cg_item = {
+ .ci_namebuf = "ckrm",
+ .ci_type = &rcfs_class_type,
+ }
+ }
+};
+
+static int __init rcfs_init(void)
+{
+ config_group_init(&rcfs_subsys.su_group);
+ init_MUTEX(&rcfs_subsys.su_sem);
+ return configfs_register_subsystem(&rcfs_subsys);
+}
+
+static void __exit rcfs_exit(void)
+{
+ configfs_unregister_subsystem(&rcfs_subsys);
+ ckrm_teardown();
+}
+
+late_initcall(rcfs_init);
+module_exit(rcfs_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("RCFS - Provides an interface to classes and allows control "
+ "of their resource usage");
Index: linux2617-rc2/kernel/ckrm/Makefile
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/Makefile
+++ linux2617-rc2/kernel/ckrm/Makefile
@@ -1 +1,2 @@
obj-y = ckrm.o ckrm_shares.o ckrm_task.o
+obj-$(CONFIG_CKRM_RCFS) += ckrm_rcfs.o
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 08/12] Add attribute support to RCFS
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (6 preceding siblings ...)
2006-04-21 2:24 ` [RFC] [PATCH 07/12] Configfs based filesystem user interface - RCFS sekharan
@ 2006-04-21 2:24 ` sekharan
2006-04-21 2:25 ` [RFC] [PATCH 09/12] Add stats file " sekharan
` (4 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:24 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
08/12 - ckrm_configfs_rcfs_attr_support
Adds the basic attribute store and show functions.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
kernel/ckrm/ckrm_rcfs.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 50 insertions(+), 1 deletion(-)
Index: linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_rcfs.c
+++ linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
@@ -14,13 +14,21 @@
* as published by the Free Software Foundation.
*
*/
+#include <linux/ctype.h>
#include <linux/module.h>
#include <linux/configfs.h>
+#include <linux/parser.h>
#include "ckrm_local.h"
static struct configfs_subsystem rcfs_subsys;
static struct config_item_type rcfs_class_type;
+struct class_attribute {
+ struct configfs_attribute configfs_attr;
+ ssize_t (*show)(struct ckrm_class *, char *);
+ int (*store)(struct ckrm_class *, const char *);
+};
+
struct rcfs_class {
char *name;
struct ckrm_class *core;
@@ -56,6 +64,40 @@ static inline struct ckrm_class *item_to
return group_to_ckrm_class(to_config_group(item));
}
+static ssize_t rcfs_attr_show(struct config_item *item,
+ struct configfs_attribute *attr, char *buf)
+{
+ struct class_attribute *class_attr;
+ struct ckrm_class *class = item_to_ckrm_class(item);
+
+ class_attr = container_of(attr, struct class_attribute, configfs_attr);
+ return class_attr->show(class, buf);
+}
+
+static ssize_t rcfs_attr_store(struct config_item *item,
+ struct configfs_attribute *attr, const char *buf,
+ size_t count)
+{
+ char *filtered_buf, *p;
+ ssize_t rc;
+ struct class_attribute *class_attr;
+ struct ckrm_class *class = item_to_ckrm_class(item);
+
+ class_attr = container_of(attr, struct class_attribute, configfs_attr);
+ filtered_buf = kzalloc(count + 1, GFP_KERNEL);
+ if (!filtered_buf)
+ return -ENOMEM;
+ strncpy(filtered_buf, buf, count);
+ for (p = filtered_buf; isprint(*p); ++p)
+ ;
+ *p = '\0';
+ rc = class_attr->store(class, filtered_buf);
+ kfree(filtered_buf);
+ if (rc)
+ return rc;
+ return count;
+}
+
/*
* This is the function that is called when a 'mkdir' command
* is issued under our filesystem
@@ -117,17 +159,24 @@ static void rcfs_class_release_item(stru
}
static struct configfs_item_operations rcfs_class_item_ops = {
- .release = rcfs_class_release_item,
+ .release = rcfs_class_release_item,
+ .show_attribute = rcfs_attr_show,
+ .store_attribute = rcfs_attr_store,
};
static struct configfs_group_operations rcfs_class_group_ops = {
.make_group = make_rcfs_class,
};
+static struct configfs_attribute *class_attrs[] = {
+ NULL
+};
+
static struct config_item_type rcfs_class_type = {
.ct_owner = THIS_MODULE,
.ct_item_ops = &rcfs_class_item_ops,
.ct_group_ops = &rcfs_class_group_ops,
+ .ct_attrs = class_attrs
};
static struct configfs_subsystem rcfs_subsys = {
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 09/12] Add stats file support to RCFS
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (7 preceding siblings ...)
2006-04-21 2:24 ` [RFC] [PATCH 08/12] Add attribute support to RCFS sekharan
@ 2006-04-21 2:25 ` sekharan
2006-04-21 2:25 ` [RFC] [PATCH 10/12] Add shares " sekharan
` (3 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:25 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
09/12 - ckrm_configfs_rcfs_stats
Adds attr_store and attr_show support for stats file.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
kernel/ckrm/ckrm_rcfs.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 112 insertions(+), 2 deletions(-)
Index: linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_rcfs.c
+++ linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
@@ -20,8 +20,104 @@
#include <linux/parser.h>
#include "ckrm_local.h"
-static struct configfs_subsystem rcfs_subsys;
-static struct config_item_type rcfs_class_type;
+#define CKRM_NAME_LEN 20
+
+#define RES_STRING "res"
+
+static ssize_t show_stats(struct ckrm_class *class, char *buf)
+{
+ int i, j = 0, rc = 0;
+ size_t buf_size = PAGE_SIZE-1; /* allow only PAGE_SIZE # of bytes */
+ struct ckrm_controller *ctlr;
+ struct ckrm_shares *shares;
+
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++, j = 0) {
+ if (buf_size <= 0)
+ break;
+ ctlr = ckrm_get_controller_by_id(i);
+ if (!ctlr)
+ continue;
+ shares = ckrm_get_controller_shares(class, ctlr);
+ if (shares && ctlr->show_stats)
+ j = ctlr->show_stats(shares, buf, buf_size);
+ ckrm_put_controller(ctlr);
+ rc += j;
+ buf += j;
+ buf_size -= j;
+ }
+ if (i < CKRM_MAX_RES_CTLRS)
+ rc = -ENOSPC;
+ return rc;
+}
+
+enum parse_token_t {
+ parse_res_type, parse_err
+};
+
+static match_table_t parse_tokens = {
+ {parse_res_type, RES_STRING"=%s"},
+ {parse_err, NULL}
+};
+
+static int ckrm_stats_parse(const char *options,
+ char **resname, char **remaining_line)
+{
+ char *p, *str;
+ int rc = -EINVAL;
+
+ if (!options)
+ return -EINVAL;
+
+ while ((p = strsep((char **)&options, ",")) != NULL) {
+ substring_t args[MAX_OPT_ARGS];
+ int token;
+
+ if (!*p)
+ continue;
+ token = match_token(p, parse_tokens, args);
+ if (token == parse_res_type) {
+ *resname = match_strdup(args);
+ str = p + strlen(p) + 1;
+ *remaining_line = kmalloc(strlen(str) + 1, GFP_KERNEL);
+ if (*remaining_line == NULL) {
+ kfree(*resname);
+ *resname = NULL;
+ rc = -ENOMEM;
+ } else {
+ strcpy(*remaining_line, str);
+ rc = 0;
+ }
+ break;
+ }
+ }
+ return rc;
+}
+
+static int reset_stats(struct ckrm_class *class, const char *str)
+{
+ int rc;
+ char *resname = NULL, *statstr = NULL;
+ struct ckrm_controller *ctlr;
+ struct ckrm_shares *shares;
+
+ rc = ckrm_stats_parse(str, &resname, &statstr);
+ if (rc)
+ return rc;
+
+ ctlr = ckrm_get_controller_by_name(resname);
+ if (!ctlr) {
+ rc = -EINVAL;
+ goto done;
+ }
+ shares = ckrm_get_controller_shares(class, ctlr);
+ if (shares && ctlr->reset_stats)
+ rc = ctlr->reset_stats(shares, statstr);
+ ckrm_put_controller(ctlr);
+done:
+ kfree(resname);
+ kfree(statstr);
+ return rc;
+}
struct class_attribute {
struct configfs_attribute configfs_attr;
@@ -29,6 +125,19 @@ struct class_attribute {
int (*store)(struct ckrm_class *, const char *);
};
+struct class_attribute stats_attr = {
+ .configfs_attr = {
+ .ca_name = "stats",
+ .ca_owner = THIS_MODULE,
+ .ca_mode = S_IRUGO | S_IWUSR
+ },
+ .show = show_stats,
+ .store = reset_stats
+};
+
+static struct configfs_subsystem rcfs_subsys;
+static struct config_item_type rcfs_class_type;
+
struct rcfs_class {
char *name;
struct ckrm_class *core;
@@ -169,6 +278,7 @@ static struct configfs_group_operations
};
static struct configfs_attribute *class_attrs[] = {
+ &stats_attr.configfs_attr,
NULL
};
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 10/12] Add shares file support to RCFS
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (8 preceding siblings ...)
2006-04-21 2:25 ` [RFC] [PATCH 09/12] Add stats file " sekharan
@ 2006-04-21 2:25 ` sekharan
2006-04-21 2:25 ` [RFC] [PATCH 11/12] Add members " sekharan
` (2 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:25 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
10/12 - ckrm_configfs_rcfs_shares
Adds attr_store and attr_show support for shares file.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
kernel/ckrm/ckrm_rcfs.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 136 insertions(+)
Index: linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_rcfs.c
+++ linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
@@ -23,6 +23,9 @@
#define CKRM_NAME_LEN 20
#define RES_STRING "res"
+#define MIN_SHARES_STRING "min_shares"
+#define MAX_SHARES_STRING "max_shares"
+#define CHILD_SHARES_DIVISOR_STRING "child_shares_divisor"
static ssize_t show_stats(struct ckrm_class *class, char *buf)
{
@@ -119,6 +122,128 @@ done:
return rc;
}
+
+enum share_token_t {
+ MIN_SHARES_TOKEN,
+ MAX_SHARES_TOKEN,
+ CHILD_SHARES_DIVISOR_TOKEN,
+ RESOURCE_TYPE_TOKEN,
+ ERROR_TOKEN
+};
+
+/* Token matching for parsing input to this magic file */
+static match_table_t shares_tokens = {
+ {RESOURCE_TYPE_TOKEN, RES_STRING"=%s"},
+ {MIN_SHARES_TOKEN, MIN_SHARES_STRING"=%d"},
+ {MAX_SHARES_TOKEN, MAX_SHARES_STRING"=%d"},
+ {CHILD_SHARES_DIVISOR_TOKEN, CHILD_SHARES_DIVISOR_STRING"=%d"},
+ {ERROR_TOKEN, NULL}
+};
+
+static int shares_parse(const char *options, char **resname,
+ struct ckrm_shares *shares)
+{
+ char *p;
+ int option, rc = -EINVAL;
+
+ *resname = NULL;
+ if (!options)
+ goto done;
+ while ((p = strsep((char **)&options, ",")) != NULL) {
+ substring_t args[MAX_OPT_ARGS];
+ int token;
+
+ if (!*p)
+ continue;
+ token = match_token(p, shares_tokens, args);
+ switch (token) {
+ case RESOURCE_TYPE_TOKEN:
+ if (*resname)
+ goto done;
+ *resname = match_strdup(args);
+ break;
+ case MIN_SHARES_TOKEN:
+ if (match_int(args, &option))
+ goto done;
+ shares->min_shares = option;
+ break;
+ case MAX_SHARES_TOKEN:
+ if (match_int(args, &option))
+ goto done;
+ shares->max_shares = option;
+ break;
+ case CHILD_SHARES_DIVISOR_TOKEN:
+ if (match_int(args, &option))
+ goto done;
+ shares->child_shares_divisor = option;
+ break;
+ default:
+ goto done;
+ }
+ }
+ rc = 0;
+done:
+ if (rc) {
+ kfree(*resname);
+ *resname = NULL;
+ }
+ return rc;
+}
+
+static int set_shares(struct ckrm_class *class, const char *str)
+{
+ char *resname = NULL;
+ int rc;
+ struct ckrm_controller *ctlr;
+ struct ckrm_shares shares = {
+ .min_shares = CKRM_SHARE_UNCHANGED,
+ .max_shares = CKRM_SHARE_UNCHANGED,
+ .child_shares_divisor = CKRM_SHARE_UNCHANGED,
+ };
+
+ rc = shares_parse(str, &resname, &shares);
+ if (!rc) {
+ ctlr = ckrm_get_controller_by_name(resname);
+ if (ctlr) {
+ rc = ckrm_set_controller_shares(class, ctlr, &shares);
+ ckrm_put_controller(ctlr);
+ } else
+ rc = -EINVAL;
+ kfree(resname);
+ }
+ return rc;
+}
+
+static ssize_t show_shares(struct ckrm_class *class, char *buf)
+{
+ int i;
+ ssize_t j, rc = 0, bufsize = PAGE_SIZE;
+ struct ckrm_shares *shares;
+ struct ckrm_controller *ctlr;
+
+ for (i = 0; i < CKRM_MAX_RES_CTLRS; i++) {
+ ctlr = ckrm_get_controller_by_id(i);
+ if (!ctlr)
+ continue;
+ shares = ckrm_get_controller_shares(class, ctlr);
+ if (shares) {
+ if (bufsize <= 0)
+ break;
+ j = snprintf(buf, bufsize, "%s=%s,%s=%d,%s=%d,%s=%d\n",
+ RES_STRING, ctlr->name,
+ MIN_SHARES_STRING, shares->min_shares,
+ MAX_SHARES_STRING, shares->max_shares,
+ CHILD_SHARES_DIVISOR_STRING,
+ shares->child_shares_divisor);
+ rc += j; buf += j; bufsize -= j;
+ }
+ ckrm_put_controller(ctlr);
+ }
+ if (i < CKRM_MAX_RES_CTLRS)
+ rc = -ENOSPC;
+ return rc;
+}
+
struct class_attribute {
struct configfs_attribute configfs_attr;
ssize_t (*show)(struct ckrm_class *, char *);
@@ -135,6 +260,16 @@ struct class_attribute stats_attr = {
.store = reset_stats
};
+struct class_attribute shares_attr = {
+ .configfs_attr = {
+ .ca_name = "shares",
+ .ca_owner = THIS_MODULE,
+ .ca_mode = S_IRUGO | S_IWUSR
+ },
+ .show = show_shares,
+ .store = set_shares
+};
+
static struct configfs_subsystem rcfs_subsys;
static struct config_item_type rcfs_class_type;
@@ -279,6 +414,7 @@ static struct configfs_group_operations
static struct configfs_attribute *class_attrs[] = {
&stats_attr.configfs_attr,
+ &shares_attr.configfs_attr,
NULL
};
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 11/12] Add members file support to RCFS
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (9 preceding siblings ...)
2006-04-21 2:25 ` [RFC] [PATCH 10/12] Add shares " sekharan
@ 2006-04-21 2:25 ` sekharan
2006-04-21 2:25 ` [RFC] [PATCH 12/12] Documentation for CKRM sekharan
2006-04-21 14:49 ` [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul Dave Hansen
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:25 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
11/12 - ckrm_configfs_rcfs_members
Adds attr_store and attr_show support for members file.
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
kernel/ckrm/ckrm_rcfs.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 49 insertions(+)
Index: linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
===================================================================
--- linux2617-rc2.orig/kernel/ckrm/ckrm_rcfs.c
+++ linux2617-rc2/kernel/ckrm/ckrm_rcfs.c
@@ -244,6 +244,43 @@ static ssize_t show_shares(struct ckrm_c
return rc;
}
+/*
+ * Given a buffer with a pid in it, add the task with that pid to the class.
+ * Ignores entire buffer after the first pid is parsed.
+ */
+static int add_member(struct ckrm_class *class, const char *str)
+{
+ pid_t pid;
+
+ pid = (pid_t) simple_strtol(str, NULL, 0);
+ if (pid <= 0)
+ return -EINVAL; /* Not a valid pid */
+ return ckrm_setclass(pid, class);
+}
+
+/*
+ * Lists pids of tasks that belong to the given class.
+ */
+static ssize_t show_members(struct ckrm_class *class, char *buf)
+{
+ ssize_t i, rc = 0, bufsize = PAGE_SIZE;
+ struct task_struct *tsk;
+
+ spin_lock(&class->class_lock);
+ list_for_each_entry(tsk, &class->task_list, member_list) {
+ if (bufsize <= 0) {
+ rc = -ENOSPC;
+ break;
+ }
+ if (!tsk->pid) /* Ignore swappers */
+ continue;
+ i = snprintf(buf, bufsize, "%ld\n", (long)tsk->pid);
+ buf += i; rc += i; bufsize -= i;
+ }
+ spin_unlock(&class->class_lock);
+ return rc;
+}
+
struct class_attribute {
struct configfs_attribute configfs_attr;
ssize_t (*show)(struct ckrm_class *, char *);
@@ -270,6 +307,17 @@ struct class_attribute shares_attr = {
.store = set_shares
};
+struct class_attribute members_attr = {
+ .configfs_attr = {
+ .ca_name = "members",
+ .ca_owner = THIS_MODULE,
+ .ca_mode = S_IRUGO | S_IWUSR
+ },
+ .show = show_members,
+ .store = add_member
+};
+
+
static struct configfs_subsystem rcfs_subsys;
static struct config_item_type rcfs_class_type;
@@ -415,6 +463,7 @@ static struct configfs_group_operations
static struct configfs_attribute *class_attrs[] = {
&stats_attr.configfs_attr,
&shares_attr.configfs_attr,
+ &members_attr.configfs_attr,
NULL
};
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* [RFC] [PATCH 12/12] Documentation for CKRM
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (10 preceding siblings ...)
2006-04-21 2:25 ` [RFC] [PATCH 11/12] Add members " sekharan
@ 2006-04-21 2:25 ` sekharan
2006-04-21 14:49 ` [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul Dave Hansen
12 siblings, 0 replies; 35+ messages in thread
From: sekharan @ 2006-04-21 2:25 UTC (permalink / raw)
To: linux-kernel, ckrm-tech; +Cc: sekharan
12/12 - ckrm_docs
Documentation describing important CKRM elements such as classes, shares,
controllers, and the interface provided to userspace via RCFS
--
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-Off-By: Vivek Kashyap <kashyapv@us.ibm.com>
Signed-Off-By: Matt Helsley <matthltc@us.ibm.com>
Documentation/ckrm/ckrm_basics | 65 ++++++++++++++++++++++++++++++++++++++++
Documentation/ckrm/ckrm_install | 54 +++++++++++++++++++++++++++++++++
Documentation/ckrm/ckrm_usage | 52 ++++++++++++++++++++++++++++++++
3 files changed, 171 insertions(+)
Index: linux-2.6.16/Documentation/ckrm/ckrm_basics
===================================================================
--- /dev/null
+++ linux-2.6.16/Documentation/ckrm/ckrm_basics
@@ -0,0 +1,65 @@
+CKRM Basics
+-------------
+A brief review of CKRM concepts and terminology will help make installation
+and testing easier. For more details, please visit http://ckrm.sf.net.
+
+Concept:
+User defines a class, associate some amount of resources to the class, and
+associates tasks with the class. Tasks belonging to that class will be
+bound by the amount of resources that are assigned to that class.
+
+RCFS depicts a CKRM class as a directory. Hierarchy of classes can be
+created in which children of a class share resources allotted to
+the parent. Tasks can be classified to any class which is at any level.
+There is no correlation between parent-child relationship of tasks and
+the parent-child relationship of classes they belong to.
+
+During fork(), class is inherited by a task. A privileged user can
+reassign a task to any class.
+
+Characteristics of a class can be accessed/changed through the following
+files under the directory representing the class:
+
+shares: allows changing shares of different resource managed by the class
+stats: shows statistics associated with each resource managed by the class
+members: allows assignment of tasks to a class and shows tasks that are
+ assigned to a class.
+
+Resource allocation of a class is proportional to the amount of resources
+available to the class's parent.
+Resource allocation for a class is controlled by the parameters:
+
+min_shares: Minimum amount shares that can be allocated by a class. A
+ special value of DONT_CARE(-3) means that there is no minimum
+ shares of a resource specified. This class may not get
+ any resource if the system is running short on resources.
+max_shares: Specifies the maximum amount of resource that is allowed to be
+ allocated by a class. A special value DONT_CARE(-3) means
+ there is no specific limit is specified, this class can get all
+ the resources available.
+child_shares_divisor: total guarantee that is allowed among the children of
+ this class. In other words, the sum of "guarantee"s of all
+ children of this class cannot exceed this number.
+
+Any of these parameters can have a special value, UNSUPPORTED(-2) meaning
+that the specific controller does not support this parameter. User
+request to change the value will be ignored.
+
+None of these parameters neither absolute nor have any units associated with
+them. These are just numbers that are used to calculate the absolute number
+of resource available for a specific class.
+
+In order to make them independent of the type of resource and handle
+complexities like hotplug none of these parameters have units associated
+with them. Furthermore they are not percentages. They are called shares
+because an appropriate analogy would be shares in a stock market.
+
+The absolute amount (for example no. of tasks) of minimum shares available
+for a class is calculuated as:
+
+ absolute minimum shares = (parent's absolute amount of resource) *
+ (class's min_shares / parent's child_shares_divisor)
+
+Maximum shares is also calculated in the same way.
+
+Root class is allocated all the resources available in the system. In other
Index: linux-2.6.16/Documentation/ckrm/ckrm_install
===================================================================
--- /dev/null
+++ linux-2.6.16/Documentation/ckrm/ckrm_install
@@ -0,0 +1,54 @@
+Kernel installation
+------------------------------
+
+<kernver> = version of mainline Linux kernel
+<ckrmver> = version of CKRM
+
+Note: It is expected that CKRM versions will change often. Hence once
+a CKRM version has been released for some <kernver>, it will only be made
+available for future <kernver>'s until the next CKRM version is released.
+
+Patches released will specify which version of kernel source that patchset
+is released against.
+
+Core patches will be released in two formats
+ 1. set of patches with a series file (to be used with quilt)
+ 2. a single patch that is inclusive of all the core patches.
+
+Controler patches will be released as a set. An excpetion would be the
+numtasks controller which would be released as part of the core patchset.
+
+1. Patch
+
+ Apply ckrm-single-<ckrmversion>.patch to a mainline kernel
+ tree with version <kernver>.
+
+2. Configure
+
+Select appropriate configuration options:
+
+ Enable configfs filesystem:
+ File systems --->
+ Pseudo filesystems --->
+ <M> Userspace-driven configuration filesystem (EXPERIMENTAL)
+
+ Enable CKRM components:
+ General Setup --->
+ Class Based Kernel Resource Management --->
+ [*] Class Based Kernel Resource Management Core
+ <M> Resource Class File System (User API)
+ [*] Number of Tasks Resource Manager
+
+
+3. Build, boot the kernel
+
+4. Enable rcfs
+
+ # insmod <patchestree>/fs/configfs/configfs.ko # if compiled as module
+ # insmod <patchedtree>/kernel/ckrm/ckrm_rcfs.ko # if compiled in as module
+ # mount -t configfs none /config
+
+ This will create the directory /config/ckrm which is the root of classes.
+
+5. Work with class hierarchy as explained in the file ckrm_usage
+
Index: linux-2.6.16/Documentation/ckrm/ckrm_usage
===================================================================
--- /dev/null
+++ linux-2.6.16/Documentation/ckrm/ckrm_usage
@@ -0,0 +1,52 @@
+Usage of CKRM
+-------------
+
+1. Create a class
+
+ # mkdir /config/ckrm/c1
+ creates a class named c1 , while
+
+The newly created class directory is automatically populated by magic files
+shares, stats, members, and attrib.
+
+2. View default shares of a class
+
+ # cat /config/ckrm/c1/shares
+ min_shares=-3,max_shares=-3,child_total_divisor=100
+
+ Above is the default value set for resources that have controllers
+ registered with CKRM.
+
+3. change shares of a specific resource in a class
+
+ One or more of the following fields can/must be specified
+ res=<res_name> #mandatory
+ min_shares=<number>
+ max_shares=<number>
+ child_total_divisor=<number>
+ e.g.
+ # echo "res=numtasks,max_shares=20" > /config/ckrm/c1/shares
+
+ If any of these parameters are not specified, the current value will be
+ retained.
+
+4. Reclassify a task
+
+ write the pid of the process to the destination class' members file
+ # echo 1004 > /config/ckrm/c1/members
+
+5. Get a list of tasks assigned to a class
+
+ # cat /config/ckrm/c1/members
+ lists pids of tasks belonging to c1
+
+6. Get statictics of different resources of a class
+
+ # cat /config/ckrm/c1/stats
+ shows c1's statistics for each registered resource controller.
+
+7. Configuration settings for controllers
+ Configuration values for controller are available through module
+ parameter interfaces. Consult the controller specific documents for
+ details. For example, numtasks has it available through
+ /sys/module/ckrm_numtasks/parameters.
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
` (11 preceding siblings ...)
2006-04-21 2:25 ` [RFC] [PATCH 12/12] Documentation for CKRM sekharan
@ 2006-04-21 14:49 ` Dave Hansen
2006-04-21 16:58 ` Chandra Seetharaman
12 siblings, 1 reply; 35+ messages in thread
From: Dave Hansen @ 2006-04-21 14:49 UTC (permalink / raw)
To: sekharan; +Cc: linux-kernel, ckrm-tech
On Thu, 2006-04-20 at 19:24 -0700, sekharan@us.ibm.com wrote:
> CKRM has gone through a major overhaul by removing some of the complexity,
> cutting down on features and moving portions to userspace.
What do you want done with these patches? Do you think they are ready
for mainline? -mm? Or, are you just posting here for comments?
-- Dave
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-21 14:49 ` [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul Dave Hansen
@ 2006-04-21 16:58 ` Chandra Seetharaman
2006-04-21 22:57 ` Andrew Morton
0 siblings, 1 reply; 35+ messages in thread
From: Chandra Seetharaman @ 2006-04-21 16:58 UTC (permalink / raw)
To: Dave Hansen; +Cc: linux-kernel, ckrm-tech
On Fri, 2006-04-21 at 07:49 -0700, Dave Hansen wrote:
> On Thu, 2006-04-20 at 19:24 -0700, sekharan@us.ibm.com wrote:
> > CKRM has gone through a major overhaul by removing some of the complexity,
> > cutting down on features and moving portions to userspace.
>
> What do you want done with these patches? Do you think they are ready
> for mainline? -mm? Or, are you just posting here for comments?
>
We think it is ready for -mm. But, want to go through a review cycle in
lkml before i request Andrew for that.
Thanks for asking,
chandra
> -- Dave
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-21 16:58 ` Chandra Seetharaman
@ 2006-04-21 22:57 ` Andrew Morton
2006-04-22 1:48 ` Chandra Seetharaman
0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2006-04-21 22:57 UTC (permalink / raw)
To: sekharan; +Cc: haveblue, linux-kernel, ckrm-tech
Chandra Seetharaman <sekharan@us.ibm.com> wrote:
>
> On Fri, 2006-04-21 at 07:49 -0700, Dave Hansen wrote:
> > On Thu, 2006-04-20 at 19:24 -0700, sekharan@us.ibm.com wrote:
> > > CKRM has gone through a major overhaul by removing some of the complexity,
> > > cutting down on features and moving portions to userspace.
> >
> > What do you want done with these patches? Do you think they are ready
> > for mainline? -mm? Or, are you just posting here for comments?
> >
>
> We think it is ready for -mm. But, want to go through a review cycle in
> lkml before i request Andrew for that.
>From a quick scan, the overall code quality is probably the best I've seen
for an initial submission of this magnitude. I had a few minor issues and
questions, but it'd need a couple of hours to go through it all.
So. Send 'em over when you're ready.
I have one concern. If we merge this framework into mainline then we'd
(quite reasonably) expect to see an ongoing dribble of new controllers
being submitted. But we haven't seen those controllers yet. So there is a
risk that you'll submit a major new controller (most likely a net or memory
controller) and it will provoke a reviewer revolt. We'd then be in a
situation of cant-go-forward, cant-go-backward.
It would increase the comfort level if we could see what the major
controllers look like before committing. But that's unreasonable.
Could I ask that you briefly enumerate
a) which controllers you think we'll need in the forseeable future
b) what they need to do
c) pointer to prototype code if poss
Thanks.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-21 22:57 ` Andrew Morton
@ 2006-04-22 1:48 ` Chandra Seetharaman
2006-04-22 2:13 ` Andrew Morton
2006-04-24 1:47 ` Hirokazu Takahashi
0 siblings, 2 replies; 35+ messages in thread
From: Chandra Seetharaman @ 2006-04-22 1:48 UTC (permalink / raw)
To: Andrew Morton; +Cc: haveblue, linux-kernel, ckrm-tech
On Fri, 2006-04-21 at 15:57 -0700, Andrew Morton wrote:
> Chandra Seetharaman <sekharan@us.ibm.com> wrote:
> >
> > On Fri, 2006-04-21 at 07:49 -0700, Dave Hansen wrote:
> > > On Thu, 2006-04-20 at 19:24 -0700, sekharan@us.ibm.com wrote:
> > > > CKRM has gone through a major overhaul by removing some of the complexity,
> > > > cutting down on features and moving portions to userspace.
> > >
> > > What do you want done with these patches? Do you think they are ready
> > > for mainline? -mm? Or, are you just posting here for comments?
> > >
> >
> > We think it is ready for -mm. But, want to go through a review cycle in
> > lkml before i request Andrew for that.
>
> From a quick scan, the overall code quality is probably the best I've seen
> for an initial submission of this magnitude. I had a few minor issues and
Thanks, and thanks to all that helped.
> questions, but it'd need a couple of hours to go through it all.
>
> So. Send 'em over when you're ready.
Great. I will wait for couple of days for comments and then send them
your way.
>
> I have one concern. If we merge this framework into mainline then we'd
> (quite reasonably) expect to see an ongoing dribble of new controllers
> being submitted. But we haven't seen those controllers yet. So there is a
> risk that you'll submit a major new controller (most likely a net or memory
> controller) and it will provoke a reviewer revolt. We'd then be in a
> situation of cant-go-forward, cant-go-backward.
>
I totally understand your concern.
CKRM's design is not tied with a specific implementation of a
controller. It allows hooking up different controllers for the same
resource. If a controller is considered complex, it can cut some of the
features and be made simpler. Or a simpler controller can replace an
earlier complex controller without affecting the user interface.
This flexibility feature reduces the "cant-go-forward, cant-go-back"
problem, somewhat.
FYI, we found out that managing network resources was not falling into
this task based model and we had to invent complex layering to
accommodate it. So, we dropped our plans for network support.
One can write controller for any resource that can be accounted at task
level. The corresponding subsystem stakeholders can ensure that it is
clean, and at acceptable level.
> It would increase the comfort level if we could see what the major
> controllers look like before committing. But that's unreasonable.
You might have seen the CPU controller (different implementation than
what we had earlier) and the numtasks controller (can prevent fork
bombs) that followed this patchset.
>
> Could I ask that you briefly enumerate
>
> a) which controllers you think we'll need in the forseeable future
>
Our main object is to provide resource control for the hardware
resources: CPU, I/O and memory.
We have already posted the CPU controller.
We have two implementations of memory controller and a I/O controller.
Memory controller is understandably more complex and controversial, and
that is the reason we haven't posted it this time around (we are looking
at ways to simplify the design and hence the complexity). Both the
memory controllers has been posted to linux-mm.
I/O controller is based on CFQ-scheduler.
> b) what they need to do
Both memory controllers provide control for LRU lists.
- One maintains the active/inactive lists per class for each zone. It
is of order O(1). Current code is little complex. We are looking at
ways to simplify it.
- Another creates pseudo zones under each zones (by splitting the
number of pages available in a zone) and attaches them with
each class.
I/O Controller that we are working on is based on CFQ scheduler and
provides bandwidth control.
>
> c) pointer to prototype code if poss
Both the memory controllers are fully functional. We need to trim them
down.
active/inactive list per class memory controller:
http://prdownloads.sourceforge.net/ckrm/mem_rc-f0.4-2615-v2.tz?download
pzone based memory controller:
http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
i/o controller: This controller is not ported to the framework posted,
but can be taken for a prototype version. New version would be simpler
though.
http://prdownloads.sourceforge.net/ckrm/io_rc.tar.bz2?download
Thanks & Regards,
chandra
>
> Thanks.
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 1:48 ` Chandra Seetharaman
@ 2006-04-22 2:13 ` Andrew Morton
2006-04-22 2:20 ` Matt Helsley
` (3 more replies)
2006-04-24 1:47 ` Hirokazu Takahashi
1 sibling, 4 replies; 35+ messages in thread
From: Andrew Morton @ 2006-04-22 2:13 UTC (permalink / raw)
To: sekharan; +Cc: haveblue, linux-kernel, ckrm-tech
Chandra Seetharaman <sekharan@us.ibm.com> wrote:
>
> >
> > c) pointer to prototype code if poss
>
> Both the memory controllers are fully functional. We need to trim them
> down.
>
> active/inactive list per class memory controller:
> http://prdownloads.sourceforge.net/ckrm/mem_rc-f0.4-2615-v2.tz?download
Oh my gosh. That converts memory reclaim from per-zone LRU to
per-CKRM-class LRU. If configured.
This is huge. It means that we have basically two quite different versions
of memory reclaim to test and maintain. This is a problem.
(I hope that's the before-we-added-comments version of the patch btw).
> pzone based memory controller:
> http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
>From a super-quick scan that looks saner. Is it effective? Is this the
way you're planning on proceeding?
This requirement is basically a glorified RLIMIT_RSS manager, isn't it?
Just that it covers a group of mm's and not just the one mm?
Do you attempt to manage just pagecache? So if class A tries to read 10GB
from disk, does that get more aggressively reclaimed based on class A's
resource limits?
This all would have been more comfortable if done on top of the 2.4
kernel's virtual scanner.
(btw, using the term "class" to identify a group of tasks isn't very
comfortable - it's an instance, not a class...)
Worried.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 2:13 ` Andrew Morton
@ 2006-04-22 2:20 ` Matt Helsley
2006-04-22 2:33 ` Andrew Morton
2006-04-22 5:28 ` Chandra Seetharaman
` (2 subsequent siblings)
3 siblings, 1 reply; 35+ messages in thread
From: Matt Helsley @ 2006-04-22 2:20 UTC (permalink / raw)
To: Andrew Morton; +Cc: Chandra S. Seetharaman, Dave Hansen, LKML, CKRM-Tech
On Fri, 2006-04-21 at 19:13 -0700, Andrew Morton wrote:
<snip> (I'll let those more familiar with the memory controller efforts
comment on those concerns)
> (btw, using the term "class" to identify a group of tasks isn't very
> comfortable - it's an instance, not a class...)
Yes, I can see how this would be uncomfortable. How about replacing
"class" with "resource group"?
Cheers,
-Matt Helsley
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 2:20 ` Matt Helsley
@ 2006-04-22 2:33 ` Andrew Morton
0 siblings, 0 replies; 35+ messages in thread
From: Andrew Morton @ 2006-04-22 2:33 UTC (permalink / raw)
To: Matt Helsley; +Cc: sekharan, haveblue, linux-kernel, ckrm-tech
Matt Helsley <matthltc@us.ibm.com> wrote:
>
> > (btw, using the term "class" to identify a group of tasks isn't very
> > comfortable - it's an instance, not a class...)
>
> Yes, I can see how this would be uncomfortable. How about replacing
> "class" with "resource group"?
Much more comfortable, thanks ;)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 2:13 ` Andrew Morton
2006-04-22 2:20 ` Matt Helsley
@ 2006-04-22 5:28 ` Chandra Seetharaman
2006-04-24 1:10 ` KUROSAWA Takahiro
2006-04-24 5:18 ` Hirokazu Takahashi
2006-04-23 6:52 ` Paul Jackson
2006-04-28 1:58 ` Chandra Seetharaman
3 siblings, 2 replies; 35+ messages in thread
From: Chandra Seetharaman @ 2006-04-22 5:28 UTC (permalink / raw)
To: Andrew Morton
Cc: haveblue, linux-kernel, ckrm-tech, Valerie Clement,
Takahiro Kurosawa
On Fri, 2006-04-21 at 19:13 -0700, Andrew Morton wrote:
> Chandra Seetharaman <sekharan@us.ibm.com> wrote:
> >
> > >
> > > c) pointer to prototype code if poss
> >
> > Both the memory controllers are fully functional. We need to trim them
> > down.
> >
> > active/inactive list per class memory controller:
> > http://prdownloads.sourceforge.net/ckrm/mem_rc-f0.4-2615-v2.tz?download
>
> Oh my gosh. That converts memory reclaim from per-zone LRU to
> per-CKRM-class LRU. If configured.
Yes. We originally had an implementation that would use the existing
per-zone LRU, but the reclamation path was O(n), where n is the number
of classes. So, we moved towards a O(1) algorithm.
>
> This is huge. It means that we have basically two quite different versions
> of memory reclaim to test and maintain. This is a problem.
Understood, will work and come up with an acceptable memory controller.
>
> (I hope that's the before-we-added-comments version of the patch btw).
Yes, indeed :). As I told earlier this patch is not ready for lkml or -
mm yet.
>
> > pzone based memory controller:
> > http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
>
> From a super-quick scan that looks saner. Is it effective? Is this the
> way you're planning on proceeding?
>
Yes, it is effective, and the reclamation is O(1) too. It has couple of
problems by design, (1) doesn't handle shared pages and (2) doesn't
provide support for both min_shares and max_shares.
> This requirement is basically a glorified RLIMIT_RSS manager, isn't it?
> Just that it covers a group of mm's and not just the one mm?
Yes, that is the core object of ckrm, associate resources to a group of
tasks.
>
> Do you attempt to manage just pagecache? So if class A tries to read 10GB
> from disk, does that get more aggressively reclaimed based on class A's
> resource limits?
Yes, it would get more aggressively reclaimed. But, if you have the I/O
controller also configured appropriately only class A will be affected.
>
> This all would have been more comfortable if done on top of the 2.4
> kernel's virtual scanner.
>
> (btw, using the term "class" to identify a group of tasks isn't very
> comfortable - it's an instance, not a class...)
We could go with "Resource Group" as Matt suggested.
>
>
Valerie, KUROSAWA, Please free to add any more details.
> Worried.
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 2:13 ` Andrew Morton
2006-04-22 2:20 ` Matt Helsley
2006-04-22 5:28 ` Chandra Seetharaman
@ 2006-04-23 6:52 ` Paul Jackson
2006-04-23 9:31 ` Matt Helsley
2006-04-28 1:58 ` Chandra Seetharaman
3 siblings, 1 reply; 35+ messages in thread
From: Paul Jackson @ 2006-04-23 6:52 UTC (permalink / raw)
To: Andrew Morton; +Cc: sekharan, haveblue, linux-kernel, ckrm-tech
Andrew wrote:
> (btw, using the term "class" to identify a group of tasks isn't very
> comfortable - it's an instance, not a class...)
Bless you. I objected to the term 'class' a long time ago, but failed
to advance my case in a successful fashion.
Matt replied:
> "resource group"?
Nice.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-23 6:52 ` Paul Jackson
@ 2006-04-23 9:31 ` Matt Helsley
0 siblings, 0 replies; 35+ messages in thread
From: Matt Helsley @ 2006-04-23 9:31 UTC (permalink / raw)
To: Paul Jackson; +Cc: Andrew Morton, sekharan, haveblue, linux-kernel, ckrm-tech
On Sat, 2006-04-22 at 23:52 -0700, Paul Jackson wrote:
> Andrew wrote:
> > (btw, using the term "class" to identify a group of tasks isn't very
> > comfortable - it's an instance, not a class...)
>
> Bless you. I objected to the term 'class' a long time ago, but failed
> to advance my case in a successful fashion.
Well, I wouldn't say you were entirely unsuccessful. I distinctly
remembered your case and I tried to think of suitable names during the
recent changes. Please take a look at the latest set of patches and see
if you think the names are clearer.
> Matt replied:
> > "resource group"?
>
> Nice.
Cheers,
-Matt Helsley
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 5:28 ` Chandra Seetharaman
@ 2006-04-24 1:10 ` KUROSAWA Takahiro
2006-04-24 4:39 ` Kirill Korotaev
2006-04-24 5:18 ` Hirokazu Takahashi
1 sibling, 1 reply; 35+ messages in thread
From: KUROSAWA Takahiro @ 2006-04-24 1:10 UTC (permalink / raw)
To: sekharan
Cc: akpm, haveblue, linux-kernel, ckrm-tech,
" Valerie.Clement"
On Fri, 21 Apr 2006 22:28:45 -0700
Chandra Seetharaman <sekharan@us.ibm.com> wrote:
> > > pzone based memory controller:
> > > http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
> >
> > From a super-quick scan that looks saner. Is it effective? Is this the
> > way you're planning on proceeding?
>
> Yes, it is effective, and the reclamation is O(1) too. It has couple of
> problems by design, (1) doesn't handle shared pages and (2) doesn't
> provide support for both min_shares and max_shares.
Right. I wanted to show proof-of-cencept of the pzone based controller
and implemented minimal features necessary as the memory controller.
So, the pzone based controller still needs development and some cleanup.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 1:48 ` Chandra Seetharaman
2006-04-22 2:13 ` Andrew Morton
@ 2006-04-24 1:47 ` Hirokazu Takahashi
2006-04-24 20:42 ` Shailabh Nagar
1 sibling, 1 reply; 35+ messages in thread
From: Hirokazu Takahashi @ 2006-04-24 1:47 UTC (permalink / raw)
To: sekharan; +Cc: akpm, haveblue, linux-kernel, ckrm-tech
Hi Chandra,
> > Could I ask that you briefly enumerate
> >
> > a) which controllers you think we'll need in the forseeable future
> >
>
> Our main object is to provide resource control for the hardware
> resources: CPU, I/O and memory.
>
> We have already posted the CPU controller.
>
> We have two implementations of memory controller and a I/O controller.
>
> Memory controller is understandably more complex and controversial, and
> that is the reason we haven't posted it this time around (we are looking
> at ways to simplify the design and hence the complexity). Both the
> memory controllers has been posted to linux-mm.
>
> I/O controller is based on CFQ-scheduler.
>
> > b) what they need to do
(snip)
> I/O Controller that we are working on is based on CFQ scheduler and
> provides bandwidth control.
> >
> > c) pointer to prototype code if poss
(snip)
> i/o controller: This controller is not ported to the framework posted,
> but can be taken for a prototype version. New version would be simpler
> though.
I think controlling I/O bandwidth is right way to go.
However, I think you need to change the design of the controller a bit.
A lot of I/O requests processes issue will be handled by other contexts.
There are AIO, journaling, pdflush and vmscan, which some kernel threads
treat instead of the processes.
The current design looks not to care about this.
Thanks,
Hirokazu Takahashi.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-24 1:10 ` KUROSAWA Takahiro
@ 2006-04-24 4:39 ` Kirill Korotaev
2006-04-24 5:41 ` KUROSAWA Takahiro
0 siblings, 1 reply; 35+ messages in thread
From: Kirill Korotaev @ 2006-04-24 4:39 UTC (permalink / raw)
To: KUROSAWA Takahiro
Cc: sekharan, akpm, haveblue, linux-kernel, ckrm-tech,
Valerie.Clement
>>>> pzone based memory controller:
>>>> http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
>>> From a super-quick scan that looks saner. Is it effective? Is this the
>>> way you're planning on proceeding?
>> Yes, it is effective, and the reclamation is O(1) too. It has couple of
>> problems by design, (1) doesn't handle shared pages and (2) doesn't
>> provide support for both min_shares and max_shares.
>
> Right. I wanted to show proof-of-cencept of the pzone based controller
> and implemented minimal features necessary as the memory controller.
> So, the pzone based controller still needs development and some cleanup.
Just out of curiosity, how it was meassured that it is effective?
How does it work when there is a global memory shortage in the system?
Thanks,
Kirill
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 5:28 ` Chandra Seetharaman
2006-04-24 1:10 ` KUROSAWA Takahiro
@ 2006-04-24 5:18 ` Hirokazu Takahashi
2006-04-25 1:42 ` Chandra Seetharaman
1 sibling, 1 reply; 35+ messages in thread
From: Hirokazu Takahashi @ 2006-04-24 5:18 UTC (permalink / raw)
To: sekharan
Cc: akpm, haveblue, linux-kernel, ckrm-tech,
" Valerie.Clement", kurosawa
Hi Chandra,
> > > > c) pointer to prototype code if poss
> > >
> > > Both the memory controllers are fully functional. We need to trim them
> > > down.
> > >
> > > active/inactive list per class memory controller:
> > > http://prdownloads.sourceforge.net/ckrm/mem_rc-f0.4-2615-v2.tz?download
> >
> > Oh my gosh. That converts memory reclaim from per-zone LRU to
> > per-CKRM-class LRU. If configured.
>
> Yes. We originally had an implementation that would use the existing
> per-zone LRU, but the reclamation path was O(n), where n is the number
> of classes. So, we moved towards a O(1) algorithm.
>
> >
> > This is huge. It means that we have basically two quite different versions
> > of memory reclaim to test and maintain. This is a problem.
>
> Understood, will work and come up with an acceptable memory controller.
> >
> > (I hope that's the before-we-added-comments version of the patch btw).
>
> Yes, indeed :). As I told earlier this patch is not ready for lkml or -
> mm yet.
> >
> > > pzone based memory controller:
> > > http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
> >
> > From a super-quick scan that looks saner. Is it effective? Is this the
> > way you're planning on proceeding?
> >
>
> Yes, it is effective, and the reclamation is O(1) too. It has couple of
> problems by design, (1) doesn't handle shared pages and (2) doesn't
> provide support for both min_shares and max_shares.
I'm not sure all of them have to be managed under ckrm_core and rcfs
in kernel.
These functions you mentioned can be implemented in user space
to minimize the overhead in usual VM operations because it isn't
expected quick response to resize it. It is a bit different from
that of CPU resource.
You don't need to invent everything. I think you can reuse what
NUMA team is doing instead. This approach may not fit in your rcfs,
though.
> > This requirement is basically a glorified RLIMIT_RSS manager, isn't it?
> > Just that it covers a group of mm's and not just the one mm?
>
> Yes, that is the core object of ckrm, associate resources to a group of
> tasks.
Thanks,
Hirokazu Takahahsi.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-24 4:39 ` Kirill Korotaev
@ 2006-04-24 5:41 ` KUROSAWA Takahiro
2006-04-24 6:45 ` Kirill Korotaev
0 siblings, 1 reply; 35+ messages in thread
From: KUROSAWA Takahiro @ 2006-04-24 5:41 UTC (permalink / raw)
To: Kirill Korotaev
Cc: sekharan, akpm, haveblue, linux-kernel, ckrm-tech,
Valerie.Clement
On Mon, 24 Apr 2006 08:39:52 +0400
Kirill Korotaev <dev@openvz.org> wrote:
> >>>> pzone based memory controller:
> >>>> http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
> >>> From a super-quick scan that looks saner. Is it effective? Is this the
> >>> way you're planning on proceeding?
> >> Yes, it is effective, and the reclamation is O(1) too. It has couple of
> >> problems by design, (1) doesn't handle shared pages and (2) doesn't
> >> provide support for both min_shares and max_shares.
> >
> > Right. I wanted to show proof-of-cencept of the pzone based controller
> > and implemented minimal features necessary as the memory controller.
> > So, the pzone based controller still needs development and some cleanup.
> Just out of curiosity, how it was meassured that it is effective?
I don't have any benchmark numbers yet, so I can't explain the
effectiveness with numbers. I've been looking for the way to
measure the cost of pzones correctly, but I've not found it out yet.
> How does it work when there is a global memory shortage in the system?
I guess you are referring to the situation that global memory is running
out but there are free pages in pzones. These free pages in pzones are
handled as reserved for pzone users and not used even in global memory
shortage.
Thanks,
--
KUROSAWA, Takahiro
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-24 5:41 ` KUROSAWA Takahiro
@ 2006-04-24 6:45 ` Kirill Korotaev
2006-04-24 7:12 ` KUROSAWA Takahiro
0 siblings, 1 reply; 35+ messages in thread
From: Kirill Korotaev @ 2006-04-24 6:45 UTC (permalink / raw)
To: KUROSAWA Takahiro
Cc: Kirill Korotaev, sekharan, akpm, haveblue, linux-kernel,
ckrm-tech, Valerie.Clement, devel
>>>>Yes, it is effective, and the reclamation is O(1) too. It has couple of
>>>>problems by design, (1) doesn't handle shared pages and (2) doesn't
>>>>provide support for both min_shares and max_shares.
>>>
>>>Right. I wanted to show proof-of-cencept of the pzone based controller
>>>and implemented minimal features necessary as the memory controller.
>>>So, the pzone based controller still needs development and some cleanup.
>>
>>Just out of curiosity, how it was meassured that it is effective?
>
>
> I don't have any benchmark numbers yet, so I can't explain the
> effectiveness with numbers. I've been looking for the way to
> measure the cost of pzones correctly, but I've not found it out yet.
>
>
>>How does it work when there is a global memory shortage in the system?
>
>
> I guess you are referring to the situation that global memory is running
> out but there are free pages in pzones. These free pages in pzones are
> handled as reserved for pzone users and not used even in global memory
> shortage.
ok. Let me explain what I mean.
Imagine the situation with global memory shortage. In kernel, there are
threads which do some job behalf the user, e.g. kjournald, loop etc. If
the user has some pzone memory, but these threads fail to do their job
some nasty things can happen (ext3 problems, deadlocks, OOM etc.)
If such behaviour is ok for you, then great. But did you consider it?
Also, I can't understand how it works with OOM killer. If pzones has
enough memory, but there is a global shortage, who will be killed?
Thanks,
Kirill
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-24 6:45 ` Kirill Korotaev
@ 2006-04-24 7:12 ` KUROSAWA Takahiro
0 siblings, 0 replies; 35+ messages in thread
From: KUROSAWA Takahiro @ 2006-04-24 7:12 UTC (permalink / raw)
To: Kirill Korotaev
Cc: sekharan, akpm, haveblue, linux-kernel, ckrm-tech,
Valerie.Clement, devel
On Mon, 24 Apr 2006 10:45:59 +0400
Kirill Korotaev <dev@openvz.org> wrote:
> >>>>Yes, it is effective, and the reclamation is O(1) too. It has couple of
> >>>>problems by design, (1) doesn't handle shared pages and (2) doesn't
> >>>>provide support for both min_shares and max_shares.
> >>>
> >>>Right. I wanted to show proof-of-cencept of the pzone based controller
> >>>and implemented minimal features necessary as the memory controller.
> >>>So, the pzone based controller still needs development and some cleanup.
> >>
> >>Just out of curiosity, how it was meassured that it is effective?
> >
> > I don't have any benchmark numbers yet, so I can't explain the
> > effectiveness with numbers. I've been looking for the way to
> > measure the cost of pzones correctly, but I've not found it out yet.
> >
> >>How does it work when there is a global memory shortage in the system?
> >
> > I guess you are referring to the situation that global memory is running
> > out but there are free pages in pzones. These free pages in pzones are
> > handled as reserved for pzone users and not used even in global memory
> > shortage.
> ok. Let me explain what I mean.
> Imagine the situation with global memory shortage. In kernel, there are
> threads which do some job behalf the user, e.g. kjournald, loop etc. If
> the user has some pzone memory, but these threads fail to do their job
> some nasty things can happen (ext3 problems, deadlocks, OOM etc.)
> If such behaviour is ok for you, then great. But did you consider it?
>
> Also, I can't understand how it works with OOM killer. If pzones has
> enough memory, but there is a global shortage, who will be killed?
I understand.
IMHO, only the system processes should use global memory.
User processes that may cause such memory shortage should be
enclosed in pzones first.
Thanks,
--
KUROSAWA, Takahiro
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-24 1:47 ` Hirokazu Takahashi
@ 2006-04-24 20:42 ` Shailabh Nagar
0 siblings, 0 replies; 35+ messages in thread
From: Shailabh Nagar @ 2006-04-24 20:42 UTC (permalink / raw)
To: Hirokazu Takahashi; +Cc: sekharan, akpm, haveblue, linux-kernel, ckrm-tech
Hirokazu Takahashi wrote:
>
>
>>i/o controller: This controller is not ported to the framework posted,
>>but can be taken for a prototype version. New version would be simpler
>>though.
>>
>>
>
>I think controlling I/O bandwidth is right way to go.
>
>
Thanks. Obviously we agree heartily :-)
>However, I think you need to change the design of the controller a bit.
>A lot of I/O requests processes issue will be handled by other contexts.
>There are AIO, journaling, pdflush and vmscan, which some kernel threads
>treat instead of the processes.
>
>The current design looks not to care about this.
>
>
Yes. The current design, which builds directly on top of the CFQ
scheduler, does not attempt to treat kernel
threads specially in order to account the I/O they're doing on behalf of
others properly. This was mainly because
of the desire to keep the controller simple.
I suspect pdflush and vmscan I/O is never going to be properly
attributable and journaling may be possible but
unlikely to be worth it given the risks of throttling it ? AIO is
likely to be something we can address if there is
consensus that one is willing to pay the price of tracking the source
through the I/O submission layers.
I suppose this would be a good time to dust off the I/O controller and
post it so discussions can become more
concrete.
But as always, changes in the design and implementation are always
welcome....
Regards,
Shailabh
>Thanks,
>Hirokazu Takahashi.
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-24 5:18 ` Hirokazu Takahashi
@ 2006-04-25 1:42 ` Chandra Seetharaman
0 siblings, 0 replies; 35+ messages in thread
From: Chandra Seetharaman @ 2006-04-25 1:42 UTC (permalink / raw)
To: Hirokazu Takahashi
Cc: akpm, haveblue, linux-kernel, ckrm-tech, Valerie.Clement,
kurosawa
On Mon, 2006-04-24 at 14:18 +0900, Hirokazu Takahashi wrote:
> Hi Chandra,
<snip>
> > Yes, it is effective, and the reclamation is O(1) too. It has couple of
> > problems by design, (1) doesn't handle shared pages and (2) doesn't
> > provide support for both min_shares and max_shares.
>
> I'm not sure all of them have to be managed under ckrm_core and rcfs
> in kernel.
>
> These functions you mentioned can be implemented in user space
> to minimize the overhead in usual VM operations because it isn't
> expected quick response to resize it. It is a bit different from
> that of CPU resource.
Agree, that is where the additional complexity arise from.
If the user can achieve the same results with user space solution that
would be good too.
Thanks
chandra
> You don't need to invent everything. I think you can reuse what
> NUMA team is doing instead. This approach may not fit in your rcfs,
> though.
>
> > > This requirement is basically a glorified RLIMIT_RSS manager, isn't it?
> > > Just that it covers a group of mm's and not just the one mm?
> >
> > Yes, that is the core object of ckrm, associate resources to a group of
> > tasks.
>
> Thanks,
> Hirokazu Takahahsi.
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-22 2:13 ` Andrew Morton
` (2 preceding siblings ...)
2006-04-23 6:52 ` Paul Jackson
@ 2006-04-28 1:58 ` Chandra Seetharaman
2006-04-28 6:07 ` Kirill Korotaev
3 siblings, 1 reply; 35+ messages in thread
From: Chandra Seetharaman @ 2006-04-28 1:58 UTC (permalink / raw)
To: Andrew Morton; +Cc: haveblue, linux-kernel, ckrm-tech
On Fri, 2006-04-21 at 19:13 -0700, Andrew Morton wrote:
> Chandra Seetharaman <sekharan@us.ibm.com> wrote:
> >
> > >
> > > c) pointer to prototype code if poss
> >
> > Both the memory controllers are fully functional. We need to trim them
> > down.
> >
> > active/inactive list per class memory controller:
> > http://prdownloads.sourceforge.net/ckrm/mem_rc-f0.4-2615-v2.tz?download
>
> Oh my gosh. That converts memory reclaim from per-zone LRU to
> per-CKRM-class LRU. If configured.
>
> This is huge. It means that we have basically two quite different versions
> of memory reclaim to test and maintain. This is a problem.
>
> (I hope that's the before-we-added-comments version of the patch btw).
>
> > pzone based memory controller:
> > http://marc.theaimsgroup.com/?l=ckrm-tech&m=113867467006531&w=2
>
> From a super-quick scan that looks saner. Is it effective? Is this the
> way you're planning on proceeding?
>
> This requirement is basically a glorified RLIMIT_RSS manager, isn't it?
> Just that it covers a group of mm's and not just the one mm?
>
> Do you attempt to manage just pagecache? So if class A tries to read 10GB
> from disk, does that get more aggressively reclaimed based on class A's
> resource limits?
>
> This all would have been more comfortable if done on top of the 2.4
> kernel's virtual scanner.
>
> (btw, using the term "class" to identify a group of tasks isn't very
> comfortable - it's an instance, not a class...)
>
>
> Worried.
The object of this infrastructure is to get a unified interface for
resource management, irrespective of the resource that is being managed.
As I mentioned in my earlier email, subsystem experts are the ones who
will finally decide what type resource controller they will accept. With
VM experts' direction and advice, i am positive that we will get an
excellent memory controller (as well as other controllers).
As you might have noticed, we have gone through major changes to come to
community's acceptance levels. We are now making use of all possible
features (kref, process event connector, configfs, module parameter,
kzalloc) in this infrastructure.
Having a CPU controller, two memory controllers, an I/O controller and a
numtasks controller proves that the infrastructure does handle major
resources nicely and is also capable of managing virtual resources.
Hope i reduced your worries (at least some :).
regards,
chandra
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-28 1:58 ` Chandra Seetharaman
@ 2006-04-28 6:07 ` Kirill Korotaev
2006-04-28 17:57 ` Chandra Seetharaman
0 siblings, 1 reply; 35+ messages in thread
From: Kirill Korotaev @ 2006-04-28 6:07 UTC (permalink / raw)
To: sekharan; +Cc: Andrew Morton, haveblue, linux-kernel, ckrm-tech
>>Worried.
> The object of this infrastructure is to get a unified interface for
> resource management, irrespective of the resource that is being managed.
>
> As I mentioned in my earlier email, subsystem experts are the ones who
> will finally decide what type resource controller they will accept. With
> VM experts' direction and advice, i am positive that we will get an
> excellent memory controller (as well as other controllers).
>
> As you might have noticed, we have gone through major changes to come to
> community's acceptance levels. We are now making use of all possible
> features (kref, process event connector, configfs, module parameter,
> kzalloc) in this infrastructure.
>
> Having a CPU controller, two memory controllers, an I/O controller and a
> numtasks controller proves that the infrastructure does handle major
> resources nicely and is also capable of managing virtual resources.
>
> Hope i reduced your worries (at least some :).
Not all :) Let me explain.
Until you provided something more complex then numtasks, this
infrastructure is pure theory. For example, in your infrastracture, when
you will add memory resource controller with data sharing, you will face
that changing CKRM class of the tasks is almost impossible in a suitable
way. Another possible situation: hierarchical classes with shared memory
are even more complicated thing.
In both cases you can end up with a poor/complicated/slow solution or
dropping some of your infrastructre features (changing class on the fly,
hierarchy) or which is worse IMHO with incosistency between controllers
and interfaces.
Thanks,
Kirill
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul
2006-04-28 6:07 ` Kirill Korotaev
@ 2006-04-28 17:57 ` Chandra Seetharaman
0 siblings, 0 replies; 35+ messages in thread
From: Chandra Seetharaman @ 2006-04-28 17:57 UTC (permalink / raw)
To: Kirill Korotaev; +Cc: Andrew Morton, haveblue, linux-kernel, ckrm-tech
On Fri, 2006-04-28 at 10:07 +0400, Kirill Korotaev wrote:
> >>Worried.
> > The object of this infrastructure is to get a unified interface for
> > resource management, irrespective of the resource that is being managed.
> >
> > As I mentioned in my earlier email, subsystem experts are the ones who
> > will finally decide what type resource controller they will accept. With
> > VM experts' direction and advice, i am positive that we will get an
> > excellent memory controller (as well as other controllers).
> >
> > As you might have noticed, we have gone through major changes to come to
> > community's acceptance levels. We are now making use of all possible
> > features (kref, process event connector, configfs, module parameter,
> > kzalloc) in this infrastructure.
> >
> > Having a CPU controller, two memory controllers, an I/O controller and a
> > numtasks controller proves that the infrastructure does handle major
> > resources nicely and is also capable of managing virtual resources.
> >
> > Hope i reduced your worries (at least some :).
> Not all :) Let me explain.
>
> Until you provided something more complex then numtasks, this
> infrastructure is pure theory. For example, in your infrastracture, when
> you will add memory resource controller with data sharing, you will face
> that changing CKRM class of the tasks is almost impossible in a suitable
I do not see a problem here, there could be 2 solutions:
- do not account shared pages against the resource group(put them in
the default resource group (as some other OSs do)).
- when you are moving the task to a different class, calculate the
resource group's usage depending on how many users are using a
specific page.
> way. Another possible situation: hierarchical classes with shared memory
> are even more complicated thing.
Hierarchy is not an issue. Resource controller can calculate the
absolute number of resources (say no. of pages in this case) when the
shares are assigned and then treat all resource groups as flat.
>
> In both cases you can end up with a poor/complicated/slow solution or
> dropping some of your infrastructre features (changing class on the fly,
> hierarchy) or which is worse IMHO with incosistency between controllers
> and interfaces.
I am not convinced (based on the above explanations).
>
> Thanks,
> Kirill
>
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2006-04-28 17:57 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-21 2:24 [RFC] [PATCH 00/12] CKRM after a major overhaul sekharan
2006-04-21 2:24 ` [RFC] [PATCH 01/12] Register/Unregister interface for Controllers sekharan
2006-04-21 2:24 ` [RFC] [PATCH 02/12] Class creation/deletion sekharan
2006-04-21 2:24 ` [RFC] [PATCH 03/12] Share Handling sekharan
2006-04-21 2:24 ` [RFC] [PATCH 04/12] Add task logic to class sekharan
2006-04-21 2:24 ` [RFC] [PATCH 05/12] Init and clear class info in task sekharan
2006-04-21 2:24 ` [RFC] [PATCH 06/12] Add proc interface to get class info of task sekharan
2006-04-21 2:24 ` [RFC] [PATCH 07/12] Configfs based filesystem user interface - RCFS sekharan
2006-04-21 2:24 ` [RFC] [PATCH 08/12] Add attribute support to RCFS sekharan
2006-04-21 2:25 ` [RFC] [PATCH 09/12] Add stats file " sekharan
2006-04-21 2:25 ` [RFC] [PATCH 10/12] Add shares " sekharan
2006-04-21 2:25 ` [RFC] [PATCH 11/12] Add members " sekharan
2006-04-21 2:25 ` [RFC] [PATCH 12/12] Documentation for CKRM sekharan
2006-04-21 14:49 ` [ckrm-tech] [RFC] [PATCH 00/12] CKRM after a major overhaul Dave Hansen
2006-04-21 16:58 ` Chandra Seetharaman
2006-04-21 22:57 ` Andrew Morton
2006-04-22 1:48 ` Chandra Seetharaman
2006-04-22 2:13 ` Andrew Morton
2006-04-22 2:20 ` Matt Helsley
2006-04-22 2:33 ` Andrew Morton
2006-04-22 5:28 ` Chandra Seetharaman
2006-04-24 1:10 ` KUROSAWA Takahiro
2006-04-24 4:39 ` Kirill Korotaev
2006-04-24 5:41 ` KUROSAWA Takahiro
2006-04-24 6:45 ` Kirill Korotaev
2006-04-24 7:12 ` KUROSAWA Takahiro
2006-04-24 5:18 ` Hirokazu Takahashi
2006-04-25 1:42 ` Chandra Seetharaman
2006-04-23 6:52 ` Paul Jackson
2006-04-23 9:31 ` Matt Helsley
2006-04-28 1:58 ` Chandra Seetharaman
2006-04-28 6:07 ` Kirill Korotaev
2006-04-28 17:57 ` Chandra Seetharaman
2006-04-24 1:47 ` Hirokazu Takahashi
2006-04-24 20:42 ` Shailabh Nagar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox