* [PATCH -mm v2 0/3] sysv ipc: increase msgmnb with the number of cpus
@ 2008-07-15 21:14 Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 1/3] sysv ipc: increase msgmnb default value wrt. " Solofo.Ramangalahy
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Solofo.Ramangalahy @ 2008-07-15 21:14 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Matt Helsley, Mingming Cao, Nadia Derbey,
Manfred Spraul
The size in bytes of a SysV IPC message queue, msgmnb, is too small
for large machines, but we don't want to bloat small machines.
This series increase ("scale") the default value of
/proc/sys/kernel/msgmnb.
Several methods are used already to modify (mainly increase) msgmnb:
. distribution specific patch (e.g. openSUSE)
. system wide sysctl.conf
. application specific tuning via /proc/sys/kernel/msgmnb
Integrating this series would:
. reflect hardware and software evolutions and diversity,
. reduce configuration/tuning for the applications.
Here is the timeline of the evolution of MSG* #defines:
Year 1994 1999 1999 2008 2008
Version 1.0 2.3.27 2.3.30 2.6.25 2.6.26
#define MSGMNI 128 128 16 16 16-32768
#define MSGMAX 4056 8192 8192 8192 8192
#define MSGMNB 16384 16384 16384 16384 16384
This series increases msgmnb with respect to the number of cpus/cores
for larger machines. For uniprocessor machines the value does not
increase. The scaling factor is at most 4x, which leads to a value of
65536.
This series is similar to (and depends on) the series which scales
msgmni, the number of IPC message queue identifiers, to the amount of
low memory.
While Nadia's series scaled msgmni along the memory axis, hence the
message pool (msgmni x msgmnb), this series uses a second axis: the
number of online CPUs.
As well as covering the (cpu,memory) space of machines size, this
reflects the parallelism allowed by lockless send/receive for
in-flight messages in queues (msgmnb / msgmax messages).
The initial scaling is done at initialization of the ipc namespace.
Furthermore, the value becomes dynamic with respect to cpu hotplug,
decreasing/increasing when a cpu is taken offline/online.
As /proc/sys/kernel/msgmni scaling which can be desactived with
/proc/sys/kernel/auto_msgmni, /proc/sys/kernel/msgmnb scaling can be
desactived via /proc/sys/kernel/auto_msgmnb.
The msgmni and msgmnb values become dependent, as the value of msgmni
is computed with respect to the value of msgmnb.
Other solutions could be possible, like using a dbus/hal daemon. This
patches seems light enough not to go to user space. In particular, the
computation formula is simple and the limit rather low (x4 in more
than 10 years) which limits memory consumption for 0-sized messages.
The series is as follows:
. patch 1 introduces the scaling function
. patch 2 deals with cpu hotplug
. patch 3 deals with auto_msgmnb desactivation/reactivation
---
The series applies to 2.6.26-rc8-mm1 + Nadia's patch
"[PATCH 1/1] IPC - Do not use a negative value to re-enable msgmni automatic recomputing"
http://lkml.org/lkml/2008/7/3/135
Documentation/sysctl/kernel.txt | 35 +++++++++++++++++
include/linux/ipc_namespace.h | 6 ++
include/linux/msg.h | 7 +++
ipc/ipc_sysctl.c | 81 +++++++++++++++++++++++++++++++++++++---
ipc/ipcns_notifier.c | 67 ++++++++++++++++++++++++++++++---
ipc/msg.c | 22 ++++++++--
ipc/namespace.c | 2
ipc/util.c | 29 ++++++++++++++
ipc/util.h | 1
9 files changed, 235 insertions(+), 15 deletions(-)
Changelog:
The following changes have been compared to the patch (V1) for
2.6.26-rc5-mm3 (http://lkml.org/lkml/2008/6/24/272):
. adapt to new auto_ method to avoid using a negative value.
This also addresses the remark about not fully using
notifiers [Nadia]
(http://lkml.org/lkml/2008/6/10/31)
. corrected incremental patch compilation [Andrew]
(http://lkml.org/lkml/2008/7/1/490)
. renamed recompute_msgmnb to ipc_recompute_msgmnb [Andrew]
(http://lkml.org/lkml/2008/7/1/490)
. Add note about 0-sized messages [Manfred, Nadia]
(http://lkml.org/lkml/2008/6/25/112)
if needed, an additional constraint on memory could easily be added.
The following changes have been made compared to the RFC,
(http://lkml.org/lkml/2008/6/6/30):
. reduce use of "scale" word which leads to confusion about the fact
that this is not directly a performance patch [Nick]
. mention that the formula is simple, not needing logarithm (or user
space) [Nick]
. example of distribution using patch [Manfred]
. mention hal/dbus daemon [Manfred]
. do not reenable msgmni recomputation when reenabling msgmnb. It
suffices to do a one shot recomputation [Nadia]
. patch 3 and 4 have been merged with patch 1 [Nadia]
. Integrated documentation with patch [Matt]
. corrected a bug in the last patch
(forgot to add braces when adding statement in if)
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH -mm v2 1/3] sysv ipc: increase msgmnb default value wrt. the number of cpus
2008-07-15 21:14 [PATCH -mm v2 0/3] sysv ipc: increase msgmnb with the number of cpus Solofo.Ramangalahy
@ 2008-07-15 21:14 ` Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 2/3] sysv ipc: recompute msgmnb (and msgmni) on cpu hotplug addition and removal Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 3/3] sysv ipc: use auto_msgmnb to desactivate and reactivate msgmnb recomputation Solofo.Ramangalahy
2 siblings, 0 replies; 5+ messages in thread
From: Solofo.Ramangalahy @ 2008-07-15 21:14 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Matt Helsley, Mingming Cao, Nadia Derbey,
Manfred Spraul, Solofo Ramangalahy
[-- Attachment #1: ipc-scale-msgmnb-with-the-number-of-cpus.patch --]
[-- Type: text/plain, Size: 5250 bytes --]
From: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
Initialize msgmnb value to
min(MSGMNB * num_online_cpus(), MSGMNB * MSG_CPU_SCALE)
to increase the default value for larger machines.
MSG_CPU_SCALE scaling factor is defined to be 4, as 16384 x 4 = 65536
is an already used and recommended value.
The msgmni value is made dependant of msgmnb to keep the memory
dedicated to message queues within the 1/MSG_MEM_SCALE of lowmem
bound.
Unlike msgmni, the value is not scaled (down) with respect to the
number of ipc namespaces for simplicity.
To disable recomputation when user explicitely set a value,
we reuse the callback defined for msgmni.
As msgmni and msgmnb are correlated, user settings of any of the two
disable recomputation of both, for now. This is refined in a later
patch.
When a negative value is put in /proc/sys/kernel/msgmnb
automatic recomputing is re-enabled.
Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
---
include/linux/msg.h | 7 +++++++
ipc/ipc_sysctl.c | 5 +++--
ipc/msg.c | 22 ++++++++++++++++++----
ipc/util.h | 1 +
4 files changed, 29 insertions(+), 6 deletions(-)
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/msg.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/msg.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/msg.c
@@ -38,6 +38,7 @@
#include <linux/rwsem.h>
#include <linux/nsproxy.h>
#include <linux/ipc_namespace.h>
+#include <linux/cpumask.h>
#include <asm/current.h>
#include <asm/uaccess.h>
@@ -92,7 +93,7 @@ void recompute_msgmni(struct ipc_namespa
si_meminfo(&i);
allowed = (((i.totalram - i.totalhigh) / MSG_MEM_SCALE) * i.mem_unit)
- / MSGMNB;
+ / ns->msg_ctlmnb;
nb_ns = atomic_read(&nr_ipc_ns);
allowed /= nb_ns;
@@ -108,11 +109,24 @@ void recompute_msgmni(struct ipc_namespa
ns->msg_ctlmni = allowed;
}
+/*
+ * Scale msgmnb with the number of online cpus, up to 4x MSGMNB. For
+ * simplicity, it is not adjusted wrt. ipc namespaces. This is done
+ * indirectly by msgmni for the whole message pool. Zero-sized
+ * messages can be enqueued up to msgmnb consuming memory. If this
+ * becomes a problem, this function can be updated with a constraint
+ * on low memory like in recompute_msgmni().
+ */
+void ipc_recompute_msgmnb(struct ipc_namespace *ns)
+{
+ ns->msg_ctlmnb =
+ min(MSGMNB * num_online_cpus(), MSGMNB * MSG_CPU_SCALE);
+}
void msg_init_ns(struct ipc_namespace *ns)
{
ns->msg_ctlmax = MSGMAX;
- ns->msg_ctlmnb = MSGMNB;
+ ipc_recompute_msgmnb(ns);
recompute_msgmni(ns);
@@ -132,8 +146,8 @@ void __init msg_init(void)
{
msg_init_ns(&init_ipc_ns);
- printk(KERN_INFO "msgmni has been set to %d\n",
- init_ipc_ns.msg_ctlmni);
+ printk(KERN_INFO "msgmni has been set to %d, msgmnb to %d\n",
+ init_ipc_ns.msg_ctlmni, init_ipc_ns.msg_ctlmnb);
ipc_init_proc_interface("sysvipc/msg",
" key msqid perms cbytes qnum lspid lrpid uid gid cuid cgid stime rtime ctime\n",
Index: linux-2.6.26-rc8-mm1-MSGMNB3/include/linux/msg.h
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/include/linux/msg.h
+++ linux-2.6.26-rc8-mm1-MSGMNB3/include/linux/msg.h
@@ -58,6 +58,13 @@ struct msginfo {
* more than 16 GB : msgmni = 32K (IPCMNI)
*/
#define MSG_MEM_SCALE 32
+/*
+ * Scaling factor to compute msgmnb: ns->msg_ctlmnb is between MSGMNB
+ * and MSGMNB * MSG_CPU_SCALE. This leads to a max msgmnb value of
+ * 65536 which is an already used and recommended value.
+ * Note that zero-sized messages can be enqueued up to this max value.
+ */
+#define MSG_CPU_SCALE 4
#define MSGMNI 16 /* <= IPCMNI */ /* max # of msg queue identifiers */
#define MSGMAX 8192 /* <= INT_MAX */ /* max size of message (bytes) */
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipc_sysctl.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/ipc_sysctl.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipc_sysctl.c
@@ -44,6 +44,7 @@ static void ipc_auto_callback(int val)
* Re-enable automatic recomputing only if not already
* enabled.
*/
+ ipc_recompute_msgmnb(current->nsproxy->ipc_ns);
recompute_msgmni(current->nsproxy->ipc_ns);
cond_register_ipcns_notifier(current->nsproxy->ipc_ns);
}
@@ -246,8 +247,8 @@ static struct ctl_table ipc_kern_table[]
.data = &init_ipc_ns.msg_ctlmnb,
.maxlen = sizeof (init_ipc_ns.msg_ctlmnb),
.mode = 0644,
- .proc_handler = proc_ipc_dointvec,
- .strategy = sysctl_ipc_data,
+ .proc_handler = proc_ipc_callback_dointvec,
+ .strategy = sysctl_ipc_registered_data,
},
{
.ctl_name = KERN_SEM,
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/util.h
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/util.h
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/util.h
@@ -122,6 +122,7 @@ extern struct msg_msg *load_msg(const vo
extern int store_msg(void __user *dest, struct msg_msg *msg, int len);
extern void recompute_msgmni(struct ipc_namespace *);
+extern void ipc_recompute_msgmnb(struct ipc_namespace *);
static inline int ipc_buildid(int id, int seq)
{
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH -mm v2 2/3] sysv ipc: recompute msgmnb (and msgmni) on cpu hotplug addition and removal
2008-07-15 21:14 [PATCH -mm v2 0/3] sysv ipc: increase msgmnb with the number of cpus Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 1/3] sysv ipc: increase msgmnb default value wrt. " Solofo.Ramangalahy
@ 2008-07-15 21:14 ` Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 3/3] sysv ipc: use auto_msgmnb to desactivate and reactivate msgmnb recomputation Solofo.Ramangalahy
2 siblings, 0 replies; 5+ messages in thread
From: Solofo.Ramangalahy @ 2008-07-15 21:14 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Matt Helsley, Mingming Cao, Nadia Derbey,
Manfred Spraul, Solofo Ramangalahy
[-- Attachment #1: ipc-recompute-msgmnb-and-msgmni-on-cpu-hotplug-addition-removal.patch --]
[-- Type: text/plain, Size: 3402 bytes --]
From: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
As msgmnb is scaled wrt. online cpus, cpu hotplug events should grow
and shrink the value.
Like msgmni with ipc_memory_callback(), the ipc_cpu_callback()
function triggers msgmnb recomputation.
Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
---
include/linux/ipc_namespace.h | 1 +
ipc/ipcns_notifier.c | 8 +++-----
ipc/util.c | 28 ++++++++++++++++++++++++++++
3 files changed, 32 insertions(+), 5 deletions(-)
Index: linux-2.6.26-rc8-mm1-MSGMNB3/include/linux/ipc_namespace.h
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/include/linux/ipc_namespace.h
+++ linux-2.6.26-rc8-mm1-MSGMNB3/include/linux/ipc_namespace.h
@@ -12,6 +12,7 @@
#define IPCNS_MEMCHANGED 0x00000001 /* Notify lowmem size changed */
#define IPCNS_CREATED 0x00000002 /* Notify new ipc namespace created */
#define IPCNS_REMOVED 0x00000003 /* Notify ipc namespace removed */
+#define IPCNS_CPUCHANGED 0x00000004 /* Notify cpu hotplug addition/removal */
#define IPCNS_CALLBACK_PRI 0
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/util.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/util.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/util.c
@@ -34,6 +34,7 @@
#include <linux/nsproxy.h>
#include <linux/rwsem.h>
#include <linux/memory.h>
+#include <linux/cpu.h>
#include <linux/ipc_namespace.h>
#include <asm/unistd.h>
@@ -96,6 +97,32 @@ static int ipc_memory_callback(struct no
#endif /* CONFIG_MEMORY_HOTPLUG */
+#ifdef CONFIG_HOTPLUG_CPU
+
+static void ipc_cpu_notifier(struct work_struct *work)
+{
+ ipcns_notify(IPCNS_CPUCHANGED);
+}
+
+static DECLARE_WORK(ipc_cpu_wq, ipc_cpu_notifier);
+
+static int __cpuinit ipc_cpu_callback(struct notifier_block *nfb,
+ unsigned long action, void *hcpu)
+{
+ switch (action) {
+ case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ schedule_work(&ipc_cpu_wq);
+ break;
+ default:
+ break;
+ }
+ return NOTIFY_OK;
+}
+
+#endif /* CONFIG_HOTPLUG_CPU */
/**
* ipc_init - initialise IPC subsystem
*
@@ -112,6 +139,7 @@ static int __init ipc_init(void)
msg_init();
shm_init();
hotplug_memory_notifier(ipc_memory_callback, IPC_CALLBACK_PRI);
+ hotcpu_notifier(ipc_cpu_callback, IPC_CALLBACK_PRI);
register_ipcns_notifier(&init_ipc_ns);
return 0;
}
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipcns_notifier.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/ipcns_notifier.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipcns_notifier.c
@@ -26,16 +26,14 @@ static int ipcns_callback(struct notifie
unsigned long action, void *arg)
{
struct ipc_namespace *ns;
-
+ ns = container_of(self, struct ipc_namespace, ipcns_nb);
switch (action) {
+ case IPCNS_CPUCHANGED:
+ ipc_recompute_msgmnb(ns); /* Fall through */
case IPCNS_MEMCHANGED: /* amount of lowmem has changed */
case IPCNS_CREATED:
case IPCNS_REMOVED:
/*
- * It's time to recompute msgmni
- */
- ns = container_of(self, struct ipc_namespace, ipcns_nb);
- /*
* No need to get a reference on the ns: the 1st job of
* free_ipc_ns() is to unregister the callback routine.
* blocking_notifier_chain_unregister takes the wr lock to do
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH -mm v2 3/3] sysv ipc: use auto_msgmnb to desactivate and reactivate msgmnb recomputation
2008-07-15 21:14 [PATCH -mm v2 0/3] sysv ipc: increase msgmnb with the number of cpus Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 1/3] sysv ipc: increase msgmnb default value wrt. " Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 2/3] sysv ipc: recompute msgmnb (and msgmni) on cpu hotplug addition and removal Solofo.Ramangalahy
@ 2008-07-15 21:14 ` Solofo.Ramangalahy
2008-07-17 21:14 ` Randy Dunlap
2 siblings, 1 reply; 5+ messages in thread
From: Solofo.Ramangalahy @ 2008-07-15 21:14 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Matt Helsley, Mingming Cao, Nadia Derbey,
Manfred Spraul, Solofo Ramangalahy
[-- Attachment #1: ipc-use-auto_msgmnb.patch --]
[-- Type: text/plain, Size: 10478 bytes --]
From: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
Add /proc/sys/kernel/auto_msgmnb to control automatic recomputation of
/proc/sys/kernel/msgmnb (msg_ctlmnb).
Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
---
Documentation/sysctl/kernel.txt | 35 ++++++++++++++++
include/linux/ipc_namespace.h | 5 ++
ipc/ipc_sysctl.c | 87 ++++++++++++++++++++++++++++++++++++----
ipc/ipcns_notifier.c | 62 +++++++++++++++++++++++++++-
ipc/namespace.c | 2
ipc/util.c | 1
6 files changed, 183 insertions(+), 9 deletions(-)
Index: linux-2.6.26-rc8-mm1-MSGMNB3/include/linux/ipc_namespace.h
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/include/linux/ipc_namespace.h
+++ linux-2.6.26-rc8-mm1-MSGMNB3/include/linux/ipc_namespace.h
@@ -38,6 +38,7 @@ struct ipc_namespace {
atomic_t msg_bytes;
atomic_t msg_hdrs;
int auto_msgmni;
+ int auto_msgmnb;
size_t shm_ctlmax;
size_t shm_ctlall;
@@ -45,6 +46,7 @@ struct ipc_namespace {
int shm_tot;
struct notifier_block ipcns_nb;
+ struct notifier_block ipcns_nb_msgmnb;
};
extern struct ipc_namespace init_ipc_ns;
@@ -56,6 +58,9 @@ extern atomic_t nr_ipc_ns;
extern int register_ipcns_notifier(struct ipc_namespace *);
extern int cond_register_ipcns_notifier(struct ipc_namespace *);
extern void unregister_ipcns_notifier(struct ipc_namespace *);
+extern int register_ipcns_notifier_msgmnb(struct ipc_namespace *);
+extern int cond_register_ipcns_notifier_msgmnb(struct ipc_namespace *);
+extern void unregister_ipcns_notifier_msgmnb(struct ipc_namespace *);
extern int ipcns_notify(unsigned long);
#else /* CONFIG_SYSVIPC */
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipc_sysctl.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/ipc_sysctl.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipc_sysctl.c
@@ -50,6 +50,22 @@ static void ipc_auto_callback(int val)
}
}
+static void ipc_auto_callback_msgmnb(int val)
+{
+ struct ipc_namespace *ns = current->nsproxy->ipc_ns;
+ if (!val)
+ unregister_ipcns_notifier_msgmnb(ns);
+ else {
+ /*
+ * Re-enable automatic recomputing only if not already
+ * enabled.
+ */
+ ipc_recompute_msgmnb(ns);
+ recompute_msgmni(ns);
+ cond_register_ipcns_notifier_msgmnb(ns);
+ }
+}
+
#ifdef CONFIG_PROC_FS
static int proc_ipc_dointvec(ctl_table *table, int write, struct file *filp,
void __user *buffer, size_t *lenp, loff_t *ppos)
@@ -73,14 +89,24 @@ static int proc_ipc_callback_dointvec(ct
rc = proc_dointvec(&ipc_table, write, filp, buffer, lenp, ppos);
- if (write && !rc && lenp_bef == *lenp)
+ if (write && !rc && lenp_bef == *lenp) {
+ struct ipc_namespace *ns = current->nsproxy->ipc_ns;
/*
* Tunable has successfully been changed by hand. Disable its
* automatic adjustment. This simply requires unregistering
* the notifiers that trigger recalculation.
*/
- unregister_ipcns_notifier(current->nsproxy->ipc_ns);
-
+ switch (table->ctl_name) {
+ case KERN_MSGMNI:
+ unregister_ipcns_notifier(ns);
+ break;
+ case KERN_MSGMNB:
+ unregister_ipcns_notifier_msgmnb(ns);
+ break;
+ default:
+ break;
+ }
+ }
return rc;
}
@@ -123,6 +149,34 @@ static int proc_ipcauto_dointvec_minmax(
return rc;
}
+static int proc_ipcauto_dointvec_minmax_msgmnb(ctl_table *table, int write,
+ struct file *filp, void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table ipc_table;
+ size_t lenp_bef = *lenp;
+ int oldval;
+ int rc;
+
+ memcpy(&ipc_table, table, sizeof(ipc_table));
+ ipc_table.data = get_ipc(table);
+ oldval = *((int *)(ipc_table.data));
+
+ rc = proc_dointvec_minmax(&ipc_table, write, filp, buffer, lenp, ppos);
+
+ if (write && !rc && lenp_bef == *lenp) {
+ int newval = *((int *)(ipc_table.data));
+ /*
+ * The file "auto_msgmnb" has correctly been set.
+ * React by (un)registering the corresponding tunable, if the
+ * value has changed.
+ */
+ if (newval != oldval)
+ ipc_auto_callback_msgmnb(newval);
+ }
+
+ return rc;
+}
+
#else
#define proc_ipc_doulongvec_minmax NULL
#define proc_ipc_dointvec NULL
@@ -175,16 +229,25 @@ static int sysctl_ipc_registered_data(ct
void __user *newval, size_t newlen)
{
int rc;
-
rc = sysctl_ipc_data(table, name, nlen, oldval, oldlenp, newval,
newlen);
- if (newval && newlen && rc > 0)
+ if (newval && newlen && rc > 0) {
+ struct ipc_namespace *ns = current->nsproxy->ipc_ns;
/*
* Tunable has successfully been changed from userland
*/
- unregister_ipcns_notifier(current->nsproxy->ipc_ns);
-
+ switch (table->ctl_name) {
+ case KERN_MSGMNI:
+ unregister_ipcns_notifier(ns);
+ break;
+ case KERN_MSGMNB:
+ unregister_ipcns_notifier_msgmnb(ns);
+ break;
+ default:
+ break;
+ }
+ }
return rc;
}
#else
@@ -269,6 +332,16 @@ static struct ctl_table ipc_kern_table[]
.extra1 = &zero,
.extra2 = &one,
},
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "auto_msgmnb",
+ .data = &init_ipc_ns.auto_msgmnb,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_ipcauto_dointvec_minmax_msgmnb,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
{}
};
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipcns_notifier.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/ipcns_notifier.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/ipcns_notifier.c
@@ -29,7 +29,8 @@ static int ipcns_callback(struct notifie
ns = container_of(self, struct ipc_namespace, ipcns_nb);
switch (action) {
case IPCNS_CPUCHANGED:
- ipc_recompute_msgmnb(ns); /* Fall through */
+ if (ns->auto_msgmnb)
+ ipc_recompute_msgmnb(ns); /* Fall through */
case IPCNS_MEMCHANGED: /* amount of lowmem has changed */
case IPCNS_CREATED:
case IPCNS_REMOVED:
@@ -42,7 +43,8 @@ static int ipcns_callback(struct notifie
* blocking_notifier_call_chain.
* So the ipc ns cannot be freed while we are here.
*/
- recompute_msgmni(ns);
+ if (ns->auto_msgmni)
+ recompute_msgmni(ns);
break;
default:
break;
@@ -88,3 +90,59 @@ int ipcns_notify(unsigned long val)
{
return blocking_notifier_call_chain(&ipcns_chain, val, NULL);
}
+
+static int ipcns_callback_msgmnb(struct notifier_block *self,
+ unsigned long action, void *arg)
+{
+ struct ipc_namespace *ns;
+ ns = container_of(self, struct ipc_namespace, ipcns_nb_msgmnb);
+ switch (action) {
+ case IPCNS_CPUCHANGED:
+ if (ns->auto_msgmnb)
+ ipc_recompute_msgmnb(ns); /* Fall through */
+ case IPCNS_MEMCHANGED:
+ case IPCNS_CREATED:
+ case IPCNS_REMOVED:
+ if (ns->auto_msgmni)
+ recompute_msgmni(ns);
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_OK;
+}
+
+int register_ipcns_notifier_msgmnb(struct ipc_namespace *ns)
+{
+ int rc;
+
+ memset(&ns->ipcns_nb_msgmnb, 0, sizeof(ns->ipcns_nb_msgmnb));
+ ns->ipcns_nb_msgmnb.notifier_call = ipcns_callback_msgmnb;
+ ns->ipcns_nb_msgmnb.priority = IPCNS_CALLBACK_PRI;
+ rc = blocking_notifier_chain_register(&ipcns_chain,
+ &ns->ipcns_nb_msgmnb);
+ if (!rc)
+ ns->auto_msgmnb = 1;
+ return rc;
+}
+
+int cond_register_ipcns_notifier_msgmnb(struct ipc_namespace *ns)
+{
+ int rc;
+
+ memset(&ns->ipcns_nb_msgmnb, 0, sizeof(ns->ipcns_nb_msgmnb));
+ ns->ipcns_nb_msgmnb.notifier_call = ipcns_callback_msgmnb;
+ ns->ipcns_nb_msgmnb.priority = IPCNS_CALLBACK_PRI;
+ rc = blocking_notifier_chain_cond_register(&ipcns_chain,
+ &ns->ipcns_nb_msgmnb);
+ if (!rc)
+ ns->auto_msgmnb = 1;
+ return rc;
+}
+
+void unregister_ipcns_notifier_msgmnb(struct ipc_namespace *ns)
+{
+ blocking_notifier_chain_unregister(&ipcns_chain, &ns->ipcns_nb_msgmnb);
+ ns->auto_msgmnb = 0;
+}
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/namespace.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/namespace.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/namespace.c
@@ -26,6 +26,7 @@ static struct ipc_namespace *clone_ipc_n
msg_init_ns(ns);
shm_init_ns(ns);
+ register_ipcns_notifier_msgmnb(ns);
/*
* msgmni has already been computed for the new ipc ns.
* Thus, do the ipcns creation notification before registering that
@@ -98,6 +99,7 @@ void free_ipc_ns(struct kref *kref)
* released the rd lock.
*/
unregister_ipcns_notifier(ns);
+ unregister_ipcns_notifier_msgmnb(ns);
sem_exit_ns(ns);
msg_exit_ns(ns);
shm_exit_ns(ns);
Index: linux-2.6.26-rc8-mm1-MSGMNB3/ipc/util.c
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/ipc/util.c
+++ linux-2.6.26-rc8-mm1-MSGMNB3/ipc/util.c
@@ -141,6 +141,7 @@ static int __init ipc_init(void)
hotplug_memory_notifier(ipc_memory_callback, IPC_CALLBACK_PRI);
hotcpu_notifier(ipc_cpu_callback, IPC_CALLBACK_PRI);
register_ipcns_notifier(&init_ipc_ns);
+ register_ipcns_notifier_msgmnb(&init_ipc_ns);
return 0;
}
__initcall(ipc_init);
Index: linux-2.6.26-rc8-mm1-MSGMNB3/Documentation/sysctl/kernel.txt
===================================================================
--- linux-2.6.26-rc8-mm1-MSGMNB3.orig/Documentation/sysctl/kernel.txt
+++ linux-2.6.26-rc8-mm1-MSGMNB3/Documentation/sysctl/kernel.txt
@@ -179,6 +179,41 @@ kernel stack.
==============================================================
+msgmnb
+
+Maximum size in bytes, not in message count, of a single SystemV IPC
+message queue (b stands for bytes).
+
+This value is dynamic and depends on the online cpu count of the
+machine (taking cpu hotplug into account).
+
+Computed values are between MSGMNB and MSGMNB*MSG_CPU_SCALE #define
+constants (currently [16384,65536]).
+
+The exact value is automatically (re)computed, but:
+
+. If the value is positioned from user space (via procfs or sysctl()),
+ then the automatic recomputation is disabled. E.g.:
+
+ # echo 16384 > /proc/sys/kernel/msgmnb
+
+. The automatic recomputation can also be disabled via auto_msgmnb,
+ e.g.:
+
+ # echo 0 > /proc/sys/kernel/auto_msgmnb
+
+. When disabled, the automatic recomputation can be reenabled via
+ auto_msgmnb, e.g.:
+
+ # echo 1 > /proc/sys/kernel/auto_msgmnb
+
+The msgmnb and auto_msgmnb values in each (ipc) namespace are
+independent.
+
+Initially, the msgmnb value is computed automatically: at boot time
+and (ipc) namespace creation.
+
+==============================================================
osrelease, ostype & version:
# cat osrelease
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -mm v2 3/3] sysv ipc: use auto_msgmnb to desactivate and reactivate msgmnb recomputation
2008-07-15 21:14 ` [PATCH -mm v2 3/3] sysv ipc: use auto_msgmnb to desactivate and reactivate msgmnb recomputation Solofo.Ramangalahy
@ 2008-07-17 21:14 ` Randy Dunlap
0 siblings, 0 replies; 5+ messages in thread
From: Randy Dunlap @ 2008-07-17 21:14 UTC (permalink / raw)
To: Solofo.Ramangalahy
Cc: Andrew Morton, linux-kernel, Matt Helsley, Mingming Cao,
Nadia Derbey, Manfred Spraul
On Tue, 15 Jul 2008 23:14:10 +0200 Solofo.Ramangalahy@bull.net wrote:
> From: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
>
> Add /proc/sys/kernel/auto_msgmnb to control automatic recomputation of
> /proc/sys/kernel/msgmnb (msg_ctlmnb).
>
> Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
> ---
> Documentation/sysctl/kernel.txt | 35 ++++++++++++++++
> include/linux/ipc_namespace.h | 5 ++
> ipc/ipc_sysctl.c | 87 ++++++++++++++++++++++++++++++++++++----
> ipc/ipcns_notifier.c | 62 +++++++++++++++++++++++++++-
> ipc/namespace.c | 2
> ipc/util.c | 1
> 6 files changed, 183 insertions(+), 9 deletions(-)
>
> Index: linux-2.6.26-rc8-mm1-MSGMNB3/Documentation/sysctl/kernel.txt
> ===================================================================
> --- linux-2.6.26-rc8-mm1-MSGMNB3.orig/Documentation/sysctl/kernel.txt
> +++ linux-2.6.26-rc8-mm1-MSGMNB3/Documentation/sysctl/kernel.txt
> @@ -179,6 +179,41 @@ kernel stack.
>
> ==============================================================
>
> +msgmnb
> +
> +Maximum size in bytes, not in message count, of a single SystemV IPC
> +message queue (b stands for bytes).
> +
> +This value is dynamic and depends on the online cpu count of the
> +machine (taking cpu hotplug into account).
Prefer "CPU" to "cpu".
> +
> +Computed values are between MSGMNB and MSGMNB*MSG_CPU_SCALE #define
> +constants (currently [16384,65536]).
> +
> +The exact value is automatically (re)computed, but:
> +
> +. If the value is positioned from user space (via procfs or sysctl()),
s/positioned/specified/ or set
> + then the automatic recomputation is disabled. E.g.:
> +
> + # echo 16384 > /proc/sys/kernel/msgmnb
> +
> +. The automatic recomputation can also be disabled via auto_msgmnb,
> + e.g.:
> +
> + # echo 0 > /proc/sys/kernel/auto_msgmnb
> +
> +. When disabled, the automatic recomputation can be reenabled via
> + auto_msgmnb, e.g.:
> +
> + # echo 1 > /proc/sys/kernel/auto_msgmnb
> +
> +The msgmnb and auto_msgmnb values in each (ipc) namespace are
> +independent.
> +
Prefer IPC instead of ipc (above and below).
> +Initially, the msgmnb value is computed automatically: at boot time
> +and (ipc) namespace creation.
> +
> +==============================================================
---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-07-17 21:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-15 21:14 [PATCH -mm v2 0/3] sysv ipc: increase msgmnb with the number of cpus Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 1/3] sysv ipc: increase msgmnb default value wrt. " Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 2/3] sysv ipc: recompute msgmnb (and msgmni) on cpu hotplug addition and removal Solofo.Ramangalahy
2008-07-15 21:14 ` [PATCH -mm v2 3/3] sysv ipc: use auto_msgmnb to desactivate and reactivate msgmnb recomputation Solofo.Ramangalahy
2008-07-17 21:14 ` Randy Dunlap
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox