From: Andrew Morton <akpm@linux-foundation.org>
To: <Solofo.Ramangalahy@bull.net>
Cc: linux-kernel@vger.kernel.org, matthltc@us.ibm.com,
cmm@us.ibm.com, Nadia.Derbey@bull.net, manfred@colorfullife.com,
nickpiggin@yahoo.com.au, Solofo.Ramangalahy@bull.net
Subject: Re: [PATCH -mm 1/3] sysv ipc: increase msgmnb default value wrt. the number of cpus
Date: Tue, 24 Jun 2008 14:31:20 -0700 [thread overview]
Message-ID: <20080624143120.9bed4f18.akpm@linux-foundation.org> (raw)
In-Reply-To: <20080624093453.201071209@bull.net>
On Tue, 24 Jun 2008 11:34:53 +0200
<Solofo.Ramangalahy@bull.net> wrote:
> From: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
>
> Initialize msgmnb value to
> min(MSGMNB * num_online_cpus(), MSGMNB * MSG_CPU_SCALE)
> to increase the default value for larger machines.
>
> MSG_CPU_SCALE scaling factor is defined to be 4, as 16384 x 4 = 65536
> is an already used and recommended value.
>
> The msgmni value is made dependant of msgmnb to keep the memory
> dedicated to message queues within the 1/MSG_MEM_SCALE of lowmem
> bound.
>
> Unlike msgmni, the value is not scaled (down) with respect to the
> number of ipc namespaces for simplicity.
>
> To disable recomputation when user explicitely set a value,
> we reuse the callback defined for msgmni.
>
> As msgmni and msgmnb are correlated, user settings of any of the two
> disable recomputation of both, for now. This is refined in a later
> patch.
>
> When a negative value is put in /proc/sys/kernel/msgmnb
> automatic recomputing is re-enabled.
>
Thanks for taking the time to describe this work so well.
>
> ---
> Documentation/sysctl/kernel.txt | 28 ++++++++++++++++++++++++++++
> include/linux/msg.h | 6 ++++++
> ipc/ipc_sysctl.c | 5 +++--
> ipc/msg.c | 17 +++++++++++++----
> 4 files changed, 50 insertions(+), 6 deletions(-)
>
> Index: b/ipc/msg.c
> ===================================================================
> --- a/ipc/msg.c
> +++ b/ipc/msg.c
> @@ -38,6 +38,7 @@
> #include <linux/rwsem.h>
> #include <linux/nsproxy.h>
> #include <linux/ipc_namespace.h>
> +#include <linux/cpumask.h>
>
> #include <asm/current.h>
> #include <asm/uaccess.h>
> @@ -92,7 +93,7 @@ void recompute_msgmni(struct ipc_namespa
>
> si_meminfo(&i);
> allowed = (((i.totalram - i.totalhigh) / MSG_MEM_SCALE) * i.mem_unit)
> - / MSGMNB;
> + / ns->msg_ctlmnb;
> nb_ns = atomic_read(&nr_ipc_ns);
> allowed /= nb_ns;
>
> @@ -108,11 +109,19 @@ void recompute_msgmni(struct ipc_namespa
>
> ns->msg_ctlmni = allowed;
> }
> +/*
> + * Scale msgmnb with the number of online cpus, up to 4x MSGMNB.
> + */
> +void recompute_msgmnb(struct ipc_namespace *ns)
> +{
> + ns->msg_ctlmnb =
> + min(MSGMNB * num_online_cpus(), MSGMNB * MSG_CPU_SCALE);
> +}
>
> void msg_init_ns(struct ipc_namespace *ns)
> {
> ns->msg_ctlmax = MSGMAX;
> - ns->msg_ctlmnb = MSGMNB;
> + recompute_msgmnb(ns);
>
> recompute_msgmni(ns);
>
> @@ -132,8 +141,8 @@ void __init msg_init(void)
> {
> msg_init_ns(&init_ipc_ns);
>
> - printk(KERN_INFO "msgmni has been set to %d\n",
> - init_ipc_ns.msg_ctlmni);
> + printk(KERN_INFO "msgmni has been set to %d, msgmnb to %d\n",
> + init_ipc_ns.msg_ctlmni, init_ipc_ns.msg_ctlmnb);
>
> ipc_init_proc_interface("sysvipc/msg",
> " key msqid perms cbytes qnum lspid lrpid uid gid cuid cgid stime rtime ctime\n",
> Index: b/include/linux/msg.h
> ===================================================================
> --- a/include/linux/msg.h
> +++ b/include/linux/msg.h
> @@ -58,6 +58,12 @@ struct msginfo {
> * more than 16 GB : msgmni = 32K (IPCMNI)
> */
> #define MSG_MEM_SCALE 32
> +/*
> + * Scaling factor to compute msgmnb: ns->msg_ctlmnb is between MSGMNB
> + * and MSGMNB * MSG_CPU_SCALE. This leads to a max msgmnb value of
> + * 65536 which is an already used and recommended value.
> + */
> +#define MSG_CPU_SCALE 4
>
> #define MSGMNI 16 /* <= IPCMNI */ /* max # of msg queue identifiers */
> #define MSGMAX 8192 /* <= INT_MAX */ /* max size of message (bytes) */
> Index: b/ipc/ipc_sysctl.c
> ===================================================================
> --- a/ipc/ipc_sysctl.c
> +++ b/ipc/ipc_sysctl.c
> @@ -42,6 +42,7 @@ static void tunable_set_callback(int val
> * Re-enable automatic recomputing only if not already
> * enabled.
> */
> + recompute_msgmnb(current->nsproxy->ipc_ns);
> recompute_msgmni(current->nsproxy->ipc_ns);
> cond_register_ipcns_notifier(current->nsproxy->ipc_ns);
> }
> @@ -210,8 +211,8 @@ static struct ctl_table ipc_kern_table[]
> .data = &init_ipc_ns.msg_ctlmnb,
> .maxlen = sizeof (init_ipc_ns.msg_ctlmnb),
> .mode = 0644,
> - .proc_handler = proc_ipc_dointvec,
> - .strategy = sysctl_ipc_data,
> + .proc_handler = proc_ipc_callback_dointvec,
> + .strategy = sysctl_ipc_registered_data,
> },
> {
> .ctl_name = KERN_SEM,
> Index: b/Documentation/sysctl/kernel.txt
> ===================================================================
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -179,6 +179,34 @@ kernel stack.
>
> ==============================================================
>
> +msgmnb
> +
> +Maximum size in bytes (not in message count) of a single SystemV IPC
> +message queue (b stands for bytes).
> +
> +This value is dynamic and depends on the online cpu count of the
> +machine (taking cpu hotplug into account).
> +
> +Computed values are between MSGMNB and MSGMNB*MSG_CPU_SCALE #define
> +constants (currently [16384,65536]).
> +
> +The exact value is automatically (re)computed, but:
> +. If the value is positioned from user space (via procfs or sysctl()),
> + to a positive value then the automatic recomputation is
> + disabled. This leaves control to user space. E.g.
> +
> + # echo 16384 > /proc/sys/kernel/msgmnb
> +
> +. If the value is positioned from user space to a negative value, then
> + the computation is reenabled. E.g.
> +
> + # echo -1 > /proc/sys/kernel/msgmnb
> +
> +See recompute_msgmnb() function in ipc/ directory for details.
> +The value of msgmnb is coupled with the value of msgmni.
> +
The magical positive-versus-negative number trick is a bit obscure, and
I don't think there's any precedent for it in the kernel ABI (which is
what this is).
Is there anything we can do to reduce the unusualness of this
interface? Say, add a new /proc/sys/kernel/automatic-msgmnb which
contains the automatic scaling and leave /proc/sys/kernel/msgmnb
containing the manual scaling? Or something like that?
next prev parent reply other threads:[~2008-06-24 21:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-24 9:34 [PATCH -mm 0/3] sysv ipc: increase msgmnb with the number of cpus Solofo.Ramangalahy
2008-06-24 9:34 ` [PATCH -mm 1/3] sysv ipc: increase msgmnb default value wrt. " Solofo.Ramangalahy
2008-06-24 21:31 ` Andrew Morton [this message]
2008-06-25 10:34 ` Nadia Derbey
2008-06-26 14:49 ` Nadia Derbey
2008-06-26 16:12 ` Andrew Morton
2008-06-24 9:34 ` [PATCH -mm 2/3] sysv ipc: recompute msgmnb (and msgmni) on cpu hotplug addition and removal Solofo.Ramangalahy
2008-06-24 9:34 ` [PATCH -mm 3/3] sysv ipc: deconnect msgmnb and msgmni deactivation and reactivation Solofo.Ramangalahy
2008-07-01 22:16 ` [PATCH -mm 0/3] sysv ipc: increase msgmnb with the number of cpus Andrew Morton
2008-07-03 5:39 ` Solofo.Ramangalahy
2008-07-03 12:05 ` Nadia Derbey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080624143120.9bed4f18.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=Nadia.Derbey@bull.net \
--cc=Solofo.Ramangalahy@bull.net \
--cc=cmm@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=matthltc@us.ibm.com \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox