* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector [not found] <1109240677.1738.196.camel@frecb000711.frec.bull.fr> @ 2005-03-02 8:48 ` Guillaume Thouvenin 2005-03-02 14:51 ` Paul Jackson ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Guillaume Thouvenin @ 2005-03-02 8:48 UTC (permalink / raw) To: Andrew Morton Cc: lkml, Evgeniy Polyakov, elsa-devel, Jay Lan, Gerrit Huizenga, Erich Focht, Netlink List, Kaigai Kohei ChangeLog: - Add parenthesis around sizeof(struct cn_msg) + CN_FORK_INFO_SIZE in the CN_FORK_MSG_SIZE macro - fork_cn_lock is declareed with DEFINE_SPINLOCK() - fork_cn_lock is defined as static and local to fork_connector() - Create a specific module cn_fork.c in drivers/connector to register the callback. - Improve the callback that turns on/off the fork connector I also run the lmbench and results are send in response to another thread "A common layer for Accounting packages". When fork connector is turned off the overhead is negligible. This patch works with another small patch that fix a problem in the connector. Without it, there is a message that says "skb does not have enough length". It will be fix in the next -mm tree I think. Thanks everyone for the comments, Guillaume Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@bull.net> --- drivers/connector/Kconfig | 11 +++++ drivers/connector/Makefile | 1 drivers/connector/cn_fork.c | 85 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/connector.h | 4 ++ kernel/fork.c | 44 ++++++++++++++++++++++ 5 files changed, 145 insertions(+) diff -uprN -X dontdiff linux-2.6.11-rc4-mm1/drivers/connector/cn_fork.c linux-2.6.11-rc4-mm1-cnfork/drivers/connector/cn_fork.c --- linux-2.6.11-rc4-mm1/drivers/connector/cn_fork.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.11-rc4-mm1-cnfork/drivers/connector/cn_fork.c 2005-03-01 13:13:05.000000000 +0100 @@ -0,0 +1,85 @@ +/* + * cn_fork.c + * + * 2005 Copyright (c) Guillaume Thouvenin <guillaume.thouvenin@bull.net> + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/init.h> + +#include <linux/connector.h> + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Guillaume Thouvenin <guillaume.thouvenin@bull.net>"); +MODULE_DESCRIPTION("Enable or disable the usage of the fork connector"); + +int cn_fork_enable = 0; +struct cb_id cb_fork_id = { CN_IDX_FORK, CN_VAL_FORK }; + +/** + * cn_fork_callback - enable or disable the fork connector + * @data: message send by the connector + * + * The callback allows to enable or disable the sending of information + * about fork in the do_fork() routine. To enable the fork, the user + * space application must send the integer 1 in the data part of the + * message. To disable the fork connector, it must send the integer 0. + */ +static void cn_fork_callback(void *data) +{ + struct cn_msg *msg = (struct cn_msg *)data; + + if (cn_already_initialized && (msg->len == sizeof(cn_fork_enable))) + memcpy(&cn_fork_enable, msg->data, sizeof(cn_fork_enable)); +} + +/** + * cn_fork_init - initialization entry point + * + * This routine will be run at kernel boot time because this driver is + * built in the kernel. It adds the connector callback to the connector + * driver. + */ +static int cn_fork_init(void) +{ + int err; + + err = cn_add_callback(&cb_fork_id, "cn_fork", &cn_fork_callback); + if (err) { + printk(KERN_WARNING "Failed to register cn_fork\n"); + return -EINVAL; + } + + printk(KERN_NOTICE "cn_fork is registered\n"); + return 0; +} + +/** + * cn_fork_exit - exit entry point + * + * As this driver is always statically compiled into the kernel the + * cn_fork_exit has no effect. + */ +static void cn_fork_exit(void) +{ + cn_del_callback(&cb_fork_id); +} + +module_init(cn_fork_init); +module_exit(cn_fork_exit); diff -uprN -X dontdiff linux-2.6.11-rc4-mm1/drivers/connector/Kconfig linux-2.6.11-rc4-mm1-cnfork/drivers/connector/Kconfig --- linux-2.6.11-rc4-mm1/drivers/connector/Kconfig 2005-02-23 11:12:15.000000000 +0100 +++ linux-2.6.11-rc4-mm1-cnfork/drivers/connector/Kconfig 2005-02-24 10:29:11.000000000 +0100 @@ -10,4 +10,15 @@ config CONNECTOR Connector support can also be built as a module. If so, the module will be called cn.ko. +config FORK_CONNECTOR + bool "Enable fork connector" + depends on CONNECTOR=y + default y + ---help--- + It adds a connector in kernel/fork.c:do_fork() function. When a fork + occurs, netlink is used to transfer information about the parent and + its child. This information can be used by a user space application. + The fork connector can be enable/disable by sending a message to the + connector with the corresponding group id. + endmenu diff -uprN -X dontdiff linux-2.6.11-rc4-mm1/drivers/connector/Makefile linux-2.6.11-rc4-mm1-cnfork/drivers/connector/Makefile --- linux-2.6.11-rc4-mm1/drivers/connector/Makefile 2005-02-23 11:12:15.000000000 +0100 +++ linux-2.6.11-rc4-mm1-cnfork/drivers/connector/Makefile 2005-02-25 13:49:57.000000000 +0100 @@ -1,2 +1,3 @@ obj-$(CONFIG_CONNECTOR) += cn.o +obj-$(CONFIG_FORK_CONNECTOR) += cn_fork.o cn-objs := cn_queue.o connector.o diff -uprN -X dontdiff linux-2.6.11-rc4-mm1/include/linux/connector.h linux-2.6.11-rc4-mm1-cnfork/include/linux/connector.h --- linux-2.6.11-rc4-mm1/include/linux/connector.h 2005-02-23 11:12:17.000000000 +0100 +++ linux-2.6.11-rc4-mm1-cnfork/include/linux/connector.h 2005-03-01 12:44:50.000000000 +0100 @@ -28,6 +28,8 @@ #define CN_VAL_KOBJECT_UEVENT 0x0000 #define CN_IDX_SUPERIO 0xaabb /* SuperIO subsystem */ #define CN_VAL_SUPERIO 0xccdd +#define CN_IDX_FORK 0xfeed /* fork events */ +#define CN_VAL_FORK 0xbeef #define CONNECTOR_MAX_MSG_SIZE 1024 @@ -133,6 +135,8 @@ struct cn_dev }; extern int cn_already_initialized; +extern int cn_fork_enable; +extern struct cb_id cb_fork_id; int cn_add_callback(struct cb_id *, char *, void (* callback)(void *)); void cn_del_callback(struct cb_id *); diff -uprN -X dontdiff linux-2.6.11-rc4-mm1/kernel/fork.c linux-2.6.11-rc4-mm1-cnfork/kernel/fork.c --- linux-2.6.11-rc4-mm1/kernel/fork.c 2005-02-23 11:12:17.000000000 +0100 +++ linux-2.6.11-rc4-mm1-cnfork/kernel/fork.c 2005-03-01 08:39:13.000000000 +0100 @@ -41,6 +41,7 @@ #include <linux/profile.h> #include <linux/rmap.h> #include <linux/acct.h> +#include <linux/connector.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> @@ -63,6 +64,47 @@ DEFINE_PER_CPU(unsigned long, process_co EXPORT_SYMBOL(tasklist_lock); +#ifdef CONFIG_FORK_CONNECTOR + +#define CN_FORK_INFO_SIZE 64 +#define CN_FORK_MSG_SIZE (sizeof(struct cn_msg) + CN_FORK_INFO_SIZE) + +static inline void fork_connector(pid_t parent, pid_t child) +{ + static DEFINE_SPINLOCK(cn_fork_lock); + static __u32 seq; /* used to test if message is lost */ + + if (cn_fork_enable) { + struct cn_msg *msg; + + __u8 buffer[CN_FORK_MSG_SIZE]; + + msg = (struct cn_msg *)buffer; + + memcpy(&msg->id, &cb_fork_id, sizeof(msg->id)); + spin_lock(&cn_fork_lock); + msg->seq = seq++; + spin_unlock(&cn_fork_lock); + msg->ack = 0; /* not used */ + /* + * size of data is the number of characters + * printed plus one for the trailing '\0' + */ + /* just fill the data part with '\0' */ + memset(msg->data, '\0', CN_FORK_INFO_SIZE); + msg->len = scnprintf(msg->data, CN_FORK_INFO_SIZE-1, + "%i %i", parent, child) + 1; + + cn_netlink_send(msg, CN_IDX_FORK); + } +} +#else +static inline void fork_connector(pid_t parent, pid_t child) +{ + return; +} +#endif + int nr_processes(void) { int cpu; @@ -1238,6 +1280,8 @@ long do_fork(unsigned long clone_flags, if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE)) ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP); } + + fork_connector(current->pid, p->pid); } else { free_pidmap(pid); pid = PTR_ERR(p); ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-02 8:48 ` [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector Guillaume Thouvenin @ 2005-03-02 14:51 ` Paul Jackson 2005-03-02 17:48 ` Jesse Barnes 2005-03-02 15:50 ` Paul Jackson 2005-03-03 3:18 ` Kaigai Kohei 2 siblings, 1 reply; 8+ messages in thread From: Paul Jackson @ 2005-03-02 14:51 UTC (permalink / raw) To: Guillaume Thouvenin Cc: akpm, linux-kernel, johnpol, elsa-devel, jlan, gh, efocht, netdev, kaigai Guillaume wrote: > > I also run the lmbench and results are send in response to another > thread "A common layer for Accounting packages". When fork connector is > turned off the overhead is negligible. Good. If I read this code right: > > +static inline void fork_connector(pid_t parent, pid_t child) > +{ > + static DEFINE_SPINLOCK(cn_fork_lock); > + static __u32 seq; /* used to test if message is lost */ > + > + if (cn_fork_enable) { then the code executed if the fork connector is off is a call to an inline function that tests an integer, finds it zero, and returns. This is sufficiently little code that I for one would hardly even need lmbench to be comfortable that fork() wasn't impacted seriously, in the case that the fork connector is disabled. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373, 1.925.600.0401 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-02 14:51 ` Paul Jackson @ 2005-03-02 17:48 ` Jesse Barnes 0 siblings, 0 replies; 8+ messages in thread From: Jesse Barnes @ 2005-03-02 17:48 UTC (permalink / raw) To: Paul Jackson Cc: Guillaume Thouvenin, akpm, linux-kernel, johnpol, elsa-devel, jlan, gh, efocht, netdev, kaigai On Wednesday, March 2, 2005 6:51 am, Paul Jackson wrote: > Guillaume wrote: > > I also run the lmbench and results are send in response to another > > thread "A common layer for Accounting packages". When fork connector is > > turned off the overhead is negligible. > > Good. > > If I read this code right: > > +static inline void fork_connector(pid_t parent, pid_t child) > > +{ > > + static DEFINE_SPINLOCK(cn_fork_lock); > > + static __u32 seq; /* used to test if message is lost */ > > + > > + if (cn_fork_enable) { > > then the code executed if the fork connector is off is a call to an > inline function that tests an integer, finds it zero, and returns. > > This is sufficiently little code that I for one would hardly > even need lmbench to be comfortable that fork() wasn't impacted > seriously, in the case that the fork connector is disabled. But if it *is* enabled, it takes a global lock on every fork. That can't scale on a big multiprocessor if lots of CPUs are doing lots of forks... Jesse ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-02 8:48 ` [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector Guillaume Thouvenin 2005-03-02 14:51 ` Paul Jackson @ 2005-03-02 15:50 ` Paul Jackson 2005-03-03 3:18 ` Kaigai Kohei 2 siblings, 0 replies; 8+ messages in thread From: Paul Jackson @ 2005-03-02 15:50 UTC (permalink / raw) To: Guillaume Thouvenin Cc: akpm, linux-kernel, johnpol, elsa-devel, jlan, gh, efocht, netdev, kaigai In addition to worrying about performance and scaling, with accounting enabled or disabled, one should also try to minimize code clutter in key kernel files, such as fork.c For example, one might, instead of adding 40 lines os fork_connector() code to kernel/fork.c, instead add something like just the #include <linux/connector.h> and the "fork_connector(current->pid, p->pid)" call to kernel/fork.c, where include/linux/connector.h had something like: #ifdef CONFIG_FORK_CONNECTOR static inline void fork_connector(pid_t parent, pid_t child) { if (cn_fork_enable) __fork_connector(parent, child); } #else static inline void fork_connector(pid_t parent, pid_t child) {} #endif Then bury the interesting code in the implementation of __fork_connector(), in drivers/connector/cn_fork.c or some such place. This adds a real function call in the case that cn_fork_enable is set. That code path requires more than that anyway (and it makes kernel stack backtraces more transparent). But it removes 40 lines of fork_connector detail from fork.c. And it avoids marking a 40 line routine as inline ... -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373, 1.925.600.0401 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-02 8:48 ` [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector Guillaume Thouvenin 2005-03-02 14:51 ` Paul Jackson 2005-03-02 15:50 ` Paul Jackson @ 2005-03-03 3:18 ` Kaigai Kohei 2005-03-03 5:46 ` Evgeniy Polyakov 2 siblings, 1 reply; 8+ messages in thread From: Kaigai Kohei @ 2005-03-03 3:18 UTC (permalink / raw) To: Guillaume Thouvenin Cc: Andrew Morton, lkml, Evgeniy Polyakov, elsa-devel, Jay Lan, Gerrit Huizenga, Erich Focht, Netlink List [-- Attachment #1: Type: text/plain, Size: 2212 bytes --] Hello, Guillaume I tried to measure the process-creation/destruction performance on 2.6.11-rc4-mm1 plus some extensiton(Normal/with PAGG/with Fork-Connector). But I received a following messages endlessly on system console with Fork-Connector extensiton. # on IA-64 environment / When an simple fork() iteration is run in parallel. skb does not have enough length: requested msg->len=10[28], nlh->nlmsg_len=48[32], skb->len=48[must be 30]. skb does not have enough length: requested msg->len=10[28], nlh->nlmsg_len=48[32], skb->len=48[must be 30]. skb does not have enough length: requested msg->len=10[28], nlh->nlmsg_len=48[32], skb->len=48[must be 30]. : Is's generated at drivers/connector/connector.c:__cn_rx_skb(), and this warn the length of msg's payload does not fit in nlmsghdr's length. This message means netlink packet is not sent to user space. I was notified occurence of fork() by printk(). :-( The attached simple *.c file can enable/disable fork-connector and listen the fork-notification. Because It's first experimence for me to write a code to use netlink, point out a right how-to-use if there's some mistakes at user side apprication. Thanks. P.S. I can't reproduce lockup on 367th-fork() with your latest patch. Guillaume Thouvenin wrote: > ChangeLog: > > - Add parenthesis around sizeof(struct cn_msg) + CN_FORK_INFO_SIZE > in the CN_FORK_MSG_SIZE macro > - fork_cn_lock is declareed with DEFINE_SPINLOCK() > - fork_cn_lock is defined as static and local to fork_connector() > - Create a specific module cn_fork.c in drivers/connector to > register the callback. > - Improve the callback that turns on/off the fork connector > > I also run the lmbench and results are send in response to another > thread "A common layer for Accounting packages". When fork connector is > turned off the overhead is negligible. This patch works with another > small patch that fix a problem in the connector. Without it, there is a > message that says "skb does not have enough length". It will be fix in > the next -mm tree I think. > > > Thanks everyone for the comments, > Guillaume -- Linux Promotion Center, NEC KaiGai Kohei <kaigai@ak.jp.nec.com> [-- Attachment #2: fclisten.c --] [-- Type: text/plain, Size: 2433 bytes --] #include <stdio.h> #include <stdlib.h> #include <string.h> #include <asm/types.h> #include <sys/types.h> #include <sys/socket.h> #include <linux/netlink.h> void usage(){ puts("usage: fclisten <on|off>"); puts(" Default -> listening fork-connector"); puts(" on -> fork-connector Enable"); puts(" off -> fork-connector Disable"); exit(0); } #define MODE_LISTEN (1) #define MODE_ENABLE (2) #define MODE_DISABLE (3) struct cb_id { __u32 idx; __u32 val; }; struct cn_msg { struct cb_id id; __u32 seq; __u32 ack; __u32 len; /* Length of the following data */ __u8 data[0]; }; int main(int argc, char *argv[]){ char buf[4096]; int mode, sockfd, len; struct sockaddr_nl ad; struct nlmsghdr *hdr = (struct nlmsghdr *)buf; struct cn_msg *msg = (struct cn_msg *)(buf+sizeof(struct nlmsghdr)); switch(argc){ case 1: mode = MODE_LISTEN; break; case 2: if (strcasecmp("on",argv[1])==0) { mode = MODE_ENABLE; }else if (strcasecmp("off",argv[1])==0){ mode = MODE_DISABLE; }else{ usage(); } break; default: usage(); break; } if( (sockfd=socket(PF_NETLINK, SOCK_RAW, NETLINK_NFLOG)) < 0 ){ fprintf(stderr, "Fault on socket().\n"); return( 1 ); } ad.nl_family = AF_NETLINK; ad.nl_pad = 0; ad.nl_pid = getpid(); ad.nl_groups = -1; if( bind(sockfd, (struct sockaddr *)&ad, sizeof(ad)) ){ fprintf(stderr, "Fault on bind to netlink.\n"); return( 2 ); } if (mode==MODE_LISTEN) { while(-1){ len = recvfrom(sockfd, buf, 4096, 0, NULL, NULL); printf("%d-byte recv Seq=%d\n", len, hdr->nlmsg_seq); } }else{ ad.nl_family = AF_NETLINK; ad.nl_pad = 0; ad.nl_pid = 0; ad.nl_groups = 1; hdr->nlmsg_len = sizeof(struct nlmsghdr) + sizeof(struct cn_msg) + sizeof(int); hdr->nlmsg_type = 0; hdr->nlmsg_flags = 0; hdr->nlmsg_seq = 0; hdr->nlmsg_pid = getpid(); msg->id.idx = 0xfeed; msg->id.val = 0xbeef; msg->seq = msg->ack = 0; msg->len = sizeof(int); if (mode==MODE_ENABLE){ (*(int *)(msg->data)) = 1; } else { (*(int *)(msg->data)) = 0; } sendto(sockfd, buf, sizeof(struct nlmsghdr)+sizeof(struct cn_msg)+sizeof(int), 0, (struct sockaddr *)&ad, sizeof(ad)); } } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-03 3:18 ` Kaigai Kohei @ 2005-03-03 5:46 ` Evgeniy Polyakov 2005-03-03 11:51 ` Evgeniy Polyakov 0 siblings, 1 reply; 8+ messages in thread From: Evgeniy Polyakov @ 2005-03-03 5:46 UTC (permalink / raw) To: Kaigai Kohei Cc: Guillaume Thouvenin, Andrew Morton, lkml, elsa-devel, Jay Lan, Gerrit Huizenga, Erich Focht, Netlink List On Thu, Mar 03, 2005 at 12:18:25PM +0900, Kaigai Kohei (kaigai@ak.jp.nec.com) wrote: > Hello, Guillaume > > I tried to measure the process-creation/destruction performance on 2.6.11-rc4-mm1 plus > some extensiton(Normal/with PAGG/with Fork-Connector). > But I received a following messages endlessly on system console with Fork-Connector extensiton. > > # on IA-64 environment / When an simple fork() iteration is run in parallel. > skb does not have enough length: requested msg->len=10[28], nlh->nlmsg_len=48[32], skb->len=48[must be 30]. > skb does not have enough length: requested msg->len=10[28], nlh->nlmsg_len=48[32], skb->len=48[must be 30]. > skb does not have enough length: requested msg->len=10[28], nlh->nlmsg_len=48[32], skb->len=48[must be 30]. > : > > Is's generated at drivers/connector/connector.c:__cn_rx_skb(), and this warn the length of msg's payload > does not fit in nlmsghdr's length. > This message means netlink packet is not sent to user space. > I was notified occurence of fork() by printk(). :-( No, lengths are correct, but skb can be dropped due to misaligned sizes check. > The attached simple *.c file can enable/disable fork-connector and listen the fork-notification. > Because It's first experimence for me to write a code to use netlink, point out a right how-to-use > if there's some mistakes at user side apprication. > > Thanks. > > P.S. I can't reproduce lockup on 367th-fork() with your latest patch. I've sent that patch to Guillaume and upstream, hopefully it will be integrated into next -mm release. > Guillaume Thouvenin wrote: > > ChangeLog: > > > > - Add parenthesis around sizeof(struct cn_msg) + CN_FORK_INFO_SIZE > > in the CN_FORK_MSG_SIZE macro > > - fork_cn_lock is declareed with DEFINE_SPINLOCK() > > - fork_cn_lock is defined as static and local to fork_connector() > > - Create a specific module cn_fork.c in drivers/connector to > > register the callback. > > - Improve the callback that turns on/off the fork connector > > > > I also run the lmbench and results are send in response to another > > thread "A common layer for Accounting packages". When fork connector is > > turned off the overhead is negligible. This patch works with another > > small patch that fix a problem in the connector. Without it, there is a > > message that says "skb does not have enough length". It will be fix in > > the next -mm tree I think. > > > > > > Thanks everyone for the comments, > > Guillaume > > -- > Linux Promotion Center, NEC > KaiGai Kohei <kaigai@ak.jp.nec.com> > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <asm/types.h> > #include <sys/types.h> > #include <sys/socket.h> > #include <linux/netlink.h> > > void usage(){ > puts("usage: fclisten <on|off>"); > puts(" Default -> listening fork-connector"); > puts(" on -> fork-connector Enable"); > puts(" off -> fork-connector Disable"); > exit(0); > } > > #define MODE_LISTEN (1) > #define MODE_ENABLE (2) > #define MODE_DISABLE (3) > > struct cb_id > { > __u32 idx; > __u32 val; > }; > > struct cn_msg > { > struct cb_id id; > __u32 seq; > __u32 ack; > __u32 len; /* Length of the following data */ > __u8 data[0]; > }; > > > int main(int argc, char *argv[]){ > char buf[4096]; > int mode, sockfd, len; > struct sockaddr_nl ad; > struct nlmsghdr *hdr = (struct nlmsghdr *)buf; > struct cn_msg *msg = (struct cn_msg *)(buf+sizeof(struct nlmsghdr)); > > switch(argc){ > case 1: > mode = MODE_LISTEN; > break; > case 2: > if (strcasecmp("on",argv[1])==0) { > mode = MODE_ENABLE; > }else if (strcasecmp("off",argv[1])==0){ > mode = MODE_DISABLE; > }else{ > usage(); > } > break; > default: > usage(); > break; > } > > if( (sockfd=socket(PF_NETLINK, SOCK_RAW, NETLINK_NFLOG)) < 0 ){ > fprintf(stderr, "Fault on socket().\n"); > return( 1 ); > } > ad.nl_family = AF_NETLINK; > ad.nl_pad = 0; > ad.nl_pid = getpid(); > ad.nl_groups = -1; Group should be CN_FORK_IDX to receive only fork's messages. > if( bind(sockfd, (struct sockaddr *)&ad, sizeof(ad)) ){ > fprintf(stderr, "Fault on bind to netlink.\n"); > return( 2 ); > } > > if (mode==MODE_LISTEN) { > while(-1){ > len = recvfrom(sockfd, buf, 4096, 0, NULL, NULL); > printf("%d-byte recv Seq=%d\n", len, hdr->nlmsg_seq); > } > }else{ > ad.nl_family = AF_NETLINK; > ad.nl_pad = 0; > ad.nl_pid = 0; > ad.nl_groups = 1; > > hdr->nlmsg_len = sizeof(struct nlmsghdr) + sizeof(struct cn_msg) + sizeof(int); > hdr->nlmsg_type = 0; > hdr->nlmsg_flags = 0; > hdr->nlmsg_seq = 0; > hdr->nlmsg_pid = getpid(); > msg->id.idx = 0xfeed; > msg->id.val = 0xbeef; > msg->seq = msg->ack = 0; > msg->len = sizeof(int); > > if (mode==MODE_ENABLE){ > (*(int *)(msg->data)) = 1; > } else { > (*(int *)(msg->data)) = 0; > } > sendto(sockfd, buf, sizeof(struct nlmsghdr)+sizeof(struct cn_msg)+sizeof(int), > 0, (struct sockaddr *)&ad, sizeof(ad)); > } > } Later today I will post finished connector.c with the all pending patches in, and simple test program for anyone, who wants to test fork() performace with and without fork's connector enabled. Since Guillaume is busy, I will test it in my 2-way (1+1HT) CPU system. -- Evgeniy Polyakov ( s0mbre ) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-03 5:46 ` Evgeniy Polyakov @ 2005-03-03 11:51 ` Evgeniy Polyakov 2005-03-03 12:20 ` Evgeniy Polyakov 0 siblings, 1 reply; 8+ messages in thread From: Evgeniy Polyakov @ 2005-03-03 11:51 UTC (permalink / raw) To: Kaigai Kohei Cc: Guillaume Thouvenin, Andrew Morton, lkml, elsa-devel, Jay Lan, Gerrit Huizenga, Erich Focht, Netlink List [-- Attachment #1: Type: text/plain, Size: 1437 bytes --] Simple program to test fork() performance. #include <sys/signal.h> #include <sys/time.h> int main(int argc, char *argv[]) { int pid; int i = 0, max = 100000; struct timeval tv0, tv1; struct timezone tz; long diff; if (argc >= 2) max = atoi(argv[1]); signal(SIGCHLD, SIG_IGN); gettimeofday(&tv0, &tz); while (i++ < max) { pid = fork(); if (pid == 0) { sleep(1); exit (0); } } gettimeofday(&tv1, &tz); diff = (tv1.tv_sec - tv0.tv_sec)*1000000 + (tv1.tv_usec - tv0.tv_usec); printf("Average per process fork+exit time is %ld usecs [diff=%lu, max=%d].\n", diff/max, diff, max); return 0; } Creating 10k forks 100 times. Results on 2-way SMP(1+1HT) Xeon for one fork()+exit(): 2.6.11-rc4-mm1 494 usec 2.6.11-rc4-mm1-fork-connector-no_userspace 509 usec 2.6.11-rc4-mm1-fork-connector-userspace 520 usec 5% fork() degradation(connector with userspace vs. vanilla) with fork() connector. On my test system global fork lock does not cost anything (tested both with and without userspace listener), but it is only 2-way(pseudo). -- Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector 2005-03-03 11:51 ` Evgeniy Polyakov @ 2005-03-03 12:20 ` Evgeniy Polyakov 0 siblings, 0 replies; 8+ messages in thread From: Evgeniy Polyakov @ 2005-03-03 12:20 UTC (permalink / raw) To: Kaigai Kohei Cc: Guillaume Thouvenin, Andrew Morton, lkml, elsa-devel, Jay Lan, Gerrit Huizenga, Erich Focht, Netlink List [-- Attachment #1.1: Type: text/plain, Size: 1286 bytes --] On Thu, 2005-03-03 at 14:51 +0300, Evgeniy Polyakov wrote: > Simple program to test fork() performance. ... In a bit more advanced version it checks for error value, but it never happend. It can also have more fine grained measurment, but IMHO the picture is clear for small systems. > Creating 10k forks 100 times. > Results on 2-way SMP(1+1HT) Xeon for one fork()+exit(): > > 2.6.11-rc4-mm1 494 usec Actually sometimes it drops to 480 usecs. > 2.6.11-rc4-mm1-fork-connector-no_userspace 509 usec > 2.6.11-rc4-mm1-fork-connector-userspace 520 usec > > 5% fork() degradation(connector with userspace vs. vanilla) with fork() connector. > On my test system global fork lock does not cost anything > (tested both with and without userspace listener), but it is only 2-way(pseudo). connector.c used in experiments is attached. If fork connector analysis will show that global fork lock is a big bottleneck, than seq counter can be replaced with per-cpu counter, but then inner header should include cpu id to properly distinguish messages. But it is totaly fork's connector area, so I will not break things. -- Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski [-- Attachment #1.2: connector.c --] [-- Type: text/x-csrc, Size: 13174 bytes --] /* * connector.c * * 2004 Copyright (c) Evgeniy Polyakov <johnpol@2ka.mipt.ru> * All rights reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ #include <linux/kernel.h> #include <linux/module.h> #include <linux/list.h> #include <linux/skbuff.h> #include <linux/netlink.h> #include <linux/moduleparam.h> #include <linux/connector.h> #include <net/sock.h> MODULE_LICENSE("GPL"); MODULE_AUTHOR("Evgeniy Polyakov <johnpol@2ka.mipt.ru>"); MODULE_DESCRIPTION("Generic userspace <-> kernelspace connector."); static int unit = NETLINK_NFLOG; static u32 cn_idx = -1; static u32 cn_val = -1; module_param(unit, int, 0); module_param(cn_idx, uint, 0); module_param(cn_val, uint, 0); static DEFINE_SPINLOCK(notify_lock); static LIST_HEAD(notify_list); static struct cn_dev cdev; int cn_already_initialized = 0; int cn_fork_enable = 0; struct cb_id cb_fork_id = { CN_IDX_FORK, CN_VAL_FORK }; /* * msg->seq and msg->ack are used to determine message genealogy. * When someone sends message it puts there locally unique sequence * and random acknowledge numbers. * Sequence number may be copied into nlmsghdr->nlmsg_seq too. * * Sequence number is incremented with each message to be sent. * * If we expect reply to our message, * then sequence number in received message MUST be the same as in original message, * and acknowledge number MUST be the same + 1. * * If we receive message and it's sequence number is not equal to one we are expecting, * then it is new message. * If we receive message and it's sequence number is the same as one we are expecting, * but it's acknowledge is not equal acknowledge number in original message + 1, * then it is new message. * */ void cn_netlink_send(struct cn_msg *msg, u32 __groups) { struct cn_callback_entry *n, *__cbq; unsigned int size; struct sk_buff *skb, *uskb; struct nlmsghdr *nlh; struct cn_msg *data; struct cn_dev *dev = &cdev; u32 groups = 0; int found = 0; if (!__groups) { spin_lock_bh(&dev->cbdev->queue_lock); list_for_each_entry_safe(__cbq, n, &dev->cbdev->queue_list, callback_entry) { if (cn_cb_equal(&__cbq->cb->id, &msg->id)) { found = 1; groups = __cbq->group; } } spin_unlock_bh(&dev->cbdev->queue_lock); if (!found) { printk(KERN_ERR "Failed to find multicast netlink group for callback[0x%x.0x%x]. seq=%u\n", msg->id.idx, msg->id.val, msg->seq); return; } } else groups = __groups; size = NLMSG_SPACE(sizeof(*msg) + msg->len); skb = alloc_skb(size, GFP_ATOMIC); if (!skb) { printk(KERN_ERR "Failed to allocate new skb with size=%u.\n", size); return; } nlh = NLMSG_PUT(skb, 0, msg->seq, NLMSG_DONE, size - NLMSG_ALIGN(sizeof(*nlh))); data = (struct cn_msg *)NLMSG_DATA(nlh); memcpy(data, msg, sizeof(*data) + msg->len); #if 0 printk("%s: len=%u, seq=%u, ack=%u, group=%u.\n", __func__, msg->len, msg->seq, msg->ack, groups); #endif NETLINK_CB(skb).dst_groups = groups; uskb = skb_clone(skb, GFP_ATOMIC); if (uskb) { netlink_unicast(dev->nls, uskb, 0, 0); } netlink_broadcast(dev->nls, skb, 0, groups, GFP_ATOMIC); return; nlmsg_failure: printk(KERN_ERR "Failed to send %u.%u\n", msg->seq, msg->ack); kfree_skb(skb); return; } static int cn_call_callback(struct cn_msg *msg, void (*destruct_data) (void *), void *data) { struct cn_callback_entry *n, *__cbq; struct cn_dev *dev = &cdev; int found = 0; spin_lock_bh(&dev->cbdev->queue_lock); list_for_each_entry_safe(__cbq, n, &dev->cbdev->queue_list, callback_entry) { if (cn_cb_equal(&__cbq->cb->id, &msg->id)) { __cbq->cb->priv = msg; __cbq->ddata = data; __cbq->destruct_data = destruct_data; queue_work(dev->cbdev->cn_queue, &__cbq->work); found = 1; break; } } spin_unlock_bh(&dev->cbdev->queue_lock); return found; } static int __cn_rx_skb(struct sk_buff *skb, struct nlmsghdr *nlh) { u32 pid, uid, seq, group; struct cn_msg *msg; pid = NETLINK_CREDS(skb)->pid; uid = NETLINK_CREDS(skb)->uid; seq = nlh->nlmsg_seq; group = NETLINK_CB((skb)).groups; msg = (struct cn_msg *)NLMSG_DATA(nlh); if (NLMSG_SPACE(msg->len + sizeof(*msg)) != nlh->nlmsg_len) { printk(KERN_ERR "skb does not have enough length: " "requested msg->len=%u[%u], nlh->nlmsg_len=%u, skb->len=%u.\n", msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)), nlh->nlmsg_len, skb->len); kfree_skb(skb); return -EINVAL; } #if 0 printk(KERN_INFO "pid=%u, uid=%u, seq=%u, group=%u.\n", pid, uid, seq, group); #endif return cn_call_callback(msg, (void (*)(void *))kfree_skb, skb); } static void cn_rx_skb(struct sk_buff *__skb) { struct nlmsghdr *nlh; u32 len; int err; struct sk_buff *skb; skb = skb_get(__skb); if (!skb) { printk(KERN_ERR "Failed to reference an skb.\n"); kfree_skb(__skb); return; } #if 0 printk(KERN_INFO "skb: len=%u, data_len=%u, truesize=%u, proto=%u, cloned=%d, shared=%d.\n", skb->len, skb->data_len, skb->truesize, skb->protocol, skb_cloned(skb), skb_shared(skb)); #endif while (skb->len >= NLMSG_SPACE(0)) { nlh = (struct nlmsghdr *)skb->data; if (nlh->nlmsg_len < sizeof(struct cn_msg) || skb->len < nlh->nlmsg_len || nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) { #if 0 printk(KERN_INFO "nlmsg_len=%u, sizeof(*nlh)=%u\n", nlh->nlmsg_len, sizeof(*nlh)); #endif kfree_skb(skb); break; } len = NLMSG_ALIGN(nlh->nlmsg_len); if (len > skb->len) len = skb->len; err = __cn_rx_skb(skb, nlh); if (err) { #if 0 if (err < 0 && (nlh->nlmsg_flags & NLM_F_ACK)) netlink_ack(skb, nlh, -err); #endif break; } else { #if 0 if (nlh->nlmsg_flags & NLM_F_ACK) netlink_ack(skb, nlh, 0); #endif break; } skb_pull(skb, len); } kfree_skb(__skb); } static void cn_input(struct sock *sk, int len) { struct sk_buff *skb; while ((skb = skb_dequeue(&sk->sk_receive_queue)) != NULL) cn_rx_skb(skb); } static void cn_notify(struct cb_id *id, u32 notify_event) { struct cn_ctl_entry *ent; spin_lock_bh(¬ify_lock); list_for_each_entry(ent, ¬ify_list, notify_entry) { int i; struct cn_notify_req *req; struct cn_ctl_msg *ctl = ent->msg; int a, b; a = b = 0; req = (struct cn_notify_req *)ctl->data; for (i=0; i<ctl->idx_notify_num; ++i, ++req) { if (id->idx >= req->first && id->idx < req->first + req->range) { a = 1; break; } } for (i=0; i<ctl->val_notify_num; ++i, ++req) { if (id->val >= req->first && id->val < req->first + req->range) { b = 1; break; } } if (a && b) { struct cn_msg m; printk(KERN_INFO "Notifying group %x with event %u about %x.%x.\n", ctl->group, notify_event, id->idx, id->val); memset(&m, 0, sizeof(m)); m.ack = notify_event; memcpy(&m.id, id, sizeof(m.id)); cn_netlink_send(&m, ctl->group); } } spin_unlock_bh(¬ify_lock); } int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)) { int err; struct cn_dev *dev = &cdev; struct cn_callback *cb; cb = kmalloc(sizeof(*cb), GFP_KERNEL); if (!cb) { printk(KERN_INFO "%s: Failed to allocate new struct cn_callback.\n", dev->cbdev->name); return -ENOMEM; } memset(cb, 0, sizeof(*cb)); snprintf(cb->name, sizeof(cb->name), "%s", name); memcpy(&cb->id, id, sizeof(cb->id)); cb->callback = callback; atomic_set(&cb->refcnt, 0); err = cn_queue_add_callback(dev->cbdev, cb); if (err) { kfree(cb); return err; } cn_notify(id, 0); return 0; } void cn_del_callback(struct cb_id *id) { struct cn_dev *dev = &cdev; struct cn_callback_entry *n, *__cbq; list_for_each_entry_safe(__cbq, n, &dev->cbdev->queue_list, callback_entry) { if (cn_cb_equal(&__cbq->cb->id, id)) { cn_queue_del_callback(dev->cbdev, __cbq->cb); cn_notify(id, 1); break; } } } static int cn_ctl_msg_equals(struct cn_ctl_msg *m1, struct cn_ctl_msg *m2) { int i; struct cn_notify_req *req1, *req2; if (m1->idx_notify_num != m2->idx_notify_num) return 0; if (m1->val_notify_num != m2->val_notify_num) return 0; if (m1->len != m2->len) return 0; if ((m1->idx_notify_num + m1->val_notify_num)*sizeof(*req1) != m1->len) { printk(KERN_ERR "Notify entry[idx_num=%x, val_num=%x, len=%u] contains garbage. Removing.\n", m1->idx_notify_num, m1->val_notify_num, m1->len); return 1; } req1 = (struct cn_notify_req *)m1->data; req2 = (struct cn_notify_req *)m2->data; for (i=0; i<m1->idx_notify_num; ++i) { if (memcmp(req1, req2, sizeof(*req1))) return 0; req1++; req2++; } for (i=0; i<m1->val_notify_num; ++i) { if (memcmp(req1, req2, sizeof(*req1))) return 0; req1++; req2++; } return 1; } static void cn_callback(void * data) { struct cn_msg *msg = (struct cn_msg *)data; struct cn_ctl_msg *ctl; struct cn_ctl_entry *ent; u32 size; if (msg->len < sizeof(*ctl)) { printk(KERN_ERR "Wrong connector request size %u, must be >= %u.\n", msg->len, sizeof(*ctl)); return; } ctl = (struct cn_ctl_msg *)msg->data; size = sizeof(*ctl) + (ctl->idx_notify_num + ctl->val_notify_num)*sizeof(struct cn_notify_req); if (msg->len != size) { printk(KERN_ERR "Wrong connector request size %u, must be == %u.\n", msg->len, size); return; } if (ctl->len + sizeof(*ctl) != msg->len) { printk(KERN_ERR "Wrong message: msg->len=%u must be equal to inner_len=%u [+%u].\n", msg->len, ctl->len, sizeof(*ctl)); return; } /* * Remove notification. */ if (ctl->group == 0) { struct cn_ctl_entry *n; spin_lock_bh(¬ify_lock); list_for_each_entry_safe(ent, n, ¬ify_list, notify_entry) { if (cn_ctl_msg_equals(ent->msg, ctl)) { list_del(&ent->notify_entry); kfree(ent); } } spin_unlock_bh(¬ify_lock); return; } size += sizeof(*ent); ent = kmalloc(size, GFP_ATOMIC); if (!ent) { printk(KERN_ERR "Failed to allocate %d bytes for new notify entry.\n", size); return; } memset(ent, 0, size); ent->msg = (struct cn_ctl_msg *)(ent + 1); memcpy(ent->msg, ctl, size - sizeof(*ent)); spin_lock_bh(¬ify_lock); list_add(&ent->notify_entry, ¬ify_list); spin_unlock_bh(¬ify_lock); { int i; struct cn_notify_req *req; printk("Notify group %x for idx: ", ctl->group); req = (struct cn_notify_req *)ctl->data; for (i=0; i<ctl->idx_notify_num; ++i, ++req) { printk("%u-%u ", req->first, req->first+req->range-1); } printk("\nNotify group %x for val: ", ctl->group); for (i=0; i<ctl->val_notify_num; ++i, ++req) { printk("%u-%u ", req->first, req->first+req->range-1); } printk("\n"); } } static void cn_fork_callback(void *data) { if (cn_already_initialized) cn_fork_enable = 1; } static int cn_init(void) { struct cn_dev *dev = &cdev; int err; dev->input = cn_input; dev->id.idx = cn_idx; dev->id.val = cn_val; dev->nls = netlink_kernel_create(unit, dev->input); if (!dev->nls) { printk(KERN_ERR "Failed to create new netlink socket(%u).\n", unit); return -EIO; } dev->cbdev = cn_queue_alloc_dev("cqueue", dev->nls); if (!dev->cbdev) { if (dev->nls->sk_socket) sock_release(dev->nls->sk_socket); return -EINVAL; } err = cn_add_callback(&dev->id, "connector", &cn_callback); if (err) { cn_queue_free_dev(dev->cbdev); if (dev->nls->sk_socket) sock_release(dev->nls->sk_socket); return -EINVAL; } err = cn_add_callback(&cb_fork_id, "cn_fork", &cn_fork_callback); if (err) { cn_del_callback(&dev->id); cn_queue_free_dev(dev->cbdev); if (dev->nls->sk_socket) sock_release(dev->nls->sk_socket); return -EINVAL; } cn_already_initialized = 1; return 0; } static void cn_fini(void) { struct cn_dev *dev = &cdev; cn_del_callback(&dev->id); cn_queue_free_dev(dev->cbdev); if (dev->nls->sk_socket) sock_release(dev->nls->sk_socket); } module_init(cn_init); module_exit(cn_fini); EXPORT_SYMBOL_GPL(cn_add_callback); EXPORT_SYMBOL_GPL(cn_del_callback); EXPORT_SYMBOL_GPL(cn_netlink_send); [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-03-03 12:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1109240677.1738.196.camel@frecb000711.frec.bull.fr>
2005-03-02 8:48 ` [PATCH 2.6.11-rc4-mm1] connector: Add a fork connector Guillaume Thouvenin
2005-03-02 14:51 ` Paul Jackson
2005-03-02 17:48 ` Jesse Barnes
2005-03-02 15:50 ` Paul Jackson
2005-03-03 3:18 ` Kaigai Kohei
2005-03-03 5:46 ` Evgeniy Polyakov
2005-03-03 11:51 ` Evgeniy Polyakov
2005-03-03 12:20 ` Evgeniy Polyakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).