* Re: Re: A common layer for Accounting packages [not found] ` <20050227140355.GA23055@logos.cnet> @ 2005-02-28 1:59 ` Kaigai Kohei 2005-02-28 2:32 ` [Lse-tech] " Thomas Graf ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Kaigai Kohei @ 2005-02-28 1:59 UTC (permalink / raw) To: Marcelo Tosatti Cc: Andrew Morton, davem, jlan, lse-tech, linux-kernel, netdev Hello, Marcelo Tosatti wrote: > Yep, the netlink people should be able to help - they known what would be > required for not sending messages in case there is no listener registered. > > Maybe its already possible? I have never used netlink myself. If we notify the fork/exec/exit-events to user-space directly as you said, I don't think some hackings on netlink is necessary. For example, such packets is sent only when /proc/sys/.../process_grouping is set, and user-side daemon set this value, and unset when daemon will exit. It's not necessary to take too seriously. >>And, why can't netlink packets send always? >>If there are fork/exec/exit hooks, and they call CSA or other >>process-grouping modules, >>then those modules will decide whether packets for interaction with the >>daemon should be >>sent or not. > > > The netlink data will be sent to userspace at fork/exec/exit hooks - one wants > to avoid that if there are no listeners, so setups which dont want to run the > accounting daemon dont pay the cost of building and sending the information > through netlink. > > Thats what Andrew asked for if I understand correctly. Does it mean "netlink packets shouled be sent to userspace unconditionally." ? I have advocated steadfastly that fork/exec/exit hooks is necessary to support process-grouping and to account per process-grouping. It intend to be decided whether packets should be sent or not by hooked functions, in my understanding. Is it also one of the implementations whether using netlink-socket or not ? >>In most considerable case, CSA's kernel-loadable-module using such hooks >>will not be loaded >>when no accounting daemon is running. Adversely, this module must be loaded >>when accounting >>daemon needs CSA's netlink packets. >>Thus, it is only necessary to refer flag valiable and to execute >>conditional-jump >>when no-accounting daemon is running. > > > That would be one hack, although it is uglier than the pure netlink > selection. No, I can't agree this opinion. It means netlink-packets will be sent unconditionally when fork/exec/exit occur. Nobady can decide which packet is sent user-space, I think. In addition, the definition of process grouping is lightweight in many cases. For example, CpuSet can define own process-group by one increment-operation. I think it's not impossible to implement it in userspace, but it's not reasonable. An implementation as a kernel loadable module is reasonable and enough tiny. >>In my estimation, we must pay additional cost for an increment-operation, >>an decrement-op, >>an comparison-op and an conditional jump-op. It's enough lightweight, I >>think. >> >>For example: >>If CSA's module isn't loaded, 'privates_for_grouping' will be empty. >> >>inline int on_fork_hook(task_struct *parent, task_struct *newtask){ >> rcu_read_lock(); >> if( !list_empty(&parent->privates_for_grouping) ){ >> ..<Calling to any process grouping module>..; >> } >> rcu_read_unlock(); >>} > > > Andrew has been talking about sending data over netlink to implement the > accounting at userspace, so this piece of code is out of the game, no? Indeed, I'm not opposed to implement the accounting in userspace and using netlink-socket for kernel-daemon communication. But definition of process-grouping based on any grouping policy should be done in kernel space at reasonability viewpoint. Thanks. -- Linux Promotion Center, NEC KaiGai Kohei <kaigai@ak.jp.nec.com> ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 1:59 ` Re: A common layer for Accounting packages Kaigai Kohei @ 2005-02-28 2:32 ` Thomas Graf 2005-02-28 5:17 ` Evgeniy Polyakov 2005-02-28 7:20 ` [Lse-tech] " Guillaume Thouvenin 2 siblings, 0 replies; 26+ messages in thread From: Thomas Graf @ 2005-02-28 2:32 UTC (permalink / raw) To: Kaigai Kohei Cc: Marcelo Tosatti, Andrew Morton, davem, jlan, lse-tech, linux-kernel, netdev First of all, I'm not aware of the whole discussion, ignore this if it has been brought to attention already. > > Yep, the netlink people should be able to help - they known what would be > > required for not sending messages in case there is no listener registered. > > > > Maybe its already possible? I have never used netlink myself. The easiest way is to use netlink_broadcast() and have userspace register to a netlink multicast group (set .nl_groups before connecting the socket). The netlink message will be sent to only those netlink sockets assigned to the group, no message will be send out if no userspace listeners has registered. Did you have a look at the syscall enter/exit audit netlink hooks before trying to invent your own thing? I can also give you some code if you want, I use it to track the path of skbs in the net stack. It puts events into a preallocated ring buffer and a separate kernel thread broadcasts them over netlink. The events can be enqueued in any context at the cost of a possible ring buffer overrun resulting in loss of events. It's just a debugging hack though. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re: A common layer for Accounting packages 2005-02-28 1:59 ` Re: A common layer for Accounting packages Kaigai Kohei 2005-02-28 2:32 ` [Lse-tech] " Thomas Graf @ 2005-02-28 5:17 ` Evgeniy Polyakov 2005-02-28 7:20 ` [Lse-tech] " Guillaume Thouvenin 2 siblings, 0 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-02-28 5:17 UTC (permalink / raw) To: Kaigai Kohei Cc: Marcelo Tosatti, Andrew Morton, davem, jlan, lse-tech, linux-kernel, netdev [-- Attachment #1: Type: text/plain, Size: 1029 bytes --] On Mon, 2005-02-28 at 10:59 +0900, Kaigai Kohei wrote: > Hello, > > Marcelo Tosatti wrote: > > Yep, the netlink people should be able to help - they known what would be > > required for not sending messages in case there is no listener registered. > > > > Maybe its already possible? I have never used netlink myself. > > If we notify the fork/exec/exit-events to user-space directly as you said, > I don't think some hackings on netlink is necessary. > For example, such packets is sent only when /proc/sys/.../process_grouping is set, > and user-side daemon set this value, and unset when daemon will exit. > It's not necessary to take too seriously. Kernel accounting already was discussed in lkml week ago - I'm quite sure Guillaume Thouvenin created exactly that. His module creates do_fork() hook and broadcasts various process' states over netlink. Discussion at http://lkml.org/lkml/2005/2/17/87 -- Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 1:59 ` Re: A common layer for Accounting packages Kaigai Kohei 2005-02-28 2:32 ` [Lse-tech] " Thomas Graf 2005-02-28 5:17 ` Evgeniy Polyakov @ 2005-02-28 7:20 ` Guillaume Thouvenin 2005-02-28 7:39 ` Andrew Morton 2 siblings, 1 reply; 26+ messages in thread From: Guillaume Thouvenin @ 2005-02-28 7:20 UTC (permalink / raw) To: Kaigai Kohei Cc: Marcelo Tosatti, Andrew Morton, davem, jlan, LSE-Tech, lkml, netdev, elsa-devel On Mon, 2005-02-28 at 10:59 +0900, Kaigai Kohei wrote: > Marcelo Tosatti wrote: > > Yep, the netlink people should be able to help - they known what would be > > required for not sending messages in case there is no listener registered. > > > > Maybe its already possible? I have never used netlink myself. > > If we notify the fork/exec/exit-events to user-space directly as you said, > I don't think some hackings on netlink is necessary. > For example, such packets is sent only when /proc/sys/.../process_grouping is set, > and user-side daemon set this value, and unset when daemon will exit. > It's not necessary to take too seriously. I wrote a new fork connector patch with a callback to enable/disable messages in case there is or isn't listener. I will post it this week. Basically there is a global variable that is manipulated with a connector callback so a user space daemon can manipulate the variable. In the fork_connector() function you have: static inline void fork_connector(pid_t parent, pid_t child) { static DEFINE_SPINLOCK(cn_fork_lock); static __u32 seq; /* used to test if message is lost */ if (cn_fork_enable) { [...] cn_netlink_send(msg, CN_IDX_FORK); } } and in the cn_fork module (drivers/connector/cn_fork.c) the callback is defined as: static void cn_fork_callback(void *data) { if (cn_already_initialized) cn_fork_enable = cn_fork_enable ? 0 : 1; } Ok the protocol is maybe too "basic" but with this mechanism the user space application that uses the fork connector can start and stop the send of messages. This implementation needs somme improvements because currently, if two application are using the fork connector one can enable it and the other don't know if it is enable or not, but the idea is here I think. Regards, Guillaume ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 7:20 ` [Lse-tech] " Guillaume Thouvenin @ 2005-02-28 7:39 ` Andrew Morton 2005-02-28 8:04 ` Evgeniy Polyakov 2005-02-28 12:10 ` jamal 0 siblings, 2 replies; 26+ messages in thread From: Andrew Morton @ 2005-02-28 7:39 UTC (permalink / raw) To: Guillaume Thouvenin Cc: kaigai, marcelo.tosatti, davem, jlan, lse-tech, linux-kernel, netdev, elsa-devel Guillaume Thouvenin <guillaume.thouvenin@bull.net> wrote: > > Ok the protocol is maybe too "basic" but with this mechanism the user > space application that uses the fork connector can start and stop the > send of messages. This implementation needs somme improvements because > currently, if two application are using the fork connector one can > enable it and the other don't know if it is enable or not, but the idea > is here I think. Yes. But this problem can be solved in userspace, with a little library function and a bit of locking. IOW: use the library to enable/disable the fork connector rather than directly doing syscalls. It has the problem that if a client of that library crashes, the counter gets out of whack, but really, it's not all _that_ important, and to handle this properly in-kernel each client would need an open fd against some object so we can do the close-on-exit thing properly. You'd need to create a separate netlink socket for the purpose. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 7:39 ` Andrew Morton @ 2005-02-28 8:04 ` Evgeniy Polyakov 2005-02-28 12:10 ` jamal 1 sibling, 0 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-02-28 8:04 UTC (permalink / raw) To: Andrew Morton Cc: Guillaume Thouvenin, kaigai, marcelo.tosatti, davem, jlan, lse-tech, linux-kernel, netdev, elsa-devel [-- Attachment #1: Type: text/plain, Size: 1369 bytes --] On Sun, 2005-02-27 at 23:39 -0800, Andrew Morton wrote: > Guillaume Thouvenin <guillaume.thouvenin@bull.net> wrote: > > > > Ok the protocol is maybe too "basic" but with this mechanism the user > > space application that uses the fork connector can start and stop the > > send of messages. This implementation needs somme improvements because > > currently, if two application are using the fork connector one can > > enable it and the other don't know if it is enable or not, but the idea > > is here I think. > > Yes. But this problem can be solved in userspace, with a little library > function and a bit of locking. > > IOW: use the library to enable/disable the fork connector rather than > directly doing syscalls. > > It has the problem that if a client of that library crashes, the counter > gets out of whack, but really, it's not all _that_ important, and to handle > this properly in-kernel each client would need an open fd against some > object so we can do the close-on-exit thing properly. You'd need to create > a separate netlink socket for the purpose. Why dont just extend protocol a bit? Add header after cn_msg, which will have get/set field and that is all. Properly using seq/ack fields userspace can avoid locks. -- Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 7:39 ` Andrew Morton 2005-02-28 8:04 ` Evgeniy Polyakov @ 2005-02-28 12:10 ` jamal 2005-02-28 9:29 ` Marcelo Tosatti ` (2 more replies) 1 sibling, 3 replies; 26+ messages in thread From: jamal @ 2005-02-28 12:10 UTC (permalink / raw) To: Andrew Morton Cc: Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel Havent seen the beginnings of this thread. But whatever you are trying to do seems to suggest some complexity that you are trying to workaround. What was wrong with just going ahead and just always invoking your netlink_send()? If there are nobody in user space (or kernel) listening, it wont go anywhere. cheers, jamal On Mon, 2005-02-28 at 02:39, Andrew Morton wrote: > Guillaume Thouvenin <guillaume.thouvenin@bull.net> wrote: > > > > Ok the protocol is maybe too "basic" but with this mechanism the user > > space application that uses the fork connector can start and stop the > > send of messages. This implementation needs somme improvements because > > currently, if two application are using the fork connector one can > > enable it and the other don't know if it is enable or not, but the idea > > is here I think. > > Yes. But this problem can be solved in userspace, with a little library > function and a bit of locking. > > IOW: use the library to enable/disable the fork connector rather than > directly doing syscalls. > > It has the problem that if a client of that library crashes, the counter > gets out of whack, but really, it's not all _that_ important, and to handle > this properly in-kernel each client would need an open fd against some > object so we can do the close-on-exit thing properly. You'd need to create > a separate netlink socket for the purpose. > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Lse-tech mailing list > Lse-tech@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/lse-tech > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re: A common layer for Accounting packages 2005-02-28 12:10 ` jamal @ 2005-02-28 9:29 ` Marcelo Tosatti 2005-02-28 13:20 ` [Lse-tech] " Thomas Graf 2005-03-01 20:40 ` Paul Jackson 2 siblings, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2005-02-28 9:29 UTC (permalink / raw) To: jamal Cc: Andrew Morton, Guillaume Thouvenin, kaigai, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel I'm net ignorant, so just hit me with a cluebat if thats appropriate. On Mon, Feb 28, 2005 at 07:10:58AM -0500, jamal wrote: > > Havent seen the beginnings of this thread. But whatever you are trying > to do seems to suggest some complexity that you are trying to > workaround. What was wrong with just going ahead and just always > invoking your netlink_send()? What overhead does the netlink_send() impose if there are no listeners? Sure, it wont go anywhere, but the message will have to be assembled and sent anyway. Correct? The way things are now, its necessary to make the decision to invoke or not netlink_send() due to the supposed overhead. Thats what Guillaume is doing, and thats what will have to be done whenever one wants to send information through netlink from performance critical paths. Can't the assembly/etc overhead associated with netlink_send() be avoided earlier, approaching zero-cost ? Being able to get rid of the decision to invoke or not the sendmsg would be nice. TIA > If there are nobody in user space (or > kernel) listening, it wont go anywhere. > > cheers, > jamal > > On Mon, 2005-02-28 at 02:39, Andrew Morton wrote: > > Guillaume Thouvenin <guillaume.thouvenin@bull.net> wrote: > > > > > > Ok the protocol is maybe too "basic" but with this mechanism the user > > > space application that uses the fork connector can start and stop the > > > send of messages. This implementation needs somme improvements because > > > currently, if two application are using the fork connector one can > > > enable it and the other don't know if it is enable or not, but the idea > > > is here I think. > > > > Yes. But this problem can be solved in userspace, with a little library > > function and a bit of locking. > > > > IOW: use the library to enable/disable the fork connector rather than > > directly doing syscalls. > > > > It has the problem that if a client of that library crashes, the counter > > gets out of whack, but really, it's not all _that_ important, and to handle > > this properly in-kernel each client would need an open fd against some > > object so we can do the close-on-exit thing properly. You'd need to create > > a separate netlink socket for the purpose. ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 12:10 ` jamal 2005-02-28 9:29 ` Marcelo Tosatti @ 2005-02-28 13:20 ` Thomas Graf 2005-02-28 13:40 ` jamal 2005-03-01 20:40 ` Paul Jackson 2 siblings, 1 reply; 26+ messages in thread From: Thomas Graf @ 2005-02-28 13:20 UTC (permalink / raw) To: jamal Cc: Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel > Havent seen the beginnings of this thread. But whatever you are trying > to do seems to suggest some complexity that you are trying to > workaround. What was wrong with just going ahead and just always > invoking your netlink_send()? I guess parts of the wheel are broken and need to be reinvented ;-> > If there are nobody in user space (or kernel) listening, it wont go anywhere. Additional you may want to extend netlink a bit to check whether there is a listener before creating the messages. The method to do so depends on whether you use netlink_send() or netlink_brodacast(). The latter is more flexiable because you can add more groups later on and the userspace applications can decicde which ones they want to listen to. Both methods handle dying clients perfectly fine, the association to the netlink socket gets destroyed as soon as the socket is closed. Therefore you can simply check mc_list of the netlink protocol you use to see if there are any listeners registered: static inline int netlink_has_listeners(struct sock *sk) { int ret; read_lock(&nl_table_lock); ret = list_empty(&nl_table[sk->sk_protocol].mc_list) read_unlock(&nl_table_lock); return !ret; } This is simplified and ignores the actual group assignments, i.e. you might want to extend it to have it check if there are listeners for a certain group. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 13:20 ` [Lse-tech] " Thomas Graf @ 2005-02-28 13:40 ` jamal 2005-02-28 13:53 ` Thomas Graf 0 siblings, 1 reply; 26+ messages in thread From: jamal @ 2005-02-28 13:40 UTC (permalink / raw) To: Thomas Graf Cc: Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel netlink broadcast or a wrapper around it. Why even bother doing the check with netlink_has_listeners()? cheers, jamal On Mon, 2005-02-28 at 08:20, Thomas Graf wrote: > > Havent seen the beginnings of this thread. But whatever you are trying > > to do seems to suggest some complexity that you are trying to > > workaround. What was wrong with just going ahead and just always > > invoking your netlink_send()? > > I guess parts of the wheel are broken and need to be reinvented ;-> > > > If there are nobody in user space (or kernel) listening, it wont go anywhere. > > Additional you may want to extend netlink a bit to check whether > there is a listener before creating the messages. The method to do so > depends on whether you use netlink_send() or netlink_brodacast(). The > latter is more flexiable because you can add more groups later on > and the userspace applications can decicde which ones they want to > listen to. Both methods handle dying clients perfectly fine, the > association to the netlink socket gets destroyed as soon as the socket > is closed. Therefore you can simply check mc_list of the netlink > protocol you use to see if there are any listeners registered: > > static inline int netlink_has_listeners(struct sock *sk) > { > int ret; > > read_lock(&nl_table_lock); > ret = list_empty(&nl_table[sk->sk_protocol].mc_list) > read_unlock(&nl_table_lock); > > return !ret; > } > > This is simplified and ignores the actual group assignments, i.e. you > might want to extend it to have it check if there are listeners for > a certain group. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 13:40 ` jamal @ 2005-02-28 13:53 ` Thomas Graf 2005-02-28 9:52 ` Marcelo Tosatti 2005-02-28 14:10 ` jamal 0 siblings, 2 replies; 26+ messages in thread From: Thomas Graf @ 2005-02-28 13:53 UTC (permalink / raw) To: jamal Cc: Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel * jamal <1109598010.2188.994.camel@jzny.localdomain> 2005-02-28 08:40 > > netlink broadcast or a wrapper around it. > Why even bother doing the check with netlink_has_listeners()? To implement the master enable/disable switch they want. The messages don't get send out anyway but why bother doing all the work if nothing will get send out in the end? It implements a well defined flag controlled by open/close on fds (thus handles dying applications) stating whether the whole code should be enabled or disabled. It is of course not needed to avoid sending unnecessary messages. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 13:53 ` Thomas Graf @ 2005-02-28 9:52 ` Marcelo Tosatti 2005-02-28 14:10 ` jamal 1 sibling, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2005-02-28 9:52 UTC (permalink / raw) To: Thomas Graf Cc: jamal, Andrew Morton, Guillaume Thouvenin, kaigai, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel On Mon, Feb 28, 2005 at 02:53:07PM +0100, Thomas Graf wrote: > * jamal <1109598010.2188.994.camel@jzny.localdomain> 2005-02-28 08:40 > > > > netlink broadcast or a wrapper around it. > > Why even bother doing the check with netlink_has_listeners()? > > To implement the master enable/disable switch they want. The messages > don't get send out anyway but why bother doing all the work if nothing > will get send out in the end? It implements a well defined flag > controlled by open/close on fds (thus handles dying applications) > stating whether the whole code should be enabled or disabled. Yep - this far from "reinventing the wheel". ;) > It is of course not needed to avoid sending unnecessary messages. Thats the goal, thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 13:53 ` Thomas Graf 2005-02-28 9:52 ` Marcelo Tosatti @ 2005-02-28 14:10 ` jamal 2005-02-28 14:25 ` Thomas Graf 1 sibling, 1 reply; 26+ messages in thread From: jamal @ 2005-02-28 14:10 UTC (permalink / raw) To: Thomas Graf Cc: Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel On Mon, 2005-02-28 at 08:53, Thomas Graf wrote: > * jamal <1109598010.2188.994.camel@jzny.localdomain> 2005-02-28 08:40 > > > > netlink broadcast or a wrapper around it. > > Why even bother doing the check with netlink_has_listeners()? > > To implement the master enable/disable switch they want. The messages > don't get send out anyway but why bother doing all the work if nothing > will get send out in the end? It implements a well defined flag > controlled by open/close on fds (thus handles dying applications) > stating whether the whole code should be enabled or disabled. It is of > course not needed to avoid sending unnecessary messages. To justify writting the new code, I am assuming someone has actually sat down and in the minimal stuck their finger in the air and said "yes, there is definetely wind there". Which leadsto Marcello's question in other email: Theres some overhead. - Message needs to be built with skbs allocated (not the cn_xxx thing that seems to be invoked - I suspect that thing will build the skbs); - the netlink table needs to be locked -and searched and only then do you find theres nothing to send to. The point i was making is if you actually had to post this question, then you must be running into some problems of complexity ;-> which implies to me that the delta overhead maybe worth it compared to introducing the complexity or any new code. I wasnt involved in the discussion - I just woke up and saw the posting and was bored. So the justification for the optimization has probably been explained and it may be worth doing the check (but probably such check should go into whatever that cn_xxx is). cheers, jamal ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 14:10 ` jamal @ 2005-02-28 14:25 ` Thomas Graf 2005-02-28 15:31 ` jamal 0 siblings, 1 reply; 26+ messages in thread From: Thomas Graf @ 2005-02-28 14:25 UTC (permalink / raw) To: jamal Cc: Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel * jamal <1109599803.2188.1014.camel@jzny.localdomain> 2005-02-28 09:10 > On Mon, 2005-02-28 at 08:53, Thomas Graf wrote: > > * jamal <1109598010.2188.994.camel@jzny.localdomain> 2005-02-28 08:40 > > > > > > netlink broadcast or a wrapper around it. > > > Why even bother doing the check with netlink_has_listeners()? > > > > To implement the master enable/disable switch they want. The messages > > don't get send out anyway but why bother doing all the work if nothing > > will get send out in the end? It implements a well defined flag > > controlled by open/close on fds (thus handles dying applications) > > stating whether the whole code should be enabled or disabled. It is of > > course not needed to avoid sending unnecessary messages. > > To justify writting the new code, I am assuming someone has actually sat > down and in the minimal stuck their finger in the air > and said "yes, there is definetely wind there". I did, not for this problem though. The code this idea comes from sends batched events of skb passing points to userspace. Not every call invokes has_listeneres() but rather the kernel thread processing the ring buffer sending the events to userspaces does. The result is globally cached in a atomic_t making it possible to check for it at zero-cost and really saving time and effort. I have no clue wether it does make sense in this case I just pointed out how to do it properly at my point of view. > Which leadsto Marcello's question in other email: > Theres some overhead. > - Message needs to be built with skbs allocated (not the cn_xxx thing > that seems to be invoked - I suspect that thing will build the skbs); > - the netlink table needs to be locked > -and searched and only then do you find theres nothing to send to. > > The point i was making is if you actually had to post this question, > then you must be running into some problems of complexity ;-> > which implies to me that the delta overhead maybe worth it compared to > introducing the complexity or any new code. > I wasnt involved in the discussion - I just woke up and saw the posting > and was bored. So the justification for the optimization has probably > been explained and it may be worth doing the check (but probably such > check should go into whatever that cn_xxx is). I wasn't involved in the discussion either. Using rtmsg_ifinfo as example, the check should probably go in straight at the beginning _IFF_ rtmsg_ifinfo was subject to performance overhead which obviously isn't the case but just served as an example. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 14:25 ` Thomas Graf @ 2005-02-28 15:31 ` jamal 2005-02-28 16:17 ` Evgeniy Polyakov 0 siblings, 1 reply; 26+ messages in thread From: jamal @ 2005-02-28 15:31 UTC (permalink / raw) To: Thomas Graf Cc: Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel On Mon, 2005-02-28 at 09:25, Thomas Graf wrote: > * jamal <1109599803.2188.1014.camel@jzny.localdomain> 2005-02-28 09:10 [..] > > To justify writting the new code, I am assuming someone has actually sat > > down and in the minimal stuck their finger in the air > > and said "yes, there is definetely wind there". > > I did, not for this problem though. The code this idea comes from sends > batched events I would bet the benefit you are seeing has to do with batching rather than such an optimization flag. Different ballgame. I relooked at their code snippet, they dont even have skbs built nor even figured out what sock or PID. That work still needs to be done it seems in cn_netlink_send(). So probably all they need to do is move the check in cn_netlink_send() instead. This is assuming they are not scratching their heads with some realted complexities. I am gonna disapear for a while; hopefully the original posters have gathered some ideas from what we discussed. cheers, jamal ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re: A common layer for Accounting packages 2005-02-28 15:31 ` jamal @ 2005-02-28 16:17 ` Evgeniy Polyakov 2005-03-01 8:21 ` [Lse-tech] " Guillaume Thouvenin 0 siblings, 1 reply; 26+ messages in thread From: Evgeniy Polyakov @ 2005-02-28 16:17 UTC (permalink / raw) To: hadi Cc: Thomas Graf, Andrew Morton, Guillaume Thouvenin, kaigai, marcelo.tosatti, David S. Miller, jlan, lse-tech, linux-kernel, netdev, elsa-devel On 28 Feb 2005 10:31:33 -0500 jamal <hadi@cyberus.ca> wrote: > On Mon, 2005-02-28 at 09:25, Thomas Graf wrote: > > * jamal <1109599803.2188.1014.camel@jzny.localdomain> 2005-02-28 09:10 > [..] > > > To justify writting the new code, I am assuming someone has actually sat > > > down and in the minimal stuck their finger in the air > > > and said "yes, there is definetely wind there". > > > > I did, not for this problem though. The code this idea comes from sends > > batched events > > I would bet the benefit you are seeing has to do with batching rather > than such an optimization flag. Different ballgame. > I relooked at their code snippet, they dont even have skbs built nor > even figured out what sock or PID. That work still needs to be done it > seems in cn_netlink_send(). So probably all they need to do is move the > check in cn_netlink_send() instead. This is assuming they are not > scratching their heads with some realted complexities. > > I am gonna disapear for a while; hopefully the original posters have > gathered some ideas from what we discussed. As connector author, I still doubt it worth copying several lines from netlink_broadcast() before skb allocation in cn_netlink_send(). Of course it is easy and can be done, but I do not see any profit here. Atomic allocation is fast, if it succeds, but there are no groups/socket to send, skb will be freed, if allocation fails, then group check is useless. I would prefer Guillaume Thouvenin as fork connector author to test his current implementation and show that connector's cost is negligible both with and without userspace listeners. As far as I remember it is first entry in fork connector's TODO list. > cheers, > jamal > Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 16:17 ` Evgeniy Polyakov @ 2005-03-01 8:21 ` Guillaume Thouvenin 2005-03-01 13:38 ` Kaigai Kohei 0 siblings, 1 reply; 26+ messages in thread From: Guillaume Thouvenin @ 2005-03-01 8:21 UTC (permalink / raw) To: Evgeniy Polyakov Cc: hadi, Thomas Graf, Andrew Morton, Kaigai Kohei, marcelo.tosatti, David S. Miller, jlan, LSE-Tech, lkml, netdev, elsa-devel On Mon, 2005-02-28 at 19:17 +0300, Evgeniy Polyakov wrote: > On 28 Feb 2005 10:31:33 -0500 > jamal <hadi@cyberus.ca> wrote: > > I would bet the benefit you are seeing has to do with batching rather > > than such an optimization flag. Different ballgame. > > I relooked at their code snippet, they dont even have skbs built nor > > even figured out what sock or PID. That work still needs to be done it > > seems in cn_netlink_send(). So probably all they need to do is move the > > check in cn_netlink_send() instead. This is assuming they are not > > scratching their heads with some realted complexities. > [...] > As connector author, I still doubt it worth copying several lines > from netlink_broadcast() before skb allocation in cn_netlink_send(). > Of course it is easy and can be done, but I do not see any profit here. > Atomic allocation is fast, if it succeds, but there are no groups/socket to send, > skb will be freed, if allocation fails, then group check is useless. > > I would prefer Guillaume Thouvenin as fork connector author to test > his current implementation and show that connector's cost is negligible > both with and without userspace listeners. > As far as I remember it is first entry in fork connector's TODO list. I tested without user space listeners and the cost is negligible. I will test with a user space listeners and see the results. I'm going to run the test this week after improving the mechanism that switch on/off the sending of the message. Best regards, Guillaume ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-01 8:21 ` [Lse-tech] " Guillaume Thouvenin @ 2005-03-01 13:38 ` Kaigai Kohei 2005-03-01 13:53 ` Guillaume Thouvenin ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Kaigai Kohei @ 2005-03-01 13:38 UTC (permalink / raw) To: Guillaume Thouvenin Cc: Evgeniy Polyakov, hadi, Thomas Graf, Andrew Morton, marcelo.tosatti, David S. Miller, jlan, LSE-Tech, lkml, netdev, elsa-devel Hello, > I tested without user space listeners and the cost is negligible. I will > test with a user space listeners and see the results. I'm going to run > the test this week after improving the mechanism that switch on/off the > sending of the message. I'm also trying to mesure the process-creation/destruction performance on following three environment. Archtechture: i686 / Distribution: Fedora Core 3 * Kernel Preemption is DISABLE * SMP kernel but UP-machine / Not Hyper Threading [1] 2.6.11-rc4-mm1 normal [2] 2.6.11-rc4-mm1 with PAGG based Process Accounting Module [3] 2.6.11-rc4-mm1 with fork-connector notification (it's enabled) When 367th-fork() was called after fork-connector notification, kernel was locked up. (User-Space-Listener has been also run until 366th-fork() notification was received) Does this number have any sort of means ? In my second trial, kernel was also locked up after 366th-fork() notification. Currently, I don't know its causition. Is there a person encounted it? # I wanted to say "[2] is faster than [3]" when process-grouping is enable, but the plan came off. :( Thanks. -- Linux Promotion Center, NEC KaiGai Kohei <kaigai@ak.jp.nec.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-01 13:38 ` Kaigai Kohei @ 2005-03-01 13:53 ` Guillaume Thouvenin 2005-03-01 14:17 ` Evgeniy Polyakov 2005-03-02 4:50 ` Paul Jackson 2005-03-02 8:58 ` Guillaume Thouvenin 2 siblings, 1 reply; 26+ messages in thread From: Guillaume Thouvenin @ 2005-03-01 13:53 UTC (permalink / raw) To: Kaigai Kohei Cc: Evgeniy Polyakov, hadi, Thomas Graf, Andrew Morton, marcelo.tosatti, David S. Miller, jlan, LSE-Tech, lkml, netdev, elsa-devel On Tue, 2005-03-01 at 22:38 +0900, Kaigai Kohei wrote: > > I tested without user space listeners and the cost is negligible. I will > > test with a user space listeners and see the results. I'm going to run > > the test this week after improving the mechanism that switch on/off the > > sending of the message. > > I'm also trying to mesure the process-creation/destruction performance on following three environment. > Archtechture: i686 / Distribution: Fedora Core 3 > * Kernel Preemption is DISABLE > * SMP kernel but UP-machine / Not Hyper Threading > [1] 2.6.11-rc4-mm1 normal > [2] 2.6.11-rc4-mm1 with PAGG based Process Accounting Module > [3] 2.6.11-rc4-mm1 with fork-connector notification (it's enabled) > > When 367th-fork() was called after fork-connector notification, kernel was locked up. > (User-Space-Listener has been also run until 366th-fork() notification was received) I don't see this limit on my computer. I'm currently running the lmbench with a new fork connector patch (one that enable/disable fork connector) on an SMP computer. I will send results and the new patch tomorrow because the test takes a while... I'm using a small patch provided by Evgeniy and not included in the 2.6.11-rc4-mm1 tree. Best regards, Guillaume --- orig/connector.c +++ mod/connector.c @@ -168,12 +168,11 @@ group = NETLINK_CB((skb)).groups; msg = (struct cn_msg *)NLMSG_DATA(nlh); - if (msg->len != nlh->nlmsg_len - sizeof(*msg) - sizeof(*nlh)) { + if (NLMSG_SPACE(msg->len + sizeof(*msg)) != nlh->nlmsg_len) { printk(KERN_ERR "skb does not have enough length: " - "requested msg->len=%u[%u], nlh->nlmsg_len=%u[%u], skb->len=%u[must be %u].\n", - msg->len, NLMSG_SPACE(msg->len), - nlh->nlmsg_len, nlh->nlmsg_len - sizeof(*nlh), - skb->len, msg->len + sizeof(*msg)); + "requested msg->len=%u[%u], nlh->nlmsg_len=%u, skb->len=%u.\n", + msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)), + nlh->nlmsg_len, skb->len); kfree_skb(skb); return -EINVAL; } ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-01 13:53 ` Guillaume Thouvenin @ 2005-03-01 14:17 ` Evgeniy Polyakov 0 siblings, 0 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-03-01 14:17 UTC (permalink / raw) To: Guillaume Thouvenin Cc: Kaigai Kohei, hadi, Thomas Graf, Andrew Morton, marcelo.tosatti, David S. Miller, jlan, LSE-Tech, lkml, netdev, elsa-devel [-- Attachment #1: Type: text/plain, Size: 2825 bytes --] On Tue, 2005-03-01 at 14:53 +0100, Guillaume Thouvenin wrote: > On Tue, 2005-03-01 at 22:38 +0900, Kaigai Kohei wrote: > > > I tested without user space listeners and the cost is negligible. I will > > > test with a user space listeners and see the results. I'm going to run > > > the test this week after improving the mechanism that switch on/off the > > > sending of the message. > > > > I'm also trying to mesure the process-creation/destruction performance on following three environment. > > Archtechture: i686 / Distribution: Fedora Core 3 > > * Kernel Preemption is DISABLE > > * SMP kernel but UP-machine / Not Hyper Threading > > [1] 2.6.11-rc4-mm1 normal > > [2] 2.6.11-rc4-mm1 with PAGG based Process Accounting Module > > [3] 2.6.11-rc4-mm1 with fork-connector notification (it's enabled) > > > > When 367th-fork() was called after fork-connector notification, kernel was locked up. > > (User-Space-Listener has been also run until 366th-fork() notification was received) > > I don't see this limit on my computer. I'm currently running the lmbench > with a new fork connector patch (one that enable/disable fork connector) > on an SMP computer. I will send results and the new patch tomorrow > because the test takes a while... > > I'm using a small patch provided by Evgeniy and not included in the > 2.6.11-rc4-mm1 tree. 2.6.11-rc4-mm1 tree does not have the latest connector. Various fixes were added, not only that. I run the latest patch Guillaume sent to me(with small updates), fork bomb with more than 100k forks passed already without any freeze. I do not have numbers thought. > Best regards, > Guillaume > > --- orig/connector.c > +++ mod/connector.c > @@ -168,12 +168,11 @@ > group = NETLINK_CB((skb)).groups; > msg = (struct cn_msg *)NLMSG_DATA(nlh); > > - if (msg->len != nlh->nlmsg_len - sizeof(*msg) - sizeof(*nlh)) { > + if (NLMSG_SPACE(msg->len + sizeof(*msg)) != nlh->nlmsg_len) { > printk(KERN_ERR "skb does not have enough length: " > - "requested msg->len=%u[%u], nlh->nlmsg_len=%u[%u], skb->len=%u[must be %u].\n", > - msg->len, NLMSG_SPACE(msg->len), > - nlh->nlmsg_len, nlh->nlmsg_len - sizeof(*nlh), > - skb->len, msg->len + sizeof(*msg)); > + "requested msg->len=%u[%u], nlh->nlmsg_len=%u, skb->len=%u.\n", > + msg->len, NLMSG_SPACE(msg->len + sizeof(*msg)), > + nlh->nlmsg_len, skb->len); > kfree_skb(skb); > return -EINVAL; > } > -- Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-01 13:38 ` Kaigai Kohei 2005-03-01 13:53 ` Guillaume Thouvenin @ 2005-03-02 4:50 ` Paul Jackson 2005-03-02 8:58 ` Guillaume Thouvenin 2 siblings, 0 replies; 26+ messages in thread From: Paul Jackson @ 2005-03-02 4:50 UTC (permalink / raw) To: Kaigai Kohei Cc: guillaume.thouvenin, johnpol, hadi, tgraf, akpm, marcelo.tosatti, davem, jlan, lse-tech, linux-kernel, netdev, elsa-devel Just a thought - perhaps you could see if Jay can test the performance scaling of these changes on larger systems (8 to 64 CPUs, give or take, small for SGI, but big for some vendors.) Things like a global lock, for example, might be harmless on smaller systems, but hurt big time on bigger systems. I don't know if you have any such constructs ... perhaps this doesn't matter. At the very least, we need to know that performance and scaling are not significantly impacted, on systems not using accounting, either because it is obvious from the code, or because someone has tested it. And if performance or scaling was impacted when accounting was enabled, then at least we would want to know how much performance was impacted, so that users would know what to expect when they use accounting. > the process-creation/destruction performance on following three environment. I think this is a good choice of what to measure, and where. Thank-you. > kernel was also locked up after 366th-fork() I have no idea what this is -- good luck finding it. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373, 1.925.600.0401 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-01 13:38 ` Kaigai Kohei 2005-03-01 13:53 ` Guillaume Thouvenin 2005-03-02 4:50 ` Paul Jackson @ 2005-03-02 8:58 ` Guillaume Thouvenin 2005-03-02 9:06 ` Andrew Morton 2 siblings, 1 reply; 26+ messages in thread From: Guillaume Thouvenin @ 2005-03-02 8:58 UTC (permalink / raw) To: Kaigai Kohei Cc: Evgeniy Polyakov, hadi, Thomas Graf, Andrew Morton, Marcelo Tosatti, David S. Miller, jlan, LSE-Tech, lkml, Netlink List, elsa-devel On Tue, 2005-03-01 at 22:38 +0900, Kaigai Kohei wrote: > > I tested without user space listeners and the cost is negligible. I will > > test with a user space listeners and see the results. I'm going to run > > the test this week after improving the mechanism that switch on/off the > > sending of the message. > > I'm also trying to mesure the process-creation/destruction performance on following three environment. > Archtechture: i686 / Distribution: Fedora Core 3 > * Kernel Preemption is DISABLE > * SMP kernel but UP-machine / Not Hyper Threading > [1] 2.6.11-rc4-mm1 normal > [2] 2.6.11-rc4-mm1 with PAGG based Process Accounting Module > [3] 2.6.11-rc4-mm1 with fork-connector notification (it's enabled) > > When 367th-fork() was called after fork-connector notification, kernel was locked up. > (User-Space-Listener has been also run until 366th-fork() notification was received) So I ran the lmbench with three different kernels with the fork connector patch I just sent. Results are attached at the end of the mail and there are three different lines which are: o First line is a linux-2.6.11-rc4-mm1-cnfork o Second line is a linux-2.6.11-rc4-mm1 o Third line is a linux-2.6.11-rc4-mm1-cnfork with a user space application. The user space application listened during 15h and received 6496 messages. Each test has been ran only once. Best regards, Guillaume --- cd results && make summary percent 2>/dev/null | more make[1]: Entering directory `/home/guill/benchmark/lmbench/lmbench-3.0-a4/results' L M B E N C H 3 . 0 S U M M A R Y ------------------------------------ (Alpha software, do not distribute) Basic system parameters ------------------------------------------------------------------------------ Host OS Description Mhz tlb cache mem scal pages line par load bytes --------- ------------- ----------------------- ---- ----- ----- ------ ---- account Linux 2.6.11- i686-pc-linux-gnu 2765 63 128 2.4900 1 account Linux 2.6.11- i686-pc-linux-gnu 2765 67 128 2.4200 1 account Linux 2.6.11- i686-pc-linux-gnu 2765 69 128 2.4400 1 Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- account Linux 2.6.11- 2765 0.17 0.26 3.57 4.19 16.9 0.51 2.31 162. 629. 2415 account Linux 2.6.11- 2765 0.16 0.26 3.56 4.17 17.6 0.50 2.30 163. 628. 2417 account Linux 2.6.11- 2765 0.16 0.27 3.67 4.25 17.6 0.51 2.28 176. 664. 2456 Basic integer operations - times in nanoseconds - smaller is better ------------------------------------------------------------------- Host OS intgr intgr intgr intgr intgr bit add mul div mod --------- ------------- ------ ------ ------ ------ ------ account Linux 2.6.11- 0.1800 0.1700 4.9900 20.8 23.1 account Linux 2.6.11- 0.1800 0.1700 4.9900 20.8 23.1 account Linux 2.6.11- 0.1800 0.1700 4.9900 20.8 23.1 Basic float operations - times in nanoseconds - smaller is better ----------------------------------------------------------------- Host OS float float float float add mul div bogo --------- ------------- ------ ------ ------ ------ account Linux 2.6.11- 1.7300 2.4800 15.5 15.4 account Linux 2.6.11- 1.7300 2.4800 15.5 15.6 account Linux 2.6.11- 1.7400 2.5000 15.7 15.6 Basic double operations - times in nanoseconds - smaller is better ------------------------------------------------------------------ Host OS double double double double add mul div bogo --------- ------------- ------ ------ ------ ------ account Linux 2.6.11- 1.7300 2.4800 15.5 15.4 account Linux 2.6.11- 1.7300 2.4800 15.5 15.6 account Linux 2.6.11- 1.7400 2.5000 15.7 15.6 Context switching - times in microseconds - smaller is better ------------------------------------------------------------------------- Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw --------- ------------- ------ ------ ------ ------ ------ ------- ------- account Linux 2.6.11- 5.1300 5.2900 4.9700 3.1700 10.9 6.30000 32.6 account Linux 2.6.11- 4.9000 5.2100 5.1600 4.4700 20.3 6.48000 27.7 account Linux 2.6.11- 4.8600 5.3000 4.9200 3.5600 20.5 6.87000 31.5 *Local* Communication latencies in microseconds - smaller is better --------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- account Linux 2.6.11- 5.130 14.3 11.9 17.7 23.2 20.3 28.3 40. account Linux 2.6.11- 4.900 14.6 12.0 18.5 23.9 20.8 28.6 40. account Linux 2.6.11- 4.860 14.8 12.6 18.1 23.9 20.8 27.8 40. File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- account Linux 2.6.11- 18.9 16.1 65.6 33.5 15.4K 0.771 2.22520 16.4 account Linux 2.6.11- 18.8 16.3 64.2 33.2 15.7K 0.841 2.20690 16.5 account Linux 2.6.11- 19.2 16.4 65.4 33.5 15.7K 0.782 2.19950 16.4 *Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- ----- account Linux 2.6.11- 664. 497. 369. 1468.8 1836.1 596.6 568.4 1819 779.7 account Linux 2.6.11- 671. 521. 338. 1481.6 1817.2 593.8 568.8 1838 783.0 account Linux 2.6.11- 667. 543. 372. 1469.4 1816.8 594.2 568.3 1818 783.0 Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) ------------------------------------------------------------------------------ Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses --------- ------------- --- ---- ---- -------- -------- ------- account Linux 2.6.11- 2765 0.7030 6.5710 140.6 246.7 account Linux 2.6.11- 2765 0.7090 6.6350 142.4 249.5 account Linux 2.6.11- 2765 0.7110 6.6340 142.5 249.5 make[1]: Leaving directory `/home/guill/benchmark/lmbench/lmbench-3.0-a4/results' ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-02 8:58 ` Guillaume Thouvenin @ 2005-03-02 9:06 ` Andrew Morton 2005-03-02 9:25 ` Guillaume Thouvenin 2005-03-02 15:30 ` Paul Jackson 0 siblings, 2 replies; 26+ messages in thread From: Andrew Morton @ 2005-03-02 9:06 UTC (permalink / raw) To: Guillaume Thouvenin Cc: kaigai, johnpol, hadi, tgraf, marcelo.tosatti, davem, jlan, lse-tech, linux-kernel, netdev, elsa-devel Guillaume Thouvenin <guillaume.thouvenin@bull.net> wrote: > > So I ran the lmbench with three different kernels with the fork > connector patch I just sent. Results are attached at the end of the mail > and there are three different lines which are: > > o First line is a linux-2.6.11-rc4-mm1-cnfork > o Second line is a linux-2.6.11-rc4-mm1 > o Third line is a linux-2.6.11-rc4-mm1-cnfork with a user space > application. The user space application listened during 15h > and received 6496 messages. > > Each test has been ran only once. > > ... > ------------------------------------------------------------------------------ > Host OS Mhz null null open slct sig sig fork exec sh > call I/O stat clos TCP inst hndl proc proc proc > --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- > account Linux 2.6.11- 2765 0.17 0.26 3.57 4.19 16.9 0.51 2.31 162. 629. 2415 > account Linux 2.6.11- 2765 0.16 0.26 3.56 4.17 17.6 0.50 2.30 163. 628. 2417 > account Linux 2.6.11- 2765 0.16 0.27 3.67 4.25 17.6 0.51 2.28 176. 664. 2456 This is the interesting bit, yes? 5-10% slowdown on fork is expected, but why was exec slower? What does "The user space application listened during 15h" mean? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-02 9:06 ` Andrew Morton @ 2005-03-02 9:25 ` Guillaume Thouvenin 2005-03-02 15:30 ` Paul Jackson 1 sibling, 0 replies; 26+ messages in thread From: Guillaume Thouvenin @ 2005-03-02 9:25 UTC (permalink / raw) To: Andrew Morton Cc: Kaigai Kohei, Evgeniy Polyakov, hadi, tgraf, Marcelo Tosatti, David S. Miller, jlan, LSE-Tech, lkml, Netlink List, elsa-devel On Wed, 2005-03-02 at 01:06 -0800, Andrew Morton wrote: > Guillaume Thouvenin <guillaume.thouvenin@bull.net> wrote: > > > > So I ran the lmbench with three different kernels with the fork > > connector patch I just sent. Results are attached at the end of the mail > > and there are three different lines which are: > > > > o First line is a linux-2.6.11-rc4-mm1-cnfork > > o Second line is a linux-2.6.11-rc4-mm1 > > o Third line is a linux-2.6.11-rc4-mm1-cnfork with a user space > > application. The user space application listened during 15h > > and received 6496 messages. > > > > Each test has been ran only once. > > > > ... > > ------------------------------------------------------------------------------ > > Host OS Mhz null null open slct sig sig fork exec sh > > call I/O stat clos TCP inst hndl proc proc proc > > --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- > > account Linux 2.6.11- 2765 0.17 0.26 3.57 4.19 16.9 0.51 2.31 162. 629. 2415 > > account Linux 2.6.11- 2765 0.16 0.26 3.56 4.17 17.6 0.50 2.30 163. 628. 2417 > > account Linux 2.6.11- 2765 0.16 0.27 3.67 4.25 17.6 0.51 2.28 176. 664. 2456 > > This is the interesting bit, yes? 5-10% slowdown on fork is expected, but > why was exec slower? I can't explain it for the moment. I will run test more than once to see if this difference is still here. > What does "The user space application listened during 15h" mean? It means that I ran the user space application before the test and stop it 15 hours later (this morning for me). The test ran during 5h30mn. The user space application increments a counter to show how many processes have been created during a period of time. I have not use the user space daemon that manages group of processes because the it still uses the old mechanism (a signal sends from the do_fork()) and as I wanted to provide quick results, I used another user space application. I attache the test program (get_fork_info.c) that I'm using at the end of the mail to clearly show what it does. I will run new tests with the real user space daemon but it will be ready next week, sorry for the delay. Best regards, Guillaume --- /* * get_fork_info.c * * This program listens netlink interface to retreive information * sends by the kernel when forking. It increments a counter for * each forks and when the user hit CRL-C, it displays how many * fork occured during the period. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <errno.h> #include <signal.h> #include <asm/types.h> #include <sys/types.h> #include <sys/socket.h> #include <sys/time.h> #include <linux/netlink.h> #include <linux/connector.h> #define CN_FORK_OFF 0 #define CN_FORK_ON 1 #define MESSAGE_SIZE (sizeof(struct nlmsghdr) + \ sizeof(struct cn_msg) + \ sizeof(int)) int sock; unsigned long total_p; struct timeval test_time; static inline void switch_cn_fork(int sock, int action) { char buff[128]; /* must be > MESSAGE_SIZE */ struct nlmsghdr *hdr; struct cn_msg *msg; /* Clear the buffer */ memset(buff, '\0', sizeof(buff)); /* fill the message header */ hdr = (struct nlmsghdr *) buff; hdr->nlmsg_len = MESSAGE_SIZE; hdr->nlmsg_type = NLMSG_DONE; hdr->nlmsg_flags = 0; hdr->nlmsg_seq = 0; hdr->nlmsg_pid = getpid(); /* the message */ msg = (struct cn_msg *) NLMSG_DATA(hdr); msg->id.idx = CN_IDX_FORK; msg->id.val = CN_VAL_FORK; msg->seq = 0; msg->ack = 0; msg->len = sizeof(int); msg->data[0] = action; send(sock, hdr, hdr->nlmsg_len, 0); } static void cleanup() { struct timeval tmp_time; switch_cn_fork(sock, CN_FORK_OFF); tmp_time = test_time; gettimeofday(&test_time, NULL); printf("%lu processes were created in %li seconds.\n", total_p, test_time.tv_sec - tmp_time.tv_sec); close(sock); exit(EXIT_SUCCESS); } int main() { int err; struct sockaddr_nl sa; /* information for NETLINK interface */ /* * To be able to quit the application properly we install a * signal handler that catch the CTRL-C */ signal(SIGTERM, cleanup); signal(SIGINT, cleanup); /* * Create an endpoint for communication. Use the kernel user * interface device (PF_NETLINK) which is a datagram oriented * service (SOCK_DGRAM). The protocol used is the netfilter/iptables * ULOG protocol (NETLINK_NFLOG) */ sock = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_NFLOG); if (sock == -1) { perror("socket"); return -1; } sa.nl_family = AF_NETLINK; sa.nl_groups = CN_IDX_FORK; sa.nl_pid = getpid(); err = bind(sock, (struct sockaddr *) &sa, sizeof(struct sockaddr_nl)); if (err == -1) { perror("bind"); close(sock); return -1; } switch_cn_fork(sock, CN_FORK_ON); total_p = 0; gettimeofday(&test_time, NULL); for (;;) { char buff[1024]; /* it's large enough */ struct nlmsghdr *hdr; struct cn_msg *msg; int len; /* Clear the buffer */ memset(buff, '\0', sizeof(buff)); /* Listen */ len = recv(sock, buff, sizeof(buff), 0); if (len == -1) { perror("recv"); close(sock); return -1; } /* point to the message header */ hdr = (struct nlmsghdr *) buff; switch (hdr->nlmsg_type) { case NLMSG_DONE: msg = (struct cn_msg *) NLMSG_DATA(hdr); total_p++; #if 0 printf("[idx=0x%x seq=%u] %s\n", msg->id.idx, msg->seq, msg->data); #endif break; case NLMSG_ERROR: printf("NLMSG_ERROR\n"); /* Fall through */ default: break; } } /* * in fact we never reach this part of the code because there is an * infinite loop above. */ cleanup(); return 0; } ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-03-02 9:06 ` Andrew Morton 2005-03-02 9:25 ` Guillaume Thouvenin @ 2005-03-02 15:30 ` Paul Jackson 1 sibling, 0 replies; 26+ messages in thread From: Paul Jackson @ 2005-03-02 15:30 UTC (permalink / raw) To: Andrew Morton Cc: guillaume.thouvenin, kaigai, johnpol, hadi, tgraf, marcelo.tosatti, davem, jlan, lse-tech, linux-kernel, netdev, elsa-devel Andrew wrote: > 5-10% slowdown on fork is expected, but > why was exec slower? Thanks for the summary, Andrew. Guillaume (or anyone else tempted to do this) - it's a good idea, when posting 100 lines of data, to summarize with a line or two of words, as Andrew did here. It is far more efficient for one writer to do this, than each of a thousand readers. Hmmm ... so why was exec slower? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373, 1.925.600.0401 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: A common layer for Accounting packages 2005-02-28 12:10 ` jamal 2005-02-28 9:29 ` Marcelo Tosatti 2005-02-28 13:20 ` [Lse-tech] " Thomas Graf @ 2005-03-01 20:40 ` Paul Jackson 2 siblings, 0 replies; 26+ messages in thread From: Paul Jackson @ 2005-03-01 20:40 UTC (permalink / raw) To: hadi Cc: akpm, guillaume.thouvenin, kaigai, marcelo.tosatti, davem, jlan, lse-tech, linux-kernel, netdev, elsa-devel Jamal wrote: > What was wrong with just going ahead and just always > invoking your netlink_send()? I think the hope was to reduce the cost of the accounting hook in fork to "next-to-zero" if accounting is not being used on that system. See Andrew's query earlier: > b) they are next-to-zero cost if something is listening on the netlink > socket but no accounting daemon is running. Presumably sending an ignored packet costs something, quite possibly more than "next-to-zero". -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373, 1.925.600.0401 ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2005-03-02 15:30 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <42168D9E.1010900@sgi.com>
[not found] ` <20050218171610.757ba9c9.akpm@osdl.org>
[not found] ` <421993A2.4020308@ak.jp.nec.com>
[not found] ` <421B955A.9060000@sgi.com>
[not found] ` <421C2B99.2040600@ak.jp.nec.com>
[not found] ` <421CEC38.7010008@sgi.com>
[not found] ` <421EB299.4010906@ak.jp.nec.com>
[not found] ` <20050224212839.7953167c.akpm@osdl.org>
[not found] ` <20050227094949.GA22439@logos.cnet>
[not found] ` <4221E548.4000008@ak.jp.nec.com>
[not found] ` <20050227140355.GA23055@logos.cnet>
2005-02-28 1:59 ` Re: A common layer for Accounting packages Kaigai Kohei
2005-02-28 2:32 ` [Lse-tech] " Thomas Graf
2005-02-28 5:17 ` Evgeniy Polyakov
2005-02-28 7:20 ` [Lse-tech] " Guillaume Thouvenin
2005-02-28 7:39 ` Andrew Morton
2005-02-28 8:04 ` Evgeniy Polyakov
2005-02-28 12:10 ` jamal
2005-02-28 9:29 ` Marcelo Tosatti
2005-02-28 13:20 ` [Lse-tech] " Thomas Graf
2005-02-28 13:40 ` jamal
2005-02-28 13:53 ` Thomas Graf
2005-02-28 9:52 ` Marcelo Tosatti
2005-02-28 14:10 ` jamal
2005-02-28 14:25 ` Thomas Graf
2005-02-28 15:31 ` jamal
2005-02-28 16:17 ` Evgeniy Polyakov
2005-03-01 8:21 ` [Lse-tech] " Guillaume Thouvenin
2005-03-01 13:38 ` Kaigai Kohei
2005-03-01 13:53 ` Guillaume Thouvenin
2005-03-01 14:17 ` Evgeniy Polyakov
2005-03-02 4:50 ` Paul Jackson
2005-03-02 8:58 ` Guillaume Thouvenin
2005-03-02 9:06 ` Andrew Morton
2005-03-02 9:25 ` Guillaume Thouvenin
2005-03-02 15:30 ` Paul Jackson
2005-03-01 20:40 ` Paul Jackson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).