* Re: Netlink connector [not found] ` <Lynx.SEL.4.62.0507250154000.21934@thoron.boston.redhat.com> @ 2005-07-25 7:06 ` Evgeniy Polyakov 2005-07-25 14:32 ` Patrick McHardy 0 siblings, 1 reply; 15+ messages in thread From: Evgeniy Polyakov @ 2005-07-25 7:06 UTC (permalink / raw) To: James Morris Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, netdev On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > On Sun, 24 Jul 2005, David S. Miller wrote: > >From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > >Date: Sat, 23 Jul 2005 13:14:55 +0400 > > > >>Andrew has no objection against connector and it lives in -mm > > > >A patch sitting in -mm has zero significance. That is why I'm asking netdev@ people again... > The significance I think is that Andrew is trying to gently encourage some > further progress in the area. > > I recall some netconf discussion about TIPC over Netlink, or more likely a > variation thereof, which may be a better way forward. > > It's cool stuff http://tipc.sourceforge.net/ I read it quite long ago - I'm sure you do not want to use that monster for event bus. It was designed and implemented for heavy intermachine communications, and it is quite hard to setup for userspace <-> kernelspace message bus. > > - James > -- > James Morris > <jmorris@redhat.com> -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 7:06 ` Netlink connector Evgeniy Polyakov @ 2005-07-25 14:32 ` Patrick McHardy 2005-07-25 14:43 ` Eric Leblond 2005-07-25 19:28 ` Evgeniy Polyakov 0 siblings, 2 replies; 15+ messages in thread From: Patrick McHardy @ 2005-07-25 14:32 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, netdev Evgeniy Polyakov wrote: > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > >>On Sun, 24 Jul 2005, David S. Miller wrote: >> >>>From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> >>>Date: Sat, 23 Jul 2005 13:14:55 +0400 >>> >>> >>>>Andrew has no objection against connector and it lives in -mm >>> >>>A patch sitting in -mm has zero significance. > > That is why I'm asking netdev@ people again... If I understand correctly it tries to workaround some netlink limitations (limited number of netlink families and multicast groups) by sending everything to userspace and demultiplexing it there. Same in the other direction, an additional layer on top of netlink does basically the same thing netlink already does. This looks like a step in the wrong direction to me, netlink should instead be fixed to support what is needed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 14:32 ` Patrick McHardy @ 2005-07-25 14:43 ` Eric Leblond 2005-07-25 19:33 ` Evgeniy Polyakov 2005-07-25 19:28 ` Evgeniy Polyakov 1 sibling, 1 reply; 15+ messages in thread From: Eric Leblond @ 2005-07-25 14:43 UTC (permalink / raw) To: Patrick McHardy Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, netdev Le lundi 25 juillet 2005 à 16:32 +0200, Patrick McHardy a écrit : > Evgeniy Polyakov wrote: > > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > If I understand correctly it tries to workaround some netlink > limitations (limited number of netlink families and multicast groups) > by sending everything to userspace and demultiplexing it there. > Same in the other direction, an additional layer on top of netlink > does basically the same thing netlink already does. This looks like > a step in the wrong direction to me, netlink should instead be fixed > to support what is needed. I totally agree with you, it could be great to fix netlink to support multiple queue. I like to be able to use projects like snort-inline or nufw together. This will make Netfilter really stronger. Furthermore, there's a repetition of filtering capabilities with such a solution. Netfilter has to filter to send to netlink and this is the same with the queue dispatcher. I think this introduce too much complexity. my 0.02$ BR, -- Éric Leblond, eleblond@inl.fr Téléphone : 01 44 89 46 40, Fax : 01 44 89 45 01 INL, http://www.inl.fr ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 14:43 ` Eric Leblond @ 2005-07-25 19:33 ` Evgeniy Polyakov 2005-07-26 8:45 ` Harald Welte 0 siblings, 1 reply; 15+ messages in thread From: Evgeniy Polyakov @ 2005-07-25 19:33 UTC (permalink / raw) To: Eric Leblond Cc: Patrick McHardy, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel On Mon, Jul 25, 2005 at 04:43:43PM +0200, Eric Leblond (eleblond@inl.fr) wrote: > Le lundi 25 juillet 2005 à 16:32 +0200, Patrick McHardy a écrit : > > Evgeniy Polyakov wrote: > > > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > > If I understand correctly it tries to workaround some netlink > > limitations (limited number of netlink families and multicast groups) > > by sending everything to userspace and demultiplexing it there. > > Same in the other direction, an additional layer on top of netlink > > does basically the same thing netlink already does. This looks like > > a step in the wrong direction to me, netlink should instead be fixed > > to support what is needed. > > I totally agree with you, it could be great to fix netlink to support > multiple queue. > I like to be able to use projects like snort-inline or nufw together. > This will make Netfilter really stronger. > Furthermore, there's a repetition of filtering capabilities with such a > solution. Netfilter has to filter to send to netlink and this is the > same with the queue dispatcher. I think this introduce too much > complexity. Netlink is transport protocol - no need to add complexity into it, it must be as simple as possible and thus extensible. Multiple queues and filtering should be created on different layer, like it is done for TCP/IP and other protocols. I'm not advertising, but connector is exactly the place where it can be implemented. > my 0.02$ > > BR, > -- > Éric Leblond, eleblond@inl.fr > Téléphone : 01 44 89 46 40, Fax : 01 44 89 45 01 > INL, http://www.inl.fr > -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 19:33 ` Evgeniy Polyakov @ 2005-07-26 8:45 ` Harald Welte 0 siblings, 0 replies; 15+ messages in thread From: Harald Welte @ 2005-07-26 8:45 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Andrew Morton, Eric Leblond, netdev, netfilter-devel, linux-kernel, Patrick McHardy [-- Attachment #1: Type: text/plain, Size: 1005 bytes --] On Mon, Jul 25, 2005 at 11:33:51PM +0400, Evgeniy Polyakov wrote: > Netlink is transport protocol - no need to add complexity into it, > it must be as simple as possible and thus extensible. yes. but when you run into a serious addressing shortage (like the internet does with ipv4), you develop something that provides more addresses (such as ipv6). That's why support for more groups than 32 (per family) is something that should be put in the netlink protocol. I totally agree that we need a higher-level api on top of that, in order to hide the details of the networking stack for those not interested in it. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 14:32 ` Patrick McHardy 2005-07-25 14:43 ` Eric Leblond @ 2005-07-25 19:28 ` Evgeniy Polyakov 2005-07-25 23:46 ` Patrick McHardy 1 sibling, 1 reply; 15+ messages in thread From: Evgeniy Polyakov @ 2005-07-25 19:28 UTC (permalink / raw) To: Patrick McHardy Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy (kaber@trash.net) wrote: > Evgeniy Polyakov wrote: > >On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris > >(jmorris@redhat.com) wrote: > > > >>On Sun, 24 Jul 2005, David S. Miller wrote: > >> > >>>From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > >>>Date: Sat, 23 Jul 2005 13:14:55 +0400 > >>> > >>> > >>>>Andrew has no objection against connector and it lives in -mm > >>> > >>>A patch sitting in -mm has zero significance. > > > >That is why I'm asking netdev@ people again... > > If I understand correctly it tries to workaround some netlink > limitations (limited number of netlink families and multicast groups) > by sending everything to userspace and demultiplexing it there. > Same in the other direction, an additional layer on top of netlink > does basically the same thing netlink already does. This looks like > a step in the wrong direction to me, netlink should instead be fixed > to support what is needed. Not only it. The main _first_ idea was to simplify userspace mesasge handling as much as possible. In first releases I called it ioctl-ng - any module that want ot communicate with userspace in the way ioctl does, requires skb allocation/freeing/handling. Does RTC driver writer need to know what is the difference between shared and cloned skb? Should kernel user of such message bus have to know about skb at all? With char device I only need to register my callback - with kernel connector it is the same, but allows to use the whole power of netlink, especially without nice ioctl features like different pointer size in userspace and kernelspace. And number of free netlink sockets is _very_ small, especially if allocate new one for simple notifications, which can be easily done using connector. No need to allocate skb, no need to know who are those monsters in header and so on. And netlink can be extended to support it - netlink is a transport protocol, it should not care about higher layer message handling, connector instead will deliver message to the end user in a very convenient form. P.S. I've removed netdev@redhat.com - please do not add subscribers-only private mail lists. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 19:28 ` Evgeniy Polyakov @ 2005-07-25 23:46 ` Patrick McHardy 2005-07-25 23:56 ` Thomas Graf 2005-07-26 4:45 ` Evgeniy Polyakov 0 siblings, 2 replies; 15+ messages in thread From: Patrick McHardy @ 2005-07-25 23:46 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel Evgeniy Polyakov wrote: > On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy (kaber@trash.net) wrote: > >>If I understand correctly it tries to workaround some netlink >>limitations (limited number of netlink families and multicast groups) >>by sending everything to userspace and demultiplexing it there. >>Same in the other direction, an additional layer on top of netlink >>does basically the same thing netlink already does. This looks like >>a step in the wrong direction to me, netlink should instead be fixed >>to support what is needed. > > Not only it. > The main _first_ idea was to simplify userspace mesasge handling as much > as possible. > In first releases I called it ioctl-ng - any module that want ot > communicate with userspace in the way ioctl does, Usually netlink is easily extendable by using nested TLVs. By hiding this you basically remove this extensibility. > requires skb allocation/freeing/handling. > Does RTC driver writer need to know what is the difference between > shared and cloned skb? Should kernel user of such message bus > have to know about skb at all? Netlink users don't have to care about shared or cloned skbs. I don't think its a big issue to use alloc_skb and then the usual netlink macros. Thomas added a number of macros that simplfiy use a lot. But my main objection is that it sends everything to userspace even if noone is listening. This can't be used for things that generate lots of events, and also will get problematic is the number of users increases. > With char device I only need to register my callback - with kernel > connector it is the same, but allows to use the whole power of netlink, > especially without nice ioctl features like different pointer size > in userspace and kernelspace. You still have to take care of mixed 64/32 bit environments, u64 fields for example are differently alligned. > And number of free netlink sockets is _very_ small, especially > if allocate new one for simple notifications, which can be easily done > using connector. Then fix it so we can use more families and groups. I started some work on this, but I'm not sure if I have time to complete it. > And netlink can be extended to support it - netlink is a transport > protocol, it should not care about higher layer message handling, > connector instead will deliver message to the end user in a very > convenient form. You can still built this stuff on top, but the workarounds for netlink limitations need to be fixed in netlink. > P.S. I've removed netdev@redhat.com - please do not add subscribers-only > private mail lists. Wasn't me :) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 23:46 ` Patrick McHardy @ 2005-07-25 23:56 ` Thomas Graf 2005-07-26 0:16 ` Patrick McHardy 2005-07-26 4:45 ` Evgeniy Polyakov 1 sibling, 1 reply; 15+ messages in thread From: Thomas Graf @ 2005-07-25 23:56 UTC (permalink / raw) To: Patrick McHardy Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel * Patrick McHardy <42E579BC.8000701@trash.net> 2005-07-26 01:46 > Netlink users don't have to care about shared or cloned skbs. I don't > think its a big issue to use alloc_skb and then the usual netlink > macros. Thomas added a number of macros that simplfiy use a lot. Once I've finished the generic netlink attribute macros the usage will be even simpler. I wrote down all the things I want to do today in a park and I intend to write the code once I'm back from my vacation. > But my main objection is that it sends everything to userspace even > if noone is listening. This can't be used for things that generate > lots of events, and also will get problematic is the number of users > increases. My patches will include a new function netlink_nr_subscribers() taking the socket and a mask of groups. I posted something simliar during an earlier connector discussion already. > You still have to take care of mixed 64/32 bit environments, u64 fields > for example are differently alligned. My solution to this (in the same patchset) is that we never derference u64s but instead copy them. > Then fix it so we can use more families and groups. I started some work > on this, but I'm not sure if I have time to complete it. Great, this is one of the remaining issues I haven't solved yet. If you want me to take over just hand over your unfinished work and I'll integrate it into my patchset. I'm sorry to not being able to provide any code yet, it's one of the first things I'll do once I'm back. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 23:56 ` Thomas Graf @ 2005-07-26 0:16 ` Patrick McHardy 2005-07-26 0:30 ` Thomas Graf 0 siblings, 1 reply; 15+ messages in thread From: Patrick McHardy @ 2005-07-26 0:16 UTC (permalink / raw) To: Thomas Graf Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel Thomas Graf wrote: > * Patrick McHardy <42E579BC.8000701@trash.net> 2005-07-26 01:46 > >>You still have to take care of mixed 64/32 bit environments, u64 fields >>for example are differently alligned. > > My solution to this (in the same patchset) is that we never > derference u64s but instead copy them. I don't understand. The problem is mainly u64 embedded in structures, the structs have different sizes if the u64 is not 8 byte aligned and the structure size padded to a multiple of 8. >>Then fix it so we can use more families and groups. I started some work >>on this, but I'm not sure if I have time to complete it. > > Great, this is one of the remaining issues I haven't solved yet. > If you want me to take over just hand over your unfinished work > and I'll integrate it into my patchset. I started working on it after the OLS party, so no postable code yet :) The idea for more groups is basically to remove the fixed groups bitmask from struct sockaddr_nl and use setsockopt to add/remove multicast subscriptions. If we add the limitation that a packet can only be multicasted to a single group we can support an arbitary number of groups, otherwise we would still be limited by size of skb->cb. This limitation shouldn't be a problem, AFAIK nothing is multicasting to multiple groups at once right now and the increased number of groups will allow a better granularity anyway. The main problem is keeping it backwards-compatible for current netlink users. If this isn't possible we may need to call it netlink2. Regards Patrick ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-26 0:16 ` Patrick McHardy @ 2005-07-26 0:30 ` Thomas Graf 0 siblings, 0 replies; 15+ messages in thread From: Thomas Graf @ 2005-07-26 0:30 UTC (permalink / raw) To: Patrick McHardy Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, Jamal Hadi Salim, linux-kernel * Patrick McHardy <42E580CF.4010800@trash.net> 2005-07-26 02:16 > Thomas Graf wrote: > >* Patrick McHardy <42E579BC.8000701@trash.net> 2005-07-26 01:46 > > > >>You still have to take care of mixed 64/32 bit environments, u64 fields > >>for example are differently alligned. > > > >My solution to this (in the same patchset) is that we never > >derference u64s but instead copy them. > > I don't understand. The problem is mainly u64 embedded in structures, > the structs have different sizes if the u64 is not 8 byte aligned > and the structure size padded to a multiple of 8. Like in gnet_stats, yes. I thought you meant usages like *(u64 *) which we shouldn't do either. > I started working on it after the OLS party, so no postable code yet :) > The idea for more groups is basically to remove the fixed groups > bitmask from struct sockaddr_nl and use setsockopt to add/remove > multicast subscriptions. If we add the limitation that a packet > can only be multicasted to a single group we can support an arbitary > number of groups, otherwise we would still be limited by size of > skb->cb. I was thinking of subscription messages over netlink itself for the advantage that we could use it within the distributed netlink protocol that has to come up sometime soon. Well, both ways are ok I guess, the ease of distributive usage is my only argument. > This limitation shouldn't be a problem, AFAIK nothing is > multicasting to multiple groups at once right now and the increased > number of groups will allow a better granularity anyway. I'm not aware of any and I agree. We don't need n<->n subscriptions, 1<->n is perfectly fine as I see it. > The main > problem is keeping it backwards-compatible for current netlink users. > If this isn't possible we may need to call it netlink2. I think Jamal has a moral patent on the name netlink2 so be careful ;-> It should be possible to remain compatible, I don't see any unresolveable issues right now. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-25 23:46 ` Patrick McHardy 2005-07-25 23:56 ` Thomas Graf @ 2005-07-26 4:45 ` Evgeniy Polyakov 2005-07-26 4:56 ` Stephen Hemminger 2005-07-26 6:14 ` Thomas Graf 1 sibling, 2 replies; 15+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 4:45 UTC (permalink / raw) To: Patrick McHardy Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > Evgeniy Polyakov wrote: > >On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy > >(kaber@trash.net) wrote: > > > >>If I understand correctly it tries to workaround some netlink > >>limitations (limited number of netlink families and multicast groups) > >>by sending everything to userspace and demultiplexing it there. > >>Same in the other direction, an additional layer on top of netlink > >>does basically the same thing netlink already does. This looks like > >>a step in the wrong direction to me, netlink should instead be fixed > >>to support what is needed. > > > >Not only it. > >The main _first_ idea was to simplify userspace mesasge handling as much > >as possible. > >In first releases I called it ioctl-ng - any module that want ot > >communicate with userspace in the way ioctl does, > > Usually netlink is easily extendable by using nested TLVs. By hiding > this you basically remove this extensibility. Current netlink is not extensible for _many_ different users. It has only 32 sockets. > >requires skb allocation/freeing/handling. > >Does RTC driver writer need to know what is the difference between > >shared and cloned skb? Should kernel user of such message bus > >have to know about skb at all? > > Netlink users don't have to care about shared or cloned skbs. I don't > think its a big issue to use alloc_skb and then the usual netlink > macros. Thomas added a number of macros that simplfiy use a lot. Kernel user also must know about difference between unicast/broadcast, how to dequeue the skb, how to free it and in what context. ioctl users do not need to know how file_operations is bound to file. > But my main objection is that it sends everything to userspace even > if noone is listening. This can't be used for things that generate > lots of events, and also will get problematic is the number of users > increases. It is a problem for existing netlink - either check in bind time, what could be done for connector, or in socket creation time. Actually it is not even a problem, since checking is being done, but after allocation and message filling, such check can be moved into cn_netlink_send() in connector, but different netlink users, who prefers to use different sockets, must perform it by itself in each place, where skb is allocated... Connector is a solution for current situation, it can be deployed with few casualties. Creating a new netlink2 socket for device, which wants to replace ioctl controlling or broadcast it's state is a wrong way. Different sockets/flows does not allow easy flow control. We have one pipe - ethernet, and many protocols inside this pipe with different headers - it is the same here - netlink is such a pipe, and with connector it allows to have different protocols in it. > >With char device I only need to register my callback - with kernel > >connector it is the same, but allows to use the whole power of netlink, > >especially without nice ioctl features like different pointer size > >in userspace and kernelspace. > > You still have to take care of mixed 64/32 bit environments, u64 fields > for example are differently alligned. Connector has a size in it's header - ioctl does not. > >And number of free netlink sockets is _very_ small, especially > >if allocate new one for simple notifications, which can be easily done > >using connector. > > Then fix it so we can use more families and groups. I started some work > on this, but I'm not sure if I have time to complete it. It does not "fix" the "problem" of skb management knowledge, which I described. Netlink is a transport protocol, some general logic must be created on top of it, like it is done in TCP/IP. > >And netlink can be extended to support it - netlink is a transport > >protocol, it should not care about higher layer message handling, > >connector instead will deliver message to the end user in a very > >convenient form. > > You can still built this stuff on top, but the workarounds for netlink > limitations need to be fixed in netlink. I could not call it workaround, I think it is a management layer, which allows : 1. easy usage. Just register a callback and that is all. Callback will be invoced each time new message arrives. No need to dequeue/free/anything. 2. easy usage. Call one function for message delivering, which can care of nonexistent users, perform flow control, congestion control, guarantee delivery and any other. 3. Easily deployable - current implementation is so simple, and it does work with existing netlink. 4. It is logical level on top of transport protocol, it is UDP/IP over ethernet :) > >P.S. I've removed netdev@redhat.com - please do not add subscribers-only > >private mail lists. > > Wasn't me :) Yep :) -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-26 4:45 ` Evgeniy Polyakov @ 2005-07-26 4:56 ` Stephen Hemminger 2005-07-26 5:01 ` Evgeniy Polyakov 2005-07-26 6:14 ` Thomas Graf 1 sibling, 1 reply; 15+ messages in thread From: Stephen Hemminger @ 2005-07-26 4:56 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, Patrick McHardy Evgeniy Polyakov wrote: >On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > > >>Evgeniy Polyakov wrote: >> >> >>>On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy >>>(kaber@trash.net) wrote: >>> >>> >>> >>>>If I understand correctly it tries to workaround some netlink >>>>limitations (limited number of netlink families and multicast groups) >>>>by sending everything to userspace and demultiplexing it there. >>>>Same in the other direction, an additional layer on top of netlink >>>>does basically the same thing netlink already does. This looks like >>>>a step in the wrong direction to me, netlink should instead be fixed >>>>to support what is needed. >>>> >>>> >>>Not only it. >>>The main _first_ idea was to simplify userspace mesasge handling as much >>>as possible. >>>In first releases I called it ioctl-ng - any module that want ot >>>communicate with userspace in the way ioctl does, >>> >>> >>Usually netlink is easily extendable by using nested TLVs. By hiding >>this you basically remove this extensibility. >> >> > >Current netlink is not extensible for _many_ different users. >It has only 32 sockets. > > > >>>requires skb allocation/freeing/handling. >>>Does RTC driver writer need to know what is the difference between >>>shared and cloned skb? Should kernel user of such message bus >>>have to know about skb at all? >>> >>> >>Netlink users don't have to care about shared or cloned skbs. I don't >>think its a big issue to use alloc_skb and then the usual netlink >>macros. Thomas added a number of macros that simplfiy use a lot. >> >> > >Kernel user also must know about difference between unicast/broadcast, >how to dequeue the skb, how to free it and in what context. >ioctl users do not need to know how file_operations is bound to file. > > > >>But my main objection is that it sends everything to userspace even >>if noone is listening. This can't be used for things that generate >>lots of events, and also will get problematic is the number of users >>increases. >> >> > >It is a problem for existing netlink - either check in bind time, >what could be done for connector, or in socket creation time. > >Actually it is not even a problem, since checking is being done, >but after allocation and message filling, such check can be moved into >cn_netlink_send() in connector, but different netlink users, >who prefers to use different sockets, must perform it by itself in each >place, where skb is allocated... > >Connector is a solution for current situation, >it can be deployed with few casualties. >Creating a new netlink2 socket for device, which wants to replace ioctl >controlling or broadcast it's state is a wrong way. >Different sockets/flows does not allow easy flow control. > >We have one pipe - ethernet, and many protocols inside this pipe >with different headers - it is the same here - netlink is such a pipe, >and with connector it allows to have different protocols in it. > > > >>>With char device I only need to register my callback - with kernel >>>connector it is the same, but allows to use the whole power of netlink, >>>especially without nice ioctl features like different pointer size >>>in userspace and kernelspace. >>> >>> >>You still have to take care of mixed 64/32 bit environments, u64 fields >>for example are differently alligned. >> >> > >Connector has a size in it's header - ioctl does not. > > > >>>And number of free netlink sockets is _very_ small, especially >>>if allocate new one for simple notifications, which can be easily done >>>using connector. >>> >>> >>Then fix it so we can use more families and groups. I started some work >>on this, but I'm not sure if I have time to complete it. >> >> > >It does not "fix" the "problem" of skb management knowledge, which I >described. >Netlink is a transport protocol, some general logic must be created on >top of it, like it is done in TCP/IP. > > > >>>And netlink can be extended to support it - netlink is a transport >>>protocol, it should not care about higher layer message handling, >>>connector instead will deliver message to the end user in a very >>>convenient form. >>> >>> >>You can still built this stuff on top, but the workarounds for netlink >>limitations need to be fixed in netlink. >> >> > >I could not call it workaround, I think it is a management layer, >which allows : >1. easy usage. Just register a callback and that is all. Callback will >be invoced each time new message arrives. No need to >dequeue/free/anything. >2. easy usage. Call one function for message delivering, which can >care of nonexistent users, perform flow control, congestion control, >guarantee delivery and any other. >3. Easily deployable - current implementation is so simple, and it does >work with existing netlink. >4. It is logical level on top of transport protocol, it is UDP/IP over >ethernet :) > > > If it is a transport, then it should be in the kernel. Otherwise, it becomes painful for applications with multiple input sources. Think of epoll/poll/select and threads, doing the demultiplexing in user space would be a pain for applications and libraries. The other way to go is to use something like dbus/hal and use a higher level application oriented interface. The problem with that approach, is it assumes every management app wants to drag in gnome.. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-26 4:56 ` Stephen Hemminger @ 2005-07-26 5:01 ` Evgeniy Polyakov 0 siblings, 0 replies; 15+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 5:01 UTC (permalink / raw) To: Stephen Hemminger Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, Patrick McHardy On Mon, Jul 25, 2005 at 09:56:56PM -0700, Stephen Hemminger (shemminger@osdl.org) wrote: > Evgeniy Polyakov wrote: > > >On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy > >(kaber@trash.net) wrote: > > > > > >>Evgeniy Polyakov wrote: > >> > >> > >>>On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy > >>>(kaber@trash.net) wrote: > >>> > >>> > >>> > >>>>If I understand correctly it tries to workaround some netlink > >>>>limitations (limited number of netlink families and multicast groups) > >>>>by sending everything to userspace and demultiplexing it there. > >>>>Same in the other direction, an additional layer on top of netlink > >>>>does basically the same thing netlink already does. This looks like > >>>>a step in the wrong direction to me, netlink should instead be fixed > >>>>to support what is needed. > >>>> > >>>> > >>>Not only it. > >>>The main _first_ idea was to simplify userspace mesasge handling as much > >>>as possible. > >>>In first releases I called it ioctl-ng - any module that want ot > >>>communicate with userspace in the way ioctl does, > >>> > >>> > >>Usually netlink is easily extendable by using nested TLVs. By hiding > >>this you basically remove this extensibility. > >> > >> > > > >Current netlink is not extensible for _many_ different users. > >It has only 32 sockets. > > > > > > > >>>requires skb allocation/freeing/handling. > >>>Does RTC driver writer need to know what is the difference between > >>>shared and cloned skb? Should kernel user of such message bus > >>>have to know about skb at all? > >>> > >>> > >>Netlink users don't have to care about shared or cloned skbs. I don't > >>think its a big issue to use alloc_skb and then the usual netlink > >>macros. Thomas added a number of macros that simplfiy use a lot. > >> > >> > > > >Kernel user also must know about difference between unicast/broadcast, > >how to dequeue the skb, how to free it and in what context. > >ioctl users do not need to know how file_operations is bound to file. > > > > > > > >>But my main objection is that it sends everything to userspace even > >>if noone is listening. This can't be used for things that generate > >>lots of events, and also will get problematic is the number of users > >>increases. > >> > >> > > > >It is a problem for existing netlink - either check in bind time, > >what could be done for connector, or in socket creation time. > > > >Actually it is not even a problem, since checking is being done, > >but after allocation and message filling, such check can be moved into > >cn_netlink_send() in connector, but different netlink users, > >who prefers to use different sockets, must perform it by itself in each > >place, where skb is allocated... > > > >Connector is a solution for current situation, > >it can be deployed with few casualties. > >Creating a new netlink2 socket for device, which wants to replace ioctl > >controlling or broadcast it's state is a wrong way. > >Different sockets/flows does not allow easy flow control. > > > >We have one pipe - ethernet, and many protocols inside this pipe > >with different headers - it is the same here - netlink is such a pipe, > >and with connector it allows to have different protocols in it. > > > > > > > >>>With char device I only need to register my callback - with kernel > >>>connector it is the same, but allows to use the whole power of netlink, > >>>especially without nice ioctl features like different pointer size > >>>in userspace and kernelspace. > >>> > >>> > >>You still have to take care of mixed 64/32 bit environments, u64 fields > >>for example are differently alligned. > >> > >> > > > >Connector has a size in it's header - ioctl does not. > > > > > > > >>>And number of free netlink sockets is _very_ small, especially > >>>if allocate new one for simple notifications, which can be easily done > >>>using connector. > >>> > >>> > >>Then fix it so we can use more families and groups. I started some work > >>on this, but I'm not sure if I have time to complete it. > >> > >> > > > >It does not "fix" the "problem" of skb management knowledge, which I > >described. > >Netlink is a transport protocol, some general logic must be created on > >top of it, like it is done in TCP/IP. > > > > > > > >>>And netlink can be extended to support it - netlink is a transport > >>>protocol, it should not care about higher layer message handling, > >>>connector instead will deliver message to the end user in a very > >>>convenient form. > >>> > >>> > >>You can still built this stuff on top, but the workarounds for netlink > >>limitations need to be fixed in netlink. > >> > >> > > > >I could not call it workaround, I think it is a management layer, > >which allows : > >1. easy usage. Just register a callback and that is all. Callback will > >be invoced each time new message arrives. No need to > >dequeue/free/anything. > >2. easy usage. Call one function for message delivering, which can > >care of nonexistent users, perform flow control, congestion control, > >guarantee delivery and any other. > >3. Easily deployable - current implementation is so simple, and it does > >work with existing netlink. > >4. It is logical level on top of transport protocol, it is UDP/IP over > >ethernet :) > > > > > > > If it is a transport, then it should be in the kernel. Otherwise, it > becomes painful > for applications with multiple input sources. Think of > epoll/poll/select and threads, > doing the demultiplexing in user space would be a pain for applications > and libraries. It _is_ in the kernel - multiplexing is being done in a send time, userspace does not receive messages for different ID's. Currently it is done using netlink groups, and I would like to change it, but conenctor layer itself will not be changed, so no application will be changed - they bound before and will only bound after. one socket, different groups. Ok, now application bound to -1 group will receive all traffic, but I posted proof-of-concept patch to remove such behaviour. > The other way to go is to use something like dbus/hal and use a higher level > application oriented interface. The problem with that approach, is it > assumes > every management app wants to drag in gnome.. No need to parse headers there. When we read from UDP socket, we do not get headers - connector users do not read netlink header, and it is possible to completely remove even connector header, although I would like to have it - some kind of HDRINCL option... -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-26 4:45 ` Evgeniy Polyakov 2005-07-26 4:56 ` Stephen Hemminger @ 2005-07-26 6:14 ` Thomas Graf 2005-07-26 6:31 ` Evgeniy Polyakov 1 sibling, 1 reply; 15+ messages in thread From: Thomas Graf @ 2005-07-26 6:14 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, Patrick McHardy * Evgeniy Polyakov <20050726044547.GA32006@2ka.mipt.ru> 2005-07-26 08:45 > On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > > Usually netlink is easily extendable by using nested TLVs. By hiding > > this you basically remove this extensibility. > > Current netlink is not extensible for _many_ different users. Patrick's key point was that by hiding some of the functionality you remove a lot of the flexbility. > It has only 32 sockets. You mean MAX_LINKS? That is the current number of reserved netlink protocols. The ethertaps are obsolete and can be reused so we're currently using 16 out of 256 possible protocols. If that is not enough there are ways to work around this. However, I also see a need for a generic protocol providing a simplified interface for small applications. Nevertheless we should take the time and work things out on the netlink level first, netlink has issues and we should not work around them in a upper layer. > > But my main objection is that it sends everything to userspace even > > if noone is listening. This can't be used for things that generate > > lots of events, and also will get problematic is the number of users > > increases. > > It is a problem for existing netlink - either check in bind time, > what could be done for connector, or in socket creation time. No, I think you are misunderstanding something. As I said, we can easly add a function netlink_nr_subscribers(sk, groups) so the check can be done before starting to build the message. This is no problem, it simply didn't make sense so far because netlink event messages were mostly used for rare events. > Actually it is not even a problem, since checking is being done, > but after allocation and message filling, such check can be moved into > cn_netlink_send() in connector, but different netlink users, > who prefers to use different sockets, must perform it by itself in each > place, where skb is allocated... Sure, which is the right thing, it makes perfect sense to check before starting the process of building and event and sending it. > Connector is a solution for current situation, > it can be deployed with few casualties. The problem is that netlink is likely to change in order to cope with some recent needs, e.g. ctnetlink but also other current issues which need to be addressed. Therefore I suggest to build connector on top of the updated netlink so you we have one thing less to worry about when thinking about compatibility. > Creating a new netlink2 socket for device, which wants to replace ioctl > controlling or broadcast it's state is a wrong way. Slowly, we might need netlink2 _in case_ we cannot work things out without breaking compatibility. This has nothing to do with the connector, there are netlink users which have new needs such as more groups, at least some of them need the flexibility of netlink itself so we have to work things out for them. > Different sockets/flows does not allow easy flow control. I'm not sure what you mean. > We have one pipe - ethernet, and many protocols inside this pipe > with different headers - it is the same here - netlink is such a pipe, > and with connector it allows to have different protocols in it. At least parts of your connector is just a redudant implementation of what netlink is already capable of doing. Sure, some of them have issues but there is no reason to just build a new protocol on top of another one if the protocol beneath has issues which can be resolved. > > You still have to take care of mixed 64/32 bit environments, u64 fields > > for example are differently alligned. > > Connector has a size in it's header - ioctl does not. You have exactly the same issues as netlink as soon as you transfer structs, believe it or not. > It does not "fix" the "problem" of skb management knowledge, which I > described. Yes ok, this is a different issue and as Patrick stated already those have been mostly worked out by providing a new set of macros. Except for a few leftovers, which will be addressed, there is no need to call skb functions anymore. The reason the plain skb interface was used is simply that the authors of most of the netlink using code are in fact very familiar with the skb interface, that's it. > > You can still built this stuff on top, but the workarounds for netlink > > limitations need to be fixed in netlink. > > I could not call it workaround, I think it is a management layer, > which allows : Listen, nobody wants to take away your baby. ;-> There are some objections of things which would rather be fixed in the netlink layer first and the remaining part that is missing goes into the connector. I see a lot of replicated netlink code in the connector which is no necessary. I perfectly agree with you that we require some form of simplified addressing and easier message handling for simple applications but just building another layer on top of netlink without respecting the capabilities of netlink itself is not the way to go as I see it. For example, we'll probably add a new group subscription mechanism to netlink which might perfectly suit the needs of your connector. > 1. easy usage. Just register a callback and that is all. Callback will > be invoced each time new message arrives. No need to > dequeue/free/anything. Good point, also doable in netlink directly. Just get rid of the usual family_rcv -> family_rcv_skb -> family_rcv_msg process and do a callback registration interface instead. However, often the processing of a message and the resulting ack must be done as an atomic operation, e.g. rtnetlink. > 2. easy usage. Call one function for message delivering, which can > care of nonexistent users, perform flow control, congestion control, > guarantee delivery and any other. I don't understand what exactly you mean but netlink itself is not reliable under memory pressure. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Netlink connector 2005-07-26 6:14 ` Thomas Graf @ 2005-07-26 6:31 ` Evgeniy Polyakov 0 siblings, 0 replies; 15+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 6:31 UTC (permalink / raw) To: Thomas Graf Cc: Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, Patrick McHardy On Tue, Jul 26, 2005 at 08:14:47AM +0200, Thomas Graf (tgraf@suug.ch) wrote: > * Evgeniy Polyakov <20050726044547.GA32006@2ka.mipt.ru> 2005-07-26 08:45 > > On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > > > Usually netlink is easily extendable by using nested TLVs. By hiding > > > this you basically remove this extensibility. > > > > Current netlink is not extensible for _many_ different users. > > Patrick's key point was that by hiding some of the functionality > you remove a lot of the flexbility. > > > It has only 32 sockets. > > You mean MAX_LINKS? That is the current number of reserved > netlink protocols. The ethertaps are obsolete and can be > reused so we're currently using 16 out of 256 possible > protocols. If that is not enough there are ways to work > around this. However, I also see a need for a generic protocol > providing a simplified interface for small applications. > Nevertheless we should take the time and work things out on > the netlink level first, netlink has issues and we should not > work around them in a upper layer. > > > > But my main objection is that it sends everything to userspace even > > > if noone is listening. This can't be used for things that generate > > > lots of events, and also will get problematic is the number of users > > > increases. > > > > It is a problem for existing netlink - either check in bind time, > > what could be done for connector, or in socket creation time. > > No, I think you are misunderstanding something. As I said, we can > easly add a function netlink_nr_subscribers(sk, groups) so the > check can be done before starting to build the message. This is > no problem, it simply didn't make sense so far because netlink > event messages were mostly used for rare events. Yep. > > Actually it is not even a problem, since checking is being done, > > but after allocation and message filling, such check can be moved into > > cn_netlink_send() in connector, but different netlink users, > > who prefers to use different sockets, must perform it by itself in each > > place, where skb is allocated... > > Sure, which is the right thing, it makes perfect sense to check > before starting the process of building and event and sending it. > > > Connector is a solution for current situation, > > it can be deployed with few casualties. > > The problem is that netlink is likely to change in order > to cope with some recent needs, e.g. ctnetlink but also other > current issues which need to be addressed. Therefore I suggest > to build connector on top of the updated netlink so you we have > one thing less to worry about when thinking about compatibility. > > > Creating a new netlink2 socket for device, which wants to replace ioctl > > controlling or broadcast it's state is a wrong way. > > Slowly, we might need netlink2 _in case_ we cannot work things > out without breaking compatibility. This has nothing to do with > the connector, there are netlink users which have new needs such > as more groups, at least some of them need the flexibility of > netlink itself so we have to work things out for them. > > > Different sockets/flows does not allow easy flow control. > > I'm not sure what you mean. Concider socket overrun - message will be dropped, using special flags in connector [it's size field was selected to be 4 bytes, and thus has big reserve] this subsystem can requeue message later after timeout or something similar... > > We have one pipe - ethernet, and many protocols inside this pipe > > with different headers - it is the same here - netlink is such a pipe, > > and with connector it allows to have different protocols in it. > > At least parts of your connector is just a redudant implementation > of what netlink is already capable of doing. Sure, some of them > have issues but there is no reason to just build a new protocol on > top of another one if the protocol beneath has issues which can be > resolved. > > > > You still have to take care of mixed 64/32 bit environments, u64 fields > > > for example are differently alligned. > > > > Connector has a size in it's header - ioctl does not. > > You have exactly the same issues as netlink as soon as you transfer > structs, believe it or not. > > > It does not "fix" the "problem" of skb management knowledge, which I > > described. > > Yes ok, this is a different issue and as Patrick stated already > those have been mostly worked out by providing a new set of > macros. Except for a few leftovers, which will be addressed, there > is no need to call skb functions anymore. The reason the plain > skb interface was used is simply that the authors of most of the > netlink using code are in fact very familiar with the skb interface, > that's it. I saw your changes - theay are very usefull, but _only_ for sending part. Kernel receiver still needs dequeuing, freeing and NLKMSG macros. In first netlink days it also needed skb_recv_msg() or something similar... > > > You can still built this stuff on top, but the workarounds for netlink > > > limitations need to be fixed in netlink. > > > > I could not call it workaround, I think it is a management layer, > > which allows : > > Listen, nobody wants to take away your baby. ;-> There are some Yeah :) > objections of things which would rather be fixed in the netlink > layer first and the remaining part that is missing goes into the > connector. I see a lot of replicated netlink code in the connector > which is no necessary. I perfectly agree with you that we require > some form of simplified addressing and easier message handling > for simple applications but just building another layer on top > of netlink without respecting the capabilities of netlink itself > is not the way to go as I see it. For example, we'll probably add > a new group subscription mechanism to netlink which might perfectly > suit the needs of your connector. That is why I raise this question again and againg to see, what ideas should be moved from connector into netlink and vice versa... :) > > 1. easy usage. Just register a callback and that is all. Callback will > > be invoced each time new message arrives. No need to > > dequeue/free/anything. > > Good point, also doable in netlink directly. Just get rid of the > usual family_rcv -> family_rcv_skb -> family_rcv_msg process and > do a callback registration interface instead. However, often the > processing of a message and the resulting ack must be done as an > atomic operation, e.g. rtnetlink. It also better to move into workqueue - just to be sure users will not do some wrong things... > > 2. easy usage. Call one function for message delivering, which can > > care of nonexistent users, perform flow control, congestion control, > > guarantee delivery and any other. > > I don't understand what exactly you mean but netlink itself > is not reliable under memory pressure. Connector has cn_netlink_send() which is a wrapper on top of skb allocation, queuing and so on. by flow/congestion control I mean here, that this function can check for remote peer existing, requeue message if socket overrun is caught, guarantee that no OOM condition was caught, and requeue if it was the case and so on. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2005-07-26 8:45 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050723125427.GA11177@rama>
[not found] ` <20050723091455.GA12015@2ka.mipt.ru>
[not found] ` <20050724.191756.105797967.davem@davemloft.net>
[not found] ` <Lynx.SEL.4.62.0507250154000.21934@thoron.boston.redhat.com>
2005-07-25 7:06 ` Netlink connector Evgeniy Polyakov
2005-07-25 14:32 ` Patrick McHardy
2005-07-25 14:43 ` Eric Leblond
2005-07-25 19:33 ` Evgeniy Polyakov
2005-07-26 8:45 ` Harald Welte
2005-07-25 19:28 ` Evgeniy Polyakov
2005-07-25 23:46 ` Patrick McHardy
2005-07-25 23:56 ` Thomas Graf
2005-07-26 0:16 ` Patrick McHardy
2005-07-26 0:30 ` Thomas Graf
2005-07-26 4:45 ` Evgeniy Polyakov
2005-07-26 4:56 ` Stephen Hemminger
2005-07-26 5:01 ` Evgeniy Polyakov
2005-07-26 6:14 ` Thomas Graf
2005-07-26 6:31 ` Evgeniy Polyakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).