Re: Netlink connector - Evgeniy Polyakov

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Stephen Hemminger <shemminger@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
	Harald Welte <laforge@netfilter.org>,
	netdev@vger.kernel.org, netfilter-devel@lists.netfilter.org,
	linux-kernel@vger.kernel.org, Patrick McHardy <kaber@trash.net>
Subject: Re: Netlink connector
Date: Tue, 26 Jul 2005 09:01:56 +0400	[thread overview]
Message-ID: <20050726050156.GA15653@2ka.mipt.ru> (raw)
In-Reply-To: <42E5C298.8010209@osdl.org>

On Mon, Jul 25, 2005 at 09:56:56PM -0700, Stephen Hemminger (shemminger@osdl.org) wrote:
> Evgeniy Polyakov wrote:
> 
> >On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy 
> >(kaber@trash.net) wrote:
> > 
> >
> >>Evgeniy Polyakov wrote:
> >>   
> >>
> >>>On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy 
> >>>(kaber@trash.net) wrote:
> >>>
> >>>     
> >>>
> >>>>If I understand correctly it tries to workaround some netlink
> >>>>limitations (limited number of netlink families and multicast groups)
> >>>>by sending everything to userspace and demultiplexing it there.
> >>>>Same in the other direction, an additional layer on top of netlink
> >>>>does basically the same thing netlink already does. This looks like
> >>>>a step in the wrong direction to me, netlink should instead be fixed
> >>>>to support what is needed.
> >>>>       
> >>>>
> >>>Not only it.
> >>>The main _first_ idea was to simplify userspace mesasge handling as much
> >>>as possible.
> >>>In first releases I called it ioctl-ng - any module that want ot
> >>>communicate with userspace in the way ioctl does, 
> >>>     
> >>>
> >>Usually netlink is easily extendable by using nested TLVs. By hiding
> >>this you basically remove this extensibility.
> >>   
> >>
> >
> >Current netlink is not extensible for _many_ different users.
> >It has only 32 sockets.
> >
> > 
> >
> >>>requires skb allocation/freeing/handling.
> >>>Does RTC driver writer need to know what is the difference between
> >>>shared and cloned skb? Should kernel user of such message bus
> >>>have to know about skb at all?
> >>>     
> >>>
> >>Netlink users don't have to care about shared or cloned skbs. I don't
> >>think its a big issue to use alloc_skb and then the usual netlink
> >>macros. Thomas added a number of macros that simplfiy use a lot.
> >>   
> >>
> >
> >Kernel user also must know about difference between unicast/broadcast,
> >how to dequeue the skb, how to free it and in what context.
> >ioctl users do not need to know how file_operations is bound to file.
> >
> > 
> >
> >>But my main objection is that it sends everything to userspace even
> >>if noone is listening. This can't be used for things that generate
> >>lots of events, and also will get problematic is the number of users
> >>increases.
> >>   
> >>
> >
> >It is a problem for existing netlink - either check in bind time, 
> >what could be done for connector, or in socket creation time.
> >
> >Actually it is not even a problem, since checking is being done, 
> >but after allocation and message filling, such check can be moved into
> >cn_netlink_send() in connector, but different netlink users, 
> >who prefers to use different sockets, must perform it by itself in each
> >place, where skb is allocated...
> >
> >Connector is a solution for current situation, 
> >it can be deployed with few casualties.
> >Creating a new netlink2 socket for device, which wants to replace ioctl
> >controlling or broadcast it's state is a wrong way.
> >Different sockets/flows does not allow easy flow control.
> >
> >We have one pipe - ethernet, and many protocols inside this pipe
> >with different headers - it is the same here - netlink is such a pipe,
> >and with connector it allows to have different protocols in it.
> >
> > 
> >
> >>>With char device I only need to register my callback - with kernel
> >>>connector it is the same, but allows to use the whole power of netlink,
> >>>especially without nice ioctl features like different pointer size 
> >>>in userspace and kernelspace.
> >>>     
> >>>
> >>You still have to take care of mixed 64/32 bit environments, u64 fields
> >>for example are differently alligned.
> >>   
> >>
> >
> >Connector has a size in it's header - ioctl does not.
> >
> > 
> >
> >>>And number of free netlink sockets is _very_ small, especially
> >>>if allocate new one for simple notifications, which can be easily done
> >>>using connector.
> >>>     
> >>>
> >>Then fix it so we can use more families and groups. I started some work
> >>on this, but I'm not sure if I have time to complete it.
> >>   
> >>
> >
> >It does not "fix" the "problem" of skb management knowledge, which I
> >described.
> >Netlink is a transport protocol, some general logic must be created on
> >top of it, like it is done in TCP/IP.
> >
> > 
> >
> >>>And netlink can be extended to support it - netlink is a transport
> >>>protocol, it should not care about higher layer message handling,
> >>>connector instead will deliver message to the end user in a very
> >>>convenient form.
> >>>     
> >>>
> >>You can still built this stuff on top, but the workarounds for netlink
> >>limitations need to be fixed in netlink.
> >>   
> >>
> >
> >I could not call it workaround, I think it is a management layer,
> >which allows :
> >1. easy usage. Just register a callback and that is all. Callback will
> >be invoced each time new message arrives. No need to
> >dequeue/free/anything.
> >2. easy usage. Call one function for message delivering, which can
> >care of nonexistent users, perform flow control, congestion control,
> >guarantee delivery and any other.
> >3. Easily deployable - current implementation is so simple, and it does
> >work with existing netlink.
> >4. It is logical level on top of transport protocol, it is UDP/IP over
> >ethernet :)
> >
> > 
> >
> If it is a transport, then it should be in the kernel. Otherwise, it 
> becomes painful
> for applications with multiple input sources.  Think of 
> epoll/poll/select and threads,
> doing the demultiplexing in user space would be a pain for applications 
> and libraries.

It _is_ in the kernel - multiplexing is being done in a send time, 
userspace does not receive messages for different ID's.
Currently it is done using netlink groups, and I would like to change
it, but conenctor layer itself will not be changed, so no application
will be changed - they bound before and will only bound after.
one socket, different groups.

Ok, now application bound to -1 group will receive all traffic, but I
posted proof-of-concept patch to remove such behaviour.

> The other way to go is to use something like dbus/hal and use a higher level
> application oriented interface. The problem with that approach, is it 
> assumes
> every management app wants to drag in gnome..

No need to parse headers there.
When we read from UDP socket, we do not get headers - connector users
do not read netlink header, and it is possible to completely remove
even connector header, although I would like to have it - some kind of
HDRINCL option...

-- 
	Evgeniy Polyakov

WARNING: multiple messages have this Message-ID (diff)

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Stephen Hemminger <shemminger@osdl.org>
Cc: Patrick McHardy <kaber@trash.net>,
	James Morris <jmorris@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Harald Welte <laforge@netfilter.org>,
	netfilter-devel@lists.netfilter.org,
	linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
	netdev@vger.kernel.org
Subject: Re: Netlink connector
Date: Tue, 26 Jul 2005 09:01:56 +0400	[thread overview]
Message-ID: <20050726050156.GA15653@2ka.mipt.ru> (raw)
In-Reply-To: <42E5C298.8010209@osdl.org>

On Mon, Jul 25, 2005 at 09:56:56PM -0700, Stephen Hemminger (shemminger@osdl.org) wrote:
> Evgeniy Polyakov wrote:
> 
> >On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy 
> >(kaber@trash.net) wrote:
> > 
> >
> >>Evgeniy Polyakov wrote:
> >>   
> >>
> >>>On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy 
> >>>(kaber@trash.net) wrote:
> >>>
> >>>     
> >>>
> >>>>If I understand correctly it tries to workaround some netlink
> >>>>limitations (limited number of netlink families and multicast groups)
> >>>>by sending everything to userspace and demultiplexing it there.
> >>>>Same in the other direction, an additional layer on top of netlink
> >>>>does basically the same thing netlink already does. This looks like
> >>>>a step in the wrong direction to me, netlink should instead be fixed
> >>>>to support what is needed.
> >>>>       
> >>>>
> >>>Not only it.
> >>>The main _first_ idea was to simplify userspace mesasge handling as much
> >>>as possible.
> >>>In first releases I called it ioctl-ng - any module that want ot
> >>>communicate with userspace in the way ioctl does, 
> >>>     
> >>>
> >>Usually netlink is easily extendable by using nested TLVs. By hiding
> >>this you basically remove this extensibility.
> >>   
> >>
> >
> >Current netlink is not extensible for _many_ different users.
> >It has only 32 sockets.
> >
> > 
> >
> >>>requires skb allocation/freeing/handling.
> >>>Does RTC driver writer need to know what is the difference between
> >>>shared and cloned skb? Should kernel user of such message bus
> >>>have to know about skb at all?
> >>>     
> >>>
> >>Netlink users don't have to care about shared or cloned skbs. I don't
> >>think its a big issue to use alloc_skb and then the usual netlink
> >>macros. Thomas added a number of macros that simplfiy use a lot.
> >>   
> >>
> >
> >Kernel user also must know about difference between unicast/broadcast,
> >how to dequeue the skb, how to free it and in what context.
> >ioctl users do not need to know how file_operations is bound to file.
> >
> > 
> >
> >>But my main objection is that it sends everything to userspace even
> >>if noone is listening. This can't be used for things that generate
> >>lots of events, and also will get problematic is the number of users
> >>increases.
> >>   
> >>
> >
> >It is a problem for existing netlink - either check in bind time, 
> >what could be done for connector, or in socket creation time.
> >
> >Actually it is not even a problem, since checking is being done, 
> >but after allocation and message filling, such check can be moved into
> >cn_netlink_send() in connector, but different netlink users, 
> >who prefers to use different sockets, must perform it by itself in each
> >place, where skb is allocated...
> >
> >Connector is a solution for current situation, 
> >it can be deployed with few casualties.
> >Creating a new netlink2 socket for device, which wants to replace ioctl
> >controlling or broadcast it's state is a wrong way.
> >Different sockets/flows does not allow easy flow control.
> >
> >We have one pipe - ethernet, and many protocols inside this pipe
> >with different headers - it is the same here - netlink is such a pipe,
> >and with connector it allows to have different protocols in it.
> >
> > 
> >
> >>>With char device I only need to register my callback - with kernel
> >>>connector it is the same, but allows to use the whole power of netlink,
> >>>especially without nice ioctl features like different pointer size 
> >>>in userspace and kernelspace.
> >>>     
> >>>
> >>You still have to take care of mixed 64/32 bit environments, u64 fields
> >>for example are differently alligned.
> >>   
> >>
> >
> >Connector has a size in it's header - ioctl does not.
> >
> > 
> >
> >>>And number of free netlink sockets is _very_ small, especially
> >>>if allocate new one for simple notifications, which can be easily done
> >>>using connector.
> >>>     
> >>>
> >>Then fix it so we can use more families and groups. I started some work
> >>on this, but I'm not sure if I have time to complete it.
> >>   
> >>
> >
> >It does not "fix" the "problem" of skb management knowledge, which I
> >described.
> >Netlink is a transport protocol, some general logic must be created on
> >top of it, like it is done in TCP/IP.
> >
> > 
> >
> >>>And netlink can be extended to support it - netlink is a transport
> >>>protocol, it should not care about higher layer message handling,
> >>>connector instead will deliver message to the end user in a very
> >>>convenient form.
> >>>     
> >>>
> >>You can still built this stuff on top, but the workarounds for netlink
> >>limitations need to be fixed in netlink.
> >>   
> >>
> >
> >I could not call it workaround, I think it is a management layer,
> >which allows :
> >1. easy usage. Just register a callback and that is all. Callback will
> >be invoced each time new message arrives. No need to
> >dequeue/free/anything.
> >2. easy usage. Call one function for message delivering, which can
> >care of nonexistent users, perform flow control, congestion control,
> >guarantee delivery and any other.
> >3. Easily deployable - current implementation is so simple, and it does
> >work with existing netlink.
> >4. It is logical level on top of transport protocol, it is UDP/IP over
> >ethernet :)
> >
> > 
> >
> If it is a transport, then it should be in the kernel. Otherwise, it 
> becomes painful
> for applications with multiple input sources.  Think of 
> epoll/poll/select and threads,
> doing the demultiplexing in user space would be a pain for applications 
> and libraries.

It _is_ in the kernel - multiplexing is being done in a send time, 
userspace does not receive messages for different ID's.
Currently it is done using netlink groups, and I would like to change
it, but conenctor layer itself will not be changed, so no application
will be changed - they bound before and will only bound after.
one socket, different groups.

Ok, now application bound to -1 group will receive all traffic, but I
posted proof-of-concept patch to remove such behaviour.

> The other way to go is to use something like dbus/hal and use a higher level
> application oriented interface. The problem with that approach, is it 
> assumes
> every management app wants to drag in gnome..

No need to parse headers there.
When we read from UDP socket, we do not get headers - connector users
do not read netlink header, and it is possible to completely remove
even connector header, although I would like to have it - some kind of
HDRINCL option...

-- 
	Evgeniy Polyakov

next prev parent reply	other threads:[~2005-07-26  5:01 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-23 12:54 [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG Harald Welte
2005-07-23  3:05 ` YOSHIFUJI Hideaki / 吉藤英明
2005-07-23  3:05   ` YOSHIFUJI Hideaki / 吉藤英明
2005-07-23  3:15   ` YOSHIFUJI Hideaki / 吉藤英明
2005-07-23  3:52   ` Patrick McHardy
2005-07-23 13:33   ` Harald Welte
2005-07-23 13:33     ` Harald Welte
2005-07-25  2:09     ` David S. Miller
2005-07-25  2:09       ` David S. Miller
2005-07-25  2:15     ` David S. Miller
2005-07-25  2:15       ` David S. Miller
2005-07-26  9:48       ` Harald Welte
2005-07-23  9:14 ` Evgeniy Polyakov
2005-07-25  2:17   ` David S. Miller
2005-07-25  6:02     ` Netlink connector James Morris
2005-07-25  6:02       ` James Morris
2005-07-25  7:06       ` Evgeniy Polyakov
2005-07-25  7:06         ` Evgeniy Polyakov
2005-07-25 14:32         ` Patrick McHardy
2005-07-25 14:32           ` Patrick McHardy
2005-07-25 14:43           ` Eric Leblond
2005-07-25 19:33             ` Evgeniy Polyakov
2005-07-26  8:45               ` Harald Welte
2005-07-26  8:45                 ` Harald Welte
2005-07-25 19:28           ` Evgeniy Polyakov
2005-07-25 19:28             ` Evgeniy Polyakov
2005-07-25 23:46             ` Patrick McHardy
2005-07-25 23:46               ` Patrick McHardy
2005-07-25 23:56               ` Thomas Graf
2005-07-26  0:16                 ` Patrick McHardy
2005-07-26  0:30                   ` Thomas Graf
2005-07-26  0:30                     ` Thomas Graf
2005-07-26  4:45               ` Evgeniy Polyakov
2005-07-26  4:45                 ` Evgeniy Polyakov
2005-07-26  4:56                 ` Stephen Hemminger
2005-07-26  4:56                   ` Stephen Hemminger
2005-07-26  5:01                   ` Evgeniy Polyakov [this message]
2005-07-26  5:01                     ` Evgeniy Polyakov
2005-07-26  6:14                 ` Thomas Graf
2005-07-26  6:14                   ` Thomas Graf
2005-07-26  6:31                   ` Evgeniy Polyakov
2005-07-26  6:31                     ` Evgeniy Polyakov
2005-07-26  8:42       ` Harald Welte
2005-07-26  8:42         ` Harald Welte
2005-07-26  9:01         ` Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050726050156.GA15653@2ka.mipt.ru \
    --to=johnpol@2ka.mipt.ru \
    --cc=akpm@osdl.org \
    --cc=kaber@trash.net \
    --cc=laforge@netfilter.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@lists.netfilter.org \
    --cc=shemminger@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.