* [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG
@ 2005-07-23 12:54 Harald Welte
2005-07-23 3:05 ` YOSHIFUJI Hideaki / 吉藤英明
2005-07-23 9:14 ` Evgeniy Polyakov
0 siblings, 2 replies; 26+ messages in thread
From: Harald Welte @ 2005-07-23 12:54 UTC (permalink / raw)
To: David Miller
Cc: Evgeniy Polyakov, Netfilter Development Mailinglist,
Linux Kernel Mailinglist
[-- Attachment #1.1: Type: text/plain, Size: 1093 bytes --]
Hi Dave,
Hi Evgeniy,
the following patch fixes the illegal use of NETLINK_NFLOG by the
1wire drivers. It assumes that the netlink tap families can now safely
be reclaimed, which is the case according to Dave at netconf'05.
I'm not sure who would be the right person to fix this, but this patch
needs to go into both 2.6.12.x and 2.6.13 trees, since it potentially
causes a security problem by preventing the iptables ULOG
This has been the third new piece of code that reuses NETLINK_NFLOG
within a couple of months. I would really appreciate if people would
actually ask/apply for a new protocol number instead of just overloading
existing values and thereby causing breakage.
Thanks,
Harald
--
- Harald Welte <laforge@netfilter.org> http://netfilter.org/
============================================================================
"Fragmentation is like classful addressing -- an interesting early
architectural error that shows how much experimentation was going
on while IP was being designed." -- Paul Vixie
[-- Attachment #1.2: 06-w1-nflog.patch --]
[-- Type: text/plain, Size: 1787 bytes --]
Give the 1-wire driver stack its own netlink protocol number, instead of
overloading NETLINK_NFLOG.
I wonder what I have done to people, that they always overload the
NETLINK_NFLOG protocol number and thereby effectively prevent the packet
filter logging mechanism. Please don't re-use protocol numbers.
Signed-off-by: Harald Welte <laforge@netfilter.org>
---
commit b4a566c332048b642506eff7de825fce710ff42c
tree 07ef162f6d449dd67c586c9c63680004787b86c5
parent d5d3fb40b6db511dbd47a84634a1249de6b7b297
author laforge <laforge@netfilter.org> Sa, 23 Jul 2005 08:41:24 -0400
committer laforge <laforge@netfilter.org> Sa, 23 Jul 2005 08:41:24 -0400
drivers/w1/w1_int.c | 4 ++--
include/linux/netlink.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/w1/w1_int.c b/drivers/w1/w1_int.c
--- a/drivers/w1/w1_int.c
+++ b/drivers/w1/w1_int.c
@@ -88,10 +88,10 @@ static struct w1_master * w1_alloc_dev(u
dev->groups = 23;
dev->seq = 1;
- dev->nls = netlink_kernel_create(NETLINK_NFLOG, NULL);
+ dev->nls = netlink_kernel_create(NETLINK_W1, NULL);
if (!dev->nls) {
printk(KERN_ERR "Failed to create new netlink socket(%u) for w1 master %s.\n",
- NETLINK_NFLOG, dev->dev.bus_id);
+ NETLINK_W1, dev->dev.bus_id);
}
err = device_register(&dev->dev);
diff --git a/include/linux/netlink.h b/include/linux/netlink.h
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -20,7 +20,7 @@
#define NETLINK_IP6_FW 13
#define NETLINK_DNRTMSG 14 /* DECnet routing messages */
#define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */
-#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */
+#define NETLINK_W1 16 /* 16 to 31 are ethertap */
#define MAX_LINKS 32
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-23 12:54 [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG Harald Welte @ 2005-07-23 3:05 ` YOSHIFUJI Hideaki / 吉藤英明 2005-07-23 13:33 ` Harald Welte 2005-07-23 9:14 ` Evgeniy Polyakov 1 sibling, 1 reply; 26+ messages in thread From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-07-23 3:05 UTC (permalink / raw) To: laforge; +Cc: davem, johnpol, netfilter-devel, linux-kernel, yoshfuji In article <20050723125427.GA11177@rama> (at Sat, 23 Jul 2005 08:54:27 -0400), Harald Welte <laforge@netfilter.org> says: > --- a/include/linux/netlink.h > +++ b/include/linux/netlink.h > @@ -20,7 +20,7 @@ > #define NETLINK_IP6_FW 13 > #define NETLINK_DNRTMSG 14 /* DECnet routing messages */ > #define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */ > -#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */ > +#define NETLINK_W1 16 /* 16 to 31 are ethertap */ > > #define MAX_LINKS 32 > Comment says that 16-31 are used for ethertap. So, probably assigh NETLINK_W1 at 32, and bump MAX_LINKS? --yoshfuji ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-23 3:05 ` YOSHIFUJI Hideaki / 吉藤英明 @ 2005-07-23 13:33 ` Harald Welte 2005-07-25 2:09 ` David S. Miller 2005-07-25 2:15 ` David S. Miller 0 siblings, 2 replies; 26+ messages in thread From: Harald Welte @ 2005-07-23 13:33 UTC (permalink / raw) To: YOSHIFUJI Hideaki / 吉藤英明 Cc: davem, johnpol, netfilter-devel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1544 bytes --] On Fri, Jul 22, 2005 at 11:05:59PM -0400, YOSHIFUJI Hideaki / 吉藤英明 wrote: > In article <20050723125427.GA11177@rama> (at Sat, 23 Jul 2005 08:54:27 -0400), Harald Welte <laforge@netfilter.org> says: > > > --- a/include/linux/netlink.h > > +++ b/include/linux/netlink.h > > @@ -20,7 +20,7 @@ > > #define NETLINK_IP6_FW 13 > > #define NETLINK_DNRTMSG 14 /* DECnet routing messages */ > > #define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */ > > -#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */ > > +#define NETLINK_W1 16 /* 16 to 31 are ethertap */ > > > > #define MAX_LINKS 32 > > > > Comment says that 16-31 are used for ethertap. > So, probably assigh NETLINK_W1 at 32, and bump MAX_LINKS? MAX_LINKS > 32 would result in larger statically allocated pointer arrays. It would also only work if NPROTO is increased too, IIRC. I strongly disrecommend increasing NPROTO. Maybe we should look into reusing NETLINK_FIREWALL (which was an old 2.2.x kernel interface). But to be honest, I don't really care all that much as long as existing and still very actively used values are not just overloaded. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-23 13:33 ` Harald Welte @ 2005-07-25 2:09 ` David S. Miller 2005-07-25 2:15 ` David S. Miller 1 sibling, 0 replies; 26+ messages in thread From: David S. Miller @ 2005-07-25 2:09 UTC (permalink / raw) To: laforge; +Cc: yoshfuji, johnpol, netfilter-devel, linux-kernel From: Harald Welte <laforge@netfilter.org> Date: Sat, 23 Jul 2005 09:33:53 -0400 > I strongly disrecommend increasing NPROTO. Maybe we should look into > reusing NETLINK_FIREWALL (which was an old 2.2.x kernel interface). That is how I will fix this 1-wire case, by reusing the NETLINK_FIREWALL thing. > But to be honest, I don't really care all that much as long as existing > and still very actively used values are not just overloaded. Absolutely. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-23 13:33 ` Harald Welte 2005-07-25 2:09 ` David S. Miller @ 2005-07-25 2:15 ` David S. Miller 2005-07-26 9:48 ` Harald Welte 1 sibling, 1 reply; 26+ messages in thread From: David S. Miller @ 2005-07-25 2:15 UTC (permalink / raw) To: laforge; +Cc: yoshfuji, johnpol, netfilter-devel, linux-kernel From: Harald Welte <laforge@netfilter.org> Date: Sat, 23 Jul 2005 09:33:53 -0400 > I strongly disrecommend increasing NPROTO. Maybe we should look into > reusing NETLINK_FIREWALL (which was an old 2.2.x kernel interface). ip_queue.c still uses NETLINK_FIREWALL so we really can't use that. So instead, as in the patch below, I solved this for now by using the NETLINK_SKIP value which was reserved years ago yet never made use of. diff --git a/drivers/w1/w1_int.c b/drivers/w1/w1_int.c --- a/drivers/w1/w1_int.c +++ b/drivers/w1/w1_int.c @@ -88,7 +88,7 @@ static struct w1_master * w1_alloc_dev(u dev->groups = 23; dev->seq = 1; - dev->nls = netlink_kernel_create(NETLINK_NFLOG, NULL); + dev->nls = netlink_kernel_create(NETLINK_W1, NULL); if (!dev->nls) { printk(KERN_ERR "Failed to create new netlink socket(%u) for w1 master %s.\n", NETLINK_NFLOG, dev->dev.bus_id); diff --git a/include/linux/netlink.h b/include/linux/netlink.h --- a/include/linux/netlink.h +++ b/include/linux/netlink.h @@ -5,7 +5,7 @@ #include <linux/types.h> #define NETLINK_ROUTE 0 /* Routing/device hook */ -#define NETLINK_SKIP 1 /* Reserved for ENskip */ +#define NETLINK_W1 1 /* 1-wire subsystem */ #define NETLINK_USERSOCK 2 /* Reserved for user mode socket protocols */ #define NETLINK_FIREWALL 3 /* Firewalling hook */ #define NETLINK_TCPDIAG 4 /* TCP socket monitoring */ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-25 2:15 ` David S. Miller @ 2005-07-26 9:48 ` Harald Welte 0 siblings, 0 replies; 26+ messages in thread From: Harald Welte @ 2005-07-26 9:48 UTC (permalink / raw) To: David S. Miller; +Cc: johnpol, netfilter-devel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 954 bytes --] On Sun, Jul 24, 2005 at 07:15:05PM -0700, David S. Miller wrote: > > I strongly disrecommend increasing NPROTO. Maybe we should look into > > reusing NETLINK_FIREWALL (which was an old 2.2.x kernel interface). > > ip_queue.c still uses NETLINK_FIREWALL so we really can't use > that. sorry, I didn't remember that ip_queue reused the 2.2.x netlink number :( We should have renamed it to make it clear. > So instead, as in the patch below, I solved this for now by using > the NETLINK_SKIP value which was reserved years ago yet never > made use of. thanks. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-23 12:54 [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG Harald Welte 2005-07-23 3:05 ` YOSHIFUJI Hideaki / 吉藤英明 @ 2005-07-23 9:14 ` Evgeniy Polyakov 2005-07-25 2:17 ` David S. Miller 1 sibling, 1 reply; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-23 9:14 UTC (permalink / raw) To: Harald Welte, David Miller, Netfilter Development Mailinglist, Linux Kernel Mailinglist On Sat, Jul 23, 2005 at 08:54:27AM -0400, Harald Welte (laforge@netfilter.org) wrote: > Hi Dave, > Hi Evgeniy, > > the following patch fixes the illegal use of NETLINK_NFLOG by the > 1wire drivers. It assumes that the netlink tap families can now safely > be reclaimed, which is the case according to Dave at netconf'05. > > I'm not sure who would be the right person to fix this, but this patch > needs to go into both 2.6.12.x and 2.6.13 trees, since it potentially > causes a security problem by preventing the iptables ULOG Yep. Actually w1 uses it only for simple event notifications, which definitely will be replaced with connector stuff... So I woulf like to ask Dave about it, and if network people are still against it, I have no objection against this patch. But I sould definitely prefer to move all such simple events into separate event bus. > This has been the third new piece of code that reuses NETLINK_NFLOG > within a couple of months. I would really appreciate if people would > actually ask/apply for a new protocol number instead of just overloading > existing values and thereby causing breakage. I even know who added it... :) I still have question opened about message bus and connector. Andrew has no objection against connector and it lives in -mm quite long time, although was several time removed due to GregKH i2c tree changes. All objections against it was only type of - "I do not like it" Dmitry had some bugfixes which were added. It was tested under quite heavy load on different types of systems without overhead (with CBUS) and with _very_ convenient way of controlling kernelspace from userspace and reverse event bus. > Thanks, > Harald > > -- > - Harald Welte <laforge@netfilter.org> http://netfilter.org/ > ============================================================================ > "Fragmentation is like classful addressing -- an interesting early > architectural error that shows how much experimentation was going > on while IP was being designed." -- Paul Vixie > Give the 1-wire driver stack its own netlink protocol number, instead of > overloading NETLINK_NFLOG. > > I wonder what I have done to people, that they always overload the > NETLINK_NFLOG protocol number and thereby effectively prevent the packet > filter logging mechanism. Please don't re-use protocol numbers. > > Signed-off-by: Harald Welte <laforge@netfilter.org> > > --- > commit b4a566c332048b642506eff7de825fce710ff42c > tree 07ef162f6d449dd67c586c9c63680004787b86c5 > parent d5d3fb40b6db511dbd47a84634a1249de6b7b297 > author laforge <laforge@netfilter.org> Sa, 23 Jul 2005 08:41:24 -0400 > committer laforge <laforge@netfilter.org> Sa, 23 Jul 2005 08:41:24 -0400 > > drivers/w1/w1_int.c | 4 ++-- > include/linux/netlink.h | 2 +- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/w1/w1_int.c b/drivers/w1/w1_int.c > --- a/drivers/w1/w1_int.c > +++ b/drivers/w1/w1_int.c > @@ -88,10 +88,10 @@ static struct w1_master * w1_alloc_dev(u > > dev->groups = 23; > dev->seq = 1; > - dev->nls = netlink_kernel_create(NETLINK_NFLOG, NULL); > + dev->nls = netlink_kernel_create(NETLINK_W1, NULL); > if (!dev->nls) { > printk(KERN_ERR "Failed to create new netlink socket(%u) for w1 master %s.\n", > - NETLINK_NFLOG, dev->dev.bus_id); > + NETLINK_W1, dev->dev.bus_id); > } > > err = device_register(&dev->dev); > diff --git a/include/linux/netlink.h b/include/linux/netlink.h > --- a/include/linux/netlink.h > +++ b/include/linux/netlink.h > @@ -20,7 +20,7 @@ > #define NETLINK_IP6_FW 13 > #define NETLINK_DNRTMSG 14 /* DECnet routing messages */ > #define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */ > -#define NETLINK_TAPBASE 16 /* 16 to 31 are ethertap */ > +#define NETLINK_W1 16 /* 16 to 31 are ethertap */ > > #define MAX_LINKS 32 > -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG 2005-07-23 9:14 ` Evgeniy Polyakov @ 2005-07-25 2:17 ` David S. Miller 2005-07-25 6:02 ` Netlink connector James Morris 0 siblings, 1 reply; 26+ messages in thread From: David S. Miller @ 2005-07-25 2:17 UTC (permalink / raw) To: johnpol; +Cc: laforge, netfilter-devel, linux-kernel From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Date: Sat, 23 Jul 2005 13:14:55 +0400 > Andrew has no objection against connector and it lives in -mm A patch sitting in -mm has zero significance. A lot of junk and useless things end up there as often Andrew incorporates just about every single patch he sees posted to linux-kernel unless he personally knows of some reason why the change might be wrong. So "it's in -mm" is not a metric to use. > All objections against it was only type of - "I do not like it" > Dmitry had some bugfixes which were added. People like James Morris had very specific objections about it. You are a control freak and in general very very difficult to work with, so nobody wants to help you fix things up. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Netlink connector 2005-07-25 2:17 ` David S. Miller @ 2005-07-25 6:02 ` James Morris 2005-07-25 7:06 ` Evgeniy Polyakov 2005-07-26 8:42 ` Harald Welte 0 siblings, 2 replies; 26+ messages in thread From: James Morris @ 2005-07-25 6:02 UTC (permalink / raw) To: David S. Miller Cc: johnpol, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev On Sun, 24 Jul 2005, David S. Miller wrote: > From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > Date: Sat, 23 Jul 2005 13:14:55 +0400 > >> Andrew has no objection against connector and it lives in -mm > > A patch sitting in -mm has zero significance. The significance I think is that Andrew is trying to gently encourage some further progress in the area. I recall some netconf discussion about TIPC over Netlink, or more likely a variation thereof, which may be a better way forward. It's cool stuff http://tipc.sourceforge.net/ - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 6:02 ` Netlink connector James Morris @ 2005-07-25 7:06 ` Evgeniy Polyakov 2005-07-25 14:32 ` Patrick McHardy 2005-07-26 8:42 ` Harald Welte 1 sibling, 1 reply; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-25 7:06 UTC (permalink / raw) To: James Morris Cc: David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev, netdev On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > On Sun, 24 Jul 2005, David S. Miller wrote: > >From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > >Date: Sat, 23 Jul 2005 13:14:55 +0400 > > > >>Andrew has no objection against connector and it lives in -mm > > > >A patch sitting in -mm has zero significance. That is why I'm asking netdev@ people again... > The significance I think is that Andrew is trying to gently encourage some > further progress in the area. > > I recall some netconf discussion about TIPC over Netlink, or more likely a > variation thereof, which may be a better way forward. > > It's cool stuff http://tipc.sourceforge.net/ I read it quite long ago - I'm sure you do not want to use that monster for event bus. It was designed and implemented for heavy intermachine communications, and it is quite hard to setup for userspace <-> kernelspace message bus. > > - James > -- > James Morris > <jmorris@redhat.com> -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 7:06 ` Evgeniy Polyakov @ 2005-07-25 14:32 ` Patrick McHardy 2005-07-25 14:43 ` Eric Leblond 2005-07-25 19:28 ` Evgeniy Polyakov 0 siblings, 2 replies; 26+ messages in thread From: Patrick McHardy @ 2005-07-25 14:32 UTC (permalink / raw) To: Evgeniy Polyakov Cc: James Morris, David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev, netdev Evgeniy Polyakov wrote: > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > >>On Sun, 24 Jul 2005, David S. Miller wrote: >> >>>From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> >>>Date: Sat, 23 Jul 2005 13:14:55 +0400 >>> >>> >>>>Andrew has no objection against connector and it lives in -mm >>> >>>A patch sitting in -mm has zero significance. > > That is why I'm asking netdev@ people again... If I understand correctly it tries to workaround some netlink limitations (limited number of netlink families and multicast groups) by sending everything to userspace and demultiplexing it there. Same in the other direction, an additional layer on top of netlink does basically the same thing netlink already does. This looks like a step in the wrong direction to me, netlink should instead be fixed to support what is needed. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 14:32 ` Patrick McHardy @ 2005-07-25 14:43 ` Eric Leblond 2005-07-25 19:33 ` Evgeniy Polyakov 2005-07-25 19:28 ` Evgeniy Polyakov 1 sibling, 1 reply; 26+ messages in thread From: Eric Leblond @ 2005-07-25 14:43 UTC (permalink / raw) To: Patrick McHardy Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel, netdev Le lundi 25 juillet 2005 à 16:32 +0200, Patrick McHardy a écrit : > Evgeniy Polyakov wrote: > > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > If I understand correctly it tries to workaround some netlink > limitations (limited number of netlink families and multicast groups) > by sending everything to userspace and demultiplexing it there. > Same in the other direction, an additional layer on top of netlink > does basically the same thing netlink already does. This looks like > a step in the wrong direction to me, netlink should instead be fixed > to support what is needed. I totally agree with you, it could be great to fix netlink to support multiple queue. I like to be able to use projects like snort-inline or nufw together. This will make Netfilter really stronger. Furthermore, there's a repetition of filtering capabilities with such a solution. Netfilter has to filter to send to netlink and this is the same with the queue dispatcher. I think this introduce too much complexity. my 0.02$ BR, -- Éric Leblond, eleblond@inl.fr Téléphone : 01 44 89 46 40, Fax : 01 44 89 45 01 INL, http://www.inl.fr ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 14:43 ` Eric Leblond @ 2005-07-25 19:33 ` Evgeniy Polyakov 2005-07-26 8:45 ` Harald Welte 0 siblings, 1 reply; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-25 19:33 UTC (permalink / raw) To: Eric Leblond Cc: Patrick McHardy, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel On Mon, Jul 25, 2005 at 04:43:43PM +0200, Eric Leblond (eleblond@inl.fr) wrote: > Le lundi 25 juillet 2005 à 16:32 +0200, Patrick McHardy a écrit : > > Evgeniy Polyakov wrote: > > > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris (jmorris@redhat.com) wrote: > > If I understand correctly it tries to workaround some netlink > > limitations (limited number of netlink families and multicast groups) > > by sending everything to userspace and demultiplexing it there. > > Same in the other direction, an additional layer on top of netlink > > does basically the same thing netlink already does. This looks like > > a step in the wrong direction to me, netlink should instead be fixed > > to support what is needed. > > I totally agree with you, it could be great to fix netlink to support > multiple queue. > I like to be able to use projects like snort-inline or nufw together. > This will make Netfilter really stronger. > Furthermore, there's a repetition of filtering capabilities with such a > solution. Netfilter has to filter to send to netlink and this is the > same with the queue dispatcher. I think this introduce too much > complexity. Netlink is transport protocol - no need to add complexity into it, it must be as simple as possible and thus extensible. Multiple queues and filtering should be created on different layer, like it is done for TCP/IP and other protocols. I'm not advertising, but connector is exactly the place where it can be implemented. > my 0.02$ > > BR, > -- > Éric Leblond, eleblond@inl.fr > Téléphone : 01 44 89 46 40, Fax : 01 44 89 45 01 > INL, http://www.inl.fr > -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 19:33 ` Evgeniy Polyakov @ 2005-07-26 8:45 ` Harald Welte 0 siblings, 0 replies; 26+ messages in thread From: Harald Welte @ 2005-07-26 8:45 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Eric Leblond, Patrick McHardy, Andrew Morton, netdev, netfilter-devel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1005 bytes --] On Mon, Jul 25, 2005 at 11:33:51PM +0400, Evgeniy Polyakov wrote: > Netlink is transport protocol - no need to add complexity into it, > it must be as simple as possible and thus extensible. yes. but when you run into a serious addressing shortage (like the internet does with ipv4), you develop something that provides more addresses (such as ipv6). That's why support for more groups than 32 (per family) is something that should be put in the netlink protocol. I totally agree that we need a higher-level api on top of that, in order to hide the details of the networking stack for those not interested in it. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 14:32 ` Patrick McHardy 2005-07-25 14:43 ` Eric Leblond @ 2005-07-25 19:28 ` Evgeniy Polyakov 2005-07-25 23:46 ` Patrick McHardy 1 sibling, 1 reply; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-25 19:28 UTC (permalink / raw) To: Patrick McHardy Cc: James Morris, David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy (kaber@trash.net) wrote: > Evgeniy Polyakov wrote: > >On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris > >(jmorris@redhat.com) wrote: > > > >>On Sun, 24 Jul 2005, David S. Miller wrote: > >> > >>>From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > >>>Date: Sat, 23 Jul 2005 13:14:55 +0400 > >>> > >>> > >>>>Andrew has no objection against connector and it lives in -mm > >>> > >>>A patch sitting in -mm has zero significance. > > > >That is why I'm asking netdev@ people again... > > If I understand correctly it tries to workaround some netlink > limitations (limited number of netlink families and multicast groups) > by sending everything to userspace and demultiplexing it there. > Same in the other direction, an additional layer on top of netlink > does basically the same thing netlink already does. This looks like > a step in the wrong direction to me, netlink should instead be fixed > to support what is needed. Not only it. The main _first_ idea was to simplify userspace mesasge handling as much as possible. In first releases I called it ioctl-ng - any module that want ot communicate with userspace in the way ioctl does, requires skb allocation/freeing/handling. Does RTC driver writer need to know what is the difference between shared and cloned skb? Should kernel user of such message bus have to know about skb at all? With char device I only need to register my callback - with kernel connector it is the same, but allows to use the whole power of netlink, especially without nice ioctl features like different pointer size in userspace and kernelspace. And number of free netlink sockets is _very_ small, especially if allocate new one for simple notifications, which can be easily done using connector. No need to allocate skb, no need to know who are those monsters in header and so on. And netlink can be extended to support it - netlink is a transport protocol, it should not care about higher layer message handling, connector instead will deliver message to the end user in a very convenient form. P.S. I've removed netdev@redhat.com - please do not add subscribers-only private mail lists. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 19:28 ` Evgeniy Polyakov @ 2005-07-25 23:46 ` Patrick McHardy 2005-07-25 23:56 ` Thomas Graf 2005-07-26 4:45 ` Evgeniy Polyakov 0 siblings, 2 replies; 26+ messages in thread From: Patrick McHardy @ 2005-07-25 23:46 UTC (permalink / raw) To: Evgeniy Polyakov Cc: James Morris, David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev Evgeniy Polyakov wrote: > On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy (kaber@trash.net) wrote: > >>If I understand correctly it tries to workaround some netlink >>limitations (limited number of netlink families and multicast groups) >>by sending everything to userspace and demultiplexing it there. >>Same in the other direction, an additional layer on top of netlink >>does basically the same thing netlink already does. This looks like >>a step in the wrong direction to me, netlink should instead be fixed >>to support what is needed. > > Not only it. > The main _first_ idea was to simplify userspace mesasge handling as much > as possible. > In first releases I called it ioctl-ng - any module that want ot > communicate with userspace in the way ioctl does, Usually netlink is easily extendable by using nested TLVs. By hiding this you basically remove this extensibility. > requires skb allocation/freeing/handling. > Does RTC driver writer need to know what is the difference between > shared and cloned skb? Should kernel user of such message bus > have to know about skb at all? Netlink users don't have to care about shared or cloned skbs. I don't think its a big issue to use alloc_skb and then the usual netlink macros. Thomas added a number of macros that simplfiy use a lot. But my main objection is that it sends everything to userspace even if noone is listening. This can't be used for things that generate lots of events, and also will get problematic is the number of users increases. > With char device I only need to register my callback - with kernel > connector it is the same, but allows to use the whole power of netlink, > especially without nice ioctl features like different pointer size > in userspace and kernelspace. You still have to take care of mixed 64/32 bit environments, u64 fields for example are differently alligned. > And number of free netlink sockets is _very_ small, especially > if allocate new one for simple notifications, which can be easily done > using connector. Then fix it so we can use more families and groups. I started some work on this, but I'm not sure if I have time to complete it. > And netlink can be extended to support it - netlink is a transport > protocol, it should not care about higher layer message handling, > connector instead will deliver message to the end user in a very > convenient form. You can still built this stuff on top, but the workarounds for netlink limitations need to be fixed in netlink. > P.S. I've removed netdev@redhat.com - please do not add subscribers-only > private mail lists. Wasn't me :) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 23:46 ` Patrick McHardy @ 2005-07-25 23:56 ` Thomas Graf 2005-07-26 0:16 ` Patrick McHardy 2005-07-26 4:45 ` Evgeniy Polyakov 1 sibling, 1 reply; 26+ messages in thread From: Thomas Graf @ 2005-07-25 23:56 UTC (permalink / raw) To: Patrick McHardy Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel * Patrick McHardy <42E579BC.8000701@trash.net> 2005-07-26 01:46 > Netlink users don't have to care about shared or cloned skbs. I don't > think its a big issue to use alloc_skb and then the usual netlink > macros. Thomas added a number of macros that simplfiy use a lot. Once I've finished the generic netlink attribute macros the usage will be even simpler. I wrote down all the things I want to do today in a park and I intend to write the code once I'm back from my vacation. > But my main objection is that it sends everything to userspace even > if noone is listening. This can't be used for things that generate > lots of events, and also will get problematic is the number of users > increases. My patches will include a new function netlink_nr_subscribers() taking the socket and a mask of groups. I posted something simliar during an earlier connector discussion already. > You still have to take care of mixed 64/32 bit environments, u64 fields > for example are differently alligned. My solution to this (in the same patchset) is that we never derference u64s but instead copy them. > Then fix it so we can use more families and groups. I started some work > on this, but I'm not sure if I have time to complete it. Great, this is one of the remaining issues I haven't solved yet. If you want me to take over just hand over your unfinished work and I'll integrate it into my patchset. I'm sorry to not being able to provide any code yet, it's one of the first things I'll do once I'm back. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 23:56 ` Thomas Graf @ 2005-07-26 0:16 ` Patrick McHardy 2005-07-26 0:30 ` Thomas Graf 0 siblings, 1 reply; 26+ messages in thread From: Patrick McHardy @ 2005-07-26 0:16 UTC (permalink / raw) To: Thomas Graf Cc: Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel Thomas Graf wrote: > * Patrick McHardy <42E579BC.8000701@trash.net> 2005-07-26 01:46 > >>You still have to take care of mixed 64/32 bit environments, u64 fields >>for example are differently alligned. > > My solution to this (in the same patchset) is that we never > derference u64s but instead copy them. I don't understand. The problem is mainly u64 embedded in structures, the structs have different sizes if the u64 is not 8 byte aligned and the structure size padded to a multiple of 8. >>Then fix it so we can use more families and groups. I started some work >>on this, but I'm not sure if I have time to complete it. > > Great, this is one of the remaining issues I haven't solved yet. > If you want me to take over just hand over your unfinished work > and I'll integrate it into my patchset. I started working on it after the OLS party, so no postable code yet :) The idea for more groups is basically to remove the fixed groups bitmask from struct sockaddr_nl and use setsockopt to add/remove multicast subscriptions. If we add the limitation that a packet can only be multicasted to a single group we can support an arbitary number of groups, otherwise we would still be limited by size of skb->cb. This limitation shouldn't be a problem, AFAIK nothing is multicasting to multiple groups at once right now and the increased number of groups will allow a better granularity anyway. The main problem is keeping it backwards-compatible for current netlink users. If this isn't possible we may need to call it netlink2. Regards Patrick ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-26 0:16 ` Patrick McHardy @ 2005-07-26 0:30 ` Thomas Graf 0 siblings, 0 replies; 26+ messages in thread From: Thomas Graf @ 2005-07-26 0:30 UTC (permalink / raw) To: Patrick McHardy Cc: Jamal Hadi Salim, Evgeniy Polyakov, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel * Patrick McHardy <42E580CF.4010800@trash.net> 2005-07-26 02:16 > Thomas Graf wrote: > >* Patrick McHardy <42E579BC.8000701@trash.net> 2005-07-26 01:46 > > > >>You still have to take care of mixed 64/32 bit environments, u64 fields > >>for example are differently alligned. > > > >My solution to this (in the same patchset) is that we never > >derference u64s but instead copy them. > > I don't understand. The problem is mainly u64 embedded in structures, > the structs have different sizes if the u64 is not 8 byte aligned > and the structure size padded to a multiple of 8. Like in gnet_stats, yes. I thought you meant usages like *(u64 *) which we shouldn't do either. > I started working on it after the OLS party, so no postable code yet :) > The idea for more groups is basically to remove the fixed groups > bitmask from struct sockaddr_nl and use setsockopt to add/remove > multicast subscriptions. If we add the limitation that a packet > can only be multicasted to a single group we can support an arbitary > number of groups, otherwise we would still be limited by size of > skb->cb. I was thinking of subscription messages over netlink itself for the advantage that we could use it within the distributed netlink protocol that has to come up sometime soon. Well, both ways are ok I guess, the ease of distributive usage is my only argument. > This limitation shouldn't be a problem, AFAIK nothing is > multicasting to multiple groups at once right now and the increased > number of groups will allow a better granularity anyway. I'm not aware of any and I agree. We don't need n<->n subscriptions, 1<->n is perfectly fine as I see it. > The main > problem is keeping it backwards-compatible for current netlink users. > If this isn't possible we may need to call it netlink2. I think Jamal has a moral patent on the name netlink2 so be careful ;-> It should be possible to remain compatible, I don't see any unresolveable issues right now. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 23:46 ` Patrick McHardy 2005-07-25 23:56 ` Thomas Graf @ 2005-07-26 4:45 ` Evgeniy Polyakov 2005-07-26 4:56 ` Stephen Hemminger 2005-07-26 6:14 ` Thomas Graf 1 sibling, 2 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 4:45 UTC (permalink / raw) To: Patrick McHardy Cc: James Morris, David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > Evgeniy Polyakov wrote: > >On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy > >(kaber@trash.net) wrote: > > > >>If I understand correctly it tries to workaround some netlink > >>limitations (limited number of netlink families and multicast groups) > >>by sending everything to userspace and demultiplexing it there. > >>Same in the other direction, an additional layer on top of netlink > >>does basically the same thing netlink already does. This looks like > >>a step in the wrong direction to me, netlink should instead be fixed > >>to support what is needed. > > > >Not only it. > >The main _first_ idea was to simplify userspace mesasge handling as much > >as possible. > >In first releases I called it ioctl-ng - any module that want ot > >communicate with userspace in the way ioctl does, > > Usually netlink is easily extendable by using nested TLVs. By hiding > this you basically remove this extensibility. Current netlink is not extensible for _many_ different users. It has only 32 sockets. > >requires skb allocation/freeing/handling. > >Does RTC driver writer need to know what is the difference between > >shared and cloned skb? Should kernel user of such message bus > >have to know about skb at all? > > Netlink users don't have to care about shared or cloned skbs. I don't > think its a big issue to use alloc_skb and then the usual netlink > macros. Thomas added a number of macros that simplfiy use a lot. Kernel user also must know about difference between unicast/broadcast, how to dequeue the skb, how to free it and in what context. ioctl users do not need to know how file_operations is bound to file. > But my main objection is that it sends everything to userspace even > if noone is listening. This can't be used for things that generate > lots of events, and also will get problematic is the number of users > increases. It is a problem for existing netlink - either check in bind time, what could be done for connector, or in socket creation time. Actually it is not even a problem, since checking is being done, but after allocation and message filling, such check can be moved into cn_netlink_send() in connector, but different netlink users, who prefers to use different sockets, must perform it by itself in each place, where skb is allocated... Connector is a solution for current situation, it can be deployed with few casualties. Creating a new netlink2 socket for device, which wants to replace ioctl controlling or broadcast it's state is a wrong way. Different sockets/flows does not allow easy flow control. We have one pipe - ethernet, and many protocols inside this pipe with different headers - it is the same here - netlink is such a pipe, and with connector it allows to have different protocols in it. > >With char device I only need to register my callback - with kernel > >connector it is the same, but allows to use the whole power of netlink, > >especially without nice ioctl features like different pointer size > >in userspace and kernelspace. > > You still have to take care of mixed 64/32 bit environments, u64 fields > for example are differently alligned. Connector has a size in it's header - ioctl does not. > >And number of free netlink sockets is _very_ small, especially > >if allocate new one for simple notifications, which can be easily done > >using connector. > > Then fix it so we can use more families and groups. I started some work > on this, but I'm not sure if I have time to complete it. It does not "fix" the "problem" of skb management knowledge, which I described. Netlink is a transport protocol, some general logic must be created on top of it, like it is done in TCP/IP. > >And netlink can be extended to support it - netlink is a transport > >protocol, it should not care about higher layer message handling, > >connector instead will deliver message to the end user in a very > >convenient form. > > You can still built this stuff on top, but the workarounds for netlink > limitations need to be fixed in netlink. I could not call it workaround, I think it is a management layer, which allows : 1. easy usage. Just register a callback and that is all. Callback will be invoced each time new message arrives. No need to dequeue/free/anything. 2. easy usage. Call one function for message delivering, which can care of nonexistent users, perform flow control, congestion control, guarantee delivery and any other. 3. Easily deployable - current implementation is so simple, and it does work with existing netlink. 4. It is logical level on top of transport protocol, it is UDP/IP over ethernet :) > >P.S. I've removed netdev@redhat.com - please do not add subscribers-only > >private mail lists. > > Wasn't me :) Yep :) -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-26 4:45 ` Evgeniy Polyakov @ 2005-07-26 4:56 ` Stephen Hemminger 2005-07-26 5:01 ` Evgeniy Polyakov 2005-07-26 6:14 ` Thomas Graf 1 sibling, 1 reply; 26+ messages in thread From: Stephen Hemminger @ 2005-07-26 4:56 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Patrick McHardy, James Morris, David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev Evgeniy Polyakov wrote: >On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > > >>Evgeniy Polyakov wrote: >> >> >>>On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy >>>(kaber@trash.net) wrote: >>> >>> >>> >>>>If I understand correctly it tries to workaround some netlink >>>>limitations (limited number of netlink families and multicast groups) >>>>by sending everything to userspace and demultiplexing it there. >>>>Same in the other direction, an additional layer on top of netlink >>>>does basically the same thing netlink already does. This looks like >>>>a step in the wrong direction to me, netlink should instead be fixed >>>>to support what is needed. >>>> >>>> >>>Not only it. >>>The main _first_ idea was to simplify userspace mesasge handling as much >>>as possible. >>>In first releases I called it ioctl-ng - any module that want ot >>>communicate with userspace in the way ioctl does, >>> >>> >>Usually netlink is easily extendable by using nested TLVs. By hiding >>this you basically remove this extensibility. >> >> > >Current netlink is not extensible for _many_ different users. >It has only 32 sockets. > > > >>>requires skb allocation/freeing/handling. >>>Does RTC driver writer need to know what is the difference between >>>shared and cloned skb? Should kernel user of such message bus >>>have to know about skb at all? >>> >>> >>Netlink users don't have to care about shared or cloned skbs. I don't >>think its a big issue to use alloc_skb and then the usual netlink >>macros. Thomas added a number of macros that simplfiy use a lot. >> >> > >Kernel user also must know about difference between unicast/broadcast, >how to dequeue the skb, how to free it and in what context. >ioctl users do not need to know how file_operations is bound to file. > > > >>But my main objection is that it sends everything to userspace even >>if noone is listening. This can't be used for things that generate >>lots of events, and also will get problematic is the number of users >>increases. >> >> > >It is a problem for existing netlink - either check in bind time, >what could be done for connector, or in socket creation time. > >Actually it is not even a problem, since checking is being done, >but after allocation and message filling, such check can be moved into >cn_netlink_send() in connector, but different netlink users, >who prefers to use different sockets, must perform it by itself in each >place, where skb is allocated... > >Connector is a solution for current situation, >it can be deployed with few casualties. >Creating a new netlink2 socket for device, which wants to replace ioctl >controlling or broadcast it's state is a wrong way. >Different sockets/flows does not allow easy flow control. > >We have one pipe - ethernet, and many protocols inside this pipe >with different headers - it is the same here - netlink is such a pipe, >and with connector it allows to have different protocols in it. > > > >>>With char device I only need to register my callback - with kernel >>>connector it is the same, but allows to use the whole power of netlink, >>>especially without nice ioctl features like different pointer size >>>in userspace and kernelspace. >>> >>> >>You still have to take care of mixed 64/32 bit environments, u64 fields >>for example are differently alligned. >> >> > >Connector has a size in it's header - ioctl does not. > > > >>>And number of free netlink sockets is _very_ small, especially >>>if allocate new one for simple notifications, which can be easily done >>>using connector. >>> >>> >>Then fix it so we can use more families and groups. I started some work >>on this, but I'm not sure if I have time to complete it. >> >> > >It does not "fix" the "problem" of skb management knowledge, which I >described. >Netlink is a transport protocol, some general logic must be created on >top of it, like it is done in TCP/IP. > > > >>>And netlink can be extended to support it - netlink is a transport >>>protocol, it should not care about higher layer message handling, >>>connector instead will deliver message to the end user in a very >>>convenient form. >>> >>> >>You can still built this stuff on top, but the workarounds for netlink >>limitations need to be fixed in netlink. >> >> > >I could not call it workaround, I think it is a management layer, >which allows : >1. easy usage. Just register a callback and that is all. Callback will >be invoced each time new message arrives. No need to >dequeue/free/anything. >2. easy usage. Call one function for message delivering, which can >care of nonexistent users, perform flow control, congestion control, >guarantee delivery and any other. >3. Easily deployable - current implementation is so simple, and it does >work with existing netlink. >4. It is logical level on top of transport protocol, it is UDP/IP over >ethernet :) > > > If it is a transport, then it should be in the kernel. Otherwise, it becomes painful for applications with multiple input sources. Think of epoll/poll/select and threads, doing the demultiplexing in user space would be a pain for applications and libraries. The other way to go is to use something like dbus/hal and use a higher level application oriented interface. The problem with that approach, is it assumes every management app wants to drag in gnome.. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-26 4:56 ` Stephen Hemminger @ 2005-07-26 5:01 ` Evgeniy Polyakov 0 siblings, 0 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 5:01 UTC (permalink / raw) To: Stephen Hemminger Cc: Patrick McHardy, James Morris, David S. Miller, Harald Welte, netfilter-devel, linux-kernel, Andrew Morton, netdev On Mon, Jul 25, 2005 at 09:56:56PM -0700, Stephen Hemminger (shemminger@osdl.org) wrote: > Evgeniy Polyakov wrote: > > >On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy > >(kaber@trash.net) wrote: > > > > > >>Evgeniy Polyakov wrote: > >> > >> > >>>On Mon, Jul 25, 2005 at 04:32:32PM +0200, Patrick McHardy > >>>(kaber@trash.net) wrote: > >>> > >>> > >>> > >>>>If I understand correctly it tries to workaround some netlink > >>>>limitations (limited number of netlink families and multicast groups) > >>>>by sending everything to userspace and demultiplexing it there. > >>>>Same in the other direction, an additional layer on top of netlink > >>>>does basically the same thing netlink already does. This looks like > >>>>a step in the wrong direction to me, netlink should instead be fixed > >>>>to support what is needed. > >>>> > >>>> > >>>Not only it. > >>>The main _first_ idea was to simplify userspace mesasge handling as much > >>>as possible. > >>>In first releases I called it ioctl-ng - any module that want ot > >>>communicate with userspace in the way ioctl does, > >>> > >>> > >>Usually netlink is easily extendable by using nested TLVs. By hiding > >>this you basically remove this extensibility. > >> > >> > > > >Current netlink is not extensible for _many_ different users. > >It has only 32 sockets. > > > > > > > >>>requires skb allocation/freeing/handling. > >>>Does RTC driver writer need to know what is the difference between > >>>shared and cloned skb? Should kernel user of such message bus > >>>have to know about skb at all? > >>> > >>> > >>Netlink users don't have to care about shared or cloned skbs. I don't > >>think its a big issue to use alloc_skb and then the usual netlink > >>macros. Thomas added a number of macros that simplfiy use a lot. > >> > >> > > > >Kernel user also must know about difference between unicast/broadcast, > >how to dequeue the skb, how to free it and in what context. > >ioctl users do not need to know how file_operations is bound to file. > > > > > > > >>But my main objection is that it sends everything to userspace even > >>if noone is listening. This can't be used for things that generate > >>lots of events, and also will get problematic is the number of users > >>increases. > >> > >> > > > >It is a problem for existing netlink - either check in bind time, > >what could be done for connector, or in socket creation time. > > > >Actually it is not even a problem, since checking is being done, > >but after allocation and message filling, such check can be moved into > >cn_netlink_send() in connector, but different netlink users, > >who prefers to use different sockets, must perform it by itself in each > >place, where skb is allocated... > > > >Connector is a solution for current situation, > >it can be deployed with few casualties. > >Creating a new netlink2 socket for device, which wants to replace ioctl > >controlling or broadcast it's state is a wrong way. > >Different sockets/flows does not allow easy flow control. > > > >We have one pipe - ethernet, and many protocols inside this pipe > >with different headers - it is the same here - netlink is such a pipe, > >and with connector it allows to have different protocols in it. > > > > > > > >>>With char device I only need to register my callback - with kernel > >>>connector it is the same, but allows to use the whole power of netlink, > >>>especially without nice ioctl features like different pointer size > >>>in userspace and kernelspace. > >>> > >>> > >>You still have to take care of mixed 64/32 bit environments, u64 fields > >>for example are differently alligned. > >> > >> > > > >Connector has a size in it's header - ioctl does not. > > > > > > > >>>And number of free netlink sockets is _very_ small, especially > >>>if allocate new one for simple notifications, which can be easily done > >>>using connector. > >>> > >>> > >>Then fix it so we can use more families and groups. I started some work > >>on this, but I'm not sure if I have time to complete it. > >> > >> > > > >It does not "fix" the "problem" of skb management knowledge, which I > >described. > >Netlink is a transport protocol, some general logic must be created on > >top of it, like it is done in TCP/IP. > > > > > > > >>>And netlink can be extended to support it - netlink is a transport > >>>protocol, it should not care about higher layer message handling, > >>>connector instead will deliver message to the end user in a very > >>>convenient form. > >>> > >>> > >>You can still built this stuff on top, but the workarounds for netlink > >>limitations need to be fixed in netlink. > >> > >> > > > >I could not call it workaround, I think it is a management layer, > >which allows : > >1. easy usage. Just register a callback and that is all. Callback will > >be invoced each time new message arrives. No need to > >dequeue/free/anything. > >2. easy usage. Call one function for message delivering, which can > >care of nonexistent users, perform flow control, congestion control, > >guarantee delivery and any other. > >3. Easily deployable - current implementation is so simple, and it does > >work with existing netlink. > >4. It is logical level on top of transport protocol, it is UDP/IP over > >ethernet :) > > > > > > > If it is a transport, then it should be in the kernel. Otherwise, it > becomes painful > for applications with multiple input sources. Think of > epoll/poll/select and threads, > doing the demultiplexing in user space would be a pain for applications > and libraries. It _is_ in the kernel - multiplexing is being done in a send time, userspace does not receive messages for different ID's. Currently it is done using netlink groups, and I would like to change it, but conenctor layer itself will not be changed, so no application will be changed - they bound before and will only bound after. one socket, different groups. Ok, now application bound to -1 group will receive all traffic, but I posted proof-of-concept patch to remove such behaviour. > The other way to go is to use something like dbus/hal and use a higher level > application oriented interface. The problem with that approach, is it > assumes > every management app wants to drag in gnome.. No need to parse headers there. When we read from UDP socket, we do not get headers - connector users do not read netlink header, and it is possible to completely remove even connector header, although I would like to have it - some kind of HDRINCL option... -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-26 4:45 ` Evgeniy Polyakov 2005-07-26 4:56 ` Stephen Hemminger @ 2005-07-26 6:14 ` Thomas Graf 2005-07-26 6:31 ` Evgeniy Polyakov 1 sibling, 1 reply; 26+ messages in thread From: Thomas Graf @ 2005-07-26 6:14 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Patrick McHardy, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel * Evgeniy Polyakov <20050726044547.GA32006@2ka.mipt.ru> 2005-07-26 08:45 > On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > > Usually netlink is easily extendable by using nested TLVs. By hiding > > this you basically remove this extensibility. > > Current netlink is not extensible for _many_ different users. Patrick's key point was that by hiding some of the functionality you remove a lot of the flexbility. > It has only 32 sockets. You mean MAX_LINKS? That is the current number of reserved netlink protocols. The ethertaps are obsolete and can be reused so we're currently using 16 out of 256 possible protocols. If that is not enough there are ways to work around this. However, I also see a need for a generic protocol providing a simplified interface for small applications. Nevertheless we should take the time and work things out on the netlink level first, netlink has issues and we should not work around them in a upper layer. > > But my main objection is that it sends everything to userspace even > > if noone is listening. This can't be used for things that generate > > lots of events, and also will get problematic is the number of users > > increases. > > It is a problem for existing netlink - either check in bind time, > what could be done for connector, or in socket creation time. No, I think you are misunderstanding something. As I said, we can easly add a function netlink_nr_subscribers(sk, groups) so the check can be done before starting to build the message. This is no problem, it simply didn't make sense so far because netlink event messages were mostly used for rare events. > Actually it is not even a problem, since checking is being done, > but after allocation and message filling, such check can be moved into > cn_netlink_send() in connector, but different netlink users, > who prefers to use different sockets, must perform it by itself in each > place, where skb is allocated... Sure, which is the right thing, it makes perfect sense to check before starting the process of building and event and sending it. > Connector is a solution for current situation, > it can be deployed with few casualties. The problem is that netlink is likely to change in order to cope with some recent needs, e.g. ctnetlink but also other current issues which need to be addressed. Therefore I suggest to build connector on top of the updated netlink so you we have one thing less to worry about when thinking about compatibility. > Creating a new netlink2 socket for device, which wants to replace ioctl > controlling or broadcast it's state is a wrong way. Slowly, we might need netlink2 _in case_ we cannot work things out without breaking compatibility. This has nothing to do with the connector, there are netlink users which have new needs such as more groups, at least some of them need the flexibility of netlink itself so we have to work things out for them. > Different sockets/flows does not allow easy flow control. I'm not sure what you mean. > We have one pipe - ethernet, and many protocols inside this pipe > with different headers - it is the same here - netlink is such a pipe, > and with connector it allows to have different protocols in it. At least parts of your connector is just a redudant implementation of what netlink is already capable of doing. Sure, some of them have issues but there is no reason to just build a new protocol on top of another one if the protocol beneath has issues which can be resolved. > > You still have to take care of mixed 64/32 bit environments, u64 fields > > for example are differently alligned. > > Connector has a size in it's header - ioctl does not. You have exactly the same issues as netlink as soon as you transfer structs, believe it or not. > It does not "fix" the "problem" of skb management knowledge, which I > described. Yes ok, this is a different issue and as Patrick stated already those have been mostly worked out by providing a new set of macros. Except for a few leftovers, which will be addressed, there is no need to call skb functions anymore. The reason the plain skb interface was used is simply that the authors of most of the netlink using code are in fact very familiar with the skb interface, that's it. > > You can still built this stuff on top, but the workarounds for netlink > > limitations need to be fixed in netlink. > > I could not call it workaround, I think it is a management layer, > which allows : Listen, nobody wants to take away your baby. ;-> There are some objections of things which would rather be fixed in the netlink layer first and the remaining part that is missing goes into the connector. I see a lot of replicated netlink code in the connector which is no necessary. I perfectly agree with you that we require some form of simplified addressing and easier message handling for simple applications but just building another layer on top of netlink without respecting the capabilities of netlink itself is not the way to go as I see it. For example, we'll probably add a new group subscription mechanism to netlink which might perfectly suit the needs of your connector. > 1. easy usage. Just register a callback and that is all. Callback will > be invoced each time new message arrives. No need to > dequeue/free/anything. Good point, also doable in netlink directly. Just get rid of the usual family_rcv -> family_rcv_skb -> family_rcv_msg process and do a callback registration interface instead. However, often the processing of a message and the resulting ack must be done as an atomic operation, e.g. rtnetlink. > 2. easy usage. Call one function for message delivering, which can > care of nonexistent users, perform flow control, congestion control, > guarantee delivery and any other. I don't understand what exactly you mean but netlink itself is not reliable under memory pressure. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-26 6:14 ` Thomas Graf @ 2005-07-26 6:31 ` Evgeniy Polyakov 0 siblings, 0 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 6:31 UTC (permalink / raw) To: Thomas Graf Cc: Patrick McHardy, Andrew Morton, Harald Welte, netdev, netfilter-devel, linux-kernel On Tue, Jul 26, 2005 at 08:14:47AM +0200, Thomas Graf (tgraf@suug.ch) wrote: > * Evgeniy Polyakov <20050726044547.GA32006@2ka.mipt.ru> 2005-07-26 08:45 > > On Tue, Jul 26, 2005 at 01:46:04AM +0200, Patrick McHardy (kaber@trash.net) wrote: > > > Usually netlink is easily extendable by using nested TLVs. By hiding > > > this you basically remove this extensibility. > > > > Current netlink is not extensible for _many_ different users. > > Patrick's key point was that by hiding some of the functionality > you remove a lot of the flexbility. > > > It has only 32 sockets. > > You mean MAX_LINKS? That is the current number of reserved > netlink protocols. The ethertaps are obsolete and can be > reused so we're currently using 16 out of 256 possible > protocols. If that is not enough there are ways to work > around this. However, I also see a need for a generic protocol > providing a simplified interface for small applications. > Nevertheless we should take the time and work things out on > the netlink level first, netlink has issues and we should not > work around them in a upper layer. > > > > But my main objection is that it sends everything to userspace even > > > if noone is listening. This can't be used for things that generate > > > lots of events, and also will get problematic is the number of users > > > increases. > > > > It is a problem for existing netlink - either check in bind time, > > what could be done for connector, or in socket creation time. > > No, I think you are misunderstanding something. As I said, we can > easly add a function netlink_nr_subscribers(sk, groups) so the > check can be done before starting to build the message. This is > no problem, it simply didn't make sense so far because netlink > event messages were mostly used for rare events. Yep. > > Actually it is not even a problem, since checking is being done, > > but after allocation and message filling, such check can be moved into > > cn_netlink_send() in connector, but different netlink users, > > who prefers to use different sockets, must perform it by itself in each > > place, where skb is allocated... > > Sure, which is the right thing, it makes perfect sense to check > before starting the process of building and event and sending it. > > > Connector is a solution for current situation, > > it can be deployed with few casualties. > > The problem is that netlink is likely to change in order > to cope with some recent needs, e.g. ctnetlink but also other > current issues which need to be addressed. Therefore I suggest > to build connector on top of the updated netlink so you we have > one thing less to worry about when thinking about compatibility. > > > Creating a new netlink2 socket for device, which wants to replace ioctl > > controlling or broadcast it's state is a wrong way. > > Slowly, we might need netlink2 _in case_ we cannot work things > out without breaking compatibility. This has nothing to do with > the connector, there are netlink users which have new needs such > as more groups, at least some of them need the flexibility of > netlink itself so we have to work things out for them. > > > Different sockets/flows does not allow easy flow control. > > I'm not sure what you mean. Concider socket overrun - message will be dropped, using special flags in connector [it's size field was selected to be 4 bytes, and thus has big reserve] this subsystem can requeue message later after timeout or something similar... > > We have one pipe - ethernet, and many protocols inside this pipe > > with different headers - it is the same here - netlink is such a pipe, > > and with connector it allows to have different protocols in it. > > At least parts of your connector is just a redudant implementation > of what netlink is already capable of doing. Sure, some of them > have issues but there is no reason to just build a new protocol on > top of another one if the protocol beneath has issues which can be > resolved. > > > > You still have to take care of mixed 64/32 bit environments, u64 fields > > > for example are differently alligned. > > > > Connector has a size in it's header - ioctl does not. > > You have exactly the same issues as netlink as soon as you transfer > structs, believe it or not. > > > It does not "fix" the "problem" of skb management knowledge, which I > > described. > > Yes ok, this is a different issue and as Patrick stated already > those have been mostly worked out by providing a new set of > macros. Except for a few leftovers, which will be addressed, there > is no need to call skb functions anymore. The reason the plain > skb interface was used is simply that the authors of most of the > netlink using code are in fact very familiar with the skb interface, > that's it. I saw your changes - theay are very usefull, but _only_ for sending part. Kernel receiver still needs dequeuing, freeing and NLKMSG macros. In first netlink days it also needed skb_recv_msg() or something similar... > > > You can still built this stuff on top, but the workarounds for netlink > > > limitations need to be fixed in netlink. > > > > I could not call it workaround, I think it is a management layer, > > which allows : > > Listen, nobody wants to take away your baby. ;-> There are some Yeah :) > objections of things which would rather be fixed in the netlink > layer first and the remaining part that is missing goes into the > connector. I see a lot of replicated netlink code in the connector > which is no necessary. I perfectly agree with you that we require > some form of simplified addressing and easier message handling > for simple applications but just building another layer on top > of netlink without respecting the capabilities of netlink itself > is not the way to go as I see it. For example, we'll probably add > a new group subscription mechanism to netlink which might perfectly > suit the needs of your connector. That is why I raise this question again and againg to see, what ideas should be moved from connector into netlink and vice versa... :) > > 1. easy usage. Just register a callback and that is all. Callback will > > be invoced each time new message arrives. No need to > > dequeue/free/anything. > > Good point, also doable in netlink directly. Just get rid of the > usual family_rcv -> family_rcv_skb -> family_rcv_msg process and > do a callback registration interface instead. However, often the > processing of a message and the resulting ack must be done as an > atomic operation, e.g. rtnetlink. It also better to move into workqueue - just to be sure users will not do some wrong things... > > 2. easy usage. Call one function for message delivering, which can > > care of nonexistent users, perform flow control, congestion control, > > guarantee delivery and any other. > > I don't understand what exactly you mean but netlink itself > is not reliable under memory pressure. Connector has cn_netlink_send() which is a wrapper on top of skb allocation, queuing and so on. by flow/congestion control I mean here, that this function can check for remote peer existing, requeue message if socket overrun is caught, guarantee that no OOM condition was caught, and requeue if it was the case and so on. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-25 6:02 ` Netlink connector James Morris 2005-07-25 7:06 ` Evgeniy Polyakov @ 2005-07-26 8:42 ` Harald Welte 2005-07-26 9:01 ` Evgeniy Polyakov 1 sibling, 1 reply; 26+ messages in thread From: Harald Welte @ 2005-07-26 8:42 UTC (permalink / raw) To: James Morris Cc: David S. Miller, johnpol, netfilter-devel, linux-kernel, Andrew Morton, netdev [-- Attachment #1: Type: text/plain, Size: 1365 bytes --] On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris wrote: > On Sun, 24 Jul 2005, David S. Miller wrote: > >From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > >Date: Sat, 23 Jul 2005 13:14:55 +0400 > >>Andrew has no objection against connector and it lives in -mm > >A patch sitting in -mm has zero significance. > > The significance I think is that Andrew is trying to gently encourage some > further progress in the area. Patrick McHardy is currently working on some ideas on how to extend netlink. The fundamental problem that the connector is trying to solve: 1) provide more 'groups' (to transport more different kinds of events) 2) provide an abstract API for other kernel code, so it doesn't have to know anything about skb's or networking. IMHO issue number '1' should (and can) be adressed within netlink. Wait for Patrick's work on this to show up on netdev. We can then think whether the connctor API (or something similar) can be put on top of it. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Netlink connector 2005-07-26 8:42 ` Harald Welte @ 2005-07-26 9:01 ` Evgeniy Polyakov 0 siblings, 0 replies; 26+ messages in thread From: Evgeniy Polyakov @ 2005-07-26 9:01 UTC (permalink / raw) To: Harald Welte, James Morris, David S. Miller, netfilter-devel, linux-kernel, Andrew Morton, netdev On Tue, Jul 26, 2005 at 04:42:14AM -0400, Harald Welte (laforge@netfilter.org) wrote: > On Mon, Jul 25, 2005 at 02:02:10AM -0400, James Morris wrote: > > On Sun, 24 Jul 2005, David S. Miller wrote: > > >From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> > > >Date: Sat, 23 Jul 2005 13:14:55 +0400 > > >>Andrew has no objection against connector and it lives in -mm > > >A patch sitting in -mm has zero significance. > > > > The significance I think is that Andrew is trying to gently encourage some > > further progress in the area. > > Patrick McHardy is currently working on some ideas on how to extend > netlink. > > The fundamental problem that the connector is trying to solve: > > 1) provide more 'groups' (to transport more different kinds of events) > 2) provide an abstract API for other kernel code, so it doesn't have to > know anything about skb's or networking. > > IMHO issue number '1' should (and can) be adressed within netlink. Wait > for Patrick's work on this to show up on netdev. We can then think > whether the connctor API (or something similar) can be put on top of it. Fair enough. Let's do it this way. > -- > - Harald Welte <laforge@netfilter.org> http://netfilter.org/ > ============================================================================ > "Fragmentation is like classful addressing -- an interesting early > architectural error that shows how much experimentation was going > on while IP was being designed." -- Paul Vixie -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2005-07-26 12:58 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-07-23 12:54 [PATCH] 1 Wire drivers illegally overload NETLINK_NFLOG Harald Welte 2005-07-23 3:05 ` YOSHIFUJI Hideaki / 吉藤英明 2005-07-23 13:33 ` Harald Welte 2005-07-25 2:09 ` David S. Miller 2005-07-25 2:15 ` David S. Miller 2005-07-26 9:48 ` Harald Welte 2005-07-23 9:14 ` Evgeniy Polyakov 2005-07-25 2:17 ` David S. Miller 2005-07-25 6:02 ` Netlink connector James Morris 2005-07-25 7:06 ` Evgeniy Polyakov 2005-07-25 14:32 ` Patrick McHardy 2005-07-25 14:43 ` Eric Leblond 2005-07-25 19:33 ` Evgeniy Polyakov 2005-07-26 8:45 ` Harald Welte 2005-07-25 19:28 ` Evgeniy Polyakov 2005-07-25 23:46 ` Patrick McHardy 2005-07-25 23:56 ` Thomas Graf 2005-07-26 0:16 ` Patrick McHardy 2005-07-26 0:30 ` Thomas Graf 2005-07-26 4:45 ` Evgeniy Polyakov 2005-07-26 4:56 ` Stephen Hemminger 2005-07-26 5:01 ` Evgeniy Polyakov 2005-07-26 6:14 ` Thomas Graf 2005-07-26 6:31 ` Evgeniy Polyakov 2005-07-26 8:42 ` Harald Welte 2005-07-26 9:01 ` Evgeniy Polyakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox