netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NETLINK_UESTABLISHED notifier event
@ 2005-04-05 19:54 Dmitry Yusupov
  2005-04-05 21:39 ` Jean-Mickael Guerin
  2005-04-06  2:45 ` Herbert Xu
  0 siblings, 2 replies; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-05 19:54 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller

[-- Attachment #1: Type: text/plain, Size: 338 bytes --]

This patch provides:

* new event for unicast connections NETLINK_UESTABLISHED.

* netlink alloc_skb() now uses sk_allocation instead of hard-coded
GFP_KERNEL

* since netlink event described via proto and pid,
netlink_getsockbypid() is exported, so netlink user can identify socket.

Signed-off-by: Dmitry Yusupov <dmitry_yus@yahoo.com>

[-- Attachment #2: netlink_uestablish_event.patch --]
[-- Type: text/x-patch, Size: 3418 bytes --]

--- linux-2.6.11-um/net/netlink/af_netlink.c	2005-04-05 12:40:33.000000000 -0700
+++ linux-2.6.11.orig/net/netlink/af_netlink.c	2005-03-01 23:38:09.000000000 -0800
@@ -467,14 +467,8 @@
 			return err;
 	}
 
-	if (!nladdr->nl_groups && !nlk->groups) {
-		struct netlink_notify n = {
-						.protocol = sk->sk_protocol,
-						.pid = nladdr->nl_pid,
-					  };
-		notifier_call_chain(&netlink_chain, NETLINK_UESTABLISHED, &n);
+	if (!nladdr->nl_groups && !nlk->groups)
 		return 0;
-	}
 
 	netlink_table_grab();
 	if (nlk->groups && !nladdr->nl_groups)
@@ -510,6 +504,7 @@
 
 	if (!nlk->pid)
 		err = netlink_autobind(sock);
+
 	if (err == 0) {
 		sk->sk_state	= NETLINK_CONNECTED;
 		nlk->dst_pid 	= nladdr->nl_pid;
@@ -547,7 +542,7 @@
 	}
 }
 
-struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid)
+static struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid)
 {
 	int protocol = ssk->sk_protocol;
 	struct sock *sock;
@@ -925,7 +920,7 @@
 	if (len > sk->sk_sndbuf - 32)
 		goto out;
 	err = -ENOBUFS;
-	skb = alloc_skb(len, sk->sk_allocation);
+	skb = alloc_skb(len, GFP_KERNEL);
 	if (skb==NULL)
 		goto out;
 
@@ -1184,7 +1179,7 @@
 	else
 		size = NLMSG_SPACE(4 + NLMSG_ALIGN(nlh->nlmsg_len));
 
-	skb = alloc_skb(size, in_skb->sk->sk_allocation);
+	skb = alloc_skb(size, GFP_KERNEL);
 	if (!skb) {
 		struct sock *sk;
 
@@ -1470,5 +1465,4 @@
 EXPORT_SYMBOL(netlink_set_nonroot);
 EXPORT_SYMBOL(netlink_unicast);
 EXPORT_SYMBOL(netlink_unregister_notifier);
-EXPORT_SYMBOL(netlink_getsockbypid);
 
--- linux-2.6.11-um/include/linux/notifier.h	2005-04-05 12:30:10.000000000 -0700
+++ linux-2.6.11.orig/include/linux/notifier.h	2005-03-01 23:37:48.000000000 -0800
@@ -62,15 +62,14 @@
 #define SYS_HALT	0x0002	/* Notify of system halt */
 #define SYS_POWER_OFF	0x0003	/* Notify of system power off */
 
-#define NETLINK_UESTABLISHED	0x0001	/* Unicast netlink socket established */
-#define NETLINK_URELEASE	0x0002	/* Unicast netlink socket released */
+#define NETLINK_URELEASE	0x0001	/* Unicast netlink socket released */
 
-#define CPU_ONLINE		0x0003 /* CPU (unsigned)v is up */
-#define CPU_UP_PREPARE		0x0004 /* CPU (unsigned)v coming up */
-#define CPU_UP_CANCELED		0x0005 /* CPU (unsigned)v NOT coming up */
-#define CPU_DOWN_PREPARE	0x0006 /* CPU (unsigned)v going down */
-#define CPU_DOWN_FAILED		0x0007 /* CPU (unsigned)v NOT going down */
-#define CPU_DEAD		0x0008 /* CPU (unsigned)v dead */
+#define CPU_ONLINE		0x0002 /* CPU (unsigned)v is up */
+#define CPU_UP_PREPARE		0x0003 /* CPU (unsigned)v coming up */
+#define CPU_UP_CANCELED		0x0004 /* CPU (unsigned)v NOT coming up */
+#define CPU_DOWN_PREPARE	0x0005 /* CPU (unsigned)v going down */
+#define CPU_DOWN_FAILED		0x0006 /* CPU (unsigned)v NOT going down */
+#define CPU_DEAD		0x0007 /* CPU (unsigned)v dead */
 
 #endif /* __KERNEL__ */
 #endif /* _LINUX_NOTIFIER_H */
--- linux-2.6.11-um/include/linux/netlink.h	2005-04-05 11:35:15.000000000 -0700
+++ linux-2.6.11.orig/include/linux/netlink.h	2005-03-01 23:38:25.000000000 -0800
@@ -124,7 +124,6 @@
 extern void netlink_set_err(struct sock *ssk, __u32 pid, __u32 group, int code);
 extern int netlink_register_notifier(struct notifier_block *nb);
 extern int netlink_unregister_notifier(struct notifier_block *nb);
-extern struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid);
 
 /* finegrained unicast helpers: */
 struct sock *netlink_getsockbyfilp(struct file *filp);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-05 19:54 [PATCH] NETLINK_UESTABLISHED notifier event Dmitry Yusupov
@ 2005-04-05 21:39 ` Jean-Mickael Guerin
  2005-04-05 23:53   ` Dmitry Yusupov
  2005-04-06  0:13   ` Dmitry Yusupov
  2005-04-06  2:45 ` Herbert Xu
  1 sibling, 2 replies; 22+ messages in thread
From: Jean-Mickael Guerin @ 2005-04-05 21:39 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: netdev

Dmitry Yusupov wrote:
> This patch provides:
> 
> * new event for unicast connections NETLINK_UESTABLISHED.

Which kind of application takes benefit of this event ?

btw your diff is reversed ;-)

Jean-Mickael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-05 21:39 ` Jean-Mickael Guerin
@ 2005-04-05 23:53   ` Dmitry Yusupov
  2005-04-06  0:13   ` Dmitry Yusupov
  1 sibling, 0 replies; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-05 23:53 UTC (permalink / raw)
  To: Jean-Mickael Guerin; +Cc: netdev

On Tue, 2005-04-05 at 23:39 +0200, Jean-Mickael Guerin wrote:
> Dmitry Yusupov wrote:
> > This patch provides:
> > 
> > * new event for unicast connections NETLINK_UESTABLISHED.
> 
> Which kind of application takes benefit of this event ?

we have NETLINK_URELEASE already where netlink user usually free
associated resources. It is nice to have NETLINK_UESTABLISHED to
implement reverse needs. For example, netlink user wants to allocate
associated resources or to monitor new connections from user-space
before it actually gets first interrupt from it.

> btw your diff is reversed ;-)

oh, well, I was running out of time. still applicable... :)

> Jean-Mickael
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-05 21:39 ` Jean-Mickael Guerin
  2005-04-05 23:53   ` Dmitry Yusupov
@ 2005-04-06  0:13   ` Dmitry Yusupov
  1 sibling, 0 replies; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-06  0:13 UTC (permalink / raw)
  To: Jean-Mickael Guerin; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 304 bytes --]

resend.

On Tue, 2005-04-05 at 23:39 +0200, Jean-Mickael Guerin wrote:
> Dmitry Yusupov wrote:
> > This patch provides:
> > 
> > * new event for unicast connections NETLINK_UESTABLISHED.
> 
> Which kind of application takes benefit of this event ?
> 
> btw your diff is reversed ;-)
> 
> Jean-Mickael
> 

[-- Attachment #2: netlink_uestablish_event.patch --]
[-- Type: text/x-patch, Size: 3418 bytes --]

--- linux-2.6.11.orig/net/netlink/af_netlink.c	2005-03-01 23:38:09.000000000 -0800
+++ linux-2.6.11-um/net/netlink/af_netlink.c	2005-04-05 12:40:33.000000000 -0700
@@ -467,8 +467,14 @@
 			return err;
 	}
 
-	if (!nladdr->nl_groups && !nlk->groups)
+	if (!nladdr->nl_groups && !nlk->groups) {
+		struct netlink_notify n = {
+						.protocol = sk->sk_protocol,
+						.pid = nladdr->nl_pid,
+					  };
+		notifier_call_chain(&netlink_chain, NETLINK_UESTABLISHED, &n);
 		return 0;
+	}
 
 	netlink_table_grab();
 	if (nlk->groups && !nladdr->nl_groups)
@@ -504,7 +510,6 @@
 
 	if (!nlk->pid)
 		err = netlink_autobind(sock);
-
 	if (err == 0) {
 		sk->sk_state	= NETLINK_CONNECTED;
 		nlk->dst_pid 	= nladdr->nl_pid;
@@ -542,7 +547,7 @@
 	}
 }
 
-static struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid)
+struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid)
 {
 	int protocol = ssk->sk_protocol;
 	struct sock *sock;
@@ -920,7 +925,7 @@
 	if (len > sk->sk_sndbuf - 32)
 		goto out;
 	err = -ENOBUFS;
-	skb = alloc_skb(len, GFP_KERNEL);
+	skb = alloc_skb(len, sk->sk_allocation);
 	if (skb==NULL)
 		goto out;
 
@@ -1179,7 +1184,7 @@
 	else
 		size = NLMSG_SPACE(4 + NLMSG_ALIGN(nlh->nlmsg_len));
 
-	skb = alloc_skb(size, GFP_KERNEL);
+	skb = alloc_skb(size, in_skb->sk->sk_allocation);
 	if (!skb) {
 		struct sock *sk;
 
@@ -1465,4 +1470,5 @@
 EXPORT_SYMBOL(netlink_set_nonroot);
 EXPORT_SYMBOL(netlink_unicast);
 EXPORT_SYMBOL(netlink_unregister_notifier);
+EXPORT_SYMBOL(netlink_getsockbypid);
 
--- linux-2.6.11.orig/include/linux/notifier.h	2005-03-01 23:37:48.000000000 -0800
+++ linux-2.6.11-um/include/linux/notifier.h	2005-04-05 12:30:10.000000000 -0700
@@ -62,14 +62,15 @@
 #define SYS_HALT	0x0002	/* Notify of system halt */
 #define SYS_POWER_OFF	0x0003	/* Notify of system power off */
 
-#define NETLINK_URELEASE	0x0001	/* Unicast netlink socket released */
+#define NETLINK_UESTABLISHED	0x0001	/* Unicast netlink socket established */
+#define NETLINK_URELEASE	0x0002	/* Unicast netlink socket released */
 
-#define CPU_ONLINE		0x0002 /* CPU (unsigned)v is up */
-#define CPU_UP_PREPARE		0x0003 /* CPU (unsigned)v coming up */
-#define CPU_UP_CANCELED		0x0004 /* CPU (unsigned)v NOT coming up */
-#define CPU_DOWN_PREPARE	0x0005 /* CPU (unsigned)v going down */
-#define CPU_DOWN_FAILED		0x0006 /* CPU (unsigned)v NOT going down */
-#define CPU_DEAD		0x0007 /* CPU (unsigned)v dead */
+#define CPU_ONLINE		0x0003 /* CPU (unsigned)v is up */
+#define CPU_UP_PREPARE		0x0004 /* CPU (unsigned)v coming up */
+#define CPU_UP_CANCELED		0x0005 /* CPU (unsigned)v NOT coming up */
+#define CPU_DOWN_PREPARE	0x0006 /* CPU (unsigned)v going down */
+#define CPU_DOWN_FAILED		0x0007 /* CPU (unsigned)v NOT going down */
+#define CPU_DEAD		0x0008 /* CPU (unsigned)v dead */
 
 #endif /* __KERNEL__ */
 #endif /* _LINUX_NOTIFIER_H */
--- linux-2.6.11.orig/include/linux/netlink.h	2005-03-01 23:38:25.000000000 -0800
+++ linux-2.6.11-um/include/linux/netlink.h	2005-04-05 11:35:15.000000000 -0700
@@ -124,6 +124,7 @@
 extern void netlink_set_err(struct sock *ssk, __u32 pid, __u32 group, int code);
 extern int netlink_register_notifier(struct notifier_block *nb);
 extern int netlink_unregister_notifier(struct notifier_block *nb);
+extern struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid);
 
 /* finegrained unicast helpers: */
 struct sock *netlink_getsockbyfilp(struct file *filp);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-05 19:54 [PATCH] NETLINK_UESTABLISHED notifier event Dmitry Yusupov
  2005-04-05 21:39 ` Jean-Mickael Guerin
@ 2005-04-06  2:45 ` Herbert Xu
  2005-04-06  5:55   ` Dmitry Yusupov
  2005-04-06 16:23   ` Mike Christie
  1 sibling, 2 replies; 22+ messages in thread
From: Herbert Xu @ 2005-04-06  2:45 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: netdev, davem

Dmitry Yusupov <dmitry_yus@yahoo.com> wrote:
> 
> * new event for unicast connections NETLINK_UESTABLISHED.

Huh? In the patch you're actually sending the notification
when the socket stops listening to multicast traffic.

Please document why we need this in greater detail too.

> * netlink alloc_skb() now uses sk_allocation instead of hard-coded
> GFP_KERNEL

Why? We never set it to anything else for netlink.

> * since netlink event described via proto and pid,
> netlink_getsockbypid() is exported, so netlink user can identify socket.

Please submit the users for kernel inclusion first.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06  2:45 ` Herbert Xu
@ 2005-04-06  5:55   ` Dmitry Yusupov
  2005-04-06  6:03     ` Herbert Xu
  2005-04-06 16:23   ` Mike Christie
  1 sibling, 1 reply; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-06  5:55 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, davem

On Wed, 2005-04-06 at 12:45 +1000, Herbert Xu wrote:
> Dmitry Yusupov <dmitry_yus@yahoo.com> wrote:
> > 
> > * new event for unicast connections NETLINK_UESTABLISHED.
> 
> Huh? In the patch you're actually sending the notification
> when the socket stops listening to multicast traffic.

it could be that I just screwed up everything as usual and my
application just works as expected for some mythical reasons 8)
but on a serious note, it is sends event from netlink_bind() context for
non-multicast connections only. may be I messed up something, so please
correct me.

> Please document why we need this in greater detail too.

main reason for this is to have clear way to notify netlink user that
new socket created and bound.

> > * netlink alloc_skb() now uses sk_allocation instead of hard-coded
> > GFP_KERNEL
> 
> Why? We never set it to anything else for netlink.

one reason is for consistency with sock interface. sk_allocation is
equal to GFP_KERNEL by default, so nothing changed. but. in some cases
application might require non-blocking kmalloc behavior. one real life
example is networking block device used for swap partition. this way any
GFP_KERENL allocation on recovery path might lead to deadlock condition.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06  5:55   ` Dmitry Yusupov
@ 2005-04-06  6:03     ` Herbert Xu
  2005-04-06 15:16       ` Dmitry Yusupov
  0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2005-04-06  6:03 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: netdev, davem

On Tue, Apr 05, 2005 at 10:55:01PM -0700, Dmitry Yusupov wrote:
> 
> main reason for this is to have clear way to notify netlink user that
> new socket created and bound.

Why do you want to exclude sockets that listen to multicast
messages? They can still do unicast transactions.
 
> one reason is for consistency with sock interface. sk_allocation is
> equal to GFP_KERNEL by default, so nothing changed. but. in some cases
> application might require non-blocking kmalloc behavior. one real life
> example is networking block device used for swap partition. this way any
> GFP_KERENL allocation on recovery path might lead to deadlock condition.

Setting of sk_allocation is controlled by the protocol itself.  In this
case this is af_netlink.c.  As you can see, we never set this to anything
other than GFP_KERNEL.

Using sk_allocation will only confuse those who are not familiar with
netlink into thinking that this can be atomic.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06  6:03     ` Herbert Xu
@ 2005-04-06 15:16       ` Dmitry Yusupov
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-06 15:16 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, davem

On Wed, 2005-04-06 at 16:03 +1000, Herbert Xu wrote:
> On Tue, Apr 05, 2005 at 10:55:01PM -0700, Dmitry Yusupov wrote:
> > 
> > main reason for this is to have clear way to notify netlink user that
> > new socket created and bound.
> 
> Why do you want to exclude sockets that listen to multicast
> messages? They can still do unicast transactions.

this probably makes sense. I didn't do that intentionally, all I wanted
is to have the clean way to initialize control structures in proper time
- NETLINK_UESTABLISHED so it will work with my application. If you want
to extent change for multicast - feel free.

> > one reason is for consistency with sock interface. sk_allocation is
> > equal to GFP_KERNEL by default, so nothing changed. but. in some cases
> > application might require non-blocking kmalloc behavior. one real life
> > example is networking block device used for swap partition. this way any
> > GFP_KERENL allocation on recovery path might lead to deadlock condition.
> 
> Setting of sk_allocation is controlled by the protocol itself.  In this
> case this is af_netlink.c.  As you can see, we never set this to anything
> other than GFP_KERNEL.
> 
> Using sk_allocation will only confuse those who are not familiar with
> netlink into thinking that this can be atomic.

I don't think so. It is not clean enough to call alloc_skb() with hard-
coded vm priority. And it limits netlink in the way I described before.
If we want to defer critical processing to the user-space(like number of
applications includes: iscsi, lvm, multipath, fuse...), this change is
essential. Having GFP_KERNEL hard-coded will lead to deadlock on "down"
calls in some cases. So, we do need this change to make a progress.

> Cheers,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06  2:45 ` Herbert Xu
  2005-04-06  5:55   ` Dmitry Yusupov
@ 2005-04-06 16:23   ` Mike Christie
  2005-04-06 21:29     ` Herbert Xu
  1 sibling, 1 reply; 22+ messages in thread
From: Mike Christie @ 2005-04-06 16:23 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Dmitry Yusupov, netdev, davem

Herbert Xu wrote:
> Dmitry Yusupov <dmitry_yus@yahoo.com> wrote:
>>* netlink alloc_skb() now uses sk_allocation instead of hard-coded
>>GFP_KERNEL
> 
> 
> Why? We never set it to anything else for netlink.
> 

It is due to where it is being used. open-iscsi uses netlink
sockets for communication in a block (scsi specificically)
driver that has pushed much of its code to usersapce. Forcing
open-iscsi to use GFP_KERNEL causes a couple of problems. The
worst would be where a GFP_KERNEL allocation causes a write,
and that write is to an iscsi disk that open-iscsi is managing.
The write could then hit the same code path and cause another
GFP_KERNEL allocation and we could loop like that until the
system locks up.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06 16:23   ` Mike Christie
@ 2005-04-06 21:29     ` Herbert Xu
  2005-04-06 21:37       ` Dmitry Yusupov
  0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2005-04-06 21:29 UTC (permalink / raw)
  To: Mike Christie; +Cc: Dmitry Yusupov, netdev, davem

On Wed, Apr 06, 2005 at 09:23:15AM -0700, Mike Christie wrote:
> 
> It is due to where it is being used. open-iscsi uses netlink
> sockets for communication in a block (scsi specificically)
> driver that has pushed much of its code to usersapce. Forcing
> open-iscsi to use GFP_KERNEL causes a couple of problems. The
> worst would be where a GFP_KERNEL allocation causes a write,
> and that write is to an iscsi disk that open-iscsi is managing.
> The write could then hit the same code path and cause another
> GFP_KERNEL allocation and we could loop like that until the
> system locks up.

In that case it's not enough to just use sk_allocation here.
You'll need a way to actually set it to GFP_ATOMIC.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06 21:29     ` Herbert Xu
@ 2005-04-06 21:37       ` Dmitry Yusupov
  2005-04-06 22:04         ` Herbert Xu
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-06 21:37 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Mike Christie, netdev, davem

On Thu, 2005-04-07 at 07:29 +1000, Herbert Xu wrote:
> On Wed, Apr 06, 2005 at 09:23:15AM -0700, Mike Christie wrote:
> > 
> > It is due to where it is being used. open-iscsi uses netlink
> > sockets for communication in a block (scsi specificically)
> > driver that has pushed much of its code to usersapce. Forcing
> > open-iscsi to use GFP_KERNEL causes a couple of problems. The
> > worst would be where a GFP_KERNEL allocation causes a write,
> > and that write is to an iscsi disk that open-iscsi is managing.
> > The write could then hit the same code path and cause another
> > GFP_KERNEL allocation and we could loop like that until the
> > system locks up.
> 
> In that case it's not enough to just use sk_allocation here.
> You'll need a way to actually set it to GFP_ATOMIC.

correct.

That's why in my patch I provided NETLINK_UESTABLISHED event. It is a
right way and time to set sk->sk_allocation to GFP_ATOMIC for newly
established netlink connection. imho.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06 21:37       ` Dmitry Yusupov
@ 2005-04-06 22:04         ` Herbert Xu
  2005-04-06 22:26           ` Dmitry Yusupov
  0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2005-04-06 22:04 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: Mike Christie, netdev, davem

On Wed, Apr 06, 2005 at 02:37:21PM -0700, Dmitry Yusupov wrote:
>
> > In that case it's not enough to just use sk_allocation here.
> > You'll need a way to actually set it to GFP_ATOMIC.
> 
> That's why in my patch I provided NETLINK_UESTABLISHED event. It is a
> right way and time to set sk->sk_allocation to GFP_ATOMIC for newly
> established netlink connection. imho.

Right.  In that case I think you're going about this the wrong way.
It's unmanageable to have some external kernel module fiddle with
netlink's sk_allocation upon the receipt of a notification.

If you really want to do this, do it as a setsockopt.  It's the owner
of the socket that wants the GFP_ATOMIC, not anybody else.

However, I'd like to know more about how open-iscsi uses this.  In
particular, what are these messages and what do you do when the
GFP_ATOMIC alloc_skb fails?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06 22:04         ` Herbert Xu
@ 2005-04-06 22:26           ` Dmitry Yusupov
  2005-04-07 11:30             ` jamal
  2005-04-07 21:32             ` Herbert Xu
  0 siblings, 2 replies; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-06 22:26 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Mike Christie, netdev, davem

On Thu, 2005-04-07 at 08:04 +1000, Herbert Xu wrote:
> On Wed, Apr 06, 2005 at 02:37:21PM -0700, Dmitry Yusupov wrote:
> >
> > > In that case it's not enough to just use sk_allocation here.
> > > You'll need a way to actually set it to GFP_ATOMIC.
> > 
> > That's why in my patch I provided NETLINK_UESTABLISHED event. It is a
> > right way and time to set sk->sk_allocation to GFP_ATOMIC for newly
> > established netlink connection. imho.
> 
> Right.  In that case I think you're going about this the wrong way.
> It's unmanageable to have some external kernel module fiddle with
> netlink's sk_allocation upon the receipt of a notification.
>
> If you really want to do this, do it as a setsockopt.  It's the owner
> of the socket that wants the GFP_ATOMIC, not anybody else.

I see. yes. either way is OK for me. yours is cleaner. We probably
should provide new optname for netlink proto.

> However, I'd like to know more about how open-iscsi uses this.  In
> particular, what are these messages and what do you do when the
> GFP_ATOMIC alloc_skb fails?

Messages from kernel are asynchronous and there are no alloc_skb on "up"
calls. It is "mempooled out" on interface level. (see Open-iSCSI
interface). Messages to kernel requires copy_from_user to newly
allocated skb, here is where we need sk_allocation bit set. Those
messages are synchronous from daemon perspective. If "down" call fails,
we will re-try later or take some other management action. We assuming
that later OOM-killer will free some memory for us and atomic allocation
will succeed eventually.

Dmitry

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06 22:26           ` Dmitry Yusupov
@ 2005-04-07 11:30             ` jamal
  2005-04-07 15:05               ` Dmitry Yusupov
  2005-04-07 21:32             ` Herbert Xu
  1 sibling, 1 reply; 22+ messages in thread
From: jamal @ 2005-04-07 11:30 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: Herbert Xu, Mike Christie, netdev, David S. Miller

On Wed, 2005-04-06 at 18:26, Dmitry Yusupov wrote:

> (see Open-iSCSI interface).

url?

cheers,
jamal

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-07 11:30             ` jamal
@ 2005-04-07 15:05               ` Dmitry Yusupov
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-07 15:05 UTC (permalink / raw)
  To: hadi; +Cc: Herbert Xu, Mike Christie, netdev, David S. Miller

http://www.open-iscsi.org

On Thu, 2005-04-07 at 07:30 -0400, jamal wrote:
> On Wed, 2005-04-06 at 18:26, Dmitry Yusupov wrote:
> 
> > (see Open-iSCSI interface).
> 
> url?
> 
> cheers,
> jamal
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-06 22:26           ` Dmitry Yusupov
  2005-04-07 11:30             ` jamal
@ 2005-04-07 21:32             ` Herbert Xu
  2005-04-07 23:36               ` Dmitry Yusupov
  1 sibling, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2005-04-07 21:32 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: Mike Christie, netdev, davem

On Wed, Apr 06, 2005 at 03:26:25PM -0700, Dmitry Yusupov wrote:
> 
> Messages from kernel are asynchronous and there are no alloc_skb on "up"
> calls. It is "mempooled out" on interface level. (see Open-iSCSI
> interface). Messages to kernel requires copy_from_user to newly
> allocated skb, here is where we need sk_allocation bit set. Those
> messages are synchronous from daemon perspective. If "down" call fails,
> we will re-try later or take some other management action. We assuming
> that later OOM-killer will free some memory for us and atomic allocation
> will succeed eventually.

I presume you only need to send one message at a time of a fixed size.
Would it better to always have an skb allocated for the socket so that
we don't need to allocate at sendmsg time?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-07 21:32             ` Herbert Xu
@ 2005-04-07 23:36               ` Dmitry Yusupov
  2005-04-08 11:36                 ` Herbert Xu
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-07 23:36 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Mike Christie, netdev, davem

On Fri, 2005-04-08 at 07:32 +1000, Herbert Xu wrote:
> On Wed, Apr 06, 2005 at 03:26:25PM -0700, Dmitry Yusupov wrote:
> > 
> > Messages from kernel are asynchronous and there are no alloc_skb on "up"
> > calls. It is "mempooled out" on interface level. (see Open-iSCSI
> > interface). Messages to kernel requires copy_from_user to newly
> > allocated skb, here is where we need sk_allocation bit set. Those
> > messages are synchronous from daemon perspective. If "down" call fails,
> > we will re-try later or take some other management action. We assuming
> > that later OOM-killer will free some memory for us and atomic allocation
> > will succeed eventually.
> 
> I presume you only need to send one message at a time of a fixed size.
> Would it better to always have an skb allocated for the socket so that
> we don't need to allocate at sendmsg time?

This actually even better since we will guarantee "down" call delivery.
My only concern is that it is not very generic. Do you know clean way to
implement it?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-07 23:36               ` Dmitry Yusupov
@ 2005-04-08 11:36                 ` Herbert Xu
  2005-04-08 15:30                   ` Dmitry Yusupov
  0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2005-04-08 11:36 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: Mike Christie, netdev, davem

On Thu, Apr 07, 2005 at 04:36:41PM -0700, Dmitry Yusupov wrote:
> 
> This actually even better since we will guarantee "down" call delivery.
> My only concern is that it is not very generic. Do you know clean way to
> implement it?

How generic do you want this? Do you need this for socket types other
than netlink?

For a one-packet version, we can pre-allocate an skb/page in response to
a setsockopt and store it in sk_send_head/sk_sndmsg_page.

This can then be used at sendmsg time.  Obviously subsequent messages
will have to use alloc_skb until that skb is released.

Alternatively, we can let the socket allocate skb's from an emergency
pool similar to what was discussed in the thread
"Summary of 2005 Kernel Summit Proposed Topics".

Again this could enabled on a setsockopt.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-08 11:36                 ` Herbert Xu
@ 2005-04-08 15:30                   ` Dmitry Yusupov
  2005-04-09  1:44                     ` Herbert Xu
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-08 15:30 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Mike Christie, netdev, davem

On Fri, 2005-04-08 at 21:36 +1000, Herbert Xu wrote:
> On Thu, Apr 07, 2005 at 04:36:41PM -0700, Dmitry Yusupov wrote:
> > 
> > This actually even better since we will guarantee "down" call delivery.
> > My only concern is that it is not very generic. Do you know clean way to
> > implement it?
> 
> How generic do you want this? Do you need this for socket types other
> than netlink?

as discussed on kernel-summit-discuss. yes. we need generic OOM-safety
solution.

> For a one-packet version, we can pre-allocate an skb/page in response to
> a setsockopt and store it in sk_send_head/sk_sndmsg_page.
> 
> This can then be used at sendmsg time.  Obviously subsequent messages
> will have to use alloc_skb until that skb is released.
> 
> Alternatively, we can let the socket allocate skb's from an emergency
> pool similar to what was discussed in the thread
> "Summary of 2005 Kernel Summit Proposed Topics".

This is ideally how it should be for sendmsg paths. socket applications
like iscsi, nbd, etc will use it for TCP/IP type of socket. iscsi could
re-use the same generic "emergency pool" code for netlink.

> Again this could enabled on a setsockopt.

perfect.

> Cheers,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-08 15:30                   ` Dmitry Yusupov
@ 2005-04-09  1:44                     ` Herbert Xu
  2005-04-09 16:02                       ` Dmitry Yusupov
  0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2005-04-09  1:44 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: Mike Christie, netdev, davem

On Fri, Apr 08, 2005 at 08:30:14AM -0700, Dmitry Yusupov wrote:
> 
> This is ideally how it should be for sendmsg paths. socket applications
> like iscsi, nbd, etc will use it for TCP/IP type of socket. iscsi could
> re-use the same generic "emergency pool" code for netlink.

In that case we'll wait for the resolution of the discussion on TCP
itself, OK?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-09  1:44                     ` Herbert Xu
@ 2005-04-09 16:02                       ` Dmitry Yusupov
  2005-04-09 19:36                         ` Herbert Xu
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Yusupov @ 2005-04-09 16:02 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Mike Christie, netdev, davem

On Sat, 2005-04-09 at 11:44 +1000, Herbert Xu wrote:
> On Fri, Apr 08, 2005 at 08:30:14AM -0700, Dmitry Yusupov wrote:
> > 
> > This is ideally how it should be for sendmsg paths. socket applications
> > like iscsi, nbd, etc will use it for TCP/IP type of socket. iscsi could
> > re-use the same generic "emergency pool" code for netlink.
> 
> In that case we'll wait for the resolution of the discussion on TCP
> itself, OK?

"emergency pool" is a big and generic feature, needs design and
resolution. it could definitely wait. Do you have something against
one liner patch for using sk->sk_allocation instead of hard-coded vm
priority?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] NETLINK_UESTABLISHED notifier event
  2005-04-09 16:02                       ` Dmitry Yusupov
@ 2005-04-09 19:36                         ` Herbert Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Herbert Xu @ 2005-04-09 19:36 UTC (permalink / raw)
  To: Dmitry Yusupov; +Cc: Mike Christie, netdev, davem

On Sat, Apr 09, 2005 at 09:02:59AM -0700, Dmitry Yusupov wrote:
> 
> "emergency pool" is a big and generic feature, needs design and
> resolution. it could definitely wait. Do you have something against
> one liner patch for using sk->sk_allocation instead of hard-coded vm
> priority?

Whatever we come up with will need to be maintained as part of
the kernel ABI.  So I'd rather not see a temporary hack carried
around forever.

You can always patch your kernels to do whatever you want of course.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2005-04-09 19:36 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-05 19:54 [PATCH] NETLINK_UESTABLISHED notifier event Dmitry Yusupov
2005-04-05 21:39 ` Jean-Mickael Guerin
2005-04-05 23:53   ` Dmitry Yusupov
2005-04-06  0:13   ` Dmitry Yusupov
2005-04-06  2:45 ` Herbert Xu
2005-04-06  5:55   ` Dmitry Yusupov
2005-04-06  6:03     ` Herbert Xu
2005-04-06 15:16       ` Dmitry Yusupov
2005-04-06 16:23   ` Mike Christie
2005-04-06 21:29     ` Herbert Xu
2005-04-06 21:37       ` Dmitry Yusupov
2005-04-06 22:04         ` Herbert Xu
2005-04-06 22:26           ` Dmitry Yusupov
2005-04-07 11:30             ` jamal
2005-04-07 15:05               ` Dmitry Yusupov
2005-04-07 21:32             ` Herbert Xu
2005-04-07 23:36               ` Dmitry Yusupov
2005-04-08 11:36                 ` Herbert Xu
2005-04-08 15:30                   ` Dmitry Yusupov
2005-04-09  1:44                     ` Herbert Xu
2005-04-09 16:02                       ` Dmitry Yusupov
2005-04-09 19:36                         ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).