Netdev List
 help / color / mirror / Atom feed
* Re: [2.6 patch] do not select NET_CLS
From: Sam Ravnborg @ 2005-11-22 22:49 UTC (permalink / raw)
  To: David S. Miller; +Cc: kaber, bunk, evil, linux-kernel, netdev, zippel
In-Reply-To: <20051122.143713.101129339.davem@davemloft.net>

On Tue, Nov 22, 2005 at 02:37:13PM -0800, David S. Miller wrote:
> 
> One thing we can do to prevent human
> mistakes, is to make the "make modules" pass do a quick "is vmlinux
> uptodate?" check, and if not print out an error message explaining the
> situation and aborting the "make modules" attempt.


I do not quite follow you here.

For a while I have considered implementing something that told why a
given file was compiled - like:

 CC     net/ipv4/ip_gre.o   due to net/dsfield.h, net/xfrm.h
 CC     net/ipv4/raw.c   due to include/config/ip/mroute.h

The latter is a config option that I do not see a possibility to change
back to a config option syntax (at least not without doing some effort).

My thinking was that 'make V=2' would give above printout.

But what you request is something that keep the dense printout without
building the kernel - right?

Any suggestion for an intuitive syntax to enable that?
'make -n V=2' will not do it.

	Sam

^ permalink raw reply

* Re: [NETLINK]: Use tgid instead of pid for nlmsg_pid
From: David S. Miller @ 2005-11-22 22:43 UTC (permalink / raw)
  To: herbert; +Cc: kuznet, cfriesen, netdev, linux-kernel
In-Reply-To: <E1EeJxb-0006xG-00@gondolin.me.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue, 22 Nov 2005 09:16:27 +1100

> Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote:
> > 
> > I agree, apparently netlink_autobind was missed when sed'ing pid->tgid.
> > Of course, it does not matter, but tgid is nicer choice from user's viewpoint.
> 
> Great, here is the patch to do just that.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, of course.

I can't for the life of me figure out how we missed this when
we fixed up all the current->pid references under net/.
Ulrich Drepper let us know that the problem existed, and
I was sure we eliminated all such cases.

It is possible we accidently reintroduced current->pid when
we redid all of the netlink hashing. :-)

^ permalink raw reply

* Re: [PATCH 00/13]: Netfilter IPsec support
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-11-22 22:38 UTC (permalink / raw)
  To: davem; +Cc: netdev, netfilter-devel, kaber
In-Reply-To: <20051122.143438.84749134.davem@davemloft.net>

In article <20051122.143438.84749134.davem@davemloft.net> (at Tue, 22 Nov 2005 14:34:38 -0800 (PST)), "David S. Miller" <davem@davemloft.net> says:

> I want to do this so that Patrick doesn't have to repost
> 13 or so patches every time one of the parts still under
> discussion gets changed.
> 
> Actually, it seems the only part under discussion is how to
> avoid extension header reparsing and routing re-lookups on
> the ipv6 side.  That could be fixed by a follow-on patch and
> is not %100 necessary for initial integration in my opinion.
> 
> Can I get agreement on that?  Patrick sends me a dump of the
> current state of his patch set right now, we put that into
> net-2.6.16, and fix problems with followon patches.
> 
> Ok?

I believe he can manage these patches in his tree,
but anyway...

Well, it is very important to fix the packet processing
path issues (including the extension header issue)
before we merge it to the mainline. Definitely.
What I want to ensure is that they will not reach the mainline
tree without resolving those issues.

Thank you.

--yoshfuji

^ permalink raw reply

* Re: [2.6 patch] do not select NET_CLS
From: David S. Miller @ 2005-11-22 22:37 UTC (permalink / raw)
  To: kaber; +Cc: bunk, evil, linux-kernel, netdev, zippel
In-Reply-To: <4381F2D2.5000605@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Mon, 21 Nov 2005 17:16:18 +0100

> Adrian Bunk wrote:
> > This patch therefore changes NET_CLS back to the 2.6.14 status quo of 
> > being an user-visible option.
> 
> I disagree with this patch. NET_CLS enables the infrastructure support
> for classifiers. Users generally don't care about infrastructure but
> directly usable things, so I'd prefer to have it automatically selected. 
> And there are lots of other cases where enabling a module causes changes
> in the kernel image. Some examples include some of the netfilter stuff,
> the IPsec transforms, NET_CLS_ROUTE4, the ieee80211 stuff, and a lot
> more.

I agree with Patrick.

Changing config options of any kind can result in the main kernel
image needing to be rebuilt.  One thing we can do to prevent human
mistakes, is to make the "make modules" pass do a quick "is vmlinux
uptodate?" check, and if not print out an error message explaining the
situation and aborting the "make modules" attempt.

^ permalink raw reply

* Re: [PATCH 00/13]: Netfilter IPsec support
From: David S. Miller @ 2005-11-22 22:34 UTC (permalink / raw)
  To: kaber; +Cc: netdev, netfilter-devel
In-Reply-To: <20051120163128.16666.38111.sendpatchset@localhost.localdomain>

From: Patrick McHardy <kaber@trash.net>
Date: Sun, 20 Nov 2005 17:31:28 +0100

> This is the latest netfilter/IPsec patchset. Its purpose is to make
> IPsec look as much as a normal tunnel device to netfilter as possible
> and to enable NAT support.

I think there are some of these patches that we can merge in
right now into net-2.6.16...

I want to do this so that Patrick doesn't have to repost
13 or so patches every time one of the parts still under
discussion gets changed.

Actually, it seems the only part under discussion is how to
avoid extension header reparsing and routing re-lookups on
the ipv6 side.  That could be fixed by a follow-on patch and
is not %100 necessary for initial integration in my opinion.

Can I get agreement on that?  Patrick sends me a dump of the
current state of his patch set right now, we put that into
net-2.6.16, and fix problems with followon patches.

Ok?

^ permalink raw reply

* Re: [PATCH] NETFILTER arp_tables: Fix NUMA optimization
From: David S. Miller @ 2005-11-22 22:28 UTC (permalink / raw)
  To: laforge; +Cc: netdev, netfilter-devel, dada1
In-Reply-To: <20051119103131.GE20775@sunbeam.de.gnumonks.org>

From: Harald Welte <laforge@netfilter.org>
Date: Sat, 19 Nov 2005 11:31:31 +0100

> There's another one that I detected while merging those changes with
> x_tables (painful).  Please merge:
> 
> [NETFILTER] arp_tables: Fix bug introduced with NUMA aware allocation
> 
> This fix shows how bad I am with copy & paste.
> 
> Signed-off-by: Harald Welte <laforge@netfilter.org>

Added to net-2.6.16, thanks a lot.

^ permalink raw reply

* Re: e1000 82571 Packet Splitting
From: Jeff V. Merkey @ 2005-11-22 22:27 UTC (permalink / raw)
  To: Jesse Brandeburg; +Cc: linux-kernel, Kernel Netdev Mailing List
In-Reply-To: <4807377b0511221546h7a16c0a9p793e441197b23118@mail.gmail.com>

Jesse Brandeburg wrote:

>this should be on netdev.
>
>On 11/22/05, Jeff V. Merkey <jmerkey@soleranetworks.com> wrote:
>  
>
>>I have noted that the e1000 driver is now supporting DMA splitting of
>>the packet header and payload into separate pages.  I also noticed
>>that none of the config options enable it.  Is anyone using this feature
>>at present and has it even been tested on Linux?
>>    
>>
>
>The 6.2.15 driver off http://prdownloads.sf.net/e1000 will enable it
>by default.  We've gone through our release approval, which includes
>quite a bit of testing.
>If the patches posted recently don't include that support, I missed
>something :-)
>
>This is only on the 82571 and greater hardware that packet split is
>supported, BTW.
>
>  
>
Got it.  I am merging packet capture support into the e1000 for DSFS for 
direct cache DMA support.  I noticed the two distinct fill routines
for the ring buffer.  I have enabled DSFS for both with this driver.  I 
will post patches late tonight or in the morning for the 82571 support 
for DSFS.  I will
also be posting the patches for 2.6.12, 2.6.10. 2.6.14, Suse 9 and 10.0, 
Red Hat ES/AS 3 and 4, and Fedora Core 2/3/4 by end of week with the
packet splitting capabilities with packet capture on DSFS.  This is a 
very good optimization and should come in handy for reducing header copies
in the protocol stacks.

Jeff

^ permalink raw reply

* Re: [PATCH] ebtables: port ebt*[u]log.c to nf[netlink]_log
From: David S. Miller @ 2005-11-22 22:26 UTC (permalink / raw)
  To: bdschuym-LPO8gxj9N8aZIoH1IeqzKA
  Cc: laforge-TgoAw6mPHtdg9hUCZPvPmw, netdev-u79uwXL29TY76Z2rM5mHXA,
	ebtables-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
In-Reply-To: <1132078881.3464.5.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>

From: Bart De Schuymer <bdschuym-LPO8gxj9N8aZIoH1IeqzKA@public.gmane.org>
Date: Tue, 15 Nov 2005 18:21:21 +0000

> [NETFILTER] ebtables: Support nf_log API from ebt_log and ebt_ulog
> 
> This makes ebt_log and ebt_ulog use the new nf_log api.  This enables
> the bridging packet filter to log packets e.g. via nfnetlink_log.
> 
> Signed-off-by: Bart De Schuymer <bdschuym-LPO8gxj9N8aZIoH1IeqzKA@public.gmane.org>
> Signed-off-by: Harald Welte <laforge-Cap9r6Oaw4JrovVCs/uTlw@public.gmane.org>

I'm going to queue this up for 2.6.16, thanks.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click

^ permalink raw reply

* Re: Network lockup under load
From: Herbert Xu @ 2005-11-22 20:40 UTC (permalink / raw)
  To: Joe Korty; +Cc: David S. Miller, linux-kernel, netdev
In-Reply-To: <20051122164502.GA12498@tsunami.ccur.com>

On Tue, Nov 22, 2005 at 04:45:02PM +0000, Joe Korty wrote:
> 
> To trigger the lockup, I do a 'scp -rp' of the kernel
> tree from a machine with a defective kernel to any other
> machine.  It triggers anywhere after the first file
> transfered, to perhaps 30 files transfered.

Please generate a stack backtrace of your processes using either
CTRL-SCROLLLOCK or SysRq.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 02/10]: [NETFILTER]: Defer fragmentation in ip_output when connection tracking is used
From: Harald Welte @ 2005-11-22 14:19 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Kernel Netdev Mailing List, Netfilter Development Mailinglist,
	Herbert Xu
In-Reply-To: <4382D419.6070705@trash.net>

[-- Attachment #1: Type: text/plain, Size: 763 bytes --]

On Tue, Nov 22, 2005 at 09:17:29AM +0100, Patrick McHardy wrote:

> >>I've CCed Harald for his opinion in case I missed something.
> >I don't think you've missed something.  I agree that the features you
> >pointed out seem to be useful and/or important.
> 
> I hope you meant to write "don't seem to be"? :)

yes.  I shouldn't be answering emails while holding a tutorial, sorry.

-- 
- Harald Welte <laforge@netfilter.org>                 http://netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH 05/13]: [IPV4/6]: Netfilter IPsec output hooks
From: Herbert Xu @ 2005-11-22 12:13 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, netfilter-devel, davem
In-Reply-To: <20051122103139.GA4632@gondor.apana.org.au>

On Tue, Nov 22, 2005 at 09:31:39PM +1100, herbert wrote:
> 
> Unfortunately it looks like gcc 3.3.5 at least is too dumb to optimise
> it away.  I think we'll need a better strategy.

OK, the idea is still the same: Move the loop from dst_output into
xfrm4_output/xfrm6_output since they're the only ones who need to it.

In order to avoid the tail call issue, I've added the inline function
nf_hook which is nf_hook_slow plus the empty list check.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -168,6 +168,37 @@ void nf_log_packet(int pf,
 		   const struct net_device *out,
 		   struct nf_loginfo *li,
 		   const char *fmt, ...);
+
+int nf_hook_slow(int pf, unsigned int hook, struct sk_buff **pskb,
+		 struct net_device *indev, struct net_device *outdev,
+		 int (*okfn)(struct sk_buff *), int thresh);
+
+/**
+ *	nf_hook_thresh - call a netfilter hook
+ *	
+ *	Returns 1 if the hook has allowed the packet to pass.  The function
+ *	okfn must be invoked by the caller in this case.  Any other return
+ *	value indicates the packet has been consumed by the hook.
+ */
+static inline int nf_hook_thresh(int pf, unsigned int hook,
+				 struct sk_buff **pskb,
+				 struct net_device *indev,
+				 struct net_device *outdev,
+				 int (*okfn)(struct sk_buff *), int thresh)
+{
+#ifndef CONFIG_NETFILTER_DEBUG
+	if (list_empty(&nf_hooks[pf][hook]))
+		return 1;
+#endif
+	return nf_hook_slow(pf, hook, pskb, indev, outdev, okfn, thresh);
+}
+
+static inline int nf_hook(int pf, unsigned int hook, struct sk_buff **pskb,
+			  struct net_device *indev, struct net_device *outdev,
+			  int (*okfn)(struct sk_buff *))
+{
+	return nf_hook_thresh(pf, hook, pskb, indev, outdev, okfn, INT_MIN);
+}
                    
 /* Activate hook; either okfn or kfree_skb called, unless a hook
    returns NF_STOLEN (in which case, it's up to the hook to deal with
@@ -188,35 +219,17 @@ void nf_log_packet(int pf,
 
 /* This is gross, but inline doesn't cut it for avoiding the function
    call in fast path: gcc doesn't inline (needs value tracking?). --RR */
-#ifdef CONFIG_NETFILTER_DEBUG
-#define NF_HOOK(pf, hook, skb, indev, outdev, okfn)			       \
-({int __ret;								       \
-if ((__ret=nf_hook_slow(pf, hook, &(skb), indev, outdev, okfn, INT_MIN)) == 1) \
-	__ret = (okfn)(skb);						       \
-__ret;})
-#define NF_HOOK_THRESH(pf, hook, skb, indev, outdev, okfn, thresh)	       \
-({int __ret;								       \
-if ((__ret=nf_hook_slow(pf, hook, &(skb), indev, outdev, okfn, thresh)) == 1)  \
-	__ret = (okfn)(skb);						       \
-__ret;})
-#else
-#define NF_HOOK(pf, hook, skb, indev, outdev, okfn)			       \
-({int __ret;								       \
-if (list_empty(&nf_hooks[pf][hook]) ||					       \
-    (__ret=nf_hook_slow(pf, hook, &(skb), indev, outdev, okfn, INT_MIN)) == 1) \
-	__ret = (okfn)(skb);						       \
-__ret;})
+
+/* HX: It's slightly less gross now. */
+
 #define NF_HOOK_THRESH(pf, hook, skb, indev, outdev, okfn, thresh)	       \
 ({int __ret;								       \
-if (list_empty(&nf_hooks[pf][hook]) ||					       \
-    (__ret=nf_hook_slow(pf, hook, &(skb), indev, outdev, okfn, thresh)) == 1)  \
+if ((__ret=nf_hook_thresh(pf, hook, &(skb), indev, outdev, okfn, thresh)) == 1)\
 	__ret = (okfn)(skb);						       \
 __ret;})
-#endif
 
-int nf_hook_slow(int pf, unsigned int hook, struct sk_buff **pskb,
-		 struct net_device *indev, struct net_device *outdev,
-		 int (*okfn)(struct sk_buff *), int thresh);
+#define NF_HOOK(pf, hook, skb, indev, outdev, okfn) \
+	NF_HOOK_THRESH(pf, hook, skb, indev, outdev, okfn, INT_MIN)
 
 /* Call setsockopt() */
 int nf_setsockopt(struct sock *sk, int pf, int optval, char __user *opt, 
diff --git a/include/net/dst.h b/include/net/dst.h
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -224,16 +224,7 @@ static inline void dst_set_expires(struc
 /* Output packet to network from transport.  */
 static inline int dst_output(struct sk_buff *skb)
 {
-	int err;
-
-	for (;;) {
-		err = skb->dst->output(skb);
-
-		if (likely(err == 0))
-			return err;
-		if (unlikely(err != NET_XMIT_BYPASS))
-			return err;
-	}
+	return skb->dst->output(skb);
 }
 
 /* Input packet from network to transport.  */
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -8,8 +8,10 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#include <linux/compiler.h>
 #include <linux/skbuff.h>
 #include <linux/spinlock.h>
+#include <linux/netfilter_ipv4.h>
 #include <net/inet_ecn.h>
 #include <net/ip.h>
 #include <net/xfrm.h>
@@ -95,7 +97,7 @@ out:
 	return ret;
 }
 
-int xfrm4_output(struct sk_buff *skb)
+static int xfrm4_output_one(struct sk_buff *skb)
 {
 	struct dst_entry *dst = skb->dst;
 	struct xfrm_state *x = dst->xfrm;
@@ -138,7 +140,7 @@ int xfrm4_output(struct sk_buff *skb)
 		x = dst->xfrm;
 	} while (x && !x->props.mode);
 
-	err = NET_XMIT_BYPASS;
+	err = 0;
 
 out_exit:
 	return err;
@@ -148,3 +150,34 @@ error_nolock:
 	kfree_skb(skb);
 	goto out_exit;
 }
+
+static int xfrm4_output_finish(struct sk_buff *skb)
+{
+	int err;
+
+	while (likely((err = xfrm4_output_one(skb)) == 0)) {
+		nf_reset(skb);
+
+		err = nf_hook(PF_INET, NF_IP_LOCAL_OUT, &skb, NULL,
+			      skb->dst->dev, dst_output);
+		if (unlikely(err != 1))
+			break;
+
+		err = 0;
+		if (!skb->dst->xfrm)
+			break;
+
+		err = nf_hook(PF_INET, NF_IP_POST_ROUTING, &skb, NULL,
+			      skb->dst->dev, xfrm4_output_finish);
+		if (unlikely(err != 1))
+			break;
+	}
+
+	return err;
+}
+
+int xfrm4_output(struct sk_buff *skb)
+{
+	return NF_HOOK(PF_INET, NF_IP_POST_ROUTING, skb, NULL, skb->dst->dev,
+		       xfrm4_output_finish);
+}
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -9,9 +9,11 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#include <linux/compiler.h>
 #include <linux/skbuff.h>
 #include <linux/spinlock.h>
 #include <linux/icmpv6.h>
+#include <linux/netfilter_ipv6.h>
 #include <net/dsfield.h>
 #include <net/inet_ecn.h>
 #include <net/ipv6.h>
@@ -92,7 +94,7 @@ static int xfrm6_tunnel_check_size(struc
 	return ret;
 }
 
-int xfrm6_output(struct sk_buff *skb)
+static int xfrm6_output_one(struct sk_buff *skb)
 {
 	struct dst_entry *dst = skb->dst;
 	struct xfrm_state *x = dst->xfrm;
@@ -137,7 +139,7 @@ int xfrm6_output(struct sk_buff *skb)
 		x = dst->xfrm;
 	} while (x && !x->props.mode);
 
-	err = NET_XMIT_BYPASS;
+	err = 0;
 
 out_exit:
 	return err;
@@ -147,3 +149,34 @@ error_nolock:
 	kfree_skb(skb);
 	goto out_exit;
 }
+
+static int xfrm6_output_finish(struct sk_buff *skb)
+{
+	int err;
+
+	while (likely((err = xfrm6_output_one(skb)) == 0)) {
+		nf_reset(skb);
+	
+		err = nf_hook(PF_INET6, NF_IP6_LOCAL_OUT, &skb, NULL,
+			      skb->dst->dev, dst_output);
+		if (unlikely(err != 1))
+			break;
+
+		err = 0;
+		if (!skb->dst->xfrm)
+			break;
+
+		err = nf_hook(PF_INET6, NF_IP6_POST_ROUTING, &skb, NULL,
+			      skb->dst->dev, xfrm6_output_finish);
+		if (unlikely(err != 1))
+			break;
+	}
+
+	return err;
+}
+
+int xfrm6_output(struct sk_buff *skb)
+{
+	return NF_HOOK(PF_INET6, NF_IP6_POST_ROUTING, skb, NULL, skb->dst->dev,
+		       xfrm6_output_finish);
+}

^ permalink raw reply

* Re: [PATCH 05/13]: [IPV4/6]: Netfilter IPsec output hooks
From: Herbert Xu @ 2005-11-22 10:31 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, netfilter-devel, davem
In-Reply-To: <20051122103038.GA31532@gondor.apana.org.au>

On Tue, Nov 22, 2005 at 09:30:38PM +1100, herbert wrote:
>
> the need to return control to dst_output.  The only reason for
> dst_output to exist is to handle compilers that can't optimise away
> tail calls.  So if we are going to rely on the compiler to do away
> with tail calls (ip_dst_output <-> __ip_dst_output), then we might
> as well get rid of the loop in dst_output.

Unfortunately it looks like gcc 3.3.5 at least is too dumb to optimise
it away.  I think we'll need a better strategy.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 05/13]: [IPV4/6]: Netfilter IPsec output hooks
From: Herbert Xu @ 2005-11-22 10:30 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, netfilter-devel, davem
In-Reply-To: <4382A44F.9000105@trash.net>

On Tue, Nov 22, 2005 at 05:53:35AM +0100, Patrick McHardy wrote:
> 
> This looks nice, but placing the hooks at the end of the xfrm[46]
> functions doesn't work with queueing without recursively calling
> dst_output (as okfn) since we have to provide an okfn but also
> have to return ownership of the skb back to dst_output.

This patch (on top of the last one) is what we could do to eliminate
the need to return control to dst_output.  The only reason for
dst_output to exist is to handle compilers that can't optimise away
tail calls.  So if we are going to rely on the compiler to do away
with tail calls (ip_dst_output <-> __ip_dst_output), then we might
as well get rid of the loop in dst_output.

BTW, I killed the corresponding inline (which would have gone onto
xfrm[46]_output_finish) because the compiler should be able to
optimise it into a tail call.  If it doesn't, then we're in trouble
anyway since it won't be able to optimise away the call to dst_output.

> This is after POST_ROUTING :) POST_ROUTING is called before the
> transforms and LOCAL_OUT afterwards. But it could be moved to the
> ip/ip6_dst_output functions to avoid unnecessarily trying to reset
> the skb for transport mode transforms.

You're absolute right.  I somehow mistook LOCAL_OUT for POST_ROUTING :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/net/dst.h b/include/net/dst.h
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -224,16 +224,7 @@ static inline void dst_set_expires(struc
 /* Output packet to network from transport.  */
 static inline int dst_output(struct sk_buff *skb)
 {
-	int err;
-
-	for (;;) {
-		err = skb->dst->output(skb);
-
-		if (likely(err == 0))
-			return err;
-		if (unlikely(err != NET_XMIT_BYPASS))
-			return err;
-	}
+	return skb->dst->output(skb);
 }
 
 /* Input packet from network to transport.  */
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -10,6 +10,7 @@
 
 #include <linux/skbuff.h>
 #include <linux/spinlock.h>
+#include <linux/netfilter_ipv4.h>
 #include <net/inet_ecn.h>
 #include <net/ip.h>
 #include <net/xfrm.h>
@@ -95,7 +96,7 @@ out:
 	return ret;
 }
 
-int xfrm4_output(struct sk_buff *skb)
+static int xfrm4_output_finish(struct sk_buff *skb)
 {
 	struct dst_entry *dst = skb->dst;
 	struct xfrm_state *x = dst->xfrm;
@@ -138,13 +139,20 @@ int xfrm4_output(struct sk_buff *skb)
 		x = dst->xfrm;
 	} while (x && !x->props.mode);
 
-	err = NET_XMIT_BYPASS;
+	nf_reset(skb);
+	
+	return NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, dst->dev,
+		       dst_output);
 
-out_exit:
-	return err;
 error:
 	spin_unlock_bh(&x->lock);
 error_nolock:
 	kfree_skb(skb);
-	goto out_exit;
+	return err;
+}
+
+int xfrm4_output(struct sk_buff *skb)
+{
+	return NF_HOOK(PF_INET, NF_IP_POST_ROUTING, skb, NULL, skb->dst->dev,
+		       xfrm4_output_finish);
 }
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -12,6 +12,7 @@
 #include <linux/skbuff.h>
 #include <linux/spinlock.h>
 #include <linux/icmpv6.h>
+#include <linux/netfilter_ipv6.h>
 #include <net/dsfield.h>
 #include <net/inet_ecn.h>
 #include <net/ipv6.h>
@@ -92,7 +93,7 @@ static int xfrm6_tunnel_check_size(struc
 	return ret;
 }
 
-int xfrm6_output(struct sk_buff *skb)
+static int xfrm6_output_finish(struct sk_buff *skb)
 {
 	struct dst_entry *dst = skb->dst;
 	struct xfrm_state *x = dst->xfrm;
@@ -137,13 +138,20 @@ int xfrm6_output(struct sk_buff *skb)
 		x = dst->xfrm;
 	} while (x && !x->props.mode);
 
-	err = NET_XMIT_BYPASS;
+	nf_reset(skb);
+	
+	return NF_HOOK(PF_INET6, NF_IP6_LOCAL_OUT, skb, NULL, dst->dev,
+		       dst_output);
 
-out_exit:
-	return err;
 error:
 	spin_unlock_bh(&x->lock);
 error_nolock:
 	kfree_skb(skb);
-	goto out_exit;
+	return err;
+}
+
+int xfrm6_output(struct sk_buff *skb)
+{
+	return NF_HOOK(PF_INET6, NF_IP6_POST_ROUTING, skb, NULL, skb->dst->dev,
+		       xfrm6_output_finish);
 }

^ permalink raw reply

* Re: [PATCH 02/10]: [NETFILTER]: Defer fragmentation in ip_output when connection tracking is used
From: Patrick McHardy @ 2005-11-22  8:17 UTC (permalink / raw)
  To: Harald Welte
  Cc: Kernel Netdev Mailing List, Netfilter Development Mailinglist,
	Herbert Xu
In-Reply-To: <20051122075904.GL31478@sunbeam.de.gnumonks.org>

Harald Welte wrote:
> On Thu, Nov 17, 2005 at 03:28:24AM +0100, Patrick McHardy wrote:
> 
> 
>>- CLASSIFY fragments differently
>>- MARK fragments differently
>>- DSCP/ECN/TOS mark fragments differently
>>- Change TTLs of fragments to differently values
> 
> [...]
> 
>>I've CCed Harald for his opinion in case I missed something.
> 
> 
> I don't think you've missed something.  I agree that the features you
> pointed out seem to be useful and/or important.

I hope you meant to write "don't seem to be"? :)

^ permalink raw reply

* Re: [PATCH 02/10]: [NETFILTER]: Defer fragmentation in ip_output when connection tracking is used
From: Harald Welte @ 2005-11-22  7:59 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Kernel Netdev Mailing List, Netfilter Development Mailinglist,
	Herbert Xu
In-Reply-To: <437BEAC8.40904@trash.net>

[-- Attachment #1: Type: text/plain, Size: 799 bytes --]

On Thu, Nov 17, 2005 at 03:28:24AM +0100, Patrick McHardy wrote:

> - CLASSIFY fragments differently
> - MARK fragments differently
> - DSCP/ECN/TOS mark fragments differently
> - Change TTLs of fragments to differently values
[...]
> I've CCed Harald for his opinion in case I missed something.

I don't think you've missed something.  I agree that the features you
pointed out seem to be useful and/or important.

-- 
- Harald Welte <laforge@netfilter.org>                 http://netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH 05/13]: [IPV4/6]: Netfilter IPsec output hooks
From: Patrick McHardy @ 2005-11-22  5:13 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, netfilter-devel, davem
In-Reply-To: <4382A44F.9000105@trash.net>

Patrick McHardy wrote:
> Herbert Xu wrote:
> 
>> On Sun, Nov 20, 2005 at 04:31:34PM +0000, Patrick McHardy wrote:
>>
>>> diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
>>> index ae0779d..b93e7cd 100644
>>> --- a/net/ipv4/netfilter.c
>>> +++ b/net/ipv4/netfilter.c
>>> @@ -78,6 +79,34 @@ int ip_route_me_harder(struct sk_buff **
>>> }
>>> EXPORT_SYMBOL(ip_route_me_harder);
>>>
>>> +#ifdef CONFIG_XFRM
>>> +static inline int __ip_dst_output(struct sk_buff *skb)
>>
>>
>>
>> I'd like to suggest an alternative way of doing this that
>>
>> 1) Keeps this XFRM stuff in xfrm*.c.
>> 2) Removes the need for ip_dst_output.
>>
>> Please see the attached patch.
>>
>>
>>> +    do {
>>> +        err = skb->dst->output(skb);
>>> +
>>> +        if (likely(err == 0))
>>> +            return err;
>>> +        if (unlikely(err != NET_XMIT_BYPASS))
>>> +            return err;
>>> +    } while (skb->dst->xfrm && !skb->dst->xfrm->props.mode);
>>> +
>>> +    return NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, skb->dst->dev,
>>> +                   ip_dst_output);
>>
>>
>>
>> The idea is simply to put this stuff in xfrm[46]_output directly.
>> So for your patch you would simply need to add the two NF_HOOK
>> calls at the beginning and end of xfrm[46]_output once they've
>> been modified in the way I outline below.
> 
> 
> This looks nice, but placing the hooks at the end of the xfrm[46]
> functions doesn't work with queueing without recursively calling
> dst_output (as okfn) since we have to provide an okfn but also
> have to return ownership of the skb back to dst_output.

I should add, the same affects ip_dst_output/__ip_dst_output of
course, which is why they do call themselves recursively. But since
__ip_dst_output is an inline function and is not called through the
function pointer except from a different context (ip_queue), the
compiler does a good job at eliminating the recursion for the
inlined version. This probably wouldn't work if we used a recursive
dst_output call in xfrm[46]_output. But your patch looks like a nice
idea anyway, so how about we move the looping to xfrm[46]_input and
keep ip/ip6_dst_output for the hooks?

^ permalink raw reply

* Re: [PATCH 05/13]: [IPV4/6]: Netfilter IPsec output hooks
From: Patrick McHardy @ 2005-11-22  4:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, netfilter-devel, davem
In-Reply-To: <20051122044046.GA29166@gondor.apana.org.au>

Herbert Xu wrote:
> On Sun, Nov 20, 2005 at 04:31:34PM +0000, Patrick McHardy wrote:
> 
>>diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
>>index ae0779d..b93e7cd 100644
>>--- a/net/ipv4/netfilter.c
>>+++ b/net/ipv4/netfilter.c
>>@@ -78,6 +79,34 @@ int ip_route_me_harder(struct sk_buff **
>> }
>> EXPORT_SYMBOL(ip_route_me_harder);
>> 
>>+#ifdef CONFIG_XFRM
>>+static inline int __ip_dst_output(struct sk_buff *skb)
> 
> 
> I'd like to suggest an alternative way of doing this that
> 
> 1) Keeps this XFRM stuff in xfrm*.c.
> 2) Removes the need for ip_dst_output.
> 
> Please see the attached patch.
> 
> 
>>+	do {
>>+		err = skb->dst->output(skb);
>>+
>>+		if (likely(err == 0))
>>+			return err;
>>+		if (unlikely(err != NET_XMIT_BYPASS))
>>+			return err;
>>+	} while (skb->dst->xfrm && !skb->dst->xfrm->props.mode);
>>+
>>+	return NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, skb->dst->dev,
>>+	               ip_dst_output);
> 
> 
> The idea is simply to put this stuff in xfrm[46]_output directly.
> So for your patch you would simply need to add the two NF_HOOK
> calls at the beginning and end of xfrm[46]_output once they've
> been modified in the way I outline below.

This looks nice, but placing the hooks at the end of the xfrm[46]
functions doesn't work with queueing without recursively calling
dst_output (as okfn) since we have to provide an okfn but also
have to return ownership of the skb back to dst_output.

>>diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
>>index 66620a9..c135746 100644
>>--- a/net/ipv4/xfrm4_output.c
>>+++ b/net/ipv4/xfrm4_output.c
>>@@ -133,6 +133,7 @@ int xfrm4_output(struct sk_buff *skb)
>> 		err = -EHOSTUNREACH;
>> 		goto error_nolock;
>> 	}
>>+	nf_reset(skb);
>> 	err = NET_XMIT_BYPASS;
> 
> 
> Shouldn't this sit after POST_ROUTING, i.e., once the connection
> has been confirmed?

This is after POST_ROUTING :) POST_ROUTING is called before the
transforms and LOCAL_OUT afterwards. But it could be moved to the
ip/ip6_dst_output functions to avoid unnecessarily trying to reset
the skb for transport mode transforms.

^ permalink raw reply

* Re: [PATCH 05/13]: [IPV4/6]: Netfilter IPsec output hooks
From: Herbert Xu @ 2005-11-22  4:40 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, netfilter-devel, davem
In-Reply-To: <20051120163134.16666.9265.sendpatchset@localhost.localdomain>

On Sun, Nov 20, 2005 at 04:31:34PM +0000, Patrick McHardy wrote:
>
> diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
> index ae0779d..b93e7cd 100644
> --- a/net/ipv4/netfilter.c
> +++ b/net/ipv4/netfilter.c
> @@ -78,6 +79,34 @@ int ip_route_me_harder(struct sk_buff **
>  }
>  EXPORT_SYMBOL(ip_route_me_harder);
>  
> +#ifdef CONFIG_XFRM
> +static inline int __ip_dst_output(struct sk_buff *skb)

I'd like to suggest an alternative way of doing this that

1) Keeps this XFRM stuff in xfrm*.c.
2) Removes the need for ip_dst_output.

Please see the attached patch.

> +	do {
> +		err = skb->dst->output(skb);
> +
> +		if (likely(err == 0))
> +			return err;
> +		if (unlikely(err != NET_XMIT_BYPASS))
> +			return err;
> +	} while (skb->dst->xfrm && !skb->dst->xfrm->props.mode);
> +
> +	return NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, skb->dst->dev,
> +	               ip_dst_output);

The idea is simply to put this stuff in xfrm[46]_output directly.
So for your patch you would simply need to add the two NF_HOOK
calls at the beginning and end of xfrm[46]_output once they've
been modified in the way I outline below.

> diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
> index 66620a9..c135746 100644
> --- a/net/ipv4/xfrm4_output.c
> +++ b/net/ipv4/xfrm4_output.c
> @@ -133,6 +133,7 @@ int xfrm4_output(struct sk_buff *skb)
>  		err = -EHOSTUNREACH;
>  		goto error_nolock;
>  	}
> +	nf_reset(skb);
>  	err = NET_XMIT_BYPASS;

Shouldn't this sit after POST_ROUTING, i.e., once the connection
has been confirmed?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/net/ip.h b/include/net/ip.h
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -113,26 +113,31 @@ int xfrm4_output(struct sk_buff *skb)
 			goto error_nolock;
 	}
 
-	spin_lock_bh(&x->lock);
-	err = xfrm_state_check(x, skb);
-	if (err)
-		goto error;
-
-	xfrm4_encap(skb);
-
-	err = x->type->output(x, skb);
-	if (err)
-		goto error;
+	do {
+		spin_lock_bh(&x->lock);
+		err = xfrm_state_check(x, skb);
+		if (err)
+			goto error;
+
+		xfrm4_encap(skb);
+
+		err = x->type->output(x, skb);
+		if (err)
+			goto error;
 
-	x->curlft.bytes += skb->len;
-	x->curlft.packets++;
+		x->curlft.bytes += skb->len;
+		x->curlft.packets++;
 
-	spin_unlock_bh(&x->lock);
+		spin_unlock_bh(&x->lock);
 	
-	if (!(skb->dst = dst_pop(dst))) {
-		err = -EHOSTUNREACH;
-		goto error_nolock;
-	}
+		if (!(skb->dst = dst_pop(dst))) {
+			err = -EHOSTUNREACH;
+			goto error_nolock;
+		}
+		dst = skb->dst;
+		x = dst->xfrm;
+	} while (x && !x->props.mode);
+
 	err = NET_XMIT_BYPASS;
 
 out_exit:
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -110,28 +110,33 @@ int xfrm6_output(struct sk_buff *skb)
 			goto error_nolock;
 	}
 
-	spin_lock_bh(&x->lock);
-	err = xfrm_state_check(x, skb);
-	if (err)
-		goto error;
-
-	xfrm6_encap(skb);
-
-	err = x->type->output(x, skb);
-	if (err)
-		goto error;
-
-	x->curlft.bytes += skb->len;
-	x->curlft.packets++;
-
-	spin_unlock_bh(&x->lock);
+	do {
+		spin_lock_bh(&x->lock);
+		err = xfrm_state_check(x, skb);
+		if (err)
+			goto error;
+
+		xfrm6_encap(skb);
+
+		err = x->type->output(x, skb);
+		if (err)
+			goto error;
+
+		x->curlft.bytes += skb->len;
+		x->curlft.packets++;
+
+		spin_unlock_bh(&x->lock);
+
+		skb->nh.raw = skb->data;
+		
+		if (!(skb->dst = dst_pop(dst))) {
+			err = -EHOSTUNREACH;
+			goto error_nolock;
+		}
+		dst = skb->dst;
+		x = dst->xfrm;
+	} while (x && !x->props.mode);
 
-	skb->nh.raw = skb->data;
-	
-	if (!(skb->dst = dst_pop(dst))) {
-		err = -EHOSTUNREACH;
-		goto error_nolock;
-	}
 	err = NET_XMIT_BYPASS;
 
 out_exit:

^ permalink raw reply

* Re: [PATCH -mm2] net: Fix compiler-error on dgrs.c when !CONFIG_PCI
From: Richard Knutsson @ 2005-11-22  1:56 UTC (permalink / raw)
  To: Herbert Xu; +Cc: akpm, linux-kernel, netdev, jgarzik, ashutosh.naik
In-Reply-To: <20051121203758.GA25509@gondor.apana.org.au>

Herbert Xu wrote:

>On Mon, Nov 21, 2005 at 01:52:49PM +0100, Richard Knutsson wrote:
>  
>
>>This patch requirer the 
>>"net-fix-compiler-error-on-dgrsc-when-config_pci.patch" (added to the 
>>-mm tree after 2.6.15-rc1-mm2):
>>
>>--- 
>>devel/drivers/net/dgrs.c~net-fix-compiler-error-on-dgrsc-when-config_pci 
>>2005-11-19 18:00:34.000000000 -0800
>>+++ devel-akpm/drivers/net/dgrs.c	2005-11-19 18:00:34.000000000 -0800
>>@@ -1458,6 +1458,8 @@ static struct pci_driver dgrs_pci_driver
>>	.probe = dgrs_pci_probe,
>>	.remove = __devexit_p(dgrs_pci_remove),
>>};
>>+#else
>>+static struct pci_driver dgrs_pci_driver = {};
>>#endif
>>    
>>
>
>I don't see the point.  We shouldn't have this structure at all
>if CONFIG_PCI is not set.
>
>Cheers,
>  
>
Opps, misread your mail. Sorry.
But in that case, why shall we have any pci_*-function in the first 
place when !CONFIG_PCI? As it was before, they were contained within 
#ifdef CONFIG_PCI's.
You said your patch were easier to read, please elaborate. Yes, in the 
dgrs_init_module() (no arguments there), but you introduces new 
functions (who need to be checked out if you read the code for the first 
time) and is really #ifdef's a good idea to change function behavior? 
Isn't better to change the input? (I know, linux/pci.h does it, but at 
least that is in a .h-file with inline functions containing at most 
"return 0;").

Btw, you said Jeff should decide. Why not Rich Richardson who is the author?

Apologize for the ranting.
As I said before, thinking of deleting CONFIG_PCI's containing 
pci_*-functions and if so, need a valid plan for it (because of the 
pci_driver-struct). If there is any no-no in "my" way, please point it 
out to spare me/you/lkml the patches. If not for this, I would've let 
this rest and leave it to Jeff (or Rich ;) ).

cu,
Richard

^ permalink raw reply

* Re: netlink nlmsg_pid supposed to be pid or tid?
From: Herbert Xu @ 2005-11-22  0:09 UTC (permalink / raw)
  To: Christopher Friesen; +Cc: kuznet, netdev, linux-kernel, davem
In-Reply-To: <43824F57.1040007@nortel.com>

On Mon, Nov 21, 2005 at 04:51:03PM -0600, Christopher Friesen wrote:
> 
> Should that be getsockname(2)?  Anyways, I now understand the issues.

Yes that's what I meant.

> As it turns out, we were actually already calling getsockname() a bit 
> earlier in the code path, so we can just use the "pid" value from there.

Excellent.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH -mm2] net: Fix compiler-error on dgrs.c when !CONFIG_PCI
From: Herbert Xu @ 2005-11-21 23:36 UTC (permalink / raw)
  To: Richard Knutsson; +Cc: akpm, linux-kernel, netdev, jgarzik, ashutosh.naik
In-Reply-To: <4382563E.50502@student.ltu.se>

On Tue, Nov 22, 2005 at 12:20:30AM +0100, Richard Knutsson wrote:
> 
> Am thinking of removing the #ifdef CONFIG_PCI's in other files, to 
> "clean up" the code, but that would introduce this problem again, don't 
> you think it is more readable to make an empty struct when !CONFIG_PCI, 
> then making new pci_*-functions specific to the driver?

I think my version is more readable.  However, it's really up to Jeff
to decide.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: netlink nlmsg_pid supposed to be pid or tid?
From: Herbert Xu @ 2005-11-21 23:34 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Christopher Friesen, netdev, linux-kernel, davem
In-Reply-To: <20051121224913.GA31287@ms2.inr.ac.ru>

On Tue, Nov 22, 2005 at 01:49:13AM +0300, Alexey Kuznetsov wrote:
> 
> Actually, I remember one discussion. Herbert, wait a minute...
> That's it: February 2005, Subject: [PATCH] Add audit uid to netlink credentials
> We decided (or not?) that binding to anything but tgid and pid
> must be prohibited by security reasons. Apaprently, the finding was lost.

Thanks for reminding me.  We may still need to track that down (we
have now serialised most of the netlink processing so this my not be
as bad as it was).

However, I think explicit binding should still be allowed for root,
so nobody should take the PID for granted.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH -mm2] net: Fix compiler-error on dgrs.c when !CONFIG_PCI
From: Richard Knutsson @ 2005-11-21 23:20 UTC (permalink / raw)
  To: Herbert Xu; +Cc: akpm, linux-kernel, netdev, jgarzik, ashutosh.naik
In-Reply-To: <E1EeJu8-0006vr-00@gondolin.me.apana.org.au>

Herbert Xu wrote:

>Richard Knutsson <ricknu-0@student.ltu.se> wrote:
>  
>
>>We need it even if pci_register_driver() and pci_register_driver() is 
>>empty shells. And instead of removing #endif CONFIG_PCI for 
>>    
>>
>
>I don't think so.  Please see my previous patch where pci_register_driver
>is not called at all if CONFIG_PCI is not defined.
>
>Cheers,
>  
>
Yes, I know you patch don't need it (thought you commented the patch vs. 
dgrs.c, not vs. you patch). But to do that, you have to introduce a new 
dgrs-specific pci-layer to compensate. If you don't want to have an 
empty struct, is it not nicer/easier to just #ifdef CONFIG_PCI the 
pci_*-functions? (instead of two inline functions for every used 
pci_*-function?)

Am thinking of removing the #ifdef CONFIG_PCI's in other files, to 
"clean up" the code, but that would introduce this problem again, don't 
you think it is more readable to make an empty struct when !CONFIG_PCI, 
then making new pci_*-functions specific to the driver?

cu

^ permalink raw reply

* Re: netlink nlmsg_pid supposed to be pid or tid?
From: Christopher Friesen @ 2005-11-21 22:51 UTC (permalink / raw)
  To: Herbert Xu; +Cc: kuznet, netdev, linux-kernel, davem
In-Reply-To: <E1EeJwF-0006wc-00@gondolin.me.apana.org.au>

Herbert Xu wrote:

> As I said before, you should not rely on getpid() to work.

> You should always use getsockaddr(2) to get your local address.

Should that be getsockname(2)?  Anyways, I now understand the issues.

As it turns out, we were actually already calling getsockname() a bit 
earlier in the code path, so we can just use the "pid" value from there.

Chris

^ permalink raw reply

* Re: netlink nlmsg_pid supposed to be pid or tid?
From: Alexey Kuznetsov @ 2005-11-21 22:49 UTC (permalink / raw)
  To: Christopher Friesen; +Cc: Herbert Xu, netdev, linux-kernel, davem
In-Reply-To: <4382406D.1040508@nortel.com>

Hello!

> TIPC wants the user to fill in the pid to use in the nlmsghdr portion of 
> a particular message.

It is wrong. netlink_pid used not to be associated with process pids.
Kernel used pid just as a seed to calculate a random value to bind,
when user did not bind explicitly. It is equal to current->pid occasionally.
F.e. libnetlink from iproute autobinds and gets netlink_pid with
getsockname().

When user binds the socket himself, he was free to bind to any value,
including pid and tgid.

Actually, I remember one discussion. Herbert, wait a minute...
That's it: February 2005, Subject: [PATCH] Add audit uid to netlink credentials
We decided (or not?) that binding to anything but tgid and pid
must be prohibited by security reasons. Apaprently, the finding was lost.

Alexey

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox